JingsongLi commented on a change in pull request #13010:
URL: https://github.com/apache/flink/pull/13010#discussion_r468310256



##########
File path: docs/dev/table/connectors/datagen.md
##########
@@ -29,25 +29,24 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-The DataGen connector allows for reading by data generation rules.
+The DataGen connector allows for creating tables based on in-memory data 
generation.
+This is useful when developing queries locally without access to external 
systems such as Kafka.
+Tables can include [Computed Column syntax]({% link dev/table/sql/create.md 
%}#create-table) which allows for flexible record generation.
 
-The DataGen connector can work with [Computed Column syntax]({% link 
dev/table/sql/create.md %}#create-table).
-This allows you to generate records flexibly.
+The DataGen connector is built-in, no additional dependencies are required.
 
-The DataGen connector is built-in.
+Usage
+-----
 
-<span class="label label-danger">Attention</span> Complex types are not 
supported: Array, Map, Row. Please construct these types by computed column.
+By default, a DataGen table will create an unbounded number of rows with a 
random value for each column.
+For variable sized types, char/varchar/string/array/map/multiset, the length 
can be specified.
+Additionally, a total number of rows can be specified, resulting in a bounded 
table.
 
-How to create a DataGen table
-----------------
-
-The boundedness of table: when the generation of field data in the table is 
completed, the reading
-is finished. So the boundedness of the table depends on the boundedness of 
fields.
-
-For each field, there are two ways to generate data:
+There also exists a sequence generator, where users specify a sequence of 
start and end values.
+Complex types cannot be generated as a sequence.
+If any column in a table is a sequence type, the table will be bounded and end 
with the first sequence completes.
 
-- Random generator is the default generator, you can specify random max and 
min values. For char/varchar/string, the length can be specified. It is a 
unbounded generator.
-- Sequence generator, you can specify sequence start and end values. It is a 
bounded generator, when the sequence number reaches the end value, the reading 
ends.
+Time types are always the local machines current system time.

Review comment:
       Maybe we can have a table to show all types.
   Display the generation strategies they support, and the required parameters?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to