[GitHub] [flink] wuchong commented on a change in pull request #9802: [FLINK-13361][documention] Add documentation for JDBC connector for Table API & SQL

GitBox Wed, 09 Oct 2019 20:11:51 -0700

wuchong commented on a change in pull request #9802: [FLINK-13361][documention] 
Add documentation for JDBC connector for Table API & SQL
URL: https://github.com/apache/flink/pull/9802#discussion_r333314037


 ##########
 File path: docs/dev/table/connect.md
 ##########
 @@ -1075,6 +1076,143 @@ CREATE TABLE MyUserTable (
 
 {% top %}
 
+### JDBC Connector
+
+<span class="label label-primary">Source: Batch</span>
+<span class="label label-primary">Sink: Batch</span>
+<span class="label label-primary">Sink: Streaming Append Mode</span>
+<span class="label label-primary">Sink: Streaming Upsert Mode</span>
+<span class="label label-primary">Temporal Join: Sync Mode</span>
+
+The JDBC connector allows for reading from an JDBC client.
+The JDBC connector allows for writing into an JDBC client.
+
+The connector can operate in [upsert mode](#update-modes) for exchanging 
UPSERT/DELETE messages with the external system using a [key defined by the 
query](./streaming/dynamic_tables.html#table-to-stream-conversion).
+
+For append-only queries, the connector can also operate in [append 
mode](#update-modes) for exchanging only INSERT messages with the external 
system.
+
+Need specify JDBC library, for example, if want to use Mysql library, the 
following dependency to your project:
+
+{% highlight xml %}
+<dependency>
+    <groupId>mysql</groupId>
+    <artifactId>mysql-connector-java</artifactId>
+    <version>8.0.17</version>
+</dependency>
+{% endhighlight %}
+
+**Library support:** Now, we only support mysql, derby, postgres.
+
+The connector can be defined as follows:
+
+<div class="codetabs" markdown="1">
+<div data-lang="YAML" markdown="1">
+{% highlight yaml %}
+connector:
+  type: jdbc
+  url: "jdbc:mysql://localhost:3306/flink-test"     # required: JDBC DB url
+  table: "jdbc_table_name"        # required: jdbc table name
+  driver: "com.mysql.jdbc.Driver" # optional: the class name of the JDBC 
driver to use to connect to this URL.
+                                  # If not set, it will automatically be 
derived from the URL.
+
+  username: "name"                # optional: jdbc user name and password
+  password: "password"
+  
+  read: # scan options, optional, used when reading from table
+    partition: # These options must all be specified if any of them is 
specified. In addition, partition.num must be specified. They
+               # describe how to partition the table when reading in parallel 
from multiple tasks. partition.column must be a numeric,
+               # date, or timestamp column from the table in question. Notice 
that lowerBound and upperBound are just used to decide
+               # the partition stride, not for filtering the rows in table. So 
all rows in the table will be partitioned and returned.
+               # This option applies only to reading.
+      column: "column_name" # optional, name of the column used for 
partitioning the input.
+      num: 50               # optional, the largest value of the last 
partition.
+      lower-bound: 500      # optional, the smallest value of the first 
partition.
+      upper-bound: 1000     # optional, the largest value of the last 
partition.
+    fetch-size: 100         # optional, Gives the reader a hint as to the 
number of rows that should be fetched
+                            # from the database when reading per round trip. 
If the value specified is zero, then
+                            # the hint is ignored. The default value is zero.
+  
+  lookup: # lookup options, optional, used in temporary join
+    cache:
+      max-rows: 5000 # optional, max number of rows of lookup cache, over this 
value, the oldest rows will
+                     # be eliminated. "cache.max-rows" and "cache.ttl" options 
must all be specified if any
+                     # of them is specified. Cache is not enabled as default.
+      ttl: "10s"     # optional, the max time to live for each rows in lookup 
cache, over this time, the oldest rows
+                     # will be expired. "cache.max-rows" and "cache.ttl" 
options must all be specified if any of
+                     # them is specified. Cache is not enabled as default.
+    max-retries: 3   # optional, max retry times if lookup database failed
+  
+  write: # sink options, optional, used when writing into table
+      flush:
+        max-rows: 5000 # optional, flush max size (includes all append, upsert 
and delete records), 
+                       # over this number of records, will flush data. The 
default value is "5000".
+        interval: "2s" # optional, flush interval mills, over this time, 
asynchronous threads will flush data.
+                       # The default value is "0s", which means no 
asynchronous flush thread will be scheduled. 
+      max-retries: 3   # optional, max retry times if writing records to 
database failed.
+{% endhighlight %}
+</div>
+
+<div data-lang="DDL" markdown="1">
+{% highlight sql %}
+CREATE TABLE MyUserTable (
+  ...
+) WITH (
+  'connector.type' = 'jdbc', -- required: specify this table type is jdbc
+  
+  'connector.url' = 'jdbc:mysql://localhost:3306/flink-test', -- required: 
JDBC DB url
+  
+  'connector.table' = 'jdbc_table_name',  -- required: jdbc table name
+  
+  'connector.driver' = 'com.mysql.jdbc.Driver', -- optional: the class name of 
the JDBC driver to use to connect to this URL. 
+                                                -- If not set, it will 
automatically be derived from the URL.
+
+  'connector.username' = 'name', -- optional: jdbc user name and password
+  'connector.password' = 'password',
+  
+  -- scan options, optional, used when reading from table
+
+  -- These options must all be specified if any of them is specified. In 
addition, partition.num must be specified. They
+  -- describe how to partition the table when reading in parallel from 
multiple tasks. partition.column must be a numeric,
+  -- date, or timestamp column from the table in question. Notice that 
lowerBound and upperBound are just used to decide
+  -- the partition stride, not for filtering the rows in table. So all rows in 
the table will be partitioned and returned.
+  -- This option applies only to reading.
+  'connector.read.partition.column' = 'column_name', -- optional, name of the 
column used for partitioning the input.
+  'connector.read.partition.num' = '50', -- optional, the largest value of the 
last partition.
 
 Review comment:
   the number of partitions?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [flink] wuchong commented on a change in pull request #9802: [FLINK-13361][documention] Add documentation for JDBC connector for Table API & SQL

Reply via email to