dawidwys commented on a change in pull request #7777: [FLINK-9964][table] Add
a full RFC-compliant CSV table format factory
URL: https://github.com/apache/flink/pull/7777#discussion_r258561739
##########
File path: docs/dev/table/connect.md
##########
@@ -731,24 +757,73 @@ The CSV format allows to read and write comma-separated
rows.
{% highlight yaml %}
format:
type: csv
- fields: # required: ordered format fields
- - name: field1
- type: VARCHAR
- - name: field2
- type: TIMESTAMP
- field-delimiter: "," # optional: string delimiter "," by default
- line-delimiter: "\n" # optional: string delimiter "\n" by default
- quote-character: '"' # optional: single character for string values,
empty by default
- comment-prefix: '#' # optional: string to indicate comments, empty by
default
- ignore-first-line: false # optional: boolean flag to ignore the first
line, by default it is not skipped
- ignore-parse-errors: true # optional: skip records with parse error instead
of failing by default
+
+ # required: define the schema either by using type information
+ schema: "ROW(lon FLOAT, rideTime TIMESTAMP)"
+
+ # or use the table's schema
+ derive-schema: true
+
+ field-delimiter: ";" # optional: field delimiter character (',' by
default)
+ line-delimiter: "\r\n" # optional: line delimiter ("\n" by default;
otherwise "\r" or "\r\n" are allowed)
+ quote-character: "'" # optional: quote character for enclosing field
values ('"' by default)
+ allow-comments: true # optional: ignores comment lines that start
with "#" (disabled by default)
+ ignore-parse-errors: true # optional: skip fields and rows with parse
errors instead of failing;
+ # fields are set to null in case of errors
+ array-element-delimiter: "|" # optional: the array element delimiter string
for separating
+ # array and row element values (";" by
default)
+ escape-character: "\\" # optional: escape character for escaping
values (disabled by default)
+ null-literal: "n/a" # optional: null literal string that is
interpreted as a
+ # null value (disabled by default)
{% endhighlight %}
</div>
</div>
-The CSV format is included in Flink and does not require additional
dependencies.
+The following table lists supported types that can be read and written:
+
+| Supported Flink SQL Types |
+| :------------------------ |
+| `ROW` |
+| `VARCHAR` |
+| `ARRAY[_]` |
+| `INT` |
+| `BIGINT` |
+| `FLOAT` |
+| `DOUBLE` |
+| `BOOLEAN` |
+| `DATE` |
+| `TIME` |
+| `TIMESTAMP` |
+| `DECIMAL` |
+| `NULL` (unsupported yet) |
+
+**Numeric types:** Value should be a number but the literal `"null"` can also
be understood. An empty string is
+considered `null`. Values are also trimmed (leading/trailing white space).
Numbers are parsed using
+Java's `valueOf` semantics. Other non-numeric strings may cause parsing
exception.
+
+**String and time types:** Value is not trimmed. The literal `"null"` can also
be understood.
Review comment:
How about we add info what is the format for time types? This pops up in
user's questions often. I believe we can either point to
`Timestamp/Time.valueOf` or copy the expected formats.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services