[jira] [Comment Edited] (FLINK-21583) Allow comments in CSV format without having to ignore parse errors

liwei li (Jira) Fri, 19 Nov 2021 19:53:10 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-21583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446747#comment-17446747
 ]


liwei li edited comment on FLINK-21583 at 11/20/21, 3:52 AM:
-------------------------------------------------------------

This seems to be currently supported.

I created the following case to test :

csv file :

 
{code:java}
#comment1
1,lisi
2,wangwu
3,zhangsan
#comment2{code}
flink sql :

 
{code:java}
tEnv.executeSql(
  """
    |
    |CREATE TABLE MyUserTable (
    |  id INT,
    |  name STRING
    |) WITH (
    |  'connector' = 'filesystem',           
    |  'path' = '/test/test/csv', 
    |  'csv.allow-comments' = 'true',
    |  'format' = 'csv'                    
    |)
    |""".stripMargin) {code}
 

Here's what I got:

 
{code:java}
+----+------------------+---------------------------+
| op |          id |               name |
+----+------------------+---------------------------+
| +I |           3 |                zhangsan |
| +I |           2 |               wangwu |
| +I |           1 |               lisi |
+----+-------------+--------------------------------+
3 rows in set {code}
 

The result is the same as what we expected. Can I assume that I only need to 
modify the document?

If my understanding is wrong, please give some guidance, thank you.

[~nkruber] [~jark] 

 

 

 


was (Author: liliwei):
This seems to be currently supported.

I created the following case to test :

csv file :

 
{code:java}
#comment1
1,lisi
2,wangwu
3,zhangsan
#comment2{code}
flink sql :

 
{code:java}
tEnv.executeSql(
  """
    |
    |CREATE TABLE MyUserTable (
    |  id INT,
    |  name STRING
    |) WITH (
    |  'connector' = 'filesystem',           
    |  'path' = '/test/test/csv', 
    |  'csv.allow-comments' = 'true',
    |  'format' = 'csv'                    
    |)
    |""".stripMargin) {code}
 

Here's what I got:

 
{code:java}
+----+------------------+---------------------------+
| op |          id |               name |
+----+------------------+---------------------------+
| +I |           3 |                zhangsan |
| +I |           2 |               wangwu |
| +I |           1 |               lisi |
+----+-------------+--------------------------------+
3 rows in set {code}
The result is the same as what we expected. Can I assume that I only need to 
modify the document?

If my understanding is wrong, please give some guidance, thank you.

[~nkruber] [~jark] 

 

 

 

> Allow comments in CSV format without having to ignore parse errors
> ------------------------------------------------------------------
>
>                 Key: FLINK-21583
>                 URL: https://issues.apache.org/jira/browse/FLINK-21583
>             Project: Flink
>          Issue Type: Improvement
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>    Affects Versions: 1.12.1
>            Reporter: Nico Kruber
>            Assignee: liwei li
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> Currently, when you pass {{'csv.allow-comments' = 'true'}} to a table 
> definition, you also have to set {{'csv.ignore-parse-errors' = 'true'}} to 
> actually skip the commented-out line (and the docs mention this prominently 
> as well). This, however, may mask actual parsing errors that you want to be 
> notified of.
> I would like to propose that {{allow-comments}} actually also skips the 
> commented-out lines automatically because these shouldn't be used anyway.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (FLINK-21583) Allow comments in CSV format without having to ignore parse errors

Reply via email to