[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337244#comment-17337244 ] Leonard Xu commented on FLINK-14364: [~lirui] could you take a look this one? it's should be a csv format bug. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Jingsong Lee >Priority: Major > Labels: auto-unassigned, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333861#comment-17333861 ] Flink Jira Bot commented on FLINK-14364: This issue was marked "stale-assigned" and has not received an update in 7 days. It is now automatically unassigned. If you are still working on it, you can assign it to yourself again. Please also give an update about the status of the work. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available, stale-assigned > Time Spent: 10m > Remaining Estimate: 0h > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17323175#comment-17323175 ] Flink Jira Bot commented on FLINK-14364: This issue is assigned but has not received an update in 7 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available, stale-assigned > Time Spent: 10m > Remaining Estimate: 0h > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255178#comment-17255178 ] Jark Wu commented on FLINK-14364: - According to {{CsvRowDataSerDeSchemaTest#testDeserializeIgnoreComment}} and {{CsvRowDataSerDeSchemaTest#testDeserializeAllowComment}}, I think this has been fixed in the new CSV format? Could we close this? > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999291#comment-16999291 ] Jiayi Liao commented on FLINK-14364: [~lzljs3620320] [~ykt836] Sorry for the late reply, I was busy on something else recently :(. I've added a validation in {{CsvValidator}} to improve the usability in this [PR|https://github.com/apache/flink/pull/10622]. I'll appreciate it if you can spare time and take a look. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993178#comment-16993178 ] Jingsong Lee commented on FLINK-14364: -- Hi [~wind_ljy], current status is we can only setAllowComments(true) when setIgnoreParseErrors(true). As you said, we should add this into #CsvValidator#validate. I think we can do it in this way to improve usability. Consider this ticket is usability improve, I will change it to 1.11. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16985908#comment-16985908 ] Jiayi Liao commented on FLINK-14364: [~ykt836] We still don't have a consensus on the solution. I think it's enough by just adding the rule into {{CsvValidator.validate}} rather than changing the behaviour of {{objectReader.readValue}}, which may introduce unnecessary performance loss. [~twalthr] [~ykt836] Do you have any better idea on this? > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16985899#comment-16985899 ] Kurt Young commented on FLINK-14364: Hi [~wind_ljy], are you still working on this? > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953802#comment-16953802 ] Jiayi Liao commented on FLINK-14364: [~twalthr] I just thought it's not worth it. We get an iterator by calling {{objectReader.readValues(message)}} and have to check the iterator by calling {{iterator.hasNext()}}, which is an extra calling compared with {{objectReader.readValue(message)}}, and has a performance loss. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953467#comment-16953467 ] Timo Walther commented on FLINK-14364: -- [~wind_ljy] we can also use {{objectReader.readValues(message)}} if does not affect the correctness. Feel free to prepare a PR that improves the usability. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953412#comment-16953412 ] Jiayi Liao commented on FLINK-14364: [~twalthr] I've checked jackson codes and find out that this allowComments only work with multiple lines message by using objectReader.readValues(message) insteand of objectReader.readValue(message). I'm not sure if this is a jackson bug or expected behaviour. But either way I think we should add this into #CsvValidator#validate. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952878#comment-16952878 ] Timo Walther commented on FLINK-14364: -- [~lzljs3620320] I just saw that this is actually intended behavior. See the JavaDocs of {{org.apache.flink.table.descriptors.Csv#allowComments}}. But I'm fine with implementing this nicer. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950679#comment-16950679 ] Jark Wu commented on FLINK-14364: - [~wind_ljy], I assiged this issue to you. Feel free to submit a pull request. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Assignee: Jiayi Liao >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950368#comment-16950368 ] Jiayi Liao commented on FLINK-14364: [~twalthr] Are you working on this? I can look into this in these days if you can't spare time. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949247#comment-16949247 ] Timo Walther commented on FLINK-14364: -- Yes, this seems like a bug. It should be null. Jackson should not throw an exception here. > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949243#comment-16949243 ] Jingsong Lee commented on FLINK-14364: -- {code:java} String string = "#Test,12,Test"; final TypeInformation rowInfo = Types.ROW(Types.STRING, Types.INT, Types.STRING); final CsvRowDeserializationSchema.Builder deserSchemaBuilder = new CsvRowDeserializationSchema.Builder(rowInfo) .setIgnoreParseErrors(false) .setAllowComments(true); System.out.println(deserialize(deserSchemaBuilder, string)); {code} Here is example~ > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949221#comment-16949221 ] Timo Walther commented on FLINK-14364: -- If {{setIgnoreParseErrors}} is set to {{false}}, it is fine to throw an exception. But if you enabled comments, there should be no exception. Can you provide an example that causes this error? > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14364) Allow comments fail when not ignore parse errors in CsvRowDeserializationSchema
[ https://issues.apache.org/jira/browse/FLINK-14364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949111#comment-16949111 ] Jingsong Lee commented on FLINK-14364: -- CC: [~twalthr] > Allow comments fail when not ignore parse errors in > CsvRowDeserializationSchema > --- > > Key: FLINK-14364 > URL: https://issues.apache.org/jira/browse/FLINK-14364 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Reporter: Jingsong Lee >Priority: Major > Fix For: 1.10.0 > > > Use CsvRowDeserializationSchema, when setIgnoreParseErrors(false) and > setAllowComments(true). > If there are some comments in msg, will throw MismatchedInputException. > If this a bug? and we should catch MismatchedInputException and return null? > -- This message was sent by Atlassian Jira (v8.3.4#803005)