[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15503640#comment-15503640 ] ASF GitHub Bot commented on FLINK-4081: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2297 > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > Fix For: 1.2.0 > > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15502888#comment-15502888 ] ASF GitHub Bot commented on FLINK-4081: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2297 That looks more consistent. If the tests pass, +1 from my side > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496347#comment-15496347 ] ASF GitHub Bot commented on FLINK-4081: --- Github user twalthr commented on the issue: https://github.com/apache/flink/pull/2297 If there are no objections, I would like to merge this PR next week. Other my other PRs depend on this. > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420758#comment-15420758 ] ASF GitHub Bot commented on FLINK-4081: --- Github user twalthr commented on the issue: https://github.com/apache/flink/pull/2297 @StephanEwen I renamed `EMPTY_STRING` to `EMPTY_COLUMN`. All parsers that have a "format" (like Double, Boolean, Integer etc.) return `-1` and set `EMPTY_COLUMN`. The `StringParser` returns the String but sets `EMPTY_COLUMN` in case no quoting could be found. So in `..,12,,xyz,..` the column `,,` will always be set as `EMPTY_COLUMN` consistently across all parsers. > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411745#comment-15411745 ] ASF GitHub Bot commented on FLINK-4081: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2297 Okay, I think the main source of confusion here is that `EMPTY_STRING` does actually not refer to an empty string (as specific to the string parser) but to something like "null", "empty column", or "missing value". Let's call it something like that and then it actually makes sense to me to communicate that via an error state. Because it is actually not a regular value (as an empty string would be). > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411602#comment-15411602 ] ASF GitHub Bot commented on FLINK-4081: --- Github user twalthr commented on the issue: https://github.com/apache/flink/pull/2297 The length is not sufficient to differentiate between empty column and empty string. E.g. if you have quoted strings: `,,` and `,"",` With the changes in this PR we don't modify the return values, but deposit the information of an empty column in the error state. > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405909#comment-15405909 ] ASF GitHub Bot commented on FLINK-4081: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2297 I am confused here, the return value is not the parsed value, but the length that got parsed. > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404167#comment-15404167 ] ASF GitHub Bot commented on FLINK-4081: --- Github user fpompermaier commented on the issue: https://github.com/apache/flink/pull/2297 For our use cases it is important to know whether a value of cell is empty or it is 0. The EMPTY_STRING error could be intercepted by the csv parser in order to generate Row objects with null values instead of 0 using the Table API > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings
[ https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404107#comment-15404107 ] ASF GitHub Bot commented on FLINK-4081: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2297 Using an error state for an empty string seems a bit unorthodox. I take it returning `0` does not work for some reason? > FieldParsers should support empty strings > - > > Key: FLINK-4081 > URL: https://issues.apache.org/jira/browse/FLINK-4081 > Project: Flink > Issue Type: Bug > Components: Core >Reporter: Flavio Pompermaier >Assignee: Timo Walther > Labels: csvparser, table-api > > In order to parse CSV files using the new Table API that converts rows to Row > objects (that support null values), FiledParser implementations should > support emptry strings setting the parser state to > ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser > doesn't respect this constraint) -- This message was sent by Atlassian JIRA (v6.3.4#6332)