[
https://issues.apache.org/jira/browse/FLINK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504687#comment-14504687
]
ASF GitHub Bot commented on FLINK-1820:
---------------------------------------
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/566#issuecomment-94722161
Your PR changes the semantics of the Integer parsers a bit because you
ignore whitespaces. This change has a few implications. The following fields
are parsed as correct Integer values:
- `" 123 "`
- `"- 123"`
- `"1 2 3"`
but the following is not accepted:
- `" -123"`
This behavior is not expected, IMO.
I know that `Double.parseDouble()` and `Float.parseFloat()` both ignore
leading and tailing white spaces and the intention of this PR is to make the
parsing of floating point and integer numeric values consistent.
Instead of accepting leading and tailing white space in the Integer
parsers, I propose to check for leading and tailing whitespaces in floating
point fields and make these parsers fail in such cases. This would also give
consistent parsing behavior.
What do you think?
> Bug in DoubleParser and FloatParser - empty String is not casted to 0
> ---------------------------------------------------------------------
>
> Key: FLINK-1820
> URL: https://issues.apache.org/jira/browse/FLINK-1820
> Project: Flink
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.8.0, 0.9, 0.8.1
> Reporter: Felix Neutatz
> Assignee: Felix Neutatz
> Priority: Critical
> Fix For: 0.9
>
>
> Hi,
> I found the bug, when I wanted to read a csv file, which had a line like:
> "||\n"
> If I treat it as a Tuple2<Long,Long>, I get as expected a tuple (0L,0L).
> But if I want to read it into a Double-Tuple or a Float-Tuple, I get the
> following error:
> java.lang.AssertionError: Test failed due to a
> org.apache.flink.api.common.io.ParseException: Line could not be parsed: '||'
> ParserError NUMERIC_VALUE_FORMAT_ERROR
> This error can be solved by adding an additional condition for empty strings
> in the FloatParser / DoubleParser.
> We definitely need the CSVReader to be able to read "empty values".
> I can fix it like described if there are no better ideas :)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)