[ 
https://issues.apache.org/jira/browse/FLINK-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307155#comment-16307155
 ] 

ASF GitHub Bot commented on FLINK-8331:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5218#discussion_r159136001
  
    --- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/io/RowCsvInputFormatTest.java
 ---
    @@ -362,61 +365,95 @@ public void readStringFieldsWithTrailingDelimiters() 
throws Exception {
     
        @Test
        public void testTailingEmptyFields() throws Exception {
    -           String fileContent = "abc|-def|-ghijk\n" +
    -                           "abc|-def|-\n" +
    -                           "abc|-|-\n" +
    -                           "|-|-|-\n" +
    -                           "|-|-\n" +
    -                           "abc|-def\n";
     
    -           FileInputSplit split = createTempFile(fileContent);
    -
    -           TypeInformation[] fieldTypes = new TypeInformation[]{
    -                           BasicTypeInfo.STRING_TYPE_INFO,
    -                           BasicTypeInfo.STRING_TYPE_INFO,
    -                           BasicTypeInfo.STRING_TYPE_INFO};
    -
    -           RowCsvInputFormat format = new RowCsvInputFormat(PATH, 
fieldTypes, "\n", "|");
    -           format.setFieldDelimiter("|-");
    -           format.configure(new Configuration());
    -           format.open(split);
    -
    -           Row result = new Row(3);
    -
    -           result = format.nextRecord(result);
    -           assertNotNull(result);
    -           assertEquals("abc", result.getField(0));
    -           assertEquals("def", result.getField(1));
    -           assertEquals("ghijk", result.getField(2));
    +           List<Tuple4<TypeInformation, String, String, String>> dataList 
= new java.util.ArrayList<>();
    +
    +           // test String
    +           dataList.add(new Tuple4<>(BasicTypeInfo.STRING_TYPE_INFO, 
"bdc", "bdc", ""));
    +           // test BigInt
    +           dataList.add(new Tuple4<>(BasicTypeInfo.BIG_INT_TYPE_INFO,
    --- End diff --
    
    I think we can extend this test but should not exhaustively test all types. 
It's the responsibility of the different `FieldParser` to implement the empty 
field logic correctly. Hence, this should be tested in the field parser tests 
and not here.


> FieldParsers do not correctly set EMPT_COLUMN error state
> ---------------------------------------------------------
>
>                 Key: FLINK-8331
>                 URL: https://issues.apache.org/jira/browse/FLINK-8331
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.5.0, 1.4.1
>            Reporter: Fabian Hueske
>            Assignee: sunjincheng
>
> Some {{FieldParser}} do not correctly set the EMPTY_COLUMN error state if a 
> field is empty.
> Instead, they try to parse the field value from an empty String which fails, 
> e.g., in case of the {{DoubleParser}} with a {{NumberFormatException}}.
> The {{RowCsvInputFormat}} has a flag to interpret empty fields as {{null}} 
> values. The implementation requires that all {{FieldParser}} correctly return 
> the EMPTY_COLUMN error state in case of an empty field.
> Affected {{FieldParser}}:
> - BigDecParser
> - BigIntParser
> - DoubleParser
> - FloatParser
> - SqlDateParser
> - SqlTimeParser
> - SqlTimestampParser



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to