[
https://issues.apache.org/jira/browse/FLINK-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307155#comment-16307155
]
ASF GitHub Bot commented on FLINK-8331:
---------------------------------------
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/5218#discussion_r159136001
--- Diff:
flink-java/src/test/java/org/apache/flink/api/java/io/RowCsvInputFormatTest.java
---
@@ -362,61 +365,95 @@ public void readStringFieldsWithTrailingDelimiters()
throws Exception {
@Test
public void testTailingEmptyFields() throws Exception {
- String fileContent = "abc|-def|-ghijk\n" +
- "abc|-def|-\n" +
- "abc|-|-\n" +
- "|-|-|-\n" +
- "|-|-\n" +
- "abc|-def\n";
- FileInputSplit split = createTempFile(fileContent);
-
- TypeInformation[] fieldTypes = new TypeInformation[]{
- BasicTypeInfo.STRING_TYPE_INFO,
- BasicTypeInfo.STRING_TYPE_INFO,
- BasicTypeInfo.STRING_TYPE_INFO};
-
- RowCsvInputFormat format = new RowCsvInputFormat(PATH,
fieldTypes, "\n", "|");
- format.setFieldDelimiter("|-");
- format.configure(new Configuration());
- format.open(split);
-
- Row result = new Row(3);
-
- result = format.nextRecord(result);
- assertNotNull(result);
- assertEquals("abc", result.getField(0));
- assertEquals("def", result.getField(1));
- assertEquals("ghijk", result.getField(2));
+ List<Tuple4<TypeInformation, String, String, String>> dataList
= new java.util.ArrayList<>();
+
+ // test String
+ dataList.add(new Tuple4<>(BasicTypeInfo.STRING_TYPE_INFO,
"bdc", "bdc", ""));
+ // test BigInt
+ dataList.add(new Tuple4<>(BasicTypeInfo.BIG_INT_TYPE_INFO,
--- End diff --
I think we can extend this test but should not exhaustively test all types.
It's the responsibility of the different `FieldParser` to implement the empty
field logic correctly. Hence, this should be tested in the field parser tests
and not here.
> FieldParsers do not correctly set EMPT_COLUMN error state
> ---------------------------------------------------------
>
> Key: FLINK-8331
> URL: https://issues.apache.org/jira/browse/FLINK-8331
> Project: Flink
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.5.0, 1.4.1
> Reporter: Fabian Hueske
> Assignee: sunjincheng
>
> Some {{FieldParser}} do not correctly set the EMPTY_COLUMN error state if a
> field is empty.
> Instead, they try to parse the field value from an empty String which fails,
> e.g., in case of the {{DoubleParser}} with a {{NumberFormatException}}.
> The {{RowCsvInputFormat}} has a flag to interpret empty fields as {{null}}
> values. The implementation requires that all {{FieldParser}} correctly return
> the EMPTY_COLUMN error state in case of an empty field.
> Affected {{FieldParser}}:
> - BigDecParser
> - BigIntParser
> - DoubleParser
> - FloatParser
> - SqlDateParser
> - SqlTimeParser
> - SqlTimestampParser
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)