[
https://issues.apache.org/jira/browse/FLINK-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885905#comment-15885905
]
ASF GitHub Bot commented on FLINK-5907:
---------------------------------------
Github user KurtYoung commented on a diff in the pull request:
https://github.com/apache/flink/pull/3417#discussion_r103225729
--- Diff:
flink-java/src/main/java/org/apache/flink/api/java/io/RowCsvInputFormat.java ---
@@ -197,6 +197,14 @@ protected boolean parseRecord(Object[] holders, byte[]
bytes, int offset, int nu
if (startPos < 0) {
throw new
ParseException(String.format("Unexpected parser position for column %1$s of row
'%2$s'",
field, new String(bytes, offset,
numBytes)));
+ } else if (startPos == limit
+ && field != fieldIncluded.length - 1
+ &&
!FieldParser.endsWithDelimiter(bytes, startPos - 1, fieldDelimiter)) {
+ if (isLenient()) {
--- End diff --
added
> RowCsvInputFormat bug on parsing tsv
> ------------------------------------
>
> Key: FLINK-5907
> URL: https://issues.apache.org/jira/browse/FLINK-5907
> Project: Flink
> Issue Type: Bug
> Components: Java API
> Affects Versions: 1.2.0
> Reporter: Flavio Pompermaier
> Assignee: Kurt Young
> Labels: csv, parsing
> Attachments: test.tsv
>
>
> The following snippet reproduce the problem (using the attached file as
> input):
> {code:language=java}
> char fieldDelim = '\t';
> TypeInformation<?>[] fieldTypes = new TypeInformation<?>[51];
> for (int i = 0; i < fieldTypes.length; i++) {
> fieldTypes[i] = BasicTypeInfo.STRING_TYPE_INFO;
> }
> int[] fieldMask = new int[fieldTypes.length];
> for (int i = 0; i < fieldMask.length; i++) {
> fieldMask[i] = i;
> }
> RowCsvInputFormat csvIF = new RowCsvInputFormat(new Path(testCsv),
> fieldTypes, "\n", fieldDelim +"",
> fieldMask, true);
> csvIF.setNestedFileEnumeration(true);
> DataSet<Row> csv = env.createInput(csvIF);
> csv.print()
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)