[ 
https://issues.apache.org/jira/browse/FLINK-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224247#comment-14224247
 ] 

ASF GitHub Bot commented on FLINK-1208:
---------------------------------------

Github user FelixNeutatz commented on a diff in the pull request:

    https://github.com/apache/incubator-flink/pull/201#discussion_r20849398
  
    --- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/io/CsvInputFormat.java ---
    @@ -137,6 +239,7 @@ public OUT readRecord(OUT reuse, byte[] bytes, int 
offset, int numBytes) {
                        }
                        return reuse;
                } else {
    +                   this.invalidLineCount++;
    --- End diff --
    
    There is checking for input correctness in 
GenericCsvInputFormat.parseRecord() and the exception will be thrown their so 
we will never come to this line:
    
    14/11/25 09:46:36 INFO taskmanager.TaskManager: Shutting down TaskManager
    Exception in thread "main" 
org.apache.flink.runtime.client.JobExecutionException: 
org.apache.flink.api.common.io.ParseException: Row too short: invalid line
        at 
org.apache.flink.api.common.io.GenericCsvInputFormat.parseRecord(GenericCsvInputFormat.java:284)
        at 
org.apache.flink.api.java.io.CsvInputFormat.readRecord(CsvInputFormat.java:235)
        at 
org.apache.flink.api.java.io.CsvInputFormat.readRecord(CsvInputFormat.java:1)
        at 
org.apache.flink.api.common.io.DelimitedInputFormat.nextRecord(DelimitedInputFormat.java:489)
        at 
org.apache.flink.api.java.io.CsvInputFormat.nextRecord(CsvInputFormat.java:203)
        at 
org.apache.flink.api.java.io.CsvInputFormat.nextRecord(CsvInputFormat.java:1)
        at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:195)
        at 
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:245)
        at java.lang.Thread.run(Thread.java:745)


> Skip comment lines in CSV input format. Allow user to specify comment 
> character.
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-1208
>                 URL: https://issues.apache.org/jira/browse/FLINK-1208
>             Project: Flink
>          Issue Type: Improvement
>          Components: Java API, Scala API
>    Affects Versions: 0.8-incubating
>            Reporter: Aljoscha Krettek
>            Assignee: Felix Neutatz
>            Priority: Minor
>              Labels: starter
>
> The current skipFirstLine is limited. Skipping arbitrary lines that start 
> with a certain character would be much more flexible while still easy to 
> implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to