Nikola created ORC-633:
--------------------------

             Summary: Skip broken ORC files when reading
                 Key: ORC-633
                 URL: https://issues.apache.org/jira/browse/ORC-633
             Project: ORC
          Issue Type: Improvement
          Components: Reader
    Affects Versions: 1.6.3
            Reporter: Nikola


I am reading a path with ORC files using flink. However, some of them are 
broken.

I get exceptions like this:

org.apache.orc.FileFormatException: Not a valid ORC file /user/orc/0.orc 
(maxFileLength= 9223372036854775807)org.apache.orc.FileFormatException: Not a 
valid ORC file /user/orc/0.orc (maxFileLength= 9223372036854775807) at 
org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:546) at 
org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:370) at 
org.apache.orc.OrcFile.createReader(OrcFile.java:342) at 
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:225) at 
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:63) at 
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to