[ 
https://issues.apache.org/jira/browse/ORC-525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reassigned ORC-525:
---------------------------------

    Assignee: Owen O'Malley
     Summary: Users must close Readers in ORC 1.5.6  (was: OrcFile.createReader 
may leave open stream)

Spark users noticed that Spark was generating twice the number of file opens on 
the NameNode compared to Hive in this case:

{code}
Reader reader = OrcFile.createReader(...);
RecordReader rows = reader.rows(...);
{code} 

This was fixed in ORC-498, but now applications must close the Reader objects 
that aren't used to create RecordReaders.

> Users must close Readers in ORC 1.5.6
> -------------------------------------
>
>                 Key: ORC-525
>                 URL: https://issues.apache.org/jira/browse/ORC-525
>             Project: ORC
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 1.5.6
>            Reporter: Dongjoon Hyun
>            Assignee: Owen O'Malley
>            Priority: Major
>
> Unlike 1.5.5, ORC 1.5.6-rc1 seems to have a following issue which causes 
> Apache Spark unit test failures.
> {code}
> $ build/sbt "sql/testOnly *.SQLQuerySuite"
> ...
> [info]   Cause: java.lang.Throwable:
> [info]   at 
> org.apache.spark.DebugFilesystem$.addOpenStream(DebugFilesystem.scala:36)
> [info]   at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:70)
> [info]   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
> [info]   at 
> org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:535)
> [info]   at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:368)
> [info]   at org.apache.orc.OrcFile.createReader(OrcFile.java:343)
> {code}
> ORC-498 (ReaderImpl and RecordReaderImpl open separate file handles) seems to 
> be the reason of this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to