[ https://issues.apache.org/jira/browse/ORC-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412852#comment-16412852 ]
Sandeep More commented on ORC-305: ---------------------------------- Looks like the streamFactory does not expose the getPhysicalWriter(), the [WriterContext|https://github.com/apache/orc/blob/ded204a4a10bfad1ed739fc98f612a41005640c5/java/core/src/java/org/apache/orc/impl/writer/WriterContext.java] interface for streamFactory, used by TreeWriterBase does not have the getPhysicalWriter() method. That method is exposed by a different WriterContext, [OrcFile.WriterContext |https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/OrcFile.java#L330] but it can't be accessed by TreeWriterBase class. One way would be to add getWriter() method to [org.apache.orc.impl.writer.WriterContext|https://github.com/apache/orc/blob/ded204a4a10bfad1ed739fc98f612a41005640c5/java/core/src/java/org/apache/orc/impl/writer/WriterContext.java] class, so we could do TreeWriterBase.streamFactory.getWriter() to get the PhysicalWriter. let me know your thoughts ! > Add column statistics for the size on disk > ------------------------------------------ > > Key: ORC-305 > URL: https://issues.apache.org/jira/browse/ORC-305 > Project: ORC > Issue Type: Test > Reporter: Owen O'Malley > Assignee: Sandeep More > Priority: Major > > It would be great to have the size on disk of each column. > You can generate this by adding up the sizes of the dictionary and data > streams. > It is only relevant at the stripe and file level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)