liyunzhang_intel created PARQUET-1117:
-----------------------------------------

             Summary: ParquetRecordWriter does not provide interface like 
getRowCount(),getRawDataSize() like org.apache.orc.Writer  
                 Key: PARQUET-1117
                 URL: https://issues.apache.org/jira/browse/PARQUET-1117
             Project: Parquet
          Issue Type: Bug
            Reporter: liyunzhang_intel


Hive with orc can update the statistics like rowCount,rawDataSize after loading 
data to table. Hive with parquet cannot and need to use analyze command like 
"analyze table xxx compute statistics noscan" to update these two statistics 
info.  The reason is ParquetRecordWriter used in hive does not provide 
interfaces like getRowCount(),getRawDataSize(). While org.apache.orc.Writer  
provides these [two 
interfaces|https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/Writer.java#L68
 ].  Anyone knows how to get rowCount and rawDataSize in ParquetRecordWriter?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to