liyunzhang_intel created PARQUET-1117:
-----------------------------------------
Summary: ParquetRecordWriter does not provide interface like
getRowCount(),getRawDataSize() like org.apache.orc.Writer
Key: PARQUET-1117
URL: https://issues.apache.org/jira/browse/PARQUET-1117
Project: Parquet
Issue Type: Bug
Reporter: liyunzhang_intel
Hive with orc can update the statistics like rowCount,rawDataSize after loading
data to table. Hive with parquet cannot and need to use analyze command like
"analyze table xxx compute statistics noscan" to update these two statistics
info. The reason is ParquetRecordWriter used in hive does not provide
interfaces like getRowCount(),getRawDataSize(). While org.apache.orc.Writer
provides these [two
interfaces|https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/Writer.java#L68
]. Anyone knows how to get rowCount and rawDataSize in ParquetRecordWriter?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)