Hi folks,

I was using parquet-tools to see the data or metadata of parquet files. I 
noticed that parquet-tools has been deprecated and removed from the latest 
branch and it is replaced by parquet-cli. Most of my use-cases are fulfilled by 
parquet-cli but there is 1 thing missing in parquet-cli. I am not able to find 
any way to get the uncompressed size of the data present. “parquet-tools size 
-u” gave the uncompressed size but there is no equivalent parquet-cli command 
and “parquet-cli meta” only prints the compressed size.

I looked around in the codebase and noticed that uncompressedSize is assigned 
to a variable in meta command but it is not used or printed anywhere [1]. I 
think usage of the variable is missed but I am not able to find any open issue 
in jira so I might be completely wrong here. Please confirm whether this is 
actually an issue and is there any other way to get uncompressed size that I am 
missing?


[1] 
https://github.com/apache/parquet-mr/blob/master/parquet-cli/src/main/java/org/apache/parquet/cli/commands/ParquetMetadataCommand.java#L123
--
Thanks & Regards
Deepak Gangwar

Reply via email to