You seem right. The 'uncompressedSize' is having the value but not printed out anywhere. Do you want to make a fix?
On Thu, Feb 17, 2022 at 3:29 AM Deepak Gangwar <[email protected]> wrote: > Hi folks, > > I was using parquet-tools to see the data or metadata of parquet files. I > noticed that parquet-tools has been deprecated and removed from the latest > branch and it is replaced by parquet-cli. Most of my use-cases are > fulfilled by parquet-cli but there is 1 thing missing in parquet-cli. I am > not able to find any way to get the uncompressed size of the data present. > “parquet-tools size -u” gave the uncompressed size but there is no > equivalent parquet-cli command and “parquet-cli meta” only prints the > compressed size. > > I looked around in the codebase and noticed that uncompressedSize is > assigned to a variable in meta command but it is not used or printed > anywhere [1]. I think usage of the variable is missed but I am not able to > find any open issue in jira so I might be completely wrong here. Please > confirm whether this is actually an issue and is there any other way to get > uncompressed size that I am missing? > > > [1] > https://github.com/apache/parquet-mr/blob/master/parquet-cli/src/main/java/org/apache/parquet/cli/commands/ParquetMetadataCommand.java#L123 > -- > Thanks & Regards > Deepak Gangwar > > -- Xinli Shang
