Thanks Xinli for confirming. Looks like Vinoo already have a fix so I will lookout for that PR.
-- Thanks & Regards Deepak Gangwar From: Vinoo Ganesh <[email protected]> Date: Monday, 21 February 2022 at 2:42 AM To: [email protected] <[email protected]> Subject: Re: Get uncompressed size of parquet file via parquet-cli Ironically, I've needed this and added it recently on my fork of my parquet. Happy to contribute it back: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FPARQUET-2129&data=04%7C01%7Cdgangwar%40vmware.com%7C49ca4906740f4c18a3f008d9f4b5b3dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637809883530644471%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4ewpHGR1A3STunNpNqxK%2F%2BUAYHUw9RSOHRs7U%2F95X08%3D&reserved=0 https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fparquet-mr%2Fpull%2F949&data=04%7C01%7Cdgangwar%40vmware.com%7C49ca4906740f4c18a3f008d9f4b5b3dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637809883530644471%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=8SaMmV4ptS7Czy5gPQril7MjqZhEG0Pi5ys321ANgec%3D&reserved=0 Thanks, Vinoo Ganesh | [email protected] <[email protected]> On Sun, Feb 20, 2022 at 1:18 PM Xinli shang <[email protected]> wrote: > You seem right. The 'uncompressedSize' is having the value but not printed > out anywhere. Do you want to make a fix? > > On Thu, Feb 17, 2022 at 3:29 AM Deepak Gangwar <[email protected]> > wrote: > > > Hi folks, > > > > I was using parquet-tools to see the data or metadata of parquet files. I > > noticed that parquet-tools has been deprecated and removed from the > latest > > branch and it is replaced by parquet-cli. Most of my use-cases are > > fulfilled by parquet-cli but there is 1 thing missing in parquet-cli. I > am > > not able to find any way to get the uncompressed size of the data > present. > > “parquet-tools size -u” gave the uncompressed size but there is no > > equivalent parquet-cli command and “parquet-cli meta” only prints the > > compressed size. > > > > I looked around in the codebase and noticed that uncompressedSize is > > assigned to a variable in meta command but it is not used or printed > > anywhere [1]. I think usage of the variable is missed but I am not able > to > > find any open issue in jira so I might be completely wrong here. Please > > confirm whether this is actually an issue and is there any other way to > get > > uncompressed size that I am missing? > > > > > > [1] > > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fparquet-mr%2Fblob%2Fmaster%2Fparquet-cli%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fparquet%2Fcli%2Fcommands%2FParquetMetadataCommand.java%23L123&data=04%7C01%7Cdgangwar%40vmware.com%7C49ca4906740f4c18a3f008d9f4b5b3dd%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637809883530644471%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=qFfUYd0CY5L5%2FlZ0qVJ%2BRxG0gqO%2BU8KJeqLt%2B5cgwwg%3D&reserved=0 > > -- > > Thanks & Regards > > Deepak Gangwar > > > > > > -- > Xinli Shang >
