I am a new pyarrow/parquet user. I ran the following test: - 18mb zipped csv file (approx 1.5 mil rows) which has data for one month - saved it as parquet file partitioned on date with default compression and see the parquet file size at ~45mb. If I don’t partition on date then the file size is ~30mb.
My expectation was that the parquet file size would be less than zipped csv file - any comments? Thx
