I recommend trying different values using the parquet-cli. That's an easy
way to see how different row group and page sizes perform. That's what I do
to tune all of our tables.

rb

On Fri, Jan 12, 2018 at 10:43 AM, ALeX Wang <[email protected]> wrote:

> Hi,
>
> I'm using parquet to store a big table (400+ columns), and most of columns
> will be none
>
> Is there any recommended rowgroup size and the number of row groups per
> parquet file for my use case?  Or is there any reference/paper that I could
> read myself,
>
>
> Thanks,
> --
> Alex Wang,
> Open vSwitch developer
>



-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to