Hi all,

I am trying to figure out if a large Parquet file can be striped across 
multiple small files based on a Row group chunk size where each stripe would 
naturally end up containing data pages from a single row group. So, if I say my 
writer "write a parquet file in chunks of 128 MB (assuming my row groups are of 
around 128MB), each of my chunks ends up being self-contained row group, maybe 
except the last chunk which has the footer contents. Is this possible? Can we 
fix the row group size (the amount of disk space a row group uses) while 
writing parquet files ? Thanks a lot.

Reply via email to