Hi, I'm converting my Pig dataset of 2700+ columns to Parquet format.
I set parquet.block.size to be 1GB and I'm still getting OOM issues. Is it still too small (I guess there's only 1 row group, it's the case for another dataset with 600+ columns)? Is there a setting to specify the number of row groups? Thanks, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/
