Why do you want larger files? Doesn't the result Parquet file contain
all the data in the original TSV file?
Cheng
On 10/7/15 11:07 AM, Younes Naguib wrote:
Hi,
I’m reading a large tsv file, and creating parquet files using sparksql:
insert overwrite
table tbl partition(year, month, day)....
Select .... from tbl_tsv;
This works nicely, but generates small parquet files (15MB).
I wanted to generate larger files, any idea how to address this?
*Thanks,*
*Younes Naguib***
Triton Digital | 1440 Ste-Catherine W., Suite 1200 | Montreal, QC H3G 1R8
Tel.: +1 514 448 4037 x2688 | Tel.: +1 866 448 4037 x2688 |
younes.nag...@tritondigital.com<mailto:younes.nag...@streamtheworld.com>