hive set block size not working

2018-06-13 Thread cathy zhu
I wanted to find the optimized parquet file size. It looks like no matther how much I put on set block size, hive always gave the same result on parquet file sizes. I was copying everything from a table to another same dummy table for the experiment. There are a lot small files. Here are the

snappy compression & set parquet size not working on hive

2018-06-04 Thread cathy zhu
I created a hive table, use insert select to load existing impala data to hive table. I noticed 2 things. 1. The data size is more than twice the size of old data. Old data used impala to do the compression. 2. No matter how large I set parquet block size, hive always generate parquet files with