Re: Generating a fixed size parquet file when doing Insert select *

2020-03-25 Thread Tim Armstrong
I believe that statement about the estimation is true. PARQUET_FILE_SIZE is also an upper bound and depends on the amount of data being written to that partition on a particular impala daemon - if you have less than a file's worth of data written on that node, you will only get a single (maybe

RE: Generating a fixed size parquet file when doing Insert select *

2020-03-25 Thread Antoni Ivanov
Hi, Impala team can correct me but Even if you specify PARQUET_FILE_SIZE to 256MB Impala may and likely will create smaller files (e.g 128MB or even smaller). As far as I could understand, that’s because when Impala is writing the parquet file, it’s making a guess about the potential file size