Got it. Thanks a lot!
Tianqi

-----Original Message-----
From: Ryan Blue [mailto:[email protected]] 
Sent: Monday, April 13, 2015 4:59 PM
To: [email protected]
Subject: Re: PARQUET_FILE_SIZE & parquet.block.size & dfs.blocksize

On 04/13/2015 03:47 PM, Tianqi Tong wrote:
> Hi Ryan,
> Then back to the original topic: it should be okay if I break a Parquet file 
> into multiple HDFS blocks, right?
> Because when I was querying via Impala, there's a warning like: Parquet file 
> should not be split into multiple hdfs-blocks.
>
> Thanks!
> Tianqi

It is fine to write data as multiple blocks, but Impala performance will be 
better if you keep data in a single block for now. This is something that the 
Impala team is working on.

rb


--
Ryan Blue
Software Engineer
Cloudera, Inc.

Reply via email to