Depends on your queries, the data structure etc. generally flat is better, but 
if your query filter is on the highest level then you may have better 
performance with a nested structure, but it really depends

> On 30. Apr 2017, at 10:19, Zeming Yu <zemin...@gmail.com> wrote:
> 
> Hi,
> 
> We're building a parquet based data lake. I was under the impression that 
> flat files are more efficient than deeply nested files (say 3 or 4 levels 
> down). Is that correct?
> 
> Thanks,
> Zeming

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to