Depends on your queries, the data structure etc. generally flat is better, but if your query filter is on the highest level then you may have better performance with a nested structure, but it really depends
> On 30. Apr 2017, at 10:19, Zeming Yu <zemin...@gmail.com> wrote: > > Hi, > > We're building a parquet based data lake. I was under the impression that > flat files are more efficient than deeply nested files (say 3 or 4 levels > down). Is that correct? > > Thanks, > Zeming --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org