I have 10k complex parquet files with large footers. The schema for all
these files is the same. Drill ended up generating a cache file which is
2.26 GB. Now a simple count(*) query got hung from sqlline and did not
return.

In this specific case, I compared the footers for 2 files and there were
many parts which are identical. Would it make sense to store the common
information once and override the specific details?

- Rahul

Reply via email to