Querying parquet files

Yousef Lasi Mon, 06 Jul 2015 19:11:36 -0700

I'm hoping someone can expand my understanding of the mechanics of a query 
against a parquet file. We're finding that selecting a single column in a 
record from a file with > 40 million records is extremely fast - typically less 
than a second. However, running a 'select *" query against the same record 
using the same criteria  is somewhat slow - as in greater than 60 seconds.


 This might be expected behavior, but hopefully a better understanding of why 
this occurs might help us optimize the structure of our data files better as we 
create them.

 Thanks

Querying parquet files

Reply via email to