I'm hoping someone can expand my understanding of the mechanics of a query against a parquet file. We're finding that selecting a single column in a record from a file with > 40 million records is extremely fast - typically less than a second. However, running a 'select *" query against the same record using the same criteria is somewhat slow - as in greater than 60 seconds.
This might be expected behavior, but hopefully a better understanding of why this occurs might help us optimize the structure of our data files better as we create them. Thanks
