Hi,
I'm new to the list so apologies up front if this is the wrong place to
post this (glad to take input).
I converted a large set of CSV files to parquet files using Drill. I
tried this with snappy and uncompressed.
Subsequent reads with a 'select count(*) from dfs.`mydir` where
`somecolumn` > 47;' always does 16k reads. Using flightrecorder this
seems to come from the Page Header in the Parquet files.
Anyone know a way to increase the 16k reads? Thinking about writing my
own parquet files but thought I'd ask if there was some config way to do
it first. And also ask if writing my own parquet file with bigger sizes
in the Page Header will help?
Thanks in advance,
Mark