Hey there, i currently doing some performance measurements on Drill. In my case its a single parquet file with a single local Drill Bit.
Now in one case i have unexpected results and i’m curious if somebody has a good explanation for it! So i have a file with 10 mio rows with 9 columns . Now i’m doing a select statement to find one single row. Runtime with select * : ~ 14.79 s Runtime with select(filterField) : ~ 1.5 sec So i’m surprised that there is so much variance depending on the fields i select, since i thought Drill needs most time for finding that one element, and then deserialize the other fields only on a hit… But for deserialising 8 more hits 10 sec seem way to much!?!?!? best Johannes
