Paul,

I cannot thank you enough for your help and guidance! You are right that 
columnar readers will have a harder time balancing resource requirements and 
performance. Nevertheless, DRILL-6147 is a starting point; it should allow us 
to gain knowledge and accordingly refine our strategy as we go.

FYI - On a completely different topic; I was working on an EBF regarding the 
Parquet complex reader (though the bug was midstream). I was surprised by the 
level of overhead associated with nested data processing; literarily, the code 
was jumping from one column/level to another just to process a single value. 
There was a comment to perform such processing in a bulk manner (which I agree 
with). The moral of the story is that Drill is dealing with complex use-cases 
that haven’t been dealt with before (at least not with great success); as can 
be seen, we started with simpler solutions only to realize they are 
inefficient. What is needed, is to spend time understanding such use-cases and 
incrementally attempt perfecting those shortcomings.

Regards,
Salim


> On Feb 11, 2018, at 3:44 PM, Paul Rogers <par0328@yahoo.c 
> <mailto:par0328@yahoo.c>
> om.INVALID> wrote:
> 
> One more thought:
>>> 3) Assuming that you go with the average batch size calculation approach,
> 
> The average batch size approach is a quick and dirty approach for non-leaf 
> operators that can observe an incoming batch to estimate row width. Because 
> Drill batches are large, the law of large numbers means that the average of a 
> large input batch is likely to be a good estimator for the average size of a 
> large output batch.
> Note that this works only because non-leaf operators have an input batch to 
> sample. Leaf operators (readers) do not have this luxury. Hence the result 
> set loader uses the actual accumulated size for the current batch.
> Also note that the average row method, while handy, is not optimal. It will, 
> in general, result in greater internal fragmentation than the result set 
> loader. Why? The result set loader packs vectors right up to the point where 
> the largest would overflow. The average row method works at the aggregate 
> level and will likely result in wasted space (internal fragmentation) in the 
> largest vector. Said another way, with the average row size method, we can 
> usually pack in a few more rows before the batch actually fills, and so we 
> end up with batches with lower "density" than the optimal. This is important 
> when the consuming operator is a buffering one such as sort.
> The key reason Padma is using the quick & dirty average row size method is 
> not that it is ideal (it is not), but rather that it is, in fact, quick.
> We do want to move to the result set loader over time so we get improved 
> memory utilization. And, it is the only way to control row size in readers 
> such as CSV or JSON in which we have no size information until we read the 
> data.
> - Paul   

Reply via email to