[
https://issues.apache.org/jira/browse/IMPALA-8786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sahil Takiar resolved IMPALA-8786.
----------------------------------
Fix Version/s: Not Applicable
Resolution: Later
After doing lots of perf profiling (some results are inĀ IMPALA-8888). I have
concluded that result spooling does not add significant overhead, in some cases
it actually *improves* performance (seen mostly when selecting a large number
of rows from Impala). So while there are some interesting ideas of possible
optimizations here, I am going to close this JIRA and mark the 'Resolution' as
'Later'. We can re-visit these optimizations later if we think they add
significant benefit.
> BufferedPlanRootSink should directly write to a QueryResultSet if one is
> available
> ----------------------------------------------------------------------------------
>
> Key: IMPALA-8786
> URL: https://issues.apache.org/jira/browse/IMPALA-8786
> Project: IMPALA
> Issue Type: Sub-task
> Components: Backend
> Reporter: Sahil Takiar
> Assignee: Sahil Takiar
> Priority: Major
> Fix For: Not Applicable
>
>
> {{BufferedPlanRootSink}} uses a {{RowBatchQueue}} to buffer {{RowBatch}}-es
> and then the consumer thread reads them and writes them to a given
> {{QueryResultSet}}. Implementations of {{RowBatchQueue}} might end up copying
> the buffered {{RowBatch}}-es (e.g. if the queue is backed by a
> {{BufferedTupleStream}}). An optimization would be for the producer thread to
> directly write to the consumer {{QueryResultSet}}. This optimization would
> only be triggered if (1) the queue is empty, and (2) the consumer thread has
> a {{QueryResultSet}} available for writing.
> This "fast path" is useful in a few different scenarios:
> * If the consumer is faster than at reading rows than the producer is at
> sending them; in this case, the overhead of buffering rows in a
> {{RowBatchQueue}} can be completely avoided
> * For queries that return under 1024 its likely that the consumer will
> produce a {{QueryResultSet}} before the first {{RowBatch}} is returned
> (except perhaps for very trivial queries)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]