Sahil Takiar created IMPALA-8819:
------------------------------------

             Summary: BufferedPlanRootSink should handle non-default fetch sizes
                 Key: IMPALA-8819
                 URL: https://issues.apache.org/jira/browse/IMPALA-8819
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Sahil Takiar
            Assignee: Sahil Takiar


As of IMPALA-8780, the {{BufferedPlanRootSink}} returns an error whenever a 
client sets the fetch size to a value lower than the {{BATCH_SIZE}}. The issue 
is that when reading from a {{RowBatch}} from the queue, the batch might 
contain more rows than the number requested by the client. So the 
{{BufferedPlanRootSink}} needs to be able to partially read a {{RowBatch}} and 
remember the index of the rows it read. Furthermore, {{num_results}} in 
{{BufferedPlanRootSink::GetNext}} could be lower than {{BATCH_SIZE}} if the 
query results cache in {{ClientRequestState}} has a cache hit (only happens if 
the client cursor is reset).

Another issue is that the {{BufferedPlanRootSink}} can only read up to a single 
{{RowBatch}} at a time. So if a fetch size larger than {{BATCH_SIZE}} is 
specified, only {{BATCH_SIZE}} rows will be written to the given 
{{QueryResultSet}}. This is consistent with the legacy behavior of 
{{PlanRootSink}} (now {{BlockingPlanRootSink}}), but is not ideal because that 
means clients can only read {{BATCH_SIZE}} rows at a time. A higher fetch size 
would potentially reduce the number of round-trips necessary between the client 
and the coordinator, which could improve fetch performance (but only if the 
{{BlockingPlanRootSink}} is capable of filling all the requested rows).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to