[ 
https://issues.apache.org/jira/browse/IMPALA-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yida Wu reassigned IMPALA-5705:
-------------------------------

    Assignee: Yida Wu

> Parallelise read I/O by prefetching pages when iterating over unpinned 
> BufferedTupleStream
> ------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-5705
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5705
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>    Affects Versions: Impala 2.10.0
>            Reporter: Tim Armstrong
>            Assignee: Yida Wu
>            Priority: Major
>
> We could improve read I/O performance when iterating over unpinned streams in 
> the hash join and hash aggregation by using additional memory to prefetch 
> pages ahead of the current read position. Currently iterating over the 
> unpinned stream only uses a single buffer, and only issues a read I/O when it 
> has finished processing the previous page.
> This slows down processing of spilled probe rows in the hash join and spilled 
> unaggregated rows in the hash aggregation.
> We'd need to figure out how to expose this in the BufferedTupleStream 
> interface, but probably when preparing to read a stream, the client could 
> specify a number of bytes to read ahead in the stream, which would require 
> additional memory but increase performance.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to