[ 
https://issues.apache.org/jira/browse/IMPALA-9316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Rorke reassigned IMPALA-9316:
-----------------------------------

    Assignee: David Rorke

> Consider coalescing S3 scans
> ----------------------------
>
>                 Key: IMPALA-9316
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9316
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Sahil Takiar
>            Assignee: David Rorke
>            Priority: Major
>
> We should consider coalescing S3 reads. IIUC the current {{DiskIoMgr}} code 
> for S3A does not do anything special for scheduling S3 scan ranges. It simply 
> round-robin assigns scans to IO threads.
> I think there might be a smarter algorithm we could employ when scheduling S3 
> reads. A few things to consider:
> * With the migration to {{hdfsPreadFully}}, each S3 scan range should 
> correspond to a single HTTP GET request (assuming the 8 MB limit is not hit, 
> see below)
> * {{read_size}} limits the size of a read to 8 MB (I believe if a scan range 
> exceeds this limit, the reads are just done on the same IO thread, but 
> sequentially - they are broken up into multiple HTTP GET requests)
> * S3A has a readahead option that defaults to 64 KB, however, it only applies 
> in certain situations
> ** If {{fs.s3a.experimental.input.fadvise=random}} (which is the recommended 
> value when reading Parquet / ORC data), the readahead applies if (1) it won't 
> cause the read to go past the end of the file, and (2) the request read 
> length is under 64 KB (it reads up to Math.max(requested-read-length, 64 KB)) 
> (so the readahead most likely applies for small reads)
> Coalescing reads would allow Impala to combine multiple, small HTTP GET 
> requests into fewer, larger HTTP GET requests. There may be some data that 
> needs to be skipped over, but the cost of reading that extra data might 
> outweigh the cost of issuing multiple HTTP requests. Since each HTTP request 
> requires a round-trip to S3, issuing a lot of GET requests can be costly, 
> especially if each only reads a small amount of data.
> Some implementation factors to consider:
> * There should probably be a limit on the maximum size of a read request (is 
> 8 MB the right value for S3?)
> * Since S3A uses a default of 64 KB for their readahead, we can probably use 
> a similar value
> * Should the number of disk IO threads be considered when coalescing reads? 
> e.g. by default there are 16 IO threads, if there are 16 small scan ranges, 
> does it make more sense to coalesce them into a single large scan range, or 
> would we get better throughput by issuing all 16 in parallel



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to