[ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887912#comment-17887912
 ] 

Steve Loughran commented on HADOOP-19229:
-----------------------------------------

facebook velox paper on merging time: 
https://research.facebook.com/publications/velox-metas-unified-execution-engine/

-----

cached columns are first read from disaggregated storage systems, such as S3 or 
HDFS, stored in RAM for the time of first use, and eventually persisted to 
local SSD. Furthermore, IO reads for nearby columns are typically coalesced 
(merged) if the gap between them is small enough (currently about 20K for SSD 
and 500K for disaggregated storage), aiming to serve neighboring reads in as 
few IO reads as possible. Naturally, this leverages the effect of temporal 
locality which makes correlated columns to be cached together on SSD.

Considering that all remote columnar formats have similar access patterns, 
consisting of first reading file metadata to identify the buffer boundaries, 
followed by read of parts of these buffers, IO reads can be scheduled in 
advance (prefetched) in order to interleave IO stalls and CPU processing. Velox 
tracks access frequencies of columns on a per-query basis, and adaptively 
schedules prefetches for hot columns. The combination of memory caching and 
smart pre-fetching logic makes many SQL interactive analytical workloads, which 
are commonly built based on small to mid-sized tables, to be effectively served 
from memory, since IO stalls are taken off of the critical path and do not 
contribute to query latency.

-----


> Vector IO on cloud storage: what is a good minimum seek size?
> -------------------------------------------------------------
>
>                 Key: HADOOP-19229
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19229
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.1
>            Reporter: Steve Loughran
>            Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to