[
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887912#comment-17887912
]
Steve Loughran commented on HADOOP-19229:
-----------------------------------------
facebook velox paper on merging time:
https://research.facebook.com/publications/velox-metas-unified-execution-engine/
-----
cached columns are first read from disaggregated storage systems, such as S3 or
HDFS, stored in RAM for the time of first use, and eventually persisted to
local SSD. Furthermore, IO reads for nearby columns are typically coalesced
(merged) if the gap between them is small enough (currently about 20K for SSD
and 500K for disaggregated storage), aiming to serve neighboring reads in as
few IO reads as possible. Naturally, this leverages the effect of temporal
locality which makes correlated columns to be cached together on SSD.
Considering that all remote columnar formats have similar access patterns,
consisting of first reading file metadata to identify the buffer boundaries,
followed by read of parts of these buffers, IO reads can be scheduled in
advance (prefetched) in order to interleave IO stalls and CPU processing. Velox
tracks access frequencies of columns on a per-query basis, and adaptively
schedules prefetches for hot columns. The combination of memory caching and
smart pre-fetching logic makes many SQL interactive analytical workloads, which
are commonly built based on small to mid-sized tables, to be effectively served
from memory, since IO stalls are taken off of the critical path and do not
contribute to query latency.
-----
> Vector IO on cloud storage: what is a good minimum seek size?
> -------------------------------------------------------------
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.1
> Reporter: Steve Loughran
> Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap
> between ranges to justify the merge. Right now we could have a read where two
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and
> that's wasteful.
> We could also consider an "efficiency" metric which looks at the ratio of
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could
> track it as an IOStat
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]