Anuj Modi created HADOOP-19767:
----------------------------------
Summary: ABFS: [Read] Introduce Abfs Input Policy for detecting
read patterns
Key: HADOOP-19767
URL: https://issues.apache.org/jira/browse/HADOOP-19767
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/azure
Affects Versions: 3.4.2
Reporter: Anuj Modi
Assignee: Anuj Modi
Since the onset of ABFS Driver, there has been a single implementation of
AbfsInputStream. Different kinds of workloads require different heuristics to
give the best performance for that type of workload. For example:
# Sequential Read Workloads like DFSIO and DistCP gain performance improvement
from prefetched
# Random Read Workloads on other hand do not need Prefetches and enabling
prefetches for them is an overhead and TPS heavy
# Query Workloads involving Parquet/ORC files benefit from improvements like
Footer Read and Small Files Reads
To accomodate this we need to determine the pattern and accordingly create
Input Streams implemented for that particular pattern.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]