Anuj Modi created HADOOP-19767:
----------------------------------

             Summary: ABFS: [Read] Introduce Abfs Input Policy for detecting 
read patterns
                 Key: HADOOP-19767
                 URL: https://issues.apache.org/jira/browse/HADOOP-19767
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/azure
    Affects Versions: 3.4.2
            Reporter: Anuj Modi
            Assignee: Anuj Modi


Since the onset of ABFS Driver, there has been a single implementation of 
AbfsInputStream. Different kinds of workloads require different heuristics to 
give the best performance for that type of workload. For example: 
 # Sequential Read Workloads like DFSIO and DistCP gain performance improvement 
from prefetched 

 # Random Read Workloads on other hand do not need Prefetches and enabling 
prefetches for them is an overhead and TPS heavy 

 # Query Workloads involving Parquet/ORC files benefit from improvements like 
Footer Read and Small Files Reads

To accomodate this we need to determine the pattern and accordingly create 
Input Streams implemented for that particular pattern.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to