[GitHub] [doris] morningman opened a new pull request, #22634: [improvement](hdfs) support hedged read

via GitHub Sat, 05 Aug 2023 00:23:14 -0700


morningman opened a new pull request, #22634:
URL: https://github.com/apache/doris/pull/22634


   ## Proposed changes
   
   In some cases, the high load of HDFS may lead to a long time to read the 
data on HDFS,
   thereby slowing down the overall query efficiency. HDFS Client provides 
Hedged Read.
   This function can start another read thread to read the same data when a 
read request
   exceeds a certain threshold and is not returned, and whichever is returned 
first will use the result.
   
   eg:
   
   ```
   create catalog regression properties (
       'type'='hms',
       'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
       'dfs.client.hedged.read.threadpool.size' = '128',
       'dfs.client.hedged.read.threshold.millis' = "500"
   );
   ```
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[[email protected]](mailto:[email protected]) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [doris] morningman opened a new pull request, #22634: [improvement](hdfs) support hedged read

Reply via email to