[PR] HADOOP-18872: [Backport to 3.3] [ABFS] [BugFix] Misreporting Retry Count for Sub-sequential and Parallel Operations [hadoop]

via GitHub Mon, 20 Nov 2023 20:21:12 -0800


anujmodi2021 opened a new pull request, #6284:
URL: https://github.com/apache/hadoop/pull/6284


   Jira Ticket: https://issues.apache.org/jira/browse/HADOOP-18872
   Trunk PR: https://github.com/apache/hadoop/pull/6019
   Cherry-picked commit: 
https://github.com/apache/hadoop/commit/000a39ba2d2131ac158e23b35eae8c1329681bff
   
   Description: 
   There was a bug identified where retry count in the client correlation id 
was wrongly reported for sub-sequential and parallel operations triggered by a 
single file system call. This was due to reusing same tracing context for all 
such calls.
   We create a new tracing context as soon as HDFS call comes. We keep on 
passing that same TC for all the client calls.
   
   For instance, when we get a createFile call, we first call metadata 
operations. If those metadata operations somehow succeeded after a few retries, 
the tracing context will have that many retry count in it. Now when actual call 
for create is made, same retry count will be used to construct the 
headers(clientCorrelationId). Alhough the create operation never failed, we 
will still see retry count from the previous request.
   
   Fix is to use a new tracing context object for all the network calls made. 
All the sub-sequential and parallel operations will have same primary request 
Id to correlate them, yet they will have their own tracing of retry count.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] HADOOP-18872: [Backport to 3.3] [ABFS] [BugFix] Misreporting Retry Count for Sub-sequential and Parallel Operations [hadoop]

Reply via email to