Re: [PR] HDFS-17360. Record the number of times a block is read during a certain time period. [hadoop]

via GitHub Mon, 29 Jan 2024 03:35:36 -0800


huangzhaobo99 commented on PR #6505:
URL: https://github.com/apache/hadoop/pull/6505#issuecomment-1914509805


   > Thanks for your contribution, but I don't quite understand why this metric 
needs to be added to jmx. If you only want to obtain the blocks that are 
frequently accessed during a certain time period, is it enough to open 
CLIENT_TRACE_LOG on the datanode? Then we can process and analyze the audit 
logs to obtain the information we need.
   
   @zhangshuyan0, Thanks for your review.
   
   1. Open CLIENT_ TRACE_ LOG, requires manual aggregation.
   2. The current log related to read is at the debug level, while the log 
related to write is at the info level. Perhaps due to too many logs, it 
defaults to the debug level. However, there are relatively few write requests, 
and the info level will not cause log explosions.
   ```java
   if ((clientTraceFmt != null) && CLIENT_TRACE_LOG.isDebugEnabled()) {
       final long endTime = System.nanoTime();
       CLIENT_TRACE_LOG.debug(String.format(clientTraceFmt, totalRead,
           initialOffset, endTime - startTime));
   }
   ```
   
   3. Record these blockids through metrics and export them to jmx through 
maps, very easy to locate block. ("DFSClientId" may also need to be recorded)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDFS-17360. Record the number of times a block is read during a certain time period. [hadoop]

Reply via email to