Re: [PR] HDFS-16446. Consider ioutils of disk when choosing volume [hadoop]

via GitHub Wed, 28 Aug 2024 20:45:52 -0700


uflswe commented on PR #3960:
URL: https://github.com/apache/hadoop/pull/3960#issuecomment-2316657301


   @whbing 
   
   > Hi, @tomscut , we also face slow nodes caused by io, but in most cases it 
is slow to read data because the frequency of reading in the cluster far 
exceeds that of writing. We have counted many cases, when reading slowly, the 
**iowait** is basically high, but only one or a few disks are full of 
**ioutil** (12 disks per dn). Therefore, I have the following points to 
discuss.嗨， @tomscut ，我们也遇到了 io 
导致的慢节点，但大多数情况下读取数据都很慢，因为集群中读取的频率远远超过写入的频率。我们统计过很多情况，慢读的时候，iowait 
基本都很高，但只有一个或者几个磁盘装满了 ioutil（每个 dn 12 个磁盘）。因此，我有以下几点要讨论。
   > 
   > 1. Is there any further consideration for slow reading?是否有进一步的考虑来考虑慢读？
   > 2. Have you considered reporting iowait to nn, like XmitsInProgress, 
ActiveTransferThreadCount, etc, so that it is taken into account when choosing 
a DN?您是否考虑过将 iowait 报告给 nn，如 XmitsInProgress、ActiveTransferThreadCount 等，以便在选择 
DN 时将其考虑在内？
   
   The first suggestion is good. But in my opinion, VolumeChoosingPolicy is 
only relevant during the write operation, determining which disk on a DataNode 
should store the data block. When reading files from HDFS, the focus is on 
selecting the appropriate DataNode that contains the required data blocks, not 
on selecting specific disks within a DataNode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDFS-16446. Consider ioutils of disk when choosing volume [hadoop]

Reply via email to