[jira] [Commented] (HDFS-17280) Pipeline recovery should better end block in advance when bytes acked greater than half of blocksize.

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809007#comment-17809007
 ] 

ASF GitHub Bot commented on HDFS-17280:
---

hfutatzhanghb commented on PR #6336:
URL: https://github.com/apache/hadoop/pull/6336#issuecomment-1902486982

   > > @hfutatzhanghb Thanks for your contribution! Sorry I didn't get this 
proposal clearly. Would you mind to offer some more information about what 
issue do you meet, and what this PR could do? Thanks again.
   > 
   > @Hexiaoqiao Sir, so sorry for repsonsing too late. Let me describe this PR 
in detail.
   > 
   > ### The goal of this PR
   > Since we have 
[HDFS-16348](https://issues.apache.org/jira/browse/HDFS-16348), we can kick out 
SLOW node in pipeline when writing data to pipeline. If we call 
addDatanode2ExistingPipeline() method, it will trigger transfer block process.
   > 
   > Think about below situation : we have a cluster with block size equals to 
512MB, if we have already written 500MB and one datanode was kicked out from 
pipeline, then add a new datanode to pipeline. It will transfer 500MB data to 
the new chosen datanode. This is not efficient.
   > 
   > So this PR is trying to alleviate this effect, if we have already written 
over a half of block size, we can end this block in advance to avoid tranfering 
data.
   
   @Hexiaoqiao @zhangshuyan0 Sir, could you please take a look at this PR when 
you are free? Thanks ahead.




> Pipeline recovery should better end block in advance when bytes acked greater 
> than half of blocksize.
> -
>
> Key: HDFS-17280
> URL: https://issues.apache.org/jira/browse/HDFS-17280
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17280) Pipeline recovery should better end block in advance when bytes acked greater than half of blocksize.

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809004#comment-17809004
 ] 

ASF GitHub Bot commented on HDFS-17280:
---

hfutatzhanghb commented on PR #6336:
URL: https://github.com/apache/hadoop/pull/6336#issuecomment-1902486702

   > @hfutatzhanghb Thanks for your contribution! Sorry I didn't get this 
proposal clearly. Would you mind to offer some more information about what 
issue do you meet, and what this PR could do? Thanks again.
   
   @Hexiaoqiao Sir, so sorry for repsonsing too late. Let me describe this PR 
in detail.
   ### 1.The goal of this PR
   Since we have 
[HDFS-16348](https://issues.apache.org/jira/browse/HDFS-16348), we can kick out 
SLOW node in pipeline when writing data to pipeline.  If we call 
addDatanode2ExistingPipeline() method, it will trigger transfer block process.
   
   Think about below situation : we have a cluster with block size equals to 
512MB,  if we have already written 500MB and one datanode was kicked out from 
pipeline, then add a new datanode to pipeline. It will transfer 500MB data to 
the new chosen datanode. This is not efficient.
   
   So this PR is trying to alleviate this effect, if we have already written 
over a half of block size, we can end this block in advance to avoid tranfering 
data.
   




> Pipeline recovery should better end block in advance when bytes acked greater 
> than half of blocksize.
> -
>
> Key: HDFS-17280
> URL: https://issues.apache.org/jira/browse/HDFS-17280
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17345) Add a metrics to record block report generating cost time

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808997#comment-17808997
 ] 

ASF GitHub Bot commented on HDFS-17345:
---

hfutatzhanghb opened a new pull request, #6475:
URL: https://github.com/apache/hadoop/pull/6475

   ### Description of PR
   HDFS-17345.
   Currently, we have block report send time metrics recorded by blockReports.
   
   We should better add another metric to record block report creating cost 
time:
   
   `long brCreateCost = brSendStartTime - brCreateStartTime; `
   It is useful for us to measure the perfomance of creating block reports.
   
   




> Add a metrics to record block report generating cost time
> -
>
> Key: HDFS-17345
> URL: https://issues.apache.org/jira/browse/HDFS-17345
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.5.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>
> Currently, we have block report send time metrics recorded by blockReports.
> We should better add another metric to record block report creating cost time:
> {code:java}
> long brCreateCost = brSendStartTime - brCreateStartTime; {code}
> It is useful for us to measure the perfomance of creating block reports.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17345) Add a metrics to record block report generating cost time

2024-01-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17345:
--
Labels: pull-request-available  (was: )

> Add a metrics to record block report generating cost time
> -
>
> Key: HDFS-17345
> URL: https://issues.apache.org/jira/browse/HDFS-17345
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.5.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, we have block report send time metrics recorded by blockReports.
> We should better add another metric to record block report creating cost time:
> {code:java}
> long brCreateCost = brSendStartTime - brCreateStartTime; {code}
> It is useful for us to measure the perfomance of creating block reports.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17345) Add a metrics to record block report generating cost time

2024-01-20 Thread farmmamba (Jira)
farmmamba created HDFS-17345:


 Summary: Add a metrics to record block report generating cost time
 Key: HDFS-17345
 URL: https://issues.apache.org/jira/browse/HDFS-17345
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.5.0
Reporter: farmmamba
Assignee: farmmamba


Currently, we have block report send time metrics recorded by blockReports.

We should better add another metric to record block report creating cost time:
{code:java}
long brCreateCost = brSendStartTime - brCreateStartTime; {code}
It is useful for us to measure the perfomance of creating block reports.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17293) First packet data + checksum size will be set to 516 bytes when writing to a new block.

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808923#comment-17808923
 ] 

ASF GitHub Bot commented on HDFS-17293:
---

hfutatzhanghb commented on PR #6368:
URL: https://github.com/apache/hadoop/pull/6368#issuecomment-1902139525

   > This PR has corrected the size of the first packet in a new block, which 
is great. However, due to the original logical problem in 
`adjustChunkBoundary`, the calculation of the size of the last packet in a 
block is still problematic, and I think we need a new PR to solve it.
   > 
   > 
https://github.com/apache/hadoop/blob/27ecc23ae7c5cafba6a5ea58d4a68d25bd7507dd/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java#L531-L543
   > 
   > 
   > Line540, when we pass `blockSize - getStreamer().getBytesCurBlock()` to 
`computePacketChunkSize` as the first parameter, `computePacketChunkSize` is 
likely to cause the data that could have been sent in one data packet to be 
split into two data packets and sent.
   
   Sir, very nice catch. I think below code may resolve the problem you found. 
Please take a look~ I will submit another PR to fix it and Add UT.
   
   ```java
   if (!getStreamer().getAppendChunk()) {
 int psize = 0;
 if (blockSize == getStreamer().getBytesCurBlock()) {
   psize = writePacketSize;
 } else if (blockSize - getStreamer().getBytesCurBlock() + 
PacketHeader.PKT_MAX_HEADER_LEN
 < writePacketSize ) {
   psize = (int)(blockSize - getStreamer().getBytesCurBlock()) + 
PacketHeader.PKT_MAX_HEADER_LEN;
 } else {
   psize = (int) Math
   .min(blockSize - getStreamer().getBytesCurBlock(), 
writePacketSize);
 }
 computePacketChunkSize(psize, bytesPerChecksum);
   }
   ```




> First packet data + checksum size will be set to 516 bytes when writing to a 
> new block.
> ---
>
> Key: HDFS-17293
> URL: https://issues.apache.org/jira/browse/HDFS-17293
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.6
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: pull-request-available
>
> First packet size will be set to 516 bytes when writing to a new block.
> In  method computePacketChunkSize, the parameters psize and csize would be 
> (0, 512)
> when writting to a new block. It should better use writePacketSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17342) Fix DataNode may invalidates normal block causing missing block

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808916#comment-17808916
 ] 

ASF GitHub Bot commented on HDFS-17342:
---

hadoop-yetus commented on PR #6464:
URL: https://github.com/apache/hadoop/pull/6464#issuecomment-1902126261

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  47m 39s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 57s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 247m 14s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6464/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 399m 46s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6464/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6464 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 778cfbae1130 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0a93644d94377973dbc22ea998b35b2e52b6fae5 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6464/6/testReport/ |
   | Max. process+thread count | 2202 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 

[jira] [Commented] (HDFS-17293) First packet data + checksum size will be set to 516 bytes when writing to a new block.

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808912#comment-17808912
 ] 

ASF GitHub Bot commented on HDFS-17293:
---

hadoop-yetus commented on PR #6368:
URL: https://github.com/apache/hadoop/pull/6368#issuecomment-1902116701

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 22s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 40s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  19m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   2m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   2m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 45s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 11s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  5s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m  1s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 34s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 20s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   2m 50s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 40s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   2m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 49s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 188m 45s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6368/13/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 28s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 293m 47s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.TestDFSShell |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6368/13/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6368 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux a5e118489e2b 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 68292c7eed326eb3d653c66b885502f9befcd371 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 

[jira] [Commented] (HDFS-17333) DFSClient support lazy resolve host->ip.

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808906#comment-17808906
 ] 

ASF GitHub Bot commented on HDFS-17333:
---

KeeProMise commented on PR #6430:
URL: https://github.com/apache/hadoop/pull/6430#issuecomment-1902086242

   Can anyone help review it? Thanks.




> DFSClient support lazy resolve host->ip.
> 
>
> Key: HDFS-17333
> URL: https://issues.apache.org/jira/browse/HDFS-17333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-17333.001.patch
>
>
> Currently, when dfsclient is started, it will resolve all hosts of all 
> namservices: 
>   at DFSUtilClient#getAddresses(conf, null, addressKey)
>   at AbstractNNFailoverProxyProvider#getProxyAddresses(URI uri, 
> String addressKey)
> If the current environment where the dfsClient is located causes resolution 
> of host->ip to be very slow, the existing logic will undoubtedly take a long 
> time when there are too many nameservices.
> Now, each dfsclient only needs the IPs of all namenodes of a certain 
> nameservice at most. A better situation is that if the namenode selected by 
> dfsclient for the first time can provide the required services normally, then 
> the client only needs to know the IP of this namenode. Therefore, it is not 
> necessary to resolve all namenodes of all nameservices in the configuration 
> file, when dfsclient is started.
> This patch supports lazy resolution of host->ip, which will only be resolved 
> when the host needs to be accessed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17343) Revert HDFS-16016. BPServiceActor to provide new thread to handle IBR

2024-01-20 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-17343:

Fix Version/s: 3.5.0

> Revert HDFS-16016. BPServiceActor to provide new thread to handle IBR
> -
>
> Key: HDFS-17343
> URL: https://issues.apache.org/jira/browse/HDFS-17343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.5.0
>
>
> When preparing for hadoop-3.4.0 release, we found that HDFS-16016 may cause 
> mis-order of ibr and fbr on datanode. After discussion, we decided to revert 
> HDFS-16016.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17293) First packet data + checksum size will be set to 516 bytes when writing to a new block.

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808901#comment-17808901
 ] 

ASF GitHub Bot commented on HDFS-17293:
---

hadoop-yetus commented on PR #6368:
URL: https://github.com/apache/hadoop/pull/6368#issuecomment-1902069369

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 22s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 49s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   3m 17s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   3m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 16s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 28s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 20s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m  8s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   3m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 36s | 
[/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6368/12/artifact/out/results-checkstyle-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project: The patch generated 2 new + 28 unchanged - 0 fixed = 
30 total (was 28)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 45s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 193m 27s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6368/12/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 314m  4s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestDFSShell |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6368/12/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6368 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux b9d8820b9876 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / da1ea49921d46c85ca76b8c3c42110fde18125d2 |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 

[jira] [Commented] (HDFS-17342) Fix DataNode may invalidates normal block causing missing block

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808882#comment-17808882
 ] 

ASF GitHub Bot commented on HDFS-17342:
---

haiyang1987 commented on code in PR #6464:
URL: https://github.com/apache/hadoop/pull/6464#discussion_r1460295286


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java:
##
@@ -2011,4 +2011,83 @@ public void tesInvalidateMissingBlock() throws Exception 
{
   cluster.shutdown();
 }
   }
+
+  @Test
+  public void testCheckFilesWhenInvalidateMissingBlock() throws Exception {
+long blockSize = 1024;
+int heartbeatInterval = 1;
+HdfsConfiguration c = new HdfsConfiguration();
+c.setInt(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, heartbeatInterval);
+c.setLong(DFS_BLOCK_SIZE_KEY, blockSize);
+MiniDFSCluster cluster = new MiniDFSCluster.Builder(c).
+numDataNodes(1).build();
+DataNodeFaultInjector oldDnInjector = DataNodeFaultInjector.get();
+try {
+  cluster.waitActive();
+  BlockReaderTestUtil util = new BlockReaderTestUtil(cluster, new
+  HdfsConfiguration(conf));
+  Path path = new Path("/testFile");
+  util.writeFile(path, 1);
+  String bpid = cluster.getNameNode().getNamesystem().getBlockPoolId();
+  DataNode dn = cluster.getDataNodes().get(0);
+  FsDatasetImpl dnFSDataset = (FsDatasetImpl) dn.getFSDataset();
+  List replicaInfos = dnFSDataset.getFinalizedBlocks(bpid);
+  assertEquals(1, replicaInfos.size());
+  DFSTestUtil.readFile(cluster.getFileSystem(), path);
+  LocatedBlock blk = util.getFileBlocks(path, 512).get(0);
+  ExtendedBlock block = blk.getBlock();
+
+  // Append a new block with an incremented generation stamp.
+  long newGS = block.getGenerationStamp() + 1;
+  dnFSDataset.append(block, newGS, 1024);
+  block.setGenerationStamp(newGS);
+
+  DataNodeFaultInjector injector = new DataNodeFaultInjector() {
+@Override
+public void delayGetMetaDataInputStream() {
+  try {
+Thread.sleep(8000);
+  } catch (InterruptedException e) {
+// Ignore exception.
+  }
+}
+  };
+  // Delay to getMetaDataInputStream.
+  DataNodeFaultInjector.set(injector);
+
+  ExecutorService executorService = Executors.newFixedThreadPool(2);
+  try {
+Future blockReaderFuture = executorService.submit(() -> {
+  try {
+// Submit tasks for reading block.
+BlockReaderTestUtil.getBlockReader(cluster.getFileSystem(), blk, 
0, 512);
+  } catch (IOException e) {
+// Ignore exception.
+  }
+});
+
+Future finalizeBlockFuture = executorService.submit(() -> {
+  try {
+// Submit tasks for finalizing block.
+Thread.sleep(1000);
+dnFSDataset.finalizeBlock(block, false);
+  } catch (Exception e) {
+// Ignore exception
+  }
+});
+
+// Wait for both tasks to complete.
+blockReaderFuture.get();
+finalizeBlockFuture.get();
+  } finally {
+executorService.shutdown();
+  }
+
+  // Validate the replica is exits.
+  assertNotNull(dnFSDataset.getReplicaInfo(blk.getBlock()));

Review Comment:
   TestFsDatasetImpl#tesInvalidateMissingBlock line[1991-1997]
   here case will simulate the ReplicaInfo would be removed from ReplicaMap 
when checkFile=false
   ```
// Mock local block file not found when disk with some exception.
fsdataset.invalidateMissingBlock(bpid, replicaInfo, false);
// Assert local block file wouldn't be deleted from disk.
assertTrue(blockFile.exists());
// Assert block info would be removed from ReplicaMap.
assertEquals("null", fsdataset.getReplicaString(bpid, 
replicaInfo.getBlockId()));
   ```





> Fix DataNode may invalidates normal block causing missing block
> ---
>
> Key: HDFS-17342
> URL: https://issues.apache.org/jira/browse/HDFS-17342
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> When users read an append file, occasional exceptions may occur, such as 
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: xxx.
> This can happen if one thread is reading the block while writer thread is 
> finalizing it simultaneously.
> *Root cause:*
> # The reader thread obtains a RBW replica from VolumeMap, such as: 
> blk_xxx_xxx[RBW] and  the data file should be in /XXX/rbw/blk_xxx.
> # Simultaneously, the writer thread will finalize 

[jira] [Commented] (HDFS-17342) Fix DataNode may invalidates normal block causing missing block

2024-01-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808880#comment-17808880
 ] 

ASF GitHub Bot commented on HDFS-17342:
---

haiyang1987 commented on code in PR #6464:
URL: https://github.com/apache/hadoop/pull/6464#discussion_r1460291058


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java:
##
@@ -2416,11 +2419,21 @@ public void invalidateMissingBlock(String bpid, Block 
block) {
 // So remove if from volume map notify namenode is ok.
 try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl,
 bpid)) {
-  ReplicaInfo replica = volumeMap.remove(bpid, block);
-  invalidate(bpid, replica);
+  // Check if this block is on the volume map.
+  ReplicaInfo replica = volumeMap.get(bpid, block);
+  // Double-check block or meta file existence when checkFiles as true.
+  if (replica != null && (!checkFiles ||
+  (!replica.blockDataExists() || !replica.metadataExists( {
+volumeMap.remove(bpid, block);
+invalidate(bpid, replica);

Review Comment:
   `If replica == null` should not need to execute `invalidate(bpid, replica)` 
avoid cause NPE. 
   





> Fix DataNode may invalidates normal block causing missing block
> ---
>
> Key: HDFS-17342
> URL: https://issues.apache.org/jira/browse/HDFS-17342
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> When users read an append file, occasional exceptions may occur, such as 
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: xxx.
> This can happen if one thread is reading the block while writer thread is 
> finalizing it simultaneously.
> *Root cause:*
> # The reader thread obtains a RBW replica from VolumeMap, such as: 
> blk_xxx_xxx[RBW] and  the data file should be in /XXX/rbw/blk_xxx.
> # Simultaneously, the writer thread will finalize this block, moving it from 
> the RBW directory to the FINALIZE directory. the data file is move from 
> /XXX/rbw/block_xxx to /XXX/finalize/block_xxx.
> # The reader thread attempts to open this data input stream but encounters a 
> FileNotFoundException because the data file /XXX/rbw/blk_xxx or meta file 
> /XXX/rbw/blk_xxx_xxx doesn't exist at this moment.
> # The reader thread  will treats this block as corrupt, removes the replica 
> from the volume map, and the DataNode reports the deleted block to the 
> NameNode.
> # The NameNode removes this replica for the block.
> # If the current file replication is 1, this file will cause a missing block 
> issue until this DataNode executes the DirectoryScanner again.
> As described above, when the reader thread encountered FileNotFoundException 
> is as expected, because the file is moved.
> So we need to add a double check to the invalidateMissingBlock logic to 
> verify whether the data file or meta file exists to avoid similar cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org