[jira] [Commented] (HDFS-16757) Add a new method copyBlockCrossNamespace to DataNode

ASF GitHub Bot (Jira) Mon, 16 Dec 2024 01:28:07 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17905954#comment-17905954
 ]


ASF GitHub Bot commented on HDFS-16757:
---------------------------------------

tomscut commented on code in PR #6926:
URL: https://github.com/apache/hadoop/pull/6926#discussion_r1886463808


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java:
##########
@@ -1152,4 +1153,14 @@ void setDeleteDuplicateReplicasForTests(
     this.deleteDuplicateReplicas = deleteDuplicateReplicasForTests;
   }
 
+  public File hardLinkOneBlock(File src, File srcMeta, Block dstBlock) throws 
IOException {
+    File dstMeta = new File(tmpDir,
+        DatanodeUtil.getMetaName(dstBlock.getBlockName(), 
dstBlock.getGenerationStamp()));
+    HardLink.createHardLink(srcMeta, dstMeta);
+
+    File dstBlockFile = new File(tmpDir, dstBlock.getBlockName());
+    HardLink.createHardLink(src, dstBlockFile);

Review Comment:
   Hi @LiuGuH @haiyang1987 , do you use this feature on a large scale? Does 
creating a lot of hard links have a maintenance impact on the system? Have you 
considered implementing it based on rename?



##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java:
##########
@@ -1956,6 +1957,26 @@ protected IOStreamPair connectToDN(DatanodeInfo dn, int 
timeout,
         socketFactory, getConf().isConnectToDnViaHostname(), this, blockToken);
   }
 
+  protected void copyBlockCrossNamespace(ExtendedBlock sourceBlk,
+      Token<BlockTokenIdentifier> sourceBlockToken, DatanodeInfo 
sourceDatanode,
+      ExtendedBlock targetBlk, Token<BlockTokenIdentifier> targetBlockToken,
+      DatanodeInfo targetDatanode) throws IOException {
+    IOStreamPair pair =
+        DFSUtilClient.connectToDN(sourceDatanode, 
getConf().getSocketTimeout(), conf, saslClient,
+            socketFactory, getConf().isConnectToDnViaHostname(), this, 
sourceBlockToken);

Review Comment:
   Plz reuse `connectToDN` and close connection resources. You can refer to 
`inferChecksumTypeByReading`.



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java:
##########
@@ -4388,4 +4409,94 @@ boolean isSlownode() {
   public BlockPoolManager getBlockPoolManager() {
     return blockPoolManager;
   }
+
+  public void copyBlockCrossNamespace(ExtendedBlock sourceBlk, ExtendedBlock 
targetBlk,
+      DatanodeInfo targetDn) throws IOException {
+    if (!data.isValidBlock(sourceBlk)) {
+      // block does not exist or is under-construction
+      String errStr =
+          "copyBlock:(" + this.getInfoPort() + ") Can't send invalid block " + 
sourceBlk + " "
+              + data.getReplicaString(sourceBlk.getBlockPoolId(), 
sourceBlk.getBlockId());
+      LOG.info(errStr);
+      throw new IOException(errStr);
+    }
+    long onDiskLength = data.getLength(sourceBlk);
+    if (sourceBlk.getNumBytes() > onDiskLength) {
+      // Shorter on-disk len indicates corruption so report NN the corrupt 
block
+      String msg = "copyBlock: Can't replicate block " + sourceBlk + " because 
on-disk length "
+          + onDiskLength + " is shorter than provided length " + 
sourceBlk.getNumBytes();
+      LOG.info(msg);
+      throw new IOException(msg);
+    }
+    LOG.info(getDatanodeInfo() + " copyBlock: Starting thread to transfer: " + 
"block:"
+        + sourceBlk + " from " + this.getDatanodeUuid() + " to " + 
targetDn.getDatanodeUuid()
+        + "(" + targetDn + ")");

Review Comment:
   > Hi, @LiuGuH , can use log parameterized messages ("{} xxxx", a) instead of 
string concatenation (a + xxx).
   
   @LiuGuH Plz fix this.





> Add a new method copyBlockCrossNamespace to DataNode
> ----------------------------------------------------
>
>                 Key: HDFS-16757
>                 URL: https://issues.apache.org/jira/browse/HDFS-16757
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: ZanderXu
>            Assignee: liuguanghua
>            Priority: Minor
>              Labels: pull-request-available
>
> Add a new method copyBlockCrossNamespace in DataTransferProtocol at the 
> DataNode Side.
> This method will copy a source block from one namespace to a target block 
> from a different namespace. If the target DN is the same with the current DN, 
> this method will copy the block via HardLink. If the target DN is different 
> with the current DN, this method will copy the block via TransferBlock.
> This method will contains some parameters:
>  * ExtendedBlock sourceBlock
>  * Token<BlockTokenIdentifier> sourceBlockToken
>  * ExtendedBlock targetBlock
>  * Token<BlockTokenIdentifier> targetBlockToken
>  * DatanodeInfo targetDN



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16757) Add a new method copyBlockCrossNamespace to DataNode

Reply via email to