[jira] [Commented] (HDFS-14315) Add more detailed log message when decreasing replication factor < max in snapshots

Masatake Iwasaki (JIRA) Wed, 27 Feb 2019 09:05:33 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16779517#comment-16779517
 ]


Masatake Iwasaki commented on HDFS-14315:
-----------------------------------------

Comments for TestRplicationSnapshot:

{code}
   public void testReplicationWithSnapshot() throws Exception {
-    short fileRep = 1;
+    short fileRep = 2;
     // Create file1, set its replication to 1
     DFSTestUtil.createFile(hdfs, file1, BLOCKSIZE, fileRep, seed);
     Map<Path, Short> snapshotRepMap = new HashMap<Path, Short>();
+    FileStatus fileStatus = hdfs.getFileStatus(file1);
+    long len = 0;
     // Change replication factor from 1 to 5. In the meanwhile, keep taking
     // snapshots for sub1
{code}
Changing initial value of fileRep makes following comments inconsistent.

{code}
+      for( BlockLocation locations : hdfs.getFileBlockLocations(fileStatus, 0, 
len) ) {
+        //assertTrue(locations.getHosts().length == fileRep);
+        assertEquals(fileRep, locations.getHosts().length);
{code}
The line commented out should be removed.

{code}
+    // At this point in time, number of replicas (fileRep) is 5 and the latest 
snapshot
+    // was taken when the replication factor was 4.
+
+    //Change replication factor back to 3, but it's set to 4.
+    hdfs.setReplication(file1, (short) 3);
{code}
I think there is confusion about terminology in the test. "fileRep" means 
replication factor set in inode. DistributedFileSystem#setReplication sets the 
given number to inode regardless of snapshots. 
INodeFile#getPreferredBlockReplication returns number of replicas adjusted 
considering snapshots. The number of replicas is called "blockRep" in the test.

{code}
+      //Wait for block replication
+      DFSTestUtil.waitForReplication(hdfs, file1, fileRep, 5000);
+
+      // Check number of replicas in datanodes, which should be 5.
+      len = fileStatus.getLen();
+      for( BlockLocation locations : hdfs.getFileBlockLocations(fileStatus, 0, 
len) ) {
+        //assertTrue(locations.getHosts().length == fileRep);
+        assertEquals(fileRep, locations.getHosts().length);
+      }
{code}
The check is redundant since DFSTestUtil#waitForReplication do the same thing. 
"should be 5" is wrong since fileRep is changing in the loop.

{code}
+    //Wait for block invalitation in datanodes, but expect to receive the 
TimeoutException.
+    try {
+      DFSTestUtil.waitForReplication(hdfs, file1, (short) 3, 10000);
+    } catch (Exception e) {
+      assertEquals(TimeoutException.class, e.getClass());
+    }
{code}
TestReplication could be a reference if you want to check that there is no 
invalidation work without waiting 10s, while I think just 
checkSnapshotFileReplication is enough here.

> Add more detailed log message when decreasing replication factor < max in 
> snapshots
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-14315
>                 URL: https://issues.apache.org/jira/browse/HDFS-14315
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.1.2
>            Reporter: Daisuke Kobayashi
>            Assignee: Daisuke Kobayashi
>            Priority: Minor
>         Attachments: HDFS-14315.000.patch, HDFS-14315.001.patch
>
>
> When changing replication factor for a given file, the following 3 types of 
> logging appear in the namenode log, but more detailed message would be useful 
> especially when the file is in snapshot(s).
> {noformat}
> Decreasing replication from X to Y for FILE
> Increasing replication from X to Y for FILE
> Replication remains unchanged at X for FILE
> {noformat}
> I have added additional log messages as well as further test scenarios to 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotReplication#testReplicationWithSnapshot.
> The test sequence is:
> 1) A file is created with replication factor 1 (there are 5 datanodes)
> 2) Take a snapshot and increase the factor by 1. Continue this loop up to 5.
> 3) Setrep back to 3, but the target replication is decided to 4, which is the 
> maximum in snapshots.
> {noformat}
> 2019-02-25 17:17:26,800 [IPC Server handler 9 on default port 55726] INFO  
> namenode.FSDirectory (FSDirAttrOp.java:unprotectedSetReplication(408)) - 
> Decreasing replication from 5 to 4 for /TestSnapshot/sub1/file1. Requested 
> value is 3, but 4 is the maximum in snapshots
> {noformat}
> 4) Setrep to 3 again, but it's unchanged as follows. Both 3) and 4) are 
> expected.
> {noformat}
> 2019-02-25 17:17:26,804 [IPC Server handler 6 on default port 55726] INFO  
> namenode.FSDirectory (FSDirAttrOp.java:unprotectedSetReplication(420)) - 
> Replication remains unchanged at 4 for /TestSnapshot/sub1/file1 . Requested 
> value is 3, but 4 is the maximum in snapshots.
> {noformat}
> 5) Make sure the number of replicas in datanodes remains 4.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14315) Add more detailed log message when decreasing replication factor < max in snapshots

Reply via email to