[jira] [Commented] (HDFS-14847) Erasure Coding: Blocks are over-replicated while EC decommissioning

2019-10-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953450#comment-16953450
 ] 

Ayush Saxena commented on HDFS-14847:
-

if that is just for readability, I must say we should not do that, 
Unnecessarily,  though trivial why to have any performance impact.

FYI.  HDFS-14768 touches the same part.

> Erasure Coding: Blocks are over-replicated while EC decommissioning
> ---
>
> Key: HDFS-14847
> URL: https://issues.apache.org/jira/browse/HDFS-14847
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-14847.001.patch, HDFS-14847.002.patch, 
> HDFS-14847.003.patch
>
>
> Found that Some blocks are over-replicated while ec decommissioning. Messages 
> in log as follow
> {quote}
> INFO BlockStateChange: Block: blk_-9223372035714984112_363779142, Expected 
> Replicas: 9, live replicas: 8, corrupt replicas: 0, decommissioned replicas: 
> 0, decommissioning replicas: 3, maintenance replicas: 0, live entering 
> maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes 
> having this block: 10.254.41.34:50010 10.254.54.53:50010 10.254.28.53:50010 
> 10.254.56.55:50010 10.254.32.21:50010 10.254.33.19:50010 10.254.63.17:50010 
> 10.254.31.19:50010 10.254.35.29:50010 10.254.51.57:50010 10.254.40.58:50010 
> 10.254.69.31:50010 10.254.47.18:50010 10.254.51.18:50010 10.254.43.57:50010 
> 10.254.50.47:50010 10.254.42.37:50010 10.254.57.29:50010 10.254.67.40:50010 
> 10.254.44.16:50010 10.254.59.38:50010 10.254.53.56:50010 10.254.45.11:50010 
> 10.254.39.22:50010 10.254.30.16:50010 10.254.35.53:50010 10.254.22.30:50010 
> 10.254.26.34:50010 10.254.17.58:50010 10.254.65.53:50010 10.254.60.39:50010 
> 10.254.61.20:50010 10.254.64.23:50010 10.254.21.13:50010 10.254.37.35:50010 
> 10.254.68.30:50010 10.254.62.37:50010 10.254.25.58:50010 10.254.52.54:50010 
> 10.254.58.31:50010 10.254.49.11:50010 10.254.55.52:50010 10.254.19.19:50010 
> 10.254.36.40:50010 10.254.18.30:50010 10.254.20.39:50010 10.254.66.52:50010 
> 10.254.56.32:50010 10.254.24.55:50010 10.254.34.11:50010 10.254.29.58:50010 
> 10.254.27.40:50010 10.254.46.33:50010 10.254.23.19:50010 10.254.74.12:50010 
> 10.254.74.13:50010 10.254.41.35:50010 10.254.67.58:50010 10.254.54.11:50010 
> 10.254.68.14:50010 10.254.27.14:50010 10.254.51.29:50010 10.254.45.21:50010 
> 10.254.50.56:50010 10.254.47.31:50010 10.254.40.14:50010 10.254.65.21:50010 
> 10.254.62.22:50010 10.254.57.16:50010 10.254.36.52:50010 10.254.30.13:50010 
> 10.254.35.12:50010 10.254.69.34:50010 10.254.34.58:50010 10.254.17.50:50010 
> 10.254.63.12:50010 10.254.28.21:50010 10.254.58.30:50010 10.254.24.57:50010 
> 10.254.33.50:50010 10.254.44.52:50010 10.254.32.48:50010 10.254.43.39:50010 
> 10.254.20.37:50010 10.254.56.59:50010 10.254.22.33:50010 10.254.60.34:50010 
> 10.254.49.19:50010 10.254.52.21:50010 10.254.23.59:50010 10.254.21.16:50010 
> 10.254.42.55:50010 10.254.29.33:50010 10.254.53.17:50010 10.254.19.14:50010 
> 10.254.64.51:50010 10.254.46.20:50010 10.254.66.22:50010 10.254.18.38:50010 
> 10.254.39.17:50010 10.254.37.57:50010 10.254.31.54:50010 10.254.55.33:50010 
> 10.254.25.17:50010 10.254.61.33:50010 10.254.26.40:50010 10.254.59.23:50010 
> 10.254.59.35:50010 10.254.66.48:50010 10.254.41.15:50010 10.254.54.31:50010 
> 10.254.61.50:50010 10.254.62.31:50010 10.254.17.56:50010 10.254.29.18:50010 
> 10.254.45.16:50010 10.254.63.48:50010 10.254.22.34:50010 10.254.37.51:50010 
> 10.254.65.49:50010 10.254.58.21:50010 10.254.42.12:50010 10.254.55.17:50010 
> 10.254.27.13:50010 10.254.57.17:50010 10.254.67.18:50010 10.254.31.31:50010 
> 10.254.28.12:50010 10.254.36.12:50010 10.254.21.59:50010 10.254.30.30:50010 
> 10.254.26.50:50010 10.254.40.40:50010 10.254.32.17:50010 10.254.47.55:50010 
> 10.254.60.55:50010 10.254.49.33:50010 10.254.68.47:50010 10.254.39.21:50010 
> 10.254.56.14:50010 10.254.33.54:50010 10.254.69.57:50010 10.254.43.50:50010 
> 10.254.50.13:50010 10.254.25.49:50010 10.254.18.20:50010 10.254.52.23:50010 
> 10.254.19.11:50010 10.254.20.21:50010 10.254.74.16:50010 10.254.64.55:50010 
> 10.254.24.48:50010 10.254.46.29:50010 10.254.51.12:50010 10.254.23.56:50010 
> 10.254.44.59:50010 10.254.53.58:50010 10.254.34.38:50010 10.254.35.37:50010 
> 10.254.35.16:50010 10.254.36.23:50010 10.254.41.47:50010 10.254.54.12:50010 
> 10.254.20.59:50010 , Current Datanode: 10.254.56.55:50010, Is current 
> datanode decommissioning: true, Is current datanode entering maintenance: 
> false
> {quote}
> Decommisions hang for a long time.
> Deep into the code and find that There is a problem in ErasureCodingWork.java
> For Example, there are 2 nodes(dn0, dn1) in dec

[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.

2019-10-16 Thread guojh (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953447#comment-16953447
 ] 

guojh commented on HDFS-14768:
--

[~ayushtkn] [~surendrasingh] Could you help review the code? I can drop the 
code that try to fixed the over-replicated problem. let it fixed in HDFS-14847. 

> In some cases, erasure blocks are corruption  when they are reconstruct.
> 
>
> Key: HDFS-14768
> URL: https://issues.apache.org/jira/browse/HDFS-14768
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, hdfs, namenode
>Affects Versions: 3.0.2
>Reporter: guojh
>Assignee: guojh
>Priority: Major
>  Labels: patch
> Fix For: 3.3.0
>
> Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, 
> HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, 
> HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, 
> guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, 
> zhaoyiming_UT_beofre_deomission.txt
>
>
> Policy is RS-6-3-1024K, version is hadoop 3.0.2;
> We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission 
> index[3,4], increase the index 6 datanode's
> pendingReplicationWithoutTargets  that make it large than 
> replicationStreamsHardLimit(we set 14). Then, After the method 
> chooseSourceDatanodes of BlockMananger, the liveBlockIndices is 
> [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. 
> In method scheduleReconstruction of BlockManager, the additionalReplRequired 
> is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a 
> erasureCode task to target datanode.
> When datanode get the task will build  targetIndices from liveBlockIndices 
> and target length. the code is blow.
> {code:java}
> // code placeholder
> targetIndices = new short[targets.length];
> private void initTargetIndices() { 
>   BitSet bitset = reconstructor.getLiveBitSet();
>   int m = 0; hasValidTargets = false; 
>   for (int i = 0; i < dataBlkNum + parityBlkNum; i++) {  
> if (!bitset.get) {    
>   if (reconstructor.getBlockLen > 0) {
>        if (m < targets.length) {
>          targetIndices[m++] = (short)i;
>          hasValidTargets = true;
>         }
>       }
>     }
>  }
> {code}
> targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value.
> The StripedReader is  aways create reader from first 6 index block, and is 
> [0,1,2,3,4,5]
> Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal 
> bug. the block index6's data is corruption(all data is zero).
> I write a unit test can stabilize repreduce.
> {code:java}
> // code placeholder
> private int replicationStreamsHardLimit = 
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT;
> numDNs = dataBlocks + parityBlocks + 10;
> @Test(timeout = 24)
> public void testFileDecommission() throws Exception {
>   LOG.info("Starting test testFileDecommission");
>   final Path ecFile = new Path(ecDir, "testFileDecommission");
>   int writeBytes = cellSize * dataBlocks;
>   writeStripedFile(dfs, ecFile, writeBytes);
>   Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks());
>   FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes);
>   final INodeFile fileNode = cluster.getNamesystem().getFSDirectory()
>   .getINode4Write(ecFile.toString()).asFile();
>   LocatedBlocks locatedBlocks =
>   StripedFileTestUtil.getLocatedBlocks(ecFile, dfs);
>   LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0)
>   .get(0);
>   DatanodeInfo[] dnLocs = lb.getLocations();
>   LocatedStripedBlock lastBlock =
>   (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock();
>   DatanodeInfo[] storageInfos = lastBlock.getLocations();
>   //
>   DatanodeDescriptor datanodeDescriptor = 
> cluster.getNameNode().getNamesystem()
>   
> .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid());
>   BlockInfo firstBlock = fileNode.getBlocks()[0];
>   DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock);
>   // the first heartbeat will consume 3 replica tasks
>   for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) {
> BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new 
> Block(i),
> new DatanodeStorageInfo[]{dStorageInfos[0]});
>   }
>   assertEquals(dataBlocks + parityBlocks, dnLocs.length);
>   int[] decommNodeIndex = {3, 4};
>   final List decommisionNodes = new ArrayList();
>   // add the node which will be decommissioning
>   decommisionNodes.add(dnLocs[decommNodeIndex[0]]);
>   decommisionNodes.add(dnLocs[decommNodeIndex[1]]);
>   decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED);
>   ass

[jira] [Commented] (HDFS-14847) Erasure Coding: Blocks are over-replicated while EC decommissioning

2019-10-16 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953445#comment-16953445
 ] 

Fei Hui commented on HDFS-14847:


[~ayushtkn] Thanks for your review!
{quote}
Can this be avoided, seems required only for size of bitset, can we not use 
srcNodes?
{quote}
Maybe realTotalBlockNum here is more readability

{quote}
For the test, for the wait methods, may be we can use 
GenericTestUtils.waitFor(...) ?
May be keep the helper methods together after the test only for better 
redability..
{quote}
Use GenericTestUtils.waitFor and remove useless code and move helper methods 
together after the test 

> Erasure Coding: Blocks are over-replicated while EC decommissioning
> ---
>
> Key: HDFS-14847
> URL: https://issues.apache.org/jira/browse/HDFS-14847
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-14847.001.patch, HDFS-14847.002.patch, 
> HDFS-14847.003.patch
>
>
> Found that Some blocks are over-replicated while ec decommissioning. Messages 
> in log as follow
> {quote}
> INFO BlockStateChange: Block: blk_-9223372035714984112_363779142, Expected 
> Replicas: 9, live replicas: 8, corrupt replicas: 0, decommissioned replicas: 
> 0, decommissioning replicas: 3, maintenance replicas: 0, live entering 
> maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes 
> having this block: 10.254.41.34:50010 10.254.54.53:50010 10.254.28.53:50010 
> 10.254.56.55:50010 10.254.32.21:50010 10.254.33.19:50010 10.254.63.17:50010 
> 10.254.31.19:50010 10.254.35.29:50010 10.254.51.57:50010 10.254.40.58:50010 
> 10.254.69.31:50010 10.254.47.18:50010 10.254.51.18:50010 10.254.43.57:50010 
> 10.254.50.47:50010 10.254.42.37:50010 10.254.57.29:50010 10.254.67.40:50010 
> 10.254.44.16:50010 10.254.59.38:50010 10.254.53.56:50010 10.254.45.11:50010 
> 10.254.39.22:50010 10.254.30.16:50010 10.254.35.53:50010 10.254.22.30:50010 
> 10.254.26.34:50010 10.254.17.58:50010 10.254.65.53:50010 10.254.60.39:50010 
> 10.254.61.20:50010 10.254.64.23:50010 10.254.21.13:50010 10.254.37.35:50010 
> 10.254.68.30:50010 10.254.62.37:50010 10.254.25.58:50010 10.254.52.54:50010 
> 10.254.58.31:50010 10.254.49.11:50010 10.254.55.52:50010 10.254.19.19:50010 
> 10.254.36.40:50010 10.254.18.30:50010 10.254.20.39:50010 10.254.66.52:50010 
> 10.254.56.32:50010 10.254.24.55:50010 10.254.34.11:50010 10.254.29.58:50010 
> 10.254.27.40:50010 10.254.46.33:50010 10.254.23.19:50010 10.254.74.12:50010 
> 10.254.74.13:50010 10.254.41.35:50010 10.254.67.58:50010 10.254.54.11:50010 
> 10.254.68.14:50010 10.254.27.14:50010 10.254.51.29:50010 10.254.45.21:50010 
> 10.254.50.56:50010 10.254.47.31:50010 10.254.40.14:50010 10.254.65.21:50010 
> 10.254.62.22:50010 10.254.57.16:50010 10.254.36.52:50010 10.254.30.13:50010 
> 10.254.35.12:50010 10.254.69.34:50010 10.254.34.58:50010 10.254.17.50:50010 
> 10.254.63.12:50010 10.254.28.21:50010 10.254.58.30:50010 10.254.24.57:50010 
> 10.254.33.50:50010 10.254.44.52:50010 10.254.32.48:50010 10.254.43.39:50010 
> 10.254.20.37:50010 10.254.56.59:50010 10.254.22.33:50010 10.254.60.34:50010 
> 10.254.49.19:50010 10.254.52.21:50010 10.254.23.59:50010 10.254.21.16:50010 
> 10.254.42.55:50010 10.254.29.33:50010 10.254.53.17:50010 10.254.19.14:50010 
> 10.254.64.51:50010 10.254.46.20:50010 10.254.66.22:50010 10.254.18.38:50010 
> 10.254.39.17:50010 10.254.37.57:50010 10.254.31.54:50010 10.254.55.33:50010 
> 10.254.25.17:50010 10.254.61.33:50010 10.254.26.40:50010 10.254.59.23:50010 
> 10.254.59.35:50010 10.254.66.48:50010 10.254.41.15:50010 10.254.54.31:50010 
> 10.254.61.50:50010 10.254.62.31:50010 10.254.17.56:50010 10.254.29.18:50010 
> 10.254.45.16:50010 10.254.63.48:50010 10.254.22.34:50010 10.254.37.51:50010 
> 10.254.65.49:50010 10.254.58.21:50010 10.254.42.12:50010 10.254.55.17:50010 
> 10.254.27.13:50010 10.254.57.17:50010 10.254.67.18:50010 10.254.31.31:50010 
> 10.254.28.12:50010 10.254.36.12:50010 10.254.21.59:50010 10.254.30.30:50010 
> 10.254.26.50:50010 10.254.40.40:50010 10.254.32.17:50010 10.254.47.55:50010 
> 10.254.60.55:50010 10.254.49.33:50010 10.254.68.47:50010 10.254.39.21:50010 
> 10.254.56.14:50010 10.254.33.54:50010 10.254.69.57:50010 10.254.43.50:50010 
> 10.254.50.13:50010 10.254.25.49:50010 10.254.18.20:50010 10.254.52.23:50010 
> 10.254.19.11:50010 10.254.20.21:50010 10.254.74.16:50010 10.254.64.55:50010 
> 10.254.24.48:50010 10.254.46.29:50010 10.254.51.12:50010 10.254.23.56:50010 
> 10.254.44.59:50010 10.254.53.58:50010 10.254.34.38:50010 10.254.35.37:50010 
> 10.254.35.16:50010 10.254.36.23:50010 10.254.41.47:50010 10.254.54.12:50010 
> 10.254.20.59:50010 , Current Datanode:

[jira] [Updated] (HDFS-14847) Erasure Coding: Blocks are over-replicated while EC decommissioning

2019-10-16 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-14847:
---
Attachment: HDFS-14847.003.patch

> Erasure Coding: Blocks are over-replicated while EC decommissioning
> ---
>
> Key: HDFS-14847
> URL: https://issues.apache.org/jira/browse/HDFS-14847
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-14847.001.patch, HDFS-14847.002.patch, 
> HDFS-14847.003.patch
>
>
> Found that Some blocks are over-replicated while ec decommissioning. Messages 
> in log as follow
> {quote}
> INFO BlockStateChange: Block: blk_-9223372035714984112_363779142, Expected 
> Replicas: 9, live replicas: 8, corrupt replicas: 0, decommissioned replicas: 
> 0, decommissioning replicas: 3, maintenance replicas: 0, live entering 
> maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes 
> having this block: 10.254.41.34:50010 10.254.54.53:50010 10.254.28.53:50010 
> 10.254.56.55:50010 10.254.32.21:50010 10.254.33.19:50010 10.254.63.17:50010 
> 10.254.31.19:50010 10.254.35.29:50010 10.254.51.57:50010 10.254.40.58:50010 
> 10.254.69.31:50010 10.254.47.18:50010 10.254.51.18:50010 10.254.43.57:50010 
> 10.254.50.47:50010 10.254.42.37:50010 10.254.57.29:50010 10.254.67.40:50010 
> 10.254.44.16:50010 10.254.59.38:50010 10.254.53.56:50010 10.254.45.11:50010 
> 10.254.39.22:50010 10.254.30.16:50010 10.254.35.53:50010 10.254.22.30:50010 
> 10.254.26.34:50010 10.254.17.58:50010 10.254.65.53:50010 10.254.60.39:50010 
> 10.254.61.20:50010 10.254.64.23:50010 10.254.21.13:50010 10.254.37.35:50010 
> 10.254.68.30:50010 10.254.62.37:50010 10.254.25.58:50010 10.254.52.54:50010 
> 10.254.58.31:50010 10.254.49.11:50010 10.254.55.52:50010 10.254.19.19:50010 
> 10.254.36.40:50010 10.254.18.30:50010 10.254.20.39:50010 10.254.66.52:50010 
> 10.254.56.32:50010 10.254.24.55:50010 10.254.34.11:50010 10.254.29.58:50010 
> 10.254.27.40:50010 10.254.46.33:50010 10.254.23.19:50010 10.254.74.12:50010 
> 10.254.74.13:50010 10.254.41.35:50010 10.254.67.58:50010 10.254.54.11:50010 
> 10.254.68.14:50010 10.254.27.14:50010 10.254.51.29:50010 10.254.45.21:50010 
> 10.254.50.56:50010 10.254.47.31:50010 10.254.40.14:50010 10.254.65.21:50010 
> 10.254.62.22:50010 10.254.57.16:50010 10.254.36.52:50010 10.254.30.13:50010 
> 10.254.35.12:50010 10.254.69.34:50010 10.254.34.58:50010 10.254.17.50:50010 
> 10.254.63.12:50010 10.254.28.21:50010 10.254.58.30:50010 10.254.24.57:50010 
> 10.254.33.50:50010 10.254.44.52:50010 10.254.32.48:50010 10.254.43.39:50010 
> 10.254.20.37:50010 10.254.56.59:50010 10.254.22.33:50010 10.254.60.34:50010 
> 10.254.49.19:50010 10.254.52.21:50010 10.254.23.59:50010 10.254.21.16:50010 
> 10.254.42.55:50010 10.254.29.33:50010 10.254.53.17:50010 10.254.19.14:50010 
> 10.254.64.51:50010 10.254.46.20:50010 10.254.66.22:50010 10.254.18.38:50010 
> 10.254.39.17:50010 10.254.37.57:50010 10.254.31.54:50010 10.254.55.33:50010 
> 10.254.25.17:50010 10.254.61.33:50010 10.254.26.40:50010 10.254.59.23:50010 
> 10.254.59.35:50010 10.254.66.48:50010 10.254.41.15:50010 10.254.54.31:50010 
> 10.254.61.50:50010 10.254.62.31:50010 10.254.17.56:50010 10.254.29.18:50010 
> 10.254.45.16:50010 10.254.63.48:50010 10.254.22.34:50010 10.254.37.51:50010 
> 10.254.65.49:50010 10.254.58.21:50010 10.254.42.12:50010 10.254.55.17:50010 
> 10.254.27.13:50010 10.254.57.17:50010 10.254.67.18:50010 10.254.31.31:50010 
> 10.254.28.12:50010 10.254.36.12:50010 10.254.21.59:50010 10.254.30.30:50010 
> 10.254.26.50:50010 10.254.40.40:50010 10.254.32.17:50010 10.254.47.55:50010 
> 10.254.60.55:50010 10.254.49.33:50010 10.254.68.47:50010 10.254.39.21:50010 
> 10.254.56.14:50010 10.254.33.54:50010 10.254.69.57:50010 10.254.43.50:50010 
> 10.254.50.13:50010 10.254.25.49:50010 10.254.18.20:50010 10.254.52.23:50010 
> 10.254.19.11:50010 10.254.20.21:50010 10.254.74.16:50010 10.254.64.55:50010 
> 10.254.24.48:50010 10.254.46.29:50010 10.254.51.12:50010 10.254.23.56:50010 
> 10.254.44.59:50010 10.254.53.58:50010 10.254.34.38:50010 10.254.35.37:50010 
> 10.254.35.16:50010 10.254.36.23:50010 10.254.41.47:50010 10.254.54.12:50010 
> 10.254.20.59:50010 , Current Datanode: 10.254.56.55:50010, Is current 
> datanode decommissioning: true, Is current datanode entering maintenance: 
> false
> {quote}
> Decommisions hang for a long time.
> Deep into the code and find that There is a problem in ErasureCodingWork.java
> For Example, there are 2 nodes(dn0, dn1) in decommission and an ec block 
> group with the 2 nodes. After creating an ErasureCodingWork to reconstruct, 
> it will create 2 replication work. 
> If dn0 replicates in success and dn1 replicates in failure, T

[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.

2019-10-16 Thread guojh (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953443#comment-16953443
 ] 

guojh commented on HDFS-14768:
--

[~ayushtkn] Thanks for you reply, This patch try to fix two problem, one is 
replica over-replicated another is block incorrect. The code  quote by 
[~surendrasingh] is try to fixed the over-replicated issue.

> In some cases, erasure blocks are corruption  when they are reconstruct.
> 
>
> Key: HDFS-14768
> URL: https://issues.apache.org/jira/browse/HDFS-14768
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, hdfs, namenode
>Affects Versions: 3.0.2
>Reporter: guojh
>Assignee: guojh
>Priority: Major
>  Labels: patch
> Fix For: 3.3.0
>
> Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, 
> HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, 
> HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, 
> guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, 
> zhaoyiming_UT_beofre_deomission.txt
>
>
> Policy is RS-6-3-1024K, version is hadoop 3.0.2;
> We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission 
> index[3,4], increase the index 6 datanode's
> pendingReplicationWithoutTargets  that make it large than 
> replicationStreamsHardLimit(we set 14). Then, After the method 
> chooseSourceDatanodes of BlockMananger, the liveBlockIndices is 
> [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. 
> In method scheduleReconstruction of BlockManager, the additionalReplRequired 
> is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a 
> erasureCode task to target datanode.
> When datanode get the task will build  targetIndices from liveBlockIndices 
> and target length. the code is blow.
> {code:java}
> // code placeholder
> targetIndices = new short[targets.length];
> private void initTargetIndices() { 
>   BitSet bitset = reconstructor.getLiveBitSet();
>   int m = 0; hasValidTargets = false; 
>   for (int i = 0; i < dataBlkNum + parityBlkNum; i++) {  
> if (!bitset.get) {    
>   if (reconstructor.getBlockLen > 0) {
>        if (m < targets.length) {
>          targetIndices[m++] = (short)i;
>          hasValidTargets = true;
>         }
>       }
>     }
>  }
> {code}
> targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value.
> The StripedReader is  aways create reader from first 6 index block, and is 
> [0,1,2,3,4,5]
> Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal 
> bug. the block index6's data is corruption(all data is zero).
> I write a unit test can stabilize repreduce.
> {code:java}
> // code placeholder
> private int replicationStreamsHardLimit = 
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT;
> numDNs = dataBlocks + parityBlocks + 10;
> @Test(timeout = 24)
> public void testFileDecommission() throws Exception {
>   LOG.info("Starting test testFileDecommission");
>   final Path ecFile = new Path(ecDir, "testFileDecommission");
>   int writeBytes = cellSize * dataBlocks;
>   writeStripedFile(dfs, ecFile, writeBytes);
>   Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks());
>   FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes);
>   final INodeFile fileNode = cluster.getNamesystem().getFSDirectory()
>   .getINode4Write(ecFile.toString()).asFile();
>   LocatedBlocks locatedBlocks =
>   StripedFileTestUtil.getLocatedBlocks(ecFile, dfs);
>   LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0)
>   .get(0);
>   DatanodeInfo[] dnLocs = lb.getLocations();
>   LocatedStripedBlock lastBlock =
>   (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock();
>   DatanodeInfo[] storageInfos = lastBlock.getLocations();
>   //
>   DatanodeDescriptor datanodeDescriptor = 
> cluster.getNameNode().getNamesystem()
>   
> .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid());
>   BlockInfo firstBlock = fileNode.getBlocks()[0];
>   DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock);
>   // the first heartbeat will consume 3 replica tasks
>   for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) {
> BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new 
> Block(i),
> new DatanodeStorageInfo[]{dStorageInfos[0]});
>   }
>   assertEquals(dataBlocks + parityBlocks, dnLocs.length);
>   int[] decommNodeIndex = {3, 4};
>   final List decommisionNodes = new ArrayList();
>   // add the node which will be decommissioning
>   decommisionNodes.add(dnLocs[decommNodeIndex[0]]);
>   decommisionNodes.add(dnLocs[decommNodeIndex[1]]);
>   decommissionNode(0, dec

[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2208:
-
Labels: pull-request-available  (was: )

> Propagate System Exceptions from OM transaction apply phase
> ---
>
> Key: HDDS-2208
> URL: https://issues.apache.org/jira/browse/HDDS-2208
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>
> The change for HDDS-2206 tracks system exceptions during preExecute phase of 
> OM request handling.
> The current jira is to implement exception propagation once the OM request is 
> submitted to Ratis - when the handler is running validateAndUpdateCache for 
> the request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2208?focusedWorklogId=329629&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329629
 ]

ASF GitHub Bot logged work on HDDS-2208:


Author: ASF GitHub Bot
Created on: 17/Oct/19 06:25
Start Date: 17/Oct/19 06:25
Worklog Time Spent: 10m 
  Work Description: supratimdeka commented on pull request #47: HDDS-2208. 
Propagate System Exceptions from OM transaction apply phase. Contributed by 
Supratim Deka
URL: https://github.com/apache/hadoop-ozone/pull/47
 
 
   https://issues.apache.org/jira/browse/HDDS-2208
   
   This is a follow-up to the patch for HDDS-2206
   https://github.com/apache/hadoop-ozone/pull/12
   
   The change propagates complete stacktraces for system exceptions encountered 
during the Ratis phase of the OM request handling.
   
   However, this patch does not consider the configuration parameter introduced 
earlier in the patch for HDDS-2206. Controlling the behaviour using a 
configuration parameter requires a much greater footprint in the code. Because 
at this point, there is no clear requirement for such a config parameter - 
going ahead without the config param.
   Will update the patch for HDDS-2206 as well - will remove the configuration 
that was introduced.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329629)
Remaining Estimate: 0h
Time Spent: 10m

> Propagate System Exceptions from OM transaction apply phase
> ---
>
> Key: HDDS-2208
> URL: https://issues.apache.org/jira/browse/HDDS-2208
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The change for HDDS-2206 tracks system exceptions during preExecute phase of 
> OM request handling.
> The current jira is to implement exception propagation once the OM request is 
> submitted to Ratis - when the handler is running validateAndUpdateCache for 
> the request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase

2019-10-16 Thread Supratim Deka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Supratim Deka updated HDDS-2208:

Description: 
The change for HDDS-2206 tracks system exceptions during preExecute phase of OM 
request handling.

The current jira is to implement exception propagation once the OM request is 
submitted to Ratis - when the handler is running validateAndUpdateCache for the 
request.

  was:
applyTransaction handling in the OzoneManagerStateMachine does not propagate 
exceptions/failures to the initiator.
The future which is returned from applyTransaction simply tracks completion of 
the async executor represented by the "executorService" in 
OzoneManagerStateMachine.java


> Propagate System Exceptions from OM transaction apply phase
> ---
>
> Key: HDDS-2208
> URL: https://issues.apache.org/jira/browse/HDDS-2208
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>
> The change for HDDS-2206 tracks system exceptions during preExecute phase of 
> OM request handling.
> The current jira is to implement exception propagation once the OM request is 
> submitted to Ratis - when the handler is running validateAndUpdateCache for 
> the request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase

2019-10-16 Thread Supratim Deka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Supratim Deka updated HDDS-2208:

Summary: Propagate System Exceptions from OM transaction apply phase  (was: 
OzoneManagerStateMachine does not track failures in applyTransaction)

> Propagate System Exceptions from OM transaction apply phase
> ---
>
> Key: HDDS-2208
> URL: https://issues.apache.org/jira/browse/HDDS-2208
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>
> applyTransaction handling in the OzoneManagerStateMachine does not propagate 
> exceptions/failures to the initiator.
> The future which is returned from applyTransaction simply tracks completion 
> of the async executor represented by the "executorService" in 
> OzoneManagerStateMachine.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope

2019-10-16 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953415#comment-16953415
 ] 

Brahma Reddy Battula commented on HDFS-14909:
-

+1,Nice Finding.

> DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage 
> count for excluded node which is already part of excluded scope 
> -
>
> Key: HDFS-14909
> URL: https://issues.apache.org/jira/browse/HDFS-14909
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch, 
> HDFS-14909.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2286) Add a log info in ozone client and scm to print the exclusion list during allocate block

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2286?focusedWorklogId=329591&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329591
 ]

ASF GitHub Bot logged work on HDDS-2286:


Author: ASF GitHub Bot
Created on: 17/Oct/19 05:01
Start Date: 17/Oct/19 05:01
Worklog Time Spent: 10m 
  Work Description: swagle commented on pull request #46: HDDS-2286. Add a 
log info in ozone client and scm to print the exclus…
URL: https://github.com/apache/hadoop-ozone/pull/46
 
 
   ##  What changes were proposed in this pull request?
   Added additional logging to print exclude lists on client and SCM
   
   (Please fill in changes proposed in this fix)
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-2286
   
   ## How was this patch tested?
   Waiting on unit tests since only log statements were added.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329591)
Remaining Estimate: 0h
Time Spent: 10m

> Add a log info in ozone client and scm to print the exclusion list during 
> allocate block
> 
>
> Key: HDDS-2286
> URL: https://issues.apache.org/jira/browse/HDDS-2286
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2286) Add a log info in ozone client and scm to print the exclusion list during allocate block

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2286:
-
Labels: pull-request-available  (was: )

> Add a log info in ozone client and scm to print the exclusion list during 
> allocate block
> 
>
> Key: HDDS-2286
> URL: https://issues.apache.org/jira/browse/HDDS-2286
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14880) Balancer sequence of statistics & exit message is not correct

2019-10-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953389#comment-16953389
 ] 

Ayush Saxena commented on HDFS-14880:
-

Thanx [~prasad-acit] for the report, The fix looks good and fair enough to have.
The only concern that I see could be incompatibility due to change in CLI.
[~vinayakumarb] can you help, how can this be taken ahead.?

> Balancer sequence of statistics & exit message is not correct
> -
>
> Key: HDFS-14880
> URL: https://issues.apache.org/jira/browse/HDFS-14880
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 3.1.1, 3.2.1
> Environment: Run the balancer tool in cluster.
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-14880.0001.patch
>
>
> Actual:
> Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
> The cluster is balanced. Exiting...
> Sep 27, 2019 5:13:15 PM   0   0 B  0 B
>   0 B
> Sep 27, 2019 5:13:15 PM Balancing took 1.726 seconds
> Done!
> Expected: Exit message should be after loggin all the balancer movement 
> statistics data.
> Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
> Sep 27, 2019 5:13:15 PM   0   0 B  0 B
>   0 B
> The cluster is balanced. Exiting...
> Sep 27, 2019 5:13:15 PM Balancing took 1.726 seconds
> Done!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2275) In BatchOperation.SingleOperation, do not clone byte[]

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2275?focusedWorklogId=329575&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329575
 ]

ASF GitHub Bot logged work on HDDS-2275:


Author: ASF GitHub Bot
Created on: 17/Oct/19 04:15
Start Date: 17/Oct/19 04:15
Worklog Time Spent: 10m 
  Work Description: szetszwo commented on pull request #45: HDDS-2275. In 
BatchOperation.SingleOperation, do not clone byte[].
URL: https://github.com/apache/hadoop-ozone/pull/45
 
 
   See https://issues.apache.org/jira/browse/HDDS-2275
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329575)
Remaining Estimate: 0h
Time Spent: 10m

> In BatchOperation.SingleOperation, do not clone byte[]
> --
>
> Key: HDDS-2275
> URL: https://issues.apache.org/jira/browse/HDDS-2275
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Attachments: o2275_20191015.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> byte[] is cloned once in the constructor and then it is cloned again in the 
> getter methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2275) In BatchOperation.SingleOperation, do not clone byte[]

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2275:
-
Labels: pull-request-available  (was: )

> In BatchOperation.SingleOperation, do not clone byte[]
> --
>
> Key: HDDS-2275
> URL: https://issues.apache.org/jira/browse/HDDS-2275
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Attachments: o2275_20191015.patch
>
>
> byte[] is cloned once in the constructor and then it is cloned again in the 
> getter methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329570&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329570
 ]

ASF GitHub Bot logged work on HDDS-2271:


Author: ASF GitHub Bot
Created on: 17/Oct/19 04:06
Start Date: 17/Oct/19 04:06
Worklog Time Spent: 10m 
  Work Description: szetszwo commented on issue #1625: HDDS-2271. Avoid 
buffer copying in KeyValueHandler.
URL: https://github.com/apache/hadoop/pull/1625#issuecomment-542990305
 
 
   This was migrated to https://github.com/apache/hadoop-ozone/pull/44
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329570)
Time Spent: 1h 10m  (was: 1h)

> Avoid buffer copying in KeyValueHandler
> ---
>
> Key: HDDS-2271
> URL: https://issues.apache.org/jira/browse/HDDS-2271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: o2271_20191015.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> - In handleGetSmallFile, it first reads chunk data to a   byte[] and the copy 
> them to a ByteString.
> - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds 
> a ContainerProtos.BlockData and then copies it to a byte[].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2271) Avoid buffer copying in KeyValueHandler

2019-10-16 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze updated HDDS-2271:
-
Fix Version/s: 0.6.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

I have committed this.

> Avoid buffer copying in KeyValueHandler
> ---
>
> Key: HDDS-2271
> URL: https://issues.apache.org/jira/browse/HDDS-2271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: o2271_20191015.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> - In handleGetSmallFile, it first reads chunk data to a   byte[] and the copy 
> them to a ByteString.
> - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds 
> a ContainerProtos.BlockData and then copies it to a byte[].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329569&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329569
 ]

ASF GitHub Bot logged work on HDDS-2271:


Author: ASF GitHub Bot
Created on: 17/Oct/19 04:05
Start Date: 17/Oct/19 04:05
Worklog Time Spent: 10m 
  Work Description: szetszwo commented on pull request #1625: HDDS-2271. 
Avoid buffer copying in KeyValueHandler.
URL: https://github.com/apache/hadoop/pull/1625
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329569)
Time Spent: 1h  (was: 50m)

> Avoid buffer copying in KeyValueHandler
> ---
>
> Key: HDDS-2271
> URL: https://issues.apache.org/jira/browse/HDDS-2271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: o2271_20191015.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> - In handleGetSmallFile, it first reads chunk data to a   byte[] and the copy 
> them to a ByteString.
> - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds 
> a ContainerProtos.BlockData and then copies it to a byte[].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329568&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329568
 ]

ASF GitHub Bot logged work on HDDS-2271:


Author: ASF GitHub Bot
Created on: 17/Oct/19 04:04
Start Date: 17/Oct/19 04:04
Worklog Time Spent: 10m 
  Work Description: szetszwo commented on pull request #44: HDDS-2271. 
Avoid buffer copying in KeyValueHandler.
URL: https://github.com/apache/hadoop-ozone/pull/44
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329568)
Time Spent: 50m  (was: 40m)

> Avoid buffer copying in KeyValueHandler
> ---
>
> Key: HDDS-2271
> URL: https://issues.apache.org/jira/browse/HDDS-2271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Attachments: o2271_20191015.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> - In handleGetSmallFile, it first reads chunk data to a   byte[] and the copy 
> them to a ByteString.
> - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds 
> a ContainerProtos.BlockData and then copies it to a byte[].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2271) Avoid buffer copying in KeyValueHandler

2019-10-16 Thread Tsz-wo Sze (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953378#comment-16953378
 ] 

Tsz-wo Sze commented on HDDS-2271:
--

Thanks [~shashikant] and [~hanishakoneru] for reviewing this.

> Avoid buffer copying in KeyValueHandler
> ---
>
> Key: HDDS-2271
> URL: https://issues.apache.org/jira/browse/HDDS-2271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Attachments: o2271_20191015.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> - In handleGetSmallFile, it first reads chunk data to a   byte[] and the copy 
> them to a ByteString.
> - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds 
> a ContainerProtos.BlockData and then copies it to a byte[].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329567
 ]

ASF GitHub Bot logged work on HDDS-2271:


Author: ASF GitHub Bot
Created on: 17/Oct/19 04:02
Start Date: 17/Oct/19 04:02
Worklog Time Spent: 10m 
  Work Description: szetszwo commented on pull request #44: HDDS-2271. 
Avoid buffer copying in KeyValueHandler.
URL: https://github.com/apache/hadoop-ozone/pull/44
 
 
   Migrated from https://github.com/apache/hadoop/pull/1625
   
   See https://issues.apache.org/jira/browse/HDDS-2271
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329567)
Time Spent: 40m  (was: 0.5h)

> Avoid buffer copying in KeyValueHandler
> ---
>
> Key: HDDS-2271
> URL: https://issues.apache.org/jira/browse/HDDS-2271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Attachments: o2271_20191015.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> - In handleGetSmallFile, it first reads chunk data to a   byte[] and the copy 
> them to a ByteString.
> - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds 
> a ContainerProtos.BlockData and then copies it to a byte[].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.

2019-10-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953341#comment-16953341
 ] 

Ayush Saxena commented on HDFS-14768:
-

[~gjhkael] is the scenario you are talking is the one in HDFS-14847?

> In some cases, erasure blocks are corruption  when they are reconstruct.
> 
>
> Key: HDFS-14768
> URL: https://issues.apache.org/jira/browse/HDFS-14768
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, hdfs, namenode
>Affects Versions: 3.0.2
>Reporter: guojh
>Assignee: guojh
>Priority: Major
>  Labels: patch
> Fix For: 3.3.0
>
> Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, 
> HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, 
> HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, 
> guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, 
> zhaoyiming_UT_beofre_deomission.txt
>
>
> Policy is RS-6-3-1024K, version is hadoop 3.0.2;
> We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission 
> index[3,4], increase the index 6 datanode's
> pendingReplicationWithoutTargets  that make it large than 
> replicationStreamsHardLimit(we set 14). Then, After the method 
> chooseSourceDatanodes of BlockMananger, the liveBlockIndices is 
> [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. 
> In method scheduleReconstruction of BlockManager, the additionalReplRequired 
> is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a 
> erasureCode task to target datanode.
> When datanode get the task will build  targetIndices from liveBlockIndices 
> and target length. the code is blow.
> {code:java}
> // code placeholder
> targetIndices = new short[targets.length];
> private void initTargetIndices() { 
>   BitSet bitset = reconstructor.getLiveBitSet();
>   int m = 0; hasValidTargets = false; 
>   for (int i = 0; i < dataBlkNum + parityBlkNum; i++) {  
> if (!bitset.get) {    
>   if (reconstructor.getBlockLen > 0) {
>        if (m < targets.length) {
>          targetIndices[m++] = (short)i;
>          hasValidTargets = true;
>         }
>       }
>     }
>  }
> {code}
> targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value.
> The StripedReader is  aways create reader from first 6 index block, and is 
> [0,1,2,3,4,5]
> Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal 
> bug. the block index6's data is corruption(all data is zero).
> I write a unit test can stabilize repreduce.
> {code:java}
> // code placeholder
> private int replicationStreamsHardLimit = 
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT;
> numDNs = dataBlocks + parityBlocks + 10;
> @Test(timeout = 24)
> public void testFileDecommission() throws Exception {
>   LOG.info("Starting test testFileDecommission");
>   final Path ecFile = new Path(ecDir, "testFileDecommission");
>   int writeBytes = cellSize * dataBlocks;
>   writeStripedFile(dfs, ecFile, writeBytes);
>   Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks());
>   FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes);
>   final INodeFile fileNode = cluster.getNamesystem().getFSDirectory()
>   .getINode4Write(ecFile.toString()).asFile();
>   LocatedBlocks locatedBlocks =
>   StripedFileTestUtil.getLocatedBlocks(ecFile, dfs);
>   LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0)
>   .get(0);
>   DatanodeInfo[] dnLocs = lb.getLocations();
>   LocatedStripedBlock lastBlock =
>   (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock();
>   DatanodeInfo[] storageInfos = lastBlock.getLocations();
>   //
>   DatanodeDescriptor datanodeDescriptor = 
> cluster.getNameNode().getNamesystem()
>   
> .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid());
>   BlockInfo firstBlock = fileNode.getBlocks()[0];
>   DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock);
>   // the first heartbeat will consume 3 replica tasks
>   for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) {
> BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new 
> Block(i),
> new DatanodeStorageInfo[]{dStorageInfos[0]});
>   }
>   assertEquals(dataBlocks + parityBlocks, dnLocs.length);
>   int[] decommNodeIndex = {3, 4};
>   final List decommisionNodes = new ArrayList();
>   // add the node which will be decommissioning
>   decommisionNodes.add(dnLocs[decommNodeIndex[0]]);
>   decommisionNodes.add(dnLocs[decommNodeIndex[1]]);
>   decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED);
>   assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes());
>   bm.

[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-10-16 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953314#comment-16953314
 ] 

Jinglun commented on HDFS-14908:


Hi [~elgoiri], thanks your nice comments ! One place is 
FederationUtil.isParentEntry(). When I did v01 I took it as reference.

I'll make a benchmark comparing this method and String.startsWith().

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.

2019-10-16 Thread guojh (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953287#comment-16953287
 ] 

guojh commented on HDFS-14768:
--

[~surendrasingh] If I decommission two datanode that have the block index 6 and 
8, After first replica schedule, the index 6 is success replica but index 8 is 
not. Then the file with index[0, 1, 2, 3, 4, 5, 6(Decommissioning), 7 
,8(Decommissioning), 6],  after timeout, The BlockManager will schecule it 
again, then findLeavingServiceSources will still choose index 6 to replica. 
This is another problem. Should I create a new Issue and write another UT?

> In some cases, erasure blocks are corruption  when they are reconstruct.
> 
>
> Key: HDFS-14768
> URL: https://issues.apache.org/jira/browse/HDFS-14768
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, hdfs, namenode
>Affects Versions: 3.0.2
>Reporter: guojh
>Assignee: guojh
>Priority: Major
>  Labels: patch
> Fix For: 3.3.0
>
> Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, 
> HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, 
> HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, 
> guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, 
> zhaoyiming_UT_beofre_deomission.txt
>
>
> Policy is RS-6-3-1024K, version is hadoop 3.0.2;
> We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission 
> index[3,4], increase the index 6 datanode's
> pendingReplicationWithoutTargets  that make it large than 
> replicationStreamsHardLimit(we set 14). Then, After the method 
> chooseSourceDatanodes of BlockMananger, the liveBlockIndices is 
> [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. 
> In method scheduleReconstruction of BlockManager, the additionalReplRequired 
> is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a 
> erasureCode task to target datanode.
> When datanode get the task will build  targetIndices from liveBlockIndices 
> and target length. the code is blow.
> {code:java}
> // code placeholder
> targetIndices = new short[targets.length];
> private void initTargetIndices() { 
>   BitSet bitset = reconstructor.getLiveBitSet();
>   int m = 0; hasValidTargets = false; 
>   for (int i = 0; i < dataBlkNum + parityBlkNum; i++) {  
> if (!bitset.get) {    
>   if (reconstructor.getBlockLen > 0) {
>        if (m < targets.length) {
>          targetIndices[m++] = (short)i;
>          hasValidTargets = true;
>         }
>       }
>     }
>  }
> {code}
> targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value.
> The StripedReader is  aways create reader from first 6 index block, and is 
> [0,1,2,3,4,5]
> Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal 
> bug. the block index6's data is corruption(all data is zero).
> I write a unit test can stabilize repreduce.
> {code:java}
> // code placeholder
> private int replicationStreamsHardLimit = 
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT;
> numDNs = dataBlocks + parityBlocks + 10;
> @Test(timeout = 24)
> public void testFileDecommission() throws Exception {
>   LOG.info("Starting test testFileDecommission");
>   final Path ecFile = new Path(ecDir, "testFileDecommission");
>   int writeBytes = cellSize * dataBlocks;
>   writeStripedFile(dfs, ecFile, writeBytes);
>   Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks());
>   FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes);
>   final INodeFile fileNode = cluster.getNamesystem().getFSDirectory()
>   .getINode4Write(ecFile.toString()).asFile();
>   LocatedBlocks locatedBlocks =
>   StripedFileTestUtil.getLocatedBlocks(ecFile, dfs);
>   LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0)
>   .get(0);
>   DatanodeInfo[] dnLocs = lb.getLocations();
>   LocatedStripedBlock lastBlock =
>   (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock();
>   DatanodeInfo[] storageInfos = lastBlock.getLocations();
>   //
>   DatanodeDescriptor datanodeDescriptor = 
> cluster.getNameNode().getNamesystem()
>   
> .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid());
>   BlockInfo firstBlock = fileNode.getBlocks()[0];
>   DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock);
>   // the first heartbeat will consume 3 replica tasks
>   for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) {
> BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new 
> Block(i),
> new DatanodeStorageInfo[]{dStorageInfos[0]});
>   }
>   assertEquals(dataBlocks + parityBlocks, dnLocs.length);
>   int[] decommNodeI

[jira] [Updated] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer

2019-10-16 Thread Vivek Ratnavel Subramanian (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Ratnavel Subramanian updated HDDS-2181:
-
Status: Patch Available  (was: Reopened)

> Ozone Manager should send correct ACL type in ACL requests to Authorizer
> 
>
> Key: HDDS-2181
> URL: https://issues.apache.org/jira/browse/HDDS-2181
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete 
> and bucket create operation. Fix the acl type in all requests to the 
> authorizer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2181?focusedWorklogId=329498&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329498
 ]

ASF GitHub Bot logged work on HDDS-2181:


Author: ASF GitHub Bot
Created on: 17/Oct/19 00:14
Start Date: 17/Oct/19 00:14
Worklog Time Spent: 10m 
  Work Description: vivekratnavel commented on pull request #43: HDDS-2181. 
Ozone Manager should send correct ACL type in ACL requests…
URL: https://github.com/apache/hadoop-ozone/pull/43
 
 
   … to Authorizer
   
   ## What changes were proposed in this pull request?
   
   The ACL types sent to authorizers is changed from sending "WRITE" ACL type 
always to sending appropriate ACL types as required.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-2181
   
   ## How was this patch tested?
   
   This patch was tested by updating and running unit tests and acceptance 
tests. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329498)
Time Spent: 11h  (was: 10h 50m)

> Ozone Manager should send correct ACL type in ACL requests to Authorizer
> 
>
> Key: HDDS-2181
> URL: https://issues.apache.org/jira/browse/HDDS-2181
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete 
> and bucket create operation. Fix the acl type in all requests to the 
> authorizer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2302) Manage common pom versions in one common place

2019-10-16 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2302.

Fix Version/s: 0.5.0
   Resolution: Fixed

> Manage common pom versions in one common place
> --
>
> Key: HDDS-2302
> URL: https://issues.apache.org/jira/browse/HDDS-2302
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: build
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Some of the versions (eg. ozone.version, hdds.version, ratis.version) are 
> required for both ozone and hdds subprojects. As we have a common pom.xml it 
> can be safer to manage them in one common place at the root pom.xml instead 
> of managing them multiple times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2302) Manage common pom versions in one common place

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2302?focusedWorklogId=329484&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329484
 ]

ASF GitHub Bot logged work on HDDS-2302:


Author: ASF GitHub Bot
Created on: 16/Oct/19 23:32
Start Date: 16/Oct/19 23:32
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #21: HDDS-2302. 
Manage common pom versions in one common place
URL: https://github.com/apache/hadoop-ozone/pull/21
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329484)
Time Spent: 20m  (was: 10m)

> Manage common pom versions in one common place
> --
>
> Key: HDDS-2302
> URL: https://issues.apache.org/jira/browse/HDDS-2302
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: build
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Some of the versions (eg. ozone.version, hdds.version, ratis.version) are 
> required for both ozone and hdds subprojects. As we have a common pom.xml it 
> can be safer to manage them in one common place at the root pom.xml instead 
> of managing them multiple times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2289) Put testing information and a problem description to the github PR template

2019-10-16 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2289.

Resolution: Fixed

> Put testing information and a problem description to the github PR template
> ---
>
> Key: HDDS-2289
> URL: https://issues.apache.org/jira/browse/HDDS-2289
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is suggested by [~aengineer] during an offline discussion to add more 
> information to the github PR template based on the template of ambari (by 
> Vivek):
> https://github.com/apache/ambari/commit/579cec8cf5bcfe1a1a0feacf055ed6569f674e6a



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope

2019-10-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953236#comment-16953236
 ] 

Hadoop QA commented on HDFS-14909:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 10s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14909 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12983211/HDFS-14909.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7c35529448fa 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 375224e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28099/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28099/testReport/ |
| Max. process+thread count | 3149 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-

[jira] [Commented] (HDDS-2271) Avoid buffer copying in KeyValueHandler

2019-10-16 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953233#comment-16953233
 ] 

Hanisha Koneru commented on HDDS-2271:
--

Thanks [~szetszwo] for the patch.
LGTM. +1.

> Avoid buffer copying in KeyValueHandler
> ---
>
> Key: HDDS-2271
> URL: https://issues.apache.org/jira/browse/HDDS-2271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Attachments: o2271_20191015.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> - In handleGetSmallFile, it first reads chunk data to a   byte[] and the copy 
> them to a ByteString.
> - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds 
> a ContainerProtos.BlockData and then copies it to a byte[].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation

2019-10-16 Thread Siddharth Wagle (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953232#comment-16953232
 ] 

Siddharth Wagle edited comment on HDDS-2283 at 10/16/19 10:22 PM:
--

[~aengineer] Yes the follow-up Jira will not be blindly taken up without 
figuring out if 10s of RocksDBs sharing a disk vs 1 RocksDB with 10 tables in a 
single RocksDB which one is better/worse.

I took this up as a low hanging fruit, agree with not focusing on 
micro-benchmarks comment. This was just a curiosity / exploratory effort from 
me that took all of 20 mins including the fix so went ahead with the patch.


was (Author: swagle):
[~aengineer] Yes the follow-up Jira will not be blindly taken up without 
figuring out if 10s of RocksDBs sharing a disk vs 1 RocksDB with 10 tables in a 
single RocksDB which one is better/worse.

I took this up as a low hanging fruit, agree with not focusing on 
micro-benchmarks comment. This was just a curiosity / exploratory effort from 
me that took all of 20 mins including the fix.

> Container Creation on datanodes take around 300ms due to rocksdb creation
> -
>
> Key: HDDS-2283
> URL: https://issues.apache.org/jira/browse/HDDS-2283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-2283.00.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Container Creation on datanodes take around 300ms due to rocksdb creation. 
> Rocksdb creation is taking a considerable time and this needs to be optimized.
> Creating a rocksdb per disk should be enough and each container can be table 
> inside the rocksdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation

2019-10-16 Thread Siddharth Wagle (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953232#comment-16953232
 ] 

Siddharth Wagle commented on HDDS-2283:
---

[~aengineer] Yes the follow-up Jira will not be blindly taken up without 
figuring out if 10s of RocksDBs sharing a disk vs 1 RocksDB with 10 tables in a 
single RocksDB which one is better/worse.

I took this up as a low hanging fruit, agree with not focusing on 
micro-benchmarks comment. This was just a curiosity / exploratory effort from 
me that took all of 20 mins including the fix.

> Container Creation on datanodes take around 300ms due to rocksdb creation
> -
>
> Key: HDDS-2283
> URL: https://issues.apache.org/jira/browse/HDDS-2283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-2283.00.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Container Creation on datanodes take around 300ms due to rocksdb creation. 
> Rocksdb creation is taking a considerable time and this needs to be optimized.
> Creating a rocksdb per disk should be enough and each container can be table 
> inside the rocksdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2275) In BatchOperation.SingleOperation, do not clone byte[]

2019-10-16 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953216#comment-16953216
 ] 

Hanisha Koneru commented on HDDS-2275:
--

Thanks [~szetszwo] for the patch.
LGTM. +1.

> In BatchOperation.SingleOperation, do not clone byte[]
> --
>
> Key: HDDS-2275
> URL: https://issues.apache.org/jira/browse/HDDS-2275
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
> Attachments: o2275_20191015.patch
>
>
> byte[] is cloned once in the constructor and then it is cloned again in the 
> getter methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2289) Put testing information and a problem description to the github PR template

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2289?focusedWorklogId=329408&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329408
 ]

ASF GitHub Bot logged work on HDDS-2289:


Author: ASF GitHub Bot
Created on: 16/Oct/19 21:07
Start Date: 16/Oct/19 21:07
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #5: HDDS-2289. 
Put testing information and a problem description to the g…
URL: https://github.com/apache/hadoop-ozone/pull/5
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329408)
Time Spent: 20m  (was: 10m)

> Put testing information and a problem description to the github PR template
> ---
>
> Key: HDDS-2289
> URL: https://issues.apache.org/jira/browse/HDDS-2289
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is suggested by [~aengineer] during an offline discussion to add more 
> information to the github PR template based on the template of ambari (by 
> Vivek):
> https://github.com/apache/ambari/commit/579cec8cf5bcfe1a1a0feacf055ed6569f674e6a



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer

2019-10-16 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953198#comment-16953198
 ] 

Anu Engineer commented on HDDS-2181:


[~elek] Didn't you revert this ? 

> Ozone Manager should send correct ACL type in ACL requests to Authorizer
> 
>
> Key: HDDS-2181
> URL: https://issues.apache.org/jira/browse/HDDS-2181
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete 
> and bucket create operation. Fix the acl type in all requests to the 
> authorizer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation

2019-10-16 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-2283:
---
Comment: was deleted

(was: I really doubt if this is a bottleneck. I am not disagreeing with the 
measurement. I am just saying in the big picture of writing 5GB of data, 
optimizing for this 300 ms buys you nothing. if you are optimizing for 
micro-benchmarks, what we will end of building is a system that is optimized 
for these benchmarks; not something that is really needed.
)

> Container Creation on datanodes take around 300ms due to rocksdb creation
> -
>
> Key: HDDS-2283
> URL: https://issues.apache.org/jira/browse/HDDS-2283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-2283.00.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Container Creation on datanodes take around 300ms due to rocksdb creation. 
> Rocksdb creation is taking a considerable time and this needs to be optimized.
> Creating a rocksdb per disk should be enough and each container can be table 
> inside the rocksdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation

2019-10-16 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953181#comment-16953181
 ] 

Anu Engineer commented on HDDS-2283:


I am fine this this change.. but the follow up JIRA makes no sense to me, my 
earlier deleted comment was about that.

> Container Creation on datanodes take around 300ms due to rocksdb creation
> -
>
> Key: HDDS-2283
> URL: https://issues.apache.org/jira/browse/HDDS-2283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-2283.00.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Container Creation on datanodes take around 300ms due to rocksdb creation. 
> Rocksdb creation is taking a considerable time and this needs to be optimized.
> Creating a rocksdb per disk should be enough and each container can be table 
> inside the rocksdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation

2019-10-16 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953179#comment-16953179
 ] 

Anu Engineer commented on HDDS-2283:


I really doubt if this is a bottleneck. I am not disagreeing with the 
measurement. I am just saying in the big picture of writing 5GB of data, 
optimizing for this 300 ms buys you nothing. if you are optimizing for 
micro-benchmarks, what we will end of building is a system that is optimized 
for these benchmarks; not something that is really needed.


> Container Creation on datanodes take around 300ms due to rocksdb creation
> -
>
> Key: HDDS-2283
> URL: https://issues.apache.org/jira/browse/HDDS-2283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-2283.00.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Container Creation on datanodes take around 300ms due to rocksdb creation. 
> Rocksdb creation is taking a considerable time and this needs to be optimized.
> Creating a rocksdb per disk should be enough and each container can be table 
> inside the rocksdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk

2019-10-16 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953176#comment-16953176
 ] 

Anu Engineer commented on HDDS-2317:


and how do we propose to move a container from one machine to another now? 

> Change rocksDB per Container model to have table per container on RocksDb per 
> disk
> --
>
> Key: HDDS-2317
> URL: https://issues.apache.org/jira/browse/HDDS-2317
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Priority: Major
>
> Idea proposed by [~msingh] in HDDS-2283.
> Better utilize disk bandwidth by having Rocks DB per disk and put containers 
> as tables inside.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk

2019-10-16 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953175#comment-16953175
 ] 

Anu Engineer commented on HDDS-2317:


Don't we create a container once per 5GB of data? are we optimizing for the 
micro-bench-marking scenarios? 
I am really surprised to learn that 300ms for a DB creation is the bottleneck 
for 5GB data write.


> Change rocksDB per Container model to have table per container on RocksDb per 
> disk
> --
>
> Key: HDDS-2317
> URL: https://issues.apache.org/jira/browse/HDDS-2317
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Priority: Major
>
> Idea proposed by [~msingh] in HDDS-2283.
> Better utilize disk bandwidth by having Rocks DB per disk and put containers 
> as tables inside.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope

2019-10-16 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953133#comment-16953133
 ] 

Surendra Singh Lilhore commented on HDFS-14909:
---

Thanks [~elgoiri] for review.

Attached v3 patch.

> DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage 
> count for excluded node which is already part of excluded scope 
> -
>
> Key: HDFS-14909
> URL: https://issues.apache.org/jira/browse/HDFS-14909
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch, 
> HDFS-14909.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope

2019-10-16 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-14909:
--
Attachment: HDFS-14909.003.patch

> DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage 
> count for excluded node which is already part of excluded scope 
> -
>
> Key: HDFS-14909
> URL: https://issues.apache.org/jira/browse/HDFS-14909
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch, 
> HDFS-14909.003.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.

2019-10-16 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953129#comment-16953129
 ] 

Surendra Singh Lilhore commented on HDFS-14768:
---

Hi [~gjhkael],

Why this change is required ? I didn't got this, please can you explain ?
{code:java}
+  getSrcNodes()[i].isAlive())) {
+// Filter out decommission blocks that have successfully replica
+boolean existLiveReplica = false;
+int currentBlockIndex = liveBlockIndicies[i];
+for (int j = liveBlockIndicies.length - 1; j > i; j--) {
+  if (liveBlockIndicies[j] == currentBlockIndex
+  && getSrcNodes()[j].isInService()) {
+existLiveReplica = true;
+break;
+  }
+}
+if (!existLiveReplica) {
+  srcIndices.add(i);
+} {code}

> In some cases, erasure blocks are corruption  when they are reconstruct.
> 
>
> Key: HDFS-14768
> URL: https://issues.apache.org/jira/browse/HDFS-14768
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding, hdfs, namenode
>Affects Versions: 3.0.2
>Reporter: guojh
>Assignee: guojh
>Priority: Major
>  Labels: patch
> Fix For: 3.3.0
>
> Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, 
> HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, 
> HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, 
> guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, 
> zhaoyiming_UT_beofre_deomission.txt
>
>
> Policy is RS-6-3-1024K, version is hadoop 3.0.2;
> We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission 
> index[3,4], increase the index 6 datanode's
> pendingReplicationWithoutTargets  that make it large than 
> replicationStreamsHardLimit(we set 14). Then, After the method 
> chooseSourceDatanodes of BlockMananger, the liveBlockIndices is 
> [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. 
> In method scheduleReconstruction of BlockManager, the additionalReplRequired 
> is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a 
> erasureCode task to target datanode.
> When datanode get the task will build  targetIndices from liveBlockIndices 
> and target length. the code is blow.
> {code:java}
> // code placeholder
> targetIndices = new short[targets.length];
> private void initTargetIndices() { 
>   BitSet bitset = reconstructor.getLiveBitSet();
>   int m = 0; hasValidTargets = false; 
>   for (int i = 0; i < dataBlkNum + parityBlkNum; i++) {  
> if (!bitset.get) {    
>   if (reconstructor.getBlockLen > 0) {
>        if (m < targets.length) {
>          targetIndices[m++] = (short)i;
>          hasValidTargets = true;
>         }
>       }
>     }
>  }
> {code}
> targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value.
> The StripedReader is  aways create reader from first 6 index block, and is 
> [0,1,2,3,4,5]
> Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal 
> bug. the block index6's data is corruption(all data is zero).
> I write a unit test can stabilize repreduce.
> {code:java}
> // code placeholder
> private int replicationStreamsHardLimit = 
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT;
> numDNs = dataBlocks + parityBlocks + 10;
> @Test(timeout = 24)
> public void testFileDecommission() throws Exception {
>   LOG.info("Starting test testFileDecommission");
>   final Path ecFile = new Path(ecDir, "testFileDecommission");
>   int writeBytes = cellSize * dataBlocks;
>   writeStripedFile(dfs, ecFile, writeBytes);
>   Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks());
>   FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes);
>   final INodeFile fileNode = cluster.getNamesystem().getFSDirectory()
>   .getINode4Write(ecFile.toString()).asFile();
>   LocatedBlocks locatedBlocks =
>   StripedFileTestUtil.getLocatedBlocks(ecFile, dfs);
>   LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0)
>   .get(0);
>   DatanodeInfo[] dnLocs = lb.getLocations();
>   LocatedStripedBlock lastBlock =
>   (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock();
>   DatanodeInfo[] storageInfos = lastBlock.getLocations();
>   //
>   DatanodeDescriptor datanodeDescriptor = 
> cluster.getNameNode().getNamesystem()
>   
> .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid());
>   BlockInfo firstBlock = fileNode.getBlocks()[0];
>   DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock);
>   // the first heartbeat will consume 3 replica tasks
>   for (int i = 0; i <= replicationStreamsHardLimit + 3; i++)

[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd

2019-10-16 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953089#comment-16953089
 ] 

hemanthboyina commented on HDFS-13507:
--

hi [~gangli2384] are you working on this ?

> RBF: Remove update functionality from routeradmin's add cmd
> ---
>
> Key: HDFS-13507
> URL: https://issues.apache.org/jira/browse/HDFS-13507
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Gang Li
>Priority: Minor
>  Labels: incompatible
> Attachments: HDFS-13507-HDFS-13891.003.patch, 
> HDFS-13507-HDFS-13891.004.patch, HDFS-13507.000.patch, HDFS-13507.001.patch, 
> HDFS-13507.002.patch
>
>
> Follow up the discussion in HDFS-13326. We should remove the "update" 
> functionality from routeradmin's add cmd, to make it consistent with RPC 
> calls.
> Note that: this is an incompatible change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope

2019-10-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953062#comment-16953062
 ] 

Íñigo Goiri commented on HDFS-14909:


Thanks [~surendrasingh] for the update.
Let's just fix the checkstyles and we are good to go.

> DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage 
> count for excluded node which is already part of excluded scope 
> -
>
> Key: HDFS-14909
> URL: https://issues.apache.org/jira/browse/HDFS-14909
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk

2019-10-16 Thread Siddharth Wagle (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDDS-2317:
--
Fix Version/s: (was: 0.5.0)

> Change rocksDB per Container model to have table per container on RocksDb per 
> disk
> --
>
> Key: HDDS-2317
> URL: https://issues.apache.org/jira/browse/HDDS-2317
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Priority: Major
>
> Idea proposed by [~msingh] in HDDS-2283.
> Better utilize disk bandwidth by having Rocks DB per disk and put containers 
> as tables inside.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk

2019-10-16 Thread Siddharth Wagle (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle reassigned HDDS-2317:
-

Assignee: (was: Siddharth Wagle)

> Change rocksDB per Container model to have table per container on RocksDb per 
> disk
> --
>
> Key: HDDS-2317
> URL: https://issues.apache.org/jira/browse/HDDS-2317
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
>
> Idea proposed by [~msingh] in HDDS-2283.
> Better utilize disk bandwidth by having Rocks DB per disk and put containers 
> as tables inside.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk

2019-10-16 Thread Siddharth Wagle (Jira)
Siddharth Wagle created HDDS-2317:
-

 Summary: Change rocksDB per Container model to have table per 
container on RocksDb per disk
 Key: HDDS-2317
 URL: https://issues.apache.org/jira/browse/HDDS-2317
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Affects Versions: 0.5.0
Reporter: Siddharth Wagle
Assignee: Siddharth Wagle
 Fix For: 0.5.0


Idea proposed by [~msingh] in HDDS-2283.

Better utilize disk bandwidth by having Rocks DB per disk and put containers as 
tables inside.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2283?focusedWorklogId=329291&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329291
 ]

ASF GitHub Bot logged work on HDDS-2283:


Author: ASF GitHub Bot
Created on: 16/Oct/19 17:45
Start Date: 16/Oct/19 17:45
Worklog Time Spent: 10m 
  Work Description: swagle commented on pull request #41: HDDS-2283. 
Container Creation on datanodes take around 300ms due to rocksdb creation.
URL: https://github.com/apache/hadoop-ozone/pull/41
 
 
   Container Creation on datanodes take around 300ms due to rocksdb creation. 
Rocksdb creation is taking a considerable time and this needs to be optimized.
   
   Creating a rocksdb per disk should be enough and each container can be table 
inside the rocksdb.
   
   `2019-10-15 13:20:10,714 INFO  utils.MetadataStoreBuilder 
(MetadataStoreBuilder.java:build(124)) - Time before create, load options: 81
   2019-10-15 13:20:10,715 INFO  utils.RocksDBStore 
(RocksDBStore.java:(68)) - Time to load library: 0
   2019-10-15 13:20:10,723 INFO  utils.RocksDBStore 
(RocksDBStore.java:(75)) - Time to open: 8
   2019-10-15 13:20:10,723 INFO  helpers.KeyValueContainerUtil 
(KeyValueContainerUtil.java:createContainerMetaData(85)) - Total time to 
create: {}95`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329291)
Remaining Estimate: 0h
Time Spent: 10m

> Container Creation on datanodes take around 300ms due to rocksdb creation
> -
>
> Key: HDDS-2283
> URL: https://issues.apache.org/jira/browse/HDDS-2283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-2283.00.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Container Creation on datanodes take around 300ms due to rocksdb creation. 
> Rocksdb creation is taking a considerable time and this needs to be optimized.
> Creating a rocksdb per disk should be enough and each container can be table 
> inside the rocksdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2283:
-
Labels: pull-request-available  (was: )

> Container Creation on datanodes take around 300ms due to rocksdb creation
> -
>
> Key: HDDS-2283
> URL: https://issues.apache.org/jira/browse/HDDS-2283
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-2283.00.patch
>
>
> Container Creation on datanodes take around 300ms due to rocksdb creation. 
> Rocksdb creation is taking a considerable time and this needs to be optimized.
> Creating a rocksdb per disk should be enough and each container can be table 
> inside the rocksdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14546) Document block placement policies

2019-10-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953044#comment-16953044
 ] 

Íñigo Goiri commented on HDFS-14546:


Nothing to add to the comments, it would be nice to be able to see the markdown 
rendered.
This would be a good case to review GitHub PRs.

> Document block placement policies
> -
>
> Key: HDFS-14546
> URL: https://issues.apache.org/jira/browse/HDFS-14546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Amithsha
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, 
> HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, 
> HDFS-14546-06.patch, HdfsDesign.patch
>
>
> Currently, all the documentation refers to the default block placement policy.
> However, over time there have been new policies:
> * BlockPlacementPolicyRackFaultTolerant (HDFS-7891)
> * BlockPlacementPolicyWithNodeGroup (HDFS-3601)
> * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006)
> We should update the documentation to refer to them explaining their 
> particularities and probably how to setup each one of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions

2019-10-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953043#comment-16953043
 ] 

Íñigo Goiri commented on HDFS-14284:


Thanks [~ayushtkn], yes, that's it, RemoteException only extracts a few 
exceptions not all.
I guess this is fine then.

> RBF: Log Router identifier when reporting exceptions
> 
>
> Key: HDFS-14284
> URL: https://issues.apache.org/jira/browse/HDFS-14284
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, 
> HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, 
> HDFS-14284.006.patch
>
>
> The typical setup is to use multiple Routers through 
> ConfiguredFailoverProxyProvider.
> In a regular HA Namenode setup, it is easy to know which NN was used.
> However, in RBF, any Router can be the one reporting the exception and it is 
> hard to know which was the one.
> We should have a way to identify which Router/Namenode was the one triggering 
> the exception.
> This would also apply with Observer Namenodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2285) GetBlock and ReadChunk command from the client should be sent to the same datanode to re-use the same connection

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2285?focusedWorklogId=329288&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329288
 ]

ASF GitHub Bot logged work on HDDS-2285:


Author: ASF GitHub Bot
Created on: 16/Oct/19 17:39
Start Date: 16/Oct/19 17:39
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #40: HDDS-2285. 
GetBlock and ReadChunk command from the client should be s…
URL: https://github.com/apache/hadoop-ozone/pull/40
 
 
   It can be observed that the GetBlock and ReadChunk command is sent to 2 
different datanodes. It should be sent to the same datanode to re-use the 
connection.
   
   ```
   19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to 
datanode 172.26.32.224
   19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to 
datanode 172.26.32.231
   ```
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329288)
Remaining Estimate: 0h
Time Spent: 10m

> GetBlock and ReadChunk command from the client should be sent to the same 
> datanode to re-use the same connection
> 
>
> Key: HDDS-2285
> URL: https://issues.apache.org/jira/browse/HDDS-2285
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Mukul Kumar Singh
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I can be observed that the GetBlock and ReadChunk command is sent to 2 
> different datanodes. It should be sent to the same datanode to re-use the 
> connection.
> {code}
> 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to 
> datanode 172.26.32.224
> 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to 
> datanode 172.26.32.231
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-10-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953041#comment-16953041
 ] 

Íñigo Goiri commented on HDFS-14854:


[~belugabehr], thanks for taking a pass.

[~sodonnell], a few minor things:
* I would solve the HiddenField checkstyle warning.
* Not very common to have a break line between the class and its javadoc.
* Add short javadocs (one line /** */ for cancelledNodes and blocksPerLock.
* Unify the comments lines 174-182 in the new monitor.
* moveBlocksToPending is a little hard to read. The loops specially breaking in 
a couple places. Extracting some might also help, for example 469.
* L633 should be a single line.

Other than these cosmetic changes, this looks good.
As this is pretty big, I'd like others to take another look.

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, 
> HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, 
> HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, 
> HDFS-14854.008.patch, HDFS-14854.009.patch, HDFS-14854.010.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2285) GetBlock and ReadChunk command from the client should be sent to the same datanode to re-use the same connection

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2285:
-
Labels: pull-request-available  (was: )

> GetBlock and ReadChunk command from the client should be sent to the same 
> datanode to re-use the same connection
> 
>
> Key: HDDS-2285
> URL: https://issues.apache.org/jira/browse/HDDS-2285
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Mukul Kumar Singh
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>
> I can be observed that the GetBlock and ReadChunk command is sent to 2 
> different datanodes. It should be sent to the same datanode to re-use the 
> connection.
> {code}
> 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to 
> datanode 172.26.32.224
> 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to 
> datanode 172.26.32.231
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-10-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953029#comment-16953029
 ] 

Íñigo Goiri commented on HDFS-14908:


At this point, I think we are replicating a lot of what 
{{hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Path.java}}
 does.
However, I have to say that Path may have more overhead.
The other thing is that this will be called pretty frequently in the 
LeaseManager (metSave is not that common).

We need to do a couple things here:
* Make sure that this is not done somewhere else in Hadoop.
* Try to benchmark/profile this a little.

[~weichiu] are you familiar with other places where we do this kind of pattern?
[~belugabehr], any idea on optimizations (while keeping this somewhat readable) 
for isParent()?


> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable

2019-10-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953021#comment-16953021
 ] 

Íñigo Goiri commented on HDFS-14887:


[~tasanuma], do you mind taking a final look to see if [^HDFS-14887.008.patch] 
solves your concerns?

> RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
> --
>
> Key: HDFS-14887
> URL: https://issues.apache.org/jira/browse/HDFS-14887
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, 
> HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, 
> HDFS-14887.005.patch, HDFS-14887.006.patch, HDFS-14887.007.patch, 
> HDFS-14887.008.patch
>
>
> In Router Web UI, Observer Namenode Information displaying as Unavailable.
> We should show a proper icon for them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable

2019-10-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953018#comment-16953018
 ] 

Íñigo Goiri commented on HDFS-14887:


+1 on  [^HDFS-14887.008.patch].

> RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
> --
>
> Key: HDFS-14887
> URL: https://issues.apache.org/jira/browse/HDFS-14887
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, 
> HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, 
> HDFS-14887.005.patch, HDFS-14887.006.patch, HDFS-14887.007.patch, 
> HDFS-14887.008.patch
>
>
> In Router Web UI, Observer Namenode Information displaying as Unavailable.
> We should show a proper icon for them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2254) Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2254?focusedWorklogId=329262&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329262
 ]

ASF GitHub Bot logged work on HDDS-2254:


Author: ASF GitHub Bot
Created on: 16/Oct/19 17:02
Start Date: 16/Oct/19 17:02
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #31: HDDS-2254. 
Fix flaky unit test TestContainerStateMachine#testRatisSn…
URL: https://github.com/apache/hadoop-ozone/pull/31
 
 
   …apshotRetention.
   
   ## What changes were proposed in this pull request?
   On locally trying out repeated runs of the unit test, the unit test failed 
intermittently while asserting "Null" value for CSM snapshot. This assertion is 
not valid when the other unit test in the class executes before and creates 
keys in the cluster/container. Hence, moved to a model where each unit test 
creates its own cluster.
   
   https://issues.apache.org/jira/browse/HDDS-2254
   
   ## How was this patch tested?
   Ran the unit tests in the IDE and command line.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329262)
Time Spent: 1h 40m  (was: 1.5h)

> Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention
> ---
>
> Key: HDDS-2254
> URL: https://issues.apache.org/jira/browse/HDDS-2254
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Test always fails with assertion error:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachine.testRatisSnapshotRetention(TestContainerStateMachine.java:188)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2254) Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2254?focusedWorklogId=329259&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329259
 ]

ASF GitHub Bot logged work on HDDS-2254:


Author: ASF GitHub Bot
Created on: 16/Oct/19 17:00
Start Date: 16/Oct/19 17:00
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #31: HDDS-2254. 
Fix flaky unit test TestContainerStateMachine#testRatisSn…
URL: https://github.com/apache/hadoop-ozone/pull/31
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329259)
Time Spent: 1.5h  (was: 1h 20m)

> Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention
> ---
>
> Key: HDDS-2254
> URL: https://issues.apache.org/jira/browse/HDDS-2254
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Test always fails with assertion error:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachine.testRatisSnapshotRetention(TestContainerStateMachine.java:188)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer

2019-10-16 Thread Jitendra Nath Pandey (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reopened HDDS-2181:


The pull request is still open.

> Ozone Manager should send correct ACL type in ACL requests to Authorizer
> 
>
> Key: HDDS-2181
> URL: https://issues.apache.org/jira/browse/HDDS-2181
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete 
> and bucket create operation. Fix the acl type in all requests to the 
> authorizer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable

2019-10-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953000#comment-16953000
 ] 

Hadoop QA commented on HDFS-14887:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  6m 
54s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14887 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12983186/HDFS-14887.008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ced60378d4de 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 375224e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28098/testReport/ |
| Max. process+thread count | 2691 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28098/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
> --

[jira] [Assigned] (HDDS-2314) Fix TestOMKeyCommitRequest Error

2019-10-16 Thread YiSheng Lien (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YiSheng Lien reassigned HDDS-2314:
--

Assignee: (was: YiSheng Lien)

> Fix TestOMKeyCommitRequest Error
> 
>
> Key: HDDS-2314
> URL: https://issues.apache.org/jira/browse/HDDS-2314
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: YiSheng Lien
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
> [ERROR] Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 2.479 
> s <<< FAILURE! - in 
> org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest
> [ERROR] 
> testValidateAndUpdateCacheWithKeyNotFound(org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest)
>   Time elapsed: 2.045 s  <<< ERROR!
> java.lang.IllegalMonitorStateException: Releasing lock on resource 
> /e4ec6d72-f27c-46f8-8434-e704e091f87b/db3319a6-6d78-42e1-8352-9feb099de70a 
> without acquiring lock
>   at 
> org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220)
>   at 
> org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java:168)
>   at 
> org.apache.hadoop.ozone.lock.LockManager.writeUnlock(LockManager.java:148)
>   at 
> org.apache.hadoop.ozone.om.lock.OzoneManagerLock.unlock(OzoneManagerLock.java:364)
>   at 
> org.apache.hadoop.ozone.om.lock.OzoneManagerLock.releaseWriteLock(OzoneManagerLock.java:329)
>   at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest.validateAndUpdateCache(OMKeyCommitRequest.java:177)
>   at 
> org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest.testValidateAndUpdateCacheWithKeyNotFound(TestOMKeyCommitRequest.java:202)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> [ERROR] 
> testValidateAndUpdateCacheWithBucketNotFound(org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest)
>   Time elapsed: 0.098 s  <<< ERROR!
> java.lang.IllegalMonitorStateException: Releasing lock on resource 
> /4696e0f1-6439-4300-a1bc-f30c37a12a37/352527b9-eb75-49af-b06a-57cbc697730c 
> without acquiring lock
>   at 
> org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220)
>   at 
> org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java

[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-10-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952984#comment-16952984
 ] 

Hadoop QA commented on HDFS-14908:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 402 unchanged - 0 fixed = 403 total (was 402) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 28s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14908 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12983175/HDFS-14908.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f2312368abc2 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 090f73a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28097/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28097/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28097/testReport/ |
| Max. process+thread count | 2820 (v

[jira] [Commented] (HDFS-13506) RBF: Create destination directory when adding mount entry using router admin cmds.

2019-10-16 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952963#comment-16952963
 ] 

hemanthboyina commented on HDFS-13506:
--

hi [~dibyendu_hadoop] are you working on this ?

> RBF: Create destination directory when adding mount entry using router admin 
> cmds.
> --
>
> Key: HDFS-13506
> URL: https://issues.apache.org/jira/browse/HDFS-13506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
>
> Currently there is no option to create destination when adding mount entry. 
> User has to create the destination separately.
> In router admin -add command we can add an option -createDest to create the 
> destination for the mount entry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable

2019-10-16 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14887:
-
Attachment: HDFS-14887.008.patch

> RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
> --
>
> Key: HDFS-14887
> URL: https://issues.apache.org/jira/browse/HDFS-14887
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, 
> HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, 
> HDFS-14887.005.patch, HDFS-14887.006.patch, HDFS-14887.007.patch, 
> HDFS-14887.008.patch
>
>
> In Router Web UI, Observer Namenode Information displaying as Unavailable.
> We should show a proper icon for them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14802) The feature of protect directories should be used in RenameOp

2019-10-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952900#comment-16952900
 ] 

Ayush Saxena commented on HDFS-14802:
-

Thanx Steve for the help,
[~ferhui] can you update accordingly.
Additionally,
I think you can add the method in {{DfsUtil}} rather than creating new Util 
class. and the Check should be before the unprotected methods, We can't throw 
exception from the unprotected ones.

> The feature of protect directories should be used in RenameOp
> -
>
> Key: HDFS-14802
> URL: https://issues.apache.org/jira/browse/HDFS-14802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, 
> HDFS-14802.003.patch
>
>
> Now we could set fs.protected.directories to prevent users from deleting 
> important directories. But users can delete directories around the limitation.
> 1. Rename the directories and delete them.
> 2. move the directories to trash and namenode will delete them.
> So I think we should use the feature of protected directories in RenameOp



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14546) Document block placement policies

2019-10-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952897#comment-16952897
 ] 

Ayush Saxena commented on HDFS-14546:
-

You can do mvn clean site in {{/hadoop-hdfs-project/hadoop-hdfs}} and build the 
webpage in the local. The webpage would be there in 
{{/hadoop-hdfs-project/hadoop-hdfs/target/site}} path.

> Document block placement policies
> -
>
> Key: HDFS-14546
> URL: https://issues.apache.org/jira/browse/HDFS-14546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Amithsha
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, 
> HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, 
> HDFS-14546-06.patch, HdfsDesign.patch
>
>
> Currently, all the documentation refers to the default block placement policy.
> However, over time there have been new policies:
> * BlockPlacementPolicyRackFaultTolerant (HDFS-7891)
> * BlockPlacementPolicyWithNodeGroup (HDFS-3601)
> * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006)
> We should update the documentation to refer to them explaining their 
> particularities and probably how to setup each one of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14546) Document block placement policies

2019-10-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952893#comment-16952893
 ] 

Ayush Saxena commented on HDFS-14546:
-

Thanx [~Amithsha] for the patch. Almost looks good, couple of things :

This isn't required :
{code:java}
+* [Block Placement Policies](#Block_Placement_Policies)
+ * [Introduction](#Introduction)
+ * [Additional Types of Policies](#Policy_Types)
+ * 
[BlockPlacementPolicyRackFaultTolerant](#Block_PlacementPolicy_RackFaultTolerant)
+ * [BlockPlacementPolicyWithNodeGroup](#BlockPlacementPolicy_With_NodeGroup)
+ * 
[BlockPlacementPolicyWithUpgradeDomain](#BlockPlacementPolicy_With_UpgradeDomain)
{code}

Just having a single line, will automatically generate this, Like :
{code:java}
BlockPlacementPolicies
==

   // Add this line

##Introduction

{code}

These should have  ## rather than ###


{code:java}
+### BlockPlacementPolicyRackFaultTolerant
{code}

These Links should be relative paths :


{code:java}
+Detailed info about configuration 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUpgradeDomain.html
{code}

The image has too much whitespace at the bottom can you reduce it. It makes the 
text go too much below.

Let me know, if you are facing any problem or need any help handling anything, 
I can help.

[~elgoiri] you want to add anything?

> Document block placement policies
> -
>
> Key: HDFS-14546
> URL: https://issues.apache.org/jira/browse/HDFS-14546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Íñigo Goiri
>Assignee: Amithsha
>Priority: Major
>  Labels: documentation
> Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, 
> HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, 
> HDFS-14546-06.patch, HdfsDesign.patch
>
>
> Currently, all the documentation refers to the default block placement policy.
> However, over time there have been new policies:
> * BlockPlacementPolicyRackFaultTolerant (HDFS-7891)
> * BlockPlacementPolicyWithNodeGroup (HDFS-3601)
> * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006)
> We should update the documentation to refer to them explaining their 
> particularities and probably how to setup each one of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14739) RBF: LS command for mount point shows wrong owner and permission information.

2019-10-16 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952877#comment-16952877
 ] 

Hudson commented on HDFS-14739:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17541 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17541/])
HDFS-14739. RBF: LS command for mount point shows wrong owner and (ayushsaxena: 
rev 375224edebb1c937afe4bbea8fe884499ca8ece5)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestDisableNameservices.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/MountTableResolver.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterMountTable.java
* (add) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/RouterResolveException.java


> RBF: LS command for mount point shows wrong owner and permission information.
> -
>
> Key: HDFS-14739
> URL: https://issues.apache.org/jira/browse/HDFS-14739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: xuzq
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14739-trunk-001.patch, HDFS-14739-trunk-002.patch, 
> HDFS-14739-trunk-003.patch, HDFS-14739-trunk-004.patch, 
> HDFS-14739-trunk-005.patch, HDFS-14739-trunk-006.patch, 
> HDFS-14739-trunk-007.patch, HDFS-14739-trunk-008.patch, 
> HDFS-14739-trunk-009.patch, HDFS-14739-trunk-010.patch, 
> HDFS-14739-trunk-011.patch, image-2019-08-16-17-15-50-614.png, 
> image-2019-08-16-17-16-00-863.png, image-2019-08-16-17-16-34-325.png
>
>
> ||source||target namespace||destination||owner||group||permission||
> |/mnt|ns0|/mnt|mnt|mnt_group|755|
> |/mnt/test1|ns1|/mnt/test1|mnt_test1|mnt_test1_group|755|
> |/test1|ns1|/test1|test1|test1_group|755|
> When do getListing("/mnt"), the owner of  */mnt/test1* should be *mnt_test1* 
> instead of *test1* in result.
>  
> And if the mount table as blew, we should support getListing("/mnt") instead 
> of throw IOException when dfs.federation.router.default.nameservice.enable is 
> false.
> ||source||target namespace||destination||owner||group||permission||
> |/mnt/test1|ns0|/mnt/test1|test1|test1|755|
> |/mnt/test2|ns1|/mnt/test2|test2|test2|755|
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14739) RBF: LS command for mount point shows wrong owner and permission information.

2019-10-16 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14739:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> RBF: LS command for mount point shows wrong owner and permission information.
> -
>
> Key: HDFS-14739
> URL: https://issues.apache.org/jira/browse/HDFS-14739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: xuzq
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14739-trunk-001.patch, HDFS-14739-trunk-002.patch, 
> HDFS-14739-trunk-003.patch, HDFS-14739-trunk-004.patch, 
> HDFS-14739-trunk-005.patch, HDFS-14739-trunk-006.patch, 
> HDFS-14739-trunk-007.patch, HDFS-14739-trunk-008.patch, 
> HDFS-14739-trunk-009.patch, HDFS-14739-trunk-010.patch, 
> HDFS-14739-trunk-011.patch, image-2019-08-16-17-15-50-614.png, 
> image-2019-08-16-17-16-00-863.png, image-2019-08-16-17-16-34-325.png
>
>
> ||source||target namespace||destination||owner||group||permission||
> |/mnt|ns0|/mnt|mnt|mnt_group|755|
> |/mnt/test1|ns1|/mnt/test1|mnt_test1|mnt_test1_group|755|
> |/test1|ns1|/test1|test1|test1_group|755|
> When do getListing("/mnt"), the owner of  */mnt/test1* should be *mnt_test1* 
> instead of *test1* in result.
>  
> And if the mount table as blew, we should support getListing("/mnt") instead 
> of throw IOException when dfs.federation.router.default.nameservice.enable is 
> false.
> ||source||target namespace||destination||owner||group||permission||
> |/mnt/test1|ns0|/mnt/test1|test1|test1|755|
> |/mnt/test2|ns1|/mnt/test2|test2|test2|755|
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2220) HddsVolume needs a toString method

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2220?focusedWorklogId=329190&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329190
 ]

ASF GitHub Bot logged work on HDDS-2220:


Author: ASF GitHub Bot
Created on: 16/Oct/19 14:16
Start Date: 16/Oct/19 14:16
Worklog Time Spent: 10m 
  Work Description: cxorm commented on pull request #1652: HDDS-2220. 
HddsVolume needs a toString method.
URL: https://github.com/apache/hadoop/pull/1652
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329190)
Time Spent: 1h  (was: 50m)

> HddsVolume needs a toString method
> --
>
> Key: HDDS-2220
> URL: https://issues.apache.org/jira/browse/HDDS-2220
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Marton Elek
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This is logged to the console of datanodes:
> {code:java}
> 2019-10-01 11:37:59 INFO  HddsVolumeChecker:202 - Scheduled health check for 
> volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 11:52:59 INFO  ThrottledAsyncChecker:139 - Scheduling a check for 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 11:52:59 INFO  HddsVolumeChecker:202 - Scheduled health check for 
> volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:07:59 INFO  ThrottledAsyncChecker:139 - Scheduling a check for 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:07:59 INFO  HddsVolumeChecker:202 - Scheduled health check for 
> volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:22:59 INFO  ThrottledAsyncChecker:139 - Scheduling a check for 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:22:59 INFO  HddsVolumeChecker:202 - Scheduled health check for 
> volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:37:59 INFO  ThrottledAsyncChecker:139 - Scheduling a check for 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:37:59 INFO  HddsVolumeChecker:202 - Scheduled health check for 
> volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:52:59 INFO  ThrottledAsyncChecker:139 - Scheduling a check for 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a
> 2019-10-01 12:52:59 INFO  HddsVolumeChecker:202 - Scheduled health check for 
> volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a 
> {code}
> Without a proper HddsVolume.toString it's hard to say which volume is 
> checked...
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14739) RBF: LS command for mount point shows wrong owner and permission information.

2019-10-16 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952868#comment-16952868
 ] 

Ayush Saxena commented on HDFS-14739:
-

Committed to trunk.
Thanx [~LiJinglun] for the contribution [~xuzq_zander] for the report and 
[~elgoiri] for the review!!!

> RBF: LS command for mount point shows wrong owner and permission information.
> -
>
> Key: HDFS-14739
> URL: https://issues.apache.org/jira/browse/HDFS-14739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: xuzq
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-14739-trunk-001.patch, HDFS-14739-trunk-002.patch, 
> HDFS-14739-trunk-003.patch, HDFS-14739-trunk-004.patch, 
> HDFS-14739-trunk-005.patch, HDFS-14739-trunk-006.patch, 
> HDFS-14739-trunk-007.patch, HDFS-14739-trunk-008.patch, 
> HDFS-14739-trunk-009.patch, HDFS-14739-trunk-010.patch, 
> HDFS-14739-trunk-011.patch, image-2019-08-16-17-15-50-614.png, 
> image-2019-08-16-17-16-00-863.png, image-2019-08-16-17-16-34-325.png
>
>
> ||source||target namespace||destination||owner||group||permission||
> |/mnt|ns0|/mnt|mnt|mnt_group|755|
> |/mnt/test1|ns1|/mnt/test1|mnt_test1|mnt_test1_group|755|
> |/test1|ns1|/test1|test1|test1_group|755|
> When do getListing("/mnt"), the owner of  */mnt/test1* should be *mnt_test1* 
> instead of *test1* in result.
>  
> And if the mount table as blew, we should support getListing("/mnt") instead 
> of throw IOException when dfs.federation.router.default.nameservice.enable is 
> false.
> ||source||target namespace||destination||owner||group||permission||
> |/mnt/test1|ns0|/mnt/test1|test1|test1|755|
> |/mnt/test2|ns1|/mnt/test2|test2|test2|755|
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2316) Support to skip recon and/or ozonefs during the build

2019-10-16 Thread Anu Engineer (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDDS-2316.

Fix Version/s: 0.5.0
   Resolution: Fixed

Committed to the master.

> Support to skip recon and/or ozonefs during the build
> -
>
> Key: HDDS-2316
> URL: https://issues.apache.org/jira/browse/HDDS-2316
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard 
> to resist...)
>  
>  The two slowest part of Ozone build as of now:
>  # The (multiple) shading of ozonefs
>  # And the frontend build/obfuscation of ozone recon
> [~aengineer] suggested to introduce options to skip them as they are not 
> required for the build all the time.
> This patch introduces '-DskipRecon' and '-DskipShade' options to provide a 
> faster way to create a *partial* build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2316) Support to skip recon and/or ozonefs during the build

2019-10-16 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952849#comment-16952849
 ] 

Anu Engineer commented on HDDS-2316:


Thank you for the fast-lane. I am taking the first ride. 


> Support to skip recon and/or ozonefs during the build
> -
>
> Key: HDDS-2316
> URL: https://issues.apache.org/jira/browse/HDDS-2316
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard 
> to resist...)
>  
>  The two slowest part of Ozone build as of now:
>  # The (multiple) shading of ozonefs
>  # And the frontend build/obfuscation of ozone recon
> [~aengineer] suggested to introduce options to skip them as they are not 
> required for the build all the time.
> This patch introduces '-DskipRecon' and '-DskipShade' options to provide a 
> faster way to create a *partial* build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2316) Support to skip recon and/or ozonefs during the build

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2316?focusedWorklogId=329177&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329177
 ]

ASF GitHub Bot logged work on HDDS-2316:


Author: ASF GitHub Bot
Created on: 16/Oct/19 13:54
Start Date: 16/Oct/19 13:54
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #39: HDDS-2316. 
Support to skip recon and/or ozonefs during the build
URL: https://github.com/apache/hadoop-ozone/pull/39
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329177)
Time Spent: 20m  (was: 10m)

> Support to skip recon and/or ozonefs during the build
> -
>
> Key: HDDS-2316
> URL: https://issues.apache.org/jira/browse/HDDS-2316
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard 
> to resist...)
>  
>  The two slowest part of Ozone build as of now:
>  # The (multiple) shading of ozonefs
>  # And the frontend build/obfuscation of ozone recon
> [~aengineer] suggested to introduce options to skip them as they are not 
> required for the build all the time.
> This patch introduces '-DskipRecon' and '-DskipShade' options to provide a 
> faster way to create a *partial* build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-10-16 Thread Jinglun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HDFS-14908:
---
Attachment: HDFS-14908.002.patch

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-10-16 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952838#comment-16952838
 ] 

Jinglun commented on HDFS-14908:


Thanks [~elgoiri] your nice comments ! I make the DFSUtil.isParent() stronger 
and more readable, and add a new test case for it.

Upload v02.

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2235) Ozone Datanode web page doesn't exist

2019-10-16 Thread Sandeep Nemuri (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Nemuri reassigned HDDS-2235:


Assignee: Sandeep Nemuri

> Ozone Datanode web page doesn't exist
> -
>
> Key: HDDS-2235
> URL: https://issues.apache.org/jira/browse/HDDS-2235
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Sandeep Nemuri
>Priority: Major
>
> On trying to access the dn UI, the following error is seen.
> http://dn_ip:9882/
> {code}
> HTTP ERROR 403
> Problem accessing /. Reason:
> Forbidden
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2218) Use OZONE_CLASSPATH instead of HADOOP_CLASSPATH

2019-10-16 Thread Sandeep Nemuri (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Nemuri reassigned HDDS-2218:


Assignee: Sandeep Nemuri

> Use OZONE_CLASSPATH instead of HADOOP_CLASSPATH
> ---
>
> Key: HDDS-2218
> URL: https://issues.apache.org/jira/browse/HDDS-2218
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: docker
>Reporter: Marton Elek
>Assignee: Sandeep Nemuri
>Priority: Major
>  Labels: newbe
>
> HADOOP_CLASSPATH is the standard way to add additional jar files to the 
> classpath of the mapreduce/spark/.. .jobs. If something is added to the 
> HADOOP_CLASSPATH, than it should be on the classpath of the classic hadoop 
> daemons.
> But for the Ozone components we don't need any new jar files (cloud 
> connectors, libraries). I think it's more safe to separated HADOOP_CLASSPATH 
> from OZONE_CLASSPATH. If something is really need on the classpath for Ozone 
> daemons the dedicated environment variable should be used.
>  
> Most probably it can be fixed in
> hadoop-hdds/common/src/main/bin/hadoop-functions.sh
> And the hadoop-ozone/dev/src/main/compose files also should be checked (some 
> of them contain HADOOP_CLASSPATH



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-10-16 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952812#comment-16952812
 ] 

David Mollitor commented on HDFS-14854:
---

[~sodonnell] Thanks.  Looks good!

> Create improved decommission monitor implementation
> ---
>
> Key: HDFS-14854
> URL: https://issues.apache.org/jira/browse/HDFS-14854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, 
> HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, 
> HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, 
> HDFS-14854.008.patch, HDFS-14854.009.patch, HDFS-14854.010.patch
>
>
> In HDFS-13157, we discovered a series of problems with the current 
> decommission monitor implementation, such as:
>  * Blocks are replicated sequentially disk by disk and node by node, and 
> hence the load is not spread well across the cluster
>  * Adding a node for decommission can cause the namenode write lock to be 
> held for a long time.
>  * Decommissioning nodes floods the replication queue and under replicated 
> blocks from a future node or disk failure may way for a long time before they 
> are replicated.
>  * Blocks pending replication are checked many times under a write lock 
> before they are sufficiently replicate, wasting resources
> In this Jira I propose to create a new implementation of the decommission 
> monitor that resolves these issues. As it will be difficult to prove one 
> implementation is better than another, the new implementation can be enabled 
> or disabled giving the option of the existing implementation or the new one.
> I will attach a pdf with some more details on the design and then a version 1 
> patch shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1737) Add Volume check in KeyManager and File Operations

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1737?focusedWorklogId=329159&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329159
 ]

ASF GitHub Bot logged work on HDDS-1737:


Author: ASF GitHub Bot
Created on: 16/Oct/19 13:00
Start Date: 16/Oct/19 13:00
Worklog Time Spent: 10m 
  Work Description: elek commented on pull request #2: HDDS-1737. Add 
Volume check in KeyManager and File Operations.
URL: https://github.com/apache/hadoop-ozone/pull/2
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329159)
Time Spent: 5.5h  (was: 5h 20m)

> Add Volume check in KeyManager and File Operations
> --
>
> Key: HDDS-1737
> URL: https://issues.apache.org/jira/browse/HDDS-1737
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> This is to address a TODO to check volume checks when performing Key/File 
> operations.
>  
> // TODO: Not checking volume exist here, once we have full cache we can
> // add volume exist check also.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1737) Add Volume check in KeyManager and File Operations

2019-10-16 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-1737:
--
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Add Volume check in KeyManager and File Operations
> --
>
> Key: HDDS-1737
> URL: https://issues.apache.org/jira/browse/HDDS-1737
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> This is to address a TODO to check volume checks when performing Key/File 
> operations.
>  
> // TODO: Not checking volume exist here, once we have full cache we can
> // add volume exist check also.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1737) Add Volume check in KeyManager and File Operations

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1737?focusedWorklogId=329156&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329156
 ]

ASF GitHub Bot logged work on HDDS-1737:


Author: ASF GitHub Bot
Created on: 16/Oct/19 12:58
Start Date: 16/Oct/19 12:58
Worklog Time Spent: 10m 
  Work Description: elek commented on issue #1559: HDDS-1737. Add Volume 
check in KeyManager and File Operations.
URL: https://github.com/apache/hadoop/pull/1559#issuecomment-542687280
 
 
   Migrated to https://github.com/apache/hadoop-ozone/pull/2
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329156)
Time Spent: 5h 10m  (was: 5h)

> Add Volume check in KeyManager and File Operations
> --
>
> Key: HDDS-1737
> URL: https://issues.apache.org/jira/browse/HDDS-1737
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: newbie, pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> This is to address a TODO to check volume checks when performing Key/File 
> operations.
>  
> // TODO: Not checking volume exist here, once we have full cache we can
> // add volume exist check also.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1737) Add Volume check in KeyManager and File Operations

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1737?focusedWorklogId=329157&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329157
 ]

ASF GitHub Bot logged work on HDDS-1737:


Author: ASF GitHub Bot
Created on: 16/Oct/19 12:58
Start Date: 16/Oct/19 12:58
Worklog Time Spent: 10m 
  Work Description: elek commented on pull request #1559: HDDS-1737. Add 
Volume check in KeyManager and File Operations.
URL: https://github.com/apache/hadoop/pull/1559
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329157)
Time Spent: 5h 20m  (was: 5h 10m)

> Add Volume check in KeyManager and File Operations
> --
>
> Key: HDDS-1737
> URL: https://issues.apache.org/jira/browse/HDDS-1737
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: newbie, pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> This is to address a TODO to check volume checks when performing Key/File 
> operations.
>  
> // TODO: Not checking volume exist here, once we have full cache we can
> // add volume exist check also.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation

2019-10-16 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952775#comment-16952775
 ] 

Hadoop QA commented on HDFS-14854:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 12 new + 462 unchanged - 5 fixed = 474 total (was 467) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14854 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12983148/HDFS-14854.010.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 53ec2adf14cb 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bbcf0b9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28096/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/P

[jira] [Work logged] (HDDS-2316) Support to skip recon and/or ozonefs during the build

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2316?focusedWorklogId=329139&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329139
 ]

ASF GitHub Bot logged work on HDDS-2316:


Author: ASF GitHub Bot
Created on: 16/Oct/19 12:15
Start Date: 16/Oct/19 12:15
Worklog Time Spent: 10m 
  Work Description: elek commented on pull request #39: HDDS-2316. Support 
to skip recon and/or ozonefs during the build
URL: https://github.com/apache/hadoop-ozone/pull/39
 
 
   ## What changes were proposed in this pull request?
   
The two slowest part of Ozone build as of now:
   
   * The (multiple) shading of ozonefs
   * And the frontend build/obfuscation of ozone recon
   
   @anuengineer suggested to introduce options to skip them as they are not 
required for the build all the time.
   
   This patch introduces `-DskipRecon` and `-DskipShade` options to provide a 
faster way to create a partial build.
   
   ## What is the link to the Apache JIRA
   
   https://github.com/elek/hadoop-ozone/pull/new/HDDS-2316
   
   ## How this patch can be tested?
   
   ```
   mvn clean install -DskipShade -DskipRecon -DskipTests
   ```
   
   ```
   mvn clean install -DskipShade -DskipTests
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329139)
Remaining Estimate: 0h
Time Spent: 10m

> Support to skip recon and/or ozonefs during the build
> -
>
> Key: HDDS-2316
> URL: https://issues.apache.org/jira/browse/HDDS-2316
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard 
> to resist...)
>  
>  The two slowest part of Ozone build as of now:
>  # The (multiple) shading of ozonefs
>  # And the frontend build/obfuscation of ozone recon
> [~aengineer] suggested to introduce options to skip them as they are not 
> required for the build all the time.
> This patch introduces '-DskipRecon' and '-DskipShade' options to provide a 
> faster way to create a *partial* build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2316) Support to skip recon and/or ozonefs during the build

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-2316:
-
Labels: pull-request-available  (was: )

> Support to skip recon and/or ozonefs during the build
> -
>
> Key: HDDS-2316
> URL: https://issues.apache.org/jira/browse/HDDS-2316
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>
> (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard 
> to resist...)
>  
>  The two slowest part of Ozone build as of now:
>  # The (multiple) shading of ozonefs
>  # And the frontend build/obfuscation of ozone recon
> [~aengineer] suggested to introduce options to skip them as they are not 
> required for the build all the time.
> This patch introduces '-DskipRecon' and '-DskipShade' options to provide a 
> faster way to create a *partial* build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2316) Support to skip recon and/or ozonefs during the build

2019-10-16 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-2316:
--
Description: 
(I almost use this Jira summary: "Fast-lane to ozone build" It was very hard to 
resist...)

 

 The two slowest part of Ozone build as of now:


 # The (multiple) shading of ozonefs
 # And the frontend build/obfuscation of ozone recon

[~aengineer] suggested to introduce options to skip them as they are not 
required for the build all the time.

This patch introduces '-DskipRecon' and '-DskipShade' options to provide a 
faster way to create a *partial* build.

  was:
(I almost use this Jira summary: "Fast-lane to ozone build" It was very hard to 
resist...)

 

 


> Support to skip recon and/or ozonefs during the build
> -
>
> Key: HDDS-2316
> URL: https://issues.apache.org/jira/browse/HDDS-2316
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Anu Engineer
>Assignee: Marton Elek
>Priority: Major
>
> (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard 
> to resist...)
>  
>  The two slowest part of Ozone build as of now:
>  # The (multiple) shading of ozonefs
>  # And the frontend build/obfuscation of ozone recon
> [~aengineer] suggested to introduce options to skip them as they are not 
> required for the build all the time.
> This patch introduces '-DskipRecon' and '-DskipShade' options to provide a 
> faster way to create a *partial* build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2316) Support to skip recon and/or ozonefs during the build

2019-10-16 Thread Marton Elek (Jira)
Marton Elek created HDDS-2316:
-

 Summary: Support to skip recon and/or ozonefs during the build
 Key: HDDS-2316
 URL: https://issues.apache.org/jira/browse/HDDS-2316
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
Reporter: Anu Engineer
Assignee: Marton Elek


(I almost use this Jira summary: "Fast-lane to ozone build" It was very hard to 
resist...)

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2315) bucket creation fails because bucket does not exist

2019-10-16 Thread Attila Doroszlai (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai resolved HDDS-2315.

Resolution: Invalid

Fixed by revert: https://github.com/apache/hadoop-ozone/commit/17081c2e

> bucket creation fails because bucket does not exist
> ---
>
> Key: HDDS-2315
> URL: https://issues.apache.org/jira/browse/HDDS-2315
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Attila Doroszlai
>Priority: Blocker
>
> Secure acceptance tests fail because no bucket can be created due to ACL 
> check:
> {noformat}
> om_1| 2019-10-16 10:42:04,422 [IPC Server handler 0 on 9862] INFO 
>   - created volume:vol-0-38760 for user:HTTP/s...@example.com
> om_1| 2019-10-16 10:42:04,464 [IPC Server handler 4 on 9862] INFO 
>   - created volume:vol-1-41642 for user:HTTP/s...@example.com
> om_1| 2019-10-16 10:42:04,481 [IPC Server handler 11 on 9862] INFO
>- created volume:vol-2-97489 for user:HTTP/s...@example.com
> om_1| 2019-10-16 10:42:04,496 [IPC Server handler 12 on 9862] INFO
>- created volume:vol-3-24784 for user:HTTP/s...@example.com
> om_1| 2019-10-16 10:42:04,512 [IPC Server handler 6 on 9862] INFO 
>   - created volume:vol-4-01299 for user:HTTP/s...@example.com
> om_1| 2019-10-16 10:42:04,550 [IPC Server handler 7 on 9862] ERROR
>   - Bucket creation failed for bucket:bucket-0-94230 in volume:vol-0-38760
> om_1| BUCKET_NOT_FOUND 
> org.apache.hadoop.ozone.om.exceptions.OMException: Bucket bucket-0-94230 is 
> not found
> om_1| at 
> org.apache.hadoop.ozone.om.BucketManagerImpl.checkAccess(BucketManagerImpl.java:568)
> om_1| at 
> org.apache.hadoop.ozone.security.acl.OzoneNativeAuthorizer.checkAccess(OzoneNativeAuthorizer.java:89)
> om_1| at 
> org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1625)
> om_1| at 
> org.apache.hadoop.ozone.om.request.OMClientRequest.checkAcls(OMClientRequest.java:135)
> om_1| at 
> org.apache.hadoop.ozone.om.request.bucket.OMBucketCreateRequest.validateAndUpdateCache(OMBucketCreateRequest.java:146)
> om_1| at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:219)
> om_1| at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
> om_1| at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> om_1| at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2313) Duplicate release of lock in OMKeyCommitRequest

2019-10-16 Thread Attila Doroszlai (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai resolved HDDS-2313.

Resolution: Invalid

Fixed by revert: https://github.com/apache/hadoop-ozone/commit/17081c2e

> Duplicate release of lock in OMKeyCommitRequest
> ---
>
> Key: HDDS-2313
> URL: https://issues.apache.org/jira/browse/HDDS-2313
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> om_1| 2019-10-16 05:33:57,413 [IPC Server handler 19 on 9862] ERROR   
>- Trying to release the lock on /bypdd/mybucket4, which was never acquired.
> om_1| 2019-10-16 05:33:57,414 WARN ipc.Server: IPC Server handler 19 
> on 9862, call Call#4 Retry#8 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 172.29.0.4:37018
> om_1| java.lang.IllegalMonitorStateException: Releasing lock on 
> resource /bypdd/mybucket4 without acquiring lock
> om_1| at 
> org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220)
> om_1| at 
> org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java:168)
> om_1| at 
> org.apache.hadoop.ozone.lock.LockManager.writeUnlock(LockManager.java:148)
> om_1| at 
> org.apache.hadoop.ozone.om.lock.OzoneManagerLock.unlock(OzoneManagerLock.java:364)
> om_1| at 
> org.apache.hadoop.ozone.om.lock.OzoneManagerLock.releaseWriteLock(OzoneManagerLock.java:329)
> om_1| at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest.validateAndUpdateCache(OMKeyCommitRequest.java:177)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2313) Duplicate release of lock in OMKeyCommitRequest

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2313?focusedWorklogId=329133&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329133
 ]

ASF GitHub Bot logged work on HDDS-2313:


Author: ASF GitHub Bot
Created on: 16/Oct/19 12:01
Start Date: 16/Oct/19 12:01
Worklog Time Spent: 10m 
  Work Description: adoroszlai commented on pull request #35: HDDS-2313. 
Duplicate release of lock in OMKeyCommitRequest
URL: https://github.com/apache/hadoop-ozone/pull/35
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329133)
Time Spent: 20m  (was: 10m)

> Duplicate release of lock in OMKeyCommitRequest
> ---
>
> Key: HDDS-2313
> URL: https://issues.apache.org/jira/browse/HDDS-2313
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> om_1| 2019-10-16 05:33:57,413 [IPC Server handler 19 on 9862] ERROR   
>- Trying to release the lock on /bypdd/mybucket4, which was never acquired.
> om_1| 2019-10-16 05:33:57,414 WARN ipc.Server: IPC Server handler 19 
> on 9862, call Call#4 Retry#8 
> org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
> 172.29.0.4:37018
> om_1| java.lang.IllegalMonitorStateException: Releasing lock on 
> resource /bypdd/mybucket4 without acquiring lock
> om_1| at 
> org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220)
> om_1| at 
> org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java:168)
> om_1| at 
> org.apache.hadoop.ozone.lock.LockManager.writeUnlock(LockManager.java:148)
> om_1| at 
> org.apache.hadoop.ozone.om.lock.OzoneManagerLock.unlock(OzoneManagerLock.java:364)
> om_1| at 
> org.apache.hadoop.ozone.om.lock.OzoneManagerLock.releaseWriteLock(OzoneManagerLock.java:329)
> om_1| at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest.validateAndUpdateCache(OMKeyCommitRequest.java:177)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1988) Fix listParts API

2019-10-16 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-1988:
--
Fix Version/s: 0.5.0

> Fix listParts API
> -
>
> Key: HDDS-1988
> URL: https://issues.apache.org/jira/browse/HDDS-1988
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This Jira is to fix listParts API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listParts of a MPU key, it should use 
> both in-memory cache and rocksdb mpu table to list parts of a mpu key.
>  
> No fix is required for this, as the information is retrieved from the MPU Key 
> table, this information is not retrieved through RocksDB Table iteration. (As 
> when we use get() this checks from cache first, and then it checks table)
>  
> Used this Jira to add an integration test to verify the behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2267) Container metadata scanner interval mismatch

2019-10-16 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-2267:
--
Fix Version/s: 0.5.0

> Container metadata scanner interval mismatch
> 
>
> Key: HDDS-2267
> URL: https://issues.apache.org/jira/browse/HDDS-2267
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Container metadata scanner can be configured to run at specific time 
> intervals, eg. hourly ({{hdds.containerscrub.metadata.scan.interval}}).  
> However, the actual run interval does not match the configuration.  After a 
> datanode restart, it runs in quick succession, later it runs at apparently 
> random intervals.
> {noformat:title=sample log}
> datanode_1  | 2019-10-08 14:05:30 INFO  ContainerMetadataScanner:88 - 
> Completed an iteration of container metadata scrubber in 0 minutes. Number of 
>  iterations (since the data-node restart) : 1, Number of containers scanned 
> in this iteration : 0, Number of unhealthy containers found in this iteration 
> : 0
> datanode_1  | 2019-10-08 14:09:33 INFO  ContainerMetadataScanner:88 - 
> Completed an iteration of container metadata scrubber in 0 minutes. Number of 
>  iterations (since the data-node restart) : 1, Number of containers scanned 
> in this iteration : 6, Number of unhealthy containers found in this iteration 
> : 0
> ...
> datanode_1  | 2019-10-08 14:09:33 INFO  ContainerMetadataScanner:88 - 
> Completed an iteration of container metadata scrubber in 0 minutes. Number of 
>  iterations (since the data-node restart) : 28, Number of containers scanned 
> in this iteration : 6, Number of unhealthy containers found in this iteration 
> : 0
> datanode_1  | 2019-10-08 14:21:01 INFO  ContainerMetadataScanner:88 - 
> Completed an iteration of container metadata scrubber in 0 minutes. Number of 
>  iterations (since the data-node restart) : 29, Number of containers scanned 
> in this iteration : 6, Number of unhealthy containers found in this iteration 
> : 0
> datanode_1  | 2019-10-08 14:21:01 INFO  ContainerMetadataScanner:88 - 
> Completed an iteration of container metadata scrubber in 0 minutes. Number of 
>  iterations (since the data-node restart) : 30, Number of containers scanned 
> in this iteration : 6, Number of unhealthy containers found in this iteration 
> : 0
> datanode_1  | 2019-10-08 15:30:38 INFO  ContainerMetadataScanner:88 - 
> Completed an iteration of container metadata scrubber in 0 minutes. Number of 
>  iterations (since the data-node restart) : 31, Number of containers scanned 
> in this iteration : 6, Number of unhealthy containers found in this iteration 
> : 0
> datanode_1  | 2019-10-08 16:45:01 INFO  ContainerMetadataScanner:88 - 
> Completed an iteration of container metadata scrubber in 0 minutes. Number of 
>  iterations (since the data-node restart) : 32, Number of containers scanned 
> in this iteration : 6, Number of unhealthy containers found in this iteration 
> : 0
> {noformat}
> The problem is that time elapsed is measured in nanoseconds, while the 
> configuration is in milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2312) Fix typo in ozone command

2019-10-16 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved HDDS-2312.
---
Fix Version/s: 0.5.0
   Resolution: Fixed

> Fix typo in ozone command
> -
>
> Key: HDDS-2312
> URL: https://issues.apache.org/jira/browse/HDDS-2312
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 0.5.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat:title=ozone}
> Usage: ozone [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]
> ...
> insight   tool to get runtime opeartion information
> ...
> {noformat}
> Should be "operation".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2312) Fix typo in ozone command

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2312?focusedWorklogId=329114&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329114
 ]

ASF GitHub Bot logged work on HDDS-2312:


Author: ASF GitHub Bot
Created on: 16/Oct/19 11:17
Start Date: 16/Oct/19 11:17
Worklog Time Spent: 10m 
  Work Description: elek commented on pull request #34: HDDS-2312. Fix typo 
in ozone command
URL: https://github.com/apache/hadoop-ozone/pull/34
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329114)
Time Spent: 20m  (was: 10m)

> Fix typo in ozone command
> -
>
> Key: HDDS-2312
> URL: https://issues.apache.org/jira/browse/HDDS-2312
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 0.5.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat:title=ozone}
> Usage: ozone [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]
> ...
> insight   tool to get runtime opeartion information
> ...
> {noformat}
> Should be "operation".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1985) Fix listVolumes API

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1985?focusedWorklogId=329112&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329112
 ]

ASF GitHub Bot logged work on HDDS-1985:


Author: ASF GitHub Bot
Created on: 16/Oct/19 11:11
Start Date: 16/Oct/19 11:11
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #33: 
HDDS-1985. Fix listVolumes API
URL: https://github.com/apache/hadoop-ozone/pull/33
 
 
   https://issues.apache.org/jira/browse/HDDS-1985
   
   No fix is required for this, as the information is retrieved from the MPU 
Key table, this information is not retrieved through RocksDB Table iteration. 
(As when we use get() this checks from cache first, and then it checks table)
   

   
   Used this Jira to add an integration test to verify the behavior.
   
   (This has cumulative changes required for HDDS-1988 and HDDS-1985)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329112)
Time Spent: 20m  (was: 10m)

> Fix listVolumes API
> ---
>
> Key: HDDS-1985
> URL: https://issues.apache.org/jira/browse/HDDS-1985
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This Jira is to fix lisVolumes API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listVolumes, it should use both 
> in-memory cache and rocksdb volume table to list volumes for a user.
>  
> No fix is required for this, as the information is retrieved from the MPU Key 
> table, this information is not retrieved through RocksDB Table iteration. (As 
> when we use get() this checks from cache first, and then it checks table)
>  
> Used this Jira to add an integration test to verify the behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1985) Fix listVolumes API

2019-10-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1985?focusedWorklogId=329111&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329111
 ]

ASF GitHub Bot logged work on HDDS-1985:


Author: ASF GitHub Bot
Created on: 16/Oct/19 11:10
Start Date: 16/Oct/19 11:10
Worklog Time Spent: 10m 
  Work Description: elek commented on pull request #33: HDDS-1985. Fix 
listVolumes API
URL: https://github.com/apache/hadoop-ozone/pull/33
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 329111)
Remaining Estimate: 0h
Time Spent: 10m

> Fix listVolumes API
> ---
>
> Key: HDDS-1985
> URL: https://issues.apache.org/jira/browse/HDDS-1985
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This Jira is to fix lisVolumes API in HA code path.
> In HA, we have an in-memory cache, where we put the result to in-memory cache 
> and return the response, later it will be picked by double buffer thread and 
> it will flush to disk. So, now when do listVolumes, it should use both 
> in-memory cache and rocksdb volume table to list volumes for a user.
>  
> No fix is required for this, as the information is retrieved from the MPU Key 
> table, this information is not retrieved through RocksDB Table iteration. (As 
> when we use get() this checks from cache first, and then it checks table)
>  
> Used this Jira to add an integration test to verify the behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >