[jira] [Commented] (HDFS-14847) Erasure Coding: Blocks are over-replicated while EC decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953450#comment-16953450 ] Ayush Saxena commented on HDFS-14847: - if that is just for readability, I must say we should not do that, Unnecessarily, though trivial why to have any performance impact. FYI. HDFS-14768 touches the same part. > Erasure Coding: Blocks are over-replicated while EC decommissioning > --- > > Key: HDFS-14847 > URL: https://issues.apache.org/jira/browse/HDFS-14847 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Critical > Attachments: HDFS-14847.001.patch, HDFS-14847.002.patch, > HDFS-14847.003.patch > > > Found that Some blocks are over-replicated while ec decommissioning. Messages > in log as follow > {quote} > INFO BlockStateChange: Block: blk_-9223372035714984112_363779142, Expected > Replicas: 9, live replicas: 8, corrupt replicas: 0, decommissioned replicas: > 0, decommissioning replicas: 3, maintenance replicas: 0, live entering > maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes > having this block: 10.254.41.34:50010 10.254.54.53:50010 10.254.28.53:50010 > 10.254.56.55:50010 10.254.32.21:50010 10.254.33.19:50010 10.254.63.17:50010 > 10.254.31.19:50010 10.254.35.29:50010 10.254.51.57:50010 10.254.40.58:50010 > 10.254.69.31:50010 10.254.47.18:50010 10.254.51.18:50010 10.254.43.57:50010 > 10.254.50.47:50010 10.254.42.37:50010 10.254.57.29:50010 10.254.67.40:50010 > 10.254.44.16:50010 10.254.59.38:50010 10.254.53.56:50010 10.254.45.11:50010 > 10.254.39.22:50010 10.254.30.16:50010 10.254.35.53:50010 10.254.22.30:50010 > 10.254.26.34:50010 10.254.17.58:50010 10.254.65.53:50010 10.254.60.39:50010 > 10.254.61.20:50010 10.254.64.23:50010 10.254.21.13:50010 10.254.37.35:50010 > 10.254.68.30:50010 10.254.62.37:50010 10.254.25.58:50010 10.254.52.54:50010 > 10.254.58.31:50010 10.254.49.11:50010 10.254.55.52:50010 10.254.19.19:50010 > 10.254.36.40:50010 10.254.18.30:50010 10.254.20.39:50010 10.254.66.52:50010 > 10.254.56.32:50010 10.254.24.55:50010 10.254.34.11:50010 10.254.29.58:50010 > 10.254.27.40:50010 10.254.46.33:50010 10.254.23.19:50010 10.254.74.12:50010 > 10.254.74.13:50010 10.254.41.35:50010 10.254.67.58:50010 10.254.54.11:50010 > 10.254.68.14:50010 10.254.27.14:50010 10.254.51.29:50010 10.254.45.21:50010 > 10.254.50.56:50010 10.254.47.31:50010 10.254.40.14:50010 10.254.65.21:50010 > 10.254.62.22:50010 10.254.57.16:50010 10.254.36.52:50010 10.254.30.13:50010 > 10.254.35.12:50010 10.254.69.34:50010 10.254.34.58:50010 10.254.17.50:50010 > 10.254.63.12:50010 10.254.28.21:50010 10.254.58.30:50010 10.254.24.57:50010 > 10.254.33.50:50010 10.254.44.52:50010 10.254.32.48:50010 10.254.43.39:50010 > 10.254.20.37:50010 10.254.56.59:50010 10.254.22.33:50010 10.254.60.34:50010 > 10.254.49.19:50010 10.254.52.21:50010 10.254.23.59:50010 10.254.21.16:50010 > 10.254.42.55:50010 10.254.29.33:50010 10.254.53.17:50010 10.254.19.14:50010 > 10.254.64.51:50010 10.254.46.20:50010 10.254.66.22:50010 10.254.18.38:50010 > 10.254.39.17:50010 10.254.37.57:50010 10.254.31.54:50010 10.254.55.33:50010 > 10.254.25.17:50010 10.254.61.33:50010 10.254.26.40:50010 10.254.59.23:50010 > 10.254.59.35:50010 10.254.66.48:50010 10.254.41.15:50010 10.254.54.31:50010 > 10.254.61.50:50010 10.254.62.31:50010 10.254.17.56:50010 10.254.29.18:50010 > 10.254.45.16:50010 10.254.63.48:50010 10.254.22.34:50010 10.254.37.51:50010 > 10.254.65.49:50010 10.254.58.21:50010 10.254.42.12:50010 10.254.55.17:50010 > 10.254.27.13:50010 10.254.57.17:50010 10.254.67.18:50010 10.254.31.31:50010 > 10.254.28.12:50010 10.254.36.12:50010 10.254.21.59:50010 10.254.30.30:50010 > 10.254.26.50:50010 10.254.40.40:50010 10.254.32.17:50010 10.254.47.55:50010 > 10.254.60.55:50010 10.254.49.33:50010 10.254.68.47:50010 10.254.39.21:50010 > 10.254.56.14:50010 10.254.33.54:50010 10.254.69.57:50010 10.254.43.50:50010 > 10.254.50.13:50010 10.254.25.49:50010 10.254.18.20:50010 10.254.52.23:50010 > 10.254.19.11:50010 10.254.20.21:50010 10.254.74.16:50010 10.254.64.55:50010 > 10.254.24.48:50010 10.254.46.29:50010 10.254.51.12:50010 10.254.23.56:50010 > 10.254.44.59:50010 10.254.53.58:50010 10.254.34.38:50010 10.254.35.37:50010 > 10.254.35.16:50010 10.254.36.23:50010 10.254.41.47:50010 10.254.54.12:50010 > 10.254.20.59:50010 , Current Datanode: 10.254.56.55:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > {quote} > Decommisions hang for a long time. > Deep into the code and find that There is a problem in ErasureCodingWork.java > For Example, there are 2 nodes(dn0, dn1) in dec
[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953447#comment-16953447 ] guojh commented on HDFS-14768: -- [~ayushtkn] [~surendrasingh] Could you help review the code? I can drop the code that try to fixed the over-replicated problem. let it fixed in HDFS-14847. > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, > HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, > HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { > BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new > Block(i), > new DatanodeStorageInfo[]{dStorageInfos[0]}); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); > ass
[jira] [Commented] (HDFS-14847) Erasure Coding: Blocks are over-replicated while EC decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953445#comment-16953445 ] Fei Hui commented on HDFS-14847: [~ayushtkn] Thanks for your review! {quote} Can this be avoided, seems required only for size of bitset, can we not use srcNodes? {quote} Maybe realTotalBlockNum here is more readability {quote} For the test, for the wait methods, may be we can use GenericTestUtils.waitFor(...) ? May be keep the helper methods together after the test only for better redability.. {quote} Use GenericTestUtils.waitFor and remove useless code and move helper methods together after the test > Erasure Coding: Blocks are over-replicated while EC decommissioning > --- > > Key: HDFS-14847 > URL: https://issues.apache.org/jira/browse/HDFS-14847 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Critical > Attachments: HDFS-14847.001.patch, HDFS-14847.002.patch, > HDFS-14847.003.patch > > > Found that Some blocks are over-replicated while ec decommissioning. Messages > in log as follow > {quote} > INFO BlockStateChange: Block: blk_-9223372035714984112_363779142, Expected > Replicas: 9, live replicas: 8, corrupt replicas: 0, decommissioned replicas: > 0, decommissioning replicas: 3, maintenance replicas: 0, live entering > maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes > having this block: 10.254.41.34:50010 10.254.54.53:50010 10.254.28.53:50010 > 10.254.56.55:50010 10.254.32.21:50010 10.254.33.19:50010 10.254.63.17:50010 > 10.254.31.19:50010 10.254.35.29:50010 10.254.51.57:50010 10.254.40.58:50010 > 10.254.69.31:50010 10.254.47.18:50010 10.254.51.18:50010 10.254.43.57:50010 > 10.254.50.47:50010 10.254.42.37:50010 10.254.57.29:50010 10.254.67.40:50010 > 10.254.44.16:50010 10.254.59.38:50010 10.254.53.56:50010 10.254.45.11:50010 > 10.254.39.22:50010 10.254.30.16:50010 10.254.35.53:50010 10.254.22.30:50010 > 10.254.26.34:50010 10.254.17.58:50010 10.254.65.53:50010 10.254.60.39:50010 > 10.254.61.20:50010 10.254.64.23:50010 10.254.21.13:50010 10.254.37.35:50010 > 10.254.68.30:50010 10.254.62.37:50010 10.254.25.58:50010 10.254.52.54:50010 > 10.254.58.31:50010 10.254.49.11:50010 10.254.55.52:50010 10.254.19.19:50010 > 10.254.36.40:50010 10.254.18.30:50010 10.254.20.39:50010 10.254.66.52:50010 > 10.254.56.32:50010 10.254.24.55:50010 10.254.34.11:50010 10.254.29.58:50010 > 10.254.27.40:50010 10.254.46.33:50010 10.254.23.19:50010 10.254.74.12:50010 > 10.254.74.13:50010 10.254.41.35:50010 10.254.67.58:50010 10.254.54.11:50010 > 10.254.68.14:50010 10.254.27.14:50010 10.254.51.29:50010 10.254.45.21:50010 > 10.254.50.56:50010 10.254.47.31:50010 10.254.40.14:50010 10.254.65.21:50010 > 10.254.62.22:50010 10.254.57.16:50010 10.254.36.52:50010 10.254.30.13:50010 > 10.254.35.12:50010 10.254.69.34:50010 10.254.34.58:50010 10.254.17.50:50010 > 10.254.63.12:50010 10.254.28.21:50010 10.254.58.30:50010 10.254.24.57:50010 > 10.254.33.50:50010 10.254.44.52:50010 10.254.32.48:50010 10.254.43.39:50010 > 10.254.20.37:50010 10.254.56.59:50010 10.254.22.33:50010 10.254.60.34:50010 > 10.254.49.19:50010 10.254.52.21:50010 10.254.23.59:50010 10.254.21.16:50010 > 10.254.42.55:50010 10.254.29.33:50010 10.254.53.17:50010 10.254.19.14:50010 > 10.254.64.51:50010 10.254.46.20:50010 10.254.66.22:50010 10.254.18.38:50010 > 10.254.39.17:50010 10.254.37.57:50010 10.254.31.54:50010 10.254.55.33:50010 > 10.254.25.17:50010 10.254.61.33:50010 10.254.26.40:50010 10.254.59.23:50010 > 10.254.59.35:50010 10.254.66.48:50010 10.254.41.15:50010 10.254.54.31:50010 > 10.254.61.50:50010 10.254.62.31:50010 10.254.17.56:50010 10.254.29.18:50010 > 10.254.45.16:50010 10.254.63.48:50010 10.254.22.34:50010 10.254.37.51:50010 > 10.254.65.49:50010 10.254.58.21:50010 10.254.42.12:50010 10.254.55.17:50010 > 10.254.27.13:50010 10.254.57.17:50010 10.254.67.18:50010 10.254.31.31:50010 > 10.254.28.12:50010 10.254.36.12:50010 10.254.21.59:50010 10.254.30.30:50010 > 10.254.26.50:50010 10.254.40.40:50010 10.254.32.17:50010 10.254.47.55:50010 > 10.254.60.55:50010 10.254.49.33:50010 10.254.68.47:50010 10.254.39.21:50010 > 10.254.56.14:50010 10.254.33.54:50010 10.254.69.57:50010 10.254.43.50:50010 > 10.254.50.13:50010 10.254.25.49:50010 10.254.18.20:50010 10.254.52.23:50010 > 10.254.19.11:50010 10.254.20.21:50010 10.254.74.16:50010 10.254.64.55:50010 > 10.254.24.48:50010 10.254.46.29:50010 10.254.51.12:50010 10.254.23.56:50010 > 10.254.44.59:50010 10.254.53.58:50010 10.254.34.38:50010 10.254.35.37:50010 > 10.254.35.16:50010 10.254.36.23:50010 10.254.41.47:50010 10.254.54.12:50010 > 10.254.20.59:50010 , Current Datanode:
[jira] [Updated] (HDFS-14847) Erasure Coding: Blocks are over-replicated while EC decommissioning
[ https://issues.apache.org/jira/browse/HDFS-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14847: --- Attachment: HDFS-14847.003.patch > Erasure Coding: Blocks are over-replicated while EC decommissioning > --- > > Key: HDFS-14847 > URL: https://issues.apache.org/jira/browse/HDFS-14847 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Critical > Attachments: HDFS-14847.001.patch, HDFS-14847.002.patch, > HDFS-14847.003.patch > > > Found that Some blocks are over-replicated while ec decommissioning. Messages > in log as follow > {quote} > INFO BlockStateChange: Block: blk_-9223372035714984112_363779142, Expected > Replicas: 9, live replicas: 8, corrupt replicas: 0, decommissioned replicas: > 0, decommissioning replicas: 3, maintenance replicas: 0, live entering > maintenance replicas: 0, excess replicas: 0, Is Open File: false, Datanodes > having this block: 10.254.41.34:50010 10.254.54.53:50010 10.254.28.53:50010 > 10.254.56.55:50010 10.254.32.21:50010 10.254.33.19:50010 10.254.63.17:50010 > 10.254.31.19:50010 10.254.35.29:50010 10.254.51.57:50010 10.254.40.58:50010 > 10.254.69.31:50010 10.254.47.18:50010 10.254.51.18:50010 10.254.43.57:50010 > 10.254.50.47:50010 10.254.42.37:50010 10.254.57.29:50010 10.254.67.40:50010 > 10.254.44.16:50010 10.254.59.38:50010 10.254.53.56:50010 10.254.45.11:50010 > 10.254.39.22:50010 10.254.30.16:50010 10.254.35.53:50010 10.254.22.30:50010 > 10.254.26.34:50010 10.254.17.58:50010 10.254.65.53:50010 10.254.60.39:50010 > 10.254.61.20:50010 10.254.64.23:50010 10.254.21.13:50010 10.254.37.35:50010 > 10.254.68.30:50010 10.254.62.37:50010 10.254.25.58:50010 10.254.52.54:50010 > 10.254.58.31:50010 10.254.49.11:50010 10.254.55.52:50010 10.254.19.19:50010 > 10.254.36.40:50010 10.254.18.30:50010 10.254.20.39:50010 10.254.66.52:50010 > 10.254.56.32:50010 10.254.24.55:50010 10.254.34.11:50010 10.254.29.58:50010 > 10.254.27.40:50010 10.254.46.33:50010 10.254.23.19:50010 10.254.74.12:50010 > 10.254.74.13:50010 10.254.41.35:50010 10.254.67.58:50010 10.254.54.11:50010 > 10.254.68.14:50010 10.254.27.14:50010 10.254.51.29:50010 10.254.45.21:50010 > 10.254.50.56:50010 10.254.47.31:50010 10.254.40.14:50010 10.254.65.21:50010 > 10.254.62.22:50010 10.254.57.16:50010 10.254.36.52:50010 10.254.30.13:50010 > 10.254.35.12:50010 10.254.69.34:50010 10.254.34.58:50010 10.254.17.50:50010 > 10.254.63.12:50010 10.254.28.21:50010 10.254.58.30:50010 10.254.24.57:50010 > 10.254.33.50:50010 10.254.44.52:50010 10.254.32.48:50010 10.254.43.39:50010 > 10.254.20.37:50010 10.254.56.59:50010 10.254.22.33:50010 10.254.60.34:50010 > 10.254.49.19:50010 10.254.52.21:50010 10.254.23.59:50010 10.254.21.16:50010 > 10.254.42.55:50010 10.254.29.33:50010 10.254.53.17:50010 10.254.19.14:50010 > 10.254.64.51:50010 10.254.46.20:50010 10.254.66.22:50010 10.254.18.38:50010 > 10.254.39.17:50010 10.254.37.57:50010 10.254.31.54:50010 10.254.55.33:50010 > 10.254.25.17:50010 10.254.61.33:50010 10.254.26.40:50010 10.254.59.23:50010 > 10.254.59.35:50010 10.254.66.48:50010 10.254.41.15:50010 10.254.54.31:50010 > 10.254.61.50:50010 10.254.62.31:50010 10.254.17.56:50010 10.254.29.18:50010 > 10.254.45.16:50010 10.254.63.48:50010 10.254.22.34:50010 10.254.37.51:50010 > 10.254.65.49:50010 10.254.58.21:50010 10.254.42.12:50010 10.254.55.17:50010 > 10.254.27.13:50010 10.254.57.17:50010 10.254.67.18:50010 10.254.31.31:50010 > 10.254.28.12:50010 10.254.36.12:50010 10.254.21.59:50010 10.254.30.30:50010 > 10.254.26.50:50010 10.254.40.40:50010 10.254.32.17:50010 10.254.47.55:50010 > 10.254.60.55:50010 10.254.49.33:50010 10.254.68.47:50010 10.254.39.21:50010 > 10.254.56.14:50010 10.254.33.54:50010 10.254.69.57:50010 10.254.43.50:50010 > 10.254.50.13:50010 10.254.25.49:50010 10.254.18.20:50010 10.254.52.23:50010 > 10.254.19.11:50010 10.254.20.21:50010 10.254.74.16:50010 10.254.64.55:50010 > 10.254.24.48:50010 10.254.46.29:50010 10.254.51.12:50010 10.254.23.56:50010 > 10.254.44.59:50010 10.254.53.58:50010 10.254.34.38:50010 10.254.35.37:50010 > 10.254.35.16:50010 10.254.36.23:50010 10.254.41.47:50010 10.254.54.12:50010 > 10.254.20.59:50010 , Current Datanode: 10.254.56.55:50010, Is current > datanode decommissioning: true, Is current datanode entering maintenance: > false > {quote} > Decommisions hang for a long time. > Deep into the code and find that There is a problem in ErasureCodingWork.java > For Example, there are 2 nodes(dn0, dn1) in decommission and an ec block > group with the 2 nodes. After creating an ErasureCodingWork to reconstruct, > it will create 2 replication work. > If dn0 replicates in success and dn1 replicates in failure, T
[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953443#comment-16953443 ] guojh commented on HDFS-14768: -- [~ayushtkn] Thanks for you reply, This patch try to fix two problem, one is replica over-replicated another is block incorrect. The code quote by [~surendrasingh] is try to fixed the over-replicated issue. > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, > HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, > HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { > BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new > Block(i), > new DatanodeStorageInfo[]{dStorageInfos[0]}); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommissionNode(0, dec
[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase
[ https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2208: - Labels: pull-request-available (was: ) > Propagate System Exceptions from OM transaction apply phase > --- > > Key: HDDS-2208 > URL: https://issues.apache.org/jira/browse/HDDS-2208 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > > The change for HDDS-2206 tracks system exceptions during preExecute phase of > OM request handling. > The current jira is to implement exception propagation once the OM request is > submitted to Ratis - when the handler is running validateAndUpdateCache for > the request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase
[ https://issues.apache.org/jira/browse/HDDS-2208?focusedWorklogId=329629&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329629 ] ASF GitHub Bot logged work on HDDS-2208: Author: ASF GitHub Bot Created on: 17/Oct/19 06:25 Start Date: 17/Oct/19 06:25 Worklog Time Spent: 10m Work Description: supratimdeka commented on pull request #47: HDDS-2208. Propagate System Exceptions from OM transaction apply phase. Contributed by Supratim Deka URL: https://github.com/apache/hadoop-ozone/pull/47 https://issues.apache.org/jira/browse/HDDS-2208 This is a follow-up to the patch for HDDS-2206 https://github.com/apache/hadoop-ozone/pull/12 The change propagates complete stacktraces for system exceptions encountered during the Ratis phase of the OM request handling. However, this patch does not consider the configuration parameter introduced earlier in the patch for HDDS-2206. Controlling the behaviour using a configuration parameter requires a much greater footprint in the code. Because at this point, there is no clear requirement for such a config parameter - going ahead without the config param. Will update the patch for HDDS-2206 as well - will remove the configuration that was introduced. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329629) Remaining Estimate: 0h Time Spent: 10m > Propagate System Exceptions from OM transaction apply phase > --- > > Key: HDDS-2208 > URL: https://issues.apache.org/jira/browse/HDDS-2208 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The change for HDDS-2206 tracks system exceptions during preExecute phase of > OM request handling. > The current jira is to implement exception propagation once the OM request is > submitted to Ratis - when the handler is running validateAndUpdateCache for > the request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase
[ https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2208: Description: The change for HDDS-2206 tracks system exceptions during preExecute phase of OM request handling. The current jira is to implement exception propagation once the OM request is submitted to Ratis - when the handler is running validateAndUpdateCache for the request. was: applyTransaction handling in the OzoneManagerStateMachine does not propagate exceptions/failures to the initiator. The future which is returned from applyTransaction simply tracks completion of the async executor represented by the "executorService" in OzoneManagerStateMachine.java > Propagate System Exceptions from OM transaction apply phase > --- > > Key: HDDS-2208 > URL: https://issues.apache.org/jira/browse/HDDS-2208 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > The change for HDDS-2206 tracks system exceptions during preExecute phase of > OM request handling. > The current jira is to implement exception propagation once the OM request is > submitted to Ratis - when the handler is running validateAndUpdateCache for > the request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase
[ https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supratim Deka updated HDDS-2208: Summary: Propagate System Exceptions from OM transaction apply phase (was: OzoneManagerStateMachine does not track failures in applyTransaction) > Propagate System Exceptions from OM transaction apply phase > --- > > Key: HDDS-2208 > URL: https://issues.apache.org/jira/browse/HDDS-2208 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > > applyTransaction handling in the OzoneManagerStateMachine does not propagate > exceptions/failures to the initiator. > The future which is returned from applyTransaction simply tracks completion > of the async executor represented by the "executorService" in > OzoneManagerStateMachine.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope
[ https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953415#comment-16953415 ] Brahma Reddy Battula commented on HDFS-14909: - +1,Nice Finding. > DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage > count for excluded node which is already part of excluded scope > - > > Key: HDFS-14909 > URL: https://issues.apache.org/jira/browse/HDFS-14909 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch, > HDFS-14909.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2286) Add a log info in ozone client and scm to print the exclusion list during allocate block
[ https://issues.apache.org/jira/browse/HDDS-2286?focusedWorklogId=329591&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329591 ] ASF GitHub Bot logged work on HDDS-2286: Author: ASF GitHub Bot Created on: 17/Oct/19 05:01 Start Date: 17/Oct/19 05:01 Worklog Time Spent: 10m Work Description: swagle commented on pull request #46: HDDS-2286. Add a log info in ozone client and scm to print the exclus… URL: https://github.com/apache/hadoop-ozone/pull/46 ## What changes were proposed in this pull request? Added additional logging to print exclude lists on client and SCM (Please fill in changes proposed in this fix) ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2286 ## How was this patch tested? Waiting on unit tests since only log statements were added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329591) Remaining Estimate: 0h Time Spent: 10m > Add a log info in ozone client and scm to print the exclusion list during > allocate block > > > Key: HDDS-2286 > URL: https://issues.apache.org/jira/browse/HDDS-2286 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2286) Add a log info in ozone client and scm to print the exclusion list during allocate block
[ https://issues.apache.org/jira/browse/HDDS-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2286: - Labels: pull-request-available (was: ) > Add a log info in ozone client and scm to print the exclusion list during > allocate block > > > Key: HDDS-2286 > URL: https://issues.apache.org/jira/browse/HDDS-2286 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14880) Balancer sequence of statistics & exit message is not correct
[ https://issues.apache.org/jira/browse/HDFS-14880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953389#comment-16953389 ] Ayush Saxena commented on HDFS-14880: - Thanx [~prasad-acit] for the report, The fix looks good and fair enough to have. The only concern that I see could be incompatibility due to change in CLI. [~vinayakumarb] can you help, how can this be taken ahead.? > Balancer sequence of statistics & exit message is not correct > - > > Key: HDFS-14880 > URL: https://issues.apache.org/jira/browse/HDFS-14880 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 3.1.1, 3.2.1 > Environment: Run the balancer tool in cluster. >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Attachments: HDFS-14880.0001.patch > > > Actual: > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved > The cluster is balanced. Exiting... > Sep 27, 2019 5:13:15 PM 0 0 B 0 B > 0 B > Sep 27, 2019 5:13:15 PM Balancing took 1.726 seconds > Done! > Expected: Exit message should be after loggin all the balancer movement > statistics data. > Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved > Sep 27, 2019 5:13:15 PM 0 0 B 0 B > 0 B > The cluster is balanced. Exiting... > Sep 27, 2019 5:13:15 PM Balancing took 1.726 seconds > Done! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2275) In BatchOperation.SingleOperation, do not clone byte[]
[ https://issues.apache.org/jira/browse/HDDS-2275?focusedWorklogId=329575&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329575 ] ASF GitHub Bot logged work on HDDS-2275: Author: ASF GitHub Bot Created on: 17/Oct/19 04:15 Start Date: 17/Oct/19 04:15 Worklog Time Spent: 10m Work Description: szetszwo commented on pull request #45: HDDS-2275. In BatchOperation.SingleOperation, do not clone byte[]. URL: https://github.com/apache/hadoop-ozone/pull/45 See https://issues.apache.org/jira/browse/HDDS-2275 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329575) Remaining Estimate: 0h Time Spent: 10m > In BatchOperation.SingleOperation, do not clone byte[] > -- > > Key: HDDS-2275 > URL: https://issues.apache.org/jira/browse/HDDS-2275 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Attachments: o2275_20191015.patch > > Time Spent: 10m > Remaining Estimate: 0h > > byte[] is cloned once in the constructor and then it is cloned again in the > getter methods. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2275) In BatchOperation.SingleOperation, do not clone byte[]
[ https://issues.apache.org/jira/browse/HDDS-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2275: - Labels: pull-request-available (was: ) > In BatchOperation.SingleOperation, do not clone byte[] > -- > > Key: HDDS-2275 > URL: https://issues.apache.org/jira/browse/HDDS-2275 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Attachments: o2275_20191015.patch > > > byte[] is cloned once in the constructor and then it is cloned again in the > getter methods. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler
[ https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329570&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329570 ] ASF GitHub Bot logged work on HDDS-2271: Author: ASF GitHub Bot Created on: 17/Oct/19 04:06 Start Date: 17/Oct/19 04:06 Worklog Time Spent: 10m Work Description: szetszwo commented on issue #1625: HDDS-2271. Avoid buffer copying in KeyValueHandler. URL: https://github.com/apache/hadoop/pull/1625#issuecomment-542990305 This was migrated to https://github.com/apache/hadoop-ozone/pull/44 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329570) Time Spent: 1h 10m (was: 1h) > Avoid buffer copying in KeyValueHandler > --- > > Key: HDDS-2271 > URL: https://issues.apache.org/jira/browse/HDDS-2271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 0.6.0 > > Attachments: o2271_20191015.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > - In handleGetSmallFile, it first reads chunk data to a byte[] and the copy > them to a ByteString. > - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds > a ContainerProtos.BlockData and then copies it to a byte[]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2271) Avoid buffer copying in KeyValueHandler
[ https://issues.apache.org/jira/browse/HDDS-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDDS-2271: - Fix Version/s: 0.6.0 Resolution: Fixed Status: Resolved (was: Patch Available) I have committed this. > Avoid buffer copying in KeyValueHandler > --- > > Key: HDDS-2271 > URL: https://issues.apache.org/jira/browse/HDDS-2271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 0.6.0 > > Attachments: o2271_20191015.patch > > Time Spent: 50m > Remaining Estimate: 0h > > - In handleGetSmallFile, it first reads chunk data to a byte[] and the copy > them to a ByteString. > - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds > a ContainerProtos.BlockData and then copies it to a byte[]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler
[ https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329569&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329569 ] ASF GitHub Bot logged work on HDDS-2271: Author: ASF GitHub Bot Created on: 17/Oct/19 04:05 Start Date: 17/Oct/19 04:05 Worklog Time Spent: 10m Work Description: szetszwo commented on pull request #1625: HDDS-2271. Avoid buffer copying in KeyValueHandler. URL: https://github.com/apache/hadoop/pull/1625 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329569) Time Spent: 1h (was: 50m) > Avoid buffer copying in KeyValueHandler > --- > > Key: HDDS-2271 > URL: https://issues.apache.org/jira/browse/HDDS-2271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 0.6.0 > > Attachments: o2271_20191015.patch > > Time Spent: 1h > Remaining Estimate: 0h > > - In handleGetSmallFile, it first reads chunk data to a byte[] and the copy > them to a ByteString. > - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds > a ContainerProtos.BlockData and then copies it to a byte[]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler
[ https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329568&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329568 ] ASF GitHub Bot logged work on HDDS-2271: Author: ASF GitHub Bot Created on: 17/Oct/19 04:04 Start Date: 17/Oct/19 04:04 Worklog Time Spent: 10m Work Description: szetszwo commented on pull request #44: HDDS-2271. Avoid buffer copying in KeyValueHandler. URL: https://github.com/apache/hadoop-ozone/pull/44 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329568) Time Spent: 50m (was: 40m) > Avoid buffer copying in KeyValueHandler > --- > > Key: HDDS-2271 > URL: https://issues.apache.org/jira/browse/HDDS-2271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Attachments: o2271_20191015.patch > > Time Spent: 50m > Remaining Estimate: 0h > > - In handleGetSmallFile, it first reads chunk data to a byte[] and the copy > them to a ByteString. > - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds > a ContainerProtos.BlockData and then copies it to a byte[]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2271) Avoid buffer copying in KeyValueHandler
[ https://issues.apache.org/jira/browse/HDDS-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953378#comment-16953378 ] Tsz-wo Sze commented on HDDS-2271: -- Thanks [~shashikant] and [~hanishakoneru] for reviewing this. > Avoid buffer copying in KeyValueHandler > --- > > Key: HDDS-2271 > URL: https://issues.apache.org/jira/browse/HDDS-2271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Attachments: o2271_20191015.patch > > Time Spent: 40m > Remaining Estimate: 0h > > - In handleGetSmallFile, it first reads chunk data to a byte[] and the copy > them to a ByteString. > - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds > a ContainerProtos.BlockData and then copies it to a byte[]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2271) Avoid buffer copying in KeyValueHandler
[ https://issues.apache.org/jira/browse/HDDS-2271?focusedWorklogId=329567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329567 ] ASF GitHub Bot logged work on HDDS-2271: Author: ASF GitHub Bot Created on: 17/Oct/19 04:02 Start Date: 17/Oct/19 04:02 Worklog Time Spent: 10m Work Description: szetszwo commented on pull request #44: HDDS-2271. Avoid buffer copying in KeyValueHandler. URL: https://github.com/apache/hadoop-ozone/pull/44 Migrated from https://github.com/apache/hadoop/pull/1625 See https://issues.apache.org/jira/browse/HDDS-2271 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329567) Time Spent: 40m (was: 0.5h) > Avoid buffer copying in KeyValueHandler > --- > > Key: HDDS-2271 > URL: https://issues.apache.org/jira/browse/HDDS-2271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Attachments: o2271_20191015.patch > > Time Spent: 40m > Remaining Estimate: 0h > > - In handleGetSmallFile, it first reads chunk data to a byte[] and the copy > them to a ByteString. > - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds > a ContainerProtos.BlockData and then copies it to a byte[]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953341#comment-16953341 ] Ayush Saxena commented on HDFS-14768: - [~gjhkael] is the scenario you are talking is the one in HDFS-14847? > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, > HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, > HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { > BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new > Block(i), > new DatanodeStorageInfo[]{dStorageInfos[0]}); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); > assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); > bm.
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953314#comment-16953314 ] Jinglun commented on HDFS-14908: Hi [~elgoiri], thanks your nice comments ! One place is FederationUtil.isParentEntry(). When I did v01 I took it as reference. I'll make a benchmark comparing this method and String.startsWith(). > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953287#comment-16953287 ] guojh commented on HDFS-14768: -- [~surendrasingh] If I decommission two datanode that have the block index 6 and 8, After first replica schedule, the index 6 is success replica but index 8 is not. Then the file with index[0, 1, 2, 3, 4, 5, 6(Decommissioning), 7 ,8(Decommissioning), 6], after timeout, The BlockManager will schecule it again, then findLeavingServiceSources will still choose index 6 to replica. This is another problem. Should I create a new Issue and write another UT? > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, > HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, > HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { > BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new > Block(i), > new DatanodeStorageInfo[]{dStorageInfos[0]}); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeI
[jira] [Updated] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer
[ https://issues.apache.org/jira/browse/HDDS-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Ratnavel Subramanian updated HDDS-2181: - Status: Patch Available (was: Reopened) > Ozone Manager should send correct ACL type in ACL requests to Authorizer > > > Key: HDDS-2181 > URL: https://issues.apache.org/jira/browse/HDDS-2181 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 11h > Remaining Estimate: 0h > > Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete > and bucket create operation. Fix the acl type in all requests to the > authorizer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer
[ https://issues.apache.org/jira/browse/HDDS-2181?focusedWorklogId=329498&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329498 ] ASF GitHub Bot logged work on HDDS-2181: Author: ASF GitHub Bot Created on: 17/Oct/19 00:14 Start Date: 17/Oct/19 00:14 Worklog Time Spent: 10m Work Description: vivekratnavel commented on pull request #43: HDDS-2181. Ozone Manager should send correct ACL type in ACL requests… URL: https://github.com/apache/hadoop-ozone/pull/43 … to Authorizer ## What changes were proposed in this pull request? The ACL types sent to authorizers is changed from sending "WRITE" ACL type always to sending appropriate ACL types as required. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2181 ## How was this patch tested? This patch was tested by updating and running unit tests and acceptance tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329498) Time Spent: 11h (was: 10h 50m) > Ozone Manager should send correct ACL type in ACL requests to Authorizer > > > Key: HDDS-2181 > URL: https://issues.apache.org/jira/browse/HDDS-2181 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 11h > Remaining Estimate: 0h > > Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete > and bucket create operation. Fix the acl type in all requests to the > authorizer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2302) Manage common pom versions in one common place
[ https://issues.apache.org/jira/browse/HDDS-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2302. Fix Version/s: 0.5.0 Resolution: Fixed > Manage common pom versions in one common place > -- > > Key: HDDS-2302 > URL: https://issues.apache.org/jira/browse/HDDS-2302 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Some of the versions (eg. ozone.version, hdds.version, ratis.version) are > required for both ozone and hdds subprojects. As we have a common pom.xml it > can be safer to manage them in one common place at the root pom.xml instead > of managing them multiple times. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2302) Manage common pom versions in one common place
[ https://issues.apache.org/jira/browse/HDDS-2302?focusedWorklogId=329484&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329484 ] ASF GitHub Bot logged work on HDDS-2302: Author: ASF GitHub Bot Created on: 16/Oct/19 23:32 Start Date: 16/Oct/19 23:32 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #21: HDDS-2302. Manage common pom versions in one common place URL: https://github.com/apache/hadoop-ozone/pull/21 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329484) Time Spent: 20m (was: 10m) > Manage common pom versions in one common place > -- > > Key: HDDS-2302 > URL: https://issues.apache.org/jira/browse/HDDS-2302 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Some of the versions (eg. ozone.version, hdds.version, ratis.version) are > required for both ozone and hdds subprojects. As we have a common pom.xml it > can be safer to manage them in one common place at the root pom.xml instead > of managing them multiple times. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2289) Put testing information and a problem description to the github PR template
[ https://issues.apache.org/jira/browse/HDDS-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2289. Resolution: Fixed > Put testing information and a problem description to the github PR template > --- > > Key: HDDS-2289 > URL: https://issues.apache.org/jira/browse/HDDS-2289 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This is suggested by [~aengineer] during an offline discussion to add more > information to the github PR template based on the template of ambari (by > Vivek): > https://github.com/apache/ambari/commit/579cec8cf5bcfe1a1a0feacf055ed6569f674e6a -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope
[ https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953236#comment-16953236 ] Hadoop QA commented on HDFS-14909: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 53s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}162m 8s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14909 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983211/HDFS-14909.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7c35529448fa 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 375224e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28099/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28099/testReport/ | | Max. process+thread count | 3149 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-
[jira] [Commented] (HDDS-2271) Avoid buffer copying in KeyValueHandler
[ https://issues.apache.org/jira/browse/HDDS-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953233#comment-16953233 ] Hanisha Koneru commented on HDDS-2271: -- Thanks [~szetszwo] for the patch. LGTM. +1. > Avoid buffer copying in KeyValueHandler > --- > > Key: HDDS-2271 > URL: https://issues.apache.org/jira/browse/HDDS-2271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Attachments: o2271_20191015.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > - In handleGetSmallFile, it first reads chunk data to a byte[] and the copy > them to a ByteString. > - In handlePutBlock/handleGetBlock, in order to get the length, it (1) builds > a ContainerProtos.BlockData and then copies it to a byte[]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953232#comment-16953232 ] Siddharth Wagle edited comment on HDDS-2283 at 10/16/19 10:22 PM: -- [~aengineer] Yes the follow-up Jira will not be blindly taken up without figuring out if 10s of RocksDBs sharing a disk vs 1 RocksDB with 10 tables in a single RocksDB which one is better/worse. I took this up as a low hanging fruit, agree with not focusing on micro-benchmarks comment. This was just a curiosity / exploratory effort from me that took all of 20 mins including the fix so went ahead with the patch. was (Author: swagle): [~aengineer] Yes the follow-up Jira will not be blindly taken up without figuring out if 10s of RocksDBs sharing a disk vs 1 RocksDB with 10 tables in a single RocksDB which one is better/worse. I took this up as a low hanging fruit, agree with not focusing on micro-benchmarks comment. This was just a curiosity / exploratory effort from me that took all of 20 mins including the fix. > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953232#comment-16953232 ] Siddharth Wagle commented on HDDS-2283: --- [~aengineer] Yes the follow-up Jira will not be blindly taken up without figuring out if 10s of RocksDBs sharing a disk vs 1 RocksDB with 10 tables in a single RocksDB which one is better/worse. I took this up as a low hanging fruit, agree with not focusing on micro-benchmarks comment. This was just a curiosity / exploratory effort from me that took all of 20 mins including the fix. > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2275) In BatchOperation.SingleOperation, do not clone byte[]
[ https://issues.apache.org/jira/browse/HDDS-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953216#comment-16953216 ] Hanisha Koneru commented on HDDS-2275: -- Thanks [~szetszwo] for the patch. LGTM. +1. > In BatchOperation.SingleOperation, do not clone byte[] > -- > > Key: HDDS-2275 > URL: https://issues.apache.org/jira/browse/HDDS-2275 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Attachments: o2275_20191015.patch > > > byte[] is cloned once in the constructor and then it is cloned again in the > getter methods. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2289) Put testing information and a problem description to the github PR template
[ https://issues.apache.org/jira/browse/HDDS-2289?focusedWorklogId=329408&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329408 ] ASF GitHub Bot logged work on HDDS-2289: Author: ASF GitHub Bot Created on: 16/Oct/19 21:07 Start Date: 16/Oct/19 21:07 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #5: HDDS-2289. Put testing information and a problem description to the g… URL: https://github.com/apache/hadoop-ozone/pull/5 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329408) Time Spent: 20m (was: 10m) > Put testing information and a problem description to the github PR template > --- > > Key: HDDS-2289 > URL: https://issues.apache.org/jira/browse/HDDS-2289 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This is suggested by [~aengineer] during an offline discussion to add more > information to the github PR template based on the template of ambari (by > Vivek): > https://github.com/apache/ambari/commit/579cec8cf5bcfe1a1a0feacf055ed6569f674e6a -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer
[ https://issues.apache.org/jira/browse/HDDS-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953198#comment-16953198 ] Anu Engineer commented on HDDS-2181: [~elek] Didn't you revert this ? > Ozone Manager should send correct ACL type in ACL requests to Authorizer > > > Key: HDDS-2181 > URL: https://issues.apache.org/jira/browse/HDDS-2181 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10h 50m > Remaining Estimate: 0h > > Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete > and bucket create operation. Fix the acl type in all requests to the > authorizer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2283: --- Comment: was deleted (was: I really doubt if this is a bottleneck. I am not disagreeing with the measurement. I am just saying in the big picture of writing 5GB of data, optimizing for this 300 ms buys you nothing. if you are optimizing for micro-benchmarks, what we will end of building is a system that is optimized for these benchmarks; not something that is really needed. ) > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953181#comment-16953181 ] Anu Engineer commented on HDDS-2283: I am fine this this change.. but the follow up JIRA makes no sense to me, my earlier deleted comment was about that. > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953179#comment-16953179 ] Anu Engineer commented on HDDS-2283: I really doubt if this is a bottleneck. I am not disagreeing with the measurement. I am just saying in the big picture of writing 5GB of data, optimizing for this 300 ms buys you nothing. if you are optimizing for micro-benchmarks, what we will end of building is a system that is optimized for these benchmarks; not something that is really needed. > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk
[ https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953176#comment-16953176 ] Anu Engineer commented on HDDS-2317: and how do we propose to move a container from one machine to another now? > Change rocksDB per Container model to have table per container on RocksDb per > disk > -- > > Key: HDDS-2317 > URL: https://issues.apache.org/jira/browse/HDDS-2317 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Priority: Major > > Idea proposed by [~msingh] in HDDS-2283. > Better utilize disk bandwidth by having Rocks DB per disk and put containers > as tables inside. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk
[ https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953175#comment-16953175 ] Anu Engineer commented on HDDS-2317: Don't we create a container once per 5GB of data? are we optimizing for the micro-bench-marking scenarios? I am really surprised to learn that 300ms for a DB creation is the bottleneck for 5GB data write. > Change rocksDB per Container model to have table per container on RocksDb per > disk > -- > > Key: HDDS-2317 > URL: https://issues.apache.org/jira/browse/HDDS-2317 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Priority: Major > > Idea proposed by [~msingh] in HDDS-2283. > Better utilize disk bandwidth by having Rocks DB per disk and put containers > as tables inside. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope
[ https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953133#comment-16953133 ] Surendra Singh Lilhore commented on HDFS-14909: --- Thanks [~elgoiri] for review. Attached v3 patch. > DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage > count for excluded node which is already part of excluded scope > - > > Key: HDFS-14909 > URL: https://issues.apache.org/jira/browse/HDFS-14909 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch, > HDFS-14909.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope
[ https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-14909: -- Attachment: HDFS-14909.003.patch > DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage > count for excluded node which is already part of excluded scope > - > > Key: HDFS-14909 > URL: https://issues.apache.org/jira/browse/HDFS-14909 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch, > HDFS-14909.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953129#comment-16953129 ] Surendra Singh Lilhore commented on HDFS-14768: --- Hi [~gjhkael], Why this change is required ? I didn't got this, please can you explain ? {code:java} + getSrcNodes()[i].isAlive())) { +// Filter out decommission blocks that have successfully replica +boolean existLiveReplica = false; +int currentBlockIndex = liveBlockIndicies[i]; +for (int j = liveBlockIndicies.length - 1; j > i; j--) { + if (liveBlockIndicies[j] == currentBlockIndex + && getSrcNodes()[j].isInService()) { +existLiveReplica = true; +break; + } +} +if (!existLiveReplica) { + srcIndices.add(i); +} {code} > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, > HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, > HDFS-14768.003.patch, HDFS-14768.jpg, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++)
[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd
[ https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953089#comment-16953089 ] hemanthboyina commented on HDFS-13507: -- hi [~gangli2384] are you working on this ? > RBF: Remove update functionality from routeradmin's add cmd > --- > > Key: HDFS-13507 > URL: https://issues.apache.org/jira/browse/HDFS-13507 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Wei Yan >Assignee: Gang Li >Priority: Minor > Labels: incompatible > Attachments: HDFS-13507-HDFS-13891.003.patch, > HDFS-13507-HDFS-13891.004.patch, HDFS-13507.000.patch, HDFS-13507.001.patch, > HDFS-13507.002.patch > > > Follow up the discussion in HDFS-13326. We should remove the "update" > functionality from routeradmin's add cmd, to make it consistent with RPC > calls. > Note that: this is an incompatible change. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14909) DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage count for excluded node which is already part of excluded scope
[ https://issues.apache.org/jira/browse/HDFS-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953062#comment-16953062 ] Íñigo Goiri commented on HDFS-14909: Thanks [~surendrasingh] for the update. Let's just fix the checkstyles and we are good to go. > DFSNetworkTopology#chooseRandomWithStorageType() should not decrease storage > count for excluded node which is already part of excluded scope > - > > Key: HDFS-14909 > URL: https://issues.apache.org/jira/browse/HDFS-14909 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Attachments: HDFS-14909.001.patch, HDFS-14909.002.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk
[ https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDDS-2317: -- Fix Version/s: (was: 0.5.0) > Change rocksDB per Container model to have table per container on RocksDb per > disk > -- > > Key: HDDS-2317 > URL: https://issues.apache.org/jira/browse/HDDS-2317 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Priority: Major > > Idea proposed by [~msingh] in HDDS-2283. > Better utilize disk bandwidth by having Rocks DB per disk and put containers > as tables inside. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk
[ https://issues.apache.org/jira/browse/HDDS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-2317: - Assignee: (was: Siddharth Wagle) > Change rocksDB per Container model to have table per container on RocksDb per > disk > -- > > Key: HDDS-2317 > URL: https://issues.apache.org/jira/browse/HDDS-2317 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Priority: Major > Fix For: 0.5.0 > > > Idea proposed by [~msingh] in HDDS-2283. > Better utilize disk bandwidth by having Rocks DB per disk and put containers > as tables inside. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2317) Change rocksDB per Container model to have table per container on RocksDb per disk
Siddharth Wagle created HDDS-2317: - Summary: Change rocksDB per Container model to have table per container on RocksDb per disk Key: HDDS-2317 URL: https://issues.apache.org/jira/browse/HDDS-2317 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Affects Versions: 0.5.0 Reporter: Siddharth Wagle Assignee: Siddharth Wagle Fix For: 0.5.0 Idea proposed by [~msingh] in HDDS-2283. Better utilize disk bandwidth by having Rocks DB per disk and put containers as tables inside. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?focusedWorklogId=329291&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329291 ] ASF GitHub Bot logged work on HDDS-2283: Author: ASF GitHub Bot Created on: 16/Oct/19 17:45 Start Date: 16/Oct/19 17:45 Worklog Time Spent: 10m Work Description: swagle commented on pull request #41: HDDS-2283. Container Creation on datanodes take around 300ms due to rocksdb creation. URL: https://github.com/apache/hadoop-ozone/pull/41 Container Creation on datanodes take around 300ms due to rocksdb creation. Rocksdb creation is taking a considerable time and this needs to be optimized. Creating a rocksdb per disk should be enough and each container can be table inside the rocksdb. `2019-10-15 13:20:10,714 INFO utils.MetadataStoreBuilder (MetadataStoreBuilder.java:build(124)) - Time before create, load options: 81 2019-10-15 13:20:10,715 INFO utils.RocksDBStore (RocksDBStore.java:(68)) - Time to load library: 0 2019-10-15 13:20:10,723 INFO utils.RocksDBStore (RocksDBStore.java:(75)) - Time to open: 8 2019-10-15 13:20:10,723 INFO helpers.KeyValueContainerUtil (KeyValueContainerUtil.java:createContainerMetaData(85)) - Total time to create: {}95` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329291) Remaining Estimate: 0h Time Spent: 10m > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2283: - Labels: pull-request-available (was: ) > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14546) Document block placement policies
[ https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953044#comment-16953044 ] Íñigo Goiri commented on HDFS-14546: Nothing to add to the comments, it would be nice to be able to see the markdown rendered. This would be a good case to review GitHub PRs. > Document block placement policies > - > > Key: HDFS-14546 > URL: https://issues.apache.org/jira/browse/HDFS-14546 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Amithsha >Priority: Major > Labels: documentation > Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, > HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, > HDFS-14546-06.patch, HdfsDesign.patch > > > Currently, all the documentation refers to the default block placement policy. > However, over time there have been new policies: > * BlockPlacementPolicyRackFaultTolerant (HDFS-7891) > * BlockPlacementPolicyWithNodeGroup (HDFS-3601) > * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006) > We should update the documentation to refer to them explaining their > particularities and probably how to setup each one of them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953043#comment-16953043 ] Íñigo Goiri commented on HDFS-14284: Thanks [~ayushtkn], yes, that's it, RemoteException only extracts a few exceptions not all. I guess this is fine then. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2285) GetBlock and ReadChunk command from the client should be sent to the same datanode to re-use the same connection
[ https://issues.apache.org/jira/browse/HDDS-2285?focusedWorklogId=329288&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329288 ] ASF GitHub Bot logged work on HDDS-2285: Author: ASF GitHub Bot Created on: 16/Oct/19 17:39 Start Date: 16/Oct/19 17:39 Worklog Time Spent: 10m Work Description: hanishakoneru commented on pull request #40: HDDS-2285. GetBlock and ReadChunk command from the client should be s… URL: https://github.com/apache/hadoop-ozone/pull/40 It can be observed that the GetBlock and ReadChunk command is sent to 2 different datanodes. It should be sent to the same datanode to re-use the connection. ``` 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to datanode 172.26.32.224 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to datanode 172.26.32.231 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329288) Remaining Estimate: 0h Time Spent: 10m > GetBlock and ReadChunk command from the client should be sent to the same > datanode to re-use the same connection > > > Key: HDDS-2285 > URL: https://issues.apache.org/jira/browse/HDDS-2285 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Mukul Kumar Singh >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > I can be observed that the GetBlock and ReadChunk command is sent to 2 > different datanodes. It should be sent to the same datanode to re-use the > connection. > {code} > 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to > datanode 172.26.32.224 > 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to > datanode 172.26.32.231 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953041#comment-16953041 ] Íñigo Goiri commented on HDFS-14854: [~belugabehr], thanks for taking a pass. [~sodonnell], a few minor things: * I would solve the HiddenField checkstyle warning. * Not very common to have a break line between the class and its javadoc. * Add short javadocs (one line /** */ for cancelledNodes and blocksPerLock. * Unify the comments lines 174-182 in the new monitor. * moveBlocksToPending is a little hard to read. The loops specially breaking in a couple places. Extracting some might also help, for example 469. * L633 should be a single line. Other than these cosmetic changes, this looks good. As this is pretty big, I'd like others to take another look. > Create improved decommission monitor implementation > --- > > Key: HDFS-14854 > URL: https://issues.apache.org/jira/browse/HDFS-14854 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, > HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, > HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, > HDFS-14854.008.patch, HDFS-14854.009.patch, HDFS-14854.010.patch > > > In HDFS-13157, we discovered a series of problems with the current > decommission monitor implementation, such as: > * Blocks are replicated sequentially disk by disk and node by node, and > hence the load is not spread well across the cluster > * Adding a node for decommission can cause the namenode write lock to be > held for a long time. > * Decommissioning nodes floods the replication queue and under replicated > blocks from a future node or disk failure may way for a long time before they > are replicated. > * Blocks pending replication are checked many times under a write lock > before they are sufficiently replicate, wasting resources > In this Jira I propose to create a new implementation of the decommission > monitor that resolves these issues. As it will be difficult to prove one > implementation is better than another, the new implementation can be enabled > or disabled giving the option of the existing implementation or the new one. > I will attach a pdf with some more details on the design and then a version 1 > patch shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2285) GetBlock and ReadChunk command from the client should be sent to the same datanode to re-use the same connection
[ https://issues.apache.org/jira/browse/HDDS-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2285: - Labels: pull-request-available (was: ) > GetBlock and ReadChunk command from the client should be sent to the same > datanode to re-use the same connection > > > Key: HDDS-2285 > URL: https://issues.apache.org/jira/browse/HDDS-2285 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Mukul Kumar Singh >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > > I can be observed that the GetBlock and ReadChunk command is sent to 2 > different datanodes. It should be sent to the same datanode to re-use the > connection. > {code} > 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to > datanode 172.26.32.224 > 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to > datanode 172.26.32.231 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953029#comment-16953029 ] Íñigo Goiri commented on HDFS-14908: At this point, I think we are replicating a lot of what {{hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Path.java}} does. However, I have to say that Path may have more overhead. The other thing is that this will be called pretty frequently in the LeaseManager (metSave is not that common). We need to do a couple things here: * Make sure that this is not done somewhere else in Hadoop. * Try to benchmark/profile this a little. [~weichiu] are you familiar with other places where we do this kind of pattern? [~belugabehr], any idea on optimizations (while keeping this somewhat readable) for isParent()? > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953021#comment-16953021 ] Íñigo Goiri commented on HDFS-14887: [~tasanuma], do you mind taking a final look to see if [^HDFS-14887.008.patch] solves your concerns? > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, > HDFS-14887.005.patch, HDFS-14887.006.patch, HDFS-14887.007.patch, > HDFS-14887.008.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953018#comment-16953018 ] Íñigo Goiri commented on HDFS-14887: +1 on [^HDFS-14887.008.patch]. > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, > HDFS-14887.005.patch, HDFS-14887.006.patch, HDFS-14887.007.patch, > HDFS-14887.008.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2254) Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention
[ https://issues.apache.org/jira/browse/HDDS-2254?focusedWorklogId=329262&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329262 ] ASF GitHub Bot logged work on HDDS-2254: Author: ASF GitHub Bot Created on: 16/Oct/19 17:02 Start Date: 16/Oct/19 17:02 Worklog Time Spent: 10m Work Description: avijayanhwx commented on pull request #31: HDDS-2254. Fix flaky unit test TestContainerStateMachine#testRatisSn… URL: https://github.com/apache/hadoop-ozone/pull/31 …apshotRetention. ## What changes were proposed in this pull request? On locally trying out repeated runs of the unit test, the unit test failed intermittently while asserting "Null" value for CSM snapshot. This assertion is not valid when the other unit test in the class executes before and creates keys in the cluster/container. Hence, moved to a model where each unit test creates its own cluster. https://issues.apache.org/jira/browse/HDDS-2254 ## How was this patch tested? Ran the unit tests in the IDE and command line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329262) Time Spent: 1h 40m (was: 1.5h) > Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention > --- > > Key: HDDS-2254 > URL: https://issues.apache.org/jira/browse/HDDS-2254 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Test always fails with assertion error: > {code} > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.ozone.client.rpc.TestContainerStateMachine.testRatisSnapshotRetention(TestContainerStateMachine.java:188) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2254) Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention
[ https://issues.apache.org/jira/browse/HDDS-2254?focusedWorklogId=329259&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329259 ] ASF GitHub Bot logged work on HDDS-2254: Author: ASF GitHub Bot Created on: 16/Oct/19 17:00 Start Date: 16/Oct/19 17:00 Worklog Time Spent: 10m Work Description: avijayanhwx commented on pull request #31: HDDS-2254. Fix flaky unit test TestContainerStateMachine#testRatisSn… URL: https://github.com/apache/hadoop-ozone/pull/31 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329259) Time Spent: 1.5h (was: 1h 20m) > Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention > --- > > Key: HDDS-2254 > URL: https://issues.apache.org/jira/browse/HDDS-2254 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Test always fails with assertion error: > {code} > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.ozone.client.rpc.TestContainerStateMachine.testRatisSnapshotRetention(TestContainerStateMachine.java:188) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDDS-2181) Ozone Manager should send correct ACL type in ACL requests to Authorizer
[ https://issues.apache.org/jira/browse/HDDS-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey reopened HDDS-2181: The pull request is still open. > Ozone Manager should send correct ACL type in ACL requests to Authorizer > > > Key: HDDS-2181 > URL: https://issues.apache.org/jira/browse/HDDS-2181 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10h 50m > Remaining Estimate: 0h > > Currently, Ozone manager sends "WRITE" as ACLType for key create, key delete > and bucket create operation. Fix the acl type in all requests to the > authorizer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953000#comment-16953000 ] Hadoop QA commented on HDFS-14887: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 54s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 70m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14887 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983186/HDFS-14887.008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ced60378d4de 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 375224e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28098/testReport/ | | Max. process+thread count | 2691 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28098/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > --
[jira] [Assigned] (HDDS-2314) Fix TestOMKeyCommitRequest Error
[ https://issues.apache.org/jira/browse/HDDS-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-2314: -- Assignee: (was: YiSheng Lien) > Fix TestOMKeyCommitRequest Error > > > Key: HDDS-2314 > URL: https://issues.apache.org/jira/browse/HDDS-2314 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: YiSheng Lien >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {code} > [ERROR] Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 2.479 > s <<< FAILURE! - in > org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest > [ERROR] > testValidateAndUpdateCacheWithKeyNotFound(org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest) > Time elapsed: 2.045 s <<< ERROR! > java.lang.IllegalMonitorStateException: Releasing lock on resource > /e4ec6d72-f27c-46f8-8434-e704e091f87b/db3319a6-6d78-42e1-8352-9feb099de70a > without acquiring lock > at > org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220) > at > org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java:168) > at > org.apache.hadoop.ozone.lock.LockManager.writeUnlock(LockManager.java:148) > at > org.apache.hadoop.ozone.om.lock.OzoneManagerLock.unlock(OzoneManagerLock.java:364) > at > org.apache.hadoop.ozone.om.lock.OzoneManagerLock.releaseWriteLock(OzoneManagerLock.java:329) > at > org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest.validateAndUpdateCache(OMKeyCommitRequest.java:177) > at > org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest.testValidateAndUpdateCacheWithKeyNotFound(TestOMKeyCommitRequest.java:202) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > [ERROR] > testValidateAndUpdateCacheWithBucketNotFound(org.apache.hadoop.ozone.om.request.key.TestOMKeyCommitRequest) > Time elapsed: 0.098 s <<< ERROR! > java.lang.IllegalMonitorStateException: Releasing lock on resource > /4696e0f1-6439-4300-a1bc-f30c37a12a37/352527b9-eb75-49af-b06a-57cbc697730c > without acquiring lock > at > org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220) > at > org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952984#comment-16952984 ] Hadoop QA commented on HDFS-14908: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 44s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 402 unchanged - 0 fixed = 403 total (was 402) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 28s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14908 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983175/HDFS-14908.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f2312368abc2 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 090f73a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28097/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28097/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28097/testReport/ | | Max. process+thread count | 2820 (v
[jira] [Commented] (HDFS-13506) RBF: Create destination directory when adding mount entry using router admin cmds.
[ https://issues.apache.org/jira/browse/HDFS-13506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952963#comment-16952963 ] hemanthboyina commented on HDFS-13506: -- hi [~dibyendu_hadoop] are you working on this ? > RBF: Create destination directory when adding mount entry using router admin > cmds. > -- > > Key: HDFS-13506 > URL: https://issues.apache.org/jira/browse/HDFS-13506 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Dibyendu Karmakar >Priority: Major > > Currently there is no option to create destination when adding mount entry. > User has to create the destination separately. > In router admin -add command we can add an option -createDest to create the > destination for the mount entry. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14887) RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable
[ https://issues.apache.org/jira/browse/HDFS-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-14887: - Attachment: HDFS-14887.008.patch > RBF: In Router Web UI, Observer Namenode Information displaying as Unavailable > -- > > Key: HDFS-14887 > URL: https://issues.apache.org/jira/browse/HDFS-14887 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: 14887.after.png, 14887.before.png, HDFS-14887.001.patch, > HDFS-14887.002.patch, HDFS-14887.003.patch, HDFS-14887.004.patch, > HDFS-14887.005.patch, HDFS-14887.006.patch, HDFS-14887.007.patch, > HDFS-14887.008.patch > > > In Router Web UI, Observer Namenode Information displaying as Unavailable. > We should show a proper icon for them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14802) The feature of protect directories should be used in RenameOp
[ https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952900#comment-16952900 ] Ayush Saxena commented on HDFS-14802: - Thanx Steve for the help, [~ferhui] can you update accordingly. Additionally, I think you can add the method in {{DfsUtil}} rather than creating new Util class. and the Check should be before the unprotected methods, We can't throw exception from the unprotected ones. > The feature of protect directories should be used in RenameOp > - > > Key: HDFS-14802 > URL: https://issues.apache.org/jira/browse/HDFS-14802 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, > HDFS-14802.003.patch > > > Now we could set fs.protected.directories to prevent users from deleting > important directories. But users can delete directories around the limitation. > 1. Rename the directories and delete them. > 2. move the directories to trash and namenode will delete them. > So I think we should use the feature of protected directories in RenameOp -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14546) Document block placement policies
[ https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952897#comment-16952897 ] Ayush Saxena commented on HDFS-14546: - You can do mvn clean site in {{/hadoop-hdfs-project/hadoop-hdfs}} and build the webpage in the local. The webpage would be there in {{/hadoop-hdfs-project/hadoop-hdfs/target/site}} path. > Document block placement policies > - > > Key: HDFS-14546 > URL: https://issues.apache.org/jira/browse/HDFS-14546 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Amithsha >Priority: Major > Labels: documentation > Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, > HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, > HDFS-14546-06.patch, HdfsDesign.patch > > > Currently, all the documentation refers to the default block placement policy. > However, over time there have been new policies: > * BlockPlacementPolicyRackFaultTolerant (HDFS-7891) > * BlockPlacementPolicyWithNodeGroup (HDFS-3601) > * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006) > We should update the documentation to refer to them explaining their > particularities and probably how to setup each one of them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14546) Document block placement policies
[ https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952893#comment-16952893 ] Ayush Saxena commented on HDFS-14546: - Thanx [~Amithsha] for the patch. Almost looks good, couple of things : This isn't required : {code:java} +* [Block Placement Policies](#Block_Placement_Policies) + * [Introduction](#Introduction) + * [Additional Types of Policies](#Policy_Types) + * [BlockPlacementPolicyRackFaultTolerant](#Block_PlacementPolicy_RackFaultTolerant) + * [BlockPlacementPolicyWithNodeGroup](#BlockPlacementPolicy_With_NodeGroup) + * [BlockPlacementPolicyWithUpgradeDomain](#BlockPlacementPolicy_With_UpgradeDomain) {code} Just having a single line, will automatically generate this, Like : {code:java} BlockPlacementPolicies == // Add this line ##Introduction {code} These should have ## rather than ### {code:java} +### BlockPlacementPolicyRackFaultTolerant {code} These Links should be relative paths : {code:java} +Detailed info about configuration https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUpgradeDomain.html {code} The image has too much whitespace at the bottom can you reduce it. It makes the text go too much below. Let me know, if you are facing any problem or need any help handling anything, I can help. [~elgoiri] you want to add anything? > Document block placement policies > - > > Key: HDFS-14546 > URL: https://issues.apache.org/jira/browse/HDFS-14546 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Amithsha >Priority: Major > Labels: documentation > Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, > HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, > HDFS-14546-06.patch, HdfsDesign.patch > > > Currently, all the documentation refers to the default block placement policy. > However, over time there have been new policies: > * BlockPlacementPolicyRackFaultTolerant (HDFS-7891) > * BlockPlacementPolicyWithNodeGroup (HDFS-3601) > * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006) > We should update the documentation to refer to them explaining their > particularities and probably how to setup each one of them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14739) RBF: LS command for mount point shows wrong owner and permission information.
[ https://issues.apache.org/jira/browse/HDFS-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952877#comment-16952877 ] Hudson commented on HDFS-14739: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17541 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17541/]) HDFS-14739. RBF: LS command for mount point shows wrong owner and (ayushsaxena: rev 375224edebb1c937afe4bbea8fe884499ca8ece5) * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestDisableNameservices.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/MountTableResolver.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java * (edit) hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterMountTable.java * (add) hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/RouterResolveException.java > RBF: LS command for mount point shows wrong owner and permission information. > - > > Key: HDFS-14739 > URL: https://issues.apache.org/jira/browse/HDFS-14739 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: xuzq >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14739-trunk-001.patch, HDFS-14739-trunk-002.patch, > HDFS-14739-trunk-003.patch, HDFS-14739-trunk-004.patch, > HDFS-14739-trunk-005.patch, HDFS-14739-trunk-006.patch, > HDFS-14739-trunk-007.patch, HDFS-14739-trunk-008.patch, > HDFS-14739-trunk-009.patch, HDFS-14739-trunk-010.patch, > HDFS-14739-trunk-011.patch, image-2019-08-16-17-15-50-614.png, > image-2019-08-16-17-16-00-863.png, image-2019-08-16-17-16-34-325.png > > > ||source||target namespace||destination||owner||group||permission|| > |/mnt|ns0|/mnt|mnt|mnt_group|755| > |/mnt/test1|ns1|/mnt/test1|mnt_test1|mnt_test1_group|755| > |/test1|ns1|/test1|test1|test1_group|755| > When do getListing("/mnt"), the owner of */mnt/test1* should be *mnt_test1* > instead of *test1* in result. > > And if the mount table as blew, we should support getListing("/mnt") instead > of throw IOException when dfs.federation.router.default.nameservice.enable is > false. > ||source||target namespace||destination||owner||group||permission|| > |/mnt/test1|ns0|/mnt/test1|test1|test1|755| > |/mnt/test2|ns1|/mnt/test2|test2|test2|755| > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14739) RBF: LS command for mount point shows wrong owner and permission information.
[ https://issues.apache.org/jira/browse/HDFS-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-14739: Fix Version/s: 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > RBF: LS command for mount point shows wrong owner and permission information. > - > > Key: HDFS-14739 > URL: https://issues.apache.org/jira/browse/HDFS-14739 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: xuzq >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14739-trunk-001.patch, HDFS-14739-trunk-002.patch, > HDFS-14739-trunk-003.patch, HDFS-14739-trunk-004.patch, > HDFS-14739-trunk-005.patch, HDFS-14739-trunk-006.patch, > HDFS-14739-trunk-007.patch, HDFS-14739-trunk-008.patch, > HDFS-14739-trunk-009.patch, HDFS-14739-trunk-010.patch, > HDFS-14739-trunk-011.patch, image-2019-08-16-17-15-50-614.png, > image-2019-08-16-17-16-00-863.png, image-2019-08-16-17-16-34-325.png > > > ||source||target namespace||destination||owner||group||permission|| > |/mnt|ns0|/mnt|mnt|mnt_group|755| > |/mnt/test1|ns1|/mnt/test1|mnt_test1|mnt_test1_group|755| > |/test1|ns1|/test1|test1|test1_group|755| > When do getListing("/mnt"), the owner of */mnt/test1* should be *mnt_test1* > instead of *test1* in result. > > And if the mount table as blew, we should support getListing("/mnt") instead > of throw IOException when dfs.federation.router.default.nameservice.enable is > false. > ||source||target namespace||destination||owner||group||permission|| > |/mnt/test1|ns0|/mnt/test1|test1|test1|755| > |/mnt/test2|ns1|/mnt/test2|test2|test2|755| > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2220) HddsVolume needs a toString method
[ https://issues.apache.org/jira/browse/HDDS-2220?focusedWorklogId=329190&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329190 ] ASF GitHub Bot logged work on HDDS-2220: Author: ASF GitHub Bot Created on: 16/Oct/19 14:16 Start Date: 16/Oct/19 14:16 Worklog Time Spent: 10m Work Description: cxorm commented on pull request #1652: HDDS-2220. HddsVolume needs a toString method. URL: https://github.com/apache/hadoop/pull/1652 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329190) Time Spent: 1h (was: 50m) > HddsVolume needs a toString method > -- > > Key: HDDS-2220 > URL: https://issues.apache.org/jira/browse/HDDS-2220 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Marton Elek >Assignee: YiSheng Lien >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > This is logged to the console of datanodes: > {code:java} > 2019-10-01 11:37:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 11:52:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 11:52:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:07:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:07:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:22:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:22:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:37:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:37:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:52:59 INFO ThrottledAsyncChecker:139 - Scheduling a check for > org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > 2019-10-01 12:52:59 INFO HddsVolumeChecker:202 - Scheduled health check for > volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@5460cf3a > {code} > Without a proper HddsVolume.toString it's hard to say which volume is > checked... > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14739) RBF: LS command for mount point shows wrong owner and permission information.
[ https://issues.apache.org/jira/browse/HDFS-14739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952868#comment-16952868 ] Ayush Saxena commented on HDFS-14739: - Committed to trunk. Thanx [~LiJinglun] for the contribution [~xuzq_zander] for the report and [~elgoiri] for the review!!! > RBF: LS command for mount point shows wrong owner and permission information. > - > > Key: HDFS-14739 > URL: https://issues.apache.org/jira/browse/HDFS-14739 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: xuzq >Assignee: Jinglun >Priority: Major > Attachments: HDFS-14739-trunk-001.patch, HDFS-14739-trunk-002.patch, > HDFS-14739-trunk-003.patch, HDFS-14739-trunk-004.patch, > HDFS-14739-trunk-005.patch, HDFS-14739-trunk-006.patch, > HDFS-14739-trunk-007.patch, HDFS-14739-trunk-008.patch, > HDFS-14739-trunk-009.patch, HDFS-14739-trunk-010.patch, > HDFS-14739-trunk-011.patch, image-2019-08-16-17-15-50-614.png, > image-2019-08-16-17-16-00-863.png, image-2019-08-16-17-16-34-325.png > > > ||source||target namespace||destination||owner||group||permission|| > |/mnt|ns0|/mnt|mnt|mnt_group|755| > |/mnt/test1|ns1|/mnt/test1|mnt_test1|mnt_test1_group|755| > |/test1|ns1|/test1|test1|test1_group|755| > When do getListing("/mnt"), the owner of */mnt/test1* should be *mnt_test1* > instead of *test1* in result. > > And if the mount table as blew, we should support getListing("/mnt") instead > of throw IOException when dfs.federation.router.default.nameservice.enable is > false. > ||source||target namespace||destination||owner||group||permission|| > |/mnt/test1|ns0|/mnt/test1|test1|test1|755| > |/mnt/test2|ns1|/mnt/test2|test2|test2|755| > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2316) Support to skip recon and/or ozonefs during the build
[ https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2316. Fix Version/s: 0.5.0 Resolution: Fixed Committed to the master. > Support to skip recon and/or ozonefs during the build > - > > Key: HDDS-2316 > URL: https://issues.apache.org/jira/browse/HDDS-2316 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard > to resist...) > > The two slowest part of Ozone build as of now: > # The (multiple) shading of ozonefs > # And the frontend build/obfuscation of ozone recon > [~aengineer] suggested to introduce options to skip them as they are not > required for the build all the time. > This patch introduces '-DskipRecon' and '-DskipShade' options to provide a > faster way to create a *partial* build. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2316) Support to skip recon and/or ozonefs during the build
[ https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952849#comment-16952849 ] Anu Engineer commented on HDDS-2316: Thank you for the fast-lane. I am taking the first ride. > Support to skip recon and/or ozonefs during the build > - > > Key: HDDS-2316 > URL: https://issues.apache.org/jira/browse/HDDS-2316 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard > to resist...) > > The two slowest part of Ozone build as of now: > # The (multiple) shading of ozonefs > # And the frontend build/obfuscation of ozone recon > [~aengineer] suggested to introduce options to skip them as they are not > required for the build all the time. > This patch introduces '-DskipRecon' and '-DskipShade' options to provide a > faster way to create a *partial* build. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2316) Support to skip recon and/or ozonefs during the build
[ https://issues.apache.org/jira/browse/HDDS-2316?focusedWorklogId=329177&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329177 ] ASF GitHub Bot logged work on HDDS-2316: Author: ASF GitHub Bot Created on: 16/Oct/19 13:54 Start Date: 16/Oct/19 13:54 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #39: HDDS-2316. Support to skip recon and/or ozonefs during the build URL: https://github.com/apache/hadoop-ozone/pull/39 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329177) Time Spent: 20m (was: 10m) > Support to skip recon and/or ozonefs during the build > - > > Key: HDDS-2316 > URL: https://issues.apache.org/jira/browse/HDDS-2316 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard > to resist...) > > The two slowest part of Ozone build as of now: > # The (multiple) shading of ozonefs > # And the frontend build/obfuscation of ozone recon > [~aengineer] suggested to introduce options to skip them as they are not > required for the build all the time. > This patch introduces '-DskipRecon' and '-DskipShade' options to provide a > faster way to create a *partial* build. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-14908: --- Attachment: HDFS-14908.002.patch > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952838#comment-16952838 ] Jinglun commented on HDFS-14908: Thanks [~elgoiri] your nice comments ! I make the DFSUtil.isParent() stronger and more readable, and add a new test case for it. Upload v02. > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14908.001.patch > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2235) Ozone Datanode web page doesn't exist
[ https://issues.apache.org/jira/browse/HDDS-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandeep Nemuri reassigned HDDS-2235: Assignee: Sandeep Nemuri > Ozone Datanode web page doesn't exist > - > > Key: HDDS-2235 > URL: https://issues.apache.org/jira/browse/HDDS-2235 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Sandeep Nemuri >Priority: Major > > On trying to access the dn UI, the following error is seen. > http://dn_ip:9882/ > {code} > HTTP ERROR 403 > Problem accessing /. Reason: > Forbidden > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2218) Use OZONE_CLASSPATH instead of HADOOP_CLASSPATH
[ https://issues.apache.org/jira/browse/HDDS-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandeep Nemuri reassigned HDDS-2218: Assignee: Sandeep Nemuri > Use OZONE_CLASSPATH instead of HADOOP_CLASSPATH > --- > > Key: HDDS-2218 > URL: https://issues.apache.org/jira/browse/HDDS-2218 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: docker >Reporter: Marton Elek >Assignee: Sandeep Nemuri >Priority: Major > Labels: newbe > > HADOOP_CLASSPATH is the standard way to add additional jar files to the > classpath of the mapreduce/spark/.. .jobs. If something is added to the > HADOOP_CLASSPATH, than it should be on the classpath of the classic hadoop > daemons. > But for the Ozone components we don't need any new jar files (cloud > connectors, libraries). I think it's more safe to separated HADOOP_CLASSPATH > from OZONE_CLASSPATH. If something is really need on the classpath for Ozone > daemons the dedicated environment variable should be used. > > Most probably it can be fixed in > hadoop-hdds/common/src/main/bin/hadoop-functions.sh > And the hadoop-ozone/dev/src/main/compose files also should be checked (some > of them contain HADOOP_CLASSPATH -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952812#comment-16952812 ] David Mollitor commented on HDFS-14854: --- [~sodonnell] Thanks. Looks good! > Create improved decommission monitor implementation > --- > > Key: HDFS-14854 > URL: https://issues.apache.org/jira/browse/HDFS-14854 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: Decommission_Monitor_V2_001.pdf, HDFS-14854.001.patch, > HDFS-14854.002.patch, HDFS-14854.003.patch, HDFS-14854.004.patch, > HDFS-14854.005.patch, HDFS-14854.006.patch, HDFS-14854.007.patch, > HDFS-14854.008.patch, HDFS-14854.009.patch, HDFS-14854.010.patch > > > In HDFS-13157, we discovered a series of problems with the current > decommission monitor implementation, such as: > * Blocks are replicated sequentially disk by disk and node by node, and > hence the load is not spread well across the cluster > * Adding a node for decommission can cause the namenode write lock to be > held for a long time. > * Decommissioning nodes floods the replication queue and under replicated > blocks from a future node or disk failure may way for a long time before they > are replicated. > * Blocks pending replication are checked many times under a write lock > before they are sufficiently replicate, wasting resources > In this Jira I propose to create a new implementation of the decommission > monitor that resolves these issues. As it will be difficult to prove one > implementation is better than another, the new implementation can be enabled > or disabled giving the option of the existing implementation or the new one. > I will attach a pdf with some more details on the design and then a version 1 > patch shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1737) Add Volume check in KeyManager and File Operations
[ https://issues.apache.org/jira/browse/HDDS-1737?focusedWorklogId=329159&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329159 ] ASF GitHub Bot logged work on HDDS-1737: Author: ASF GitHub Bot Created on: 16/Oct/19 13:00 Start Date: 16/Oct/19 13:00 Worklog Time Spent: 10m Work Description: elek commented on pull request #2: HDDS-1737. Add Volume check in KeyManager and File Operations. URL: https://github.com/apache/hadoop-ozone/pull/2 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329159) Time Spent: 5.5h (was: 5h 20m) > Add Volume check in KeyManager and File Operations > -- > > Key: HDDS-1737 > URL: https://issues.apache.org/jira/browse/HDDS-1737 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > This is to address a TODO to check volume checks when performing Key/File > operations. > > // TODO: Not checking volume exist here, once we have full cache we can > // add volume exist check also. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1737) Add Volume check in KeyManager and File Operations
[ https://issues.apache.org/jira/browse/HDDS-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-1737: -- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Add Volume check in KeyManager and File Operations > -- > > Key: HDDS-1737 > URL: https://issues.apache.org/jira/browse/HDDS-1737 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > This is to address a TODO to check volume checks when performing Key/File > operations. > > // TODO: Not checking volume exist here, once we have full cache we can > // add volume exist check also. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1737) Add Volume check in KeyManager and File Operations
[ https://issues.apache.org/jira/browse/HDDS-1737?focusedWorklogId=329156&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329156 ] ASF GitHub Bot logged work on HDDS-1737: Author: ASF GitHub Bot Created on: 16/Oct/19 12:58 Start Date: 16/Oct/19 12:58 Worklog Time Spent: 10m Work Description: elek commented on issue #1559: HDDS-1737. Add Volume check in KeyManager and File Operations. URL: https://github.com/apache/hadoop/pull/1559#issuecomment-542687280 Migrated to https://github.com/apache/hadoop-ozone/pull/2 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329156) Time Spent: 5h 10m (was: 5h) > Add Volume check in KeyManager and File Operations > -- > > Key: HDDS-1737 > URL: https://issues.apache.org/jira/browse/HDDS-1737 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: newbie, pull-request-available > Time Spent: 5h 10m > Remaining Estimate: 0h > > This is to address a TODO to check volume checks when performing Key/File > operations. > > // TODO: Not checking volume exist here, once we have full cache we can > // add volume exist check also. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1737) Add Volume check in KeyManager and File Operations
[ https://issues.apache.org/jira/browse/HDDS-1737?focusedWorklogId=329157&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329157 ] ASF GitHub Bot logged work on HDDS-1737: Author: ASF GitHub Bot Created on: 16/Oct/19 12:58 Start Date: 16/Oct/19 12:58 Worklog Time Spent: 10m Work Description: elek commented on pull request #1559: HDDS-1737. Add Volume check in KeyManager and File Operations. URL: https://github.com/apache/hadoop/pull/1559 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329157) Time Spent: 5h 20m (was: 5h 10m) > Add Volume check in KeyManager and File Operations > -- > > Key: HDDS-1737 > URL: https://issues.apache.org/jira/browse/HDDS-1737 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: newbie, pull-request-available > Time Spent: 5h 20m > Remaining Estimate: 0h > > This is to address a TODO to check volume checks when performing Key/File > operations. > > // TODO: Not checking volume exist here, once we have full cache we can > // add volume exist check also. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14854) Create improved decommission monitor implementation
[ https://issues.apache.org/jira/browse/HDFS-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952775#comment-16952775 ] Hadoop QA commented on HDFS-14854: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 51s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 45s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 12 new + 462 unchanged - 5 fixed = 474 total (was 467) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 22s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | | | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14854 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983148/HDFS-14854.010.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 53ec2adf14cb 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bbcf0b9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28096/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/P
[jira] [Work logged] (HDDS-2316) Support to skip recon and/or ozonefs during the build
[ https://issues.apache.org/jira/browse/HDDS-2316?focusedWorklogId=329139&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329139 ] ASF GitHub Bot logged work on HDDS-2316: Author: ASF GitHub Bot Created on: 16/Oct/19 12:15 Start Date: 16/Oct/19 12:15 Worklog Time Spent: 10m Work Description: elek commented on pull request #39: HDDS-2316. Support to skip recon and/or ozonefs during the build URL: https://github.com/apache/hadoop-ozone/pull/39 ## What changes were proposed in this pull request? The two slowest part of Ozone build as of now: * The (multiple) shading of ozonefs * And the frontend build/obfuscation of ozone recon @anuengineer suggested to introduce options to skip them as they are not required for the build all the time. This patch introduces `-DskipRecon` and `-DskipShade` options to provide a faster way to create a partial build. ## What is the link to the Apache JIRA https://github.com/elek/hadoop-ozone/pull/new/HDDS-2316 ## How this patch can be tested? ``` mvn clean install -DskipShade -DskipRecon -DskipTests ``` ``` mvn clean install -DskipShade -DskipTests ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329139) Remaining Estimate: 0h Time Spent: 10m > Support to skip recon and/or ozonefs during the build > - > > Key: HDDS-2316 > URL: https://issues.apache.org/jira/browse/HDDS-2316 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard > to resist...) > > The two slowest part of Ozone build as of now: > # The (multiple) shading of ozonefs > # And the frontend build/obfuscation of ozone recon > [~aengineer] suggested to introduce options to skip them as they are not > required for the build all the time. > This patch introduces '-DskipRecon' and '-DskipShade' options to provide a > faster way to create a *partial* build. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2316) Support to skip recon and/or ozonefs during the build
[ https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2316: - Labels: pull-request-available (was: ) > Support to skip recon and/or ozonefs during the build > - > > Key: HDDS-2316 > URL: https://issues.apache.org/jira/browse/HDDS-2316 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard > to resist...) > > The two slowest part of Ozone build as of now: > # The (multiple) shading of ozonefs > # And the frontend build/obfuscation of ozone recon > [~aengineer] suggested to introduce options to skip them as they are not > required for the build all the time. > This patch introduces '-DskipRecon' and '-DskipShade' options to provide a > faster way to create a *partial* build. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2316) Support to skip recon and/or ozonefs during the build
[ https://issues.apache.org/jira/browse/HDDS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-2316: -- Description: (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard to resist...) The two slowest part of Ozone build as of now: # The (multiple) shading of ozonefs # And the frontend build/obfuscation of ozone recon [~aengineer] suggested to introduce options to skip them as they are not required for the build all the time. This patch introduces '-DskipRecon' and '-DskipShade' options to provide a faster way to create a *partial* build. was: (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard to resist...) > Support to skip recon and/or ozonefs during the build > - > > Key: HDDS-2316 > URL: https://issues.apache.org/jira/browse/HDDS-2316 > Project: Hadoop Distributed Data Store > Issue Type: New Feature >Reporter: Anu Engineer >Assignee: Marton Elek >Priority: Major > > (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard > to resist...) > > The two slowest part of Ozone build as of now: > # The (multiple) shading of ozonefs > # And the frontend build/obfuscation of ozone recon > [~aengineer] suggested to introduce options to skip them as they are not > required for the build all the time. > This patch introduces '-DskipRecon' and '-DskipShade' options to provide a > faster way to create a *partial* build. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2316) Support to skip recon and/or ozonefs during the build
Marton Elek created HDDS-2316: - Summary: Support to skip recon and/or ozonefs during the build Key: HDDS-2316 URL: https://issues.apache.org/jira/browse/HDDS-2316 Project: Hadoop Distributed Data Store Issue Type: New Feature Reporter: Anu Engineer Assignee: Marton Elek (I almost use this Jira summary: "Fast-lane to ozone build" It was very hard to resist...) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2315) bucket creation fails because bucket does not exist
[ https://issues.apache.org/jira/browse/HDDS-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai resolved HDDS-2315. Resolution: Invalid Fixed by revert: https://github.com/apache/hadoop-ozone/commit/17081c2e > bucket creation fails because bucket does not exist > --- > > Key: HDDS-2315 > URL: https://issues.apache.org/jira/browse/HDDS-2315 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Priority: Blocker > > Secure acceptance tests fail because no bucket can be created due to ACL > check: > {noformat} > om_1| 2019-10-16 10:42:04,422 [IPC Server handler 0 on 9862] INFO > - created volume:vol-0-38760 for user:HTTP/s...@example.com > om_1| 2019-10-16 10:42:04,464 [IPC Server handler 4 on 9862] INFO > - created volume:vol-1-41642 for user:HTTP/s...@example.com > om_1| 2019-10-16 10:42:04,481 [IPC Server handler 11 on 9862] INFO >- created volume:vol-2-97489 for user:HTTP/s...@example.com > om_1| 2019-10-16 10:42:04,496 [IPC Server handler 12 on 9862] INFO >- created volume:vol-3-24784 for user:HTTP/s...@example.com > om_1| 2019-10-16 10:42:04,512 [IPC Server handler 6 on 9862] INFO > - created volume:vol-4-01299 for user:HTTP/s...@example.com > om_1| 2019-10-16 10:42:04,550 [IPC Server handler 7 on 9862] ERROR > - Bucket creation failed for bucket:bucket-0-94230 in volume:vol-0-38760 > om_1| BUCKET_NOT_FOUND > org.apache.hadoop.ozone.om.exceptions.OMException: Bucket bucket-0-94230 is > not found > om_1| at > org.apache.hadoop.ozone.om.BucketManagerImpl.checkAccess(BucketManagerImpl.java:568) > om_1| at > org.apache.hadoop.ozone.security.acl.OzoneNativeAuthorizer.checkAccess(OzoneNativeAuthorizer.java:89) > om_1| at > org.apache.hadoop.ozone.om.OzoneManager.checkAcls(OzoneManager.java:1625) > om_1| at > org.apache.hadoop.ozone.om.request.OMClientRequest.checkAcls(OMClientRequest.java:135) > om_1| at > org.apache.hadoop.ozone.om.request.bucket.OMBucketCreateRequest.validateAndUpdateCache(OMBucketCreateRequest.java:146) > om_1| at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:219) > om_1| at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134) > om_1| at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > om_1| at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2313) Duplicate release of lock in OMKeyCommitRequest
[ https://issues.apache.org/jira/browse/HDDS-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai resolved HDDS-2313. Resolution: Invalid Fixed by revert: https://github.com/apache/hadoop-ozone/commit/17081c2e > Duplicate release of lock in OMKeyCommitRequest > --- > > Key: HDDS-2313 > URL: https://issues.apache.org/jira/browse/HDDS-2313 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Blocker > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {noformat} > om_1| 2019-10-16 05:33:57,413 [IPC Server handler 19 on 9862] ERROR >- Trying to release the lock on /bypdd/mybucket4, which was never acquired. > om_1| 2019-10-16 05:33:57,414 WARN ipc.Server: IPC Server handler 19 > on 9862, call Call#4 Retry#8 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from > 172.29.0.4:37018 > om_1| java.lang.IllegalMonitorStateException: Releasing lock on > resource /bypdd/mybucket4 without acquiring lock > om_1| at > org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220) > om_1| at > org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java:168) > om_1| at > org.apache.hadoop.ozone.lock.LockManager.writeUnlock(LockManager.java:148) > om_1| at > org.apache.hadoop.ozone.om.lock.OzoneManagerLock.unlock(OzoneManagerLock.java:364) > om_1| at > org.apache.hadoop.ozone.om.lock.OzoneManagerLock.releaseWriteLock(OzoneManagerLock.java:329) > om_1| at > org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest.validateAndUpdateCache(OMKeyCommitRequest.java:177) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2313) Duplicate release of lock in OMKeyCommitRequest
[ https://issues.apache.org/jira/browse/HDDS-2313?focusedWorklogId=329133&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329133 ] ASF GitHub Bot logged work on HDDS-2313: Author: ASF GitHub Bot Created on: 16/Oct/19 12:01 Start Date: 16/Oct/19 12:01 Worklog Time Spent: 10m Work Description: adoroszlai commented on pull request #35: HDDS-2313. Duplicate release of lock in OMKeyCommitRequest URL: https://github.com/apache/hadoop-ozone/pull/35 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329133) Time Spent: 20m (was: 10m) > Duplicate release of lock in OMKeyCommitRequest > --- > > Key: HDDS-2313 > URL: https://issues.apache.org/jira/browse/HDDS-2313 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Blocker > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {noformat} > om_1| 2019-10-16 05:33:57,413 [IPC Server handler 19 on 9862] ERROR >- Trying to release the lock on /bypdd/mybucket4, which was never acquired. > om_1| 2019-10-16 05:33:57,414 WARN ipc.Server: IPC Server handler 19 > on 9862, call Call#4 Retry#8 > org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from > 172.29.0.4:37018 > om_1| java.lang.IllegalMonitorStateException: Releasing lock on > resource /bypdd/mybucket4 without acquiring lock > om_1| at > org.apache.hadoop.ozone.lock.LockManager.getLockForReleasing(LockManager.java:220) > om_1| at > org.apache.hadoop.ozone.lock.LockManager.release(LockManager.java:168) > om_1| at > org.apache.hadoop.ozone.lock.LockManager.writeUnlock(LockManager.java:148) > om_1| at > org.apache.hadoop.ozone.om.lock.OzoneManagerLock.unlock(OzoneManagerLock.java:364) > om_1| at > org.apache.hadoop.ozone.om.lock.OzoneManagerLock.releaseWriteLock(OzoneManagerLock.java:329) > om_1| at > org.apache.hadoop.ozone.om.request.key.OMKeyCommitRequest.validateAndUpdateCache(OMKeyCommitRequest.java:177) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1988) Fix listParts API
[ https://issues.apache.org/jira/browse/HDDS-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-1988: -- Fix Version/s: 0.5.0 > Fix listParts API > - > > Key: HDDS-1988 > URL: https://issues.apache.org/jira/browse/HDDS-1988 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This Jira is to fix listParts API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listParts of a MPU key, it should use > both in-memory cache and rocksdb mpu table to list parts of a mpu key. > > No fix is required for this, as the information is retrieved from the MPU Key > table, this information is not retrieved through RocksDB Table iteration. (As > when we use get() this checks from cache first, and then it checks table) > > Used this Jira to add an integration test to verify the behavior. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2267) Container metadata scanner interval mismatch
[ https://issues.apache.org/jira/browse/HDDS-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-2267: -- Fix Version/s: 0.5.0 > Container metadata scanner interval mismatch > > > Key: HDDS-2267 > URL: https://issues.apache.org/jira/browse/HDDS-2267 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Container metadata scanner can be configured to run at specific time > intervals, eg. hourly ({{hdds.containerscrub.metadata.scan.interval}}). > However, the actual run interval does not match the configuration. After a > datanode restart, it runs in quick succession, later it runs at apparently > random intervals. > {noformat:title=sample log} > datanode_1 | 2019-10-08 14:05:30 INFO ContainerMetadataScanner:88 - > Completed an iteration of container metadata scrubber in 0 minutes. Number of > iterations (since the data-node restart) : 1, Number of containers scanned > in this iteration : 0, Number of unhealthy containers found in this iteration > : 0 > datanode_1 | 2019-10-08 14:09:33 INFO ContainerMetadataScanner:88 - > Completed an iteration of container metadata scrubber in 0 minutes. Number of > iterations (since the data-node restart) : 1, Number of containers scanned > in this iteration : 6, Number of unhealthy containers found in this iteration > : 0 > ... > datanode_1 | 2019-10-08 14:09:33 INFO ContainerMetadataScanner:88 - > Completed an iteration of container metadata scrubber in 0 minutes. Number of > iterations (since the data-node restart) : 28, Number of containers scanned > in this iteration : 6, Number of unhealthy containers found in this iteration > : 0 > datanode_1 | 2019-10-08 14:21:01 INFO ContainerMetadataScanner:88 - > Completed an iteration of container metadata scrubber in 0 minutes. Number of > iterations (since the data-node restart) : 29, Number of containers scanned > in this iteration : 6, Number of unhealthy containers found in this iteration > : 0 > datanode_1 | 2019-10-08 14:21:01 INFO ContainerMetadataScanner:88 - > Completed an iteration of container metadata scrubber in 0 minutes. Number of > iterations (since the data-node restart) : 30, Number of containers scanned > in this iteration : 6, Number of unhealthy containers found in this iteration > : 0 > datanode_1 | 2019-10-08 15:30:38 INFO ContainerMetadataScanner:88 - > Completed an iteration of container metadata scrubber in 0 minutes. Number of > iterations (since the data-node restart) : 31, Number of containers scanned > in this iteration : 6, Number of unhealthy containers found in this iteration > : 0 > datanode_1 | 2019-10-08 16:45:01 INFO ContainerMetadataScanner:88 - > Completed an iteration of container metadata scrubber in 0 minutes. Number of > iterations (since the data-node restart) : 32, Number of containers scanned > in this iteration : 6, Number of unhealthy containers found in this iteration > : 0 > {noformat} > The problem is that time elapsed is measured in nanoseconds, while the > configuration is in milliseconds. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2312) Fix typo in ozone command
[ https://issues.apache.org/jira/browse/HDDS-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek resolved HDDS-2312. --- Fix Version/s: 0.5.0 Resolution: Fixed > Fix typo in ozone command > - > > Key: HDDS-2312 > URL: https://issues.apache.org/jira/browse/HDDS-2312 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone CLI >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {noformat:title=ozone} > Usage: ozone [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS] > ... > insight tool to get runtime opeartion information > ... > {noformat} > Should be "operation". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2312) Fix typo in ozone command
[ https://issues.apache.org/jira/browse/HDDS-2312?focusedWorklogId=329114&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329114 ] ASF GitHub Bot logged work on HDDS-2312: Author: ASF GitHub Bot Created on: 16/Oct/19 11:17 Start Date: 16/Oct/19 11:17 Worklog Time Spent: 10m Work Description: elek commented on pull request #34: HDDS-2312. Fix typo in ozone command URL: https://github.com/apache/hadoop-ozone/pull/34 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329114) Time Spent: 20m (was: 10m) > Fix typo in ozone command > - > > Key: HDDS-2312 > URL: https://issues.apache.org/jira/browse/HDDS-2312 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone CLI >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {noformat:title=ozone} > Usage: ozone [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS] > ... > insight tool to get runtime opeartion information > ... > {noformat} > Should be "operation". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1985) Fix listVolumes API
[ https://issues.apache.org/jira/browse/HDDS-1985?focusedWorklogId=329112&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329112 ] ASF GitHub Bot logged work on HDDS-1985: Author: ASF GitHub Bot Created on: 16/Oct/19 11:11 Start Date: 16/Oct/19 11:11 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #33: HDDS-1985. Fix listVolumes API URL: https://github.com/apache/hadoop-ozone/pull/33 https://issues.apache.org/jira/browse/HDDS-1985 No fix is required for this, as the information is retrieved from the MPU Key table, this information is not retrieved through RocksDB Table iteration. (As when we use get() this checks from cache first, and then it checks table) Used this Jira to add an integration test to verify the behavior. (This has cumulative changes required for HDDS-1988 and HDDS-1985) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329112) Time Spent: 20m (was: 10m) > Fix listVolumes API > --- > > Key: HDDS-1985 > URL: https://issues.apache.org/jira/browse/HDDS-1985 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This Jira is to fix lisVolumes API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listVolumes, it should use both > in-memory cache and rocksdb volume table to list volumes for a user. > > No fix is required for this, as the information is retrieved from the MPU Key > table, this information is not retrieved through RocksDB Table iteration. (As > when we use get() this checks from cache first, and then it checks table) > > Used this Jira to add an integration test to verify the behavior. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1985) Fix listVolumes API
[ https://issues.apache.org/jira/browse/HDDS-1985?focusedWorklogId=329111&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329111 ] ASF GitHub Bot logged work on HDDS-1985: Author: ASF GitHub Bot Created on: 16/Oct/19 11:10 Start Date: 16/Oct/19 11:10 Worklog Time Spent: 10m Work Description: elek commented on pull request #33: HDDS-1985. Fix listVolumes API URL: https://github.com/apache/hadoop-ozone/pull/33 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329111) Remaining Estimate: 0h Time Spent: 10m > Fix listVolumes API > --- > > Key: HDDS-1985 > URL: https://issues.apache.org/jira/browse/HDDS-1985 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix lisVolumes API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listVolumes, it should use both > in-memory cache and rocksdb volume table to list volumes for a user. > > No fix is required for this, as the information is retrieved from the MPU Key > table, this information is not retrieved through RocksDB Table iteration. (As > when we use get() this checks from cache first, and then it checks table) > > Used this Jira to add an integration test to verify the behavior. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org