[jira] [Comment Edited] (HDDS-245) Handle ContainerReports in the SCM
[ https://issues.apache.org/jira/browse/HDDS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559283#comment-16559283 ] Lokesh Jain edited comment on HDDS-245 at 7/27/18 5:41 AM: --- Thanks [~elek] for working on this! The patch looks very good to me. I have a few minor comments. # ReportResult:58,59 - we can keep the missingContainers and newContainers as null. # ContainerMapping#getContainerWithPipeline needs to be updated for closed container case. For closed containers we need to fetch the datanodes from ContainerStateMap and return the appropriate pipeline information. # START_REPLICATION is currently not fired by any publisher. I guess it will be part of another jira? # We are currently processing the report as soon it is received. Are we handling the case when a container is added in one DN and has been removed from another DN? In such a case we might be sending out a false replicate event as replication count would still match the replication factor. was (Author: ljain): Thanks [~elek] for working on this! I have a few minor comments. # ReportResult:58,59 - we can keep the missingContainers and newContainers as null. # ContainerMapping#getContainerWithPipeline needs to be updated for closed container case. For closed containers we need to fetch the datanodes from ContainerStateMap and return the appropriate pipeline information. # START_REPLICATION is currently not fired by any publisher. I guess it will be part of another jira? # We are currently processing the report as soon it is received. Are we handling the case when a container is added in one DN and has been removed from another DN? In such a case we might be sending out a false replicate event as replication count would still match the replication factor. > Handle ContainerReports in the SCM > -- > > Key: HDDS-245 > URL: https://issues.apache.org/jira/browse/HDDS-245 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-245.001.patch, HDDS-245.002.patch, > HDDS-245.003.patch > > > HDDS-242 provides a new class ContainerReportHandler which could handle the > ContainerReports from the SCMHeartbeatDispatchere. > HDDS-228 introduces a new map to store the container -> datanode[] mapping > HDDS-199 implements the ReplicationManager which could send commands to the > datanodes to copy the datanode. > To wire all these components, we need to put implementation to the > ContainerReportHandler (created in HDDS-242). > The ContainerReportHandler should process the new ContainerReportForDatanode > events, update the containerStateMap and node2ContainerMap and calculate the > missing/duplicate containers and send the ReplicateCommand to the > ReplicateManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-245) Handle ContainerReports in the SCM
[ https://issues.apache.org/jira/browse/HDDS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559283#comment-16559283 ] Lokesh Jain commented on HDDS-245: -- Thanks [~elek] for working on this! I have a few minor comments. # ReportResult:58,59 - we can keep the missingContainers and newContainers as null. # ContainerMapping#getContainerWithPipeline needs to be updated for closed container case. For closed containers we need to fetch the datanodes from ContainerStateMap and return the appropriate pipeline information. # START_REPLICATION is currently not fired by any publisher. I guess it will be part of another jira? # We are currently processing the report as soon it is received. Are we handling the case when a container is added in one DN and has been removed from another DN? In such a case we might be sending out a false replicate event as replication count would still match the replication factor. > Handle ContainerReports in the SCM > -- > > Key: HDDS-245 > URL: https://issues.apache.org/jira/browse/HDDS-245 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-245.001.patch, HDDS-245.002.patch, > HDDS-245.003.patch > > > HDDS-242 provides a new class ContainerReportHandler which could handle the > ContainerReports from the SCMHeartbeatDispatchere. > HDDS-228 introduces a new map to store the container -> datanode[] mapping > HDDS-199 implements the ReplicationManager which could send commands to the > datanodes to copy the datanode. > To wire all these components, we need to put implementation to the > ContainerReportHandler (created in HDDS-242). > The ContainerReportHandler should process the new ContainerReportForDatanode > events, update the containerStateMap and node2ContainerMap and calculate the > missing/duplicate containers and send the ReplicateCommand to the > ReplicateManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-296) OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan
[ https://issues.apache.org/jira/browse/HDDS-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559247#comment-16559247 ] Xiaoyu Yao commented on HDDS-296: - [~anu], I notice that we don't have basic bloom filter and prefix_extractor enabled on OM metadata store. With that, I believe the range scan performance will be much better than we have now. There are many other tuning knobs for rocksdb for us explore. > OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan > - > > Key: HDDS-296 > URL: https://issues.apache.org/jira/browse/HDDS-296 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Anu Engineer >Priority: Critical > Fix For: 0.2.1 > > Attachments: local.png > > > We identified the problem during freon tests on real clusters. First I saw it > on a kubernetes based pseudo cluster (50 datanode, 1 freon). After a while > the rate of the key allocation was slowed down. (See the attached image). > I could also reproduce the problem with local cluster (I used the > hadoop-dist/target/compose/ozoneperf setup). After the first 1 million keys > the key creation is almost stopped. > With the help of [~nandakumar131] we identified the problem is the lock in > the ozone manager. (We profiled the OM with visual vm and found that the code > is locked for an extremity long time, also checked the rocksdb/rpc metrics > from prometheus and everything else was worked well. > [~nandakumar131] suggested to use Instrumented lock in the OMMetadataManager. > With a custom build we identified that the problem is that the deletion > service holds the OMMetadataManager lock for a full range scan. For 1 million > keys it took about 10 seconds (with my local developer machine + ssd) > {code} > ozoneManager_1 | 2018-07-25 12:45:03 WARN OMMetadataManager:143 - Lock held > time above threshold: lock identifier: OMMetadataManagerLock > lockHeldTimeMs=2648 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > ozoneManager_1 | > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyManagerImpl.getPendingDeletionKeys(KeyManagerImpl.java:506) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:98) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:85) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ozoneManager_1 | java.lang.Thread.run(Thread.java:748) > {code} > I checked it with disabled DeletionService and worked well. > Deletion service should be improved to make it work without long term locking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559236#comment-16559236 ] genericqa commented on HDFS-13769: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 6s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 1s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 28m 4s{color} | {color:red} root generated 4 new + 1468 unchanged - 0 fixed = 1472 total (was 1468) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 30s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}124m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDFS-13769 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933177/HDFS-13769.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 930a95974683 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8d3c068 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/24664/artifact/out/diff-compile-javac-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/24664/testReport/ | | Max. process+thread count | 1428 (vs. ulimit of 1) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24664/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Namenode
[jira] [Updated] (HDDS-283) Need an option to list all volumes created in the cluster
[ https://issues.apache.org/jira/browse/HDDS-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nilotpal Nandi updated HDDS-283: Attachment: HDDS-283.001.patch > Need an option to list all volumes created in the cluster > - > > Key: HDDS-283 > URL: https://issues.apache.org/jira/browse/HDDS-283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Nilotpal Nandi >Assignee: Nilotpal Nandi >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-283.001.patch > > > Currently , listVolume command either gives : > 1) all the volumes created by a particular user , using -user argument. > 2) or , all the volumes created by the logged in user , if no -user argument > is provided. > > We need an option to list all the volumes created in the cluster -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-283) Need an option to list all volumes created in the cluster
[ https://issues.apache.org/jira/browse/HDDS-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nilotpal Nandi updated HDDS-283: Status: Patch Available (was: Open) > Need an option to list all volumes created in the cluster > - > > Key: HDDS-283 > URL: https://issues.apache.org/jira/browse/HDDS-283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Nilotpal Nandi >Assignee: Nilotpal Nandi >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-283.001.patch > > > Currently , listVolume command either gives : > 1) all the volumes created by a particular user , using -user argument. > 2) or , all the volumes created by the logged in user , if no -user argument > is provided. > > We need an option to list all the volumes created in the cluster -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica
[ https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559183#comment-16559183 ] Xiao Chen commented on HDFS-13658: -- Thanks for the continued work on this Kitti! I think this is pretty close, some comments: - For EC blocks, low redundancy is not having 0 or 1 replicas. I guess we could borrow from the comment of {{LowRedundancyBlocks#getPriorityStriped}}, to call it 'at highest risk of loss'. - {{ECBlockGroupStats}} and {{ReplicatedBlockStats}} are public, so we cannot change the constructors. Can either add an overload ctor, or add use builder pattern if you want. - I think don't feel strongly about the new fsck option. But for cleanness I propose we do the metrics work here, and split that out to another jira. My take is that with the new stats admins can get the information they want, and the fsck flag seems to add limited value. > fsck, dfsadmin -report, and NN WebUI should report number of blocks that have > 1 replica > --- > > Key: HDFS-13658 > URL: https://issues.apache.org/jira/browse/HDFS-13658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch, > HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch, > HDFS-13658.006.patch, HDFS-13658.007.patch, HDFS-13658.008.patch > > > fsck, dfsadmin -report, and NN WebUI should report number of blocks that have > 1 replica. We have had many cases opened in which a customer has lost a disk > or a DN losing files/blocks due to the fact that they had blocks with only 1 > replica. We need to make the customer better aware of this situation and that > they should take action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-226) Client should update block length in OM while committing the key
[ https://issues.apache.org/jira/browse/HDDS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559179#comment-16559179 ] Mukul Kumar Singh edited comment on HDDS-226 at 7/27/18 3:01 AM: - Thanks for working on this [~shashikant]. Please find me comments as following 1) OmKeyInfo#updateBlockLength, there are 2 for loops. Normally, the order of the blocks in the ksmKeyLocations and in the blockIDList will be the same, so I feel this can be optimized by walking the list only once. Also once we have found a match, we should break from the first loop. 2) Also we have a DatanodeBlockID in DatanodeContainerProto, the block length is not an argument there. Should this be updated as well ? was (Author: msingh): Thanks for working on this [~shashikant]. Apart from Ni 1) OmKeyInfo#updateBlockLength, there are 2 for loops. Normally, the order of the blocks in the ksmKeyLocations and in the blockIDList will be the same, so I feel this can be optimized by walking the list only once. Also once we have found a match, we should break from the first loop. 2) Also we have a DatanodeBlockID in DatanodeContainerProto, the block length is not an argument there. Should this be updated as well ? > Client should update block length in OM while committing the key > > > Key: HDDS-226 > URL: https://issues.apache.org/jira/browse/HDDS-226 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Mukul Kumar Singh >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-226.00.patch, HDDS-226.01.patch, HDDS-226.02.patch, > HDDS-226.03.patch, HDDS-226.04.patch, HDDS-226.05.patch > > > Currently the client allocate a key of size with SCM block size, however a > client can always write smaller amount of data and close the key. The block > length in this case should be updated on OM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559171#comment-16559171 ] Yiqun Lin edited comment on HDFS-13769 at 7/27/18 3:01 AM: --- {quote} Also clear checkpoint in trash is a typical situation of deleting a large dir, since the checkpoint dir of trash accumulates deleted files within several hours. {quote} Agree. We also met this problem. There is a big chance the checkpoint dir being a large dir. As [~kihwal] mentioned, for the safe deleting, it might not be a atomic operation. But it should be okay using for clearing trash dir. {quote} Wei-Chiu Chuang, Agree! getContentSummary is a recursive method and it may take several seconds if the dir is very large. getContentSummary holds the read-lock in FSNameSystem rather than the write-lock. Also we need a way to know whether a dir is large. If there is a better solution I don't know, please tell me, and I think it need not to be very accurate. {quote} I am thinking for this, we can skip invoking expensive call {{getContentSummary}} for the first level dir since there will be a big chance as a large dir. For the deeper children paths, we can do as current patch did. This might be a better way I think. was (Author: linyiqun): {quote} Also clear checkpoint in trash is a typical situation of deleting a large dir, since the checkpoint dir of trash accumulates deleted files within several hours. {quote} Agree. We also met this problem. There is a big chance the checkpoint dir being a large dir. As [~kihwal] mentioned, for the safe deleting, it might not be a atomic operation. But it should be okay using for clearing trash dir. {quote} Wei-Chiu Chuang, Agree! getContentSummary is a recursive method and it may take several seconds if the dir is very large. getContentSummary holds the read-lock in FSNameSystem rather than the write-lock. Also we need a way to know whether a dir is large. If there is a better solution I don't know, please tell me, and I think it need not to be very accurate. {quote} I am thinking for this, We can skip invoking expensive call {{getContentSummary}} for in first level dir since there will be a large chance as a big dir. For the child paths, we can do as current patch did. This might a better way I think. > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-226) Client should update block length in OM while committing the key
[ https://issues.apache.org/jira/browse/HDDS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559179#comment-16559179 ] Mukul Kumar Singh commented on HDDS-226: Thanks for working on this [~shashikant]. Apart from Ni 1) OmKeyInfo#updateBlockLength, there are 2 for loops. Normally, the order of the blocks in the ksmKeyLocations and in the blockIDList will be the same, so I feel this can be optimized by walking the list only once. Also once we have found a match, we should break from the first loop. 2) Also we have a DatanodeBlockID in DatanodeContainerProto, the block length is not an argument there. Should this be updated as well ? > Client should update block length in OM while committing the key > > > Key: HDDS-226 > URL: https://issues.apache.org/jira/browse/HDDS-226 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Mukul Kumar Singh >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-226.00.patch, HDDS-226.01.patch, HDDS-226.02.patch, > HDDS-226.03.patch, HDDS-226.04.patch, HDDS-226.05.patch > > > Currently the client allocate a key of size with SCM block size, however a > client can always write smaller amount of data and close the key. The block > length in this case should be updated on OM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559171#comment-16559171 ] Yiqun Lin edited comment on HDFS-13769 at 7/27/18 2:58 AM: --- {quote} Also clear checkpoint in trash is a typical situation of deleting a large dir, since the checkpoint dir of trash accumulates deleted files within several hours. {quote} Agree. We also met this problem. There is a big chance the checkpoint dir being a large dir. As [~kihwal] mentioned, for the safe deleting, it might not be a atomic operation. But it should be okay using for clearing trash dir. {quote} Wei-Chiu Chuang, Agree! getContentSummary is a recursive method and it may take several seconds if the dir is very large. getContentSummary holds the read-lock in FSNameSystem rather than the write-lock. Also we need a way to know whether a dir is large. If there is a better solution I don't know, please tell me, and I think it need not to be very accurate. {quote} I am thinking for this, We can skip invoking expensive call {{getContentSummary}} for in first level dir since there will be a large chance as a big dir. For the child paths, we can do as current patch did. This might a better way I think. was (Author: linyiqun): {quote} Also clear checkpoint in trash is a typical situation of deleting a large dir, since the checkpoint dir of trash accumulates deleted files within several hours. {quote} Agree. We also met this problem. There is a big chance the checkpoint dir being a large dir. As [~kihwal] mentioned, for the safe deleting, it might not be a atomic operation. But it should be okay using for clearing trash dir. {quote} Wei-Chiu Chuang, Agree! getContentSummary is a recursive method and it may take several seconds if the dir is very large. getContentSummary holds the read-lock in FSNameSystem rather than the write-lock. Also we need a way to know whether a dir is large. If there is a better solution I don't know, please tell me, and I think it need not to be very accurate. {quote} I am thinking for this, we don't really need a limitation value {{FS_TRASH_SAFE_DELETE_ITEM_LIMIT_KEY}}. I mean if users enable the safe-deleteion trash policy, we are assuming the trash dir will have a big chance being a large dir. And we just use safe deletion way in {{deleteTrashInternal#safeDelete}}. And no need to invoke expensive call {{getContentSummary}} to get the counts. > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-13769: - Status: Patch Available (was: Open) > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0, 2.8.2 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559171#comment-16559171 ] Yiqun Lin commented on HDFS-13769: -- {quote} Also clear checkpoint in trash is a typical situation of deleting a large dir, since the checkpoint dir of trash accumulates deleted files within several hours. {quote} Agree. We also met this problem. There is a big chance the checkpoint dir being a large dir. As [~kihwal] mentioned, for the safe deleting, it might not be a atomic operation. But it should be okay using for clearing trash dir. {quote} Wei-Chiu Chuang, Agree! getContentSummary is a recursive method and it may take several seconds if the dir is very large. getContentSummary holds the read-lock in FSNameSystem rather than the write-lock. Also we need a way to know whether a dir is large. If there is a better solution I don't know, please tell me, and I think it need not to be very accurate. {quote} I am thinking for this, we don't really need a limitation value {{FS_TRASH_SAFE_DELETE_ITEM_LIMIT_KEY}}. I mean if users enable the safe-deleteion trash policy, we are assuming the trash dir will have a big chance being a large dir. And we just use safe deletion way in {{deleteTrashInternal#safeDelete}}. And no need to invoke expensive call {{getContentSummary}} to get the counts. > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13767) Add msync server implementation.
[ https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559149#comment-16559149 ] Konstantin Shvachko commented on HDFS-13767: I like the approach. Don't think preserving the ordering is needed. Seems you need to put some more work to complete this. May be changing {{receiveRequestState(header)}} to return {{clientStateId}}, so that you could pass it into {{RpcCall}}, which is then retrieved in {{Handler.run()}} to verify if the SBN already caught up. BTW you can also incorporate the {{AC.isAlwaysRecent()}} logic inside {{receiveRequestState()}} instead of adding new method. > Add msync server implementation. > > > Key: HDFS-13767 > URL: https://issues.apache.org/jira/browse/HDFS-13767 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13767.WIP.001.patch, HDFS-13767.WIP.002.patch > > > This is a followup on HDFS-13688, where msync API is introduced to > {{ClientProtocol}} but the server side implementation is missing. This is > Jira is to implement the server side logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559125#comment-16559125 ] Tao Jie commented on HDFS-13769: [~csun], I agree with [~kihwal]. We cannot use this logic in the default delete operation, since it breaks the existing delete semantics. However we can use this logic in trash deletion which brings less side effect. Also clear checkpoint in trash is a typical situation of deleting a large dir, since the checkpoint dir of trash accumulates deleted files within several hours. [~jojochuang], Agree! \{{getContentSummary}} is a recursive method and it may take several seconds if the dir is very large. \{{getContentSummary}} holds the read-lock in \{{FSNameSystem}} rather than the write-lock. Also we need a way to know whether a dir is large. If there is a better solution I don't know, please tell me, and I think it need not to be very accurate. > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-270) Move generic container util functions to ContianerUtils
[ https://issues.apache.org/jira/browse/HDDS-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-270: Fix Version/s: 0.2.1 > Move generic container util functions to ContianerUtils > --- > > Key: HDDS-270 > URL: https://issues.apache.org/jira/browse/HDDS-270 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-270.001.patch > > > Some container util functions such as getContainerFile() are common for all > ContainerTypes. These functions should be moved to ContainerUtils. > Also moved some fucntions to KeyValueContainer as applicable. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13771) enableManagedDfsDirsRedundancy typo in creating MiniDFSCluster
wilderchen created HDFS-13771: - Summary: enableManagedDfsDirsRedundancy typo in creating MiniDFSCluster Key: HDFS-13771 URL: https://issues.apache.org/jira/browse/HDFS-13771 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 3.0.3 Reporter: wilderchen There is a typo (wrong parameter) happens on the function "initNameNodeConf" while calling "configureNameService" in file, "hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java". * prototype of initNameNodeConf is "void initNameNodeConf(Configuration conf, String nameserviceId, int nsIndex, String nnId, boolean manageNameDfsDirs, boolean enableManagedDfsDirsRedundancy, int nnIndex)" * the function call of initNameNodeConf in configureNameService is "initNameNodeConf(conf, nsId, nsCounter, nn.getNnId(), manageNameDfsDirs, manageNameDfsDirs, nnIndex)" * expect function call to be "initNameNodeConf(conf, nsId, nsCounter, nn.getNnId(), manageNameDfsDirs, enableManagedDfsDirsRedundancy, nnIndex)" -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-270) Move generic container util functions to ContianerUtils
[ https://issues.apache.org/jira/browse/HDDS-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559064#comment-16559064 ] genericqa commented on HDDS-270: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 36s{color} | {color:red} hadoop-hdds_container-service generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s{color} | {color:green} container-service in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 6s{color} | {color:green} integration-test in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 31s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-270 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933275/HDDS-270.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 0df7f0871355 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559011#comment-16559011 ] genericqa commented on HDDS-268: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 34s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} framework in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 26s{color} | {color:red} server-scm in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.container.TestCloseContainerWatcher | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-268 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933269/HDDS-268.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b8b394062ada 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d70d845 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | findbugs |
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558999#comment-16558999 ] Wei-Chiu Chuang commented on HDFS-13769: {code} ContentSummary cs = fs.getContentSummary(path); {code} is recursive in nature, so iterating on a big directory can be slow (probably not as slow as recursive delete). You should avoid calling it if possible. > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider at creation time for consistent UGI handling
[ https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558993#comment-16558993 ] genericqa commented on HDFS-13697: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 29s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 10s{color} | {color:green} hadoop-kms in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 47s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 18s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}254m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.datanode.TestDataNodeMXBean | | | hadoop.fs.viewfs.TestViewFileSystemHdfs | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.client.impl.TestBlockReaderLocal | | | hadoop.hdfs.TestErasureCodingExerciseAPIs | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce
[jira] [Comment Edited] (HDDS-296) OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan
[ https://issues.apache.org/jira/browse/HDDS-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558977#comment-16558977 ] Anu Engineer edited comment on HDDS-296 at 7/26/18 10:24 PM: - [~elek]/[~nandakumar131] Thanks for root causing this issue. I will take care of this, we cannot have a release without this getting fixed. The reason I want to fix this is because this issue is just a symptom, we have these range scans at other places in code too, and OM has not gotten as much love as SCM :) was (Author: anu): [~elek]/[~nandakumar131] Thanks for root causing this issue. I will take care of this, we cannot have a release without this getting fixed. > OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan > - > > Key: HDDS-296 > URL: https://issues.apache.org/jira/browse/HDDS-296 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Anu Engineer >Priority: Critical > Fix For: 0.2.1 > > Attachments: local.png > > > We identified the problem during freon tests on real clusters. First I saw it > on a kubernetes based pseudo cluster (50 datanode, 1 freon). After a while > the rate of the key allocation was slowed down. (See the attached image). > I could also reproduce the problem with local cluster (I used the > hadoop-dist/target/compose/ozoneperf setup). After the first 1 million keys > the key creation is almost stopped. > With the help of [~nandakumar131] we identified the problem is the lock in > the ozone manager. (We profiled the OM with visual vm and found that the code > is locked for an extremity long time, also checked the rocksdb/rpc metrics > from prometheus and everything else was worked well. > [~nandakumar131] suggested to use Instrumented lock in the OMMetadataManager. > With a custom build we identified that the problem is that the deletion > service holds the OMMetadataManager lock for a full range scan. For 1 million > keys it took about 10 seconds (with my local developer machine + ssd) > {code} > ozoneManager_1 | 2018-07-25 12:45:03 WARN OMMetadataManager:143 - Lock held > time above threshold: lock identifier: OMMetadataManagerLock > lockHeldTimeMs=2648 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > ozoneManager_1 | > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyManagerImpl.getPendingDeletionKeys(KeyManagerImpl.java:506) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:98) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:85) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ozoneManager_1 | java.lang.Thread.run(Thread.java:748) > {code} > I checked it with disabled DeletionService and worked well. > Deletion service should be improved to make it work without long term locking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-296) OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan
[ https://issues.apache.org/jira/browse/HDDS-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558977#comment-16558977 ] Anu Engineer commented on HDDS-296: --- [~elek]/[~nandakumar131] Thanks for root causing this issue. I will take care of this, we cannot have a release without this getting fixed. > OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan > - > > Key: HDDS-296 > URL: https://issues.apache.org/jira/browse/HDDS-296 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Priority: Critical > Fix For: 0.2.1 > > Attachments: local.png > > > We identified the problem during freon tests on real clusters. First I saw it > on a kubernetes based pseudo cluster (50 datanode, 1 freon). After a while > the rate of the key allocation was slowed down. (See the attached image). > I could also reproduce the problem with local cluster (I used the > hadoop-dist/target/compose/ozoneperf setup). After the first 1 million keys > the key creation is almost stopped. > With the help of [~nandakumar131] we identified the problem is the lock in > the ozone manager. (We profiled the OM with visual vm and found that the code > is locked for an extremity long time, also checked the rocksdb/rpc metrics > from prometheus and everything else was worked well. > [~nandakumar131] suggested to use Instrumented lock in the OMMetadataManager. > With a custom build we identified that the problem is that the deletion > service holds the OMMetadataManager lock for a full range scan. For 1 million > keys it took about 10 seconds (with my local developer machine + ssd) > {code} > ozoneManager_1 | 2018-07-25 12:45:03 WARN OMMetadataManager:143 - Lock held > time above threshold: lock identifier: OMMetadataManagerLock > lockHeldTimeMs=2648 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > ozoneManager_1 | > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyManagerImpl.getPendingDeletionKeys(KeyManagerImpl.java:506) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:98) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:85) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ozoneManager_1 | java.lang.Thread.run(Thread.java:748) > {code} > I checked it with disabled DeletionService and worked well. > Deletion service should be improved to make it work without long term locking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-296) OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan
[ https://issues.apache.org/jira/browse/HDDS-296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer reassigned HDDS-296: - Assignee: Anu Engineer > OMMetadataManagerLock is hold by getPendingDeletionKeys for a full table scan > - > > Key: HDDS-296 > URL: https://issues.apache.org/jira/browse/HDDS-296 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Anu Engineer >Priority: Critical > Fix For: 0.2.1 > > Attachments: local.png > > > We identified the problem during freon tests on real clusters. First I saw it > on a kubernetes based pseudo cluster (50 datanode, 1 freon). After a while > the rate of the key allocation was slowed down. (See the attached image). > I could also reproduce the problem with local cluster (I used the > hadoop-dist/target/compose/ozoneperf setup). After the first 1 million keys > the key creation is almost stopped. > With the help of [~nandakumar131] we identified the problem is the lock in > the ozone manager. (We profiled the OM with visual vm and found that the code > is locked for an extremity long time, also checked the rocksdb/rpc metrics > from prometheus and everything else was worked well. > [~nandakumar131] suggested to use Instrumented lock in the OMMetadataManager. > With a custom build we identified that the problem is that the deletion > service holds the OMMetadataManager lock for a full range scan. For 1 million > keys it took about 10 seconds (with my local developer machine + ssd) > {code} > ozoneManager_1 | 2018-07-25 12:45:03 WARN OMMetadataManager:143 - Lock held > time above threshold: lock identifier: OMMetadataManagerLock > lockHeldTimeMs=2648 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > ozoneManager_1 | > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > ozoneManager_1 | > org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyManagerImpl.getPendingDeletionKeys(KeyManagerImpl.java:506) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:98) > ozoneManager_1 | > org.apache.hadoop.ozone.om.KeyDeletingService$KeyDeletingTask.call(KeyDeletingService.java:85) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ozoneManager_1 | java.util.concurrent.FutureTask.run(FutureTask.java:266) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > ozoneManager_1 | > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ozoneManager_1 | > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ozoneManager_1 | java.lang.Thread.run(Thread.java:748) > {code} > I checked it with disabled DeletionService and worked well. > Deletion service should be improved to make it work without long term locking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-270) Move generic container util functions to ContianerUtils
[ https://issues.apache.org/jira/browse/HDDS-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-270: Summary: Move generic container util functions to ContianerUtils (was: Move generic container utils to ContianerUitls) > Move generic container util functions to ContianerUtils > --- > > Key: HDDS-270 > URL: https://issues.apache.org/jira/browse/HDDS-270 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Attachments: HDDS-270.001.patch > > > Some container util functions such as getContainerFile() are common for all > ContainerTypes. These functions should be moved to ContainerUtils. > Also moved some fucntions to KeyValueContainer as applicable. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-270) Move generic container utils to ContianerUitls
[ https://issues.apache.org/jira/browse/HDDS-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-270: Attachment: HDDS-270.001.patch > Move generic container utils to ContianerUitls > -- > > Key: HDDS-270 > URL: https://issues.apache.org/jira/browse/HDDS-270 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Attachments: HDDS-270.001.patch > > > Some container util functions such as getContainerFile() are common for all > ContainerTypes. These functions should be moved to ContainerUtils. > Also moved some fucntions to KeyValueContainer as applicable. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-270) Move generic container utils to ContianerUitls
[ https://issues.apache.org/jira/browse/HDDS-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDDS-270: Status: Patch Available (was: Open) > Move generic container utils to ContianerUitls > -- > > Key: HDDS-270 > URL: https://issues.apache.org/jira/browse/HDDS-270 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Attachments: HDDS-270.001.patch > > > Some container util functions such as getContainerFile() are common for all > ContainerTypes. These functions should be moved to ContainerUtils. > Also moved some fucntions to KeyValueContainer as applicable. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558941#comment-16558941 ] genericqa commented on HDDS-271: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s{color} | {color:red} hadoop-hdds_container-service generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s{color} | {color:green} container-service in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-271 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933261/HDDS-271.04.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ef9d1f1f2a2e 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d70d845 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | javadoc |
[jira] [Commented] (HDFS-13767) Add msync server implementation.
[ https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558934#comment-16558934 ] Chen Liang commented on HDFS-13767: --- Upload WIP.002 patch. The main difference from WIP.001 patch is that, the logic to make sure the calls from the same client stays the same processing order is removed. Specifically, if a call has state id large than server state id, the Handler will simply insert the call back to the callQueue and continue. As an example, say callQueue has two calls from same client [1,2]. 1 gets checked, and the server state id hasn't caught up. Then 1 gets added back to queue, making it [2, 1]. Then server caught up to state id, say, 3. Then 2 gets checked, and processed, then 1. So the processing order becomes 2,1. But this is fine because even in current Server logic, there is no guarantee on the order: It is already possible that two handler threads pick up 1 and 2 respectively and 2 finishes first. In fact, due to the synchronized natural of the API, only when the same client instance is used by multiple threads, there will be multiple calls from the same client in the callQueue. But in this case, there should be no expectation on ordering. Furthermore, this logic is for Observer exclusively, which only handles reads. (Please correct me if I'm wrong on this). > Add msync server implementation. > > > Key: HDFS-13767 > URL: https://issues.apache.org/jira/browse/HDFS-13767 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13767.WIP.001.patch, HDFS-13767.WIP.002.patch > > > This is a followup on HDFS-13688, where msync API is introduced to > {{ClientProtocol}} but the server side implementation is missing. This is > Jira is to implement the server side logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558925#comment-16558925 ] Ajay Kumar commented on HDDS-268: - Patch v1 to rebase with trunk and add license to {{TestCloseContainerWatcher}}. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-268: Attachment: HDDS-268.01.patch > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch, HDDS-268.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13767) Add msync server implementation.
[ https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-13767: -- Attachment: HDFS-13767.WIP.002.patch > Add msync server implementation. > > > Key: HDFS-13767 > URL: https://issues.apache.org/jira/browse/HDFS-13767 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Chen Liang >Assignee: Chen Liang >Priority: Major > Attachments: HDFS-13767.WIP.001.patch, HDFS-13767.WIP.002.patch > > > This is a followup on HDFS-13688, where msync API is introduced to > {{ClientProtocol}} but the server side implementation is missing. This is > Jira is to implement the server side logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-226) Client should update block length in OM while committing the key
[ https://issues.apache.org/jira/browse/HDDS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558923#comment-16558923 ] Tsz Wo Nicholas Sze commented on HDDS-226: -- Hi [~shashikant], it seems not a good idea to add blockLength to BlockID. BlockID is used everywhere as an id. How about adding KeyLocation to KeyArgs? > Client should update block length in OM while committing the key > > > Key: HDDS-226 > URL: https://issues.apache.org/jira/browse/HDDS-226 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Mukul Kumar Singh >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-226.00.patch, HDDS-226.01.patch, HDDS-226.02.patch, > HDDS-226.03.patch, HDDS-226.04.patch, HDDS-226.05.patch > > > Currently the client allocate a key of size with SCM block size, however a > client can always write smaller amount of data and close the key. The block > length in this case should be updated on OM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-287) Add Close ContainerAction to Datanode#StateContext when the container gets full
[ https://issues.apache.org/jira/browse/HDDS-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558920#comment-16558920 ] Xiaoyu Yao commented on HDDS-287: - Thanks [~nandakumar131] for the patch. It looks good to me. +1 We will need to add the handler part in SCM to process ContainerAction.Action.CLOSE once HDDS-245 is in. > Add Close ContainerAction to Datanode#StateContext when the container gets > full > --- > > Key: HDDS-287 > URL: https://issues.apache.org/jira/browse/HDDS-287 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.2.1 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-287.000.patch > > > Datanode has to send Close ContainerAction to SCM whenever a container gets > full. {{Datanode#StateContext}} has {{containerActions}} queue from which the > ContainerActions are picked and sent as part of heartbeat. In this jira we > have to add ContainerAction to StateContext whenever a container get full. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8131) Implement a space balanced block placement policy
[ https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558894#comment-16558894 ] Yongjun Zhang commented on HDFS-8131: - I just read HDFS-4946, I found it doesn't exactly do what I meant by comment #3 above. HDFS-4946 introduced a config to disable/enable preferLocalDN, if disabled, the localDN will be skipped for all application. Whereas when I wrote comment #3 above, I was thinking that when choosing the first DN, we could apply the same fix done here in HDFS-8131, such that we can choose either local or remote for the first DN, instead of always skipping the local DN. Welcome to comment on this thought. > Implement a space balanced block placement policy > - > > Key: HDFS-8131 > URL: https://issues.apache.org/jira/browse/HDFS-8131 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Labels: BlockPlacementPolicy > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-8131-branch-2.7.patch, HDFS-8131-v1.diff, > HDFS-8131-v2.diff, HDFS-8131-v3.diff, HDFS-8131.004.patch, > HDFS-8131.005.patch, HDFS-8131.006.patch, balanced.png > > > The default block placement policy will choose datanodes for new blocks > randomly, which will result in unbalanced space used percent among datanodes > after an cluster expansion. The old datanodes always are in high used percent > of space and new added ones are in low percent. > Through we can used the external balance tool to balance the space used rate, > it will cost extra network IO and it's not easy to control the balance speed. > An easy solution is to implement an balanced block placement policy which > will choose low used percent datanodes for new blocks with a little high > possibility. In a not long term, the used percent of datanodes will trend to > be balanced. > Suggestions and discussions are welcomed. Thanks -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-291) Initialize hadoop metrics system in standalone hdds datanodes
[ https://issues.apache.org/jira/browse/HDDS-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558886#comment-16558886 ] Hudson commented on HDDS-291: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14648 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14648/]) HDDS-291. Initialize hadoop metrics system in standalone hdds datanodes. (xyao: rev d70d84570575574b7e3ad0f00baf54f1dde76d97) * (edit) hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/HddsDatanodeService.java * (edit) hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/SCMConnectionManager.java > Initialize hadoop metrics system in standalone hdds datanodes > - > > Key: HDDS-291 > URL: https://issues.apache.org/jira/browse/HDDS-291 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Minor > Fix For: 0.2.1 > > Attachments: HDDS-291.001.patch > > > Since HDDS-94 we can start a standalone HDDS datanode process without HDFS > datanode parts. > But to see the hadoop metrics over the jmx interface we need to initialize > the hadoop metrics system (we have existing metrics by the storage io layer). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
[ https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558883#comment-16558883 ] Xiao Chen commented on HDFS-13770: -- Thanks Kitti for identifying this and providing a fix! Patch looks pretty good, some minor comments: - We can extract a method {{decrementBlockStat}} in {{UnderReplicatedBlocks#remove}} for less duplication. - We can tidy up the new 3-param {{remove}}: make it private, and point its javadoc to the 2-param one. Some thing like: {code}* For details, see {@link #remove(BlockInfo, int)} {code} and explain the difference only (i.e. how oldExpectedReplicas is used). - Original javadoc had a typo: s/attmpted/attempted/g. - Test should have a timeout - Do you think it's helpful to add a few other sanity tests in the same test case? For example, oldExpectedReplica of 2 doesn't trigger a counter decrease. From code it's pretty clear, so this is really just adding some extra coverage. Up to you. :) > dfsadmin -report does not always decrease "missing blocks (with replication > factor 1)" metrics when file is deleted > --- > > Key: HDFS-13770 > URL: https://issues.apache.org/jira/browse/HDFS-13770 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.7 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13770-branch-2.001.patch > > > Missing blocks (with replication factor 1) metric is not always decreased > when file is deleted. > If a file is deleted, the remove function of UnderReplicatedBlocks can be > called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called > with the wrong priority the corruptReplOneBlocks metric is not decreased, > however the block is removed from the priority queue which contains it. > The corresponding code: > {code:java} > /** remove a block from a under replication queue */ > synchronized boolean remove(BlockInfo block, > int oldReplicas, > int oldReadOnlyReplicas, > int decommissionedReplicas, > int oldExpectedReplicas) { > final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, > decommissionedReplicas, oldExpectedReplicas); > boolean removedBlock = remove(block, priLevel); > if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && > oldExpectedReplicas == 1 && > removedBlock) { > corruptReplOneBlocks--; > assert corruptReplOneBlocks >= 0 : > "Number of corrupt blocks with replication factor 1 " + > "should be non-negative"; > } > return removedBlock; > } > /** > * Remove a block from the under replication queues. > * > * The priLevel parameter is a hint of which queue to query > * first: if negative or = \{@link #LEVEL} this shortcutting > * is not attmpted. > * > * If the block is not found in the nominated queue, an attempt is made to > * remove it from all queues. > * > * Warning: This is not a synchronized method. > * @param block block to remove > * @param priLevel expected privilege level > * @return true if the block was found and removed from one of the priority > queues > */ > boolean remove(BlockInfo block, int priLevel) { > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + > " from priority queue {}", block, priLevel); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > return true; > } > } > } > return false; > } > {code} > It is already fixed on trunk by this jira: HDFS-10999, but that ticket > introduces new metrics, which I think should't be backported to branch-2. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558865#comment-16558865 ] Hudson commented on HDDS-277: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14647 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14647/]) HDDS-277. PipelineStateMachine should handle closure of pipelines in (xyao: rev fd31cb6cfeef0c7e9bb0a054cb0f78853df8976f) * (edit) hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestContainerPlacement.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManager.java * (edit) hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/closer/TestContainerCloser.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/states/ContainerStateMap.java * (edit) hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestCloseContainerEventHandler.java * (edit) hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/container/common/helpers/ContainerInfo.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipelines/standalone/StandaloneManagerImpl.java * (edit) hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestContainerMapping.java * (edit) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/scm/TestContainerSQLCli.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipelines/ratis/RatisManagerImpl.java * (add) hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/pipeline/TestPipelineClose.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipelines/Node2PipelineMap.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/CloseContainerEventHandler.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java * (edit) hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/block/TestBlockManager.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerMapping.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipelines/PipelineManager.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipelines/PipelineSelector.java > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch, HDDS-277.005.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558862#comment-16558862 ] Bharat Viswanadham commented on HDDS-271: - Hi [~nandakumar131] Thanks for the review and offline discussion. Addressed your review comments in patch v04. > Create a block iterator to iterate blocks in a container > > > Key: HDDS-271 > URL: https://issues.apache.org/jira/browse/HDDS-271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-271.00.patch, HDDS-271.01.patch, HDDS-271.02.patch, > HDDS-271.03.patch, HDDS-271.04.patch > > > Create a block iterator to scan all blocks in a container. > This one will be useful during implementation of container scanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-271: Attachment: HDDS-271.04.patch > Create a block iterator to iterate blocks in a container > > > Key: HDDS-271 > URL: https://issues.apache.org/jira/browse/HDDS-271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-271.00.patch, HDDS-271.01.patch, HDDS-271.02.patch, > HDDS-271.03.patch, HDDS-271.04.patch > > > Create a block iterator to scan all blocks in a container. > This one will be useful during implementation of container scanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-293) Reduce memory usage in KeyData
[ https://issues.apache.org/jira/browse/HDDS-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558860#comment-16558860 ] genericqa commented on HDDS-293: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 32m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 32m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s{color} | {color:green} container-service in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 17s{color} | {color:red} integration-test in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}142m 7s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerByPipeline | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-293 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933239/HDDS-293.20180726.patch | |
[jira] [Updated] (HDDS-291) Initialize hadoop metrics system in standalone hdds datanodes
[ https://issues.apache.org/jira/browse/HDDS-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-291: Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~elek] for the contribution. I've commit the patch to trunk. > Initialize hadoop metrics system in standalone hdds datanodes > - > > Key: HDDS-291 > URL: https://issues.apache.org/jira/browse/HDDS-291 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Minor > Fix For: 0.2.1 > > Attachments: HDDS-291.001.patch > > > Since HDDS-94 we can start a standalone HDDS datanode process without HDFS > datanode parts. > But to see the hadoop metrics over the jmx interface we need to initialize > the hadoop metrics system (we have existing metrics by the storage io layer). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-277: Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~msingh] for the contribution. I've committed the patch to trunk. > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch, HDDS-277.005.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558773#comment-16558773 ] genericqa commented on HDDS-277: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 31m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 49s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 40s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 5s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 36s{color} | {color:green} server-scm in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 23s{color} | {color:red} integration-test in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}137m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerByPipeline | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-277 | | JIRA Patch URL |
[jira] [Commented] (HDDS-252) Eliminate the datanode ID file
[ https://issues.apache.org/jira/browse/HDDS-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558764#comment-16558764 ] genericqa commented on HDDS-252: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 36 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 27s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 42s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 42s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s{color} | {color:green} container-service in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 27s{color} | {color:green} server-scm in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 34s{color} | {color:green} tools in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 33s{color} | {color:green} integration-test in the patch passed. {color} | | {color:green}+1{color} | {color:green}
[jira] [Comment Edited] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider at creation time for consistent UGI handling
[ https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558750#comment-16558750 ] Zsolt Venczel edited comment on HDFS-13697 at 7/26/18 6:50 PM: --- The latest patch contains: * Revert of HDFS-7718 and HADOOP-13749. * DFSClient creates and caches the KeyProvider at construction time. * KMSClientProvider holds on to the UGI at creation time and also supports HADOOP-10698 efforts. * HADOOP-11368 resolves the sslfactory truststore reloader thread leak This patch does not cover the shared, periodic method for checking the truststore files. If you agree it could be covered in a separate jira. was (Author: zvenczel): The latest patch contains: * Revert of HDFS-7718 and HADOOP-13749. * DFSClient creates and caches the KeyProvider at construction time. * KMSClientProvider holds on to the UGI at creation time and also supports HADOOP-10698 efforts. * HADOOP-11368 resolves the sslfactory truststore reloader thread leak This patch does not cover the shared, periodic method for checking the truststore files. If you agree it could be covered in a separate jira. > DFSClient should instantiate and cache KMSClientProvider at creation time for > consistent UGI handling > - > > Key: HDFS-13697 > URL: https://issues.apache.org/jira/browse/HDFS-13697 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, > HDFS-13697.03.patch, HDFS-13697.04.patch > > > While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack > might not have doAs privileged execution call (in the DFSClient for example). > This results in loosing the proxy user from UGI as UGI.getCurrentUser finds > no AccessControllerContext and does a re-login for the login user only. > This can cause the following for example: if we have set up the oozie user to > be entitled to perform actions on behalf of example_user but oozie is > forbidden to decrypt any EDEK (for security reasons), due to the above issue, > example_user entitlements are lost from UGI and the following error is > reported: > {code} > [0] > SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] > JOB[0020905-180313191552532-oozie-oozi-W] > ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting > action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message > [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with > ACL name [encrypted_key]!!] > org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not > authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!! > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441) > at > org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523) > at > org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199) > at > org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) > at org.apache.oozie.command.XCommand.call(XCommand.java:286) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User > [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name > [encrypted_key]!! > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157) > at >
[jira] [Commented] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider at creation time for consistent UGI handling
[ https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558750#comment-16558750 ] Zsolt Venczel commented on HDFS-13697: -- The latest patch contains: * Revert of HDFS-7718 and HADOOP-13749. * DFSClient creates and caches the KeyProvider at construction time. * KMSClientProvider holds on to the UGI at creation time and also supports HADOOP-10698 efforts. * HADOOP-11368 resolves the sslfactory truststore reloader thread leak This patch does not cover the shared, periodic method for checking the truststore files. If you agree it could be covered in a separate jira. > DFSClient should instantiate and cache KMSClientProvider at creation time for > consistent UGI handling > - > > Key: HDFS-13697 > URL: https://issues.apache.org/jira/browse/HDFS-13697 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, > HDFS-13697.03.patch, HDFS-13697.04.patch > > > While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack > might not have doAs privileged execution call (in the DFSClient for example). > This results in loosing the proxy user from UGI as UGI.getCurrentUser finds > no AccessControllerContext and does a re-login for the login user only. > This can cause the following for example: if we have set up the oozie user to > be entitled to perform actions on behalf of example_user but oozie is > forbidden to decrypt any EDEK (for security reasons), due to the above issue, > example_user entitlements are lost from UGI and the following error is > reported: > {code} > [0] > SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] > JOB[0020905-180313191552532-oozie-oozi-W] > ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting > action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message > [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with > ACL name [encrypted_key]!!] > org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not > authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!! > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441) > at > org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523) > at > org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199) > at > org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) > at org.apache.oozie.command.XCommand.call(XCommand.java:286) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User > [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name > [encrypted_key]!! > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:607) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:565) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:832) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:209) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:205) > at >
[jira] [Commented] (HDFS-8131) Implement a space balanced block placement policy
[ https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558746#comment-16558746 ] Yongjun Zhang commented on HDFS-8131: - Hm, just noticed HDFS-4946 for my comment #3 above. Thanks. > Implement a space balanced block placement policy > - > > Key: HDFS-8131 > URL: https://issues.apache.org/jira/browse/HDFS-8131 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Labels: BlockPlacementPolicy > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-8131-branch-2.7.patch, HDFS-8131-v1.diff, > HDFS-8131-v2.diff, HDFS-8131-v3.diff, HDFS-8131.004.patch, > HDFS-8131.005.patch, HDFS-8131.006.patch, balanced.png > > > The default block placement policy will choose datanodes for new blocks > randomly, which will result in unbalanced space used percent among datanodes > after an cluster expansion. The old datanodes always are in high used percent > of space and new added ones are in low percent. > Through we can used the external balance tool to balance the space used rate, > it will cost extra network IO and it's not easy to control the balance speed. > An easy solution is to implement an balanced block placement policy which > will choose low used percent datanodes for new blocks with a little high > possibility. In a not long term, the used percent of datanodes will trend to > be balanced. > Suggestions and discussions are welcomed. Thanks -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8131) Implement a space balanced block placement policy
[ https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558739#comment-16558739 ] Yongjun Zhang commented on HDFS-8131: - HI [~liushaohui], Thanks much for the nice work here. I have some comments. 1. This jira is described as "improvement" rather than new feature, it should be a new feature and be documented. 2. A question related to the question [~Tagar] asked above: https://issues.apache.org/jira/browse/HDFS-8131?focusedCommentId=15981732=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15981732 Class AvailableSpaceBlockPlacementPolicy extends BlockPlacementPolicyDefault. But it doesn't change the behavior of choosing the first node in BlockPlacementPolicyDefault, so even with this new feature, the local DN is always chosen as the first DN (of course when it is not excluded), and the new feature only changes the selection of the rest of the two DNs. 3. Wonder if we could have another placement policy that could potentially have a choice to choose a different DN than local DN for the first node, so we don't always choose the local DN as the first node. Would you please share your thoughts? Thanks. > Implement a space balanced block placement policy > - > > Key: HDFS-8131 > URL: https://issues.apache.org/jira/browse/HDFS-8131 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Labels: BlockPlacementPolicy > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: HDFS-8131-branch-2.7.patch, HDFS-8131-v1.diff, > HDFS-8131-v2.diff, HDFS-8131-v3.diff, HDFS-8131.004.patch, > HDFS-8131.005.patch, HDFS-8131.006.patch, balanced.png > > > The default block placement policy will choose datanodes for new blocks > randomly, which will result in unbalanced space used percent among datanodes > after an cluster expansion. The old datanodes always are in high used percent > of space and new added ones are in low percent. > Through we can used the external balance tool to balance the space used rate, > it will cost extra network IO and it's not easy to control the balance speed. > An easy solution is to implement an balanced block placement policy which > will choose low used percent datanodes for new blocks with a little high > possibility. In a not long term, the used percent of datanodes will trend to > be balanced. > Suggestions and discussions are welcomed. Thanks -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider at creation time for consistent UGI handling
[ https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zsolt Venczel updated HDFS-13697: - Attachment: HDFS-13697.04.patch > DFSClient should instantiate and cache KMSClientProvider at creation time for > consistent UGI handling > - > > Key: HDFS-13697 > URL: https://issues.apache.org/jira/browse/HDFS-13697 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, > HDFS-13697.03.patch, HDFS-13697.04.patch > > > While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack > might not have doAs privileged execution call (in the DFSClient for example). > This results in loosing the proxy user from UGI as UGI.getCurrentUser finds > no AccessControllerContext and does a re-login for the login user only. > This can cause the following for example: if we have set up the oozie user to > be entitled to perform actions on behalf of example_user but oozie is > forbidden to decrypt any EDEK (for security reasons), due to the above issue, > example_user entitlements are lost from UGI and the following error is > reported: > {code} > [0] > SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] > JOB[0020905-180313191552532-oozie-oozi-W] > ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting > action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message > [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with > ACL name [encrypted_key]!!] > org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not > authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!! > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441) > at > org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523) > at > org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199) > at > org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) > at org.apache.oozie.command.XCommand.call(XCommand.java:286) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User > [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name > [encrypted_key]!! > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:607) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:565) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:832) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:209) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:205) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:205) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1440) > at >
[jira] [Updated] (HDDS-293) Reduce memory usage in KeyData
[ https://issues.apache.org/jira/browse/HDDS-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-293: Fix Version/s: 0.2.1 > Reduce memory usage in KeyData > -- > > Key: HDDS-293 > URL: https://issues.apache.org/jira/browse/HDDS-293 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-293.20180726.patch > > > Currently, the field chunks is declared as a List in KeyData as > shown below. > {code} > //KeyData.java > private List chunks; > {code} > It is expected that many KeyData objects only have a single chunk. We could > reduce the memory usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-291) Initialize hadoop metrics system in standalone hdds datanodes
[ https://issues.apache.org/jira/browse/HDDS-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558694#comment-16558694 ] genericqa commented on HDDS-291: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 56s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} container-service in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 74m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-291 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933029/HDDS-291.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d69eee8acbae 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a192295 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDDS-Build/642/testReport/ | | Max. process+thread count | 407 (vs. ulimit of 1) | | modules | C: hadoop-hdds/container-service U: hadoop-hdds/container-service | | Console output | https://builds.apache.org/job/PreCommit-HDDS-Build/642/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Initialize hadoop metrics system in standalone hdds datanodes >
[jira] [Updated] (HDDS-293) Reduce memory usage in KeyData
[ https://issues.apache.org/jira/browse/HDDS-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDDS-293: - Status: Patch Available (was: Open) HDDS-293.20180726.patch: 1st patch > Reduce memory usage in KeyData > -- > > Key: HDDS-293 > URL: https://issues.apache.org/jira/browse/HDDS-293 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Major > Attachments: HDDS-293.20180726.patch > > > Currently, the field chunks is declared as a List in KeyData as > shown below. > {code} > //KeyData.java > private List chunks; > {code} > It is expected that many KeyData objects only have a single chunk. We could > reduce the memory usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-293) Reduce memory usage in KeyData
[ https://issues.apache.org/jira/browse/HDDS-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDDS-293: - Attachment: HDDS-293.20180726.patch > Reduce memory usage in KeyData > -- > > Key: HDDS-293 > URL: https://issues.apache.org/jira/browse/HDDS-293 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Major > Attachments: HDDS-293.20180726.patch > > > Currently, the field chunks is declared as a List in KeyData as > shown below. > {code} > //KeyData.java > private List chunks; > {code} > It is expected that many KeyData objects only have a single chunk. We could > reduce the memory usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13622) mkdir should print the parent directory in the error message when parent directories do not exist
[ https://issues.apache.org/jira/browse/HDFS-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558684#comment-16558684 ] Hudson commented on HDFS-13622: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14646 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14646/]) HDFS-13622. mkdir should print the parent directory in the error message (xiao: rev be150a17b15d15f5de6d4839d5e805e8d6c57850) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Mkdir.java > mkdir should print the parent directory in the error message when parent > directories do not exist > - > > Key: HDFS-13622 > URL: https://issues.apache.org/jira/browse/HDFS-13622 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Shweta >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13622.02.patch, HDFS-13622.03.patch, > HDFS-13622.04.patch, HDFS-13622.05.patch, HDFS-13622.06.patch > > > this is a bit misleading: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent/newdir': No such file or directory > {code} > I think this command should fail because "nonexistent" doesn't exists... > the correct would be: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent': No such file or directory > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558665#comment-16558665 ] Kihwal Lee commented on HDFS-13769: --- bq. This seems to apply not only for trash dir, but also any directory with large amount of data, You mean the performance hit? Sure. But the same kind of logic cannot be used as a generic solution. It is equivalent to users dividing a large dir structure and deleting them individually. If this logic is applied by default in FSShell, it will break the delete semantics. We might add an option for the FSShell to delete in this mode with a clear warning that the delete is no longer atomic. In any case, we can't do this in RPC server side (i.e. namenode). > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558663#comment-16558663 ] Ajay Kumar commented on HDDS-268: - [~elek] you might be interested in this as it involves some changes in EventWatcher. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558662#comment-16558662 ] Ajay Kumar commented on HDDS-268: - thanks [~nandakumar131]!! Will fix asf license for {{TestCloseContainerWatcher}} with any review comments. > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558646#comment-16558646 ] genericqa commented on HDDS-268: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 36s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 21s{color} | {color:red} server-scm in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 51s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 51s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 24s{color} | {color:red} server-scm in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 23s{color} | {color:red} server-scm in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s{color} | {color:green} framework in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 25s{color} | {color:red} server-scm in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 27s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-268 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932983/HDDS-268.00.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f5e5ecf72067 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a192295 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HDDS-Build/639/artifact/out/branch-findbugs-hadoop-hdds_server-scm-warnings.html | |
[jira] [Commented] (HDFS-13622) mkdir should print the parent directory in the error message when parent directories do not exist
[ https://issues.apache.org/jira/browse/HDFS-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558645#comment-16558645 ] Shweta commented on HDFS-13622: --- Thank you [~xiaochen] for doing the commit. > mkdir should print the parent directory in the error message when parent > directories do not exist > - > > Key: HDFS-13622 > URL: https://issues.apache.org/jira/browse/HDFS-13622 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Shweta >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13622.02.patch, HDFS-13622.03.patch, > HDFS-13622.04.patch, HDFS-13622.05.patch, HDFS-13622.06.patch > > > this is a bit misleading: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent/newdir': No such file or directory > {code} > I think this command should fail because "nonexistent" doesn't exists... > the correct would be: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent': No such file or directory > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13622) mkdir should print the parent directory in the error message when parent directories do not exist
[ https://issues.apache.org/jira/browse/HDFS-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558640#comment-16558640 ] Xiao Chen edited comment on HDFS-13622 at 7/26/18 5:25 PM: --- Failed tests look unrelated and passed locally. Committed to trunk. Thanks for the contribution [~shwetayakkali]! was (Author: xiaochen): Committed to trunk. Thanks for the contribution [~shwetayakkali]! > mkdir should print the parent directory in the error message when parent > directories do not exist > - > > Key: HDFS-13622 > URL: https://issues.apache.org/jira/browse/HDFS-13622 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Shweta >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13622.02.patch, HDFS-13622.03.patch, > HDFS-13622.04.patch, HDFS-13622.05.patch, HDFS-13622.06.patch > > > this is a bit misleading: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent/newdir': No such file or directory > {code} > I think this command should fail because "nonexistent" doesn't exists... > the correct would be: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent': No such file or directory > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13622) mkdir should print the parent directory in the error message when parent directories do not exist
[ https://issues.apache.org/jira/browse/HDFS-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-13622: - Summary: mkdir should print the parent directory in the error message when parent directories do not exist (was: mkdir should not print the directory being created in the error message when parent directories do not exist) > mkdir should print the parent directory in the error message when parent > directories do not exist > - > > Key: HDFS-13622 > URL: https://issues.apache.org/jira/browse/HDFS-13622 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Shweta >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13622.02.patch, HDFS-13622.03.patch, > HDFS-13622.04.patch, HDFS-13622.05.patch, HDFS-13622.06.patch > > > this is a bit misleading: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent/newdir': No such file or directory > {code} > I think this command should fail because "nonexistent" doesn't exists... > the correct would be: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent': No such file or directory > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13622) mkdir should print the parent directory in the error message when parent directories do not exist
[ https://issues.apache.org/jira/browse/HDFS-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-13622: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.2.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the contribution [~shwetayakkali]! > mkdir should print the parent directory in the error message when parent > directories do not exist > - > > Key: HDFS-13622 > URL: https://issues.apache.org/jira/browse/HDFS-13622 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Shweta >Priority: Major > Fix For: 3.2.0 > > Attachments: HDFS-13622.02.patch, HDFS-13622.03.patch, > HDFS-13622.04.patch, HDFS-13622.05.patch, HDFS-13622.06.patch > > > this is a bit misleading: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent/newdir': No such file or directory > {code} > I think this command should fail because "nonexistent" doesn't exists... > the correct would be: > {code} > $ hdfs dfs -mkdir /nonexistent/newdir > mkdir: `/nonexistent': No such file or directory > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558630#comment-16558630 ] Xiaoyu Yao commented on HDDS-277: - [~msingh], can you rebase the patch? > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch, HDDS-277.005.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-291) Initialize hadoop metrics system in standalone hdds datanodes
[ https://issues.apache.org/jira/browse/HDDS-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558610#comment-16558610 ] Xiaoyu Yao commented on HDDS-291: - +1, pending Jenkins. I manually trigger a Jenkins run at: https://builds.apache.org/job/PreCommit-HDDS-Build/642/. > Initialize hadoop metrics system in standalone hdds datanodes > - > > Key: HDDS-291 > URL: https://issues.apache.org/jira/browse/HDDS-291 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Minor > Fix For: 0.2.1 > > Attachments: HDDS-291.001.patch > > > Since HDDS-94 we can start a standalone HDDS datanode process without HDFS > datanode parts. > But to see the hadoop metrics over the jmx interface we need to initialize > the hadoop metrics system (we have existing metrics by the storage io layer). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-10) docker changes to test secure ozone cluster
[ https://issues.apache.org/jira/browse/HDDS-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558602#comment-16558602 ] Ajay Kumar commented on HDDS-10: [~elek] thanks for checking latest patch. I might be missing something here but i don't see issuer binary included in patch v3. (size of patch 3 is 16kb while patch v2 with binary files was ~4mb ) > docker changes to test secure ozone cluster > --- > > Key: HDDS-10 > URL: https://issues.apache.org/jira/browse/HDDS-10 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Security >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.3.0 > > Attachments: HDDS-10-HDDS-4.00.patch, HDDS-10-HDDS-4.01.patch, > HDDS-10-HDDS-4.02.patch, HDDS-10-HDDS-4.03.patch > > > Update docker compose and settings to test secure ozone cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
[ https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558601#comment-16558601 ] Kitti Nanasi commented on HDFS-13770: - [~jojochuang], I ran the same tests and 3.x does not have this bug. > dfsadmin -report does not always decrease "missing blocks (with replication > factor 1)" metrics when file is deleted > --- > > Key: HDFS-13770 > URL: https://issues.apache.org/jira/browse/HDFS-13770 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.7 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13770-branch-2.001.patch > > > Missing blocks (with replication factor 1) metric is not always decreased > when file is deleted. > If a file is deleted, the remove function of UnderReplicatedBlocks can be > called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called > with the wrong priority the corruptReplOneBlocks metric is not decreased, > however the block is removed from the priority queue which contains it. > The corresponding code: > {code:java} > /** remove a block from a under replication queue */ > synchronized boolean remove(BlockInfo block, > int oldReplicas, > int oldReadOnlyReplicas, > int decommissionedReplicas, > int oldExpectedReplicas) { > final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, > decommissionedReplicas, oldExpectedReplicas); > boolean removedBlock = remove(block, priLevel); > if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && > oldExpectedReplicas == 1 && > removedBlock) { > corruptReplOneBlocks--; > assert corruptReplOneBlocks >= 0 : > "Number of corrupt blocks with replication factor 1 " + > "should be non-negative"; > } > return removedBlock; > } > /** > * Remove a block from the under replication queues. > * > * The priLevel parameter is a hint of which queue to query > * first: if negative or = \{@link #LEVEL} this shortcutting > * is not attmpted. > * > * If the block is not found in the nominated queue, an attempt is made to > * remove it from all queues. > * > * Warning: This is not a synchronized method. > * @param block block to remove > * @param priLevel expected privilege level > * @return true if the block was found and removed from one of the priority > queues > */ > boolean remove(BlockInfo block, int priLevel) { > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + > " from priority queue {}", block, priLevel); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > return true; > } > } > } > return false; > } > {code} > It is already fixed on trunk by this jira: HDFS-10999, but that ticket > introduces new metrics, which I think should't be backported to branch-2. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider at creation time for consistent UGI handling
[ https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zsolt Venczel updated HDFS-13697: - Attachment: (was: HDFS-13697.04.patch) > DFSClient should instantiate and cache KMSClientProvider at creation time for > consistent UGI handling > - > > Key: HDFS-13697 > URL: https://issues.apache.org/jira/browse/HDFS-13697 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, > HDFS-13697.03.patch > > > While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack > might not have doAs privileged execution call (in the DFSClient for example). > This results in loosing the proxy user from UGI as UGI.getCurrentUser finds > no AccessControllerContext and does a re-login for the login user only. > This can cause the following for example: if we have set up the oozie user to > be entitled to perform actions on behalf of example_user but oozie is > forbidden to decrypt any EDEK (for security reasons), due to the above issue, > example_user entitlements are lost from UGI and the following error is > reported: > {code} > [0] > SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] > JOB[0020905-180313191552532-oozie-oozi-W] > ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting > action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message > [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with > ACL name [encrypted_key]!!] > org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not > authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!! > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441) > at > org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523) > at > org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199) > at > org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) > at org.apache.oozie.command.XCommand.call(XCommand.java:286) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User > [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name > [encrypted_key]!! > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:607) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:565) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:832) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:209) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:205) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:205) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1440) > at >
[jira] [Updated] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider at creation time for consistent UGI handling
[ https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zsolt Venczel updated HDFS-13697: - Attachment: HDFS-13697.04.patch > DFSClient should instantiate and cache KMSClientProvider at creation time for > consistent UGI handling > - > > Key: HDFS-13697 > URL: https://issues.apache.org/jira/browse/HDFS-13697 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Zsolt Venczel >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, > HDFS-13697.03.patch, HDFS-13697.04.patch > > > While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack > might not have doAs privileged execution call (in the DFSClient for example). > This results in loosing the proxy user from UGI as UGI.getCurrentUser finds > no AccessControllerContext and does a re-login for the login user only. > This can cause the following for example: if we have set up the oozie user to > be entitled to perform actions on behalf of example_user but oozie is > forbidden to decrypt any EDEK (for security reasons), due to the above issue, > example_user entitlements are lost from UGI and the following error is > reported: > {code} > [0] > SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] > JOB[0020905-180313191552532-oozie-oozi-W] > ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting > action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message > [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with > ACL name [encrypted_key]!!] > org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not > authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!! > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441) > at > org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523) > at > org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199) > at > org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) > at org.apache.oozie.command.XCommand.call(XCommand.java:286) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User > [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name > [encrypted_key]!! > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:607) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:565) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:832) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:209) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:205) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:205) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1440) > at >
[jira] [Updated] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-277: --- Attachment: HDDS-277.005.patch > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch, HDDS-277.005.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-277: --- Attachment: (was: HDDS-277.004.patch) > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558530#comment-16558530 ] genericqa commented on HDDS-277: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HDDS-277 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDDS-277 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933230/HDDS-277.004.patch | | Console output | https://builds.apache.org/job/PreCommit-HDDS-Build/640/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch, HDDS-277.004.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-268) Add SCM close container watcher
[ https://issues.apache.org/jira/browse/HDDS-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558524#comment-16558524 ] Nanda kumar commented on HDDS-268: -- Manually triggered Jenkins build: https://builds.apache.org/job/PreCommit-HDDS-Build/639/ > Add SCM close container watcher > --- > > Key: HDDS-268 > URL: https://issues.apache.org/jira/browse/HDDS-268 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-268.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-277: --- Attachment: HDDS-277.004.patch > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch, HDDS-277.004.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558523#comment-16558523 ] Mukul Kumar Singh commented on HDDS-277: Thanks for the review [~xyao]. Review comments are incorporated in the latest patch. > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch, HDDS-277.004.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558514#comment-16558514 ] Chao Sun commented on HDFS-13769: - This seems to apply not only for trash dir, but also any directory with large amount of data, is that correct [~Tao Jie]? > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
[ https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558509#comment-16558509 ] Wei-Chiu Chuang commented on HDFS-13770: Thanks [~knanasi] really good finding! HDFS-10999 pertains to erasure coding, so no way to backport it in branch-2. That said, because HDFS-10999 is a huge internal refactor, have you run the same test (without some modification) and verified the test is not reproducible in 3.x? > dfsadmin -report does not always decrease "missing blocks (with replication > factor 1)" metrics when file is deleted > --- > > Key: HDFS-13770 > URL: https://issues.apache.org/jira/browse/HDFS-13770 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.7 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13770-branch-2.001.patch > > > Missing blocks (with replication factor 1) metric is not always decreased > when file is deleted. > If a file is deleted, the remove function of UnderReplicatedBlocks can be > called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called > with the wrong priority the corruptReplOneBlocks metric is not decreased, > however the block is removed from the priority queue which contains it. > The corresponding code: > {code:java} > /** remove a block from a under replication queue */ > synchronized boolean remove(BlockInfo block, > int oldReplicas, > int oldReadOnlyReplicas, > int decommissionedReplicas, > int oldExpectedReplicas) { > final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, > decommissionedReplicas, oldExpectedReplicas); > boolean removedBlock = remove(block, priLevel); > if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && > oldExpectedReplicas == 1 && > removedBlock) { > corruptReplOneBlocks--; > assert corruptReplOneBlocks >= 0 : > "Number of corrupt blocks with replication factor 1 " + > "should be non-negative"; > } > return removedBlock; > } > /** > * Remove a block from the under replication queues. > * > * The priLevel parameter is a hint of which queue to query > * first: if negative or = \{@link #LEVEL} this shortcutting > * is not attmpted. > * > * If the block is not found in the nominated queue, an attempt is made to > * remove it from all queues. > * > * Warning: This is not a synchronized method. > * @param block block to remove > * @param priLevel expected privilege level > * @return true if the block was found and removed from one of the priority > queues > */ > boolean remove(BlockInfo block, int priLevel) { > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + > " from priority queue {}", block, priLevel); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > return true; > } > } > } > return false; > } > {code} > It is already fixed on trunk by this jira: HDFS-10999, but that ticket > introduces new metrics, which I think should't be backported to branch-2. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-252) Eliminate the datanode ID file
[ https://issues.apache.org/jira/browse/HDDS-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558491#comment-16558491 ] Bharat Viswanadham commented on HDDS-252: - Fixed findbug issues in patch v07. > Eliminate the datanode ID file > -- > > Key: HDDS-252 > URL: https://issues.apache.org/jira/browse/HDDS-252 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-252.00.patch, HDDS-252.01.patch, HDDS-252.02.patch, > HDDS-252.03.patch, HDDS-252.04.patch, HDDS-252.05.patch, HDDS-252.06.patch, > HDDS-252.07.patch > > > This Jira is to remove the datanodeID file. After ContainerIO work (HDDS-48 > branch) is merged, we have a version file in each Volume which stores > datanodeUuid and some additional fields in that file. > And also if this disk containing datanodeId path is removed, that DN will now > be unusable with current code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-252) Eliminate the datanode ID file
[ https://issues.apache.org/jira/browse/HDDS-252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-252: Attachment: HDDS-252.07.patch > Eliminate the datanode ID file > -- > > Key: HDDS-252 > URL: https://issues.apache.org/jira/browse/HDDS-252 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-252.00.patch, HDDS-252.01.patch, HDDS-252.02.patch, > HDDS-252.03.patch, HDDS-252.04.patch, HDDS-252.05.patch, HDDS-252.06.patch, > HDDS-252.07.patch > > > This Jira is to remove the datanodeID file. After ContainerIO work (HDDS-48 > branch) is merged, we have a version file in each Volume which stores > datanodeUuid and some additional fields in that file. > And also if this disk containing datanodeId path is removed, that DN will now > be unusable with current code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
[ https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558483#comment-16558483 ] genericqa commented on HDFS-13770: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 21m 6s{color} | {color:red} Docker failed to build yetus/hadoop:f667ef1. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-13770 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933221/HDFS-13770-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24661/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > dfsadmin -report does not always decrease "missing blocks (with replication > factor 1)" metrics when file is deleted > --- > > Key: HDFS-13770 > URL: https://issues.apache.org/jira/browse/HDFS-13770 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.7 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13770-branch-2.001.patch > > > Missing blocks (with replication factor 1) metric is not always decreased > when file is deleted. > If a file is deleted, the remove function of UnderReplicatedBlocks can be > called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called > with the wrong priority the corruptReplOneBlocks metric is not decreased, > however the block is removed from the priority queue which contains it. > The corresponding code: > {code:java} > /** remove a block from a under replication queue */ > synchronized boolean remove(BlockInfo block, > int oldReplicas, > int oldReadOnlyReplicas, > int decommissionedReplicas, > int oldExpectedReplicas) { > final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, > decommissionedReplicas, oldExpectedReplicas); > boolean removedBlock = remove(block, priLevel); > if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && > oldExpectedReplicas == 1 && > removedBlock) { > corruptReplOneBlocks--; > assert corruptReplOneBlocks >= 0 : > "Number of corrupt blocks with replication factor 1 " + > "should be non-negative"; > } > return removedBlock; > } > /** > * Remove a block from the under replication queues. > * > * The priLevel parameter is a hint of which queue to query > * first: if negative or = \{@link #LEVEL} this shortcutting > * is not attmpted. > * > * If the block is not found in the nominated queue, an attempt is made to > * remove it from all queues. > * > * Warning: This is not a synchronized method. > * @param block block to remove > * @param priLevel expected privilege level > * @return true if the block was found and removed from one of the priority > queues > */ > boolean remove(BlockInfo block, int priLevel) { > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + > " from priority queue {}", block, priLevel); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > return true; > } > } > } > return false; > } > {code} > It is already fixed on trunk by this jira: HDFS-10999, but that ticket > introduces new metrics, which I think should't be backported to branch-2. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-277) PipelineStateMachine should handle closure of pipelines in SCM
[ https://issues.apache.org/jira/browse/HDDS-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558464#comment-16558464 ] Xiaoyu Yao commented on HDDS-277: - [~msingh], thanks for the update. Patch v2 looks good except one minor comment. +1 after that fixed. PipelineSelector.java Line 339 {code} NavigableSet containerIDS = containerStateManager .getMatchingContainerIDsByPipeline(pipeline.getPipelineName()); if (pipeline.getLifeCycleState() == LifeCycleState.CLOSING && containerIDS.size() == 0) { updatePipelineState(pipeline, HddsProtos.LifeCycleEvent.CLOSE); LOG.info("Closing pipeline. pipelineID: {}", pipeline.getPipelineName()); } {code} Can we change it to to avoid unnecessary getMatchingContainerIDsByPipeline call? {code} if (pipeline.getLifeCycleState() != LifeCycleState.CLOSING) { return; } NavigableSet containerIDS = containerStateManager .getMatchingContainerIDsByPipeline(pipeline.getPipelineName()); if (containerIDS.size() == 0) { updatePipelineState(pipeline, HddsProtos.LifeCycleEvent.CLOSE); LOG.info("Closing pipeline. pipelineID: {}", pipeline.getPipelineName()); } {code} > PipelineStateMachine should handle closure of pipelines in SCM > -- > > Key: HDDS-277 > URL: https://issues.apache.org/jira/browse/HDDS-277 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-277.001.patch, HDDS-277.002.patch, > HDDS-277.003.patch > > > Currently the only visible state of pipelines in SCM is the open state. This > jira adds capability to PipelineStateMachine to close a SCM pipeline and > corresponding open containers on the pipeline. Once all the containers on the > pipeline have been closed then the nodes of the pipeline will be released > back to the free node pool -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
[ https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-13770: Attachment: HDFS-13770-branch-2.001.patch > dfsadmin -report does not always decrease "missing blocks (with replication > factor 1)" metrics when file is deleted > --- > > Key: HDFS-13770 > URL: https://issues.apache.org/jira/browse/HDFS-13770 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.7 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13770-branch-2.001.patch > > > Missing blocks (with replication factor 1) metric is not always decreased > when file is deleted. > If a file is deleted, the remove function of UnderReplicatedBlocks can be > called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called > with the wrong priority the corruptReplOneBlocks metric is not decreased, > however the block is removed from the priority queue which contains it. > The corresponding code: > {code:java} > /** remove a block from a under replication queue */ > synchronized boolean remove(BlockInfo block, > int oldReplicas, > int oldReadOnlyReplicas, > int decommissionedReplicas, > int oldExpectedReplicas) { > final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, > decommissionedReplicas, oldExpectedReplicas); > boolean removedBlock = remove(block, priLevel); > if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && > oldExpectedReplicas == 1 && > removedBlock) { > corruptReplOneBlocks--; > assert corruptReplOneBlocks >= 0 : > "Number of corrupt blocks with replication factor 1 " + > "should be non-negative"; > } > return removedBlock; > } > /** > * Remove a block from the under replication queues. > * > * The priLevel parameter is a hint of which queue to query > * first: if negative or = \{@link #LEVEL} this shortcutting > * is not attmpted. > * > * If the block is not found in the nominated queue, an attempt is made to > * remove it from all queues. > * > * Warning: This is not a synchronized method. > * @param block block to remove > * @param priLevel expected privilege level > * @return true if the block was found and removed from one of the priority > queues > */ > boolean remove(BlockInfo block, int priLevel) { > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + > " from priority queue {}", block, priLevel); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > return true; > } > } > } > return false; > } > {code} > It is already fixed on trunk by this jira: HDFS-10999, but that ticket > introduces new metrics, which I think should't be backported to branch-2. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
[ https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-13770: Status: Patch Available (was: Open) > dfsadmin -report does not always decrease "missing blocks (with replication > factor 1)" metrics when file is deleted > --- > > Key: HDFS-13770 > URL: https://issues.apache.org/jira/browse/HDFS-13770 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.7 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13770-branch-2.001.patch > > > Missing blocks (with replication factor 1) metric is not always decreased > when file is deleted. > If a file is deleted, the remove function of UnderReplicatedBlocks can be > called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called > with the wrong priority the corruptReplOneBlocks metric is not decreased, > however the block is removed from the priority queue which contains it. > The corresponding code: > {code:java} > /** remove a block from a under replication queue */ > synchronized boolean remove(BlockInfo block, > int oldReplicas, > int oldReadOnlyReplicas, > int decommissionedReplicas, > int oldExpectedReplicas) { > final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, > decommissionedReplicas, oldExpectedReplicas); > boolean removedBlock = remove(block, priLevel); > if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && > oldExpectedReplicas == 1 && > removedBlock) { > corruptReplOneBlocks--; > assert corruptReplOneBlocks >= 0 : > "Number of corrupt blocks with replication factor 1 " + > "should be non-negative"; > } > return removedBlock; > } > /** > * Remove a block from the under replication queues. > * > * The priLevel parameter is a hint of which queue to query > * first: if negative or = \{@link #LEVEL} this shortcutting > * is not attmpted. > * > * If the block is not found in the nominated queue, an attempt is made to > * remove it from all queues. > * > * Warning: This is not a synchronized method. > * @param block block to remove > * @param priLevel expected privilege level > * @return true if the block was found and removed from one of the priority > queues > */ > boolean remove(BlockInfo block, int priLevel) { > if(priLevel >= 0 && priLevel < LEVEL > && priorityQueues.get(priLevel).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + > " from priority queue {}", block, priLevel); > return true; > } else { > // Try to remove the block from all queues if the block was > // not found in the queue for the given priority level. > for (int i = 0; i < LEVEL; i++) { > if (i != priLevel && priorityQueues.get(i).remove(block)) { > NameNode.blockStateChangeLog.debug( > "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + > " {} from priority queue {}", block, i); > return true; > } > } > } > return false; > } > {code} > It is already fixed on trunk by this jira: HDFS-10999, but that ticket > introduces new metrics, which I think should't be backported to branch-2. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-287) Add Close ContainerAction to Datanode#StateContext when the container gets full
[ https://issues.apache.org/jira/browse/HDDS-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558434#comment-16558434 ] genericqa commented on HDDS-287: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 41s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 30m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s{color} | {color:green} container-service in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 35s{color} | {color:green} server-scm in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s{color} | {color:green} tools in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 32s{color} | {color:green} integration-test in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}142m 57s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-287 | | JIRA Patch URL |
[jira] [Created] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted
Kitti Nanasi created HDFS-13770: --- Summary: dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted Key: HDFS-13770 URL: https://issues.apache.org/jira/browse/HDFS-13770 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 2.7.7 Reporter: Kitti Nanasi Assignee: Kitti Nanasi Missing blocks (with replication factor 1) metric is not always decreased when file is deleted. If a file is deleted, the remove function of UnderReplicatedBlocks can be called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called with the wrong priority the corruptReplOneBlocks metric is not decreased, however the block is removed from the priority queue which contains it. The corresponding code: {code:java} /** remove a block from a under replication queue */ synchronized boolean remove(BlockInfo block, int oldReplicas, int oldReadOnlyReplicas, int decommissionedReplicas, int oldExpectedReplicas) { final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas, decommissionedReplicas, oldExpectedReplicas); boolean removedBlock = remove(block, priLevel); if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS && oldExpectedReplicas == 1 && removedBlock) { corruptReplOneBlocks--; assert corruptReplOneBlocks >= 0 : "Number of corrupt blocks with replication factor 1 " + "should be non-negative"; } return removedBlock; } /** * Remove a block from the under replication queues. * * The priLevel parameter is a hint of which queue to query * first: if negative or = \{@link #LEVEL} this shortcutting * is not attmpted. * * If the block is not found in the nominated queue, an attempt is made to * remove it from all queues. * * Warning: This is not a synchronized method. * @param block block to remove * @param priLevel expected privilege level * @return true if the block was found and removed from one of the priority queues */ boolean remove(BlockInfo block, int priLevel) { if(priLevel >= 0 && priLevel < LEVEL && priorityQueues.get(priLevel).remove(block)) { NameNode.blockStateChangeLog.debug( "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" + " from priority queue {}", block, priLevel); return true; } else { // Try to remove the block from all queues if the block was // not found in the queue for the given priority level. for (int i = 0; i < LEVEL; i++) { if (i != priLevel && priorityQueues.get(i).remove(block)) { NameNode.blockStateChangeLog.debug( "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" + " {} from priority queue {}", block, i); return true; } } } return false; } {code} It is already fixed on trunk by this jira: HDFS-10999, but that ticket introduces new metrics, which I think should't be backported to branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9260) Improve the performance and GC friendliness of NameNode startup and full block reports
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558382#comment-16558382 ] Kihwal Lee commented on HDFS-9260: -- I propose revert of this. HDFS-13671 also reports about 4x slower performance. It might help GC, but regular operations are being affected too much. > Improve the performance and GC friendliness of NameNode startup and full > block reports > -- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg >Priority: Major > Fix For: 3.0.0-alpha1 > > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, > HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, > HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, > HDFSBenchmarks.zip, HDFSBenchmarks2.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-201) Add name for LeaseManager
[ https://issues.apache.org/jira/browse/HDDS-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558334#comment-16558334 ] Hudson commented on HDDS-201: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14645 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14645/]) HDDS-201. Add name for LeaseManager. Contributed by Sandeep Nemuri. (nanda: rev a19229594e48fad9f50dbdb1f0b2fcbf7443ce66) * (edit) hadoop-hdds/common/src/test/java/org/apache/hadoop/ozone/lease/TestLeaseManager.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerMapping.java * (edit) hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java * (edit) hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipelines/PipelineSelector.java * (edit) hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/lease/LeaseManager.java * (edit) hadoop-hdds/framework/src/test/java/org/apache/hadoop/hdds/server/events/TestEventWatcher.java > Add name for LeaseManager > - > > Key: HDDS-201 > URL: https://issues.apache.org/jira/browse/HDDS-201 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Sandeep Nemuri >Priority: Minor > Labels: newbie > Fix For: 0.2.1 > > Attachments: HDDS-201.001.patch, HDDS-201.002.patch > > > During the review of HDDS-195 we realised that one server could have multiple > LeaseManagers (for example one for the watchers one for the container > creation). > To make it easier to monitor it would be good to use some specific names for > the release manager. > This jira is about adding a new field (name) to the release manager which > should be defined by a constructor parameter and should be required. > It should be used in the name of the Threads and all the log message > (Something like "Starting CommandWatcher LeasManager") -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-271: - Comment: was deleted (was: As this is an iterator we can throw NoSuchElementException if the iteration has no more element instead of returning null. If {{hasNext}} returns true we should be able to return the next block on {{nextBlock}} call. Consider a case where we have two blocks [key1:block1, #deleting#key2:block2]. For the first {{hasNext}} call we will return {{true}} and the {{nextBlock}} call will return key1:block1. For the second {{hasNext}} call we will return {{true}} but the {{nextBlock}} call will return {{null}}. This will create inconsistent behavior in code wherever the iterator is used. We cannot fully rely on {{hasNext}} call anymore.) > Create a block iterator to iterate blocks in a container > > > Key: HDDS-271 > URL: https://issues.apache.org/jira/browse/HDDS-271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-271.00.patch, HDDS-271.01.patch, HDDS-271.02.patch, > HDDS-271.03.patch > > > Create a block iterator to scan all blocks in a container. > This one will be useful during implementation of container scanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558329#comment-16558329 ] Nanda kumar commented on HDDS-271: -- As this is an iterator we can throw NoSuchElementException if the iteration has no more element instead of returning null. If hasNext returns true we should be able to return the next block on nextBlock call. Consider a case where we have two blocks \[key1:block1, #deleting#key2:block2\]. For the first hasNext call we will return true and the nextBlock call will return key1:block1. For the second hasNext call we will return true but the nextBlock call will return null. This will create inconsistent behavior in code wherever the iterator is used. We cannot fully rely on hasNext call anymore. > Create a block iterator to iterate blocks in a container > > > Key: HDDS-271 > URL: https://issues.apache.org/jira/browse/HDDS-271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-271.00.patch, HDDS-271.01.patch, HDDS-271.02.patch, > HDDS-271.03.patch > > > Create a block iterator to scan all blocks in a container. > This one will be useful during implementation of container scanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-271: - Comment: was deleted (was: As this is an iterator we can throw NoSuchElementException if the iteration has no more element instead of returning null. If {{hasNext}} returns true we should be able to return the next block on {{nextBlock}} call. Consider a case where we have two blocks [key1:block1, #deleting#key2:block2]. For the first {{hasNext}} call we will return {{true}} and the {{nextBlock}} call will return key1:block1. For the second {{hasNext}} call we will return {{true}} but the {{nextBlock}} call will return {{null}}. This will create inconsistent behavior in code wherever the iterator is used. We cannot fully rely on {{hasNext}} call anymore.) > Create a block iterator to iterate blocks in a container > > > Key: HDDS-271 > URL: https://issues.apache.org/jira/browse/HDDS-271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-271.00.patch, HDDS-271.01.patch, HDDS-271.02.patch, > HDDS-271.03.patch > > > Create a block iterator to scan all blocks in a container. > This one will be useful during implementation of container scanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558327#comment-16558327 ] Nanda kumar commented on HDDS-271: -- As this is an iterator we can throw NoSuchElementException if the iteration has no more element instead of returning null. If {{hasNext}} returns true we should be able to return the next block on {{nextBlock}} call. Consider a case where we have two blocks [key1:block1, #deleting#key2:block2]. For the first {{hasNext}} call we will return {{true}} and the {{nextBlock}} call will return key1:block1. For the second {{hasNext}} call we will return {{true}} but the {{nextBlock}} call will return {{null}}. This will create inconsistent behavior in code wherever the iterator is used. We cannot fully rely on {{hasNext}} call anymore. > Create a block iterator to iterate blocks in a container > > > Key: HDDS-271 > URL: https://issues.apache.org/jira/browse/HDDS-271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-271.00.patch, HDDS-271.01.patch, HDDS-271.02.patch, > HDDS-271.03.patch > > > Create a block iterator to scan all blocks in a container. > This one will be useful during implementation of container scanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container
[ https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558326#comment-16558326 ] Nanda kumar commented on HDDS-271: -- As this is an iterator we can throw NoSuchElementException if the iteration has no more element instead of returning null. If {{hasNext}} returns true we should be able to return the next block on {{nextBlock}} call. Consider a case where we have two blocks [key1:block1, #deleting#key2:block2]. For the first {{hasNext}} call we will return {{true}} and the {{nextBlock}} call will return key1:block1. For the second {{hasNext}} call we will return {{true}} but the {{nextBlock}} call will return {{null}}. This will create inconsistent behavior in code wherever the iterator is used. We cannot fully rely on {{hasNext}} call anymore. > Create a block iterator to iterate blocks in a container > > > Key: HDDS-271 > URL: https://issues.apache.org/jira/browse/HDDS-271 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-271.00.patch, HDDS-271.01.patch, HDDS-271.02.patch, > HDDS-271.03.patch > > > Create a block iterator to scan all blocks in a container. > This one will be useful during implementation of container scanner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-201) Add name for LeaseManager
[ https://issues.apache.org/jira/browse/HDDS-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558306#comment-16558306 ] Nanda kumar commented on HDDS-201: -- Thanks [~Sandeep Nemuri] for the contribution and [~elek] for suggesting this improvement and reviewing it. I have committed it to trunk. > Add name for LeaseManager > - > > Key: HDDS-201 > URL: https://issues.apache.org/jira/browse/HDDS-201 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Sandeep Nemuri >Priority: Minor > Labels: newbie > Fix For: 0.2.1 > > Attachments: HDDS-201.001.patch, HDDS-201.002.patch > > > During the review of HDDS-195 we realised that one server could have multiple > LeaseManagers (for example one for the watchers one for the container > creation). > To make it easier to monitor it would be good to use some specific names for > the release manager. > This jira is about adding a new field (name) to the release manager which > should be defined by a constructor parameter and should be required. > It should be used in the name of the Threads and all the log message > (Something like "Starting CommandWatcher LeasManager") -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-201) Add name for LeaseManager
[ https://issues.apache.org/jira/browse/HDDS-201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-201: - Resolution: Fixed Status: Resolved (was: Patch Available) > Add name for LeaseManager > - > > Key: HDDS-201 > URL: https://issues.apache.org/jira/browse/HDDS-201 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Sandeep Nemuri >Priority: Minor > Labels: newbie > Fix For: 0.2.1 > > Attachments: HDDS-201.001.patch, HDDS-201.002.patch > > > During the review of HDDS-195 we realised that one server could have multiple > LeaseManagers (for example one for the watchers one for the container > creation). > To make it easier to monitor it would be good to use some specific names for > the release manager. > This jira is about adding a new field (name) to the release manager which > should be defined by a constructor parameter and should be required. > It should be used in the name of the Threads and all the log message > (Something like "Starting CommandWatcher LeasManager") -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558292#comment-16558292 ] Tao Jie commented on HDFS-13769: Hi, [~jojochuang] , the version of our cluster is 2.8.2, and this patch is based on the trunk. However I found the logic about the trash policy is almost the same in 2.8.2 and 3. > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-201) Add name for LeaseManager
[ https://issues.apache.org/jira/browse/HDDS-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558289#comment-16558289 ] Nanda kumar commented on HDDS-201: -- +1, looks good to me. I will commit this shortly. > Add name for LeaseManager > - > > Key: HDDS-201 > URL: https://issues.apache.org/jira/browse/HDDS-201 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Sandeep Nemuri >Priority: Minor > Labels: newbie > Fix For: 0.2.1 > > Attachments: HDDS-201.001.patch, HDDS-201.002.patch > > > During the review of HDDS-195 we realised that one server could have multiple > LeaseManagers (for example one for the watchers one for the container > creation). > To make it easier to monitor it would be good to use some specific names for > the release manager. > This jira is about adding a new field (name) to the release manager which > should be defined by a constructor parameter and should be required. > It should be used in the name of the Threads and all the log message > (Something like "Starting CommandWatcher LeasManager") -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-287) Add Close ContainerAction to Datanode#StateContext when the container gets full
[ https://issues.apache.org/jira/browse/HDDS-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-287: - Affects Version/s: 0.2.1 Status: Patch Available (was: Open) > Add Close ContainerAction to Datanode#StateContext when the container gets > full > --- > > Key: HDDS-287 > URL: https://issues.apache.org/jira/browse/HDDS-287 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.2.1 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-287.000.patch > > > Datanode has to send Close ContainerAction to SCM whenever a container gets > full. {{Datanode#StateContext}} has {{containerActions}} queue from which the > ContainerActions are picked and sent as part of heartbeat. In this jira we > have to add ContainerAction to StateContext whenever a container get full. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-287) Add Close ContainerAction to Datanode#StateContext when the container gets full
[ https://issues.apache.org/jira/browse/HDDS-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-287: - Attachment: HDDS-287.000.patch > Add Close ContainerAction to Datanode#StateContext when the container gets > full > --- > > Key: HDDS-287 > URL: https://issues.apache.org/jira/browse/HDDS-287 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-287.000.patch > > > Datanode has to send Close ContainerAction to SCM whenever a container gets > full. {{Datanode#StateContext}} has {{containerActions}} queue from which the > ContainerActions are picked and sent as part of heartbeat. In this jira we > have to add ContainerAction to StateContext whenever a container get full. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash
[ https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558280#comment-16558280 ] Wei-Chiu Chuang commented on HDFS-13769: Just like to clarify a bit: did you observe this behavior on a Hadoop 2.8.2 cluster as well? Or do you mean the patch (a new trash policy) is applicable to 2.8.2? We changed the block internal data structure in Hadoop 3 so I would expect the performance regression happens to a Hadoop 3 cluster only. Thanks > Namenode gets stuck when deleting large dir in trash > > > Key: HDFS-13769 > URL: https://issues.apache.org/jira/browse/HDFS-13769 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2, 3.1.0 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13769.001.patch > > > Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a > long time when deleting trash dir with a large mount of data. We found log in > namenode: > {quote} > 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem > (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for > 23018 ms via > java.lang.Thread.getStackTrace(Thread.java:1552) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820) > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047) > {quote} > One simple solution is to avoid deleting large data in one delete RPC call. > We implement a trashPolicy that divide the delete operation into several > delete RPCs, and each single deletion would not delete too many files. > Any thought? [~linyiqun] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-219) Genearate version-info.properties for hadoop and ozone
[ https://issues.apache.org/jira/browse/HDDS-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558235#comment-16558235 ] genericqa commented on HDDS-219: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 24s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 33s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 9s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 47s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}128m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HDDS-219 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932927/HDDS-219.001.patch | | Optional