[jira] [Updated] (HDDS-695) Introduce new SCM Commands to list and close Pipelines
[ https://issues.apache.org/jira/browse/HDDS-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-695: - Summary: Introduce new SCM Commands to list and close Pipelines (was: Introduce a new SCM Command to teardown a Pipeline) > Introduce new SCM Commands to list and close Pipelines > -- > > Key: HDDS-695 > URL: https://issues.apache.org/jira/browse/HDDS-695 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Blocker > Attachments: HDDS-695-ozone-0.3.000.patch, > HDDS-695-ozone-0.3.001.patch, HDDS-695-ozone-0.3.002.patch, > HDDS-695-ozone-0.3.003.patch > > > We need to have a tear-down pipeline command in SCM so that an administrator > can close/destroy a pipeline in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-718) Introduce new SCM Commands to list and close Pipelines
[ https://issues.apache.org/jira/browse/HDDS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-718: - Target Version/s: 0.4.0 (was: 0.3.0) > Introduce new SCM Commands to list and close Pipelines > -- > > Key: HDDS-718 > URL: https://issues.apache.org/jira/browse/HDDS-718 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Blocker > > We need to have a tear-down pipeline command in SCM so that an administrator > can close/destroy a pipeline in the cluster. > HDDS-695 brings in the commands in branch ozone-0.3, this Jira is for porting > them to trunk. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-723) CloseContainerCommandHandler throwing NullPointerException
[ https://issues.apache.org/jira/browse/HDDS-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-723: Assignee: Nanda kumar > CloseContainerCommandHandler throwing NullPointerException > -- > > Key: HDDS-723 > URL: https://issues.apache.org/jira/browse/HDDS-723 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.3.0 >Reporter: Nilotpal Nandi >Assignee: Nanda kumar >Priority: Major > Attachments: all-node-ozone-logs-1540356965.tar.gz > > > Seeing NullPointerException error while CloseContainerCommandHandler is > trying to close container. > > > {noformat} > 2018-10-24 04:22:04,699 INFO org.apache.ratis.server.storage.RaftLogWorker: > 8a61160b-8985-412e-9f25-9e65ceafa824-RaftLogWorker got closed and hit > exception > java.io.IOException: java.lang.InterruptedException > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:51) > at > org.apache.ratis.server.storage.RaftLogWorker.flushWrites(RaftLogWorker.java:232) > at > org.apache.ratis.server.storage.RaftLogWorker.access$600(RaftLogWorker.java:51) > at > org.apache.ratis.server.storage.RaftLogWorker$WriteLog.execute(RaftLogWorker.java:309) > at org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:179) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.InterruptedException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > at > org.apache.ratis.server.storage.RaftLogWorker.flushWrites(RaftLogWorker.java:230) > ... 4 more > 2018-10-24 04:22:04,712 INFO org.apache.ratis.server.storage.RaftLogWorker: > 8a61160b-8985-412e-9f25-9e65ceafa824-RaftLogWorker close() > 2018-10-24 04:22:31,293 ERROR > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler: > Can't close container 18 > java.lang.NullPointerException > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:78) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:381) > at java.lang.Thread.run(Thread.java:745) > 2018-10-24 04:22:31,293 ERROR > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler: > Can't close container 10 > java.lang.NullPointerException > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:78) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:381) > at java.lang.Thread.run(Thread.java:745) > 2018-10-24 04:22:31,293 ERROR > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler: > Can't close container 14 > java.lang.NullPointerException > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:78) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:381) > at java.lang.Thread.run(Thread.java:745){noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-692) Use the ProgressBar class in the RandomKeyGenerator freon test
[ https://issues.apache.org/jira/browse/HDDS-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661114#comment-16661114 ] Nanda kumar commented on HDDS-692: -- [~horzsolt2006], thanks for working on this. It is not a good idea to give the actual task to {{ProgressBar}} thread. The way it should be is * Instantiate the ProgressBar class with {{PrintStream}}, {{MaxValue}} of type Long and {{Supplier}} function. * ProgressBar#start; this should start the ProgressBar thread * ProgressBar#shutdown; this should stop the ProgressBar thread Apart from {{shutdown}} method which waits for the progress bar to complete, we should also have {{terminate}} method which can be used in case of exception in the actual job. Upon calling {{terminate}} method, {{ProgressBar}} thread should immediately terminate. > Use the ProgressBar class in the RandomKeyGenerator freon test > -- > > Key: HDDS-692 > URL: https://issues.apache.org/jira/browse/HDDS-692 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Zsolt Horvath >Priority: Major > Attachments: HDDS-692.001.patch > > > HDDS-443 provides a reusable progress bar to make it easier to add more freon > tests, but the existing RandomKeyGenerator test > (hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java) > still doesn't use it. > It would be good to switch to use the new progress bar there. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-694) Plugin new Pipeline management code in SCM
[ https://issues.apache.org/jira/browse/HDDS-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-694: - Resolution: Fixed Fix Version/s: 0.4.0 Status: Resolved (was: Patch Available) > Plugin new Pipeline management code in SCM > -- > > Key: HDDS-694 > URL: https://issues.apache.org/jira/browse/HDDS-694 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-694.001.patch, HDDS-694.002.patch, > HDDS-694.003.patch > > > This Jira aims to plugin new pipeline management code in SCM. It removes the > old pipeline related classes as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-694) Plugin new Pipeline management code in SCM
[ https://issues.apache.org/jira/browse/HDDS-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665135#comment-16665135 ] Nanda kumar commented on HDDS-694: -- [~ljain], thanks for the contribution and thanks to [~anu] for review. Committed this to trunk. > Plugin new Pipeline management code in SCM > -- > > Key: HDDS-694 > URL: https://issues.apache.org/jira/browse/HDDS-694 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-694.001.patch, HDDS-694.002.patch, > HDDS-694.003.patch > > > This Jira aims to plugin new pipeline management code in SCM. It removes the > old pipeline related classes as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-692) Use the ProgressBar class in the RandomKeyGenerator freon test
[ https://issues.apache.org/jira/browse/HDDS-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1520#comment-1520 ] Nanda kumar commented on HDDS-692: -- [~horzsolt2006], sorry for the delay in response. {quote}I'm not sure if I understand you wrt giving the actual task to the Progressbar thread. {quote} RandomKeyGenerator Line:253 - {{progressBar.start(task)}}, here {{task}} is the actual runnable which starts/submits the job to ExecutorService. We are passing the actual job to ProgressBar. In case of {{RandomKeyGenerator}}, we use ExecutorService to run the tasks in parallel. If someone uses ProgressBar who doesn't use ExecutorService, ProgressBar will be the one who will be running the job. {quote}In its public void start(Runnable task) the task parameter used as a functional interface, it doesn't actually start a thread.. {quote} Actually, {{public void start(Runnable task)}} method is the one which runs the job. It doesn't create a new Thread to run, but runs the job in the same Thread. {code:java} public void start(Runnable task) { startTime = System.nanoTime(); try { progressBar.start(); task.run(); -> This will run the job. } catch (Exception e) { exception = true; } } {code} We should not pass {{Runnable}} as an argument to {{ProgressBar}} class. ProgressBar should take a {{Supplier}} which will return a Long value. This is how ProgressBar APIs should look. {code:java} public class ProgressBar { /** * Constructs the ProgressBar instance. * * @param stream The stream to print * @param maxValue The max value * @param currentValue current value supplier */ public ProgressBar(PrintStream stream, Long maxValue, Supplier currentValue) { ... // Create new progress bar task (runnable) ... } /** * Starts the ProgressBar in a new Thread. * This is a non blocking call. */ public void start() { ... // Start the progress bar task ... } /** * Graceful shutdown, waits for the progress bar to complete. * This is a blocking call. */ public void shutdown() { ... // Wait for the progress bar task to complete ... } /** * Terminates the progress bar. * This doesn't wait for the progress bar to complete. */ public void terminate() { ... // Terminate the progress bar task ... } } {code} {quote}Sorry for my newbie questions, I'm just getting familiar with the code now. {quote} No issues :) If I have confused you more, we can get on a call to discuss this. > Use the ProgressBar class in the RandomKeyGenerator freon test > -- > > Key: HDDS-692 > URL: https://issues.apache.org/jira/browse/HDDS-692 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Zsolt Horvath >Priority: Major > Attachments: HDDS-692.001.patch > > > HDDS-443 provides a reusable progress bar to make it easier to add more freon > tests, but the existing RandomKeyGenerator test > (hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java) > still doesn't use it. > It would be good to switch to use the new progress bar there. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-754) VolumeInfo#getScmUsed throws NPE
[ https://issues.apache.org/jira/browse/HDDS-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667652#comment-16667652 ] Nanda kumar commented on HDDS-754: -- This looks similar to HDDS-354. > VolumeInfo#getScmUsed throws NPE > > > Key: HDDS-754 > URL: https://issues.apache.org/jira/browse/HDDS-754 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Priority: Blocker > > The failure can be seen at the following jenkins run > https://builds.apache.org/job/PreCommit-HDDS-Build/1540/testReport/org.apache.hadoop.hdds.scm.pipeline/TestNodeFailure/testPipelineFail/ > {code} > 2018-10-29 13:44:11,984 WARN concurrent.ExecutorHelper > (ExecutorHelper.java:logThrowableFromAfterExecute(50)) - Execution exception > when running task in Datanode ReportManager Thread - 3 > 2018-10-29 13:44:11,984 WARN concurrent.ExecutorHelper > (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in > thread Datanode ReportManager Thread - 3: > java.lang.NullPointerException > at > org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107) > at > org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:379) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:225) > at > org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64) > at > org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39) > at > org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86) > at > org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
Nanda kumar created HDDS-755: Summary: ContainerInfo and ContainerReplica protobuf changes Key: HDDS-755 URL: https://issues.apache.org/jira/browse/HDDS-755 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode, SCM Reporter: Nanda kumar Assignee: Nanda kumar We have different classes that maintain container related information, we can consolidate them so that it is easy to read the code. Proposal: In SCM: will be used in communication between SCM and Client, also used for storing in db * ContainerInfoProto * ContainerInfo In Datanode: Used in communication between Datanode and SCM * ContainerReplicaProto * ContainerReplica In Datanode: Used in communication between Datanode and Client * ContainerDataProto * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
[ https://issues.apache.org/jira/browse/HDDS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-755: - Status: Patch Available (was: Open) > ContainerInfo and ContainerReplica protobuf changes > --- > > Key: HDDS-755 > URL: https://issues.apache.org/jira/browse/HDDS-755 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-755.000.patch > > > We have different classes that maintain container related information, we can > consolidate them so that it is easy to read the code. > Proposal: > In SCM: will be used in communication between SCM and Client, also used for > storing in db > * ContainerInfoProto > * ContainerInfo > > In Datanode: Used in communication between Datanode and SCM > * ContainerReplicaProto > * ContainerReplica > > In Datanode: Used in communication between Datanode and Client > * ContainerDataProto > * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
[ https://issues.apache.org/jira/browse/HDDS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-755: - Attachment: HDDS-755.000.patch > ContainerInfo and ContainerReplica protobuf changes > --- > > Key: HDDS-755 > URL: https://issues.apache.org/jira/browse/HDDS-755 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-755.000.patch > > > We have different classes that maintain container related information, we can > consolidate them so that it is easy to read the code. > Proposal: > In SCM: will be used in communication between SCM and Client, also used for > storing in db > * ContainerInfoProto > * ContainerInfo > > In Datanode: Used in communication between Datanode and SCM > * ContainerReplicaProto > * ContainerReplica > > In Datanode: Used in communication between Datanode and Client > * ContainerDataProto > * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-692) Use the ProgressBar class in the RandomKeyGenerator freon test
[ https://issues.apache.org/jira/browse/HDDS-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-692: - Component/s: Tools > Use the ProgressBar class in the RandomKeyGenerator freon test > -- > > Key: HDDS-692 > URL: https://issues.apache.org/jira/browse/HDDS-692 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Tools >Reporter: Elek, Marton >Assignee: Zsolt Horvath >Priority: Major > Attachments: HDDS-692.001.patch > > > HDDS-443 provides a reusable progress bar to make it easier to add more freon > tests, but the existing RandomKeyGenerator test > (hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java) > still doesn't use it. > It would be good to switch to use the new progress bar there. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-775) Batch updates to container db to minimize number of updates.
[ https://issues.apache.org/jira/browse/HDDS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671500#comment-16671500 ] Nanda kumar commented on HDDS-775: -- Thanks [~msingh] for the patch. +1, looks good to me. [~linyiqun], thanks for the review. Please find my response below bq. Looks like this change can also be used for trunk In trunk the whole container report processing is getting refactored as part of HDDS-737. I will upload the patch over there shortly. bq. writeBatch operation should under lock protection. And lock operation should be moved outside loop. I agree that this will make the code look cleaner, but having the writeBatch outside of lock will not cause any correctness issue. * {{batch}} is a method variable, so there won't be any corruption here even when multiple threads are accessing. * Since {{writeBatch}} is rocksdb operation, we can rely on it for correctness in batch write. > Batch updates to container db to minimize number of updates. > > > Key: HDDS-775 > URL: https://issues.apache.org/jira/browse/HDDS-775 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.3.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDDS-775-ozone-0.3.001.patch > > > Currently while processing container reports, each report results in a put > operation to the db. This can be optimized by replacing put with a batch > operation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-775) Batch updates to container db to minimize number of updates.
[ https://issues.apache.org/jira/browse/HDDS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671531#comment-16671531 ] Nanda kumar commented on HDDS-775: -- Thanks [~linyiqun] for the quick response. I will commit it shortly. > Batch updates to container db to minimize number of updates. > > > Key: HDDS-775 > URL: https://issues.apache.org/jira/browse/HDDS-775 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.3.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDDS-775-ozone-0.3.001.patch > > > Currently while processing container reports, each report results in a put > operation to the db. This can be optimized by replacing put with a batch > operation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-775) Batch updates to container db to minimize number of updates.
[ https://issues.apache.org/jira/browse/HDDS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671551#comment-16671551 ] Nanda kumar commented on HDDS-775: -- Thanks [~msingh] for the contribution and [~linyiqun] for the review. Committed it to ozone-0.3 branch. > Batch updates to container db to minimize number of updates. > > > Key: HDDS-775 > URL: https://issues.apache.org/jira/browse/HDDS-775 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.3.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Fix For: 0.3.0 > > Attachments: HDDS-775-ozone-0.3.001.patch > > > Currently while processing container reports, each report results in a put > operation to the db. This can be optimized by replacing put with a batch > operation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-775) Batch updates to container db to minimize number of updates.
[ https://issues.apache.org/jira/browse/HDDS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671533#comment-16671533 ] Nanda kumar commented on HDDS-775: -- Findbug warning is not related to this patch, asflicense warnings are fixed in HDDS-777. I will fix the checkstyle issue while committing. > Batch updates to container db to minimize number of updates. > > > Key: HDDS-775 > URL: https://issues.apache.org/jira/browse/HDDS-775 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.3.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDDS-775-ozone-0.3.001.patch > > > Currently while processing container reports, each report results in a put > operation to the db. This can be optimized by replacing put with a batch > operation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-775) Batch updates to container db to minimize number of updates.
[ https://issues.apache.org/jira/browse/HDDS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-775: - Resolution: Fixed Fix Version/s: 0.3.0 Status: Resolved (was: Patch Available) > Batch updates to container db to minimize number of updates. > > > Key: HDDS-775 > URL: https://issues.apache.org/jira/browse/HDDS-775 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.3.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Fix For: 0.3.0 > > Attachments: HDDS-775-ozone-0.3.001.patch > > > Currently while processing container reports, each report results in a put > operation to the db. This can be optimized by replacing put with a batch > operation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
[ https://issues.apache.org/jira/browse/HDDS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-755: - Resolution: Fixed Fix Version/s: 0.4.0 Status: Resolved (was: Patch Available) > ContainerInfo and ContainerReplica protobuf changes > --- > > Key: HDDS-755 > URL: https://issues.apache.org/jira/browse/HDDS-755 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-755.000.patch, HDDS-755.001.patch > > > We have different classes that maintain container related information, we can > consolidate them so that it is easy to read the code. > Proposal: > In SCM: will be used in communication between SCM and Client, also used for > storing in db > * ContainerInfoProto > * ContainerInfo > > In Datanode: Used in communication between Datanode and SCM > * ContainerReplicaProto > * ContainerReplica > > In Datanode: Used in communication between Datanode and Client > * ContainerDataProto > * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
[ https://issues.apache.org/jira/browse/HDDS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669611#comment-16669611 ] Nanda kumar commented on HDDS-755: -- Thanks [~linyiqun] and [~msingh] for the review. I have committed this to trunk. > ContainerInfo and ContainerReplica protobuf changes > --- > > Key: HDDS-755 > URL: https://issues.apache.org/jira/browse/HDDS-755 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-755.000.patch, HDDS-755.001.patch > > > We have different classes that maintain container related information, we can > consolidate them so that it is easy to read the code. > Proposal: > In SCM: will be used in communication between SCM and Client, also used for > storing in db > * ContainerInfoProto > * ContainerInfo > > In Datanode: Used in communication between Datanode and SCM > * ContainerReplicaProto > * ContainerReplica > > In Datanode: Used in communication between Datanode and Client > * ContainerDataProto > * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
[ https://issues.apache.org/jira/browse/HDDS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669606#comment-16669606 ] Nanda kumar commented on HDDS-755: -- [~linyiqun], will take care of the checkstyle issues while committing. > ContainerInfo and ContainerReplica protobuf changes > --- > > Key: HDDS-755 > URL: https://issues.apache.org/jira/browse/HDDS-755 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-755.000.patch, HDDS-755.001.patch > > > We have different classes that maintain container related information, we can > consolidate them so that it is easy to read the code. > Proposal: > In SCM: will be used in communication between SCM and Client, also used for > storing in db > * ContainerInfoProto > * ContainerInfo > > In Datanode: Used in communication between Datanode and SCM > * ContainerReplicaProto > * ContainerReplica > > In Datanode: Used in communication between Datanode and Client > * ContainerDataProto > * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
[ https://issues.apache.org/jira/browse/HDDS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-755: - Attachment: HDDS-755.001.patch > ContainerInfo and ContainerReplica protobuf changes > --- > > Key: HDDS-755 > URL: https://issues.apache.org/jira/browse/HDDS-755 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-755.000.patch, HDDS-755.001.patch > > > We have different classes that maintain container related information, we can > consolidate them so that it is easy to read the code. > Proposal: > In SCM: will be used in communication between SCM and Client, also used for > storing in db > * ContainerInfoProto > * ContainerInfo > > In Datanode: Used in communication between Datanode and SCM > * ContainerReplicaProto > * ContainerReplica > > In Datanode: Used in communication between Datanode and Client > * ContainerDataProto > * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-755) ContainerInfo and ContainerReplica protobuf changes
[ https://issues.apache.org/jira/browse/HDDS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668571#comment-16668571 ] Nanda kumar commented on HDDS-755: -- [~linyiqun], thanks for the review. {quote}Compared with original logic, we introduce the new state QUASI_CLOSED, is intended change? {quote} Yes, QUASI_CLOSED state will be used when there is no pipeline and we want to close the container. I have planned to file follow-up jiras which will use this state. {quote}Can we reuse State definition like before? And not define the same State both in ContainerReplicaProto and ContainerDataProto. {quote} The reason for duplicating this is because protobuf doesn't allow the same constant to be used across different enums in same proto file. We already have {{OPEN}}, {{CLOSING}}, {{CLOSED}}, etc in {{LifeCycleState}}, so we cannot have another enum in Hdds.proto which has these values. There is also plan to simplify the Container and Pipeline states in SCM. This will bring changes in {{LifeCycleState}} and {{LifeCycleEvent}} enums in {{Hdds.proto}}. HDDS-735 and follow-up jiras will bring those changes. {quote}I mean we won't throw error for the case of default case after this change. Maybe we should add the state check. {quote} Actually if there is no corresponding value in enum, while calling {{valueOf}} we will get {{java.lang.IllegalArgumentException: No enum constant ...}}. I agree that the exception with a custom message will make more sense. Changed it to the older format. Also fixed related test failures in patch v001. > ContainerInfo and ContainerReplica protobuf changes > --- > > Key: HDDS-755 > URL: https://issues.apache.org/jira/browse/HDDS-755 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-755.000.patch, HDDS-755.001.patch > > > We have different classes that maintain container related information, we can > consolidate them so that it is easy to read the code. > Proposal: > In SCM: will be used in communication between SCM and Client, also used for > storing in db > * ContainerInfoProto > * ContainerInfo > > In Datanode: Used in communication between Datanode and SCM > * ContainerReplicaProto > * ContainerReplica > > In Datanode: Used in communication between Datanode and Client > * ContainerDataProto > * ContainerData -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-762) Fix unit test failure for TestContainerSQLCli & TestCSMMetrics
[ https://issues.apache.org/jira/browse/HDDS-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668588#comment-16668588 ] Nanda kumar commented on HDDS-762: -- Thanks for the patch [~msingh]. +1, pending Jenkins. > Fix unit test failure for TestContainerSQLCli & TestCSMMetrics > -- > > Key: HDDS-762 > URL: https://issues.apache.org/jira/browse/HDDS-762 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDDS-762.001.patch > > > TestContainerSQLCli & TestCSMMetrics are currently failing consistently > because of a mismatch in metrics register name. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-694) Plugin new Pipeline management code in SCM
[ https://issues.apache.org/jira/browse/HDDS-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663778#comment-16663778 ] Nanda kumar commented on HDDS-694: -- [~ljain], thanks for updating the patch. +1, pending Jenkins. > Plugin new Pipeline management code in SCM > -- > > Key: HDDS-694 > URL: https://issues.apache.org/jira/browse/HDDS-694 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-694.001.patch, HDDS-694.002.patch, > HDDS-694.003.patch > > > This Jira aims to plugin new pipeline management code in SCM. It removes the > old pipeline related classes as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-737) Introduce Incremental Container Report
Nanda kumar created HDDS-737: Summary: Introduce Incremental Container Report Key: HDDS-737 URL: https://issues.apache.org/jira/browse/HDDS-737 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode, SCM Reporter: Nanda kumar Assignee: Nanda kumar We will use Incremental Container Report (ICR) to immediately inform SCM when there is some state change to the container in datanode. This will make sure that SCM is updated as soon as the state of a container changes and doesn’t have to wait for full container report. *When do we send ICR?* * When a container replica state changes from open/closing to closed * When a container replica state changes from open/closing to quasi closed * When a container replica state changes from quasi closed to closed * When a container replica is deleted in datanode * When a container replica is copied from another datanode * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-738) Removing REST protocol support from OzoneClient
Nanda kumar created HDDS-738: Summary: Removing REST protocol support from OzoneClient Key: HDDS-738 URL: https://issues.apache.org/jira/browse/HDDS-738 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Client Reporter: Nanda kumar Since we have functional {{S3Gateway}} for Ozone which works on REST protocol, having REST protocol support in OzoneClient feels redundant and it will take a lot of effort to maintain it up to date. As S3Gateway is in a functional state now, I propose to remove REST protocol support from OzoneClient. Once we remove REST support from OzoneClient, the following will be the interface to access Ozone cluster * OzoneClient (RPC Protocol) * OzoneFS (RPC Protocol) * S3Gateway (REST Protocol) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-728) Datanodes are going to dead state after some interval
[ https://issues.apache.org/jira/browse/HDDS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663763#comment-16663763 ] Nanda kumar commented on HDDS-728: -- [~msingh], thanks for working on this. Overall the patch looks good to me, some minor comments In XceiverServerRatis we don't need to maintain {{stateMachineMap}}, RaftServerProxy already has a map to maintain this and the entry from that map is removed whenever we do group remove. In MiniOzoneClusterImpl, do we need this change? We can always wait for the datanode to get ready whenever we do a datanode restart. > Datanodes are going to dead state after some interval > - > > Key: HDDS-728 > URL: https://issues.apache.org/jira/browse/HDDS-728 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.3.0 >Reporter: Soumitra Sulav >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDDS-728.001.patch, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-03.hwx.site.log, > hadoop-root-om-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-scm-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > om-audit-ctr-e138-1518143905142-541600-02-02.hwx.site.log > > > Setup a 5 datanode ozone cluster with HDP on top of it. > After restarting all HDP services few times encountered below issue which is > making the HDP services to fail. > Same exception was observed in an old setup but I thought it could have been > issue with the setup but now encountered the same issue in new setup as well. > {code:java} > 2018-10-24 10:42:03,308 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256) > at > org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54) > at > org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2018-10-24 10:42:03,342 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256) > at > org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54) > at > org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at >
[jira] [Commented] (HDDS-744) Fix ASF license warning in PipelineNotFoundException class
[ https://issues.apache.org/jira/browse/HDDS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1036#comment-1036 ] Nanda kumar commented on HDDS-744: -- Thank you for your contribution [~ljain]. Committed it to trunk. > Fix ASF license warning in PipelineNotFoundException class > -- > > Key: HDDS-744 > URL: https://issues.apache.org/jira/browse/HDDS-744 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Trivial > Attachments: HDDS-744.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-744) Fix ASF license warning in PipelineNotFoundException class
[ https://issues.apache.org/jira/browse/HDDS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-744: - Resolution: Fixed Fix Version/s: 0.4.0 Status: Resolved (was: Patch Available) > Fix ASF license warning in PipelineNotFoundException class > -- > > Key: HDDS-744 > URL: https://issues.apache.org/jira/browse/HDDS-744 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Trivial > Fix For: 0.4.0 > > Attachments: HDDS-744.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-692) Use the ProgressBar class in the RandomKeyGenerator freon test
[ https://issues.apache.org/jira/browse/HDDS-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665960#comment-16665960 ] Nanda kumar commented on HDDS-692: -- [~horzsolt2006], sorry for the delay. Got stuck with other tasks, I will try to respond to your doubts/questions by tomorrow. > Use the ProgressBar class in the RandomKeyGenerator freon test > -- > > Key: HDDS-692 > URL: https://issues.apache.org/jira/browse/HDDS-692 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Zsolt Horvath >Priority: Major > Attachments: HDDS-692.001.patch > > > HDDS-443 provides a reusable progress bar to make it easier to add more freon > tests, but the existing RandomKeyGenerator test > (hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java) > still doesn't use it. > It would be good to switch to use the new progress bar there. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-744) Fix ASF license warning in PipelineNotFoundException class
[ https://issues.apache.org/jira/browse/HDDS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1035#comment-1035 ] Nanda kumar commented on HDDS-744: -- [~ljain], thanks for taking care of this. I will commit it shortly. > Fix ASF license warning in PipelineNotFoundException class > -- > > Key: HDDS-744 > URL: https://issues.apache.org/jira/browse/HDDS-744 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Trivial > Attachments: HDDS-744.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-801) Quasi close the container when close is not executed via Ratis
Nanda kumar created HDDS-801: Summary: Quasi close the container when close is not executed via Ratis Key: HDDS-801 URL: https://issues.apache.org/jira/browse/HDDS-801 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Affects Versions: 0.3.0 Reporter: Nanda kumar Assignee: Nanda kumar When datanode received CloseContainerCommand and the replication type is not RATIS, we should QUASI close the container. After quasi-closing the container an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674422#comment-16674422 ] Nanda kumar commented on HDDS-737: -- Thanks [~linyiqun] for taking a look at the patch. {quote}could you give a summary note of the change {quote} Sure. * The main change in this patch is to introduce a way to send ICR immediately. This is done by introducing {{triggerHeartbeat}} method in {{DatanodeStateMachine}}, this will immediately trigger a heartbeat to SCM which will also include the reports which are ready to send. So, to send an ICR immediately what we have to do is 1) Add ICR (Container Report) to {{StateContext}}. 2) Call {{triggerHeartbeat}} method. * Since we have ICR in place, we don't need to send command status for {{CloseContainerCommand}}. (We eventually want to remove the command status for all the commands) This patch removes the command status logic for {{CloseContainerCommand}} and also removes the command watcher (CloseContainerWatcher) for {{CloseContainerCommand}} in SCM. * Added IncrementalContainerReportHandler in SCM. (It is not complete yet, added TODO. Needs follow up jiras) * Processing of container report was previously done by {{NodeManager}} and {{ContainerManager}}. This logic is moved to {{ContainerReportHandler}} (There is a TODO which needs follow up jira) * Few more refactorings like removing Node2Container class and moving that data structure to {{NodeStateMap}}. I wanted to cover all the scenarios in this jira itself, but the patch already got huge and it will become very difficult to review. I will start filing follow up jiras. {quote}Current change has addressed all the points (When do we send ICR) mentioned in JIRA's description? {quote} Only one scenario is handled in this jira * When a container replica state changes from open/closing to closed We still don't have code to QUASI_CLOSE a container in datanode, we should trigger ICR when we do that. (HDDS-801) The same is for deleted, copied or corrupted. I will start filing jiras so that we can keep track of it. Thanks a lot for spending your time reviewing the patch. > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Attachment: HDDS-801.000.patch > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-117) Wrapper for set/get Standalone, Ratis and Rest Ports in DatanodeDetails.
[ https://issues.apache.org/jira/browse/HDDS-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672060#comment-16672060 ] Nanda kumar commented on HDDS-117: -- [~haridas124], I have made you a contributor to HDDS project. From now on you should be able to assign HDDS jiras to yourself. Welcome to Ozone! > Wrapper for set/get Standalone, Ratis and Rest Ports in DatanodeDetails. > > > Key: HDDS-117 > URL: https://issues.apache.org/jira/browse/HDDS-117 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Priority: Major > Labels: newbie > > It will be very helpful to have a wrapper for set/get Standalone, Ratis and > Rest Ports in DatanodeDetails. > Search and Replace usage of DatanodeDetails#newPort directly in current code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-117) Wrapper for set/get Standalone, Ratis and Rest Ports in DatanodeDetails.
[ https://issues.apache.org/jira/browse/HDDS-117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-117: Assignee: Haridas Kandath > Wrapper for set/get Standalone, Ratis and Rest Ports in DatanodeDetails. > > > Key: HDDS-117 > URL: https://issues.apache.org/jira/browse/HDDS-117 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Assignee: Haridas Kandath >Priority: Major > Labels: newbie > > It will be very helpful to have a wrapper for set/get Standalone, Ratis and > Rest Ports in DatanodeDetails. > Search and Replace usage of DatanodeDetails#newPort directly in current code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-738) Removing REST protocol support from OzoneClient
[ https://issues.apache.org/jira/browse/HDDS-738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-738: - Target Version/s: 0.5.0 (was: 0.4.0) > Removing REST protocol support from OzoneClient > --- > > Key: HDDS-738 > URL: https://issues.apache.org/jira/browse/HDDS-738 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Nanda kumar >Assignee: chencan >Priority: Major > > Since we have functional {{S3Gateway}} for Ozone which works on REST > protocol, having REST protocol support in OzoneClient feels redundant and it > will take a lot of effort to maintain it up to date. > As S3Gateway is in a functional state now, I propose to remove REST protocol > support from OzoneClient. > Once we remove REST support from OzoneClient, the following will be the > interface to access Ozone cluster > * OzoneClient (RPC Protocol) > * OzoneFS (RPC Protocol) > * S3Gateway (REST Protocol) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-738) Removing REST protocol support from OzoneClient
[ https://issues.apache.org/jira/browse/HDDS-738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-738: Assignee: (was: chencan) > Removing REST protocol support from OzoneClient > --- > > Key: HDDS-738 > URL: https://issues.apache.org/jira/browse/HDDS-738 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Nanda kumar >Priority: Major > > Since we have functional {{S3Gateway}} for Ozone which works on REST > protocol, having REST protocol support in OzoneClient feels redundant and it > will take a lot of effort to maintain it up to date. > As S3Gateway is in a functional state now, I propose to remove REST protocol > support from OzoneClient. > Once we remove REST support from OzoneClient, the following will be the > interface to access Ozone cluster > * OzoneClient (RPC Protocol) > * OzoneFS (RPC Protocol) > * S3Gateway (REST Protocol) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-738) Removing REST protocol support from OzoneClient
[ https://issues.apache.org/jira/browse/HDDS-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664995#comment-16664995 ] Nanda kumar commented on HDDS-738: -- This jira is for discussion and it will act as an umbrella jira for removing REST protocol support from OzoneClient. > Removing REST protocol support from OzoneClient > --- > > Key: HDDS-738 > URL: https://issues.apache.org/jira/browse/HDDS-738 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Nanda kumar >Assignee: chencan >Priority: Major > > Since we have functional {{S3Gateway}} for Ozone which works on REST > protocol, having REST protocol support in OzoneClient feels redundant and it > will take a lot of effort to maintain it up to date. > As S3Gateway is in a functional state now, I propose to remove REST protocol > support from OzoneClient. > Once we remove REST support from OzoneClient, the following will be the > interface to access Ozone cluster > * OzoneClient (RPC Protocol) > * OzoneFS (RPC Protocol) > * S3Gateway (REST Protocol) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-618) Separate DN registration from Heartbeat
[ https://issues.apache.org/jira/browse/HDDS-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-618: Assignee: (was: Nanda kumar) > Separate DN registration from Heartbeat > --- > > Key: HDDS-618 > URL: https://issues.apache.org/jira/browse/HDDS-618 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Hanisha Koneru >Priority: Major > > Currently, if SCM has to send ReRegister command to a DN, it can only do so > through heartbeat response. Due to this, DN reregistration can take upto 2 > heartbeat intervals. > We should decouple registration requests from heartbeat, so that DN can > reregister as soon as SCM detects that the node is not registered. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-728) Datanodes should use different ContainerStateMachine for each pipeline.
[ https://issues.apache.org/jira/browse/HDDS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667095#comment-16667095 ] Nanda kumar commented on HDDS-728: -- Thanks [~msingh] for the patches. +1 on [^HDDS-728.012.patch] and [^HDDS-728-ozone-0.3.005.patch], pending Jenkins. Tested them locally. > Datanodes should use different ContainerStateMachine for each pipeline. > --- > > Key: HDDS-728 > URL: https://issues.apache.org/jira/browse/HDDS-728 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.3.0 >Reporter: Soumitra Sulav >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDDS-728-ozone-0.3.005.patch, HDDS-728.001.patch, > HDDS-728.002.patch, HDDS-728.003.patch, HDDS-728.004.patch, > HDDS-728.005.patch, HDDS-728.006.patch, HDDS-728.007.patch, > HDDS-728.008.patch, HDDS-728.009.patch, HDDS-728.010.patch, > HDDS-728.011.patch, HDDS-728.012.patch, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-03.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-08.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-09.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-10.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-04.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-05.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-06.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-07.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-08.hwx.site.log, > hadoop-root-om-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-scm-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > om-audit-ctr-e138-1518143905142-541600-02-02.hwx.site.log > > > Setup a 5 datanode ozone cluster with HDP on top of it. > After restarting all HDP services few times encountered below issue which is > making the HDP services to fail. > Same exception was observed in an old setup but I thought it could have been > issue with the setup but now encountered the same issue in new setup as well. > {code:java} > 2018-10-24 10:42:03,308 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256) > at > org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54) > at > org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2018-10-24 10:42:03,342 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252) > at >
[jira] [Updated] (HDDS-728) Datanodes should use different ContainerStateMachine for each pipeline.
[ https://issues.apache.org/jira/browse/HDDS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-728: - Resolution: Fixed Fix Version/s: 0.4.0 0.3.0 Status: Resolved (was: Patch Available) > Datanodes should use different ContainerStateMachine for each pipeline. > --- > > Key: HDDS-728 > URL: https://issues.apache.org/jira/browse/HDDS-728 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.3.0 >Reporter: Soumitra Sulav >Assignee: Mukul Kumar Singh >Priority: Major > Fix For: 0.3.0, 0.4.0 > > Attachments: HDDS-728-ozone-0.3.005.patch, HDDS-728.001.patch, > HDDS-728.002.patch, HDDS-728.003.patch, HDDS-728.004.patch, > HDDS-728.005.patch, HDDS-728.006.patch, HDDS-728.007.patch, > HDDS-728.008.patch, HDDS-728.009.patch, HDDS-728.010.patch, > HDDS-728.011.patch, HDDS-728.012.patch, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-03.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-08.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-09.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-10.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-04.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-05.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-06.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-07.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-08.hwx.site.log, > hadoop-root-om-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-scm-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > om-audit-ctr-e138-1518143905142-541600-02-02.hwx.site.log > > > Setup a 5 datanode ozone cluster with HDP on top of it. > After restarting all HDP services few times encountered below issue which is > making the HDP services to fail. > Same exception was observed in an old setup but I thought it could have been > issue with the setup but now encountered the same issue in new setup as well. > {code:java} > 2018-10-24 10:42:03,308 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256) > at > org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54) > at > org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2018-10-24 10:42:03,342 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252) > at >
[jira] [Commented] (HDDS-728) Datanodes should use different ContainerStateMachine for each pipeline.
[ https://issues.apache.org/jira/browse/HDDS-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667295#comment-16667295 ] Nanda kumar commented on HDDS-728: -- [~msingh], thanks for the contribution. Thanks to [~ssulav] for reporting and testing it and thanks to [~shashikant] and [~anu] for the review. I committed it to trunk and ozone-0.3 branch. > Datanodes should use different ContainerStateMachine for each pipeline. > --- > > Key: HDDS-728 > URL: https://issues.apache.org/jira/browse/HDDS-728 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.3.0 >Reporter: Soumitra Sulav >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDDS-728-ozone-0.3.005.patch, HDDS-728.001.patch, > HDDS-728.002.patch, HDDS-728.003.patch, HDDS-728.004.patch, > HDDS-728.005.patch, HDDS-728.006.patch, HDDS-728.007.patch, > HDDS-728.008.patch, HDDS-728.009.patch, HDDS-728.010.patch, > HDDS-728.011.patch, HDDS-728.012.patch, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-03.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-08.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-09.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-541600-02-10.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-04.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-05.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-06.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-07.hwx.site.log, > hadoop-root-datanode-ctr-e138-1518143905142-552728-01-08.hwx.site.log, > hadoop-root-om-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > hadoop-root-scm-ctr-e138-1518143905142-541600-02-02.hwx.site.log, > om-audit-ctr-e138-1518143905142-541600-02-02.hwx.site.log > > > Setup a 5 datanode ozone cluster with HDP on top of it. > After restarting all HDP services few times encountered below issue which is > making the HDP services to fail. > Same exception was observed in an old setup but I thought it could have been > issue with the setup but now encountered the same issue in new setup as well. > {code:java} > 2018-10-24 10:42:03,308 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 1672d28e-800f-4318-895b-1648976acff6->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at > org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:252) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:256) > at > org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:411) > at > org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54) > at > org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319) > at > org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2018-10-24 10:42:03,342 WARN > org.apache.ratis.grpc.server.GrpcServerProtocolService: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: Failed requestVote > 7839294e-5657-447f-b320-6b390fffb963->2974da2b-e765-43f9-8d30-45fe40dcb9ab#0 > org.apache.ratis.protocol.GroupMismatchException: > 2974da2b-e765-43f9-8d30-45fe40dcb9ab: group-CE87A994686F not found. > at > org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114) > at >
[jira] [Created] (HDDS-733) Create container if not exist, as part of chunk write
Nanda kumar created HDDS-733: Summary: Create container if not exist, as part of chunk write Key: HDDS-733 URL: https://issues.apache.org/jira/browse/HDDS-733 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Nanda kumar The current implementation requires a container to be created in datanode before starting the chunk write. This can be optimized by creating the container on the first chunk write. During chunk write, if the container is missing, we can go ahead and create the container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-735) Remove ALLOCATED and CREATING state from ContainerStateManager
Nanda kumar created HDDS-735: Summary: Remove ALLOCATED and CREATING state from ContainerStateManager Key: HDDS-735 URL: https://issues.apache.org/jira/browse/HDDS-735 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Nanda kumar After HDDS-733 and HDDS-734, we don't need ALLOCATED and CREATING state for containers in SCM. The container will move to OPEN state as soon as it is allocated in SCM. Since the container creation happens as part of the first chunk write and container creation operation in datanode idempotent we don't have to worry about giving out the same container to multiple clients as soon as it is allocated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-694) Plugin new Pipeline management code in SCM
[ https://issues.apache.org/jira/browse/HDDS-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663442#comment-16663442 ] Nanda kumar commented on HDDS-694: -- [~ljain], the patch is not applying anymore. Can you rebase it on top of the latest changes? Also can you please fix the checkstyle issues. A couple of very minor comments, Pipeline.java Line:107 We can avoid creation of new ArrayList object by not calling {{getNodes}}. {{getNodes().get(0)}} can be replaced with {{nodeStatus.keySet().iterator().next()}} RatisPipelineProvider.java Line:129 The state should be {{OPEN}} > Plugin new Pipeline management code in SCM > -- > > Key: HDDS-694 > URL: https://issues.apache.org/jira/browse/HDDS-694 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-694.001.patch, HDDS-694.002.patch > > > This Jira aims to plugin new pipeline management code in SCM. It removes the > old pipeline related classes as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-734) Remove create container logic from OzoneClient
Nanda kumar created HDDS-734: Summary: Remove create container logic from OzoneClient Key: HDDS-734 URL: https://issues.apache.org/jira/browse/HDDS-734 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Client Reporter: Nanda kumar After HDDS-733, the container will be created as part of the first chunk write, we don't need explicit container creation code in {{OzoneClient}} anymore. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-733) Create container if not exist, as part of chunk write
[ https://issues.apache.org/jira/browse/HDDS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675599#comment-16675599 ] Nanda kumar commented on HDDS-733: -- [~ljain], thanks for working on this. The patch looks good to me, couple of minor comments In SCMChillModeManager#ContainerChillModeRule we should exclude the containers which are in OPEN state from adding to containerMap. Now the containers in OPEN state might not be created in datanode. ChillModeManager should track pipelines in the cluster for containers in OPEN state. Unused imports in SCMContainerManager and TestDeadNodeHandler. As [~linyiqun] pointed out, we can add a test case to make sure that we don't create containers for ReadChunk. {quote}Send ReadChunk request before WriteChunk request and verify the StorageContainerException of CONTAINER_NOT_FOUND. {quote} Looks the following tests are failing with this patch, can you take a look? * org.apache.hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient * org.apache.hadoop.ozone.freon.TestFreonWithDatanodeRestart > Create container if not exist, as part of chunk write > - > > Key: HDDS-733 > URL: https://issues.apache.org/jira/browse/HDDS-733 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-733.001.patch, HDDS-733.002.patch > > > The current implementation requires a container to be created in datanode > before starting the chunk write. This can be optimized by creating the > container on the first chunk write. > During chunk write, if the container is missing, we can go ahead and create > the container. > Along with this change ALLOCATED and CREATING container states can be removed > as they were used to track which containers have been successfully created. > Also there is a shouldCreateContainer flag which is used by client to know if > it needs to create container. This flag can be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675615#comment-16675615 ] Nanda kumar commented on HDDS-737: -- [~linyiqun], thanks for the review. bq. I prefer to add additional try-catch for thread.sleep and get InterruptedException. Good idea. Will address it in the next patch. bq. Can we have a new unit test for the incremental container report? I have added {{TestCloseContainerCommandHandler}} in HDDS-801 which tests whether ICR is properly triggered. HDDS-801 also introduces {{OzoneContainer#updateContainerState}} call which triggers ICR, there is a bit of restructuring of code in the way we trigger ICR in HDDS-801. > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677143#comment-16677143 ] Nanda kumar commented on HDDS-737: -- Looks like Jenkins run has some problem, the build failed with the below error {code:java} [ERROR] Please refer to /testptch/hadoop/hadoop-hdds/common/target/surefire-reports for the individual test results. [ERROR] Please refer to dump files (if any exist) [date]-jvmRun[N].dump, [date].dumpstream and [date]-jvmRun[N].dumpstream. [ERROR] ExecutionException The forked VM terminated without properly saying goodbye. VM crash or System.exit called? [ERROR] Command was /bin/sh -c cd /testptch/hadoop/hadoop-hdds/common && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2048m -XX:+HeapDumpOnOutOfMemoryError -DminiClusterDedicatedDirs=true -jar /testptch/hadoop/hadoop-hdds/common/target/surefire/surefirebooter1364262303668308594.jar /testptch/hadoop/hadoop-hdds/common/target/surefire 2018-11-06T11-45-30_771-jvmRun4 surefire2576372466595873598tmp surefire_22763712882772709533tmp {code} Re-triggered Jenkins pre-commit build: [https://builds.apache.org/job/PreCommit-HDDS-Build/1626/] > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-692) Use the ProgressBar class in the RandomKeyGenerator freon test
[ https://issues.apache.org/jira/browse/HDDS-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677190#comment-16677190 ] Nanda kumar commented on HDDS-692: -- [~horzsolt2006], the ProgressBar code can be refactored like below {code:java} /** * Creates and runs a ProgressBar in new Thread which gets printed on * the provided PrintStream. */ public class ProgressBar { private static final Logger LOG = LoggerFactory.getLogger(ProgressBar.class); private static final long REFRESH_INTERVAL = 1000L; private final long maxValue; private final Supplier currentValue; private final Thread progressBar; private volatile boolean running; private long startTime; /** * Creates a new ProgressBar instance which prints the progress on the given * PrintStream when started. * * @param stream to display the progress * @param maxValue Maximum value of the progress * @param currentValue Supplier that provides the current value */ public ProgressBar(final PrintStream stream, final Long maxValue, final Supplier currentValue) { this.maxValue = maxValue; this.currentValue = currentValue; this.progressBar = new Thread(getProgressBar(stream)); this.running = false; } /** * Starts the ProgressBar in a new Thread. * This is a non blocking call. */ public synchronized void start() { if (!running) { running = true; startTime = System.nanoTime(); progressBar.start(); } } /** * Graceful shutdown, waits for the progress bar to complete. * This is a blocking call. */ public synchronized void shutdown() { if (running) { try { progressBar.join(); running = false; } catch (InterruptedException e) { LOG.warn("Got interrupted while waiting for the progress bar to " + "complete."); } } } /** * Terminates the progress bar. This doesn't wait for the progress bar * to complete. */ public synchronized void terminate() { if (running) { try { running = false; progressBar.join(); } catch (InterruptedException e) { LOG.warn("Got interrupted while waiting for the progress bar to " + "complete."); } } } private Runnable getProgressBar(final PrintStream stream) { return () -> { stream.println(); while (running && currentValue.get() < maxValue) { print(stream, currentValue.get()); try { Thread.sleep(REFRESH_INTERVAL); } catch (InterruptedException e) { LOG.warn("ProgressBar was interrupted."); } } print(stream, maxValue); stream.println(); running = false; }; } /** * Given current value prints the progress bar. * * @param value current progress position */ private void print(final PrintStream stream, final long value) { stream.print('\r'); double percent = 100.0 * value / maxValue; StringBuilder sb = new StringBuilder(); sb.append(" " + String.format("%.2f", percent) + "% |"); for (int i = 0; i <= percent; i++) { sb.append('█'); } for (int j = 0; j < 100 - percent; j++) { sb.append(' '); } sb.append("| "); sb.append(value + "/" + maxValue); long timeInSec = TimeUnit.SECONDS.convert( System.nanoTime() - startTime, TimeUnit.NANOSECONDS); String timeToPrint = String.format("%d:%02d:%02d", timeInSec / 3600, (timeInSec % 3600) / 60, timeInSec % 60); sb.append(" Time: " + timeToPrint); stream.print(sb.toString()); } } {code} > Use the ProgressBar class in the RandomKeyGenerator freon test > -- > > Key: HDDS-692 > URL: https://issues.apache.org/jira/browse/HDDS-692 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Tools >Reporter: Elek, Marton >Assignee: Zsolt Horvath >Priority: Major > Attachments: HDDS-692.001.patch, HDDS-692.002.patch, > HDDS-692.003.patch > > > HDDS-443 provides a reusable progress bar to make it easier to add more freon > tests, but the existing RandomKeyGenerator test > (hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/RandomKeyGenerator.java) > still doesn't use it. > It would be good to switch to use the new progress bar there. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679739#comment-16679739 ] Nanda kumar commented on HDDS-737: -- Thanks [~linyiqun] & [~jnp] for the reviews. Committed it to trunk. > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-812) TestEndPoint#testCheckVersionResponse is failing
[ https://issues.apache.org/jira/browse/HDDS-812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679704#comment-16679704 ] Nanda kumar commented on HDDS-812: -- [~hanishakoneru], It seems you have accidentally attached HDDS-797's patch to this jira. > TestEndPoint#testCheckVersionResponse is failing > > > Key: HDDS-812 > URL: https://issues.apache.org/jira/browse/HDDS-812 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Nanda kumar >Assignee: Hanisha Koneru >Priority: Major > Attachments: HDDS-797.001.patch > > > TestEndPoint#testCheckVersionResponse is failing with the below error > {code:java} > [ERROR] > testCheckVersionResponse(org.apache.hadoop.ozone.container.common.TestEndPoint) > Time elapsed: 0.142 s <<< FAILURE! > java.lang.AssertionError: expected: but was: > {code} > Once we are in REGISTER state we don't allow getVersion call anymore. This is > causing the test case to fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-737: - Resolution: Fixed Fix Version/s: 0.4.0 Status: Resolved (was: Patch Available) > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-823) OzoneRestClient is failing with NPE on getKeyDetails call
[ https://issues.apache.org/jira/browse/HDDS-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679663#comment-16679663 ] Nanda kumar commented on HDDS-823: -- This is happening after HDDS-798. > OzoneRestClient is failing with NPE on getKeyDetails call > - > > Key: HDDS-823 > URL: https://issues.apache.org/jira/browse/HDDS-823 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Priority: Blocker > > {{RestClient#getKeyDetails}} is failing with {{NullPointerException}} which > is causing a lot of unit test and smoke test to fail. > Exception trace: > {code:java} > Error while calling command > (org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler@13713486): > java.lang.NullPointerException > at picocli.CommandLine.execute(CommandLine.java:926) > at picocli.CommandLine.access$700(CommandLine.java:104) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1083) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1051) > at > picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959) > at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242) > at > org.apache.hadoop.ozone.ozShell.TestOzoneShell.execute(TestOzoneShell.java:259) > at > org.apache.hadoop.ozone.ozShell.TestOzoneShell.testInfoDirKey(TestOzoneShell.java:1013) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.ozone.client.rest.RestClient.getKeyDetails(RestClient.java:817) > at > org.apache.hadoop.ozone.client.OzoneBucket.getKey(OzoneBucket.java:282) > at > org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:65) > at > org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:37) > at picocli.CommandLine.execute(CommandLine.java:919) > ... 18 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-798) Storage-class is showing incorrectly
[ https://issues.apache.org/jira/browse/HDDS-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679669#comment-16679669 ] Nanda kumar commented on HDDS-798: -- After this change {{RestClient#getKeyDetails}} is failing with {{NullPointerException}} which is causing few of unit test and smoke test to fail. > Storage-class is showing incorrectly > > > Key: HDDS-798 > URL: https://issues.apache.org/jira/browse/HDDS-798 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.3.0, 0.4.0 > > Attachments: HDDS-798.00.patch > > > After HDDS-712, we support storage-class. > For list-objects, even if key has set storage-class to REDUCED_REDUNDANCY, > still it shows STANDARD. > As in code in list object response, we have hardcoded it as below. > keyMetadata.setStorageClass("STANDARD"); -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Status: Patch Available (was: Open) > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-823) OzoneRestClient is failing with NPE on getKeyDetails call
Nanda kumar created HDDS-823: Summary: OzoneRestClient is failing with NPE on getKeyDetails call Key: HDDS-823 URL: https://issues.apache.org/jira/browse/HDDS-823 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Affects Versions: 0.3.0 Reporter: Nanda kumar {{RestClient#getKeyDetails}} is failing with {{NullPointerException}} which is causing a lot of unit test and smoke test to fail. Exception trace: {code:java} Error while calling command (org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler@13713486): java.lang.NullPointerException at picocli.CommandLine.execute(CommandLine.java:926) at picocli.CommandLine.access$700(CommandLine.java:104) at picocli.CommandLine$RunLast.handle(CommandLine.java:1083) at picocli.CommandLine$RunLast.handle(CommandLine.java:1051) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959) at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242) at org.apache.hadoop.ozone.ozShell.TestOzoneShell.execute(TestOzoneShell.java:259) at org.apache.hadoop.ozone.ozShell.TestOzoneShell.testInfoDirKey(TestOzoneShell.java:1013) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) Caused by: java.lang.NullPointerException at org.apache.hadoop.ozone.client.rest.RestClient.getKeyDetails(RestClient.java:817) at org.apache.hadoop.ozone.client.OzoneBucket.getKey(OzoneBucket.java:282) at org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:65) at org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:37) at picocli.CommandLine.execute(CommandLine.java:919) ... 18 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679732#comment-16679732 ] Nanda kumar commented on HDDS-737: -- Tested it locally, failures are not related to this patch. I will commit it shortly. > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-823) OzoneRestClient is failing with NPE on getKeyDetails call
[ https://issues.apache.org/jira/browse/HDDS-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-823: - Target Version/s: 0.3.0, 0.4.0 (was: 0.3.0) > OzoneRestClient is failing with NPE on getKeyDetails call > - > > Key: HDDS-823 > URL: https://issues.apache.org/jira/browse/HDDS-823 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Priority: Blocker > > {{RestClient#getKeyDetails}} is failing with {{NullPointerException}} which > is causing few of unit test and smoke test to fail. > Exception trace: > {code:java} > Error while calling command > (org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler@13713486): > java.lang.NullPointerException > at picocli.CommandLine.execute(CommandLine.java:926) > at picocli.CommandLine.access$700(CommandLine.java:104) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1083) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1051) > at > picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959) > at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242) > at > org.apache.hadoop.ozone.ozShell.TestOzoneShell.execute(TestOzoneShell.java:259) > at > org.apache.hadoop.ozone.ozShell.TestOzoneShell.testInfoDirKey(TestOzoneShell.java:1013) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.ozone.client.rest.RestClient.getKeyDetails(RestClient.java:817) > at > org.apache.hadoop.ozone.client.OzoneBucket.getKey(OzoneBucket.java:282) > at > org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:65) > at > org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:37) > at picocli.CommandLine.execute(CommandLine.java:919) > ... 18 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-823) OzoneRestClient is failing with NPE on getKeyDetails call
[ https://issues.apache.org/jira/browse/HDDS-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-823: - Description: {{RestClient#getKeyDetails}} is failing with {{NullPointerException}} which is causing few of unit test and smoke test to fail. Exception trace: {code:java} Error while calling command (org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler@13713486): java.lang.NullPointerException at picocli.CommandLine.execute(CommandLine.java:926) at picocli.CommandLine.access$700(CommandLine.java:104) at picocli.CommandLine$RunLast.handle(CommandLine.java:1083) at picocli.CommandLine$RunLast.handle(CommandLine.java:1051) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959) at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242) at org.apache.hadoop.ozone.ozShell.TestOzoneShell.execute(TestOzoneShell.java:259) at org.apache.hadoop.ozone.ozShell.TestOzoneShell.testInfoDirKey(TestOzoneShell.java:1013) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) Caused by: java.lang.NullPointerException at org.apache.hadoop.ozone.client.rest.RestClient.getKeyDetails(RestClient.java:817) at org.apache.hadoop.ozone.client.OzoneBucket.getKey(OzoneBucket.java:282) at org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:65) at org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:37) at picocli.CommandLine.execute(CommandLine.java:919) ... 18 more {code} was: {{RestClient#getKeyDetails}} is failing with {{NullPointerException}} which is causing a lot of unit test and smoke test to fail. Exception trace: {code:java} Error while calling command (org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler@13713486): java.lang.NullPointerException at picocli.CommandLine.execute(CommandLine.java:926) at picocli.CommandLine.access$700(CommandLine.java:104) at picocli.CommandLine$RunLast.handle(CommandLine.java:1083) at picocli.CommandLine$RunLast.handle(CommandLine.java:1051) at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959) at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242) at org.apache.hadoop.ozone.ozShell.TestOzoneShell.execute(TestOzoneShell.java:259) at org.apache.hadoop.ozone.ozShell.TestOzoneShell.testInfoDirKey(TestOzoneShell.java:1013) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) Caused by: java.lang.NullPointerException at org.apache.hadoop.ozone.client.rest.RestClient.getKeyDetails(RestClient.java:817) at org.apache.hadoop.ozone.client.OzoneBucket.getKey(OzoneBucket.java:282) at org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:65) at org.apache.hadoop.ozone.web.ozShell.keys.InfoKeyHandler.call(InfoKeyHandler.java:37) at picocli.CommandLine.execute(CommandLine.java:919)
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Status: Open (was: Patch Available) > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-801 started by Nanda kumar. > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-576) Move ContainerWithPipeline creation to RPC endpoint
[ https://issues.apache.org/jira/browse/HDDS-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-576: - Description: With independent Pipeline and Container Managers in SCM, the creation of ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear separation of the pipeline Manager and Container Manager (was: with independent Pipeline and Container Managers in SCM, the creation of ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear separation of the pipeline Manager and Container Manager) > Move ContainerWithPipeline creation to RPC endpoint > --- > > Key: HDDS-576 > URL: https://issues.apache.org/jira/browse/HDDS-576 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Priority: Major > > With independent Pipeline and Container Managers in SCM, the creation of > ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear > separation of the pipeline Manager and Container Manager -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-576) Move ContainerWithPipeline creation to RPC endpoint
[ https://issues.apache.org/jira/browse/HDDS-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar reassigned HDDS-576: Assignee: Nanda kumar > Move ContainerWithPipeline creation to RPC endpoint > --- > > Key: HDDS-576 > URL: https://issues.apache.org/jira/browse/HDDS-576 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Nanda kumar >Priority: Major > > With independent Pipeline and Container Managers in SCM, the creation of > ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear > separation of the pipeline Manager and Container Manager -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-827) TestStorageContainerManagerHttpServer should use dynamic port
Nanda kumar created HDDS-827: Summary: TestStorageContainerManagerHttpServer should use dynamic port Key: HDDS-827 URL: https://issues.apache.org/jira/browse/HDDS-827 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: test Reporter: Nanda kumar Most of the time {{TestStorageContainerManagerHttpServer}} is failing with {code} java.net.BindException: Port in use: 0.0.0.0:9876 ... Caused by: java.net.BindException: Address already in use {code} TestStorageContainerManagerHttpServer should use a port which is free (dynamic), instead of trying to bind with default 9876. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-827) TestStorageContainerManagerHttpServer should use dynamic port
[ https://issues.apache.org/jira/browse/HDDS-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-827: - Labels: newbie (was: ) > TestStorageContainerManagerHttpServer should use dynamic port > - > > Key: HDDS-827 > URL: https://issues.apache.org/jira/browse/HDDS-827 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Nanda kumar >Priority: Major > Labels: newbie > > Most of the time {{TestStorageContainerManagerHttpServer}} is failing with > {code} > java.net.BindException: Port in use: 0.0.0.0:9876 > ... > Caused by: java.net.BindException: Address already in use > {code} > TestStorageContainerManagerHttpServer should use a port which is free > (dynamic), instead of trying to bind with default 9876. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-576) Move ContainerWithPipeline creation to RPC endpoint
[ https://issues.apache.org/jira/browse/HDDS-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685072#comment-16685072 ] Nanda kumar commented on HDDS-576: -- [~linyiqun], thanks for the review. Created HDDS-833 for updating the javadoc. > Move ContainerWithPipeline creation to RPC endpoint > --- > > Key: HDDS-576 > URL: https://issues.apache.org/jira/browse/HDDS-576 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Nanda kumar >Priority: Major > Fix For: 0.4.0 > > Attachments: HDDS-576.000.patch > > > With independent Pipeline and Container Managers in SCM, the creation of > ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear > separation of the pipeline Manager and Container Manager -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-833) Update javadoc in StorageContainerManager, NodeManager, PipelineManager and ContainerManager
Nanda kumar created HDDS-833: Summary: Update javadoc in StorageContainerManager, NodeManager, PipelineManager and ContainerManager Key: HDDS-833 URL: https://issues.apache.org/jira/browse/HDDS-833 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Nanda kumar Assignee: Nanda kumar The javadoc in following interface/classes has to be updated * StorageContainerManager * NodeManager * NodeStateManager * PipelineManager * PipelineStateManager * ContainerManager * ContainerStateManager -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-830) Datanode should not start XceiverServerRatis before getting version information from SCM
Nanda kumar created HDDS-830: Summary: Datanode should not start XceiverServerRatis before getting version information from SCM Key: HDDS-830 URL: https://issues.apache.org/jira/browse/HDDS-830 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Affects Versions: 0.3.0 Reporter: Nanda kumar If a datanode restarts quickly before SCM detects, it will rejoin the ratis ring (existing pipeline). Since SCM didn't detect this restart, the pipeline is not closed. Now there is a time gap after the datanode is started and it got the version information from SCM. During this time, the SCM ID in datanode is not set(null). If a client tries to use this pipeline during that time, the container state machine will throw {{java.lang.NullPointerException: scmId cannot be nul}}. This will cause {{RaftLogWorker}} to terminate resulting in datanode crash. {code} 2018-11-12 19:45:31,811 ERROR storage.RaftLogWorker (ExitUtils.java:terminate(86)) - Terminating with exit status 1: 407fd181-2ff7-4651-9a47-a0927ede4c51-RaftLogWorker failed. java.io.IOException: java.lang.NullPointerException: scmId cannot be null at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:83) at org.apache.ratis.server.storage.RaftLogWorker$StateMachineDataPolicy.getFromFuture(RaftLogWorker.java:76) at org.apache.ratis.server.storage.RaftLogWorker$WriteLog.execute(RaftLogWorker.java:344) at org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:216) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException: scmId cannot be null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.create(KeyValueContainer.java:106) at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleCreateContainer(KeyValueHandler.java:242) at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:165) at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.createContainer(HddsDispatcher.java:206) at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:124) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:274) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:280) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$handleWriteChunk$1(ContainerStateMachine.java:301) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-576) Move ContainerWithPipeline creation to RPC endpoint
[ https://issues.apache.org/jira/browse/HDDS-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-576: - Status: Patch Available (was: Open) > Move ContainerWithPipeline creation to RPC endpoint > --- > > Key: HDDS-576 > URL: https://issues.apache.org/jira/browse/HDDS-576 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-576.000.patch > > > With independent Pipeline and Container Managers in SCM, the creation of > ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear > separation of the pipeline Manager and Container Manager -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-576) Move ContainerWithPipeline creation to RPC endpoint
[ https://issues.apache.org/jira/browse/HDDS-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683928#comment-16683928 ] Nanda kumar commented on HDDS-576: -- This patch also fixes the test failures. > Move ContainerWithPipeline creation to RPC endpoint > --- > > Key: HDDS-576 > URL: https://issues.apache.org/jira/browse/HDDS-576 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-576.000.patch > > > With independent Pipeline and Container Managers in SCM, the creation of > ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear > separation of the pipeline Manager and Container Manager -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-830) Datanode should not start XceiverServerRatis before getting version information from SCM
[ https://issues.apache.org/jira/browse/HDDS-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-830: - Issue Type: Bug (was: Improvement) > Datanode should not start XceiverServerRatis before getting version > information from SCM > > > Key: HDDS-830 > URL: https://issues.apache.org/jira/browse/HDDS-830 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Priority: Major > > If a datanode restarts quickly before SCM detects, it will rejoin the ratis > ring (existing pipeline). Since SCM didn't detect this restart, the pipeline > is not closed. Now there is a time gap after the datanode is started and it > got the version information from SCM. During this time, the SCM ID in > datanode is not set(null). If a client tries to use this pipeline during that > time, the container state machine will throw > {{java.lang.NullPointerException: scmId cannot be nul}}. This will cause > {{RaftLogWorker}} to terminate resulting in datanode crash. > {code} > 2018-11-12 19:45:31,811 ERROR storage.RaftLogWorker > (ExitUtils.java:terminate(86)) - Terminating with exit status 1: > 407fd181-2ff7-4651-9a47-a0927ede4c51-RaftLogWorker failed. > java.io.IOException: java.lang.NullPointerException: scmId cannot be null > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) > at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) > at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:83) > at > org.apache.ratis.server.storage.RaftLogWorker$StateMachineDataPolicy.getFromFuture(RaftLogWorker.java:76) > at > org.apache.ratis.server.storage.RaftLogWorker$WriteLog.execute(RaftLogWorker.java:344) > at org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:216) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException: scmId cannot be null > at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.create(KeyValueContainer.java:106) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleCreateContainer(KeyValueHandler.java:242) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:165) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.createContainer(HddsDispatcher.java:206) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:124) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:274) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:280) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$handleWriteChunk$1(ContainerStateMachine.java:301) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-576) Move ContainerWithPipeline creation to RPC endpoint
[ https://issues.apache.org/jira/browse/HDDS-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-576: - Attachment: HDDS-576.000.patch > Move ContainerWithPipeline creation to RPC endpoint > --- > > Key: HDDS-576 > URL: https://issues.apache.org/jira/browse/HDDS-576 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-576.000.patch > > > With independent Pipeline and Container Managers in SCM, the creation of > ContainerWithPipeline can be moved to RPC endpoint. This will ensure clear > separation of the pipeline Manager and Container Manager -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-827) TestStorageContainerManagerHttpServer should use dynamic port
[ https://issues.apache.org/jira/browse/HDDS-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685347#comment-16685347 ] Nanda kumar commented on HDDS-827: -- +1, will commit this shortly. > TestStorageContainerManagerHttpServer should use dynamic port > - > > Key: HDDS-827 > URL: https://issues.apache.org/jira/browse/HDDS-827 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Nanda kumar >Assignee: Sandeep Nemuri >Priority: Major > Labels: newbie > Attachments: HDDS-827.001.patch > > > Most of the time {{TestStorageContainerManagerHttpServer}} is failing with > {code} > java.net.BindException: Port in use: 0.0.0.0:9876 > ... > Caused by: java.net.BindException: Address already in use > {code} > TestStorageContainerManagerHttpServer should use a port which is free > (dynamic), instead of trying to bind with default 9876. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-827) TestStorageContainerManagerHttpServer should use dynamic port
[ https://issues.apache.org/jira/browse/HDDS-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688049#comment-16688049 ] Nanda kumar commented on HDDS-827: -- Thanks [~elek] for taking care of this. > TestStorageContainerManagerHttpServer should use dynamic port > - > > Key: HDDS-827 > URL: https://issues.apache.org/jira/browse/HDDS-827 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Nanda kumar >Assignee: Sandeep Nemuri >Priority: Major > Labels: newbie > Fix For: 0.4.0 > > Attachments: HDDS-827.001.patch > > > Most of the time {{TestStorageContainerManagerHttpServer}} is failing with > {code} > java.net.BindException: Port in use: 0.0.0.0:9876 > ... > Caused by: java.net.BindException: Address already in use > {code} > TestStorageContainerManagerHttpServer should use a port which is free > (dynamic), instead of trying to bind with default 9876. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Attachment: HDDS-801.003.patch > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch, > HDDS-801.002.patch, HDDS-801.003.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-837) Persist originNodeId as part of .container file in datanode
[ https://issues.apache.org/jira/browse/HDDS-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688622#comment-16688622 ] Nanda kumar commented on HDDS-837: -- {{originPipelineId}} is good to have as part of the container info. I will work on adding originPipelineId. The current wip patch only has originNodeId, will update the patch shortly to include originPipelineId as well. > Persist originNodeId as part of .container file in datanode > --- > > Key: HDDS-837 > URL: https://issues.apache.org/jira/browse/HDDS-837 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-837.wip.patch > > > To differentiate the replica of QUASI_CLOSED containers we need > {{originNodeId}} field. With this field, we can uniquely identify a > QUASI_CLOSED container replica. This will be needed when we want to CLOSE a > QUASI_CLOSED container. > This field will be set by the node where the container is created and stored > as part of {{.container}} file and will be sent as part of ContainerReport to > SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-837) Persist originNodeId as part of .container file in datanode
[ https://issues.apache.org/jira/browse/HDDS-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-837: - Attachment: HDDS-837.wip.patch > Persist originNodeId as part of .container file in datanode > --- > > Key: HDDS-837 > URL: https://issues.apache.org/jira/browse/HDDS-837 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-837.wip.patch > > > To differentiate the replica of QUASI_CLOSED containers we need > {{originNodeId}} field. With this field, we can uniquely identify a > QUASI_CLOSED container replica. This will be needed when we want to CLOSE a > QUASI_CLOSED container. > This field will be set by the node where the container is created and stored > as part of {{.container}} file and will be sent as part of ContainerReport to > SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-837) Persist originNodeId as part of .container file in datanode
[ https://issues.apache.org/jira/browse/HDDS-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-837 started by Nanda kumar. > Persist originNodeId as part of .container file in datanode > --- > > Key: HDDS-837 > URL: https://issues.apache.org/jira/browse/HDDS-837 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-837.wip.patch > > > To differentiate the replica of QUASI_CLOSED containers we need > {{originNodeId}} field. With this field, we can uniquely identify a > QUASI_CLOSED container replica. This will be needed when we want to CLOSE a > QUASI_CLOSED container. > This field will be set by the node where the container is created and stored > as part of {{.container}} file and will be sent as part of ContainerReport to > SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Attachment: (was: HDDS-801.003.patch) > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch, > HDDS-801.002.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Attachment: HDDS-801.003.patch > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch, > HDDS-801.002.patch, HDDS-801.003.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688608#comment-16688608 ] Nanda kumar commented on HDDS-801: -- Thanks [~msingh], [~shashikant] for the review. bq. contaienrState to containerState Fixed. bq. updateContainerState should be changed to appropriate type like closing or stopContainer Fixed. bq. I feel this can be moved inside XceiverServerRatis Done bq. lets add a getter function Added bq. Should this be encapsulated in one function Introduced {{closeInternal}} private method, both close and quasi close uses it now. bq. When the transition from QUASI_CLOSED to CLOSED is allowed later, we should not compact the DB again. It makes the code simple, it shouldn't be a problem even if we compact the db multiple times. bq. the container should already be in CLOSING state, Lets add an precondition here that the container is already in closing state. There are cases where {{handleCloseContainer}} is called even when the container is still in OPEN state (close container called via client API), some of which [~shashikant] has mentioned. So we move it to CLOSING state here if the container is not already in CLOSING state. bq. lets change the assertion here to isQuasiClosed. Fixed bq. update the comment to be container getting "quasi closed" rather than getting closed. Done bq. closeContainer is exposed to clients in ContainerProtocolCalls.Java To handle this case, if the container is in OPEN state we move it to CLOSING in {{KeyValueHandler#handleCloseContainer}}. bq. Any state change in ContainerState should triggerICR The ICR is triggered inside closeContainer/quasiCloseContainer call itself. No need to call updateContainerState internally. bq. There can be a case where let's say the SCM gets network separated from a follower before sending... This call will come through {{KeyValueHandler#handleCloseContainer}}, we will move the container to CLOSING state here if it's not there already. bq. The comments look misleading here. The TODO is for performance optimization which can be done later. The comment says that "Close container is not expected to be instantaneous" (current implementation). It looks fine to me. > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch, > HDDS-801.002.patch, HDDS-801.003.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Attachment: HDDS-801.004.patch > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch, > HDDS-801.002.patch, HDDS-801.003.patch, HDDS-801.004.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677790#comment-16677790 ] Nanda kumar commented on HDDS-737: -- The shadedclient build is failing even without the patch in Jenkins run https://builds.apache.org/job/PreCommit-HDDS-Build/1629/artifact/out/branch-shadedclient.txt I thinks it affect the unit test run, whenever shadedclient build is failing none of the unit test are running and the build is failing with {code} [ERROR] ExecutionException The forked VM terminated without properly saying goodbye. VM crash or System.exit called? {code} > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676432#comment-16676432 ] Nanda kumar commented on HDDS-737: -- [~jnp], thanks for the review. {quote}In CloseContainerCommandHandler#handle the container state should be set to CLOSING before making a ratis call. {quote} This is done as part of HDDS-801. {quote}pipelineManager is set in ContainerReportHandler but never used. {quote} Both in ContainerReportHandler and IncrementalContainerReportHandler, pipelineManager will be required when we handle state change. We need to remove the container from OPEN pipeline when the container is moved to CLOSED state. For now, added TODO in both the classes. When we handle state change, pipelineManager will be used. {quote}Heartbeating thread can also receive interrupt when shutting down {quote} Good catch. Updated the comment. {quote}NewNodeHandler does nothing. Shouldn't it send command for a container report? {quote} NewNode event is triggered by NodeManager, it has already made an entry for the registered node in NodeStateManager. We get container report as part of register call, and that container report will be processed by ContainerReportHandler to update the container replica state. We currently have nothing to do when we receive a new node event from NodeManager. NewNodeHandler is just a placeholder for now, in future, if required, we can use it. {quote}Why is removeNode removed from NodeManager? It seems like the right place. {quote} We currently don't remove a node from NodeManager once it is registered. We can add removeNode logic when we implement decommissioning of a datanode. (existing removeNode logic was incomplete). [~linyiqun] {quote}I prefer to add additional try-catch for thread.sleep and get InterruptedException. {quote} Since we also have to handle {{InterruptedException}} when the shutdown is initiated, I feel it is better to have try-catch for the complete code inside the while loop. > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-797) If DN is started before SCM, it does not register
[ https://issues.apache.org/jira/browse/HDDS-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676484#comment-16676484 ] Nanda kumar commented on HDDS-797: -- [~hanishakoneru], It seems {{org.apache.hadoop.ozone.container.common.TestEndPoint#testCheckVersionResponse}} is failing after this commit. {code:java} [ERROR] testCheckVersionResponse(org.apache.hadoop.ozone.container.common.TestEndPoint) Time elapsed: 0.142 s <<< FAILURE! java.lang.AssertionError: expected: but was: {code} Once we are in REGISTER state we don't allow {{getVersion}} call after this patch. This is causing the test case to fail. Created HDDS-812. > If DN is started before SCM, it does not register > - > > Key: HDDS-797 > URL: https://issues.apache.org/jira/browse/HDDS-797 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Assignee: Hanisha Koneru >Priority: Blocker > Fix For: 0.3.0, 0.4.0 > > Attachments: HDDS-797.001.patch > > > If a DN is started before SCM, it does not register with the SCM. DNs keep > trying to connect with the SCM and once SCM is up, the DN services are > shutdown instead of registering with SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-812) TestEndPoint#testCheckVersionResponse is failing
Nanda kumar created HDDS-812: Summary: TestEndPoint#testCheckVersionResponse is failing Key: HDDS-812 URL: https://issues.apache.org/jira/browse/HDDS-812 Project: Hadoop Distributed Data Store Issue Type: Bug Components: test Reporter: Nanda kumar TestEndPoint#testCheckVersionResponse is failing with the below error {code:java} [ERROR] testCheckVersionResponse(org.apache.hadoop.ozone.container.common.TestEndPoint) Time elapsed: 0.142 s <<< FAILURE! java.lang.AssertionError: expected: but was: {code} Once we are in REGISTER state we don't allow getVersion call anymore. This is causing the test case to fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-737: - Attachment: HDDS-737.001.patch > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-737: - Attachment: (was: HDDS-737.001.patch) > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-737) Introduce Incremental Container Report
[ https://issues.apache.org/jira/browse/HDDS-737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-737: - Attachment: HDDS-737.001.patch > Introduce Incremental Container Report > -- > > Key: HDDS-737 > URL: https://issues.apache.org/jira/browse/HDDS-737 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode, SCM >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-737.000.patch, HDDS-737.001.patch > > > We will use Incremental Container Report (ICR) to immediately inform SCM when > there is some state change to the container in datanode. This will make sure > that SCM is updated as soon as the state of a container changes and doesn’t > have to wait for full container report. > *When do we send ICR?* > * When a container replica state changes from open/closing to closed > * When a container replica state changes from open/closing to quasi closed > * When a container replica state changes from quasi closed to closed > * When a container replica is deleted in datanode > * When a container replica is copied from another datanode > * When a container replica is discovered to be corrupted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13348) Ozone: Update IP and hostname in Datanode from SCM's response to the register call
[ https://issues.apache.org/jira/browse/HDFS-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676773#comment-16676773 ] Nanda kumar commented on HDFS-13348: [~sunilg], hdds/ozone is not released as part of hadoop. Ozone follows a separate release cycle. This jira was created before the creation of new hadoop sub project HDDS and was committed to HDFS-7240 branch. Even after merging HDFS-7240 to trunk, hadoop releases doesn't container hdds/ozone. The correct fixed version for this jira should be 0.2.1 of Ozone. Since these jiras were created befor the sub project was created we don't have any correct fixed version for this in HDFS project. > Ozone: Update IP and hostname in Datanode from SCM's response to the register > call > -- > > Key: HDFS-13348 > URL: https://issues.apache.org/jira/browse/HDFS-13348 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Nanda kumar >Assignee: Shashikant Banerjee >Priority: Major > Attachments: HDFS-13348-HDFS-7240.000.patch, > HDFS-13348-HDFS-7240.001.patch, HDFS-13348-HDFS-7240.002.patch > > > Whenever a Datanode registers with SCM, the SCM resolves the IP address and > hostname of the Datanode form the RPC call. This IP address and hostname > should be sent back to Datanode in the response to register call and the > Datanode has to update the values from the response to its > {{DatanodeDetails}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13348) Ozone: Update IP and hostname in Datanode from SCM's response to the register call
[ https://issues.apache.org/jira/browse/HDFS-13348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDFS-13348: --- Fix Version/s: HDFS-7240 > Ozone: Update IP and hostname in Datanode from SCM's response to the register > call > -- > > Key: HDFS-13348 > URL: https://issues.apache.org/jira/browse/HDFS-13348 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Nanda kumar >Assignee: Shashikant Banerjee >Priority: Major > Fix For: HDFS-7240 > > Attachments: HDFS-13348-HDFS-7240.000.patch, > HDFS-13348-HDFS-7240.001.patch, HDFS-13348-HDFS-7240.002.patch > > > Whenever a Datanode registers with SCM, the SCM resolves the IP address and > hostname of the Datanode form the RPC call. This IP address and hostname > should be sent back to Datanode in the response to register call and the > Datanode has to update the values from the response to its > {{DatanodeDetails}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Status: Patch Available (was: In Progress) > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Attachment: HDDS-801.001.patch > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-837) Persist originNodeId as part of .container file in datanode
Nanda kumar created HDDS-837: Summary: Persist originNodeId as part of .container file in datanode Key: HDDS-837 URL: https://issues.apache.org/jira/browse/HDDS-837 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Nanda kumar Assignee: Nanda kumar To differentiate the replica of QUASI_CLOSED containers we need {{originNodeId}} field. With this field, we can uniquely identify a QUASI_CLOSED container replica. This will be needed when we want to CLOSE a QUASI_CLOSED container. This field will be set by the node where the container is created and stored as part of {{.container}} file and will be sent as part of ContainerReport to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDDS-801: - Attachment: HDDS-801.002.patch > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch, > HDDS-801.002.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-801) Quasi close the container when close is not executed via Ratis
[ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686242#comment-16686242 ] Nanda kumar commented on HDDS-801: -- /cc [~jnp] [~arpitagarwal] [~msingh] [~hanishakoneru] > Quasi close the container when close is not executed via Ratis > -- > > Key: HDDS-801 > URL: https://issues.apache.org/jira/browse/HDDS-801 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.3.0 >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Attachments: HDDS-801.000.patch, HDDS-801.001.patch, > HDDS-801.002.patch > > > When datanode received CloseContainerCommand and the replication type is not > RATIS, we should QUASI close the container. After quasi-closing the container > an ICR has to be sent to SCM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-798) Storage-class is showing incorrectly
[ https://issues.apache.org/jira/browse/HDDS-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679672#comment-16679672 ] Nanda kumar commented on HDDS-798: -- Created HDDS-823 for {{RestClient#getKeyDetails}} failure. > Storage-class is showing incorrectly > > > Key: HDDS-798 > URL: https://issues.apache.org/jira/browse/HDDS-798 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 0.3.0, 0.4.0 > > Attachments: HDDS-798.00.patch > > > After HDDS-712, we support storage-class. > For list-objects, even if key has set storage-class to REDUCED_REDUNDANCY, > still it shows STANDARD. > As in code in list object response, we have hardcoded it as below. > keyMetadata.setStorageClass("STANDARD"); -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-733) Create container if not exist, as part of chunk write
[ https://issues.apache.org/jira/browse/HDDS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681705#comment-16681705 ] Nanda kumar commented on HDDS-733: -- +1, LGTM. Will fix the checkstyle issues while committing. > Create container if not exist, as part of chunk write > - > > Key: HDDS-733 > URL: https://issues.apache.org/jira/browse/HDDS-733 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Reporter: Nanda kumar >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-733.001.patch, HDDS-733.002.patch, > HDDS-733.003.patch, HDDS-733.004.patch > > > The current implementation requires a container to be created in datanode > before starting the chunk write. This can be optimized by creating the > container on the first chunk write. > During chunk write, if the container is missing, we can go ahead and create > the container. > Along with this change ALLOCATED and CREATING container states can be removed > as they were used to track which containers have been successfully created. > Also there is a shouldCreateContainer flag which is used by client to know if > it needs to create container. This flag can be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org