[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222819=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222819
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 04/Apr/19 05:50
Start Date: 04/Apr/19 05:50
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222819)
Time Spent: 5h  (was: 4h 50m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222818
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 04/Apr/19 05:49
Start Date: 04/Apr/19 05:49
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479759384
 
 
   Thank you @bharatviswa504 for the reviews.
   The CI unit and acceptance test failure is not related to this PR. I will 
merge the PR with trunk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222818)
Time Spent: 4h 50m  (was: 4h 40m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222797
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 04/Apr/19 04:07
Start Date: 04/Apr/19 04:07
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479742555
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 24 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 59 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1006 | trunk passed |
   | +1 | compile | 964 | trunk passed |
   | +1 | checkstyle | 192 | trunk passed |
   | -1 | mvnsite | 37 | ozone-manager in trunk failed. |
   | +1 | shadedclient | 1092 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | -1 | findbugs | 29 | ozone-manager in trunk failed. |
   | +1 | javadoc | 122 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 21 | Maven dependency ordering for patch |
   | +1 | mvninstall | 129 | the patch passed |
   | +1 | compile | 938 | the patch passed |
   | +1 | javac | 938 | the patch passed |
   | +1 | checkstyle | 209 | the patch passed |
   | +1 | mvnsite | 148 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 1 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 614 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 198 | the patch passed |
   | +1 | javadoc | 118 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 82 | common in the patch passed. |
   | +1 | unit | 39 | common in the patch passed. |
   | -1 | unit | 1545 | integration-test in the patch failed. |
   | +1 | unit | 50 | ozone-manager in the patch passed. |
   | +1 | asflicense | 44 | The patch does not generate ASF License warnings. |
   | | | 7827 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux 78ec5a2dac0e 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 7b5b783 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/7/artifact/out/branch-mvnsite-hadoop-ozone_ozone-manager.txt
 |
   | findbugs | v3.1.0-RC1 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/7/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/7/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/7/testReport/ |
   | Max. process+thread count | 3676 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/7/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222797)
Time Spent: 4h 40m  (was: 4.5h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: 

[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222743
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 04/Apr/19 01:20
Start Date: 04/Apr/19 01:20
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479713962
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 24 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 64 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1011 | trunk passed |
   | +1 | compile | 963 | trunk passed |
   | +1 | checkstyle | 191 | trunk passed |
   | +1 | mvnsite | 217 | trunk passed |
   | +1 | shadedclient | 1135 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 213 | trunk passed |
   | +1 | javadoc | 169 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 61 | Maven dependency ordering for patch |
   | -1 | mvninstall | 25 | integration-test in the patch failed. |
   | +1 | compile | 923 | the patch passed |
   | +1 | javac | 923 | the patch passed |
   | +1 | checkstyle | 193 | the patch passed |
   | +1 | mvnsite | 171 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 2 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 656 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 214 | the patch passed |
   | +1 | javadoc | 139 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 73 | common in the patch passed. |
   | +1 | unit | 40 | common in the patch passed. |
   | -1 | unit | 1095 | integration-test in the patch failed. |
   | +1 | unit | 50 | ozone-manager in the patch passed. |
   | +1 | asflicense | 43 | The patch does not generate ASF License warnings. |
   | | | 7661 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.client.rpc.TestOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestContainerStateMachineFailures |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux abb72d223c51 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 7b5b783 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/6/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/6/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/6/testReport/ |
   | Max. process+thread count | 3897 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/6/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222743)
Time Spent: 4.5h  (was: 4h 20m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: 

[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222679
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 03/Apr/19 23:00
Start Date: 03/Apr/19 23:00
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479688829
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 25 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 24 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1014 | trunk passed |
   | +1 | compile | 951 | trunk passed |
   | +1 | checkstyle | 189 | trunk passed |
   | -1 | mvnsite | 28 | ozone-manager in trunk failed. |
   | +1 | shadedclient | 1015 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | -1 | findbugs | 37 | ozone-manager in trunk failed. |
   | +1 | javadoc | 166 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | -1 | mvninstall | 24 | integration-test in the patch failed. |
   | +1 | compile | 904 | the patch passed |
   | +1 | javac | 904 | the patch passed |
   | +1 | checkstyle | 195 | the patch passed |
   | +1 | mvnsite | 190 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 1 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 678 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 224 | the patch passed |
   | +1 | javadoc | 162 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 91 | common in the patch passed. |
   | +1 | unit | 47 | common in the patch passed. |
   | -1 | unit | 1162 | integration-test in the patch failed. |
   | +1 | unit | 60 | ozone-manager in the patch passed. |
   | +1 | asflicense | 55 | The patch does not generate ASF License warnings. |
   | | | 7584 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient |
   |   | hadoop.ozone.TestMiniChaosOzoneCluster |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux b30eca944121 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 366186d |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/5/artifact/out/branch-mvnsite-hadoop-ozone_ozone-manager.txt
 |
   | findbugs | v3.1.0-RC1 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/5/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/5/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/5/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/5/testReport/ |
   | Max. process+thread count | 4333 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/5/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue 

[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222601
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 03/Apr/19 20:20
Start Date: 03/Apr/19 20:20
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479643547
 
 
   +1 LGTM.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222601)
Time Spent: 4h 10m  (was: 4h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222599
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 03/Apr/19 20:19
Start Date: 03/Apr/19 20:19
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271915226
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1617,7 +1617,7 @@
 
   
 ozone.om.ratis.snapshot.auto.trigger.threshold
-40L
+40
 
 Review comment:
   Thanks for the info. We can tweak this later.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222599)
Time Spent: 4h  (was: 3h 50m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-03 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222597=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222597
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 03/Apr/19 20:17
Start Date: 03/Apr/19 20:17
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271914300
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1617,7 +1617,7 @@
 
   
 ozone.om.ratis.snapshot.auto.trigger.threshold
-40L
+40
 
 Review comment:
   I think 400k should not be too small a number. In HDFS, the default number 
of transactions after which a checkpoint is saved is 1M. Also, the ratis log 
index is not the same as the actual transaction count. There are lot of 
internal ratis log entries also.
   But we can re-tweak the default after some testing.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222597)
Time Spent: 3h 50m  (was: 3h 40m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222030=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222030
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:47
Start Date: 02/Apr/19 21:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479218191
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 22 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 71 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1133 | trunk passed |
   | +1 | compile | 949 | trunk passed |
   | +1 | checkstyle | 208 | trunk passed |
   | +1 | mvnsite | 204 | trunk passed |
   | +1 | shadedclient | 1166 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 198 | trunk passed |
   | +1 | javadoc | 145 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | -1 | mvninstall | 26 | integration-test in the patch failed. |
   | +1 | compile | 1000 | the patch passed |
   | +1 | javac | 1000 | the patch passed |
   | +1 | checkstyle | 213 | the patch passed |
   | +1 | mvnsite | 163 | the patch passed |
   | -1 | whitespace | 0 | The patch has 1 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply |
   | +1 | xml | 1 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 784 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 227 | the patch passed |
   | +1 | javadoc | 146 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 90 | common in the patch passed. |
   | +1 | unit | 44 | common in the patch passed. |
   | -1 | unit | 754 | integration-test in the patch failed. |
   | +1 | unit | 56 | ozone-manager in the patch passed. |
   | +1 | asflicense | 46 | The patch does not generate ASF License warnings. |
   | | | 7665 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.om.TestScmChillMode |
   |   | hadoop.hdds.scm.pipeline.TestRatisPipelineUtils |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux 23206903a05e 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed 
Oct 31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / cf26811 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | whitespace | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/whitespace-eol.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/testReport/ |
   | Max. process+thread count | 4419 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog 

[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222029
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:47
Start Date: 02/Apr/19 21:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271508687
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -534,4 +536,84 @@ public void testReadRequest() throws Exception {
   proxyProvider.getCurrentProxyOMNodeId());
 }
   }
+
+  @Test
+  public void testOMRatisSnapshot() throws Exception {
+String userName = "user" + RandomStringUtils.randomNumeric(5);
+String adminName = "admin" + RandomStringUtils.randomNumeric(5);
+String volumeName = "volume" + RandomStringUtils.randomNumeric(5);
+String bucketName = "bucket" + RandomStringUtils.randomNumeric(5);
+
+VolumeArgs createVolumeArgs = VolumeArgs.newBuilder()
+.setOwner(userName)
+.setAdmin(adminName)
+.build();
+
+objectStore.createVolume(volumeName, createVolumeArgs);
+OzoneVolume retVolumeinfo = objectStore.getVolume(volumeName);
+
+retVolumeinfo.createBucket(bucketName);
+OzoneBucket ozoneBucket = retVolumeinfo.getBucket(bucketName);
+
+String leaderOMNodeId = objectStore.getClientProxy().getOMProxyProvider()
+.getCurrentProxyOMNodeId();
+OzoneManager ozoneManager = cluster.getOzoneManager(leaderOMNodeId);
+
+// Send commands to ratis to increase the log index so that ratis
+// triggers a snapshot on the state machine.
+
+long appliedLogIndex = 0;
+while (appliedLogIndex <= SNAPSHOT_THRESHOLD) {
+  createKey(ozoneBucket);
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
+  return false;
+}, 1000, 10);
+
+// The current lastAppliedLogIndex on the state machine should be greater
+// than or equal to the saved snapshot index.
+long smLastAppliedIndex =
+ozoneManager.getOmRatisServer().getStateMachineLastAppliedIndex();
+long ratisSnapshotIndex = ozoneManager.loadRatisSnapshotIndex();
+Assert.assertTrue("LastAppliedIndex on OM State Machine ("
++ smLastAppliedIndex + ") is less than the saved snapshot index("
++ ratisSnapshotIndex + ").",
+smLastAppliedIndex >= ratisSnapshotIndex);
+
+// Add more transactions to Ratis to trigger another snapshot
+while (appliedLogIndex <= (smLastAppliedIndex + SNAPSHOT_THRESHOLD)) {
+  createKey(ozoneBucket);
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
+  return false;
+}, 1000, 10);
+
+// The new snapshot index must be greater than the previous snapshot index
+long ratisSnapshotIndexNew = ozoneManager.loadRatisSnapshotIndex();
+Assert.assertTrue("Latest snapshot index must be greater than previous " +
+"snapshot indices", ratisSnapshotIndexNew > ratisSnapshotIndex);  
 
 Review comment:
   whitespace:end of line
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222029)
Time Spent: 3.5h  (was: 3h 20m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221920
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:48
Start Date: 02/Apr/19 18:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479143795
 
 
   I think we need to rebase with trunk to get a Yetus run.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221920)
Time Spent: 3h 20m  (was: 3h 10m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221918
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:47
Start Date: 02/Apr/19 18:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479143473
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 0 | Docker mode activated. |
   | -1 | patch | 7 | https://github.com/apache/hadoop/pull/651 does not apply 
to trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help. |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221918)
Time Spent: 3h 10m  (was: 3h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221917
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:45
Start Date: 02/Apr/19 18:45
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271446311
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1617,7 +1617,7 @@
 
   
 ozone.om.ratis.snapshot.auto.trigger.threshold
-40L
+40
 
 Review comment:
   Then I think I having less value make sense. 
   Do we want to revisit this later, for now just go with ratis default value 
of 400k?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221917)
Time Spent: 3h  (was: 2h 50m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221916=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221916
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:39
Start Date: 02/Apr/19 18:39
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271444007
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1617,7 +1617,7 @@
 
   
 ozone.om.ratis.snapshot.auto.trigger.threshold
-40L
+40
 
 Review comment:
   Yes that's right
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221916)
Time Spent: 2h 50m  (was: 2h 40m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221915
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:38
Start Date: 02/Apr/19 18:38
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271443794
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -161,7 +161,10 @@ public TransactionContext startTransaction(
   @Override
   public long takeSnapshot() throws IOException {
 LOG.info("Saving Ratis snapshot on the OM.");
-return ozoneManager.saveRatisSnapshot();
+if (ozoneManager != null) {
+  return ozoneManager.saveRatisSnapshot();
+}
+return 0;
 
 Review comment:
   We do return the last applied index which is being stored on disk. The null 
check is for tests only (TestOzoneManagerRatisServer).
   Ratis currently does not do anything with the returned value.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221915)
Time Spent: 2h 40m  (was: 2.5h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220901=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220901
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 30/Mar/19 17:36
Start Date: 30/Mar/19 17:36
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270634511
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -161,7 +161,10 @@ public TransactionContext startTransaction(
   @Override
   public long takeSnapshot() throws IOException {
 LOG.info("Saving Ratis snapshot on the OM.");
-return ozoneManager.saveRatisSnapshot();
+if (ozoneManager != null) {
+  return ozoneManager.saveRatisSnapshot();
+}
+return 0;
 
 Review comment:
   Question: Here we are returning 0,(Should we return the lastAppliedIndex we 
are writing) How this will be used by Ratis?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220901)
Time Spent: 2.5h  (was: 2h 20m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220902
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 30/Mar/19 17:36
Start Date: 30/Mar/19 17:36
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270634602
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1617,7 +1617,7 @@
 
   
 ozone.om.ratis.snapshot.auto.trigger.threshold
-40L
+40
 
 Review comment:
   Question: If we have taken a snapshot for every 400k, then after that 200k 
transactions have happened, then when follower OM restart's because it knows it 
has till 400k only, so will it apply 200k transactions again?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220902)
Time Spent: 2.5h  (was: 2h 20m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220342
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 29/Mar/19 01:28
Start Date: 29/Mar/19 01:28
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270255127
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -534,4 +536,84 @@ public void testReadRequest() throws Exception {
   proxyProvider.getCurrentProxyOMNodeId());
 }
   }
+
+  @Test
+  public void testOMRatisSnapshot() throws Exception {
+String userName = "user" + RandomStringUtils.randomNumeric(5);
+String adminName = "admin" + RandomStringUtils.randomNumeric(5);
+String volumeName = "volume" + RandomStringUtils.randomNumeric(5);
+String bucketName = "bucket" + RandomStringUtils.randomNumeric(5);
+
+VolumeArgs createVolumeArgs = VolumeArgs.newBuilder()
+.setOwner(userName)
+.setAdmin(adminName)
+.build();
+
+objectStore.createVolume(volumeName, createVolumeArgs);
+OzoneVolume retVolumeinfo = objectStore.getVolume(volumeName);
+
+retVolumeinfo.createBucket(bucketName);
+OzoneBucket ozoneBucket = retVolumeinfo.getBucket(bucketName);
+
+String leaderOMNodeId = objectStore.getClientProxy().getOMProxyProvider()
+.getCurrentProxyOMNodeId();
+OzoneManager ozoneManager = cluster.getOzoneManager(leaderOMNodeId);
+
+// Send commands to ratis to increase the log index so that ratis
+// triggers a snapshot on the state machine.
+
+long appliedLogIndex = 0;
+while (appliedLogIndex <= SNAPSHOT_THRESHOLD) {
+  createKey(ozoneBucket);
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
+  return false;
+}, 1000, 10);
+
+// The current lastAppliedLogIndex on the state machine should be greater
+// than or equal to the saved snapshot index.
+long smLastAppliedIndex =
+ozoneManager.getOmRatisServer().getStateMachineLastAppliedIndex();
+long ratisSnapshotIndex = ozoneManager.loadRatisSnapshotIndex();
+Assert.assertTrue("LastAppliedIndex on OM State Machine ("
++ smLastAppliedIndex + ") is less than the saved snapshot index("
++ ratisSnapshotIndex + ").",
+smLastAppliedIndex >= ratisSnapshotIndex);
+
+// Add more transactions to Ratis to trigger another snapshot
+while (appliedLogIndex <= (smLastAppliedIndex + SNAPSHOT_THRESHOLD)) {
+  createKey(ozoneBucket);
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
+  return false;
+}, 1000, 10);
+
+// The new snapshot index must be greater than the previous snapshot index
+long ratisSnapshotIndexNew = ozoneManager.loadRatisSnapshotIndex();
+Assert.assertTrue("Latest snapshot index must be greater than previous " +
+"snapshot indices", ratisSnapshotIndexNew > ratisSnapshotIndex);  
 
 Review comment:
   whitespace:end of line
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220342)
Time Spent: 2h 10m  (was: 2h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220343
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 29/Mar/19 01:28
Start Date: 29/Mar/19 01:28
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-477830367
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 23 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 22 | Maven dependency ordering for branch |
   | +1 | mvninstall | 975 | trunk passed |
   | +1 | compile | 936 | trunk passed |
   | +1 | checkstyle | 229 | trunk passed |
   | -1 | mvnsite | 54 | integration-test in trunk failed. |
   | +1 | shadedclient | 1140 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 196 | trunk passed |
   | +1 | javadoc | 160 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 24 | Maven dependency ordering for patch |
   | -1 | mvninstall | 25 | integration-test in the patch failed. |
   | +1 | compile | 881 | the patch passed |
   | +1 | javac | 881 | the patch passed |
   | +1 | checkstyle | 188 | the patch passed |
   | +1 | mvnsite | 170 | the patch passed |
   | -1 | whitespace | 0 | The patch has 1 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply |
   | +1 | xml | 2 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 673 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 223 | the patch passed |
   | +1 | javadoc | 159 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 74 | common in the patch passed. |
   | +1 | unit | 47 | common in the patch passed. |
   | -1 | unit | 597 | integration-test in the patch failed. |
   | +1 | unit | 57 | ozone-manager in the patch passed. |
   | +1 | asflicense | 51 | The patch does not generate ASF License warnings. |
   | | | 7018 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.ozShell.TestOzoneShell |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux d097c6508e41 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / d7a2f94 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/2/artifact/out/branch-mvnsite-hadoop-ozone_integration-test.txt
 |
   | findbugs | v3.1.0-RC1 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/2/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | whitespace | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/2/artifact/out/whitespace-eol.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/2/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/2/testReport/ |
   | Max. process+thread count | 4099 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/2/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking

[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220318
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 23:32
Start Date: 28/Mar/19 23:32
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270237034
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -115,7 +117,60 @@ public TransactionContext startTransaction(
   return ctxt;
 }
 return handleStartTransactionRequests(raftClientRequest, omRequest);
+  }
+
+  /*
+   * Apply a committed log entry to the state machine.
+   */
+  @Override
+  public CompletableFuture applyTransaction(TransactionContext trx) {
+try {
+  OMRequest request = OMRatisHelper.convertByteStringToOMRequest(
+  trx.getStateMachineLogEntry().getLogData());
+  long trxLogIndex = trx.getLogEntry().getIndex();
+  CompletableFuture future = CompletableFuture
+  .supplyAsync(() -> runCommand(request, trxLogIndex));
+  return future;
+} catch (IOException e) {
+  return completeExceptionally(e);
+}
+  }
+
+  /**
+   * Query the state machine. The request must be read-only.
+   */
+  @Override
+  public CompletableFuture query(Message request) {
+try {
+  OMRequest omRequest = OMRatisHelper.convertByteStringToOMRequest(
+  request.getContent());
+  return CompletableFuture.completedFuture(queryCommand(omRequest));
+} catch (IOException e) {
+  return completeExceptionally(e);
+}
+  }
+
+  /**
+   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
+   * is the log index corresponding to the last applied transaction on the OM
+   * State Machine.
+   *
+   * @return the last applied index on the state machine which has been
+   * stored in the snapshot file.
+   */
+  @Override
+  public long takeSnapshot() throws IOException {
+LOG.info("Saving Ratis snapshot on the OM.");
+return ozoneManager.saveRatisSnapshot();
 
 Review comment:
   done. flushing the DB before saving a snapshot.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220318)
Time Spent: 1h 50m  (was: 1h 40m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220319
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 23:32
Start Date: 28/Mar/19 23:32
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-477808698
 
 
   Thank you Bharat for the review. I have updated the patch to address your 
comments.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220319)
Time Spent: 2h  (was: 1h 50m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220315=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220315
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 23:31
Start Date: 28/Mar/19 23:31
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270236900
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -534,4 +536,61 @@ public void testReadRequest() throws Exception {
   proxyProvider.getCurrentProxyOMNodeId());
 }
   }
+
+  @Test
+  public void testOMRatisSnapshot() throws Exception {
+String userName = "user" + RandomStringUtils.randomNumeric(5);
+String adminName = "admin" + RandomStringUtils.randomNumeric(5);
+String volumeName = "volume" + RandomStringUtils.randomNumeric(5);
+String bucketName = "bucket" + RandomStringUtils.randomNumeric(5);
+
+VolumeArgs createVolumeArgs = VolumeArgs.newBuilder()
+.setOwner(userName)
+.setAdmin(adminName)
+.build();
+
+objectStore.createVolume(volumeName, createVolumeArgs);
+OzoneVolume retVolumeinfo = objectStore.getVolume(volumeName);
+
+retVolumeinfo.createBucket(bucketName);
+OzoneBucket ozoneBucket = retVolumeinfo.getBucket(bucketName);
+
+String leaderOMNodeId = objectStore.getClientProxy().getOMProxyProvider()
+.getCurrentProxyOMNodeId();
+OzoneManager ozoneManager = cluster.getOzoneManager(leaderOMNodeId);
+
+// Send commands to ratis to increase the log index so that ratis
+// triggers a snapshot on the state machine.
+
+long appliedLogIndex = 0;
+while (appliedLogIndex <= SNAPSHOT_THRESHOLD) {
+  String keyName = "key" + RandomStringUtils.randomNumeric(5);
+  String data = "data" + RandomStringUtils.randomNumeric(5);
+  OzoneOutputStream ozoneOutputStream = ozoneBucket.createKey(keyName,
+  data.length(), ReplicationType.STAND_ALONE,
+  ReplicationFactor.ONE, new HashMap<>());
+  ozoneOutputStream.write(data.getBytes(), 0, data.length());
+  ozoneOutputStream.close();
+
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220315)
Time Spent: 1h 20m  (was: 1h 10m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220317=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220317
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 23:31
Start Date: 28/Mar/19 23:31
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270236979
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -308,56 +357,35 @@ private IOException constructExceptionForFailedRequest(
 STATUS_CODE + omResponse.getStatus());
   }
 
-  /*
-   * Apply a committed log entry to the state machine.
-   */
-  @Override
-  public CompletableFuture applyTransaction(TransactionContext trx) {
-try {
-  OMRequest request = OMRatisHelper.convertByteStringToOMRequest(
-  trx.getStateMachineLogEntry().getLogData());
-  CompletableFuture future = CompletableFuture
-  .supplyAsync(() -> runCommand(request));
-  return future;
-} catch (IOException e) {
-  return completeExceptionally(e);
-}
-  }
-
   /**
-   * Query the state machine. The request must be read-only.
+   * Submits write request to OM and returns the response Message.
+   * @param request OMRequest
+   * @return response from OM
+   * @throws ServiceException
*/
-  @Override
-  public CompletableFuture query(Message request) {
-try {
-  OMRequest omRequest = OMRatisHelper.convertByteStringToOMRequest(
-  request.getContent());
-  return CompletableFuture.completedFuture(runCommand(omRequest));
-} catch (IOException e) {
-  return completeExceptionally(e);
+  private Message runCommand(OMRequest request, long trxLogIndex) {
+OMResponse response = handler.handle(request);
+if (response.getSuccess()) {
 
 Review comment:
   done. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220317)
Time Spent: 1h 40m  (was: 1.5h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220316
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 23:31
Start Date: 28/Mar/19 23:31
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270236979
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -308,56 +357,35 @@ private IOException constructExceptionForFailedRequest(
 STATUS_CODE + omResponse.getStatus());
   }
 
-  /*
-   * Apply a committed log entry to the state machine.
-   */
-  @Override
-  public CompletableFuture applyTransaction(TransactionContext trx) {
-try {
-  OMRequest request = OMRatisHelper.convertByteStringToOMRequest(
-  trx.getStateMachineLogEntry().getLogData());
-  CompletableFuture future = CompletableFuture
-  .supplyAsync(() -> runCommand(request));
-  return future;
-} catch (IOException e) {
-  return completeExceptionally(e);
-}
-  }
-
   /**
-   * Query the state machine. The request must be read-only.
+   * Submits write request to OM and returns the response Message.
+   * @param request OMRequest
+   * @return response from OM
+   * @throws ServiceException
*/
-  @Override
-  public CompletableFuture query(Message request) {
-try {
-  OMRequest omRequest = OMRatisHelper.convertByteStringToOMRequest(
-  request.getContent());
-  return CompletableFuture.completedFuture(runCommand(omRequest));
-} catch (IOException e) {
-  return completeExceptionally(e);
+  private Message runCommand(OMRequest request, long trxLogIndex) {
+OMResponse response = handler.handle(request);
+if (response.getSuccess()) {
 
 Review comment:
   done. flushing the DB before saving a snapshot.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220316)
Time Spent: 1.5h  (was: 1h 20m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220280
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 22:08
Start Date: 28/Mar/19 22:08
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270217752
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1603,18 +1603,27 @@
   
 ozone.om.ratis.log.appender.queue.num-elements
 1024
-OZONE, DEBUG, CONTAINER, RATIS
+OZONE, DEBUG, OM, RATIS
 Number of operation pending with Raft's Log Worker.
 
   
   
 ozone.om.ratis.log.appender.queue.byte-limit
 32MB
-OZONE, DEBUG, CONTAINER, RATIS
+OZONE, DEBUG, OM, RATIS
 Byte limit for Raft's Log Worker queue.
 
   
 
+  
+ozone.om.ratis.snapshot.auto.trigger.threshold
+40L
 
 Review comment:
   This is the default in Ratis so used that. I was thinking we can update it 
after extensive testing. But I am open to suggestions.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220280)
Time Spent: 1h 10m  (was: 1h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220180
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 17:48
Start Date: 28/Mar/19 17:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270120787
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -115,7 +117,60 @@ public TransactionContext startTransaction(
   return ctxt;
 }
 return handleStartTransactionRequests(raftClientRequest, omRequest);
+  }
+
+  /*
+   * Apply a committed log entry to the state machine.
+   */
+  @Override
+  public CompletableFuture applyTransaction(TransactionContext trx) {
+try {
+  OMRequest request = OMRatisHelper.convertByteStringToOMRequest(
+  trx.getStateMachineLogEntry().getLogData());
+  long trxLogIndex = trx.getLogEntry().getIndex();
+  CompletableFuture future = CompletableFuture
+  .supplyAsync(() -> runCommand(request, trxLogIndex));
+  return future;
+} catch (IOException e) {
+  return completeExceptionally(e);
+}
+  }
+
+  /**
+   * Query the state machine. The request must be read-only.
+   */
+  @Override
+  public CompletableFuture query(Message request) {
+try {
+  OMRequest omRequest = OMRatisHelper.convertByteStringToOMRequest(
+  request.getContent());
+  return CompletableFuture.completedFuture(queryCommand(omRequest));
+} catch (IOException e) {
+  return completeExceptionally(e);
+}
+  }
+
+  /**
+   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
+   * is the log index corresponding to the last applied transaction on the OM
+   * State Machine.
+   *
+   * @return the last applied index on the state machine which has been
+   * stored in the snapshot file.
+   */
+  @Override
+  public long takeSnapshot() throws IOException {
+LOG.info("Saving Ratis snapshot on the OM.");
+return ozoneManager.saveRatisSnapshot();
 
 Review comment:
   Question: Can we consider this applied and when the snapshot is taken can we 
consider that it is completed?
   
   As writing to rocksdb means is it is not written to disk(As we are writing 
with sync false), so when we take a snapshot, we should also flush the DB to 
make sure the applied index transaction is applied to OM DB.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220180)
Time Spent: 1h  (was: 50m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220176=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220176
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 17:48
Start Date: 28/Mar/19 17:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270115982
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerRatisServer.java
 ##
 @@ -130,11 +131,13 @@ private OzoneManagerRatisServer(Configuration conf,
 LOG.info("Instantiating OM Ratis server with GroupID: {} and " +
 "Raft Peers: {}", raftGroupIdStr, 
raftPeersStr.toString().substring(2));
 
+this.omStateMachine = getStateMachine(this.raftGroupId);
 
 Review comment:
   Here, we are passing raftGroupId to getStateMachine, but we are not using 
that param in method.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220176)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220177=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220177
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 17:48
Start Date: 28/Mar/19 17:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270117723
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerRatisServer.java
 ##
 @@ -130,11 +131,13 @@ private OzoneManagerRatisServer(Configuration conf,
 LOG.info("Instantiating OM Ratis server with GroupID: {} and " +
 "Raft Peers: {}", raftGroupIdStr, 
raftPeersStr.toString().substring(2));
 
+this.omStateMachine = getStateMachine(this.raftGroupId);
 
 Review comment:
   We are passing raftGroupId, but it is not used in the getStateMachine
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220177)
Time Spent: 40m  (was: 0.5h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220181=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220181
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 17:48
Start Date: 28/Mar/19 17:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270125836
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1603,18 +1603,27 @@
   
 ozone.om.ratis.log.appender.queue.num-elements
 1024
-OZONE, DEBUG, CONTAINER, RATIS
+OZONE, DEBUG, OM, RATIS
 Number of operation pending with Raft's Log Worker.
 
   
   
 ozone.om.ratis.log.appender.queue.byte-limit
 32MB
-OZONE, DEBUG, CONTAINER, RATIS
+OZONE, DEBUG, OM, RATIS
 Byte limit for Raft's Log Worker queue.
 
   
 
+  
+ozone.om.ratis.snapshot.auto.trigger.threshold
+40L
 
 Review comment:
   Why we have taken 400k as default, any reason for this value?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220181)
Time Spent: 1h  (was: 50m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220175=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220175
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 17:48
Start Date: 28/Mar/19 17:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270117510
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerRatisServer.java
 ##
 @@ -186,7 +189,7 @@ public static OzoneManagerRatisServer newOMRatisServer(
   raftPeers.add(raftPeer);
 }
 
-return new OzoneManagerRatisServer(ozoneConf, om, omServiceId,
+return new OzoneManagerRatisServer(ozoneConf, omProtocol, omServiceId,
 
 Review comment:
   I think here it should be om? or we should change the method parameter name 
to omProtocol.
   Currently code is not getting compiled.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220175)
Time Spent: 0.5h  (was: 20m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220179=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220179
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 17:48
Start Date: 28/Mar/19 17:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270125514
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -534,4 +536,61 @@ public void testReadRequest() throws Exception {
   proxyProvider.getCurrentProxyOMNodeId());
 }
   }
+
+  @Test
+  public void testOMRatisSnapshot() throws Exception {
+String userName = "user" + RandomStringUtils.randomNumeric(5);
+String adminName = "admin" + RandomStringUtils.randomNumeric(5);
+String volumeName = "volume" + RandomStringUtils.randomNumeric(5);
+String bucketName = "bucket" + RandomStringUtils.randomNumeric(5);
+
+VolumeArgs createVolumeArgs = VolumeArgs.newBuilder()
+.setOwner(userName)
+.setAdmin(adminName)
+.build();
+
+objectStore.createVolume(volumeName, createVolumeArgs);
+OzoneVolume retVolumeinfo = objectStore.getVolume(volumeName);
+
+retVolumeinfo.createBucket(bucketName);
+OzoneBucket ozoneBucket = retVolumeinfo.getBucket(bucketName);
+
+String leaderOMNodeId = objectStore.getClientProxy().getOMProxyProvider()
+.getCurrentProxyOMNodeId();
+OzoneManager ozoneManager = cluster.getOzoneManager(leaderOMNodeId);
+
+// Send commands to ratis to increase the log index so that ratis
+// triggers a snapshot on the state machine.
+
+long appliedLogIndex = 0;
+while (appliedLogIndex <= SNAPSHOT_THRESHOLD) {
+  String keyName = "key" + RandomStringUtils.randomNumeric(5);
+  String data = "data" + RandomStringUtils.randomNumeric(5);
+  OzoneOutputStream ozoneOutputStream = ozoneBucket.createKey(keyName,
+  data.length(), ReplicationType.STAND_ALONE,
+  ReplicationFactor.ONE, new HashMap<>());
+  ozoneOutputStream.write(data.getBytes(), 0, data.length());
+  ozoneOutputStream.close();
+
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
 
 Review comment:
   Test LGTM, as we are writing to the same file, can we add another round of 
SNAPSHOT_THRESHOLD requests, and then check whether the value is expected or 
not. As here, for every SNAPSHOT_THRESHOLD we write to the same file.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220179)
Time Spent: 50m  (was: 40m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=220178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-220178
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 17:48
Start Date: 28/Mar/19 17:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r270122895
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -308,56 +357,35 @@ private IOException constructExceptionForFailedRequest(
 STATUS_CODE + omResponse.getStatus());
   }
 
-  /*
-   * Apply a committed log entry to the state machine.
-   */
-  @Override
-  public CompletableFuture applyTransaction(TransactionContext trx) {
-try {
-  OMRequest request = OMRatisHelper.convertByteStringToOMRequest(
-  trx.getStateMachineLogEntry().getLogData());
-  CompletableFuture future = CompletableFuture
-  .supplyAsync(() -> runCommand(request));
-  return future;
-} catch (IOException e) {
-  return completeExceptionally(e);
-}
-  }
-
   /**
-   * Query the state machine. The request must be read-only.
+   * Submits write request to OM and returns the response Message.
+   * @param request OMRequest
+   * @return response from OM
+   * @throws ServiceException
*/
-  @Override
-  public CompletableFuture query(Message request) {
-try {
-  OMRequest omRequest = OMRatisHelper.convertByteStringToOMRequest(
-  request.getContent());
-  return CompletableFuture.completedFuture(runCommand(omRequest));
-} catch (IOException e) {
-  return completeExceptionally(e);
+  private Message runCommand(OMRequest request, long trxLogIndex) {
+OMResponse response = handler.handle(request);
+if (response.getSuccess()) {
 
 Review comment:
   Why we have checked getSuccess here, then considered that as 
lastAppliedIndex.
   
   As when there are cases like bucket creation failed for an already existing 
bucket, the success will be set false. But that transaction request is 
successfully completed. This one also should be considered as applied only 
right?(Even though it does not mutate om DB, but the transaction request has 
been completed)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 220178)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=219732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-219732
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 28/Mar/19 01:11
Start Date: 28/Mar/19 01:11
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-477404785
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 97 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 22 | Maven dependency ordering for branch |
   | +1 | mvninstall | 970 | trunk passed |
   | +1 | compile | 959 | trunk passed |
   | +1 | checkstyle | 228 | trunk passed |
   | +1 | mvnsite | 200 | trunk passed |
   | +1 | shadedclient | 1158 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 197 | trunk passed |
   | +1 | javadoc | 151 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | -1 | mvninstall | 22 | integration-test in the patch failed. |
   | -1 | mvninstall | 21 | ozone-manager in the patch failed. |
   | +1 | compile | 996 | the patch passed |
   | +1 | javac | 996 | the patch passed |
   | +1 | checkstyle | 236 | the patch passed |
   | -1 | mvnsite | 33 | integration-test in the patch failed. |
   | -1 | mvnsite | 31 | ozone-manager in the patch failed. |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 1 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 621 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | -1 | findbugs | 26 | ozone-manager in the patch failed. |
   | +1 | javadoc | 123 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 67 | common in the patch passed. |
   | +1 | unit | 40 | common in the patch passed. |
   | -1 | unit | 32 | integration-test in the patch failed. |
   | -1 | unit | 30 | ozone-manager in the patch failed. |
   | +1 | asflicense | 38 | The patch does not generate ASF License warnings. |
   | | | 6407 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux 06fe1375ad45 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9cd6619 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvninstall-hadoop-ozone_ozone-manager.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvnsite-hadoop-ozone_integration-test.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-mvnsite-hadoop-ozone_ozone-manager.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-findbugs-hadoop-ozone_ozone-manager.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/artifact/out/patch-unit-hadoop-ozone_ozone-manager.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/testReport/ |
   | Max. process+thread count | 411 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was 

[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-03-27 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=219711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-219711
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 27/Mar/19 23:23
Start Date: 27/Mar/19 23:23
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 219711)
Time Spent: 10m
Remaining Estimate: 0h

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org