[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-05-01 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831257#comment-16831257
 ] 

Anu Engineer commented on HDDS-1175:


[~hanishakoneru] Sorry for commenting so late. I have not been looking at HA 
patches. I have a concern here.

bq. On OM leader, we run a periodic role checked to verify its leader status.
This means that, at the end of the day, it is possible that we do not "know" 
for sure if we are the leader. This suffers from the issue of time of check vs. 
time of access issue. One OM might think that it is a leader when it really is 
not.

Many other systems have used a notion of "Leader Lease" to avoid this problem. 
I have been thinking another way to solve this issue is to read from any 2 
nodes, and if they value of the key does not agree, we can use the later 
version of the key.

Without one of these approaches, OM HA will weaken the current set of strict 
serializability guarantees of OM ( that is OM without HA). Thought I will flag 
this here, for your consideration.


> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1175.001.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786376#comment-16786376
 ] 

Hudson commented on HDDS-1175:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16149 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16149/])
HDDS-1175. Serve read requests directly from RocksDB. (#557) (github: rev 
bb12e81ec84e2a2a1154b756d05336018c70f946)
* (delete) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OMRatisHelper.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
* (edit) 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java
* (edit) 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerProtocolServerSideTranslatorPB.java
* (edit) hadoop-hdds/common/src/main/resources/ozone-default.xml
* (edit) 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/OMConfigKeys.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerRatisClient.java
* (add) 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OMRatisHelper.java
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerRatisServer.java
* (add) 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/exceptions/NotLeaderException.java


> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1175.001.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-06 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786345#comment-16786345
 ] 

Hanisha Koneru commented on HDDS-1175:
--

Thank you [~linyiqun] and [~arpitagarwal].

I have merged the PR to trunk.

> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1175.001.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-05 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785244#comment-16785244
 ] 

Yiqun Lin commented on HDDS-1175:
-

+1 from me too, :).

> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1175.001.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-05 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785215#comment-16785215
 ] 

Arpit Agarwal commented on HDDS-1175:
-

+1 pending Jenkins for the updated PR.

[~linyiqun], does the updated patch look good to you.

> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1175.001.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-05 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784923#comment-16784923
 ] 

Hanisha Koneru commented on HDDS-1175:
--

Hi [~linyiqun], I have posted a PullRequest on github for patch v02. 
As per your suggestion, I made the RoleCheckInterval configurable.
bq. Another place, it would be better to have a check to verify that proxy 
provider has real performed fail-over to current leader id. If leader om id is 
not contained in om proxy infos, it will perform failed.
The leader Id we get as part of the NotLeaderException is only a suggested 
leader id. This is because between us querying the ratis server for leader id, 
and the client actually performing the failover, a new election might have 
occurred. So I think irrespective of whether failover happened to suggested 
leader id, we should retry the request.

> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1175.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-01 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781835#comment-16781835
 ] 

Yiqun Lin commented on HDDS-1175:
-

Hi [~hanishakoneru], just take a quick review, one comment from me: I think we 
can make {{ROLE_CHECK_INTERVAL}} configurable. In some scenarios, admins maybe 
want to let cache leader id be more-updated.


> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDDS-1175.001.patch
>
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781524#comment-16781524
 ] 

Hadoop QA commented on HDDS-1175:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
 6s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
41s{color} | {color:red} hadoop-ozone in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
25s{color} | {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
18s{color} | {color:red} ozone-manager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-ozone in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 26s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} hadoop-ozone: The patch generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
26s{color} | {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
19s{color} | {color:red} ozone-manager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
21s{color} | {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
19s{color} | {color:red} ozone-manager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
18s{color} | {color:red} hadoop-ozone_ozone-manager generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 27s{color} 
| {color:red} common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 21s{color} 
| {color:red} ozone-manager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 56s{color} 
| {color:red} integration-test in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 

[jira] [Commented] (HDDS-1175) Serve read requests directly from RocksDB

2019-03-01 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781462#comment-16781462
 ] 

Hanisha Koneru commented on HDDS-1175:
--

Patch v01 implements the following:
 * OM caches the RaftPeerRole of its Ratis server.
 * If OM leader receives a read request, it services the request directly from 
RocksDB.
 * If non leader OM receives a read request, it queries its ratis server to 
find the current leader and returns this information to the client in the form 
of NotLeaderException
 * If client gets a NotLeaderException, it fails over to the suggested leader 
and retries the request.
 * On OM leader, we run a periodic role checked to verify its leader status.
 * If a leader node's role changes to non leader role, the ratis server 
notifies the OMStateMachine and this results in a failover to current leader.

> Serve read requests directly from RocksDB
> -
>
> Key: HDDS-1175
> URL: https://issues.apache.org/jira/browse/HDDS-1175
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDDS-1175.001.patch
>
>
> We can directly server read requests from the OM's RocksDB instead of going 
> through the Ratis server. OM should first check its role and only if it is 
> the leader can it server read requests. 
> There can be a scenario where an OM can lose its Leader status but not know 
> about the new election in the ring. This OM could server stale reads for the 
> duration of the heartbeat timeout but this should be acceptable (similar to 
> how Standby Namenode could possibly server stale reads till it figures out 
> the new status).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org