[jira] [Comment Edited] (SENTRY-872) Uber jira for HMS HA + Sentry HA redesign

2016-09-27 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/SENTRY-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15527974#comment-15527974
 ] 

Alexander Kolbasov edited comment on SENTRY-872 at 9/28/16 1:05 AM:


For MySQL with JDO optimistic locking it seems that the deadlock happens where 
we update version on two objects simultaneously in the opposite order. The JDO 
optimistic lock code issues statements like

{code}
UPDATE  SET VERSION  WHERE ... 
{code}
which actually locks the matching object to update the version.

So we get a deadlock even with optimistic locking if we are unlucky. Also it 
does DELETE on the join table which also gets row locks.

So the conclusion is that while optimistic locking may provide some other 
benefits, it doesn't prevent deadlocks.


was (Author: akolb):
For MySQL with JDO optimistic locking it seems that the deadlock happens where 
we update version on two objects simultaneously in the opposite order. The JDO 
optimistic lock code issues statements like

{code}
UPDATE  SET VERSION  WHERE ... which actually locks the 
matching object to update the version.
{code}

So we get a deadlock even with optimistic locking if we are unlucky. Also it 
does DELETE on the join table which also gets row locks.

So the conclusion is that while optimistic locking may provide some other 
benefits, it doesn't prevent deadlocks.

> Uber jira for HMS HA + Sentry HA redesign
> -
>
> Key: SENTRY-872
> URL: https://issues.apache.org/jira/browse/SENTRY-872
> Project: Sentry
>  Issue Type: Improvement
>  Components: Hdfs Plugin
>Affects Versions: 1.5.0
>Reporter: Sravya Tirukkovalur
>Assignee: Sravya Tirukkovalur
> Fix For: sentry-ha-redesign
>
> Attachments: SENTRY-872.0.patch, SENTRY-872.pdf, 
> SENTRY-872_design.pdf, SENTRY-872_design_v2.pdf, Sentry-872_design_v2_1.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (SENTRY-872) Uber jira for HMS HA + Sentry HA redesign

2016-09-21 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/SENTRY-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511590#comment-15511590
 ] 

Alexander Kolbasov edited comment on SENTRY-872 at 9/22/16 12:07 AM:
-

I uploaded an updated design doc, please take a look! The major emphasis is 
shifting from Active/Passive to Active/Active approach.


was (Author: akolb):
I uploaded an updated new design doc, please take a look! The major emphasis is 
shifting from Active/Passive to Active/Active approach.

> Uber jira for HMS HA + Sentry HA redesign
> -
>
> Key: SENTRY-872
> URL: https://issues.apache.org/jira/browse/SENTRY-872
> Project: Sentry
>  Issue Type: Improvement
>  Components: Hdfs Plugin
>Affects Versions: 1.5.0
>Reporter: Sravya Tirukkovalur
>Assignee: Sravya Tirukkovalur
> Fix For: sentry-ha-redesign
>
> Attachments: SENTRY-872.0.patch, SENTRY-872.pdf, 
> SENTRY-872_design.pdf, SENTRY-872_design_v2.pdf, Sentry-872_design_v2_1.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (SENTRY-872) Uber jira for HMS HA + Sentry HA redesign

2016-06-07 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/SENTRY-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319585#comment-15319585
 ] 

Colin Patrick McCabe edited comment on SENTRY-872 at 6/7/16 10:36 PM:
--

Thanks for checking out the design doc!

Just to be clear, HIVE-7973 has been committed to Hive already.  In fact, it is 
already a part of Cloudera's CDH5.8 distribution.  While it's true that there 
are a few open subtasks remaining on the upstream JIRA, the same could be said 
for almost any Hadoop feature.  We always have plans to improve things :)  We 
are planning on using HIVE-7973 for other things besides Sentry HA-- for 
example, it is useful for replicating the Hive database.  That code will 
receive additional testing and attention due to the other uses that it's being 
put to.  When using HIVE-7973, it doesn't matter which HMS process we talk to-- 
both of them have access to the notification log stored in SQL.  This allows us 
to see what is going on in Hive, and exactly what order it occurred in, even 
when there are multiple HMS processes involved-- something we currently cannot 
do.

With an active/active design, all the sentry daemons would have to request 
updates (or be sent updates) from the HMS.  This is inefficient because it 
multiplies the RPC load on the HMS service.  It is especially inefficient if we 
have 3 sentry daemons (for extra redundancy).  It opens the door to divergence 
between sentry daemons, because some of the sentry daemons might receive 
updates from HMS earlier or later due to network conditions.  If we are 
persisting the HMS updates in the Sentry SQL database, we must somehow choose 
which sentry daemon does the persisting.  They can't all do it, because their 
updates would conflict.  Choosing one sentry daemon to do the persistence is 
essentially equivalent to choosing a master.

The update log is useful for more than just implementing HA.  It can be used as 
a generalized mechanism for synchronizing a cache.  For example, the HDFS 
plugin can read the update log and apply its updates to keep the cache 
maintained in the NameNode process in sync with what is going on in Sentry.  
This is better than the current mechanism of buffering "deltas" in memory in 
the sentry daemon.  The delta mechanism requires lots of heap memory, whereas 
the update log mechanism does not.  Because the update log is stored in the SQL 
database, the HDFS plugin will be able to continue requesting update log 
entries even if the sentry service is restarted or has a failover.  In 
contrast, the deltas buffered in memory will be lost if either of those events 
occur.  So in conclusion I would say that we do agree that sentry should move 
towards becoming stateless, and we view this design as a stepping stone towards 
that.


was (Author: cmccabe):
Thanks for checking out the design doc!

Just to be clear, HIVE-7973 has been committed to Hive already.  In fact, it is 
already a part of Cloudera's CDH5.8 distribution.  While it's true that there 
are a few open subtasks remaining on the upstream JIRA, the same could be said 
for almost any Hadoop feature.  We always have plans to improve things :)  We 
are planning on using HIVE-7973 for other things besides Sentry HA-- for 
example, it is useful for replicating the Hive database.  That code will 
receive additional testing and attention due to the other uses that it's being 
put to.  When using HIVE-7973, it doesn't matter which HMS process we talk to-- 
both of them have access to the notification log stored in SQL.  This allows us 
to see what is going on in Hive, and exactly what order it occurred in, even 
when there are multiple HMS processes involved-- something we currently cannot 
do.

With an active/active design, all the sentry daemons would have to request 
updates (or be sent updates) from the HMS.  This is inefficient because it 
multiplies the RPC load on the HMS service.  It is especially inefficient if we 
have 3 sentry daemons (for extra redundancy).  It opens the door to divergence 
between sentry daemons, because some of the sentry daemons might receive 
updates from HMS earlier or later due to network conditions.  If we are 
persisting the HMS updates in the Sentry SQL database, we must somehow choose 
which sentry daemon does the persisting.  They can't all do it, because their 
updates would conflict.  Choosing one sentry daemon to do the persistence is 
essentially equivalent to choosing a master

The update log is useful for more than just implementing HA.  It can be used as 
a generalized mechanism for synchronizing a cache.  For example, the HDFS 
plugin can read the update log and apply its updates to keep the cache 
maintained in the NameNode process in sync with what is going on in Sentry.  
This is better than the current mechanism of buffering "deltas" in memory in 
the sentry daemon.  The delt