> On July 24, 2015, 9:18 a.m., Nate Cole wrote: > > ambari-server/src/main/java/org/apache/ambari/server/state/host/HostImpl.java, > > lines 1183-1189 > > <https://reviews.apache.org/r/36779/diff/1/?file=1020970#file1020970line1183> > > > > Is a lock here even necessary to return a member variable that already > > set within a write lock?
I thought about this one a while back when we were doing our first set of deadlock fixes. The problem is if there is a writer persisting this for the very first time, but has not yet updated the boolean. Another thread comes in and sees persisted is FALSE, and then tries to persist it again, we'll get a JPA error. We _could_ use a volatile boolean here, but the lock follows our anti-pattern of locking much better. - Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36779/#review92903 ----------------------------------------------------------- On July 24, 2015, 9:05 a.m., Jonathan Hurley wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/36779/ > ----------------------------------------------------------- > > (Updated July 24, 2015, 9:05 a.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sumit Mohanty. > > > Bugs: AMBARI-12526 > https://issues.apache.org/jira/browse/AMBARI-12526 > > > Repository: ambari > > > Description > ------- > > When deploying a new cluster on SQL Azure, there is a recurring deadlock on > the SQL Server. > > Essentially, we have concurrent UPDATE statements in separate transactions > acting on different rows of hostcomponentstate. This seems to cause a > deadlock because both processes have an X lock and then try to acquire a U > lock. The U lock is what is making me think they are trying to acquire the > table lock in order to update the cluster index. > > The solution was to: > - Ensure that some of the failing transactions were placed within the scope > of our internal Java locks > - flush writing to the problem table > > > Diffs > ----- > > > ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListener.java > c016cbd > > ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostComponentStateDAO.java > 00ffd5a > ambari-server/src/main/java/org/apache/ambari/server/state/Host.java > 7a53c21 > ambari-server/src/main/java/org/apache/ambari/server/state/Service.java > 1137cba > > ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponent.java > 60a16eb > > ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentHost.java > 6917a15 > > ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java > aa147de > ambari-server/src/main/java/org/apache/ambari/server/state/ServiceImpl.java > 6484c9f > > ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java > 2b3bf05 > > ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClustersImpl.java > 90fdbec > > ambari-server/src/main/java/org/apache/ambari/server/state/configgroup/ConfigGroupImpl.java > a01f4d4 > > ambari-server/src/main/java/org/apache/ambari/server/state/host/HostImpl.java > e59f4aa > > ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java > b623479 > > ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ServiceComponentHostConcurrentWriteDeadlockTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/36779/diff/ > > > Testing > ------- > > Deployed on SQL Azure about 50 times and did not see the deadlock occur. It > would normally occur in the first 5 cluster deployments. > > > Thanks, > > Jonathan Hurley > >
