-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48392/#review136637
-----------------------------------------------------------


Ship it!




Ship It!

- Robert Levas


On June 8, 2016, 8:44 a.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48392/
> -----------------------------------------------------------
> 
> (Updated June 8, 2016, 8:44 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas.
> 
> 
> Bugs: AMBARI-17106
>     https://issues.apache.org/jira/browse/AMBARI-17106
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> This is another case of an Ambari cache competing with a JPA transaction. 
> Consider these steps:
> - A new configuration is created within the context of a Transaction
> - Within that same Transaction, the stale configuration cache is told to 
> invalidate
> - After purging the old data, but before the Transaction is committed, 
> another thread tries to read from the cache. It ends up re-populating the old 
> data.
> 
> Sometimes the code works because the Transaction is able to committ before 
> the cache is re-populated by another thread. In theory, we should be locking 
> around reading the cache to ensure that there isn't a transaction writing to 
> it. However, this is what caused the deadlock since it interferes with our 
> wonder "cluster global lock of doom".
> 
> Instead, it's safer in this case to just invalidate the cache after the 
> Transaction completes. 
> - We do this invalidate on a separate thread to ensure we don't have issues 
> with the cluster global lock
> - Since the cache isn't needed within the context of the invalidation call, 
> it's OK to purge it asynchronously.
> 
> ```
> "Server Action Executor Worker 401" #225 prio=5 os_prio=0 
> tid=0x00007fa07c03e800 nid=0x65df waiting on condition [0x00007fa0737ef000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000a059d4f0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>       at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> 
>   --> @TransactionalLock(lockArea = LockArea.STALE_CONFIG_CACHE, lockType = 
> LockType.WRITE)
> 
>   at 
> org.apache.ambari.server.orm.AmbariJpaLocalTxnInterceptor.lockTransaction(AmbariJpaLocalTxnInterceptor.java:291)
>       at 
> org.apache.ambari.server.orm.AmbariJpaLocalTxnInterceptor.invoke(AmbariJpaLocalTxnInterceptor.java:114)
>       at 
> com.google.inject.internal.InterceptorStackCallback$InterceptedMethodInvocation.proceed(InterceptorStackCallback.java:72)
>       at 
> com.google.inject.internal.InterceptorStackCallback.intercept(InterceptorStackCallback.java:52)
>       at 
> org.apache.ambari.server.state.cluster.ClusterImpl$$EnhancerByGuice$$991b84fc.applyConfigs(<generated>)
> 
>   --> clusterGlobalLock.writeLock().lock();
> 
>   at 
> org.apache.ambari.server.state.cluster.ClusterImpl.addDesiredConfig(ClusterImpl.java:2340)
>       at 
> org.apache.ambari.server.state.ConfigHelper.createConfigTypes(ConfigHelper.java:897)
>       at 
> org.apache.ambari.server.controller.internal.UpgradeResourceProvider.applyStackAndProcessConfigurations(UpgradeResourceProvider.java:1174)
> ```
> 
> ```
> "ambari-hearbeat-monitor" #23 prio=5 os_prio=0 tid=0x00007fa07476c000 
> nid=0x20ad waiting on condition [0x00007fa07bbfb000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000a32d44a0> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>       at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>       at 
> org.apache.ambari.server.state.cluster.ClusterImpl.getDesiredStackVersion(ClusterImpl.java:1052)
>       at 
> org.apache.ambari.server.state.ConfigHelper.calculateIsStaleConfigs(ConfigHelper.java:1075)
> 
>   --> @TransactionalLock(lockArea = LockArea.STALE_CONFIG_CACHE, lockType = 
> LockType.READ)
>   
>       at 
> org.apache.ambari.server.state.ConfigHelper.isStaleConfigs(ConfigHelper.java:456)
>       at 
> org.apache.ambari.server.agent.HeartbeatMonitor.createStatusCommand(HeartbeatMonitor.java:311)
> ```
> 
> 
> Diffs
> -----
> 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/ConfigHelper.java 
> 1bd0ba9 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 
> b590c56 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
>  9e9128e 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java
>  b4f0c52 
> 
> Diff: https://reviews.apache.org/r/48392/diff/
> 
> 
> Testing
> -------
> 
> Tests run: 4463, Failures: 0, Errors: 0, Skipped: 34
> 
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 34:11 min
> [INFO] Finished at: 2016-06-08T01:10:29-04:00
> [INFO] Final Memory: 38M/645M
> [INFO] 
> ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>

Reply via email to