----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50079/#review142388 -----------------------------------------------------------
Ship it! Ship It! - Nate Cole On July 15, 2016, 11:56 a.m., Jonathan Hurley wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50079/ > ----------------------------------------------------------- > > (Updated July 15, 2016, 11:56 a.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sid Wagle. > > > Bugs: AMBARI-17738 > https://issues.apache.org/jira/browse/AMBARI-17738 > > > Repository: ambari > > > Description > ------- > > Reproduced as part of creating a rolling upgrade on a large cluster. > > Initially appearing as a deadlock, it's caused by Postgres is holding the > socket open indefinitely. We have a write lock being held while the socket is > open. Jstack dumps taken many minutes apart show the same thread is stuck in > a socket read. Investigating on Postgres shows that there is a lock blocking > the thread which is waiting. > > The sequence query is currently stuck in the {{idle in transaction}} state > which is why it's blocking the other query. The transaction isn't being ended > by EclipseLink. > > The cause is that we begin a transaction and then hammer the database for 2-3 > minutes. During which time, Postgres must keep track of all kinds of > hostcomponentstate updates isolated from our current transaction. When we go > to commit the upgrade, Postgres eventually ends in a deadlock where it > doesn't think that the transaction ended. > > > Diffs > ----- > > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java > 2e976ba > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeEntity.java > db27ea5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeGroupEntity.java > 96f96d5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeItemEntity.java > 6e4a889 > > Diff: https://reviews.apache.org/r/50079/diff/ > > > Testing > ------- > > Fixed on a live cluster where it was 100% reproducible. > > UNIT TESTS PENDING... > > > Thanks, > > Jonathan Hurley > >