Re: Review Request 50079: EclipseLink Sequence Query Stuck Inside of Transaction And Blocks Other Threads
> On July 15, 2016, 5:03 p.m., Sid Wagle wrote: > > Ship It! > > Sid Wagle wrote: > General question: Any reason why we started to see this now? Is it > possible the postgres verion 9.2 does not suffer from this? We seem to be > still installing postgresql-server-8.4.20-6 > > Jonathan Hurley wrote: > Yeah, I asked myself that same question. The Postgres instance this was > seen on was remote (truly remote) and rather slow. I don't think it has to do > with the version of Postgres (I think it was 9.1 on Debian). > > But I do know really know why this happened all of a sudden - perhaps all > of our large upgrade tests have been on local Postgres up until now? I looked > through the code which was making queries to the DB during the transaction > and it hadn't changed in a while. If we had recently introduced a massive > call to DB during the upgrade creation, that could have done it. But I think > this is just a case of "it's always been there, but we never had the right > circumstances to show it". > > Feel free to look at the Jira; it has my analysis in detail. Thanks the detailed explanation on the Jira. - Sid --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50079/#review142395 --- On July 15, 2016, 8:06 p.m., Jonathan Hurley wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50079/ > --- > > (Updated July 15, 2016, 8:06 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sid Wagle. > > > Bugs: AMBARI-17738 > https://issues.apache.org/jira/browse/AMBARI-17738 > > > Repository: ambari > > > Description > --- > > Reproduced as part of creating a rolling upgrade on a large cluster. > > Initially appearing as a deadlock, it's caused by Postgres is holding the > socket open indefinitely. We have a write lock being held while the socket is > open. Jstack dumps taken many minutes apart show the same thread is stuck in > a socket read. Investigating on Postgres shows that there is a lock blocking > the thread which is waiting. > > The sequence query is currently stuck in the {{idle in transaction}} state > which is why it's blocking the other query. The transaction isn't being ended > by EclipseLink. > > The cause is that we begin a transaction and then hammer the database for 2-3 > minutes. During which time, Postgres must keep track of all kinds of > hostcomponentstate updates isolated from our current transaction. When we go > to commit the upgrade, Postgres eventually ends in a deadlock where it > doesn't think that the transaction ended. > > > Diffs > - > > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java > 2e976ba > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeEntity.java > db27ea5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeGroupEntity.java > 96f96d5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeItemEntity.java > 6e4a889 > > ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeResourceProviderTest.java > a5db0f0 > > Diff: https://reviews.apache.org/r/50079/diff/ > > > Testing > --- > > Fixed on a live cluster where it was 100% reproducible. > > > Thanks, > > Jonathan Hurley > >
Re: Review Request 50079: EclipseLink Sequence Query Stuck Inside of Transaction And Blocks Other Threads
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50079/ --- (Updated July 15, 2016, 4:06 p.m.) Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sid Wagle. Changes --- Added test coverage. Bugs: AMBARI-17738 https://issues.apache.org/jira/browse/AMBARI-17738 Repository: ambari Description --- Reproduced as part of creating a rolling upgrade on a large cluster. Initially appearing as a deadlock, it's caused by Postgres is holding the socket open indefinitely. We have a write lock being held while the socket is open. Jstack dumps taken many minutes apart show the same thread is stuck in a socket read. Investigating on Postgres shows that there is a lock blocking the thread which is waiting. The sequence query is currently stuck in the {{idle in transaction}} state which is why it's blocking the other query. The transaction isn't being ended by EclipseLink. The cause is that we begin a transaction and then hammer the database for 2-3 minutes. During which time, Postgres must keep track of all kinds of hostcomponentstate updates isolated from our current transaction. When we go to commit the upgrade, Postgres eventually ends in a deadlock where it doesn't think that the transaction ended. Diffs (updated) - ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java 2e976ba ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeEntity.java db27ea5 ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeGroupEntity.java 96f96d5 ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeItemEntity.java 6e4a889 ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeResourceProviderTest.java a5db0f0 Diff: https://reviews.apache.org/r/50079/diff/ Testing (updated) --- Fixed on a live cluster where it was 100% reproducible. Thanks, Jonathan Hurley
Re: Review Request 50079: EclipseLink Sequence Query Stuck Inside of Transaction And Blocks Other Threads
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50079/#review142398 --- Ship it! Ship It! - Alejandro Fernandez On July 15, 2016, 3:56 p.m., Jonathan Hurley wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50079/ > --- > > (Updated July 15, 2016, 3:56 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sid Wagle. > > > Bugs: AMBARI-17738 > https://issues.apache.org/jira/browse/AMBARI-17738 > > > Repository: ambari > > > Description > --- > > Reproduced as part of creating a rolling upgrade on a large cluster. > > Initially appearing as a deadlock, it's caused by Postgres is holding the > socket open indefinitely. We have a write lock being held while the socket is > open. Jstack dumps taken many minutes apart show the same thread is stuck in > a socket read. Investigating on Postgres shows that there is a lock blocking > the thread which is waiting. > > The sequence query is currently stuck in the {{idle in transaction}} state > which is why it's blocking the other query. The transaction isn't being ended > by EclipseLink. > > The cause is that we begin a transaction and then hammer the database for 2-3 > minutes. During which time, Postgres must keep track of all kinds of > hostcomponentstate updates isolated from our current transaction. When we go > to commit the upgrade, Postgres eventually ends in a deadlock where it > doesn't think that the transaction ended. > > > Diffs > - > > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java > 2e976ba > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeEntity.java > db27ea5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeGroupEntity.java > 96f96d5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeItemEntity.java > 6e4a889 > > Diff: https://reviews.apache.org/r/50079/diff/ > > > Testing > --- > > Fixed on a live cluster where it was 100% reproducible. > > UNIT TESTS PENDING... > > > Thanks, > > Jonathan Hurley > >
Re: Review Request 50079: EclipseLink Sequence Query Stuck Inside of Transaction And Blocks Other Threads
> On July 15, 2016, 5:03 p.m., Sid Wagle wrote: > > Ship It! General question: Any reason why we started to see this now? Is it possible the postgres verion 9.2 does not suffer from this? We seem to be still installing postgresql-server-8.4.20-6 - Sid --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50079/#review142395 --- On July 15, 2016, 3:56 p.m., Jonathan Hurley wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50079/ > --- > > (Updated July 15, 2016, 3:56 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sid Wagle. > > > Bugs: AMBARI-17738 > https://issues.apache.org/jira/browse/AMBARI-17738 > > > Repository: ambari > > > Description > --- > > Reproduced as part of creating a rolling upgrade on a large cluster. > > Initially appearing as a deadlock, it's caused by Postgres is holding the > socket open indefinitely. We have a write lock being held while the socket is > open. Jstack dumps taken many minutes apart show the same thread is stuck in > a socket read. Investigating on Postgres shows that there is a lock blocking > the thread which is waiting. > > The sequence query is currently stuck in the {{idle in transaction}} state > which is why it's blocking the other query. The transaction isn't being ended > by EclipseLink. > > The cause is that we begin a transaction and then hammer the database for 2-3 > minutes. During which time, Postgres must keep track of all kinds of > hostcomponentstate updates isolated from our current transaction. When we go > to commit the upgrade, Postgres eventually ends in a deadlock where it > doesn't think that the transaction ended. > > > Diffs > - > > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java > 2e976ba > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeEntity.java > db27ea5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeGroupEntity.java > 96f96d5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeItemEntity.java > 6e4a889 > > Diff: https://reviews.apache.org/r/50079/diff/ > > > Testing > --- > > Fixed on a live cluster where it was 100% reproducible. > > UNIT TESTS PENDING... > > > Thanks, > > Jonathan Hurley > >
Re: Review Request 50079: EclipseLink Sequence Query Stuck Inside of Transaction And Blocks Other Threads
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50079/#review142395 --- Ship it! Ship It! - Sid Wagle On July 15, 2016, 3:56 p.m., Jonathan Hurley wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50079/ > --- > > (Updated July 15, 2016, 3:56 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sid Wagle. > > > Bugs: AMBARI-17738 > https://issues.apache.org/jira/browse/AMBARI-17738 > > > Repository: ambari > > > Description > --- > > Reproduced as part of creating a rolling upgrade on a large cluster. > > Initially appearing as a deadlock, it's caused by Postgres is holding the > socket open indefinitely. We have a write lock being held while the socket is > open. Jstack dumps taken many minutes apart show the same thread is stuck in > a socket read. Investigating on Postgres shows that there is a lock blocking > the thread which is waiting. > > The sequence query is currently stuck in the {{idle in transaction}} state > which is why it's blocking the other query. The transaction isn't being ended > by EclipseLink. > > The cause is that we begin a transaction and then hammer the database for 2-3 > minutes. During which time, Postgres must keep track of all kinds of > hostcomponentstate updates isolated from our current transaction. When we go > to commit the upgrade, Postgres eventually ends in a deadlock where it > doesn't think that the transaction ended. > > > Diffs > - > > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java > 2e976ba > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeEntity.java > db27ea5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeGroupEntity.java > 96f96d5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeItemEntity.java > 6e4a889 > > Diff: https://reviews.apache.org/r/50079/diff/ > > > Testing > --- > > Fixed on a live cluster where it was 100% reproducible. > > UNIT TESTS PENDING... > > > Thanks, > > Jonathan Hurley > >
Re: Review Request 50079: EclipseLink Sequence Query Stuck Inside of Transaction And Blocks Other Threads
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50079/#review142388 --- Ship it! Ship It! - Nate Cole On July 15, 2016, 11:56 a.m., Jonathan Hurley wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50079/ > --- > > (Updated July 15, 2016, 11:56 a.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Sid Wagle. > > > Bugs: AMBARI-17738 > https://issues.apache.org/jira/browse/AMBARI-17738 > > > Repository: ambari > > > Description > --- > > Reproduced as part of creating a rolling upgrade on a large cluster. > > Initially appearing as a deadlock, it's caused by Postgres is holding the > socket open indefinitely. We have a write lock being held while the socket is > open. Jstack dumps taken many minutes apart show the same thread is stuck in > a socket read. Investigating on Postgres shows that there is a lock blocking > the thread which is waiting. > > The sequence query is currently stuck in the {{idle in transaction}} state > which is why it's blocking the other query. The transaction isn't being ended > by EclipseLink. > > The cause is that we begin a transaction and then hammer the database for 2-3 > minutes. During which time, Postgres must keep track of all kinds of > hostcomponentstate updates isolated from our current transaction. When we go > to commit the upgrade, Postgres eventually ends in a deadlock where it > doesn't think that the transaction ended. > > > Diffs > - > > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java > 2e976ba > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeEntity.java > db27ea5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeGroupEntity.java > 96f96d5 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/UpgradeItemEntity.java > 6e4a889 > > Diff: https://reviews.apache.org/r/50079/diff/ > > > Testing > --- > > Fixed on a live cluster where it was 100% reproducible. > > UNIT TESTS PENDING... > > > Thanks, > > Jonathan Hurley > >