[jira] [Commented] (HIVE-3826) Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)

2013-01-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547920#comment-13547920
 ] 

Hudson commented on HIVE-3826:
--

Integrated in Hive-trunk-hadoop2 #54 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/54/])
HIVE-3826 Rollbacks and retries of drops cause 
org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)
(Kevin Wilfong via namit) (Revision 1425247)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1425247
Files : 
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java


 Rollbacks and retries of drops cause 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row)
 -

 Key: HIVE-3826
 URL: https://issues.apache.org/jira/browse/HIVE-3826
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-3826.1.patch.txt


 I'm not sure if this is the only cause of the exception 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row) from the metastore, but one cause seems to be related to a drop command 
 failing, and being retried by the client.
 Based on focusing on a single thread in the metastore with DEBUG level 
 logging, I was seeing the objects that were intended to be dropped remaining 
 in the PersistenceManager cache even after a rollback.  The steps seemed to 
 be as follows:
 1) First attempt to drop the table, the table is pulled into the 
 PersistenceManager cache for the purposes of dropping
 2) The drop fails, e.g. due to a lock wait timeout on the SQL backend, this 
 causes a rollback of the transaction
 3) The drop is retried using a different thread on the metastore Thrift 
 server or a different server and succeeds
 4) Back on the original thread of the original Thrift server someone tries to 
 perform some write operation which produces a commit.  This causes those 
 detached objects related to the dropped table to attempt to reattach, causing 
 JDO to query the SQL backend for those objects which it can't find.  This 
 causes the exception.
 I was able to reproduce this regularly using the following sequence of 
 commands:
 Hive client 1 (Hive1): connected to a metastore Thrift server running a 
 single thread, I hard coded a RuntimeException into the code to drop a table 
 in the ObjectStore, specifically right before the commit in 
 preDropStorageDescriptor, to induce a rollback.  I also turned off all 
 retries at all layers of the metastore.
 Hive client 2 (Hive2): connected to a separate metastore Thrift server 
 running with standard configs and code
 1: On Hive1, CREATE TABLE t1 (c STRING);
 2: On Hive1, DROP TABLE t1; // This failed due to the hard coded exception
 3: On Hive2, DROP TABLE t1; // Succeeds
 4: On Hive1, CREATE DATABASE d1; // This database already existed, I'm not 
 sure why this was necessary, but it didn't work without it, it seemed to have 
 an affect on the order objects were committed in the next step
 5: On Hive1, CREATE DATABASE d2; // This database didn't exist, it would fail 
 with the NucleusObjectNotFoundException
 The object that would cause the exception varied, I saw the MTable, the 
 MSerDeInfo, and MTablePrivilege from the table that attempted to be dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3826) Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)

2012-12-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13538819#comment-13538819
 ] 

Hudson commented on HIVE-3826:
--

Integrated in Hive-trunk-h0.21 #1871 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1871/])
HIVE-3826 Rollbacks and retries of drops cause 
org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)
(Kevin Wilfong via namit) (Revision 1425247)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1425247
Files : 
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java


 Rollbacks and retries of drops cause 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row)
 -

 Key: HIVE-3826
 URL: https://issues.apache.org/jira/browse/HIVE-3826
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3826.1.patch.txt


 I'm not sure if this is the only cause of the exception 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row) from the metastore, but one cause seems to be related to a drop command 
 failing, and being retried by the client.
 Based on focusing on a single thread in the metastore with DEBUG level 
 logging, I was seeing the objects that were intended to be dropped remaining 
 in the PersistenceManager cache even after a rollback.  The steps seemed to 
 be as follows:
 1) First attempt to drop the table, the table is pulled into the 
 PersistenceManager cache for the purposes of dropping
 2) The drop fails, e.g. due to a lock wait timeout on the SQL backend, this 
 causes a rollback of the transaction
 3) The drop is retried using a different thread on the metastore Thrift 
 server or a different server and succeeds
 4) Back on the original thread of the original Thrift server someone tries to 
 perform some write operation which produces a commit.  This causes those 
 detached objects related to the dropped table to attempt to reattach, causing 
 JDO to query the SQL backend for those objects which it can't find.  This 
 causes the exception.
 I was able to reproduce this regularly using the following sequence of 
 commands:
 Hive client 1 (Hive1): connected to a metastore Thrift server running a 
 single thread, I hard coded a RuntimeException into the code to drop a table 
 in the ObjectStore, specifically right before the commit in 
 preDropStorageDescriptor, to induce a rollback.  I also turned off all 
 retries at all layers of the metastore.
 Hive client 2 (Hive2): connected to a separate metastore Thrift server 
 running with standard configs and code
 1: On Hive1, CREATE TABLE t1 (c STRING);
 2: On Hive1, DROP TABLE t1; // This failed due to the hard coded exception
 3: On Hive2, DROP TABLE t1; // Succeeds
 4: On Hive1, CREATE DATABASE d1; // This database already existed, I'm not 
 sure why this was necessary, but it didn't work without it, it seemed to have 
 an affect on the order objects were committed in the next step
 5: On Hive1, CREATE DATABASE d2; // This database didn't exist, it would fail 
 with the NucleusObjectNotFoundException
 The object that would cause the exception varied, I saw the MTable, the 
 MSerDeInfo, and MTablePrivilege from the table that attempted to be dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3826) Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)

2012-12-21 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13538660#comment-13538660
 ] 

Namit Jain commented on HIVE-3826:
--

+1

Great catch - running tests.

 Rollbacks and retries of drops cause 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row)
 -

 Key: HIVE-3826
 URL: https://issues.apache.org/jira/browse/HIVE-3826
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3826.1.patch.txt


 I'm not sure if this is the only cause of the exception 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row) from the metastore, but one cause seems to be related to a drop command 
 failing, and being retried by the client.
 Based on focusing on a single thread in the metastore with DEBUG level 
 logging, I was seeing the objects that were intended to be dropped remaining 
 in the PersistenceManager cache even after a rollback.  The steps seemed to 
 be as follows:
 1) First attempt to drop the table, the table is pulled into the 
 PersistenceManager cache for the purposes of dropping
 2) The drop fails, e.g. due to a lock wait timeout on the SQL backend, this 
 causes a rollback of the transaction
 3) The drop is retried using a different thread on the metastore Thrift 
 server or a different server and succeeds
 4) Back on the original thread of the original Thrift server someone tries to 
 perform some write operation which produces a commit.  This causes those 
 detached objects related to the dropped table to attempt to reattach, causing 
 JDO to query the SQL backend for those objects which it can't find.  This 
 causes the exception.
 I was able to reproduce this regularly using the following sequence of 
 commands:
 Hive client 1 (Hive1): connected to a metastore Thrift server running a 
 single thread, I hard coded a RuntimeException into the code to drop a table 
 in the ObjectStore, specifically right before the commit in 
 preDropStorageDescriptor, to induce a rollback.  I also turned off all 
 retries at all layers of the metastore.
 Hive client 2 (Hive2): connected to a separate metastore Thrift server 
 running with standard configs and code
 1: On Hive1, CREATE TABLE t1 (c STRING);
 2: On Hive1, DROP TABLE t1; // This failed due to the hard coded exception
 3: On Hive2, DROP TABLE t1; // Succeeds
 4: On Hive1, CREATE DATABASE d1; // This database already existed, I'm not 
 sure why this was necessary, but it didn't work without it, it seemed to have 
 an affect on the order objects were committed in the next step
 5: On Hive1, CREATE DATABASE d2; // This database didn't exist, it would fail 
 with the NucleusObjectNotFoundException
 The object that would cause the exception varied, I saw the MTable, the 
 MSerDeInfo, and MTablePrivilege from the table that attempted to be dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3826) Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)

2012-12-20 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537407#comment-13537407
 ] 

Kevin Wilfong commented on HIVE-3826:
-

https://reviews.facebook.net/D7539

 Rollbacks and retries of drops cause 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row)
 -

 Key: HIVE-3826
 URL: https://issues.apache.org/jira/browse/HIVE-3826
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3826.1.patch.txt


 I'm not sure if this is the only cause of the exception 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row) from the metastore, but one cause seems to be related to a drop command 
 failing, and being retried by the client.
 Based on focusing on a single thread in the metastore with DEBUG level 
 logging, I was seeing the objects that were intended to be dropped remaining 
 in the PersistenceManager cache even after a rollback.  The steps seemed to 
 be as follows:
 1) First attempt to drop the table, the table is pulled into the 
 PersistenceManager cache for the purposes of dropping
 2) The drop fails, e.g. due to a lock wait timeout on the SQL backend, this 
 causes a rollback of the transaction
 3) The drop is retried using a different thread on the metastore Thrift 
 server or a different server and succeeds
 4) Back on the original thread of the original Thrift server someone tries to 
 perform some write operation which produces a commit.  This causes those 
 detached objects related to the dropped table to attempt to reattach, causing 
 JDO to query the SQL backend for those objects which it can't find.  This 
 causes the exception.
 I was able to reproduce this regularly using the following sequence of 
 commands:
 Hive client 1 (Hive1): connected to a metastore Thrift server running a 
 single thread, I hard coded a RuntimeException into the code to drop a table 
 in the ObjectStore, specifically right before the commit in 
 preDropStorageDescriptor, to induce a rollback
 Hive client 2 (Hive2): connected to a separate metastore Thrift server 
 running with standard configs and code
 1: On Hive1, CREATE TABLE t1 (c STRING);
 2: On Hive1, DROP TABLE t1; // This failed due to the hard coded exception
 3: On Hive2, DROP TABLE t1; // Succeeds
 4: On Hive1, CREATE DATABASE d1; // This database already existed, I'm not 
 sure why this was necessary, but it didn't work without it, it seemed to have 
 an affect on the order objects were committed in the next step
 5: On Hive1, CREATE DATABASE d2; // This database didn't exist, it would fail 
 with the NucleusObjectNotFoundException
 The object that would cause the exception varied, I saw the MTable, the 
 MSerDeInfo, and MTablePrivilege from the table that attempted to be dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3826) Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)

2012-12-20 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537534#comment-13537534
 ] 

Kevin Wilfong commented on HIVE-3826:
-

The tests pass.

 Rollbacks and retries of drops cause 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row)
 -

 Key: HIVE-3826
 URL: https://issues.apache.org/jira/browse/HIVE-3826
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3826.1.patch.txt


 I'm not sure if this is the only cause of the exception 
 org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
 row) from the metastore, but one cause seems to be related to a drop command 
 failing, and being retried by the client.
 Based on focusing on a single thread in the metastore with DEBUG level 
 logging, I was seeing the objects that were intended to be dropped remaining 
 in the PersistenceManager cache even after a rollback.  The steps seemed to 
 be as follows:
 1) First attempt to drop the table, the table is pulled into the 
 PersistenceManager cache for the purposes of dropping
 2) The drop fails, e.g. due to a lock wait timeout on the SQL backend, this 
 causes a rollback of the transaction
 3) The drop is retried using a different thread on the metastore Thrift 
 server or a different server and succeeds
 4) Back on the original thread of the original Thrift server someone tries to 
 perform some write operation which produces a commit.  This causes those 
 detached objects related to the dropped table to attempt to reattach, causing 
 JDO to query the SQL backend for those objects which it can't find.  This 
 causes the exception.
 I was able to reproduce this regularly using the following sequence of 
 commands:
 Hive client 1 (Hive1): connected to a metastore Thrift server running a 
 single thread, I hard coded a RuntimeException into the code to drop a table 
 in the ObjectStore, specifically right before the commit in 
 preDropStorageDescriptor, to induce a rollback.  I also turned off all 
 retries at all layers of the metastore.
 Hive client 2 (Hive2): connected to a separate metastore Thrift server 
 running with standard configs and code
 1: On Hive1, CREATE TABLE t1 (c STRING);
 2: On Hive1, DROP TABLE t1; // This failed due to the hard coded exception
 3: On Hive2, DROP TABLE t1; // Succeeds
 4: On Hive1, CREATE DATABASE d1; // This database already existed, I'm not 
 sure why this was necessary, but it didn't work without it, it seemed to have 
 an affect on the order objects were committed in the next step
 5: On Hive1, CREATE DATABASE d2; // This database didn't exist, it would fail 
 with the NucleusObjectNotFoundException
 The object that would cause the exception varied, I saw the MTable, the 
 MSerDeInfo, and MTablePrivilege from the table that attempted to be dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira