[ 
https://issues.apache.org/jira/browse/CONNECTORS-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138221#comment-13138221
 ] 

Karl Wright commented on CONNECTORS-279:
----------------------------------------

Postgresql team responded and gave debugging tips, which show the following:

{code}
postgres=# select * from pg_locks;
   locktype    | database | relation | page | tuple | virtualxid | transactionid
 | classid | objid | objsubid | virtualtransaction | pid  |       mode       | g
ranted
---------------+----------+----------+------+-------+------------+--------------
-+---------+-------+----------+--------------------+------+------------------+--
-------
 relation      |   133853 |   140809 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140809 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 relation      |   133853 |   140807 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140807 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 virtualxid    |          |          |      |       | 2/204727   |
 |         |       |          | 2/204727           | 7260 | ExclusiveLock    | t
 virtualxid    |          |          |      |       | 3/185434   |
 |         |       |          | 3/185434           | 2384 | ExclusiveLock    | t
 virtualxid    |          |          |      |       | 1/216105   |
 |         |       |          | 1/216105           | 1800 | ExclusiveLock    | t
 relation      |   133853 |   140751 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140745 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140810 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140810 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 relation      |   133853 |   140808 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140808 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 relation      |   133853 |   140812 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140812 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 relation      |   133853 |   140784 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140785 |      |       |            |
 |         |       |          | 1/216105           | 1800 | ShareLock        | f
 relation      |   133853 |   140791 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140791 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 relation      |   133853 |   140785 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140785 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 transactionid |          |          |      |       |            |       6252988
 |         |       |          | 2/204727           | 7260 | ExclusiveLock    | t
 relation      |   133853 |   140811 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140811 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 relation      |   133853 |   140813 |      |       |            |
 |         |       |          | 2/204727           | 7260 | AccessShareLock  | t
 relation      |   133853 |   140813 |      |       |            |
 |         |       |          | 2/204727           | 7260 | RowExclusiveLock | t
 relation      |    11564 |    10969 |      |       |            |
 |         |       |          | 3/185434           | 2384 | AccessShareLock  | t
(27 rows)

postgres=# select * from pg_stat_activity;
 datid  | datname  | procpid | usesysid | usename  |          current_query
     | waiting |         xact_start         |        query_start         |
 backend_start        | client_addr | client_port
--------+----------+---------+----------+----------+----------------------------
-----+---------+----------------------------+----------------------------+------
----------------------+-------------+-------------
 133853 | testdb   |    1800 |    68231 | testuser | REINDEX TABLE jobqueue
     | t       | 2011-10-28 06:53:59.562-04 | 2011-10-28 06:53:59.562-04 | 2011-
10-28 06:53:57.789-04 | 127.0.0.1   |       54835
 133853 | testdb   |    7260 |    68231 | testuser | <IDLE> in transaction
     | f       | 2011-10-28 06:53:33.481-04 | 2011-10-28 06:53:34.895-04 | 2011-
10-28 06:53:33.404-04 | 127.0.0.1   |       54828
  11564 | postgres |    2384 |       10 | postgres | select * from pg_stat_activ
ity; | f       | 2011-10-28 06:57:14.212-04 | 2011-10-28 06:57:14.212-04 | 2011-
10-28 06:55:12.74-04  | ::1         |       54852
 133853 | testdb   |    7664 |    68231 | testuser | <IDLE>
     | f       |                            | 2011-10-28 06:57:11.029-04 | 2011-
10-28 06:57:11.017-04 | 127.0.0.1   |       54876
 133853 | testdb   |    3128 |    68231 | testuser | <IDLE>
     | f       |                            | 2011-10-28 06:57:11.059-04 | 2011-
10-28 06:57:11.052-04 | 127.0.0.1   |       54877
(5 rows)

"Thread-4398639" daemon prio=6 tid=0x087f9400 nid=0x9f0 runnable [0x04b4f000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at 
org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:135)
        at 
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:104)
        at 
org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
        at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:255)
        at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1165)
        at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:191)
        - locked <0x260f2220> (a org.postgresql.core.v3.QueryExecutorImpl)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:337)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:329)
        at 
org.apache.manifoldcf.core.database.Database.execute(Database.java:566)
        at 
org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421)

"Document delete thread '2'" daemon prio=6 tid=0x05660c00 nid=0xc88 in 
Object.wait() [0x06d8f000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Thread.join(Thread.java:1143)
        - locked <0x260f0118> (a 
org.apache.manifoldcf.core.database.Database$ExecuteQueryThread)
        at java.lang.Thread.join(Thread.java:1196)
        at 
org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:453)
        at 
org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:505)
        at 
org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1152)
        at 
org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
        at 
org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168)
        at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:639)
        at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reindexTableInternal(DBInterfacePostgreSQL.java:1284)
        at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.noteModifications(DBInterfacePostgreSQL.java:1356)
        at 
org.apache.manifoldcf.core.database.BaseTable.noteModifications(BaseTable.java:299)
        at 
org.apache.manifoldcf.crawler.jobs.JobQueue.deleteIngestedDocumentIdentifiers(JobQueue.java:565)
        at 
org.apache.manifoldcf.crawler.jobs.JobManager.deleteIngestedDocumentIdentifiers(JobManager.java:789)
        at 
org.apache.manifoldcf.crawler.system.DocumentDeleteThread.run(DocumentDeleteThread.java:176)

"Document delete stuffer thread" daemon prio=6 tid=0x0565fc00 nid=0x16b0 in 
Object.wait() [0x06c9f000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:485)
        at 
org.apache.manifoldcf.core.lockmanager.LockObject.enterWriteLock(LockObject.java:111)
        - locked <0x260f26f8> (a 
org.apache.manifoldcf.core.lockmanager.LockObject)
        at 
org.apache.manifoldcf.core.lockmanager.LockManager.enterWriteCriticalSection(LockManager.java:1459)
        at 
org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.noteModifications(DBInterfacePostgreSQL.java:1322)
        at 
org.apache.manifoldcf.core.database.BaseTable.noteModifications(BaseTable.java:299)
        at 
org.apache.manifoldcf.crawler.jobs.JobQueue.setDeletingStatus(JobQueue.java:718)
        at 
org.apache.manifoldcf.crawler.jobs.JobManager.getNextDeletableDocuments(JobManager.java:1227)
        at 
org.apache.manifoldcf.crawler.system.DocumentDeleteStufferThread.run(DocumentDeleteStufferThread.java:105)
{code}

So it looks like our reindexing locks are what are hanging the system - at 
least in this case.

                
> Postgresql load test job delete document cleanup fails sometimes and leaves 
> orphaned documents
> ----------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-279
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-279
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework crawler agent
>    Affects Versions: ManifoldCF 0.4
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 0.4
>
>
> Running the postgresql load test on my laptop, I was surprised when the test 
> did not finish.  The UI indicated that the job was being deleted, but there 
> were 49,000 documents and that number was not moving.  Further inspection 
> yielded the following:
> - Job was in the "DELETING" state
> - Documents were in the "BEINGDELETED" state
> - No activity of any kind ongoing
> The log had no errors.
> It was impossible to get a thread dump, but a cursory inspection of the code 
> indicated that either the documents were being marked as "BEINGDELETED" but 
> were not actually being placed on the in-memory queue, or the delete threads 
> were picking up the documents and somehow avoiding marking them as being 
> processed.
> Also, probably unrelated, the Document Status report listed these documents 
> as having a status of "Being removed" and a state of "Unknown".  The 
> "Unknown" should have been a "Deleting".  Since the extended WHEN... ELSE 
> clause has a reasonable condition for the "Deleting" answer, it's hard to see 
> how this could have occurred either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to