Today, as I was working on fixing handling of duplicate keys with feeds,
everything seemed to work fine. here is what we do when we encounter a
duplicate key exception.

1. we remove the tuple causing the exception.
2. we continue from where we stopped.

The problem is that when I try to query the dataset after that to check and
see which records made it into the dataset, I get a deadlock.

I have looked at the stack trace (attached) and I think the threads in the
file are the relevant ones. Please, have a look and let me know if you have
a possible cause in mind.

The threads are related to:
1. BufferCache.
2. Logging.
3. Locking.

Let me know what you think. I can reproduce this bug. it happened on 100%
of my test runs.

I will let you know when I solve it but it is taking longer than I thought.

Amoudi, Abdullah.
"Thread-62" #100 prio=5 os_prio=31 tid=0x00007f9babaf7000 nid=0x10003 waiting 
on condition [0x0000000130528000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000006d10be9b8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.asterix.transaction.management.service.logging.LogManager$FlushLogsLogger.run(LogManager.java:503)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


"Thread-56" #91 prio=5 os_prio=31 tid=0x00007f9baac2d000 nid=0xf403 waiting on 
condition [0x000000012ff16000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000006ca14f1e0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.asterix.transaction.management.service.logging.LogManager$FlushLogsLogger.run(LogManager.java:503)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

"FIFO Writer Thread" #90 prio=5 os_prio=31 tid=0x00007f9baac2a000 nid=0xf203 
waiting on condition [0x000000012fe13000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000006ca474fc0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.hyracks.storage.common.buffercache.AsyncFIFOPageQueueManager.run(AsyncFIFOPageQueueManager.java:140)
        at java.lang.Thread.run(Thread.java:745)

"Thread-43" #71 daemon prio=5 os_prio=31 tid=0x00007f9baa2ab000 nid=0xec03 in 
Object.wait() [0x000000012f896000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at 
org.apache.asterix.transaction.management.service.logging.LogManager.appendToLogTail(LogManager.java:136)
        - locked <0x000000076fd7c5f8> (a 
org.apache.asterix.common.transactions.LogRecord)
        at 
org.apache.asterix.transaction.management.service.logging.LogManager.log(LogManager.java:125)
        at 
org.apache.asterix.transaction.management.service.transaction.TransactionManager.commitTransaction(TransactionManager.java:111)
        at 
org.apache.asterix.transaction.management.service.transaction.TransactionManager.completedTransaction(TransactionManager.java:131)
        at 
org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:58)
        at org.apache.hyracks.control.nc.Joblet.performCleanup(Joblet.java:317)
        at org.apache.hyracks.control.nc.Joblet.cleanup(Joblet.java:309)
        at 
org.apache.hyracks.control.nc.work.CleanupJobletWork.run(CleanupJobletWork.java:67)
        at 
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:132)



"Thread-47" #76 prio=5 os_prio=31 tid=0x00007f9baa2ae800 nid=0xd603 waiting on 
condition [0x000000012e975000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000006ca14f1e0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.asterix.transaction.management.service.logging.LogManager$FlushLogsLogger.run(LogManager.java:503)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Reply via email to