Today, as I was working on fixing handling of duplicate keys with feeds,
everything seemed to work fine. here is what we do when we encounter a
duplicate key exception.
1. we remove the tuple causing the exception.
2. we continue from where we stopped.
The problem is that when I try to query the dataset after that to check and
see which records made it into the dataset, I get a deadlock.
I have looked at the stack trace (attached) and I think the threads in the
file are the relevant ones. Please, have a look and let me know if you have
a possible cause in mind.
The threads are related to:
1. BufferCache.
2. Logging.
3. Locking.
Let me know what you think. I can reproduce this bug. it happened on 100%
of my test runs.
I will let you know when I solve it but it is taking longer than I thought.
Amoudi, Abdullah.
"Thread-62" #100 prio=5 os_prio=31 tid=0x00007f9babaf7000 nid=0x10003 waiting
on condition [0x0000000130528000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006d10be9b8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.asterix.transaction.management.service.logging.LogManager$FlushLogsLogger.run(LogManager.java:503)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"Thread-56" #91 prio=5 os_prio=31 tid=0x00007f9baac2d000 nid=0xf403 waiting on
condition [0x000000012ff16000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006ca14f1e0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.asterix.transaction.management.service.logging.LogManager$FlushLogsLogger.run(LogManager.java:503)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"FIFO Writer Thread" #90 prio=5 os_prio=31 tid=0x00007f9baac2a000 nid=0xf203
waiting on condition [0x000000012fe13000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006ca474fc0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.hyracks.storage.common.buffercache.AsyncFIFOPageQueueManager.run(AsyncFIFOPageQueueManager.java:140)
at java.lang.Thread.run(Thread.java:745)
"Thread-43" #71 daemon prio=5 os_prio=31 tid=0x00007f9baa2ab000 nid=0xec03 in
Object.wait() [0x000000012f896000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at
org.apache.asterix.transaction.management.service.logging.LogManager.appendToLogTail(LogManager.java:136)
- locked <0x000000076fd7c5f8> (a
org.apache.asterix.common.transactions.LogRecord)
at
org.apache.asterix.transaction.management.service.logging.LogManager.log(LogManager.java:125)
at
org.apache.asterix.transaction.management.service.transaction.TransactionManager.commitTransaction(TransactionManager.java:111)
at
org.apache.asterix.transaction.management.service.transaction.TransactionManager.completedTransaction(TransactionManager.java:131)
at
org.apache.asterix.runtime.job.listener.JobEventListenerFactory$1.jobletFinish(JobEventListenerFactory.java:58)
at org.apache.hyracks.control.nc.Joblet.performCleanup(Joblet.java:317)
at org.apache.hyracks.control.nc.Joblet.cleanup(Joblet.java:309)
at
org.apache.hyracks.control.nc.work.CleanupJobletWork.run(CleanupJobletWork.java:67)
at
org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:132)
"Thread-47" #76 prio=5 os_prio=31 tid=0x00007f9baa2ae800 nid=0xd603 waiting on
condition [0x000000012e975000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006ca14f1e0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.asterix.transaction.management.service.logging.LogManager$FlushLogsLogger.run(LogManager.java:503)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)