Mark Payne created NIFI-3152:
--------------------------------

             Summary: If Provenance Repository runs out of disk space, it may 
not recover even when disk space is freed up
                 Key: NIFI-3152
                 URL: https://issues.apache.org/jira/browse/NIFI-3152
             Project: Apache NiFi
          Issue Type: Bug
            Reporter: Mark Payne
            Assignee: Mark Payne
             Fix For: 1.1.1


If we run out of disk space in the provenance repository, we can sometimes get 
into a situation where the logs show us still waiting for the repo to roll 
over, even after disk space is freed up. A thread dump shows that the 
processors are trying to force the repo to rollover. However, the rollover 
never completes because we can't create an IndexWriter:

{code}
"Provenance Repository Rollover Thread-1" Id=128 TIMED_WAITING  on null
        at java.lang.Thread.sleep(Native Method)
        at org.apache.lucene.store.Lock.obtain(Lock.java:92)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
        at 
org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:104)
        - waiting on 
org.apache.nifi.provenance.lucene.SimpleIndexManager@22f9da45
        at 
org.apache.nifi.provenance.PersistentProvenanceRepository.mergeJournals(PersistentProvenanceRepository.java:1711)
        at 
org.apache.nifi.provenance.PersistentProvenanceRepository$8.run(PersistentProvenanceRepository.java:1311)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
        Number of Locked Synchronizers: 1
        - java.util.concurrent.ThreadPoolExecutor$Worker@850f87f
{code}

The Index Writer is blocking on a lock, waiting to obtain a write lock for the 
Directory.

Digging around, I believe the issue is that if we call 
SimpleIndexManager.returnIndexWriter, it will call IndexWriter.commit(). But if 
that throws an Exception, we don't properly close the writer. If we are running 
out of disk space, it is likely that we will throw an Exception on 
IndexWriter.commit() so this appears to be the root cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to