[
https://issues.apache.org/jira/browse/NIFI-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16039018#comment-16039018
]
Mark Payne commented on NIFI-4023:
----------------------------------
[~joewitt] - i've been looking into this. I haven't been able to recreate the
exact issue but have seen something similar, when I use the default values for
nifi.properties, other than changing to the WriteAheadProvenanceRepository. I
suspect both problems have the same root cause. I believe that it's because the
default prov repo size is 1 GB and we can fill that up very quickly. We run
cleanup every 1 minute. So if we fill up faster than we cleanup, we can end up
in a condition where we're deleting the provenance file that we're writing to.
This then causes some funkiness with indexes not getting cleaned up
appropriately.
Specifically, the lucene indices are cleaned up every 1 minute. So are the
event files. But they are on separate timers so happen at different times.
Also, we ensure that we don't delete the index that we are writing to. If the
repo fills up, and then the cleanup for the event files triggers, this can
result in deleting all event files. When we then go to cleanup the indices, we
ask the Event Store for the first event so that we can determine its ID. The
Store has no events because all event files were deleted. As a result, the
Lucene Index cleanup task just returns without deleting any indices. This
further compounds the issue because now we don't cleanup old indices. So every
time that the event file cleanup triggers, it deletes all event files. And the
indices are never properly cleaned up.
When I changed my prov repo size to 10 GB, I am not able to run into any issues
anymore.
> WriteAheadProvenanceRepository indexing and query failure under high rate
> stress testing
> ----------------------------------------------------------------------------------------
>
> Key: NIFI-4023
> URL: https://issues.apache.org/jira/browse/NIFI-4023
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.3.0
> Reporter: Joseph Witt
>
> 2017-06-06 00:32:35,995 INFO [pool-10-thread-1]
> org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@5ce7ab6f checkpointed with 5737 Records
> and 0 Swap Files in 467 milliseconds (Stop-the-world time = 172 milliseconds,
> Clear Edit Logs time = 137 millis), max Transaction ID 5739
> 2017-06-06 00:32:35,996 INFO [pool-10-thread-1]
> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile
> Repository with 5737 records in 467 milliseconds
> 2017-06-06 00:33:35,418 ERROR [Index Provenance Events-2]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:36,420 ERROR [Index Provenance Events-1]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:37,425 ERROR [Index Provenance Events-2]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:38,425 ERROR [Index Provenance Events-1]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:39,431 ERROR [Index Provenance Events-2]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:40,431 ERROR [Index Provenance Events-1]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:41,433 ERROR [Index Provenance Events-2]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:42,434 ERROR [Index Provenance Events-1]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:43,438 ERROR [Index Provenance Events-2]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:44,441 ERROR [Index Provenance Events-1]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:45,444 ERROR [Index Provenance Events-2]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:46,445 ERROR [Index Provenance Events-1]
> o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> NativeFSLock@/Users/jwitt/build-verify/nifi-1.3.0/nifi-assembly/target/nifi-1.3.0-bin/nifi-1.3.0/provenance_repository/index-1496723454612/write.lock
> at org.apache.lucene.store.Lock.obtain(Lock.java:89)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.createWriter(SimpleIndexManager.java:198)
> at
> org.apache.nifi.provenance.lucene.SimpleIndexManager.borrowIndexWriter(SimpleIndexManager.java:227)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:184)
> at
> org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:104)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> 2017-06-06 00:33:47,449 INFO [Provenance Repository Maintenance-1]
> o.a.n.p.index.lucene.LuceneEventIndex Successfully removed expired Lucene
> Index ./provenance_repository/index-1496723454612
> 2017-06-06 00:33:49,035 ERROR [Compress Provenance Logs-1-thread-1]
> o.a.n.p.s.EventFileCompressor Failed to read TOC File
> ./provenance_repository/toc/30941400.toc
> java.io.EOFException: null
> at
> org.apache.nifi.provenance.toc.StandardTocReader.<init>(StandardTocReader.java:48)
> at
> org.apache.nifi.provenance.serialization.EventFileCompressor.run(EventFileCompressor.java:93)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)