Michal W created NIFI-7611:
------------------------------

             Summary: NiFi fails to index provenance events
                 Key: NIFI-7611
                 URL: https://issues.apache.org/jira/browse/NIFI-7611
             Project: Apache NiFi
          Issue Type: Bug
    Affects Versions: 1.11.4
         Environment: Microsoft Windows Server 2016 Standard - Intel Xeon Gold 
6140 CPU @ 2,30 GHz 8 processors, 32 GB RAM, total disk space 877 GB
            Reporter: Michal W


Getting error "failed to index provenance events". Nifi.app log displays 
following information:

2020-07-08 09:00:00,406 ERROR [Index Provenance Events-4] 
o.a.n.p.index.lucene.EventIndexTask Failed to index Provenance Events

org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed

                at 
org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:681)

                at 
org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:695)

                at 
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1281)

                at 
org.apache.lucene.index.IndexWriter.addDocuments(IndexWriter.java:1257)

                at 
org.apache.nifi.provenance.lucene.LuceneEventIndexWriter.index(LuceneEventIndexWriter.java:70)

                at 
org.apache.nifi.provenance.index.lucene.EventIndexTask.index(EventIndexTask.java:202)

                at 
org.apache.nifi.provenance.index.lucene.EventIndexTask.run(EventIndexTask.java:113)

                at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

                at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

                at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

                at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

                at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: java.nio.file.FileSystemException: 
E:\nifi-storage\provenance_repository\lucene-8-index-1593163985970\_11r.cfe: 
The process cannot access the file because it is being used by another process.

 

                at 
java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)

                at 
java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)

                at 
java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:108)

                at 
java.base/sun.nio.fs.WindowsFileSystemProvider.newFileChannel(WindowsFileSystemProvider.java:120)

                at 
java.base/java.nio.channels.FileChannel.open(FileChannel.java:292)

                at 
java.base/java.nio.channels.FileChannel.open(FileChannel.java:345)

                at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238)

                at 
org.apache.lucene.store.Directory.openChecksumInput(Directory.java:157)

                at 
org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.readEntries(Lucene50CompoundReader.java:105)

                at 
org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.<init>(Lucene50CompoundReader.java:69)

                at 
org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.getCompoundReader(Lucene50CompoundFormat.java:70)

                at 
org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:100)

                at 
org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:83)

                at 
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:172)

                at 
org.apache.lucene.index.ReadersAndUpdates.getReaderForMerge(ReadersAndUpdates.java:709)

                at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4396)

                at 
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4054)

                at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)

                at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)

 

Logs eventually grow over time and fill up the partition.

 

Configuration related to provenance repository:

 

# Provenance Repository Properties

nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository

nifi.provenance.repository.debug.frequency=1_000_000

nifi.provenance.repository.encryption.key.provider.implementation=

nifi.provenance.repository.encryption.key.provider.location=

nifi.provenance.repository.encryption.key.id=

nifi.provenance.repository.encryption.key=

 

# Persistent Provenance Repository Properties

nifi.provenance.repository.directory.default=E:\\nifi-storage\\provenance_repository

nifi.provenance.repository.directory.content1=F:\\nifi-storage\\provenance_repository

nifi.provenance.repository.max.storage.time=24 hours

# nifi.provenance.repository.max.storage.size=1 GB

nifi.provenance.repository.max.storage.size=8 GB

nifi.provenance.repository.rollover.time=30 secs

# nifi.provenance.repository.rollover.size=100 MB

nifi.provenance.repository.rollover.size=1 GB

nifi.provenance.repository.query.threads=2

nifi.provenance.repository.index.threads=4

#default: nifi.provenance.repository.compress.on.rollover=true

nifi.provenance.repository.compress.on.rollover=false

nifi.provenance.repository.always.sync=false

# Comma-separated list of fields. Fields that are not indexed will not be 
searchable. Valid fields are:

# EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, 
AlternateIdentifierURI, Relationship, Details

nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, 
ProcessorID, Relationship

# FlowFile Attributes that should be indexed and made searchable.  Some 
examples to consider are filename, uuid, mime.type

nifi.provenance.repository.indexed.attributes=

# Large values for the shard size will result in more Java heap usage when 
searching the Provenance Repository

# but should provide better performance

# nifi.provenance.repository.index.shard.size=500 MB

nifi.provenance.repository.index.shard.size=4 GB

 

# Indicates the maximum length that a FlowFile attribute can be when retrieving 
a Provenance Event from

# the repository. If the length of any attribute exceeds this value, it will be 
truncated when the event is retrieved.

nifi.provenance.repository.max.attribute.length=65536

nifi.provenance.repository.concurrent.merge.threads=2

 

# Volatile Provenance Respository Properties

nifi.provenance.repository.buffer.size=100000



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to