[
https://issues.apache.org/jira/browse/NIFI-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15467730#comment-15467730
]
ASF GitHub Bot commented on NIFI-2681:
--------------------------------------
Github user bbende commented on a diff in the pull request:
https://github.com/apache/nifi/pull/958#discussion_r77661605
--- Diff:
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/lucene/CachingIndexManager.java
---
@@ -0,0 +1,535 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.provenance.lucene;
+
+import java.io.Closeable;
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantLock;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.standard.StandardAnalyzer;
+import org.apache.lucene.index.DirectoryReader;
+import org.apache.lucene.index.IndexWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.FSDirectory;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class CachingIndexManager implements Closeable, IndexManager {
+ private static final Logger logger =
LoggerFactory.getLogger(CachingIndexManager.class);
+
+ private final Lock lock = new ReentrantLock();
+ private final Map<File, IndexWriterCount> writerCounts = new
HashMap<>();
+ private final Map<File, List<ActiveIndexSearcher>> activeSearchers =
new HashMap<>();
+
+
+ public void removeIndex(final File indexDirectory) {
+ final File absoluteFile = indexDirectory.getAbsoluteFile();
+ logger.info("Removing index {}", indexDirectory);
+
+ lock.lock();
+ try {
+ final IndexWriterCount count =
writerCounts.remove(absoluteFile);
+ if ( count != null ) {
+ try {
+ count.close();
+ } catch (final IOException ioe) {
+ logger.warn("Failed to close Index Writer {} for {}",
count.getWriter(), absoluteFile);
+ if ( logger.isDebugEnabled() ) {
+ logger.warn("", ioe);
+ }
+ }
+ }
+
+ for ( final List<ActiveIndexSearcher> searcherList :
activeSearchers.values() ) {
--- End diff --
Wouldn't we want to get the List<ActiveIndexSearcher> for the absoluteFile,
rather than every active searcher?
> Avoid caching Provenance Index Searchers
> ----------------------------------------
>
> Key: NIFI-2681
> URL: https://issues.apache.org/jira/browse/NIFI-2681
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Critical
> Fix For: 1.1.0
>
>
> In NIFI-2600 and NIFI-2452, we addressed two bugs where the Provenance
> Repository closes a cached IndexSearcher too soon. The IndexManager keeps the
> searchers cached in an effort to offer better performance when performing a
> Provenance Query. This was done because it was recommended in the Lucene
> documentation. However, we occasionally still see nodes crashing with
> segfaults due to the Lucene Searching. We should update the Persistent
> Provenance Repository to stop caching Index Searchers in order to trade a
> slight performance improvement for significantly better reliability.
> Playing around with the idea in order to test it out shows very favorable
> results. On a system where I could cause a seg fault almost every time that I
> ran a large provenance query, I updated the code to no longer cache the
> readers and saw perfect stability with no noticeable performance degradation.
> I will cleanup the code and submit a PR for these changes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)