[
https://issues.apache.org/jira/browse/NIFI-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404638#comment-15404638
]
ASF GitHub Bot commented on NIFI-2452:
--------------------------------------
Github user markap14 commented on a diff in the pull request:
https://github.com/apache/nifi/pull/771#discussion_r73220717
--- Diff:
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java
---
@@ -2166,6 +2171,28 @@ private Lineage computeLineage(final
Collection<String> flowFileUuids, final NiF
}
@Override
+ public ComputeLineageSubmission submitLineageComputation(final long
eventId, final NiFiUser user) {
+ final ProvenanceEventRecord event;
+ try {
+ event = getEvent(eventId);
+ } catch (final Exception e) {
+ logger.error("Failed to retrieve Provenance Event with ID " +
eventId + " to calculate data lineage due to: " + e, e);
+ final AsyncLineageSubmission result = new
AsyncLineageSubmission(LineageComputationType.FLOWFILE_LINEAGE, eventId,
Collections.<String> emptySet(), 1, user.getIdentity());
+ result.getResult().setError("Failed to retrieve Provenance
Event with ID " + eventId + ". See logs for more information.");
+ return result;
+ }
+
+ if (event == null) {
+ final AsyncLineageSubmission result = new
AsyncLineageSubmission(LineageComputationType.FLOWFILE_LINEAGE, eventId,
Collections.<String> emptySet(), 1, user.getIdentity());
+ result.getResult().setError("Could not find Provenance Event
with ID " + eventId);
+ lineageSubmissionMap.put(result.getLineageIdentifier(),
result);
+ return result;
+ }
+
+ return
submitLineageComputation(Collections.singleton(event.getFlowFileUuid()), user,
LineageComputationType.FLOWFILE_LINEAGE, eventId, event.getLineageStartDate(),
Long.MAX_VALUE);
--- End diff --
When we obtain the event by ID above, we are not querying anything - we are
simply looking up the event directly by ID. We then formulate a query based on
the UUID of the FlowFile. This is different than passing in the UUID of the
FlowFile directly, though, because when we look up the Event directly, we also
have the Lineage Start Date. Knowing that Lineage Start Date can dramatically
reduce the amount of searching required if the start date is fairly recent.
> Provenance Repository's Index Readers can be prematurely closed
> ---------------------------------------------------------------
>
> Key: NIFI-2452
> URL: https://issues.apache.org/jira/browse/NIFI-2452
> Project: Apache NiFi
> Issue Type: Sub-task
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Joseph Witt
> Priority: Blocker
> Fix For: 1.0.0
>
>
> I occasionally see when I run Provenance queries against an active provenance
> repository that the JVM will crash, writing out an hs_err_pid_XXX.log file.
> This appears to be related to
> https://issues.apache.org/jira/browse/LUCENE-7183 which indicates that it's
> caused by using a closed IndexReader.
> Adding to description from nice Github comments from mark:
> Ensure that we keep track of how many references we have to each lucene
> searcher and only close the underlying index reader if there are no
> references to the searcher. Also updated to prefer newer provenance events
> over older provenance events, and calculate FlowFile lineage based on an
> event id instead of a FlowFile UUID, as it's much more efficient
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)