Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/771#discussion_r73220717
  
    --- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java
 ---
    @@ -2166,6 +2171,28 @@ private Lineage computeLineage(final 
Collection<String> flowFileUuids, final NiF
         }
     
         @Override
    +    public ComputeLineageSubmission submitLineageComputation(final long 
eventId, final NiFiUser user) {
    +        final ProvenanceEventRecord event;
    +        try {
    +            event = getEvent(eventId);
    +        } catch (final Exception e) {
    +            logger.error("Failed to retrieve Provenance Event with ID " + 
eventId + " to calculate data lineage due to: " + e, e);
    +            final AsyncLineageSubmission result = new 
AsyncLineageSubmission(LineageComputationType.FLOWFILE_LINEAGE, eventId, 
Collections.<String> emptySet(), 1, user.getIdentity());
    +            result.getResult().setError("Failed to retrieve Provenance 
Event with ID " + eventId + ". See logs for more information.");
    +            return result;
    +        }
    +
    +        if (event == null) {
    +            final AsyncLineageSubmission result = new 
AsyncLineageSubmission(LineageComputationType.FLOWFILE_LINEAGE, eventId, 
Collections.<String> emptySet(), 1, user.getIdentity());
    +            result.getResult().setError("Could not find Provenance Event 
with ID " + eventId);
    +            lineageSubmissionMap.put(result.getLineageIdentifier(), 
result);
    +            return result;
    +        }
    +
    +        return 
submitLineageComputation(Collections.singleton(event.getFlowFileUuid()), user, 
LineageComputationType.FLOWFILE_LINEAGE, eventId, event.getLineageStartDate(), 
Long.MAX_VALUE);
    --- End diff --
    
    When we obtain the event by ID above, we are not querying anything - we are 
simply looking up the event directly by ID. We then formulate a query based on 
the UUID of the FlowFile. This is different than passing in the UUID of the 
FlowFile directly, though, because when we look up the Event directly, we also 
have the Lineage Start Date. Knowing that Lineage Start Date can dramatically 
reduce the amount of searching required if the start date is fairly recent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to