[ https://issues.apache.org/jira/browse/NIFI-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph Witt updated NIFI-4993: ------------------------------ Resolution: Fixed Status: Resolved (was: Patch Available) > ReportLineageToAtlas complete path strategy does not report some lineages > with secured NiFi > ------------------------------------------------------------------------------------------- > > Key: NIFI-4993 > URL: https://issues.apache.org/jira/browse/NIFI-4993 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions > Affects Versions: 1.5.0 > Reporter: Koji Kawamura > Assignee: Koji Kawamura > Priority: Major > Fix For: 1.6.0 > > Attachments: NIFI-4993.xml, flow-screenshot.png, hdfs-route.png, > kafka-route.png, unauthorized-query-with-fix.png, unauthorized-query.png > > > ReportLineageToAtlas 'complete path' strategy uses NiFi provenance lineage > query with an anonymous user. If NiFi is secured and the user who made the > lineage query request does not have required privilege, NiFi returns > provenance event type as UNKNOWN, and also does not traverse lineage fully. > Specifically, the authorization is implemented here: > > [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java#L2641] > {code:java|title=PersistentProvenanceRepository$ComputeLineageRunnable.run} > final StandardLineageResult result = submission.getResult(); > result.update(replaceUnauthorizedWithPlaceholders(matchingRecords, user), > matchingRecords.size()); > {code} > This affects to ReportLineageToAtlas 'complete path' strategy as it will not > be able to traverse parent provenance events to analyze full lineage path for > a FlowFile. As a result, the reporting task can not report lineage with some > structures of flow. > For example, with the following NiFi flow, the FlowFile that was RECEIVEd by > GetFile went through Kafka route (the right branch). Also, the FlowFile was > CLONEd to go Hive and HDFS routes. > !flow-screenshot.png|width=100%! > Then the original FlowFile that went through Kafka route would have NiFi > lineage like this. This lineage can be retrieved by single lineage query and > works even with an anonymous user. These routes can be reported to Atlas: > !kafka-route.png|width=180! > However, the CLONEd routes would have following lineage. This graph was > queried from NiFi UI by a NiFi user who has sufficient privilege. But with an > anonymous user, the link from SEND (23) to the FlowFile then CLONE (18) is > not returned. Because event types are masked as UNKNOWN and NiFi framework > does not traverse the linkage. Thus, these cloned routes are not reported to > Atlas. > !hdfs-route.png! > -ReportLineageToAtlas needs to have a property so that user can specify a > NiFi user id to impersonate, so that required policies can be administrated. > 1st PR [2567|https://github.com/apache/nifi/pull/2567]- > -Instead of letting user to specify a NiFi user id, the updated 2nd PR > ([2577|https://github.com/apache/nifi/pull/2577]) fixes lineage computation > with unauthorized user.- > The 3rd PR ([2589|https://github.com/apache/nifi/pull/2589]) attempts fixing > this issue by modifying ProvenanceRepository implementations to accept null > user so that lineage query can be called by NiFi internal components. > This issue was originally reported by [~nayakmahesh616]. > A simplified NiFi flow template to test the proposed fix is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)