[ 
https://issues.apache.org/jira/browse/NIFI-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-4993:
------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

> ReportLineageToAtlas complete path strategy does not report some lineages 
> with secured NiFi
> -------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4993
>                 URL: https://issues.apache.org/jira/browse/NIFI-4993
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.5.0
>            Reporter: Koji Kawamura
>            Assignee: Koji Kawamura
>            Priority: Major
>             Fix For: 1.6.0
>
>         Attachments: NIFI-4993.xml, flow-screenshot.png, hdfs-route.png, 
> kafka-route.png, unauthorized-query-with-fix.png, unauthorized-query.png
>
>
> ReportLineageToAtlas 'complete path' strategy uses NiFi provenance lineage 
> query with an anonymous user. If NiFi is secured and the user who made the 
> lineage query request does not have required privilege, NiFi returns 
> provenance event type as UNKNOWN, and also does not traverse lineage fully.
> Specifically, the authorization is implemented here:
>  
> [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java#L2641]
> {code:java|title=PersistentProvenanceRepository$ComputeLineageRunnable.run}
> final StandardLineageResult result = submission.getResult();
> result.update(replaceUnauthorizedWithPlaceholders(matchingRecords, user), 
> matchingRecords.size());
> {code}
> This affects to ReportLineageToAtlas 'complete path' strategy as it will not 
> be able to traverse parent provenance events to analyze full lineage path for 
> a FlowFile. As a result, the reporting task can not report lineage with some 
> structures of flow.
>  For example, with the following NiFi flow, the FlowFile that was RECEIVEd by 
> GetFile went through Kafka route (the right branch). Also, the FlowFile was 
> CLONEd to go Hive and HDFS routes.
> !flow-screenshot.png|width=100%!
> Then the original FlowFile that went through Kafka route would have NiFi 
> lineage like this. This lineage can be retrieved by single lineage query and 
> works even with an anonymous user. These routes can be reported to Atlas:
>  !kafka-route.png|width=180!
> However, the CLONEd routes would have following lineage. This graph was 
> queried from NiFi UI by a NiFi user who has sufficient privilege. But with an 
> anonymous user, the link from SEND (23) to the FlowFile then CLONE (18) is 
> not returned. Because event types are masked as UNKNOWN and NiFi framework 
> does not traverse the linkage. Thus, these cloned routes are not reported to 
> Atlas.
>  !hdfs-route.png!
> -ReportLineageToAtlas needs to have a property so that user can specify a 
> NiFi user id to impersonate, so that required policies can be administrated. 
> 1st PR [2567|https://github.com/apache/nifi/pull/2567]-
> -Instead of letting user to specify a NiFi user id, the updated 2nd PR 
> ([2577|https://github.com/apache/nifi/pull/2577]) fixes lineage computation 
> with unauthorized user.-
> The 3rd PR ([2589|https://github.com/apache/nifi/pull/2589]) attempts fixing 
> this issue by modifying ProvenanceRepository implementations to accept null 
> user so that lineage query can be called by NiFi internal components.
> This issue was originally reported by [~nayakmahesh616].
> A simplified NiFi flow template to test the proposed fix is attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to