[
https://issues.apache.org/jira/browse/NIFI-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417669#comment-16417669
]
ASF GitHub Bot commented on NIFI-4993:
--------------------------------------
Github user markap14 commented on the issue:
https://github.com/apache/nifi/pull/2589
@ijokarumawak thanks for the latest updates. All looks good to me at this
point. +1 merged to master.
> ReportLineageToAtlas complete path strategy does not report some lineages
> with secured NiFi
> -------------------------------------------------------------------------------------------
>
> Key: NIFI-4993
> URL: https://issues.apache.org/jira/browse/NIFI-4993
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.5.0
> Reporter: Koji Kawamura
> Assignee: Koji Kawamura
> Priority: Major
> Attachments: NIFI-4993.xml, flow-screenshot.png, hdfs-route.png,
> kafka-route.png, unauthorized-query-with-fix.png, unauthorized-query.png
>
>
> ReportLineageToAtlas 'complete path' strategy uses NiFi provenance lineage
> query with an anonymous user. If NiFi is secured and the user who made the
> lineage query request does not have required privilege, NiFi returns
> provenance event type as UNKNOWN, and also does not traverse lineage fully.
> Specifically, the authorization is implemented here:
>
> [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java#L2641]
> {code:java|title=PersistentProvenanceRepository$ComputeLineageRunnable.run}
> final StandardLineageResult result = submission.getResult();
> result.update(replaceUnauthorizedWithPlaceholders(matchingRecords, user),
> matchingRecords.size());
> {code}
> This affects to ReportLineageToAtlas 'complete path' strategy as it will not
> be able to traverse parent provenance events to analyze full lineage path for
> a FlowFile. As a result, the reporting task can not report lineage with some
> structures of flow.
> For example, with the following NiFi flow, the FlowFile that was RECEIVEd by
> GetFile went through Kafka route (the right branch). Also, the FlowFile was
> CLONEd to go Hive and HDFS routes.
> !flow-screenshot.png|width=100%!
> Then the original FlowFile that went through Kafka route would have NiFi
> lineage like this. This lineage can be retrieved by single lineage query and
> works even with an anonymous user. These routes can be reported to Atlas:
> !kafka-route.png|width=180!
> However, the CLONEd routes would have following lineage. This graph was
> queried from NiFi UI by a NiFi user who has sufficient privilege. But with an
> anonymous user, the link from SEND (23) to the FlowFile then CLONE (18) is
> not returned. Because event types are masked as UNKNOWN and NiFi framework
> does not traverse the linkage. Thus, these cloned routes are not reported to
> Atlas.
> !hdfs-route.png!
> -ReportLineageToAtlas needs to have a property so that user can specify a
> NiFi user id to impersonate, so that required policies can be administrated.
> 1st PR [2567|https://github.com/apache/nifi/pull/2567]-
> -Instead of letting user to specify a NiFi user id, the updated 2nd PR
> ([2577|https://github.com/apache/nifi/pull/2577]) fixes lineage computation
> with unauthorized user.-
> The 3rd PR ([2589|https://github.com/apache/nifi/pull/2589]) attempts fixing
> this issue by modifying ProvenanceRepository implementations to accept null
> user so that lineage query can be called by NiFi internal components.
> This issue was originally reported by [~nayakmahesh616].
> A simplified NiFi flow template to test the proposed fix is attached.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)