[
https://issues.apache.org/jira/browse/NIFI-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410923#comment-16410923
]
ASF GitHub Bot commented on NIFI-4993:
--------------------------------------
GitHub user ijokarumawak opened a pull request:
https://github.com/apache/nifi/pull/2577
NIFI-4993: Fixed unauthorized lineage computation
## For reviewer(s)
Please be aware that there is another major issue with 'complete path'
strategy, that is #2542. It should be easier to confirm this fix if that PR is
cherry-picked as well when you review this PR (in case if that one is not
merged yet to master branch when you review this one).
---
Before this fix, if provenance lineage is qualied by a user who does
not have 'view the data' plivilege for a component which emits
provenance events those create new FlowFile, then lineage computation stops
at such component, instead of connecting other available events with
the created FlowFile lineage node. Expected edges are not created
because PlaceholderProvenanceEvent returns 'UNKNOWN'
event type, edge population logics relying on that type do not work.
This commit modifies PlaceholderProvenanceEvent so that it can return
original event type if necessary, as well as children and parent event
uuids.
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
- [ ] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number
you are trying to resolve? Pay particular attention to the hyphen "-" character.
- [ ] Has your PR been rebased against the latest commit within the target
branch (typically master)?
- [ ] Is your initial contribution a single, squashed commit?
### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the LICENSE file, including the main
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to
.name (programmatic access) for each of the new properties?
### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in
which it is rendered?
### Note:
Please ensure that once the PR is submitted, you check travis-ci for build
issues and submit an update to your PR as soon as possible.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ijokarumawak/nifi nifi-4993-connect-lineage
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nifi/pull/2577.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2577
----
commit 39caff73274aeaf0fc611c9bb2e192c956467601
Author: Koji Kawamura <ijokarumawak@...>
Date: 2018-03-23T06:57:17Z
NIFI-4993: Fixed unauthorized lineage computation
Before this fix, if provenance lineage is qualied by a user who does
not have 'view the data' plivilege for a component which emits
provenance events those create new FlowFile, then lineage computation stops
at such component, instead of connecting other available events with
the created FlowFile lineage node. Expected edges are not created
because PlaceholderProvenanceEvent returns 'UNKNOWN'
event type, edge population logics relying on that type do not work.
This commit modifies PlaceholderProvenanceEvent so that it can return
original event type if necessary, as well as children and parent event
uuids.
----
> ReportLineageToAtlas complete path strategy does not report some lineages
> with secured NiFi
> -------------------------------------------------------------------------------------------
>
> Key: NIFI-4993
> URL: https://issues.apache.org/jira/browse/NIFI-4993
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.5.0
> Reporter: Koji Kawamura
> Assignee: Koji Kawamura
> Priority: Major
> Attachments: flow-screenshot.png, hdfs-route.png, kafka-route.png
>
>
> ReportLineageToAtlas 'complete path' strategy uses NiFi provenance lineage
> query with an anonymous user. If NiFi is secured and the user who made the
> lineage query request does not have required privilege, NiFi returns
> provenance event type as UNKNOWN, and also does not traverse lineage fully.
> Specifically, the authorization is implemented here:
>
> [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java#L2641]
> {code:java|title=PersistentProvenanceRepository$ComputeLineageRunnable.run}
> final StandardLineageResult result = submission.getResult();
> result.update(replaceUnauthorizedWithPlaceholders(matchingRecords, user),
> matchingRecords.size());
> {code}
> This affects to ReportLineageToAtlas 'complete path' strategy as it will not
> be able to traverse parent provenance events to analyze full lineage path for
> a FlowFile. As a result, the reporting task can not report lineage with some
> structures of flow.
> For example, with the following NiFi flow, the FlowFile that was RECEIVEd by
> GetFile went through Kafka route (the right branch). Also, the FlowFile was
> CLONEd to go Hive and HDFS routes.
> !flow-screenshot.png|width=100%!
> Then the original FlowFile that went through Kafka route would have NiFi
> lineage like this. This lineage can be retrieved by single lineage query and
> works even with an anonymous user. These routes can be reported to Atlas:
> !kafka-route.png|width=180!
> However, the CLONEd routes would have following lineage. This graph was
> queried from NiFi UI by a NiFi user who has sufficient privilege. But with an
> anonymous user, the link from SEND (23) to the FlowFile then CLONE (18) is
> not returned. Because event types are masked as UNKNOWN and NiFi framework
> does not traverse the linkage. Thus, these cloned routes are not reported to
> Atlas.
> !hdfs-route.png!
> ReportLineageToAtlas needs to have a property so that user can specify a NiFi
> user id to impersonate, so that required policies can be administrated.
> This issue was originally reported by [~nayakmahesh616].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)