[
https://issues.apache.org/jira/browse/NIFI-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405761#comment-16405761
]
ASF GitHub Bot commented on NIFI-4993:
--------------------------------------
GitHub user ijokarumawak opened a pull request:
https://github.com/apache/nifi/pull/2567
NIFI-4993: Add NiFi UserId to ReportLineageToAtlas
## For reviewer(s)
Please be aware that there is another major issue with 'complete path'
strategy, that is #2542. It should be easier to confirm this fix if that PR is
cherry-picked as well when you review this PR (in case if that one is not
merged yet to master branch when you review this one).
---
An existing NiFi user id who has sufficient privilege needs to be used
in order to query full lineage information for 'complete path' strategy
to work as expected.
Added new reporting task property so that user can specify such user id.
Also added exception handling logic to inform users that a proper user
id is required.
---
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
- [ ] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number
you are trying to resolve? Pay particular attention to the hyphen "-" character.
- [ ] Has your PR been rebased against the latest commit within the target
branch (typically master)?
- [ ] Is your initial contribution a single, squashed commit?
### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the LICENSE file, including the main
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to
.name (programmatic access) for each of the new properties?
### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in
which it is rendered?
### Note:
Please ensure that once the PR is submitted, you check travis-ci for build
issues and submit an update to your PR as soon as possible.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ijokarumawak/nifi nifi-4993
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nifi/pull/2567.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2567
----
commit 3c2a64f90ddbfa80047b702b29f3247f6d5bc927
Author: Koji Kawamura <ijokarumawak@...>
Date: 2018-03-20T03:22:46Z
NIFI-4993: Add NiFi UserId to ReportLineageToAtlas
An existing NiFi user id who has sufficient privilege needs to be used
in order to query full lineage information for 'complete path' strategy
to work as expected.
Added new reporting task property so that user can specify such user id.
Also added exception handling logic to inform users that a proper user
id is required.
----
> ReportLineageToAtlas complete path strategy does not report some lineages
> with secured NiFi
> -------------------------------------------------------------------------------------------
>
> Key: NIFI-4993
> URL: https://issues.apache.org/jira/browse/NIFI-4993
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.5.0
> Reporter: Koji Kawamura
> Assignee: Koji Kawamura
> Priority: Major
> Attachments: flow-screenshot.png, hdfs-route.png, kafka-route.png
>
>
> ReportLineageToAtlas 'complete path' strategy uses NiFi provenance lineage
> query with an anonymous user. If NiFi is secured and the user who made the
> lineage query request does not have required privilege, NiFi returns
> provenance event type as UNKNOWN, and also does not traverse lineage fully.
> Specifically, the authorization is implemented here:
>
> [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java#L2641]
> {code:java|title=PersistentProvenanceRepository$ComputeLineageRunnable.run}
> final StandardLineageResult result = submission.getResult();
> result.update(replaceUnauthorizedWithPlaceholders(matchingRecords, user),
> matchingRecords.size());
> {code}
> This affects to ReportLineageToAtlas 'complete path' strategy as it will not
> be able to traverse parent provenance events to analyze full lineage path for
> a FlowFile. As a result, the reporting task can not report lineage with some
> structures of flow.
> For example, with the following NiFi flow, the FlowFile that was RECEIVEd by
> GetFile went through Kafka route (the right branch). Also, the FlowFile was
> CLONEd to go Hive and HDFS routes.
> !flow-screenshot.png|width=100%!
> Then the original FlowFile that went through Kafka route would have NiFi
> lineage like this. This lineage can be retrieved by single lineage query and
> works even with an anonymous user. These routes can be reported to Atlas:
> !kafka-route.png|width=180!
> However, the CLONEd routes would have following lineage. This graph was
> queried from NiFi UI by a NiFi user who has sufficient privilege. But with an
> anonymous user, the link from SEND (23) to the FlowFile then CLONE (18) is
> not returned. Because event types are masked as UNKNOWN and NiFi framework
> does not traverse the linkage. Thus, these cloned routes are not reported to
> Atlas.
> !hdfs-route.png!
> ReportLineageToAtlas needs to have a property so that user can specify a NiFi
> user id to impersonate, so that required policies can be administrated.
> This issue was originally reported by [~nayakmahesh616].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)