Philipp Freyer created NIFI-15233:
-------------------------------------

             Summary: Transported GitHub flows follow foreign branches after 
merging a PR
                 Key: NIFI-15233
                 URL: https://issues.apache.org/jira/browse/NIFI-15233
             Project: Apache NiFi
          Issue Type: Improvement
            Reporter: Philipp Freyer


I am not sure if this could be seen as a bug instead.
h2. Environment description:

Two or more Apache NiFi instances are connected to the same GitHub repository. 
This is done to separate NiFi flow development from the flow used in production.

These Apache NiFi instances are configured to use different Git Branches to 
separate their flow versions so that a Flow version can only be used in 
production once it has been approved.
The productive Git branch is locked so that only pull requests can be used to 
set new code here, with approvals required.

To facilitate the transportation of Nifi flows from development to production 
(possibly through other Nifi instances, such as for testing), pull requests can 
be used to update the configuration between Git branches.

The root process group of the Nifi flow (the first PG created on the canvas) is 
versioned, all flow logic resides under this versioned process group.

Since the flow is huge and development may occur in parallel on different areas 
of the flow, process groups (separating different flow logic) is versioned as 
separate flows.
h2. Scenario:

A company uses Nifi to transport potentially sensitive data. That company has a 
bad actor who, given the environment above, does the following:

*On the development system:*

1) Go to a versioned sub-process group that works on said sensitive data
2) Add a flow configuration that sends this sensitive data to a remote server, 
controlled by said bad actor.
3) Create a new flow version for that flow => Version M
4) Revert the changes
5) Create a new flow version for that flow (now without the malicious 
configuration) => Version N

*On the productive system* (either the actor itself or by telling someone that 
this is an urgent fix):
1) Finds the same flow
2) Sets it to Version M, which is available without any approval or pull request

This scenario is hard to detect, since a change in a version in a sub-process 
group is not visible in higher process groups (at least, if there is at least 
one other versioned sub-process-group on the breadcrumb path), and the 
development system does not show the changes.

No approval was necessary despite the productive system using a separate, 
protected GitHub branch, see problem description.
h2. Problem description:

The above scenario is possible due to the GitHubFlow definition not only 
specifying a commit ID but also a branch that that commit is on. Thus, if the 
versioned root process group replicated from the development branch to the 
production branch, any references that are stored in that versioned root 
process group would - after transport - still contain the branch reference to 
the development branch.

This can be seen on the productive system - after all, the branch is shown in 
the version info - but it cannot be changed. It can only be changed with access 
to the branch on GitHub, not graphically in Nifi itself. Any new version would 
also change the branch again, if it is different in the GitHub definition.

However, since that separately versioned sub-process group has been created 
with a reference to the development branch, this means that any new versions 
are also picked up from the development branch, including versions that were 
neither approved nor transported to the production Git branch. As a result, it 
is possible to "hide" malicious flow versions in the development branch that 
will be available in production without any further guardrails set up in GitHub 
that a company may rely on.
h2. Proposed solutions:

1) Similar to Parameter context values, it would be good to be able to define 
branch reference behavior. An option to enforce the use of only one branch 
(`development` and `production` on the respective Nifi Instances, for example) 
would mean that the malicious version M (mentioned above) would not be 
available without another PR and transport.

2) Alternatively, it would be great to be able to graphically change the branch 
that a versioned process group is following and to protect this change from 
being overwritten by the versioned flows. This would mean manual intervention 
on the productive system, though.

I realize that both proposed solutions may have issues when commit IDs are not 
persisted (due to different ways of merging/transporting the logic between 
branches), and that another way of referencing the referenced flow version may 
be needed for stable, transportable version references, based on IDs controlled 
by Apache Nifi itself.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to