Philipp Freyer created NIFI-15233:
-------------------------------------
Summary: Transported GitHub flows follow foreign branches after
merging a PR
Key: NIFI-15233
URL: https://issues.apache.org/jira/browse/NIFI-15233
Project: Apache NiFi
Issue Type: Improvement
Reporter: Philipp Freyer
I am not sure if this could be seen as a bug instead.
h2. Environment description:
Two or more Apache NiFi instances are connected to the same GitHub repository.
This is done to separate NiFi flow development from the flow used in production.
These Apache NiFi instances are configured to use different Git Branches to
separate their flow versions so that a Flow version can only be used in
production once it has been approved.
The productive Git branch is locked so that only pull requests can be used to
set new code here, with approvals required.
To facilitate the transportation of Nifi flows from development to production
(possibly through other Nifi instances, such as for testing), pull requests can
be used to update the configuration between Git branches.
The root process group of the Nifi flow (the first PG created on the canvas) is
versioned, all flow logic resides under this versioned process group.
Since the flow is huge and development may occur in parallel on different areas
of the flow, process groups (separating different flow logic) is versioned as
separate flows.
h2. Scenario:
A company uses Nifi to transport potentially sensitive data. That company has a
bad actor who, given the environment above, does the following:
*On the development system:*
1) Go to a versioned sub-process group that works on said sensitive data
2) Add a flow configuration that sends this sensitive data to a remote server,
controlled by said bad actor.
3) Create a new flow version for that flow => Version M
4) Revert the changes
5) Create a new flow version for that flow (now without the malicious
configuration) => Version N
*On the productive system* (either the actor itself or by telling someone that
this is an urgent fix):
1) Finds the same flow
2) Sets it to Version M, which is available without any approval or pull request
This scenario is hard to detect, since a change in a version in a sub-process
group is not visible in higher process groups (at least, if there is at least
one other versioned sub-process-group on the breadcrumb path), and the
development system does not show the changes.
No approval was necessary despite the productive system using a separate,
protected GitHub branch, see problem description.
h2. Problem description:
The above scenario is possible due to the GitHubFlow definition not only
specifying a commit ID but also a branch that that commit is on. Thus, if the
versioned root process group replicated from the development branch to the
production branch, any references that are stored in that versioned root
process group would - after transport - still contain the branch reference to
the development branch.
This can be seen on the productive system - after all, the branch is shown in
the version info - but it cannot be changed. It can only be changed with access
to the branch on GitHub, not graphically in Nifi itself. Any new version would
also change the branch again, if it is different in the GitHub definition.
However, since that separately versioned sub-process group has been created
with a reference to the development branch, this means that any new versions
are also picked up from the development branch, including versions that were
neither approved nor transported to the production Git branch. As a result, it
is possible to "hide" malicious flow versions in the development branch that
will be available in production without any further guardrails set up in GitHub
that a company may rely on.
h2. Proposed solutions:
1) Similar to Parameter context values, it would be good to be able to define
branch reference behavior. An option to enforce the use of only one branch
(`development` and `production` on the respective Nifi Instances, for example)
would mean that the malicious version M (mentioned above) would not be
available without another PR and transport.
2) Alternatively, it would be great to be able to graphically change the branch
that a versioned process group is following and to protect this change from
being overwritten by the versioned flows. This would mean manual intervention
on the productive system, though.
I realize that both proposed solutions may have issues when commit IDs are not
persisted (due to different ways of merging/transporting the logic between
branches), and that another way of referencing the referenced flow version may
be needed for stable, transportable version references, based on IDs controlled
by Apache Nifi itself.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)