[ 
https://issues.apache.org/jira/browse/NIFIREG-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807788#comment-16807788
 ] 

Bryan Bende commented on NIFIREG-242:
-------------------------------------

[~DennisSeiffert] this sounds like an interesting feature, although I have a 
few concerns....

The general setup of registry is to be able to store versioned items. Currently 
the only versioned items are flows, but in the master branch we have added 
extension bundles like NARs, and then eventually we want to also add assets 
like datasets and config files. A bucket can contain any number of versioned 
items and types, so a given bucket could have flows, extensions, assets all 
together.

The back-end is setup so that the metadata database is the knowledge of all the 
buckets and which items belong to which buckets, and then each type of items 
has a persistence provider. For example flows may be stored in Git and 
extension bundles may be stored in an object store like S3, and really anyone 
can implement their own persistence provider for any of these because its a 
pluggable extension point.

Given the above, we can't do things like this:
{code:java}
if (this.flowPersistenceProvider instanceof GitFlowPersistenceProvider){
 deleteAllBucketsInMetaDatabase();
 return createBucketsFromGitProvider();
}{code}
There could be buckets that only have extension bundles stored in them which 
git wouldn't know about and wouldn't be able to recreate, and there could be 
buckets with flows and extension bundles in them and we'd lost the knowledge of 
the extension bundles.

The reason we could implement NIFIREG-209 (rebuild metadata DB from git repo) 
is because that logic is only triggered when starting a fresh instance with an 
empty DB, so at that point it is safe to start the DB from what is in the git 
repo, but once the app is running and there are potentially different types of 
versioned items in different buckets, the DB has to be the source of truth for 
what buckets and items exist.

Generally I think we want to avoid people bypassing the application and doing 
things in git, because then the application becomes tightly coupled to assuming 
git is present. For example, the description mentioned repairing a broken flow 
due to changed registry URL. This should be something we support fixing through 
the application, and I believe there is already a PR open for that 
(NIFIREG-238). If we need a branching concept (not sure we do) then we should 
consider building that into the application so that it could work across any 
FlowPersistenceProvider and not just git.

 

> Two-way synchronization of git repository backed flows
> ------------------------------------------------------
>
>                 Key: NIFIREG-242
>                 URL: https://issues.apache.org/jira/browse/NIFIREG-242
>             Project: NiFi Registry
>          Issue Type: New Feature
>    Affects Versions: 0.4.0
>            Reporter: Dennis Seiffert
>            Priority: Major
>              Labels: git
>
> With this feature the NiFi user and developer's life using git as version 
> control as a backend for the registry would be easier (especially in 
> dockerized environments). As a conclusion the git repository would be the 
> single source of truth in order to maintain NiFi flows. This feature contains 
> the following abilities without affecting existing functionality:
>  * synchronize remote git repository with local (nifi- registry) git 
> repository in order to support multiple registries (imagine changing a flow 
> in a test environment and update the flow in a productive environment via 
> feature branches in git, etc. ) and third party systems (git changes not done 
> by the registry, repair broken flow file because of changed registry url in 
> flow xml)
>  * initial import of git repository into registry's metadata database on 
> startup (see open issue #NIFIREG-227)
>  * ability to reset local git repository (including metadata database) to the 
> state of the remote repository 
>  * get recent status of synchronization process
>  * control synchronization via REST- endpoints (reset repository to initial 
> state, pull latest changes from git remote repository



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to