Mark Payne created NIFI-12139:
---------------------------------

             Summary: Allow for cleaner migration of extensions' configuration
                 Key: NIFI-12139
                 URL: https://issues.apache.org/jira/browse/NIFI-12139
             Project: Apache NiFi
          Issue Type: Task
          Components: Core Framework, Extensions
            Reporter: Mark Payne
            Assignee: Mark Payne


Over time, Processors, Controller Services, and Reporting Tasks need to evolve. 
New capabilities become available in the library that it uses or the endpoint 
it interacts with. The developer overlooked some configuration that is 
important for some use cases, etc. Or we just have a typo or best practices for 
naming conventions evolve.

Today, we have some ways to handle these scenarios:
 * We can add a new property descriptor with a default value.
 * We can add a displayName for Property Descriptors

But these mechanisms are lacking. Some of the problems that we have:
 * If we add a new Property Descriptor, we generally have to set the default 
value such that we don't change the behavior of existing components. This 
sometimes means that we have to use a default value that's not really the best 
default, just to maintain backward compatibility.
 * We have to maintain both a 'name' and a 'displayName' for property 
descriptors. This makes the UI and the API confusing. If the UI shows a 
property is named 'My Property', setting the value of 'My Property' should 
work. Instead, we have to set the value of 'my-property' because the that's the 
property's 'name' - even though it's not made clear in the UI.
 * If we add a new PropertyDescriptor that doesn't have a default value, all 
existing instances are made invalid upon upgrade.
 * If we want to add a new Relationship, unless it is auto-terminated, all 
existing instances are made invalid upon upgrade - and making the new 
relationship auto-terminated is rarely OK because it could result in data loss.
 * There is no way to remove a Relationship.
 * There is no way to remove a Property Descriptor. Once added, it must stay 
there, even though it is ignored.

We need to introduce a new ability to migrate old configuration to a new 
configuration. Something along the lines of:
{code:java}
public void migrateProperties(PropertyConfiguration existingConfig) {
}{code}
A default implementation would mean that nothing happens. But an implementation 
might decide to implement this as such:
{code:java}
public void migrateProperties(PropertyConfiguration config) {
    config.renameProperty("old-name", "New Name");
} {code}
Or, if a property is no longer necessary, instead of leaving it to be ignored, 
we could simply use:
{code:java}
public void migrateProperties(PropertyConfiguration config) {
    config.removeProperty("deprecated property name");
}{code}
This would mean we can actually eliminate the user of displayName and instead 
just use clean, clear names for properties. This would lead to much less 
confusion.

It gives us MUCH more freedom to evolve processors, as well. For example, let's 
say that we have a Processor that processes a file and then deletes it from a 
directory. We now decide that it should be configurable - and by default we 
don't want to delete the file. But for existing processors, we don't want to 
change their behavior. We can handle this by introducing the new Property 
Descriptor with a default but having the migration ensure that we don't change 
existing behavior:
{code:java}
static PropertyDescriptor DELETE_FILE_ON_COMPLETION = new 
PropertyDescriptor.Builder()
  .name("Delete File on Completion")
  .description("Whether or not to delete the file after processing completes.")
  .defaultValue("false")
  .allowableValues("true", "false")
  .build();

...

public void migrateProperties(PropertyConfiguration config) {
    // Maintain existing behavior for processors that were created before
    // the option was added to delete files or not.
    if (!config.hasProperty(DELETE_FILE_ON_COMPLETION)) {
        config.setProperty(DELETE_FILE_ON_COMPLETION, "true");
    }
}{code}
Importantly, we also want the ability to handle evolution of Relationships:
{code:java}
public void migrateRelationships(RelationshipConfiguration config) {
} {code}
If we decide that we now want to rename the "comms.failure" relationship to 
"Communications Failure" we can do so thusly:
{code:java}
public void migrateRelationships(RelationshipConfiguration config) {
    config.renameRelationship("comms.failure", "Communications Failure");
} {code}
We also sometimes have a situation where we'd like to break a relationship into 
two. For example, we have a "failure" relationship that we want to split into 
"failure" and "comms failure". But introducing a new "comms failure" 
relationship today would mean that existing processors become invalid. This 
allows us to take care of this:
{code:java}
public void migrateRelationships(RelationshipConfiguration config) {
    config.splitRelationship("failure", "failure", "comms failure");
} {code}
So now, any existing Connection that has the "failure" relationship will now 
have both the "failure" and "comms failure" Relationship. If "failure" is 
retried, so will be "comms failure" and if "failure" is auto-terminated, so 
will be "comms failure".

The framework should be smart enough to handle all of the necessary lifecycle 
mechanisms here. This will give us far greater flexibility in evolving 
components.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to