Mark Payne created NIFI-12139:
---------------------------------
Summary: Allow for cleaner migration of extensions' configuration
Key: NIFI-12139
URL: https://issues.apache.org/jira/browse/NIFI-12139
Project: Apache NiFi
Issue Type: Task
Components: Core Framework, Extensions
Reporter: Mark Payne
Assignee: Mark Payne
Over time, Processors, Controller Services, and Reporting Tasks need to evolve.
New capabilities become available in the library that it uses or the endpoint
it interacts with. The developer overlooked some configuration that is
important for some use cases, etc. Or we just have a typo or best practices for
naming conventions evolve.
Today, we have some ways to handle these scenarios:
* We can add a new property descriptor with a default value.
* We can add a displayName for Property Descriptors
But these mechanisms are lacking. Some of the problems that we have:
* If we add a new Property Descriptor, we generally have to set the default
value such that we don't change the behavior of existing components. This
sometimes means that we have to use a default value that's not really the best
default, just to maintain backward compatibility.
* We have to maintain both a 'name' and a 'displayName' for property
descriptors. This makes the UI and the API confusing. If the UI shows a
property is named 'My Property', setting the value of 'My Property' should
work. Instead, we have to set the value of 'my-property' because the that's the
property's 'name' - even though it's not made clear in the UI.
* If we add a new PropertyDescriptor that doesn't have a default value, all
existing instances are made invalid upon upgrade.
* If we want to add a new Relationship, unless it is auto-terminated, all
existing instances are made invalid upon upgrade - and making the new
relationship auto-terminated is rarely OK because it could result in data loss.
* There is no way to remove a Relationship.
* There is no way to remove a Property Descriptor. Once added, it must stay
there, even though it is ignored.
We need to introduce a new ability to migrate old configuration to a new
configuration. Something along the lines of:
{code:java}
public void migrateProperties(PropertyConfiguration existingConfig) {
}{code}
A default implementation would mean that nothing happens. But an implementation
might decide to implement this as such:
{code:java}
public void migrateProperties(PropertyConfiguration config) {
config.renameProperty("old-name", "New Name");
} {code}
Or, if a property is no longer necessary, instead of leaving it to be ignored,
we could simply use:
{code:java}
public void migrateProperties(PropertyConfiguration config) {
config.removeProperty("deprecated property name");
}{code}
This would mean we can actually eliminate the user of displayName and instead
just use clean, clear names for properties. This would lead to much less
confusion.
It gives us MUCH more freedom to evolve processors, as well. For example, let's
say that we have a Processor that processes a file and then deletes it from a
directory. We now decide that it should be configurable - and by default we
don't want to delete the file. But for existing processors, we don't want to
change their behavior. We can handle this by introducing the new Property
Descriptor with a default but having the migration ensure that we don't change
existing behavior:
{code:java}
static PropertyDescriptor DELETE_FILE_ON_COMPLETION = new
PropertyDescriptor.Builder()
.name("Delete File on Completion")
.description("Whether or not to delete the file after processing completes.")
.defaultValue("false")
.allowableValues("true", "false")
.build();
...
public void migrateProperties(PropertyConfiguration config) {
// Maintain existing behavior for processors that were created before
// the option was added to delete files or not.
if (!config.hasProperty(DELETE_FILE_ON_COMPLETION)) {
config.setProperty(DELETE_FILE_ON_COMPLETION, "true");
}
}{code}
Importantly, we also want the ability to handle evolution of Relationships:
{code:java}
public void migrateRelationships(RelationshipConfiguration config) {
} {code}
If we decide that we now want to rename the "comms.failure" relationship to
"Communications Failure" we can do so thusly:
{code:java}
public void migrateRelationships(RelationshipConfiguration config) {
config.renameRelationship("comms.failure", "Communications Failure");
} {code}
We also sometimes have a situation where we'd like to break a relationship into
two. For example, we have a "failure" relationship that we want to split into
"failure" and "comms failure". But introducing a new "comms failure"
relationship today would mean that existing processors become invalid. This
allows us to take care of this:
{code:java}
public void migrateRelationships(RelationshipConfiguration config) {
config.splitRelationship("failure", "failure", "comms failure");
} {code}
So now, any existing Connection that has the "failure" relationship will now
have both the "failure" and "comms failure" Relationship. If "failure" is
retried, so will be "comms failure" and if "failure" is auto-terminated, so
will be "comms failure".
The framework should be smart enough to handle all of the necessary lifecycle
mechanisms here. This will give us far greater flexibility in evolving
components.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)