Peter Gyori created NIFI-9206:
---------------------------------

             Summary: Create a processor that is capable of removing fields 
from records
                 Key: NIFI-9206
                 URL: https://issues.apache.org/jira/browse/NIFI-9206
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Extensions
            Reporter: Peter Gyori
            Assignee: Peter Gyori


A processor should be created that is capable of removing fields from records 
(RemoveRecordField might be a name for it).

The processor should have 3 properties:
 * Record Reader (a Reader controller service could be specified)
 * Record Writer (a Writer controller service could be specified)
 * Field To Remove (expects a RecordPath that points to the field to be removed)

The processor should be able to accept additional dynamic properties that 
specify further fields (by RecordPath) to be removed from the record.

+*Example*+

+input:+
{code:java}
{
    "id": 1,
    "name": "John",
    "address": {
        "zip": 1111,
        "street": "Main",
        "building": 11
    }
}
{code}
+Field to remove:+ /address/building

+output:+
{code:java}
{
    "id": 1,
    "name": "John",
    "address": {
        "zip": 1111,
        "street": "Main"
    }
}
{code}
The record's schema should be modified accordingly (removing the 
/address/building field from the schema). Field removal should be permitted 
regardless of the field being nullable or not.

Generally, the removal of a field should include the field's removal from the 
schema AND the data. The exception is if the removal is data-dependent (the 
field should be removed if its value equals "xyz"). In this case no schema 
modification should occur.

The processor should be able to remove one or more elements from arrays (e.g.: 
/addresses[ 1 ] shoud remove the element from the addresses array from the 1st 
position). When removing a field from elements of an array, the array's schema 
should only be modified if the removal is applied to all elements of the array 
(i.e.: /addresses[ * ]/building should modify the schema of the array, but 
/addresses[ 1 ]/building should not).

The same rule should be applied when handling Map datatype.

If the record does not contain the field that is expected to be removed, the 
record should still be transferred to the 'success' relationship, with no 
modification. The expectation is that if /x/y is expected to be removed from 
the record, then the record leaving the processor should not contain /x/y field.

If a certain field can be of different types (e.g. the address field can be 
"string" as well as "record" and possibly another "record" with a schema 
different from the former record type) then if /address/building is expected to 
be removed from the record, the processor is expected to remove the building 
field from the schema of all the possible types of the address field regardless 
of the address field being whatever concrete type in the particular record that 
is being processed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to