Thanks Pierre!

On Mon 6 Jan 2020, 17:06 Pierre Villard, <[email protected]>
wrote:

> Hi Emanuel,
>
> The PR is currently under review so that would not be included in NiFi
> 1.10.0 (which is already released). We recently discussed about releasing a
> new NiFi version (1.10.1 or 1.11.0) and if the PR is merged before such a
> release, it would certainly be included in that version.
>
> Hope it makes sense,
> Pierre
>
>
> Le lun. 6 janv. 2020 à 22:08, Oliveira, Emanuel <[email protected]>
> a écrit :
>
>> Thanks Matt and Mark!
>> We still on version
>> 1.8.0
>> 10/22/2018 23:48:30 EDT
>> Tagged nifi-1.8.0-RC3
>>
>> Current version is 1.10
>>
>> As curiosity, when could we expected this fix to be available ? Would it
>> mean we upgrade to 1.10 ? Thanks.
>>
>> Thanks//Regards,
>> Emanuel Oliveira
>>
>>
>>
>> -----Original Message-----
>> From: Matt Burgess <[email protected]>
>> Sent: Friday 20 December 2019 17:52
>> To: [email protected]
>> Subject: Re: NiFi ValidateRecord - unable to handle missing mandatory
>> ARRAY ?
>>
>> This email is from an external source - exercise caution regarding links
>> and attachments.
>>
>>
>> Mark is spot-on with the diagnosis, a default empty array is being
>> created for the missing field even if no default value is specified in the
>> schema. All it needs is an extra null check in order to return null as the
>> default value, then the record is marked invalid as expected.
>>
>> I have written up NIFI-6963 [1] to cover this, and issued a PR to fix it
>> [2]. Mark, would you kindly do the honors of a review? Please and thanks!
>>
>> -Matt
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-6963
>> [2] https://github.com/apache/nifi/pull/3948
>>
>> On Wed, Dec 11, 2019 at 10:25 AM Mark Payne <[email protected]> wrote:
>> >
>> > Emanuel,
>> >
>> > I looked into this a week or so ago, but haven't had a chance to
>> resolve the issue yet. It does appear to be a bug. Specifically, I believe
>> the bug is here [1].  When we create a RecordSchema from the Avro Schema,
>> we set the default value for the array to an empty array, instead of null.
>> Because of this, when the JSON is parsed, we end up creating a Record with
>> an empty array for the "Record" field instead of a null. As as result, the
>> Record is considered valid because it does have an array (it's just empty).
>> I think it *should* be a null value instead.
>> >
>> > It looks like this was introduced in NIFI-4893 [2]. We can easily
>> change it to just return a null value for the default, but that does result
>> in two of the unit tests added in NIFI-4893 failing. It may be that those
>> unit tests need to be fixed, or it may be that such a change does break
>> something. I just haven't had a chance yet to dig that far into it.
>> >
>> > If you're someone who is comfortable digging into the code and making
>> the updates, then please do and I'm happy to review a PR as soon as I'm
>> able.
>> >
>> > Thanks
>> > -Mark
>> >
>> >
>> > [1]
>> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-exten
>> > sion-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/
>> > apache/nifi/avro/AvroTypeUtil.java#L629-L631
>> >
>> > [2] https://issues.apache.org/jira/browse/NIFI-4893
>> >
>> >
>> >
>> > On Dec 11, 2019, at 8:02 AM, Oliveira, Emanuel <
>> [email protected]> wrote:
>> >
>> > Anyway knowledgably on avro schemas can please confirm/suggest if this
>> inability to invalidate json payload missing array in root when allowing
>> extra field-true is normal ?
>> >
>> > There’s 2 options with:
>> >
>> > ValidateRecord.Allow Extra Fields=false à need to supply full schema
>> > ValidateRecord.Allow Extra Fields=true à this is what I been
>> testing/want, a way to supply schema with only mandatory fields.
>> >
>> >
>> > I want 2 mandatory fields, an array with at least 1 element having
>> eventVersion, so minimal json should be:
>> > { (..)
>> >    "Records": [{
>> >          "eventVersion": "aaa"
>> >          (..)
>> >       }
>> >    ]
>> >    (..)
>> > }
>> >
>> > Problem is ValidateRecord considers FF valid if missing “Records” array
>> in the root!!!!
>> > {
>> >    "Service": "sssssss",
>> >    "Event": "eeeee",
>> >    "Time": "2019-11-25T16:21:53.280Z",
>> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
>> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
>> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
>> > }
>> >
>> > IF I supply the array “Records” then the schema correctly validates I
>> need at least eventVersion on the array element record.
>> >
>> >
>> > So… maybe my question can be tuned to “is it possible on avro schema
>> syntax to specify cardinalities like in a db e/r diagram where a relation
>> can be one of the following:
>> > 0..n
>> > 1..0
>> > 1 and only 1 ?
>> >
>> >
>> > Thanks//Regards,
>> > Emanuel Oliveira
>> > Senior Oracle/Data Engineer | CTG | Galway TEL ext: 353 – (0)91-74
>> > 4971 | int: 8-737 4971 |  who's who
>> >
>> > From: Oliveira, Emanuel <[email protected]>
>> > Sent: Friday 6 December 2019 10:15
>> > To: [email protected]
>> > Subject: RE: NiFi ValidateRecord - unable to handle missing mandatory
>> ARRAY ?
>> >
>> > Hi Mark, forgot to share the NiFi version we using:
>> > 1.8.0
>> > 10/22/2018 23:48:30 EDT
>> > Tagged nifi-1.8.0-RC3
>> >
>> >
>> > Thanks//Regards,
>> > Emanuel Oliveira
>> > Senior Oracle/Data Engineer | CTG | Galway TEL ext: 353 – (0)91-74
>> > 4971 | int: 8-737 4971 |  who's who
>> >
>> > From: Emanuel Oliveira <[email protected]>
>> > Sent: Thursday 5 December 2019 22:42
>> > To: [email protected]
>> > Subject: Re: NiFi ValidateRecord - unable to handle missing mandatory
>> ARRAY ?
>> >
>> > This email is from an external source - exercise caution regarding
>> links and attachments.
>> >
>> > Hi Mark, be sure you copy paste "NOK - payload BAD 1 - " into
>> GenerateFlowfile as this is the problem.
>> >
>> > Cheers,
>> > Emanuel
>> >
>> > On Thu 5 Dec 2019, 22:03 Mark Payne, <[email protected]> wrote:
>> >
>> > Emanuel,
>> >
>> > What version of NiFi are you using?
>> >
>> > I just tested the attached template against the latest, and the
>> FlowFile was routed to 'invalid' with the explanation:
>> >
>> > Records in this FlowFile were invalid for the following reasons: The
>> > following 1 fields were missing: [[0]/Records/eventVersion]
>> >
>> >
>> >
>> >
>> > Thanks
>> > -Mark
>> >
>> >
>> >
>> >
>> > On Dec 5, 2019, at 7:06 AM, Oliveira, Emanuel <[email protected]>
>> wrote:
>> >
>> > Hi all,
>> >
>> > I been struggling to find a way for ValidateRecord using Avro Schema to
>> force mandatory the presence of an array on json payload, problem is if
>> array “records” is missing Validate is considering FF valid ☹.
>> > --objective - Mandatory to have "Records array" with at least
>> "eventVersion"
>> > - using ValidateRecord > Allow Extra Fields
>> > - problem im facing is nifi dont trigger payload BAD 1 as invalid!!
>> >
>> > How can I make mandatory the Records array ? Is it possible ?
>> >
>> > I know I can eventually use a SplitJson JsonPath Expression=$.Records
>> to rid off the ARRAY, and also to fial if array "Records" not present.. But
>> I would like to have a clean solution using just avro schema, is this
>> possible ?
>> >
>> >
>> >
>> > --OK - payload GOOD
>> > {
>> >    "Service": "sssssss",
>> >    "Event": "eeeee",
>> >    "Time": "2019-11-25T16:21:53.280Z",
>> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
>> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
>> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
>> >    "Records": [{
>> >          "eventVersion": "aaa"
>> >       }
>> >    ]
>> > }
>> >
>> > --NOK - payload BAD 1 - missing "Records" array à BUT
>> VALIDATERECORD/AVROSCHEMA SENDS FF TO “valid”!! I want it to be sent
>> “invalid” since is not compliant to my avro schema which needs array
>> “Records” with element “eventVersion” as 2 mandatory things.
>> > {
>> >    "Service": "sssssss",
>> >    "Event": "eeeee",
>> >    "Time": "2019-11-25T16:21:53.280Z",
>> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
>> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
>> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
>> >    "RecordsXXX": [{
>> >          "eventVersion": "aaa"
>> >       }
>> >    ]
>> > }
>> >
>> > --OK - payload BAD 2 - "Records" array present but missing
>> "eventVersion"
>> > {
>> >    "Service": "sssssss",
>> >    "Event": "eeeee",
>> >    "Time": "2019-11-25T16:21:53.280Z",
>> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
>> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
>> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
>> >    "Records": [{
>> >          "eventVersionXX": "aaa"
>> >       }
>> >    ]
>> > }
>> >
>> > Its very simple test flow (attachmed the xml template
>> ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml) just using
>> ValidateRecord with JsonReader/Json Writer:
>> > <image001.png>
>> >
>> >
>> > Heres ValidateRecord processor + reader/writer controllers:
>> >
>> > Avro schema with just array “Records” and “eventVersion” as min tag on
>> array element.
>> > Using Allow Extra Fields true:
>> >
>> > So im ok having other fields on the root side by side with the array
>> “Records”, and also ok to have extra elements inside each array.
>> > FYI: the real use case im trying to validate AWS SQS message (s3
>> trigger) where I will be interested on several fields, but crafted this
>> simpler example just to ask if its possible to force array to be mandatory
>> and with at least 1 element ?
>> >
>> > ==========================================================
>> >
>> > --ValidateRecord 1.8.0
>> > Record Reader                           JsonTreeReader
>> > Record Writer                           JsonRecordSetWriter
>> > Record Writer for Invalid Records
>> > Schema Access Strategy                  Use Reader's Schema
>> > Schema Registry                         No value set
>> > Schema Name                             ${schema.name}
>> > Schema Text                             ${avro.schema}
>> > Allow Extra Fields                      true
>> > Strict Type Checking                    true
>> >
>> > --JsonTreeReader 1.8.0 - MANDATORY TO HAVE "Records" ARRAY +
>> "eventVersion" on each ARRAY element
>> > Schema Access Strategy                  Use 'Schema Text' Property
>> > Schema Registry
>> > Schema Name                             ${schema.name}
>> > Schema Version
>> > Schema Branch
>> > Schema Text
>> >                                         {
>> >                                            "name": "MyName",
>> >                                            "type": "record",
>> >                                            "namespace": "aa.bb.cc",
>> >                                            "fields": [{
>> >                                                  "name": "Records",
>> >                                                  "type": {
>> >                                                     "type": "array",
>> >                                                     "items": {
>> >                                                        "name":
>> "Records_record",
>> >                                                        "type": "record",
>> >                                                        "fields": [{
>> >                                                              "name":
>> "eventVersion",
>> >                                                              "type":
>> "string"
>> >                                                           }
>> >                                                        ]
>> >                                                     }
>> >                                                  }
>> >                                               }
>> >                                            ]
>> >                                         } Date Format Time Format
>> > Timestamp Format
>> >
>> > --JsonRecordSetWriter 1.8.0
>> > Schema Write Strategy                   Do Not Write Schema
>> > Schema Access Strategy                  Inherit Record Schema
>> > Schema Registry
>> > Schema Name                             ${schema.name}
>> > Schema Version
>> > Schema Branch
>> > Schema Text                             { "name": "eventVersion",
>> "type": "string" }
>> > Date Format
>> > Time Format
>> > Timestamp Format
>> > Pretty Print JSON                       true
>> > Suppress Null Values                    Never Suppress
>> > Output Grouping                         Array
>> >
>> > Thanks in advance,
>> > Emanuel Oliveira
>> >
>> > <ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml>
>> >
>> >
>>
>

Reply via email to