Hi Emanuel,

The PR is currently under review so that would not be included in NiFi
1.10.0 (which is already released). We recently discussed about releasing a
new NiFi version (1.10.1 or 1.11.0) and if the PR is merged before such a
release, it would certainly be included in that version.

Hope it makes sense,
Pierre


Le lun. 6 janv. 2020 à 22:08, Oliveira, Emanuel <[email protected]>
a écrit :

> Thanks Matt and Mark!
> We still on version
> 1.8.0
> 10/22/2018 23:48:30 EDT
> Tagged nifi-1.8.0-RC3
>
> Current version is 1.10
>
> As curiosity, when could we expected this fix to be available ? Would it
> mean we upgrade to 1.10 ? Thanks.
>
> Thanks//Regards,
> Emanuel Oliveira
>
>
>
> -----Original Message-----
> From: Matt Burgess <[email protected]>
> Sent: Friday 20 December 2019 17:52
> To: [email protected]
> Subject: Re: NiFi ValidateRecord - unable to handle missing mandatory
> ARRAY ?
>
> This email is from an external source - exercise caution regarding links
> and attachments.
>
>
> Mark is spot-on with the diagnosis, a default empty array is being created
> for the missing field even if no default value is specified in the schema.
> All it needs is an extra null check in order to return null as the default
> value, then the record is marked invalid as expected.
>
> I have written up NIFI-6963 [1] to cover this, and issued a PR to fix it
> [2]. Mark, would you kindly do the honors of a review? Please and thanks!
>
> -Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-6963
> [2] https://github.com/apache/nifi/pull/3948
>
> On Wed, Dec 11, 2019 at 10:25 AM Mark Payne <[email protected]> wrote:
> >
> > Emanuel,
> >
> > I looked into this a week or so ago, but haven't had a chance to resolve
> the issue yet. It does appear to be a bug. Specifically, I believe the bug
> is here [1].  When we create a RecordSchema from the Avro Schema, we set
> the default value for the array to an empty array, instead of null. Because
> of this, when the JSON is parsed, we end up creating a Record with an empty
> array for the "Record" field instead of a null. As as result, the Record is
> considered valid because it does have an array (it's just empty). I think
> it *should* be a null value instead.
> >
> > It looks like this was introduced in NIFI-4893 [2]. We can easily change
> it to just return a null value for the default, but that does result in two
> of the unit tests added in NIFI-4893 failing. It may be that those unit
> tests need to be fixed, or it may be that such a change does break
> something. I just haven't had a chance yet to dig that far into it.
> >
> > If you're someone who is comfortable digging into the code and making
> the updates, then please do and I'm happy to review a PR as soon as I'm
> able.
> >
> > Thanks
> > -Mark
> >
> >
> > [1]
> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-exten
> > sion-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/
> > apache/nifi/avro/AvroTypeUtil.java#L629-L631
> >
> > [2] https://issues.apache.org/jira/browse/NIFI-4893
> >
> >
> >
> > On Dec 11, 2019, at 8:02 AM, Oliveira, Emanuel <[email protected]>
> wrote:
> >
> > Anyway knowledgably on avro schemas can please confirm/suggest if this
> inability to invalidate json payload missing array in root when allowing
> extra field-true is normal ?
> >
> > There’s 2 options with:
> >
> > ValidateRecord.Allow Extra Fields=false à need to supply full schema
> > ValidateRecord.Allow Extra Fields=true à this is what I been
> testing/want, a way to supply schema with only mandatory fields.
> >
> >
> > I want 2 mandatory fields, an array with at least 1 element having
> eventVersion, so minimal json should be:
> > { (..)
> >    "Records": [{
> >          "eventVersion": "aaa"
> >          (..)
> >       }
> >    ]
> >    (..)
> > }
> >
> > Problem is ValidateRecord considers FF valid if missing “Records” array
> in the root!!!!
> > {
> >    "Service": "sssssss",
> >    "Event": "eeeee",
> >    "Time": "2019-11-25T16:21:53.280Z",
> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
> > }
> >
> > IF I supply the array “Records” then the schema correctly validates I
> need at least eventVersion on the array element record.
> >
> >
> > So… maybe my question can be tuned to “is it possible on avro schema
> syntax to specify cardinalities like in a db e/r diagram where a relation
> can be one of the following:
> > 0..n
> > 1..0
> > 1 and only 1 ?
> >
> >
> > Thanks//Regards,
> > Emanuel Oliveira
> > Senior Oracle/Data Engineer | CTG | Galway TEL ext: 353 – (0)91-74
> > 4971 | int: 8-737 4971 |  who's who
> >
> > From: Oliveira, Emanuel <[email protected]>
> > Sent: Friday 6 December 2019 10:15
> > To: [email protected]
> > Subject: RE: NiFi ValidateRecord - unable to handle missing mandatory
> ARRAY ?
> >
> > Hi Mark, forgot to share the NiFi version we using:
> > 1.8.0
> > 10/22/2018 23:48:30 EDT
> > Tagged nifi-1.8.0-RC3
> >
> >
> > Thanks//Regards,
> > Emanuel Oliveira
> > Senior Oracle/Data Engineer | CTG | Galway TEL ext: 353 – (0)91-74
> > 4971 | int: 8-737 4971 |  who's who
> >
> > From: Emanuel Oliveira <[email protected]>
> > Sent: Thursday 5 December 2019 22:42
> > To: [email protected]
> > Subject: Re: NiFi ValidateRecord - unable to handle missing mandatory
> ARRAY ?
> >
> > This email is from an external source - exercise caution regarding links
> and attachments.
> >
> > Hi Mark, be sure you copy paste "NOK - payload BAD 1 - " into
> GenerateFlowfile as this is the problem.
> >
> > Cheers,
> > Emanuel
> >
> > On Thu 5 Dec 2019, 22:03 Mark Payne, <[email protected]> wrote:
> >
> > Emanuel,
> >
> > What version of NiFi are you using?
> >
> > I just tested the attached template against the latest, and the FlowFile
> was routed to 'invalid' with the explanation:
> >
> > Records in this FlowFile were invalid for the following reasons: The
> > following 1 fields were missing: [[0]/Records/eventVersion]
> >
> >
> >
> >
> > Thanks
> > -Mark
> >
> >
> >
> >
> > On Dec 5, 2019, at 7:06 AM, Oliveira, Emanuel <[email protected]>
> wrote:
> >
> > Hi all,
> >
> > I been struggling to find a way for ValidateRecord using Avro Schema to
> force mandatory the presence of an array on json payload, problem is if
> array “records” is missing Validate is considering FF valid ☹.
> > --objective - Mandatory to have "Records array" with at least
> "eventVersion"
> > - using ValidateRecord > Allow Extra Fields
> > - problem im facing is nifi dont trigger payload BAD 1 as invalid!!
> >
> > How can I make mandatory the Records array ? Is it possible ?
> >
> > I know I can eventually use a SplitJson JsonPath Expression=$.Records to
> rid off the ARRAY, and also to fial if array "Records" not present.. But I
> would like to have a clean solution using just avro schema, is this
> possible ?
> >
> >
> >
> > --OK - payload GOOD
> > {
> >    "Service": "sssssss",
> >    "Event": "eeeee",
> >    "Time": "2019-11-25T16:21:53.280Z",
> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
> >    "Records": [{
> >          "eventVersion": "aaa"
> >       }
> >    ]
> > }
> >
> > --NOK - payload BAD 1 - missing "Records" array à BUT
> VALIDATERECORD/AVROSCHEMA SENDS FF TO “valid”!! I want it to be sent
> “invalid” since is not compliant to my avro schema which needs array
> “Records” with element “eventVersion” as 2 mandatory things.
> > {
> >    "Service": "sssssss",
> >    "Event": "eeeee",
> >    "Time": "2019-11-25T16:21:53.280Z",
> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
> >    "RecordsXXX": [{
> >          "eventVersion": "aaa"
> >       }
> >    ]
> > }
> >
> > --OK - payload BAD 2 - "Records" array present but missing "eventVersion"
> > {
> >    "Service": "sssssss",
> >    "Event": "eeeee",
> >    "Time": "2019-11-25T16:21:53.280Z",
> >    "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
> >    "RequestId": "RRRRRRRRRRRRRRRRRR",
> >    "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
> >    "Records": [{
> >          "eventVersionXX": "aaa"
> >       }
> >    ]
> > }
> >
> > Its very simple test flow (attachmed the xml template
> ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml) just using
> ValidateRecord with JsonReader/Json Writer:
> > <image001.png>
> >
> >
> > Heres ValidateRecord processor + reader/writer controllers:
> >
> > Avro schema with just array “Records” and “eventVersion” as min tag on
> array element.
> > Using Allow Extra Fields true:
> >
> > So im ok having other fields on the root side by side with the array
> “Records”, and also ok to have extra elements inside each array.
> > FYI: the real use case im trying to validate AWS SQS message (s3
> trigger) where I will be interested on several fields, but crafted this
> simpler example just to ask if its possible to force array to be mandatory
> and with at least 1 element ?
> >
> > ==========================================================
> >
> > --ValidateRecord 1.8.0
> > Record Reader                           JsonTreeReader
> > Record Writer                           JsonRecordSetWriter
> > Record Writer for Invalid Records
> > Schema Access Strategy                  Use Reader's Schema
> > Schema Registry                         No value set
> > Schema Name                             ${schema.name}
> > Schema Text                             ${avro.schema}
> > Allow Extra Fields                      true
> > Strict Type Checking                    true
> >
> > --JsonTreeReader 1.8.0 - MANDATORY TO HAVE "Records" ARRAY +
> "eventVersion" on each ARRAY element
> > Schema Access Strategy                  Use 'Schema Text' Property
> > Schema Registry
> > Schema Name                             ${schema.name}
> > Schema Version
> > Schema Branch
> > Schema Text
> >                                         {
> >                                            "name": "MyName",
> >                                            "type": "record",
> >                                            "namespace": "aa.bb.cc",
> >                                            "fields": [{
> >                                                  "name": "Records",
> >                                                  "type": {
> >                                                     "type": "array",
> >                                                     "items": {
> >                                                        "name":
> "Records_record",
> >                                                        "type": "record",
> >                                                        "fields": [{
> >                                                              "name":
> "eventVersion",
> >                                                              "type":
> "string"
> >                                                           }
> >                                                        ]
> >                                                     }
> >                                                  }
> >                                               }
> >                                            ]
> >                                         } Date Format Time Format
> > Timestamp Format
> >
> > --JsonRecordSetWriter 1.8.0
> > Schema Write Strategy                   Do Not Write Schema
> > Schema Access Strategy                  Inherit Record Schema
> > Schema Registry
> > Schema Name                             ${schema.name}
> > Schema Version
> > Schema Branch
> > Schema Text                             { "name": "eventVersion",
> "type": "string" }
> > Date Format
> > Time Format
> > Timestamp Format
> > Pretty Print JSON                       true
> > Suppress Null Values                    Never Suppress
> > Output Grouping                         Array
> >
> > Thanks in advance,
> > Emanuel Oliveira
> >
> > <ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml>
> >
> >
>

Reply via email to