Andre,
Thanks for the clarification. I missed the salient point of "NiFi simply
failed to validate the second schema (with
nested records)"



On Fri, Jun 23, 2017 at 3:41 AM, Andre <[email protected]> wrote:

> Andrew,
>
> Your are correct The JSON's must be different... but I have not even gone
> to that level yet! NiFi simply failed to validate the second schema (with
> nested records)
>
> The thing that was not clear to my is why the Avro Registry Service would
> not validate the second schema, event after minifying it and confirming
> both single quotes and double quotes are pure ASCII characters.
>
> It turned out to be the issue is fixed by NIFI-4029
>
> Are we targeting releasing 1.3.1 by any chance? 😎
>
> Cheers
>
>
> PS-I wasn't familiar with avro-tools random, so thank for that! 😀
>
>
> On Fri, Jun 23, 2017 at 2:57 PM, Andrew Psaltis <[email protected]>
> wrote:
>
> > Andre,
> > I may be totally off-base here as it relates to the record related
> > processing in NiFi. From a pure Avro viewpoint the same JSON will not
> work
> > between those two schemas that you provided. The second one has a nested
> > header record and the first does not. I used the java AVRO tools to do
> the
> > following:
> >
> >
> >    1. Generate sample Avro for schema 1
> >
> >  java -jar avro-tools-1.7.7.jar random --count 1 --schema-file
> schema1.asvc
> > test1.avro
> >
> >
> >         2.  Convert binary Avro to JSON
> >
> >
> > java -jar avro-tools-1.7.7.jar tojson  test1.avro
> >
> >
> > This produces JSON like such:
> >
> >
> > {"version":297340384,"deviceVendor":null}
> >
> >
> > 3. Generate sample Avro for schema 2
> >
> > java -jar avro-tools-1.7.7.jar random --count 1 --schema-file
> schema2.asvc
> > test2.avro
> >
> >
> > 4. Convert binary Avro to JSON
> >
> >
> > java -jar avro-tools-1.7.7.jar tojson  test2.avro
> >
> >
> > This produces JSON like such:
> >
> >
> > {"header":{"version":-492928400,"deviceVendor":{"float":0.59934044}}}
> >
> >
> > To then have valid JSON with the second schema, you would represent it
> as:
> >
> > {"header":{"version":-492928400,"deviceVendor":null}}
> >
> >
> > If you try and generate Avro with JSON from step 2 above
> > ({"version":297340384,"deviceVendor":null}) using the second schema:
> >
> >
> > java -jar avro-tools-1.7.7.jar fromjson --schema-file schema2.asvc
> > test1.json
> >
> > It will throw an exception:
> >
> >
> > avro.codenull3+�MW���o���c��Exception in thread "main"
> > org.apache.avro.AvroTypeException: Expected field name not found: header
> >
> >
> > Hopefully this did not totally muddy the water.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> > On Thu, Jun 22, 2017 at 11:00 PM, Andre <[email protected]> wrote:
> >
> > > All,
> > >
> > > So it turns out the reason for the validation failure seems to be
> arising
> > > here:
> > >
> > > https://github.com/apache/nifi/blob/b1901d5fe0bf87be3dcce144f13b74
> > > eb995be168/nifi-commons/nifi-record/src/main/java/org/
> > > apache/nifi/serialization/record/RecordField.java#L44
> > >
> > > When using the following schema:
> > >
> > > {
> > >     "type":"record",
> > >     "name":"header",
> > >     "fields":[
> > >       {
> > >         "name":"version",
> > >         "type":"int",
> > >         "doc":"The CEF version extracted from (CEF:0) where 0 is the
> > > version"
> > >       },
> > >       {
> > >         "name":"deviceVendor",
> > >         "type":[
> > >           "null",
> > >           "float"
> > >         ],
> > >         "default":null
> > >       }
> > >     ]
> > >
> > > }
> > >
> > > The float field with default Null will parse correctly.
> > >
> > > However
> > >
> > > When using this:
> > >
> > > {
> > >   "name":"nifiCommonEventFormat",
> > >   "namespace":"com.fluenda.SecuritySchemas.cefRev23",
> > >   "type":"record",
> > >   "fields":[
> > >     {
> > >       "name":"header",
> > >       "type":{
> > >         "type":"record",
> > >         "name":"header",
> > >         "fields":[
> > >           {
> > >             "name":"version",
> > >             "type":"int",
> > >             "doc":"The CEF version extracted from (CEF:0) where 0 is
> the
> > > version"
> > >           },
> > >           {
> > >             "name":"deviceVendor",
> > >             "type":[
> > >               "null",
> > >               "float"
> > >             ],
> > >             "default":null
> > >           }
> > >         ]
> > >       }
> > >     }
> > >   ]
> > > }
> > >
> > > It fails.
> > >
> > > intelliJ seems to suggest an NPE is the underlying cause?
> > >
> > > [image: Inline image 1]
> > >
> > >
> > > What puzzles me, is that I tested with python and the schema seems
> valid?
> > >
> > > $ ./test.py
> > > {u'header': {u'version': 0, u'deviceVendor': 1.2345000505447388}}
> > > {u'header': {u'version': 0, u'deviceVendor': None}}
> > >
> > >
> > > The test.py script can be found here:
> > >
> > > https://gist.github.com/trixpan/4d190bef83712001857548b63c1d995a
> > >
> > >
> > > Cheers
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Jun 23, 2017 at 12:20 AM, Andre <[email protected]> wrote:
> > >
> > >> Joey,
> > >>
> > >> Using Chrome on Windows. Tried with Firefox and faced the same issue.
> > >>
> > >> On Fri, Jun 23, 2017 at 12:14 AM, Joey Frazee <[email protected]
> >
> > >> wrote:
> > >>
> > >>> Andre, if there are any line breaks in the schema and leading spaces
> on
> > >>> those new lines, then this occurs. So if you minify the avsc or
> remove
> > the
> > >>> leading spaces, all should be good.
> > >>>
> > >>> Will open a JIRA on this since, including myself, I think you’re the
> > >>> third to see this. Any chance you’re using Safari?
> > >>>
> > >>> > On Jun 22, 2017, at 8:53 AM, Andrew Grande <[email protected]>
> > wrote:
> > >>> >
> > >>> > Definitely something is auto replacing quotes, I can confirm
> pasting
> > >>> worked
> > >>> > fine before from a programmer's editor.
> > >>> >
> > >>> > Andrew
> > >>> >
> > >>> > On Thu, Jun 22, 2017, 9:06 AM Mark Payne <[email protected]>
> > wrote:
> > >>> >
> > >>> >> Andre,
> > >>> >>
> > >>> >> I've not seen this personally. I just clicked on the link you
> sent,
> > >>> copied
> > >>> >> the schema,
> > >>> >> and pasted it in, and it did not have any problems. What
> application
> > >>> are
> > >>> >> you copying
> > >>> >> the text from? I've certainly seen that some applications
> > >>> (specifically
> > >>> >> Microsoft Outlook
> > >>> >> and Office) love to take double-quotes and change them into other
> > >>> >> characters so that
> > >>> >> they look nicer. But if you then copy that and paste it, it is not
> > >>> pasting
> > >>> >> a double-quote but
> > >>> >> some other unicode character.
> > >>> >>
> > >>> >> Would recommend you open the below link in Chrome and copy from
> > there
> > >>> and
> > >>> >> see if
> > >>> >> that works?
> > >>> >>
> > >>> >> Thanks
> > >>> >> -Mark
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >>> On Jun 22, 2017, at 8:56 AM, Andre <[email protected]> wrote:
> > >>> >>>
> > >>> >>> All,
> > >>> >>>
> > >>> >>> I was playing with the AvroSchemaRegistry and noticed it seems to
> > not
> > >>> >> play
> > >>> >>> ball when the DFM pastes the schema into the dynamic property
> > value.
> > >>> >>>
> > >>> >>> To test it I basically copied the demo schema from Mark's blog
> > >>> post[1]
> > >>> >> and
> > >>> >>> pasted into a NiFi 1.3.0 instance. To my surprise the controller
> > >>> would
> > >>> >> not
> > >>> >>> validate, instead it displayed:
> > >>> >>>
> > >>> >>> "was expecting double-quote to start field name"
> > >>> >>>
> > >>> >>> I also faced similar errors using the following schema:
> > >>> >>>
> > >>> >>>
> > >>> >> https://github.com/fluenda/SecuritySchemas/blob/master/CEFRe
> > >>> v23/cefRev23_nifi.avsc
> > >>> >>>
> > >>> >>> Has anyone else seen this?
> > >>> >>>
> > >>> >>> Cheers
> > >>> >>
> > >>> >>
> > >>>
> > >>>
> > >>
> > >
> >
> >
> > --
> > Thanks,
> > Andrew
> >
> > Subscribe to my book: Streaming Data <http://manning.com/psaltis>
> > <https://www.linkedin.com/pub/andrew-psaltis/1/17b/306>
> > twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata>
> >
>



-- 
Thanks,
Andrew

Subscribe to my book: Streaming Data <http://manning.com/psaltis>
<https://www.linkedin.com/pub/andrew-psaltis/1/17b/306>
twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata>

Reply via email to