Andre, Thanks for the clarification. I missed the salient point of "NiFi simply failed to validate the second schema (with nested records)"
On Fri, Jun 23, 2017 at 3:41 AM, Andre <[email protected]> wrote: > Andrew, > > Your are correct The JSON's must be different... but I have not even gone > to that level yet! NiFi simply failed to validate the second schema (with > nested records) > > The thing that was not clear to my is why the Avro Registry Service would > not validate the second schema, event after minifying it and confirming > both single quotes and double quotes are pure ASCII characters. > > It turned out to be the issue is fixed by NIFI-4029 > > Are we targeting releasing 1.3.1 by any chance? 😎 > > Cheers > > > PS-I wasn't familiar with avro-tools random, so thank for that! 😀 > > > On Fri, Jun 23, 2017 at 2:57 PM, Andrew Psaltis <[email protected]> > wrote: > > > Andre, > > I may be totally off-base here as it relates to the record related > > processing in NiFi. From a pure Avro viewpoint the same JSON will not > work > > between those two schemas that you provided. The second one has a nested > > header record and the first does not. I used the java AVRO tools to do > the > > following: > > > > > > 1. Generate sample Avro for schema 1 > > > > java -jar avro-tools-1.7.7.jar random --count 1 --schema-file > schema1.asvc > > test1.avro > > > > > > 2. Convert binary Avro to JSON > > > > > > java -jar avro-tools-1.7.7.jar tojson test1.avro > > > > > > This produces JSON like such: > > > > > > {"version":297340384,"deviceVendor":null} > > > > > > 3. Generate sample Avro for schema 2 > > > > java -jar avro-tools-1.7.7.jar random --count 1 --schema-file > schema2.asvc > > test2.avro > > > > > > 4. Convert binary Avro to JSON > > > > > > java -jar avro-tools-1.7.7.jar tojson test2.avro > > > > > > This produces JSON like such: > > > > > > {"header":{"version":-492928400,"deviceVendor":{"float":0.59934044}}} > > > > > > To then have valid JSON with the second schema, you would represent it > as: > > > > {"header":{"version":-492928400,"deviceVendor":null}} > > > > > > If you try and generate Avro with JSON from step 2 above > > ({"version":297340384,"deviceVendor":null}) using the second schema: > > > > > > java -jar avro-tools-1.7.7.jar fromjson --schema-file schema2.asvc > > test1.json > > > > It will throw an exception: > > > > > > avro.codenull3+�MW���o���c��Exception in thread "main" > > org.apache.avro.AvroTypeException: Expected field name not found: header > > > > > > Hopefully this did not totally muddy the water. > > > > Thanks, > > Andrew > > > > > > > > On Thu, Jun 22, 2017 at 11:00 PM, Andre <[email protected]> wrote: > > > > > All, > > > > > > So it turns out the reason for the validation failure seems to be > arising > > > here: > > > > > > https://github.com/apache/nifi/blob/b1901d5fe0bf87be3dcce144f13b74 > > > eb995be168/nifi-commons/nifi-record/src/main/java/org/ > > > apache/nifi/serialization/record/RecordField.java#L44 > > > > > > When using the following schema: > > > > > > { > > > "type":"record", > > > "name":"header", > > > "fields":[ > > > { > > > "name":"version", > > > "type":"int", > > > "doc":"The CEF version extracted from (CEF:0) where 0 is the > > > version" > > > }, > > > { > > > "name":"deviceVendor", > > > "type":[ > > > "null", > > > "float" > > > ], > > > "default":null > > > } > > > ] > > > > > > } > > > > > > The float field with default Null will parse correctly. > > > > > > However > > > > > > When using this: > > > > > > { > > > "name":"nifiCommonEventFormat", > > > "namespace":"com.fluenda.SecuritySchemas.cefRev23", > > > "type":"record", > > > "fields":[ > > > { > > > "name":"header", > > > "type":{ > > > "type":"record", > > > "name":"header", > > > "fields":[ > > > { > > > "name":"version", > > > "type":"int", > > > "doc":"The CEF version extracted from (CEF:0) where 0 is > the > > > version" > > > }, > > > { > > > "name":"deviceVendor", > > > "type":[ > > > "null", > > > "float" > > > ], > > > "default":null > > > } > > > ] > > > } > > > } > > > ] > > > } > > > > > > It fails. > > > > > > intelliJ seems to suggest an NPE is the underlying cause? > > > > > > [image: Inline image 1] > > > > > > > > > What puzzles me, is that I tested with python and the schema seems > valid? > > > > > > $ ./test.py > > > {u'header': {u'version': 0, u'deviceVendor': 1.2345000505447388}} > > > {u'header': {u'version': 0, u'deviceVendor': None}} > > > > > > > > > The test.py script can be found here: > > > > > > https://gist.github.com/trixpan/4d190bef83712001857548b63c1d995a > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Jun 23, 2017 at 12:20 AM, Andre <[email protected]> wrote: > > > > > >> Joey, > > >> > > >> Using Chrome on Windows. Tried with Firefox and faced the same issue. > > >> > > >> On Fri, Jun 23, 2017 at 12:14 AM, Joey Frazee <[email protected] > > > > >> wrote: > > >> > > >>> Andre, if there are any line breaks in the schema and leading spaces > on > > >>> those new lines, then this occurs. So if you minify the avsc or > remove > > the > > >>> leading spaces, all should be good. > > >>> > > >>> Will open a JIRA on this since, including myself, I think you’re the > > >>> third to see this. Any chance you’re using Safari? > > >>> > > >>> > On Jun 22, 2017, at 8:53 AM, Andrew Grande <[email protected]> > > wrote: > > >>> > > > >>> > Definitely something is auto replacing quotes, I can confirm > pasting > > >>> worked > > >>> > fine before from a programmer's editor. > > >>> > > > >>> > Andrew > > >>> > > > >>> > On Thu, Jun 22, 2017, 9:06 AM Mark Payne <[email protected]> > > wrote: > > >>> > > > >>> >> Andre, > > >>> >> > > >>> >> I've not seen this personally. I just clicked on the link you > sent, > > >>> copied > > >>> >> the schema, > > >>> >> and pasted it in, and it did not have any problems. What > application > > >>> are > > >>> >> you copying > > >>> >> the text from? I've certainly seen that some applications > > >>> (specifically > > >>> >> Microsoft Outlook > > >>> >> and Office) love to take double-quotes and change them into other > > >>> >> characters so that > > >>> >> they look nicer. But if you then copy that and paste it, it is not > > >>> pasting > > >>> >> a double-quote but > > >>> >> some other unicode character. > > >>> >> > > >>> >> Would recommend you open the below link in Chrome and copy from > > there > > >>> and > > >>> >> see if > > >>> >> that works? > > >>> >> > > >>> >> Thanks > > >>> >> -Mark > > >>> >> > > >>> >> > > >>> >> > > >>> >>> On Jun 22, 2017, at 8:56 AM, Andre <[email protected]> wrote: > > >>> >>> > > >>> >>> All, > > >>> >>> > > >>> >>> I was playing with the AvroSchemaRegistry and noticed it seems to > > not > > >>> >> play > > >>> >>> ball when the DFM pastes the schema into the dynamic property > > value. > > >>> >>> > > >>> >>> To test it I basically copied the demo schema from Mark's blog > > >>> post[1] > > >>> >> and > > >>> >>> pasted into a NiFi 1.3.0 instance. To my surprise the controller > > >>> would > > >>> >> not > > >>> >>> validate, instead it displayed: > > >>> >>> > > >>> >>> "was expecting double-quote to start field name" > > >>> >>> > > >>> >>> I also faced similar errors using the following schema: > > >>> >>> > > >>> >>> > > >>> >> https://github.com/fluenda/SecuritySchemas/blob/master/CEFRe > > >>> v23/cefRev23_nifi.avsc > > >>> >>> > > >>> >>> Has anyone else seen this? > > >>> >>> > > >>> >>> Cheers > > >>> >> > > >>> >> > > >>> > > >>> > > >> > > > > > > > > > -- > > Thanks, > > Andrew > > > > Subscribe to my book: Streaming Data <http://manning.com/psaltis> > > <https://www.linkedin.com/pub/andrew-psaltis/1/17b/306> > > twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata> > > > -- Thanks, Andrew Subscribe to my book: Streaming Data <http://manning.com/psaltis> <https://www.linkedin.com/pub/andrew-psaltis/1/17b/306> twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata>
