Andre,
I may be totally off-base here as it relates to the record related
processing in NiFi. From a pure Avro viewpoint the same JSON will not work
between those two schemas that you provided. The second one has a nested
header record and the first does not. I used the java AVRO tools to do the
following:
1. Generate sample Avro for schema 1
java -jar avro-tools-1.7.7.jar random --count 1 --schema-file schema1.asvc
test1.avro
2. Convert binary Avro to JSON
java -jar avro-tools-1.7.7.jar tojson test1.avro
This produces JSON like such:
{"version":297340384,"deviceVendor":null}
3. Generate sample Avro for schema 2
java -jar avro-tools-1.7.7.jar random --count 1 --schema-file schema2.asvc
test2.avro
4. Convert binary Avro to JSON
java -jar avro-tools-1.7.7.jar tojson test2.avro
This produces JSON like such:
{"header":{"version":-492928400,"deviceVendor":{"float":0.59934044}}}
To then have valid JSON with the second schema, you would represent it as:
{"header":{"version":-492928400,"deviceVendor":null}}
If you try and generate Avro with JSON from step 2 above
({"version":297340384,"deviceVendor":null}) using the second schema:
java -jar avro-tools-1.7.7.jar fromjson --schema-file schema2.asvc
test1.json
It will throw an exception:
avro.codenull3+�MW���o���c��Exception in thread "main"
org.apache.avro.AvroTypeException: Expected field name not found: header
Hopefully this did not totally muddy the water.
Thanks,
Andrew
On Thu, Jun 22, 2017 at 11:00 PM, Andre <[email protected]> wrote:
> All,
>
> So it turns out the reason for the validation failure seems to be arising
> here:
>
> https://github.com/apache/nifi/blob/b1901d5fe0bf87be3dcce144f13b74
> eb995be168/nifi-commons/nifi-record/src/main/java/org/
> apache/nifi/serialization/record/RecordField.java#L44
>
> When using the following schema:
>
> {
> "type":"record",
> "name":"header",
> "fields":[
> {
> "name":"version",
> "type":"int",
> "doc":"The CEF version extracted from (CEF:0) where 0 is the
> version"
> },
> {
> "name":"deviceVendor",
> "type":[
> "null",
> "float"
> ],
> "default":null
> }
> ]
>
> }
>
> The float field with default Null will parse correctly.
>
> However
>
> When using this:
>
> {
> "name":"nifiCommonEventFormat",
> "namespace":"com.fluenda.SecuritySchemas.cefRev23",
> "type":"record",
> "fields":[
> {
> "name":"header",
> "type":{
> "type":"record",
> "name":"header",
> "fields":[
> {
> "name":"version",
> "type":"int",
> "doc":"The CEF version extracted from (CEF:0) where 0 is the
> version"
> },
> {
> "name":"deviceVendor",
> "type":[
> "null",
> "float"
> ],
> "default":null
> }
> ]
> }
> }
> ]
> }
>
> It fails.
>
> intelliJ seems to suggest an NPE is the underlying cause?
>
> [image: Inline image 1]
>
>
> What puzzles me, is that I tested with python and the schema seems valid?
>
> $ ./test.py
> {u'header': {u'version': 0, u'deviceVendor': 1.2345000505447388}}
> {u'header': {u'version': 0, u'deviceVendor': None}}
>
>
> The test.py script can be found here:
>
> https://gist.github.com/trixpan/4d190bef83712001857548b63c1d995a
>
>
> Cheers
>
>
>
>
>
>
>
>
>
> On Fri, Jun 23, 2017 at 12:20 AM, Andre <[email protected]> wrote:
>
>> Joey,
>>
>> Using Chrome on Windows. Tried with Firefox and faced the same issue.
>>
>> On Fri, Jun 23, 2017 at 12:14 AM, Joey Frazee <[email protected]>
>> wrote:
>>
>>> Andre, if there are any line breaks in the schema and leading spaces on
>>> those new lines, then this occurs. So if you minify the avsc or remove the
>>> leading spaces, all should be good.
>>>
>>> Will open a JIRA on this since, including myself, I think you’re the
>>> third to see this. Any chance you’re using Safari?
>>>
>>> > On Jun 22, 2017, at 8:53 AM, Andrew Grande <[email protected]> wrote:
>>> >
>>> > Definitely something is auto replacing quotes, I can confirm pasting
>>> worked
>>> > fine before from a programmer's editor.
>>> >
>>> > Andrew
>>> >
>>> > On Thu, Jun 22, 2017, 9:06 AM Mark Payne <[email protected]> wrote:
>>> >
>>> >> Andre,
>>> >>
>>> >> I've not seen this personally. I just clicked on the link you sent,
>>> copied
>>> >> the schema,
>>> >> and pasted it in, and it did not have any problems. What application
>>> are
>>> >> you copying
>>> >> the text from? I've certainly seen that some applications
>>> (specifically
>>> >> Microsoft Outlook
>>> >> and Office) love to take double-quotes and change them into other
>>> >> characters so that
>>> >> they look nicer. But if you then copy that and paste it, it is not
>>> pasting
>>> >> a double-quote but
>>> >> some other unicode character.
>>> >>
>>> >> Would recommend you open the below link in Chrome and copy from there
>>> and
>>> >> see if
>>> >> that works?
>>> >>
>>> >> Thanks
>>> >> -Mark
>>> >>
>>> >>
>>> >>
>>> >>> On Jun 22, 2017, at 8:56 AM, Andre <[email protected]> wrote:
>>> >>>
>>> >>> All,
>>> >>>
>>> >>> I was playing with the AvroSchemaRegistry and noticed it seems to not
>>> >> play
>>> >>> ball when the DFM pastes the schema into the dynamic property value.
>>> >>>
>>> >>> To test it I basically copied the demo schema from Mark's blog
>>> post[1]
>>> >> and
>>> >>> pasted into a NiFi 1.3.0 instance. To my surprise the controller
>>> would
>>> >> not
>>> >>> validate, instead it displayed:
>>> >>>
>>> >>> "was expecting double-quote to start field name"
>>> >>>
>>> >>> I also faced similar errors using the following schema:
>>> >>>
>>> >>>
>>> >> https://github.com/fluenda/SecuritySchemas/blob/master/CEFRe
>>> v23/cefRev23_nifi.avsc
>>> >>>
>>> >>> Has anyone else seen this?
>>> >>>
>>> >>> Cheers
>>> >>
>>> >>
>>>
>>>
>>
>
--
Thanks,
Andrew
Subscribe to my book: Streaming Data <http://manning.com/psaltis>
<https://www.linkedin.com/pub/andrew-psaltis/1/17b/306>
twiiter: @itmdata <http://twitter.com/intent/user?screen_name=itmdata>