Hi all,

Not completely sure if this is a developer or user question, but I'm posting
it here for now as at this moment it is related to flow design.

So what I'm trying to achieve is to get a JSON response from an API, extract
the relevant values, validate this data and convert it to avro. I am able to
complete the first two steps with InvokeHTTP and JoltTransformJSON, after
which my data is an array of objects in JSON, so my flowfile looks like
this:

[
  {"key1": "val1", "key2": "val2"},
  {"key1": "val3", "key2": "val4"}
]

My idea was now to put this JSON in a ConvertJSONToAvro together with the
appropriate avro schema. However, ConvertJSONToAvro cannot apply schema
validation on the individual elements of an array. It can, however, apply
schema validation to records that are not contained in an array but are
separated by newlines, so it can handle the following flowfile (note that
this, on a file level, is basically invalid JSON):

{"key1": "val1", "key2": "val2"}
{"key1": "val3", "key2": "val4"}

I can achieve this in NiFi by splitting the JSON flowfile with SplitJSON and
merging it back together immediately with a MergeContent processor with '\n'
as demarcator. These both have to be applied before the ConvertJSONToAvro,
because otherwise invalid records would cause the merge step to fail. So
this splitting can't even be used to redistribute files in a cluster
setting, so I don't really like this workaround.

I was wondering if anyone knows a way to produce the second example format
of JSON using a JOLT transformation, which would be an elegant fix. If not,
I'd like to ask if there is a reason that ConvertJSONToAvro can only handle
newline separated objects and not objects in an array (which is the closest
representation in JSON of the concept of records in Avro in my opinion). If
no such reason I think it can be considered a bug and then I would like to
propose to provide an option in the ConvertJSONToAvro processor to apply the
schema validation on the whole file, on objects separated by newlines or on
objects in an array.

Please let me know what you think!

Regards,
Bas



--
View this message in context: 
http://apache-nifi-users-list.2361937.n4.nabble.com/Validating-an-array-of-objects-using-ConvertJSONToAvro-tp832.html
Sent from the Apache NiFi Users List mailing list archive at Nabble.com.

Reply via email to