Bob,
InferAvroSchema can infer types like boolean, integer, long, float, double,
and I believe for JSON can correctly descend into arrays and nested
maps/structs/objects. Here is an example record from NiFi provenance data
that has most of those covered (except bool and float/double, but you can
add those):
{
"eventId" : "7422645d-056e-423b-b280-6305f9daccaa",
"eventOrdinal" : 0,
"eventType" : "CREATE",
"timestampMillis" : 1496934288944,
"timestamp" : "2017-06-08T15:04:48.944Z",
"durationMillis" : -1,
"lineageStart" : 1496934288930,
"componentId" : "8821e5d8-015c-1000-30b0-f7211bbf43e5",
"componentType" : "GenerateFlowFile",
"componentName" : "_GenerateFlowFile",
"entityId" : "b99a56c6-e032-4396-915e-24186974b84a",
"entityType" : "org.apache.nifi.flowfile.FlowFile",
"entitySize" : 52,
"updatedAttributes" : {
"path" : "./",
"uuid" : "b99a56c6-e032-4396-915e-24186974b84a",
"filename" : "924304881186293"
},
"previousAttributes" : { },
"actorHostname" : "localhost",
"contentURI" : "
http://localhost:8989/nifi-api/provenance-events/0/content/output",
"previousContentURI" : "
http://localhost:8989/nifi-api/provenance-events/0/content/input",
"parentIds" : [ ],
"childIds" : [ ],
"platform" : "nifi",
"application" : "NiFi Flow"
}
Note that the timestamps are longs as InferAvroSchema does not support
Avro logical types (such as timestamp, date, decimal). I'd like to see an
InferRecordSchema that is record-aware, supports time/date types, etc. I
wrote up a Jira a while back to cover it [1] but haven't gotten around to
implementing it yet.
Regards,
Matt
[1] https://issues.apache.org/jira/browse/NIFI-4109
On Mon, Aug 13, 2018 at 11:02 AM Kuhfahl, Bob <[email protected]> wrote:
> Trying to develop a sample input file of json data to feed into
> InferAvroSchema so I can feed that into PutDatabaseRecord.
>
> Need a hello world example ☺
>
>
>
> But, to get started, I’d be happy to get InferAvroSchema working. I’m
> “trial and error”-ing the input file hoping to get lucky, but..
>
>
>
> No log messages, flow of json data is going to failure, I’m reading the
> code for InferAvroSchema()
>
> But it just calls JsonUtil.inferSchema(), so I’ll keep digging down the
> path but… if someone has a sample input that demonstrates how it’s supposed
> to work, I’d be grateful!
>
>
>
>
>
>
>
>
>
>
>