Thank you Matt for the suggestion, I'm currently using JsonTreeReader w/
PutDatabaseRecord. Wasn't sure if there was a way define the desired
content with this Record Reader.
And the records I am currently working with are:
{"content":[
{"owner":"blah",
"share":false,
"description": "Job URL: https:.....",
"id":"5f6afdb03972e5000164886c",
"name":"NAME",
"number":38,
"start_time":1600847279311,
"end_time":1600847284052,
"status":"PASSED",
"statistics":{
"executions":{
"total":"142",
"passed":"142",
"failed":"0",
"skipped":"0"
},
"defects":{
"product_bug":{
"total":0,
"PB001":0
},
"automation_bug":{
"AB001":0,
"total":0
},
"system_issue":{
"total":0,
"SI001":0
},
"to_investigate":{
"total":0,
"TI001":0
},
"no_defect":{
"ND001":0,
"total":0
}
}
},
"tags":["tag1", "tag2", "tag3"],
...
]
}
And typically in python I would do
pd.json_normalzie(results.json()['content']) and I would use
json_normalize() again on the statistics and defects columns and provide
the sep='_' option.
Thanks!
Eric
On Tue, Sep 29, 2020 at 2:12 PM Matt Burgess <[email protected]> wrote:
> Eric,
>
> Depending on how large the JSON content is, you could use
> JoltTransformJSON to "hoist" the desired data to the top level. Given
> this example JSON:
>
> {
> "json": {
> "data": {
> "value": 3
> }
> }
> }
>
> The spec would be:
>
> [
> {
> "operation": "shift",
> "spec": {
> "json": {
> "data": {
> "*": "&"
> }
> }
> }
> }
> ]
>
> It "walks down the tree" until it gets to what you want (in this case
> all fields under "data"), then outputs it at the top level.
>
>
> For PutDatabaseRecord, there currently is no way to have it create the
> target table if it doesn't exist, I wrote up NIFI-7862 [1] to cover
> this improvement. For the "none of the fields map to columns" issue,
> make sure the table and field/column names match the correct
> upper/lower case. You may also need to set "Translate Field Names" to
> true so it is less strict about the matching.
>
> Regards,
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-7862
>
> On Tue, Sep 29, 2020 at 4:44 PM Eric Sammons <[email protected]> wrote:
> >
> > Hello,
> >
> > As the subject line states I am a new to NiFi. For my first flow I am
> attempting to replace my python code with a NiFi workflow that extracts
> data from a JSON endpoint and then writes the JSON to a Redshift
> database.schema.table.
> >
> > I have set up my DBCP for redshift and I have assigned that to my
> PutDatabaseRecord processor; however, I am having (at least) two issues so
> far.
> >
> > How do I tell InvokeHTTP (OR PutDatabaseRecord) to start reading the
> JSON from a specific point, most json responses are in the form of
> result.json()['data'] or result.json()['content']?
> > What is the best approach to having the PutDatabaseRecord go to a
> "Create Table" when the table and thus columns do not exist? Is there a
> way to have PutDatabaseRecord create the table if it doesn't exist?
> >
> > As a side note to #2, the PutDatabaseRecord is reporting
> "PutDatabaseRecord failed to process standardflowfilerecord.... due to None
> of the fields in the record map to the columns defined by the schema.table
> table." On failures the JSON contents are being written to a file so I am
> able to see evidence that InvokeHTTP is working and that PutDatabaseRecord
> has at least looked at / for the schema.table.
> >
> > Thank you!
>