Thanks Matt and Boris for your comments/suggestions.

I came to the same conclusion as Boris yesterday and put together a Groovy script. After I merge all the records into one flowfile, the Groovy script iterates through and comes up with a complete unique set of fields.

I was hoping there was a solution using distributed cache or something similar, but I couldn't find a way to iterate through all keys/values in the cache without first knowing the names of the keys. Any thought's about that approach?


Thanks again,

Scott


On 03/12/2018 07:26 AM, Matt Burgess wrote:
Scott,

I'm assuming you're trying to create a single (possibly sparse) table
for all the incoming JSON records, and that you do not know the total
set of fields ahead of time. If you do know the fields ahead of time,
you can create the table ahead of time and use the JoltTransformJSON
processor to add default values for any missing fields in any records.
You may also be able to do this with ConvertRecord.

If you do not know the total set of fields ahead of time, how will you
know you have "enough" records in the flow whose fields fully define
the table? If you won't ever know for sure that your flow contains
records whose values fully define the table, you'd have to try to add
missing columns as they are encountered (see [1] for SQL Server
commands to do this).  That can get messy, you'd have to pull the
field names out into attributes and use PutSQL's SQL Statement
property (added in NiFi 1.5.0) with NiFi Expression Language to
execute the SQL command for adding a column if it doesn't exist.
Otherwise you'd have to use Wait/Notify or some other barrier stuff to
ensure the column was added before a record with that field is pushed
to the database.

Basically it's very difficult to do if you don't know all the possible
table columns up front. If you are consuming from a Web service, I
would hope their API at least documents all possible fields, and you
could create a table from that.

Regards,
Matt

[1] 
https://stackoverflow.com/questions/8870802/add-a-column-to-a-table-if-it-does-not-already-exist


On Sat, Mar 10, 2018 at 7:18 PM, scott <tcots8...@gmail.com> wrote:
Hello NiFi community,

I have a problem I've been working on that I was hoping to get some
suggestions for a solution. I'm ingesting a dump of json records from a web
service periodically. The json sometimes comes with some fields missing. My
goal is to insert the data into a SQLServer DB after dynamically creating
the table based on the fields in the json records. The problem is that when
I use the list of fields from a record that is missing some fields, I get
errors when I try to insert. How do I take all the fields from all records
in my dump and then create a unique list of all fields for me to create the
table with?


Thanks,

Scott


Reply via email to