Use case I'm attempting:

1.) ingest a CSV file with header lines;
2.) remove header lines (i.e. remove N lines at head);
2.) SQL INSERT each remaining line as a row in an existing mysql table.

My thinking so far:

#1 is given (CSV fetched already);
#2 simple, should be handled in the context of ExecuteStreamProcessor;

#3 is where I'm scratching my head: I keep re-reading the Description field for
the PutSQL processor in http://nifi.apache.org/docs.html but can't seem to
parse this into what I need to do to prepare a flowfile comprising lines of
comma-separated lines of text into a series of INSERT statements:

"Executes a SQL UPDATE or INSERT command. The content of an incoming
FlowFile is expected to be the SQL command to execute. The SQL command
may use the ? to escape parameters. In this case, the parameters to
use must exist as FlowFile attributes with the naming convention
sql.args.N.type and sql.args.N.value, where N is a positive integer.
The sql.args.N.type is expected to be a number indicating the JDBC
Type."

Of related interest: there seems to be only one CSV-relevant processor type in
v0.3.0, ConvertCSVToAvro; I fear the need to have to do something like this:

ConvertCSVToAvro->ConvertAvroToJSON->ConvertJSONToSQL->PutSQL

Guidance, suggestions? Thanks!

Russell

-- 
Russell Whitaker
http://twitter.com/OrthoNormalRuss
http://www.linkedin.com/pub/russell-whitaker/0/b86/329

Reply via email to