Russell, How big are these CSVs in terms of rows and columns?
If they aren't too big, another option could be to use SplitText + ReplaceText to split the csv into a FlowFile per line, and then convert each line into SQL in ReplaceText. The downside is that this would create a lot of FlowFiles for very large CSVs. -Bryan On Mon, Oct 5, 2015 at 4:14 PM, Russell Whitaker <[email protected] > wrote: > Use case I'm attempting: > > 1.) ingest a CSV file with header lines; > 2.) remove header lines (i.e. remove N lines at head); > 2.) SQL INSERT each remaining line as a row in an existing mysql table. > > My thinking so far: > > #1 is given (CSV fetched already); > #2 simple, should be handled in the context of ExecuteStreamProcessor; > > #3 is where I'm scratching my head: I keep re-reading the Description > field for > the PutSQL processor in http://nifi.apache.org/docs.html but can't seem to > parse this into what I need to do to prepare a flowfile comprising lines of > comma-separated lines of text into a series of INSERT statements: > > "Executes a SQL UPDATE or INSERT command. The content of an incoming > FlowFile is expected to be the SQL command to execute. The SQL command > may use the ? to escape parameters. In this case, the parameters to > use must exist as FlowFile attributes with the naming convention > sql.args.N.type and sql.args.N.value, where N is a positive integer. > The sql.args.N.type is expected to be a number indicating the JDBC > Type." > > Of related interest: there seems to be only one CSV-relevant processor > type in > v0.3.0, ConvertCSVToAvro; I fear the need to have to do something like > this: > > ConvertCSVToAvro->ConvertAvroToJSON->ConvertJSONToSQL->PutSQL > > Guidance, suggestions? Thanks! > > Russell > > -- > Russell Whitaker > http://twitter.com/OrthoNormalRuss > http://www.linkedin.com/pub/russell-whitaker/0/b86/329 >
