Peter, Since each of your statements ends with a semicolon, I would think you could use SplitText with Enable Multiline Mode and a delimiter of ';' to get flowfiles containing a single statement apiece, then route those to a single PutHiveQL. Not sure what the exact regex would look like but on its face it looks possible :)
Regards, Matt On Fri, Sep 23, 2016 at 8:14 AM, Peter Wicks (pwicks) <[email protected]> wrote: > I have a PutHDFS processor drop a file, I then have a long chain of > ReplaceText -> PutHiveQL processors that runs a series of steps. > > The below ~4 steps allow me to take the file generated by NiFi in one format > and move it into the final table, which is ORC with several Timestamp > columns (thus why I’m not using AvroToORC, since I’d lose my Timestamps. > > > > The exact HQL, all in one block, is roughly: > > > > DROP TABLE `db.tbl_${filename}`; > > > > CREATE TABLE ` db.tbl _${filename}`( > > Some list of columns goes here that exactly matches the schema of > `prod_db.tbl` > > ) > > ROW FORMAT DELIMITED > > FIELDS TERMINATED BY '\001' > > STORED AS TEXTFILE; > > LOAD DATA INPATH '${absolute.hdfs.path}/${filename}' INTO TABLE ` db.tbl > _${filename}`; > > INSERT INTO `prod_db.tbl` > > SELECT * FROM ` db.tbl _${filename}`; > > DROP TABLE ` db.tbl _${filename}`; > > > > Right now I’m having to split this into 5 separate ReplaceText steps, each > one followed by a PutHiveQL. Is there a way I can push a multi-statement, > order dependent, script like this to Hive in a simpler way? > > > > Thanks, > > Peter
