Peter,

Since each of your statements ends with a semicolon, I would think you
could use SplitText with Enable Multiline Mode and a delimiter of ';'
to get flowfiles containing a single statement apiece, then route
those to a single PutHiveQL. Not sure what the exact regex would look
like but on its face it looks possible :)

Regards,
Matt

On Fri, Sep 23, 2016 at 8:14 AM, Peter Wicks (pwicks) <[email protected]> wrote:
> I have a PutHDFS processor drop a file, I then have a long chain of
> ReplaceText -> PutHiveQL processors that runs a series of steps.
>
> The below ~4 steps allow me to take the file generated by NiFi in one format
> and move it into the final table, which is ORC with several Timestamp
> columns (thus why I’m not using AvroToORC, since I’d lose my Timestamps.
>
>
>
> The exact HQL, all in one block, is roughly:
>
>
>
> DROP TABLE `db.tbl_${filename}`;
>
>
>
> CREATE TABLE ` db.tbl _${filename}`(
>
>    Some list of columns goes here that exactly matches the schema of
> `prod_db.tbl`
>
> )
>
> ROW FORMAT DELIMITED
>
> FIELDS TERMINATED BY '\001'
>
> STORED AS TEXTFILE;
>
>  LOAD DATA INPATH '${absolute.hdfs.path}/${filename}' INTO TABLE ` db.tbl
> _${filename}`;
>
>  INSERT INTO `prod_db.tbl`
>
> SELECT * FROM ` db.tbl _${filename}`;
>
>                 DROP TABLE ` db.tbl _${filename}`;
>
>
>
> Right now I’m having to split this into 5 separate ReplaceText steps, each
> one followed by a PutHiveQL.  Is there a way I can push a multi-statement,
> order dependent, script like this to Hive in a simpler way?
>
>
>
> Thanks,
>
>   Peter

Reply via email to