Re: Use putdatabaserecord to update or insert data from oracle source database to vertica target db

Matt Burgess Wed, 08 Nov 2017 06:56:17 -0800

Ashwin,

I recommend PutDatabaseRecord for batch updates/inserts inside NiFi
(i.e. if you don't have a bulk loader program available). If you use
something like QueryDatabaseTable, GenerateTableFetch, and/or
ExecuteSQL to get your data from the source table, it will be in Avro
format (with embedded schema). If you are looking to replicate the
source table onto the target DB, you can use PutDatabaseRecord with a
Statement Type of INSERT. If you are trying to do incremental
replication (i.e. only "new" rows every so often) and you have a
column that is always increasing (like ID or a timestamp for something
like "creation date"), you can use the aforementioned technique as
well, since you won't get duplicate rows from the source, thereby
avoiding an INSERT when the row already exists on the target.

If you are trying to do periodic replication (i.e. full table copies
every so often), you might be better off trying to truncate the target
table first and then using the aforementioned technique.  Basically I
am saying you want to avoid an UPSERT situation, since there is no
UPSERT statement type in PutDatabaseRecord.  UPSERTs are sometimes
performed by setting the Statement Type property in PutDatabaseRecord
to "Use statement.type attribute", setting statement.type attribute to
"insert", then routing the failure relationship to an UpdateAttribute
processor which changes the statement.type to "update".

For completeness, if none of these work and you want to issue SQL
statements to the target database, you can use any number of
conversion processors. ReplaceText -> PutSQL is a non-record-aware
option, ConvertRecord -> PutDatabaseRecord is a record-aware option. I
am working on a blog post that describes how to take raw data, create
SQL from it, and execute that SQL, all using record-aware processors.
Before the record-aware stuff, you had to split the data into
individual rows/lines, convert to JSON, convert to SQL, then PutSQL a
batch at a time. With the record-aware processors, you can handle your
dataset as a single entity instead of splitting it up and iterating
over them with the exact same transformation(s).

Regards,
Matt

P.S. Since the question is about using NiFi processors (versus
developing them), you might consider asking the users list instead
([email protected]). If you are not subscribed, you can send an
email to [email protected]. Once you are subscribed your
emails will go directly to the list (instead of needing to be manually
moderated).

On Wed, Nov 8, 2017 at 1:48 AM, ashwinb <[email protected]> wrote:
> Hi ,
> Can someone let me know which processor to use  to do update or insert data
> from oracle source table  to vertica target table in batch mode?
>
> Thanks
> Ashwin
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Re: Use putdatabaserecord to update or insert data from oracle source database to vertica target db

Reply via email to