One of the easiest ways to trigger events in NiFi is to have a message queue processor set up and listening to a queue where you post an event to trigger the flow.
On Tue, Aug 13, 2019 at 11:45 AM Bimal Mehta <bimal...@gmail.com> wrote: > Thanks Mike. > ExecuteSQL looks good and am trying it. > > Also I wanted to understand how can we control triggering the NiFi jobs > from devops tools like CloudBees/ElectricFlow? > > On Tue, Aug 13, 2019 at 7:35 AM Mike Thomsen <mikerthom...@gmail.com> > wrote: > >> Bimal, >> >> 1. Take a look at ExecuteSQLRecord and see if that works for you. I don't >> use SQL databases that much, but it works like a charm for me and others >> for querying and getting an inferred avro schema based on the schema of the >> database table (you can massage it into another format with ConvertRecord). >> 2. Take a look at QueryRecord and PartitionRecord with them configured to >> use Avro readers and writers. >> >> Mike >> >> On Tue, Aug 13, 2019 at 12:25 AM Bimal Mehta <bimal...@gmail.com> wrote: >> >>> Hi NiFi users, >>> >>> We had been using the kylo data ingest template to read the data from >>> our Oracle and DB2 databases and move it into HDFS and Hive. >>> The kylo data ingest template also provided some features to validate, >>> profile and split the data based on validation rules. We also built some >>> custom processors and added them to the template. >>> We recently migrated to NiFi 1.9.0 (CDF), and a lot of Kylo processors >>> don't work there. We were able to make our custom processors work in 1.9.0 >>> but the kylo nar files don't work. I don't know if any work around exists >>> for that. >>> >>> However given that the kylo project is dead, I don't want to depend on >>> those kylo-nar files and processors, what I wanted to understand is how do >>> I replicate that functionality using the standard processors available in >>> NiFi. >>> >>> Essentially are there processors that allow me to do the below: >>> 1. Read data from database - I know QueryDatabaseTable. Any other? How >>> do I make it parameterized so that I don't need to create one flow for one >>> table. How can we pass the table name while running the job? >>> 2. Partition and convert to avro- I know splitavro, but does it >>> partition also, and how do I pass the partition parameters >>> 3. Write data to HDFS and Hive- I know PutHDFS works for writing to >>> HDFS, but should I use PutSQL for Hive by converting the avro in step 2 to >>> SQL? Or is there a better option. Does this support upserts as well? >>> 4. Apply validation rules to the data before being written into Hive. >>> Like calling a custom spark job that will execute the validation rules and >>> split the data. Any processor that can help achieve this? >>> >>> I know a few users in this group had used kylo on top of NiFi. It will >>> be great if some of you can provide your perspective as well. >>> >>> Thanks in advance. >>> >>> Bimal Mehta >>> >>