Hi Jou, Thank you for the tip: great!!!
I guess nifi is too new still because i did some extensive searching on this subject and QueryDatabaseTable was not mentioned..... This processor does exactly what i expect/need! One shortcoming... maybe i should enter a ticket for this. Usually extraction of data from rdbms involves complex queries with joins and these are not supported as far as i can see. We could also extend the processor so that a configuration option is to specify the full query which i believe is much more flexible than enumerating columns from a specific table. Paul On Wed, Mar 30, 2016 at 5:19 PM, Joe Witt <[email protected]> wrote: > Paul, > > In Apache NiFi 0.6.0 if you're looking for a change capture type > mechanism to source from relational databases take a look at > QueryDatabaseTable [1]. > > That processor is new and any feedback and or contribs for it would be > awesome. > > ExecuteSQL does have some time driven use cases to capture snapshots > and such but you're right that it doesn't sound like a good fit for > your case. > > [1] > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.QueryDatabaseTable/index.html > > On Wed, Mar 30, 2016 at 9:11 AM, Paul Bormans <[email protected]> wrote: > > I'm evaluating Apache Nifi as data ingestion tool to load data from an > > RDBMS into S3. A first test shows odd behavior where the same rows are > > written to the flowfile over and over again while i expected that only > new > > rows are written. > > > > In fact i was missing configuration options to specify what column could > be > > used to query only for new rows. > > > > Taking a look at the processor implementation makes me believe that the > > only option is to define a query including OFFSET n LIMIT m where "n" is > > dynamically set based upon previous onTriggers; would this even be > possible? > > > > Some setup info: > > nifi: 0.6.0 > > backend: postgresql > > driver: postgresql-9.4.1208.jre6.jar > > query: select * from addresses > > > > More in general i don't see a use-case where the current ExecuteSQL > > processor fits as a processor (without input flowfile). Someone can > explain? > > > > Paul >
