Great to hear, Marcio!

On Thu, Oct 13, 2016 at 9:26 PM Márcio Faria <faria.mar...@ymail.com> wrote:

> Jeff,
>
> Many thanks. I'm now more confident NiFi could be a good fit for us.
>
> Marcio
>
>
> On Wednesday, October 12, 2016 9:06 PM, Jeff <jtsw...@gmail.com> wrote:
>
>
> Hello Marcio,
>
> You're asking on the right list!
>
> Based on the scenario you described, I think NiFi would suit your needs.
> To address your 3 major steps of your workflow:
>
> 1) Processors can run based on a timer-based or cron-based schedule.
> GenerateTableFetch is a processor that can be used to create SQL SELECT
> statements from a table based on increasing values in one or more columns,
> and can be partitioned depending on your batching needs.  These SQL SELECT
> statements can then be executed against the destination database by use of
> the PutSQL processor.
>
> 2) With the more recent data, which I'm assuming is queried from the
> destination database, you can use QueryDatabaseTable to retrieve the new
> rows in Avro format and then transform as needed, which may include
> processors that encapsulate any custom logic you might have written for
> your homemade ETL solution
>
> 3) The PostHTTP processor can be used to send files over HTTPS to the
> external server.
>
> Processors have failure relationships when processing for a flow file
> fails, and can be routed as appropriate, such as retrying failed flow
> files.  For errors that require human intervention, there are a number of
> options.  Most likely, the way your homemade solution currently handles
> errors that require human intervention can be done by NiFi as well.
>
> Personally, I have used NiFi in similar ways to what you have described.
> There are some examples on the Apache NiFi site [1] that you can check
> out.  Your questions about the stopping and restarting of processing when
> errors occur is possible, though much of that is in how you design your
> flow.
>
> Feel free to ask any questions!  Much of the information above is fairly
> high-level, and NiFi offers a lot of processors to meet your data flow
> needs.
>
> - Jeff
>
> On Tue, Oct 11, 2016 at 5:18 PM Márcio Faria <faria.mar...@ymail.com>
> wrote:
>
> Hi,
>
> Potential NiFi user here.
>
> I'm trying to figure out if NiFi could be a good choice to replace our
> existent homemade ETL system, which roughly works like this:
>
> 1) Either on demand or at periodic instants, fetch fresh rows from one or
> more tables in the source database and insert or update them into the
> destination database;
>
> 2) Run the jobs which depend on the more recent data, and generate files
> based on those;
>
> 3) Upload the generated files to an external server using HTTPS.
>
> Since our use cases are more of a "pull" style (Ex: It's time to run the
> report -> get the required data updated -> run the processing job and
> submit the results) than "push" (Ex: Get the latest data available -> when
> some condition is met, run the processing job and submit the results), I'm
> wondering if NiFi, or any other flow-based toolset for that matter, would
> be a good option for us to try or not. Your opinion? Suggestions?
>
> Besides, what is the recommended way to handle errors in a ETL scenario
> like that? For example, we submit a "page" of rows to a remote server and
> its response tells us which of those rows were accepted and which ones had
> a validation error. What would be the recommended approach to handle such
> errors if the fix requires some human intervention? Is there a way of
> stopping the whole flow until the correction is done? How to restart it
> when part of the data were already processed by some of the processors? The
> server won't accept a transaction B if it depends on a transaction A that
> wasn't successfully submitted before.
>
> As you see, our processing is very batch-oriented. I know NiFi can fetch
> data in chunks from a relational database, but I'm not sure how to approach
> the conversion from our current style to a more "stream"-oriented one. I'm
> afraid I could try to use the "right tool for the wrong problem", if you
> know what I mean.
>
> Apologies if this is not the proper venue to ask. I checked all the posts
> in this mailing list and also tried to search for information elsewhere,
> but I wasn't able to find the answers myself.
>
> Any guidance, like examples or links to further reading, would be very
> much appreciated. I'm just starting to learn the ropes.
>
> Thank you,
> Marcio
>
>
>
>

Reply via email to