Many thanks. Mike Thomsen <[email protected]> schrieb am Mo., 24. Feb. 2020, 03:50:
> A wide variety of institutions use NiFi for common enterprise data > processing, ETL and more. It is also very good at being plugged into > oddball locations in enterprise systems for tasks that people might never > even think of; one of the best examples I've seen was NiFi being used to > act as a fuzzer for downstream systems that were being considered by a > client as potential purchases. > > For consultants, I don't think that's a real issue. We've got a mostly > junior data engineer workforce and they rarely need any sort of > intervention by more experienced data engineers. If you anticipate that > you'll encounter stiff resistance if you don't have an answer for where to > hire expertise on day one, the best option I am aware of would be > Cloudera's professional service team (I am not a Cloudera employee). They > could also provide you with commercial case studies if you anticipate that > need. > > Beyond that, I think we'd need to take this as a sidebar conversation > because I think there are at least certain rules of decorum on ASF mailing > lists that can be violated if we do vendor-related discussions on ASF > lists. > > Hope that helps. > > On Sun, Feb 23, 2020 at 7:26 PM Martin Ebert <[email protected]> wrote: > > > Can you send me at least 3 links to verify your statement? This would be > > really helpful. > > > > I see the potential of NiFi and would like to push it in management as > > well. Therefore it is essential to have as many good reasons as possible > > (besides my own experience). > > > > Who uses NiFi in concrete terms? > > How high is the satisfaction? > > Where can I find suitable consultants? And how many are freely available > on > > the market? > > What are the success stories? > > ... > > > > I often hear the accusation that NiFi is just another open source tool. I > > cannot share this opinion. > > > > > > > > Mike Thomsen <[email protected]> schrieb am So., 23. Feb. 2020, > > 18:05: > > > > > Not with hard numbers, but when you look at job reqs and proposals it's > > > ***everywhere***. I also can't remember the last time I saw a data > > > engineering demo or discussion where NiFi or StreamSets wasn't the > > > foundation. > > > > > > On Sun, Feb 23, 2020 at 4:21 PM Martin Ebert <[email protected]> > > wrote: > > > > > > > "NiFi is now emerging as the de facto standard for data engineering > in > > > > the government market in the US in part because properly hardening it > > is > > > > closer to something a well-motivated intern can do than requiring a > > > > "seasoned professional."" > > > > Is there any way to prove this? Sounds interesting. > > > > > > > > > > > > Mike Thomsen <[email protected]> schrieb am So., 23. Feb. 2020, > > > > 17:08: > > > > > > > > > > I just made a few benchmarks with NiFi to compare it to another > > > > solution. > > > > > > > > > > Raw performance is only one consideration when choosing an ETL or > > data > > > > > orchestration tool. NiFi has some very critical competitive > > advantages > > > > such > > > > > as how aggressively it protects the contents of the data flow from > > > > external > > > > > failure (ex someone killing the JVM doesn't corrupt hours of work) > > and > > > > how > > > > > easy it is to very deeply harden** it on the security side of > things. > > > > Plus, > > > > > you have the fact that unlike many tools in this space, it's very > > agile > > > > in > > > > > being able to stop a job at any time and inspect the inputs and > > > outputs. > > > > > > > > > > ** NiFi is now emerging as the de facto standard for data > engineering > > > in > > > > > the government market in the US in part because properly hardening > it > > > is > > > > > closer to something a well-motivated intern can do than requiring a > > > > > "seasoned professional." > > > > > > > > > > On Sun, Feb 23, 2020 at 3:36 PM Marc Pellmann <[email protected]> > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I am interested in some insight to timer driven vs. event driven > > and > > > > the > > > > > > future plans with event driven. > > > > > > > > > > > > > > > > > > I just made a few benchmarks with NiFi to compare it to another > > > > solution. > > > > > > > > > > > > > > > > > > The flows primarily consist of synchronous Web Service/REST like > > > calls. > > > > > So > > > > > > I use HandleHttpRequest/HandleHttpResponse. In the concrete > > example I > > > > > just > > > > > > have two processors in between - a ReplaceText and a > TransformXml. > > > > > > > > > > > > > > > > > > From the client side I use JMeter to generate the load (just POST > > > calls > > > > > > with a few bytes content). > > > > > > > > > > > > > > > > > > First I tested this with standard values, which means timer > driven > > > > > > scheduling strategy and 1 task. > > > > > > > > > > > > > > > > > > The numbers from this tests where not very impressive, so I > played > > > with > > > > > the > > > > > > configuration and setted the scheduling strategy to event driven > > > (with > > > > > task > > > > > > value 0 and maximum event driven thread count of 1). This could > be > > > only > > > > > > done for the two processors between and not for the > > > > > > HandleHttpRequest/HandleHttpResponse since they do not allow such > > > > > > configuration. > > > > > > > > > > > > > > > > > > This increased the throughput by the factor 6. > > > > > > > > > > > > > > > > > > I also tested to increase the throughput with some other > > > > configurations, > > > > > > such as more tasks or different run durations, but this did not > > > changed > > > > > the > > > > > > values significantly. > > > > > > > > > > > > > > > > > > So a least for this type of scenario, the event driven > > configuration > > > is > > > > > > much better. But on the other side it is still experimental and > > > > according > > > > > > to some posts it is not seen as a good option and sounds more > like > > it > > > > is > > > > > > something that might be removed. > > > > > > > > > > > > > > > > > > Why is this? > > > > > > > > > > > > > > > > > > Also I would expect an event driven configuration option for > > > > > > HandleHttpRequest, since there is already the event of http > request > > > > > occurs. > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > Marc > > > > > > > > > > > > > > > > > > > > > > > > > >
