Kumiko A couple of quick thoughts to share. You can absolutely code your processor to operate in batches and you can of course multi-thread the processor. The general unit of work concept Apache NiFi supports is called a ProcessSession and you can operate on as many flow files as you need in that session and then commit it as one batch. NiFi will automatically track/record a lot of very nice information at the process session level. In addition NiFi will capture provenance information which itself is useful for understand specific items that went through that flow and their latencies and such. Beyond these options there is also a concept of counters which you can use to capture, generally for development purposes, interesting things you'd like to observe over time. You'll also want to get a good handle on what performance you should expect interacting with the web service independent of NiFi so you can get a good baseline to work from.
The quota question is also one where you have choices and design decisions to make. You can bake this quota handling logic into your processor itself or you could also possibly wire existing or some new processor in that specifically handles the quote/grouping logic you need and it would have relationships such as 'within quota' and 'exceeds quota'. I apologize for not giving a more precise response. There are many ways to approach this and the best trade offs will depend on finer details. As you advance with this please feel free to ask more questions. If you find things you wish were available and you think should exist in NiFi we'd love to have your contribution in any form (ideas, code, JIRAs, etc..). Thanks Joe On Thu, May 26, 2016 at 9:08 PM, Kumiko Yada <[email protected]> wrote: > Hello, > > We implemented the custom process that are similar to the InvokeHTTP that the > part of URL can be replaced with the Context Data List, then write the > weather to the flowfile. For example, URL to get the weather feed have to > include the zip code in URL, and the ZIP code is {0} in the URL and replaced > the zip code from the Context Data List property. > > URL > http://example{0}/weather<http://example%7b0%7d/weather> > > Context Data List: > 00000 > 11111 > 22222 > > Processor with make the following requests: > http://example{0}/weather<http://example%7b0%7d/weather> > > http://example00000/weather > http://example11111/weather > http://example22222/weather > > This processor is processed in one request at a time and have a perf issue. > I'd like to modify to process in batches. What are the best way to process > in batches? And also, would the Nifi keep track how many requests the > processor is processed? If so, how the Nifi keep track this and how long the > Nifi keep track of data? I'd like to add the quota priorities in this > processor to keep track of quota. For example, if the weather feeds can be > requested only 100 requests a day, I don't want to processor to executed once > the quota is reached. > > Thanks > Kumiko
