Re: Nifi hardware recommendation

2016-10-14 Thread Joe Witt
The validity of that advice depends on a lot of factors. G1 changed the game a bit for pause times for sure but you can still see larger pause times than acceptable for some cases. In any event I agree that we should be more careful with how we describe heap usage. Thanks Joe On Oct 14, 2016

Re: Nifi hardware recommendation

2016-10-14 Thread Corey Flowers
We actually use heap sizes from 32 to 64Gb for ours but our volumes and graphs are both extremely large. Although I believe the smaller heap sizes were a limitation of the garbage collection in Java 7. We also moved to ssd drives, which did help through put quite a bit. Our systems were

Re: PutDynamoDB processor

2016-10-14 Thread Gop Krr
Thanks James. I would be happy to contribute the scan processor for DynamoDB. Just to clarify, based on your comment, we can't take all the rows of the DynamoDB table and put it into another table. We have to do it for one record at a time? On Fri, Oct 14, 2016 at 10:50 AM, James Wing

Re: PutDynamoDB processor

2016-10-14 Thread James Wing
NiFi's GetDynamoDB processor uses the underlying BatchGetItem API, which requires item keys as inputs. Iterating over the keys in a table would require the Scan API, but NiFi does not have a processor to scan a DynamoDB table. This would be a great addition to NiFi. If you have any interest in

Re: Penalize Flow File on Failure

2016-10-14 Thread Matt Burgess
Manish, You could use ExecuteScript with Groovy and the following: def flowFile = session.get() if(!flowFile) return flowFile = session.penalize(flowFile) session.transfer(flowFile, REL_SUCCESS) Then you can route "success" back to the processor. You'd set the Penalty Duration on the

RE: Penalize Flow File on Failure

2016-10-14 Thread Manish Gupta 8
Thanks Matt. As a workaround, is there a processor that does not modify the flow file (content or attribute) at all, and I can use it to delay the self-referencing flow files to hit the main processor again immediately? Regards, Manish -Original Message- From: Matt Burgess

Re: Penalize Flow File on Failure

2016-10-14 Thread Matt Burgess
Manish, The use of penalize(), yield(), etc. is not enforced by the framework, so processors can have different behavior, sometimes on purpose, and sometimes inadvertently. The Developer's Guide has guidance on when to use such methods [1], and reviewers often check the submissions to see if

Re: Nifi hardware recommendation

2016-10-14 Thread Russell Bateman
Ali, "not recommended to dedicate more than 8-10 GM to JVM heap space" by whom? Do you have links/references establishing this? I couldn't find anyone saying this or why. Russ On 10/13/2016 05:47 PM, Ali Nazemian wrote: Hi, I have another question regarding the hardware recommendation. As

Penalize Flow File on Failure

2016-10-14 Thread Manish Gupta 8
Hello Everyone, In some of the processors I have seen that flow files on failure are not being penalized. For example - Kite processors like ConvertJsonToAvro. Is there some specific reason why some processors have different behavior? I think every processor should penalize every non-success

Re: Nifi hardware recommendation

2016-10-14 Thread Joe Witt
I'd also add to Mark's great reply that another good use of RAM beyond the HEAP and disk caching and avoiding swapping is that you can do things like off-heap native storage of things like reference datasets that can wired into NiFi flows for high speed enrichment where you can even do hot

Re: GetKafka maximum fetch size

2016-10-14 Thread Igor Kravzov
Thanks Jeremy. Where can I set/change this property? in GetKafka? On Thu, Oct 13, 2016 at 5:33 PM, Jeremy Farbota wrote: > Igor, > > Kafka consumer properties can be found here: http://kafka.apache.org/ > documentation.html#consumerconfigs > > GetKafka uses the old consumer

Re: Nifi hardware recommendation

2016-10-14 Thread Mark Payne
Hi Ali, Typically, we see people using a 4-8 GB heap with NiFi. 8 GB is pretty typical for a flow that is expected to have pretty high throughput in terms of the number of FlowFiles, or a large number of processors. However, one thing that you will want to consider in terms of RAM is disk