[GitHub] nifi pull request: Nifi 1495 - AWS Kinesis Firehose

apiri Thu, 10 Mar 2016 09:53:11 -0800

Github user apiri commented on the pull request:

    https://github.com/apache/nifi/pull/213#issuecomment-194975737
  
    @mans2singh I had a chance to sit down and revisit this.  Overall, it looks 
good and I was able to test a flow successfully putting to a Kinesis Firehose 
which aggregated and dumped to S3.
    
    One thing that was mentioned prior in my initial review that we still need 
to cover is that of how we are handling batching.  I do think we need to handle 
that in a more constrained fashion given that file sizes could vary widely.  
With how the processor is currently configured, it could hold up to 250MB in 
memory, by default.  Instead, what would your thoughts be on converting this to 
a buffer size property.  If people want batching, they can specify a given 
memory size (perhaps something like 1 MB by default) and then we can wait until 
that threshold is hit or no more input flowfiles are available, at which point 
they are sent off in a batch.  If batching is not desired, they can either 
empty the buffer property or specify 0 bytes.
    
    Thoughts on this approach?  Ultimately, we are trying to avoid people from 
incidentally causing issues with heap exhaustion.  With the prescribed approach 
here, people can get as aggressive as they wish with batching and have a 
finitely constrained amount of space per instance.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: Nifi 1495 - AWS Kinesis Firehose

Reply via email to