Thank you Mark.

On Tue, May 31, 2016 at 1:02 PM, Mark Payne <marka...@hotmail.com> wrote:

> Igor,
>
> MergeContent will consider a 'bin' full when any one of those conditions
> hit. I.e., if you set:
>
> Max Group Size = 64 MB
> Max Number of Entries = 100
> Max Bin Age = 5 mins
>
> Then you will get a merged bin whenever a bin hits 64 MB, regardless of
> how long its been or how many entires there are.
> Similarly, if you have 100 entries, then you'll get a bin even if the data
> is only 1 MB total.
> Also, if you go 5 minutes without reaching either of those thresholds, the
> 5 minute threshold will cause the bin to be created,
> regardless of how many FlowFiles there are.
>
> A common pattern for sending to HDFS is to set the Maximum Bin Age to some
> threshold (5 mins or 1 hour or whatever makes
> sense for you) and the Min Group Size to 64 MB and Max Group Size to 128
> MB and not set anything for the Maximum Number
> of Entries. In this case, you will get bins of 64 - 128 MB most of the
> time, but if the data volume is low for a while, you'll still get some
> data flowing into HDFS because the of the Max Bin Age.
>
> Thanks
> -Markk
>
> > On May 31, 2016, at 12:07 PM, Igor Kravzov <igork.ine...@gmail.com>
> wrote:
> >
> > There are 2 configuration properties: Maximum Group Size and Maximum
> Number of entries.
> > Are these mutually exclusive? I want to create a file to store in HDFS
> but limit size at 64MB as HDFS block (or should I go bigger?).
> >
> > Max Bin Age property
> > Since content can be in different length and and not know when max size
> will be reached, whar role it will play?
>
>

Reply via email to