Hi Jay or Kafka Dev Team,

Any suggestions,  how I can deal with this situation of expanding
partitions for New Java Producer for scalability (consumer side) ?

Thanks,

Bhavesh

On Tue, Nov 4, 2014 at 7:08 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com>
wrote:

> Also, to added to this Old producer (Scala based in not impacted by the
> partition changes). So it is important scalability feature being taken way
> if you do not plan for expansion from the beginning for New Java Producer.
>
> So, New Java Producer is taking way this critical feature (unless plan).
>
> Thanks,
>
> Bhavesh
>
> On Tue, Nov 4, 2014 at 4:56 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com
> > wrote:
>
>> HI Jay,
>>
>> Fundamental, problem is batch size is already configured and producers
>> are running in production with given configuration.  ( Previous value were
>> just sample).  How do we increase partitions for topics when batch size
>> exceed and configured buffer limit ?  Yes, had we planed for batch size
>> smaller we can do this, but we cannot do this if producers are already
>> running.  Have you faced this problem at LinkedIn or any other place ?
>>
>>
>> Thanks,
>>
>> Bhavesh
>>
>> On Tue, Nov 4, 2014 at 4:25 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>>
>>> Hey Bhavesh,
>>>
>>> No there isn't such a setting. But what I am saying is that I don't think
>>> you really need that feature. I think instead you can use a 32k batch
>>> size
>>> with your 64M memory limit. This should mean you can have up up to 2048
>>> batches in flight. Assuming one batch in flight and one being added to at
>>> any given time, then this should work well for up to ~1000 partitions. So
>>> rather than trying to do anything dynamic. So assuming each producer
>>> sends
>>> to just one topic then you would be fine as long as that topic had fewer
>>> than 1000 partitions. If you wanted to add more you would need to add
>>> memory on producers.
>>>
>>> -Jay
>>>
>>> On Tue, Nov 4, 2014 at 4:04 PM, Bhavesh Mistry <
>>> mistry.p.bhav...@gmail.com>
>>> wrote:
>>>
>>> > Hi Jay,
>>> >
>>> > I agree and understood what you have mentioned in previous email.  But
>>> when
>>> > you have 5000+ producers running in cloud ( I am sure linkedin has many
>>> > more and need to increase partitions for scalability) then all running
>>> > producer will not send any data. So Is there any feature or setting
>>> that
>>> > make sense to shrink batch size to fit the increase.  I am sure other
>>> will
>>> > face the same issue.  Had I configured with block.on.buffer.full=true
>>> it
>>> > will be even worse and will block application threads.  Our use case is
>>> > *logger.log(msg)* method can not be blocked so that is why we have
>>> > configuration to false.
>>> >
>>> > So I am sure others will run into this same issues.   Try to find the
>>> > optimal solution and recommendation from Kafka Dev team for this
>>> particular
>>> > use case (which may become common).
>>> >
>>> > Thanks,
>>> >
>>> > Bhavesh
>>> >
>>> > On Tue, Nov 4, 2014 at 3:12 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>>> >
>>> > > Hey Bhavesh,
>>> > >
>>> > > Here is what your configuration means
>>> > > buffer.memory=64MB # This means don't use more than 64MB of memory
>>> > > batch.size=1MB # This means allocate a 1MB buffer for each partition
>>> with
>>> > > data
>>> > > block.on.buffer.full=false # This means immediately throw an
>>> exception if
>>> > > there is not enough memory to create a new buffer
>>> > >
>>> > > Not sure what linger time you have set.
>>> > >
>>> > > So what you see makes sense. If you have 1MB buffers and 32
>>> partitions
>>> > then
>>> > > you will have approximately 32MB of memory in use (actually a bit
>>> more
>>> > than
>>> > > this since one buffer will be filling while another is sending). If
>>> you
>>> > > have 128 partitions then you will try to use 128MB, and since you
>>> have
>>> > > configured the producer to fail when you reach 64 (rather than
>>> waiting
>>> > for
>>> > > memory to become available) that is what happens.
>>> > >
>>> > > I suspect if you want a smaller batch size. More than 64k is usually
>>> not
>>> > > going to help throughput.
>>> > >
>>> > > -Jay
>>> > >
>>> > > On Tue, Nov 4, 2014 at 11:39 AM, Bhavesh Mistry <
>>> > > mistry.p.bhav...@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > Hi Kafka Dev,
>>> > > >
>>> > > > With new Producer, we are having to change the # partitions for a
>>> > topic,
>>> > > > and we face this issue BufferExhaustedException.
>>> > > >
>>> > > > Here is example,   we have set 64MiB and 32 partitions and 1MiB of
>>> > batch
>>> > > > size.  But when we increase the partition to 128, it throws
>>> > > > BufferExhaustedException right way (non key based message).
>>> Buffer is
>>> > > > allocated based on batch.size.  This is very common need to set
>>> auto
>>> > > > calculate batch size when partitions increase because we have about
>>> > ~5000
>>> > > > boxes and it is not practical to deploy code in all machines than
>>> > expand
>>> > > > partition for  scalability purpose.   What are options available
>>> while
>>> > > new
>>> > > > producer is running and partition needs to increase and not enough
>>> > buffer
>>> > > > to allocate batch size for additional partition ?
>>> > > >
>>> > > > buffer.memory=64MiB
>>> > > > batch.size=1MiB
>>> > > > block.on.buffer.full=false
>>> > > >
>>> > > >
>>> > > > Thanks,
>>> > > >
>>> > > > Bhavesh
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Reply via email to