Hi Jay or Kafka Dev Team, Any suggestions, how I can deal with this situation of expanding partitions for New Java Producer for scalability (consumer side) ?
Thanks, Bhavesh On Tue, Nov 4, 2014 at 7:08 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com> wrote: > Also, to added to this Old producer (Scala based in not impacted by the > partition changes). So it is important scalability feature being taken way > if you do not plan for expansion from the beginning for New Java Producer. > > So, New Java Producer is taking way this critical feature (unless plan). > > Thanks, > > Bhavesh > > On Tue, Nov 4, 2014 at 4:56 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com > > wrote: > >> HI Jay, >> >> Fundamental, problem is batch size is already configured and producers >> are running in production with given configuration. ( Previous value were >> just sample). How do we increase partitions for topics when batch size >> exceed and configured buffer limit ? Yes, had we planed for batch size >> smaller we can do this, but we cannot do this if producers are already >> running. Have you faced this problem at LinkedIn or any other place ? >> >> >> Thanks, >> >> Bhavesh >> >> On Tue, Nov 4, 2014 at 4:25 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >> >>> Hey Bhavesh, >>> >>> No there isn't such a setting. But what I am saying is that I don't think >>> you really need that feature. I think instead you can use a 32k batch >>> size >>> with your 64M memory limit. This should mean you can have up up to 2048 >>> batches in flight. Assuming one batch in flight and one being added to at >>> any given time, then this should work well for up to ~1000 partitions. So >>> rather than trying to do anything dynamic. So assuming each producer >>> sends >>> to just one topic then you would be fine as long as that topic had fewer >>> than 1000 partitions. If you wanted to add more you would need to add >>> memory on producers. >>> >>> -Jay >>> >>> On Tue, Nov 4, 2014 at 4:04 PM, Bhavesh Mistry < >>> mistry.p.bhav...@gmail.com> >>> wrote: >>> >>> > Hi Jay, >>> > >>> > I agree and understood what you have mentioned in previous email. But >>> when >>> > you have 5000+ producers running in cloud ( I am sure linkedin has many >>> > more and need to increase partitions for scalability) then all running >>> > producer will not send any data. So Is there any feature or setting >>> that >>> > make sense to shrink batch size to fit the increase. I am sure other >>> will >>> > face the same issue. Had I configured with block.on.buffer.full=true >>> it >>> > will be even worse and will block application threads. Our use case is >>> > *logger.log(msg)* method can not be blocked so that is why we have >>> > configuration to false. >>> > >>> > So I am sure others will run into this same issues. Try to find the >>> > optimal solution and recommendation from Kafka Dev team for this >>> particular >>> > use case (which may become common). >>> > >>> > Thanks, >>> > >>> > Bhavesh >>> > >>> > On Tue, Nov 4, 2014 at 3:12 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >>> > >>> > > Hey Bhavesh, >>> > > >>> > > Here is what your configuration means >>> > > buffer.memory=64MB # This means don't use more than 64MB of memory >>> > > batch.size=1MB # This means allocate a 1MB buffer for each partition >>> with >>> > > data >>> > > block.on.buffer.full=false # This means immediately throw an >>> exception if >>> > > there is not enough memory to create a new buffer >>> > > >>> > > Not sure what linger time you have set. >>> > > >>> > > So what you see makes sense. If you have 1MB buffers and 32 >>> partitions >>> > then >>> > > you will have approximately 32MB of memory in use (actually a bit >>> more >>> > than >>> > > this since one buffer will be filling while another is sending). If >>> you >>> > > have 128 partitions then you will try to use 128MB, and since you >>> have >>> > > configured the producer to fail when you reach 64 (rather than >>> waiting >>> > for >>> > > memory to become available) that is what happens. >>> > > >>> > > I suspect if you want a smaller batch size. More than 64k is usually >>> not >>> > > going to help throughput. >>> > > >>> > > -Jay >>> > > >>> > > On Tue, Nov 4, 2014 at 11:39 AM, Bhavesh Mistry < >>> > > mistry.p.bhav...@gmail.com> >>> > > wrote: >>> > > >>> > > > Hi Kafka Dev, >>> > > > >>> > > > With new Producer, we are having to change the # partitions for a >>> > topic, >>> > > > and we face this issue BufferExhaustedException. >>> > > > >>> > > > Here is example, we have set 64MiB and 32 partitions and 1MiB of >>> > batch >>> > > > size. But when we increase the partition to 128, it throws >>> > > > BufferExhaustedException right way (non key based message). >>> Buffer is >>> > > > allocated based on batch.size. This is very common need to set >>> auto >>> > > > calculate batch size when partitions increase because we have about >>> > ~5000 >>> > > > boxes and it is not practical to deploy code in all machines than >>> > expand >>> > > > partition for scalability purpose. What are options available >>> while >>> > > new >>> > > > producer is running and partition needs to increase and not enough >>> > buffer >>> > > > to allocate batch size for additional partition ? >>> > > > >>> > > > buffer.memory=64MiB >>> > > > batch.size=1MiB >>> > > > block.on.buffer.full=false >>> > > > >>> > > > >>> > > > Thanks, >>> > > > >>> > > > Bhavesh >>> > > > >>> > > >>> > >>> >> >> >