I would look at writing a service that reads from your existing topic and 
writes to a new topic with (e.g. four) partitions.

You will also need to pay attention to the partitioning policy (or implement 
your own), as the default hashing in the current kafka version default can lead 
to poor distribution.

Best Regards,

-Jonathan

 
On Sep 19, 2014, at 8:57 AM, Dennis Haller <dhal...@talemetry.com> wrote:

> Hi,
> 
> We have an interesting problem to solve due to a very large traffic volumes
> on particular topics. In our initial system configuration we had only one
> partition per topic, and in in a couple of topics we have built up huge
> backlogs of several million messages that our consumers are slowly
> processing.
> 
> However, now that we have this constant backlog, we wish to repartition
> those topics into several partitions, and allow parallel consumers to run
> to handle the high message volume.
> 
> If we simply repartition the topic, say from 1 to 4 partitions, the
> backlogged messages stay in partition 1, while partitions 2,3,4 only get
> newly arrived messages. To eat away the backlog, we need to redistribute
> the backlogged messages evenly among the 4 partitions.
> 
> The tools I've seen do not allow me to rewrite or "replay" the existing
> backlogged messages from one partition into the same or another topic with
> several partitions.  - using kafka.tools.MirrorMaker does not allow me to
> move the data within the same zookeeper network, and
> - using kafka.tools.ReplayLogProducer does not write to multiple
> partitions. It seems that it will write only from a single partition to a
> single partition.
> 
> Does anyone have any other way to solve this problem or a better way of
> using the kafka tools?
> 
> Thanks
> Dennis

Reply via email to