Re: [DISCUSS]Add batch feature

Rick Zhang Tue, 07 Mar 2017 06:53:26 -0800

>
>  Only add synchronous send functions to MQProducer interface, just like
> send(final Collection msgs)


Would you provider asynchronous send(Message msg) in the future? although
async send may lead to messages lost.

Only synchronous sending with collection may be not elegant enough to
users.
They may have to think about how to avoid messages, which hasn't reach the
size threshold to send, stay in the collection too long? maybe another
thread need to be created to check that if batch.max.time has been reached.
If the messages collection is referred in multi-thread, user also have to
consider thread safety, for instance, the send argument should be a copy of
the gathered collection to avoid adding message into it while sending.

If user could endure potential messages lost, a simple asynchronous
send(Message msg) api which will send out batch in background will be more
suitable.

2017-03-07 15:46 GMT+08:00 dongeforever <[email protected]>:

> Tests show that Kafka's million-level TPS is mainly owed to batch. When
> set batch size to 1, the TPS is reduced an order of magnitude. So I try to
> add this feature to RocketMQ.
>
> https://github.com/apache/incubator-rocketmq/pull/53
>
>
>
>
> Original intention
>
> Batch is not for packaging but improving performance of small messages. So
> the messages of the same batch should act the same role, no more effort
> should be taken to split the batch.
>
> No split has another important advantage, messages of the same batch
> should be sent atomically, that is all successfully or all unsuccessfully,
> of which the importance is self-evident.
>
> So performance and atomicity are the original intentions, which will
> reflect on the usage constraints.
>
>
>
>
> How it works
>
> For a minimal effort, it works as follows:
>
> Only add synchronous send functions to MQProducer interface, just like
> send(final Collection msgs)
>
> Use MessageBatch which extends Message and implements Iterable<Message>
>
> Use byte buffer instead of list of objects to avoid too much GC in Broker.
>
> Split the decode and encode logic from lockForPutMessage to avoid too many
> race conditions.
>
> Usage constraints
>
> messages of the same batch should have:
>
> 1. same topic: If they belong to different topics(internally the queues),
> then may be sent to different brokers, which will against atomicity.
>
> 2. same waitStoreMsgOK: also differences will against atomicity.
>
> 3. no delay level: If we care about the delay level, we need to decode the
> internal properties of every message, which will cause much performance
> loss.
>
>
>
>
> Performance Tests:
>
> On linux with 24 Core 48G Ram and SSD, using 50 threads to send
> 50Byte(body) message in batch size 50, we get about 150w TPS until the disk
> is full.
>
>
>
>
> Potential problems:
>
> Although the messages can be accumulated in the Broker very quickly, it
> need time to dispatch to the consume queue, which is much slower than
> accepting messages. So the messages may not be able to be consumed
> immediately.
>
> We may need to refactor the ReputMessageService to solve this problem.
>
> Please feel free to reach out with any question.
>
>
>
>
>
>
>
> Best Regards
>
> dongeforever

Re: [DISCUSS]Add batch feature

Reply via email to