Re: [Architecture] Batched content chunk reading for WSO2 MB through Disruptor

Asitha Nanayakkara Fri, 13 Feb 2015 04:06:31 -0800

Hi Asanka,

On Fri, Feb 13, 2015 at 2:32 PM, Asanka Abeyweera <[email protected]> wrote:


> Hi Asitha,
>
> On Fri, Feb 13, 2015 at 2:16 PM, Asitha Nanayakkara <[email protected]>
> wrote:
>
>> Hi Asanka,
>>
>> On Fri, Feb 13, 2015 at 1:46 PM, Asanka Abeyweera <[email protected]>
>> wrote:
>>
>>> HI Asitha,
>>>
>>>
>>> On Fri, Feb 13, 2015 at 1:30 PM, Asitha Nanayakkara <[email protected]>
>>> wrote:
>>>
>>>> Hi Asanka,
>>>>
>>>> On Fri, Feb 13, 2015 at 1:07 PM, Asanka Abeyweera <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 13, 2015 at 12:38 PM, Asitha Nanayakkara <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Asanka,
>>>>>>
>>>>>> On Fri, Feb 13, 2015 at 10:22 AM, Asanka Abeyweera <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi Asitha ,
>>>>>>>
>>>>>>> I don't think we need to write a custom batch processor this. For me
>>>>>>> it is an additional maintenance headache, reduce readability and we 
>>>>>>> might
>>>>>>> have to change our custom processor implementation when we upgrade
>>>>>>> disruptor :). Therefore I'm -1 on writing custom processor for this. I
>>>>>>> think it's OK to add batching logic to content reading handler. This is
>>>>>>> just my idea. I might have missed some details in understanding this.
>>>>>>>
>>>>>>
>>>>>> I'm ok with dropping custom batch processors and having that batching
>>>>>> logic in event handler.
>>>>>>
>>>>>>
>>>>>
>>>> When batching we need to assure DeliveryEventHandler
>>>> (DeliveryEventHandler comes after the contentReaders ) won't process
>>>> messages until batched contents are read from DB. If we use the current
>>>> event handler, at each event it will update the sequence barrier to the
>>>> next one allowing the delivery handler to process the following slots in
>>>> ring buffer. But in this scenario we may be in the process of batching
>>>> those events and haven't read the content from DB. To assure that we have
>>>> batched and read content before DeliveryEventHandler process that slot we
>>>> need a batch processor. And we are using concurrent batch processors to
>>>> read content with a custom batching logic. Hence we needed a Custom batch
>>>> processor here. Similar to what we have in Inbound event handling
>>>> with Disruptor. Sorry I forgot the whole thing before. Please correct
>>>> me if I'm wrong or any better way to do this.
>>>>
>>>>
>>> This does not happen if we use the default batch processor.  
>>> "sequence.set(nextSequence
>>> - 1L)" is called after processing the onEvent call with endOfBatch set
>>> to true. Therefore the above scenario won't happen.
>>>
>>> Source location:
>>> https://github.com/LMAX-Exchange/disruptor/blob/2.10.4/code/src/main/com/lmax/disruptor/BatchEventProcessor.java#L117
>>>
>>
>> Yes I agree, default batch processor can be used in this scenario. Idea
>> behind writing a custom batch processor was to integrate our custom
>> concurrent batching logic. Yes we can move custom batching logic to event
>> handler and use the default Disruptor. Initial idea was to keep the
>> batching logic in batch processor and handling batched events logic in
>> event handler.
>>
>
> If the requirement is to separate batching logic from handler, what if we
> write a handler with batching logic and inside the handler we call our
> batch content reading handler.
>

+1

>
>
>>
>>
>>>
>>>
>>>
>>>>
>>>>>>> What I understood about the batching mechanism is if we have two
>>>>>>> parallel readers, one will batch odd sequences and other will batch even
>>>>>>> sequences. Can't we batch neighboring ones together?. i.e. when there 
>>>>>>> are
>>>>>>> two parallel readers sequence 1and 2 is done by one handler, 3 and 4 
>>>>>>> done
>>>>>>> by other handler. In this mechanism if we have 5 items to batch and we 
>>>>>>> have
>>>>>>> 5 reader and the batch size is five, only one handler will do batching. 
>>>>>>> But
>>>>>>> in the current implementation all the 5 readers will be involved in
>>>>>>> batching (each handler will do one item).
>>>>>>>
>>>>>>
>>>>>> This is a probable improvement I thought of having in Inbound event
>>>>>> batching as well. But at high message rates where we need the batched
>>>>>> performance this type of sparse batching doesn't happen. Yes I agree that
>>>>>> mentioned approach would batch events much better in all scenarios.
>>>>>>
>>>>>>
>>>>> BTW any ideas on batching using content chunks rather than content
>>>>>> length? This will have much better control over batching process.
>>>>>>
>>>>>> What is batching using content length?
>>>>>
>>>>
>>>> Currently from metadata what we can retrieve is content length of a
>>>> message. (To get the number of chunks we need to get the chunk size from a
>>>> reliable source.)  Therefore we have used content length of each message
>>>> and aggregate the value until we meet a specified max aggregate content
>>>> length to batch messages. This is suboptimal. We don't have a guarantee of
>>>> how many message chunks will be received from DB in one call. This value
>>>> depends on the message sizes. I think better approach would be to batch
>>>> through content chunks. Where we have a guarantee of how many maximum
>>>> chunks will be requested in one DB query. Any ideas on this?
>>>>
>>> Yes, +1 for batching using content chunks. Can we get the number of
>>> chunks for a message ID from AndesMetadata or from any other place?
>>>
>>
>> I couldn't find a way to get the number of chunks in a message from
>> metadata. Only information we can get is content length through  
>> StorableMessageMetaData#getContentSize()
>> If we can get the current content chunk size reliably (
>> org.wso2.andes.configuration.qpid.ServerConfiguration has the default
>> chunk size, this value can change) we can derive chunk count per message.
>> Or we might need to add content chunk count to metadata
>> when persisting messages.
>>
>> Thanks,
>> Asitha
>>
>>
>>>> Thanks,
>>>> Asitha
>>>>
>>>>
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Feb 13, 2015 at 5:18 AM, Asitha Nanayakkara <[email protected]
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Hi Pamod,
>>>>>>>>
>>>>>>>> branch with parrallel read implementation
>>>>>>>> https://github.com/asitha/andes/tree/parrallel-readers
>>>>>>>>
>>>>>>>>
>>>>>>>> can configure the max content size to batch. Meaning avg content
>>>>>>>> size of a batch.
>>>>>>>> for smaller messages setting a high content size will lead to
>>>>>>>> loading lot of message chunks.
>>>>>>>>
>>>>>>>> Property can be added to broker.xml
>>>>>>>>
>>>>>>>> performanceTuning/delivery/contentReadBatchSize
>>>>>>>>
>>>>>>>> @Asanka Pls take a look for any issues or improvements. Better if
>>>>>>>> we can batch through content chunk count I guess.
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Asitha Nanayakkara*
>>>>>>>> Software Engineer
>>>>>>>> WSO2, Inc. http://wso2.com/
>>>>>>>> Mob: + 94 77 85 30 682
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Asanka Abeyweera
>>>>>>> Software Engineer
>>>>>>> WSO2 Inc.
>>>>>>>
>>>>>>> Phone: +94 712228648
>>>>>>> Blog: a5anka.github.io
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Asitha Nanayakkara*
>>>>>> Software Engineer
>>>>>> WSO2, Inc. http://wso2.com/
>>>>>> Mob: + 94 77 85 30 682
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Asanka Abeyweera
>>>>> Software Engineer
>>>>> WSO2 Inc.
>>>>>
>>>>> Phone: +94 712228648
>>>>> Blog: a5anka.github.io
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Asitha Nanayakkara*
>>>> Software Engineer
>>>> WSO2, Inc. http://wso2.com/
>>>> Mob: + 94 77 85 30 682
>>>>
>>>>
>>>
>>>
>>> --
>>> Asanka Abeyweera
>>> Software Engineer
>>> WSO2 Inc.
>>>
>>> Phone: +94 712228648
>>> Blog: a5anka.github.io
>>>
>>
>>
>>
>> --
>> *Asitha Nanayakkara*
>> Software Engineer
>> WSO2, Inc. http://wso2.com/
>> Mob: + 94 77 85 30 682
>>
>>
>
>
> --
> Asanka Abeyweera
> Software Engineer
> WSO2 Inc.
>
> Phone: +94 712228648
> Blog: a5anka.github.io
>



-- 
*Asitha Nanayakkara*
Software Engineer
WSO2, Inc. http://wso2.com/
Mob: + 94 77 85 30 682

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] Batched content chunk reading for WSO2 MB through Disruptor

Reply via email to