Re: Inserts from multiple threads

Charlie Black Tue, 28 Nov 2017 11:37:01 -0800

Socket Buffers - good catch.

On Tue, Nov 28, 2017 at 11:33 AM Udo Kohlmeyer <[email protected]>
wrote:


> Another thing to keep in mind.... putAll does have notion of
> micro-batching... which means... in a partitioned region, the client will
> try and use single-hop semantics and only send the entries relevant to a
> server to that server...
> But as everybody else has already stated... you'll have test what is your
> "optimal" batch size AND ... maybe tune your buffers to match ....
>
> --Udo
>
> On Tue, Nov 28, 2017 at 11:21 AM, Charlie Black <[email protected]> wrote:
>
>> Sure 50 to 1000 key/values in a putAll - just add metrics and see what
>> works best for your environment.   The thing to think when trying to
>> achieve best performance think about amortizing network overhead and
>> parallelizing the storage request (putAll).
>>
>> I would like to point out more threads isn't necessarily better.   Geode
>> does a great job on making sure its kind to the network and shuffling the
>> data to right nodes.   So we have to think about is there enough
>> cores/horsepower to perform the unit of work from the client to servers.
>>
>> Regards,
>>
>> Charlie
>>
>> On Tue, Nov 28, 2017 at 10:06 AM Amit Pandey <[email protected]>
>> wrote:
>>
>>> Thanks guy. Much appreciated.
>>>
>>> Charlie do you mean batches of say 50-100 for putAlls ?
>>>
>>> Regards
>>>
>>> On Tue, Nov 28, 2017 at 11:15 PM, Charlie Black <[email protected]>
>>> wrote:
>>>
>>>> Both are correct and incorrect at the same time - it depends on
>>>> your application, domain model, workload and physical environment.   I
>>>> would recommend adding some metrics and follow what Akihiro mentioned and
>>>> use what works for your environment.
>>>>
>>>> As a side note: I would also recommend trying smaller batches in
>>>> your testing.
>>>>
>>>> Regards,
>>>>
>>>> Charlie
>>>>
>>>> On Tue, Nov 28, 2017 at 8:32 AM Amit Pandey <[email protected]>
>>>> wrote:
>>>>
>>>>> Hey Thanks for the answer. I guess I didn't explain it correctly. I am
>>>>> not trying to do single puts from threads.
>>>>>
>>>>> So my situation is :-
>>>>>
>>>>> I can do 500 inserts from 10 threads via putAll
>>>>>
>>>>> or I can just collect them ( 5000) and do a putAll.
>>>>>
>>>>> Which one is the correct approach ?
>>>>>
>>>>> On Mon, Nov 27, 2017 at 8:07 AM, Akihiro Kitada <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hello Amit,
>>>>>>
>>>>>> >Now my question is will it be faster to do it on the individual
>>>>>> threads and just return that they have completed the task so that they 
>>>>>> can
>>>>>> be sent back to the caller or the way we do it now I,e collect all data 
>>>>>> and
>>>>>> insert is better ?
>>>>>>
>>>>>> It depends on the workload and cluster configuration (data size, num
>>>>>> of data, num of threads, num of members, region type and so on) although
>>>>>> putAll could be more efficient in terms of throughput per threads.
>>>>>>
>>>>>> I recommend you to try both ways based on the possible workload and
>>>>>> configuration.
>>>>>>
>>>>>> Thanks, regards.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Akihiro Kitada  |  Staff Customer Engineer |  +81 80 3716 3736
>>>>>> <+81%2080-3716-3736>
>>>>>> Support.Pivotal.io <http://support.pivotal.io/>  |  Mon-Fri  9:00am
>>>>>> to 5:30pm JST  |  1-877-477-2269 <(877)%20477-2269>
>>>>>> [image: support] <https://support.pivotal.io/> [image: twitter]
>>>>>> <https://twitter.com/pivotal> [image: linkedin]
>>>>>> <https://www.linkedin.com/company/3048967> [image: facebook]
>>>>>> <https://www.facebook.com/pivotalsoftware> [image: google plus]
>>>>>> <https://plus.google.com/+Pivotal> [image: youtube]
>>>>>> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>
>>>>>>
>>>>>>
>>>>>> 2017-11-26 0:33 GMT+09:00 Amit Pandey <[email protected]>:
>>>>>>
>>>>>>> Hey Guys,
>>>>>>>
>>>>>>> I have a question. So I have a function which calls some threads to
>>>>>>> get data to be inserted into a region. It collects all the data and then
>>>>>>> puts them into a region with putAll.
>>>>>>>
>>>>>>> Now my question is will it be faster to do it on the individual
>>>>>>> threads and just return that they have completed the task so that they 
>>>>>>> can
>>>>>>> be sent back to the caller or the way we do it now I,e collect all data 
>>>>>>> and
>>>>>>> insert is better ?
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> [email protected] | +1.858.480.9722 <(858)%20480-9722>
>>>>
>>>
>>> --
>> [email protected] | +1.858.480.9722 <+1%20858-480-9722>
>>
>
>
>
> --
> Kindest Regards
> -----------------------------
> *Udo Kohlmeyer* | *Pivotal*
> [email protected]
> <http://www.gopivotal.com/>
> www.pivotal.io
>
-- 
[email protected] | +1.858.480.9722

Re: Inserts from multiple threads

Reply via email to