Re: Inserts from multiple threads

Charlie Black Tue, 28 Nov 2017 11:22:37 -0800

Sure 50 to 1000 key/values in a putAll - just add metrics and see what
works best for your environment.   The thing to think when trying to
achieve best performance think about amortizing network overhead and
parallelizing the storage request (putAll).


I would like to point out more threads isn't necessarily better.   Geode
does a great job on making sure its kind to the network and shuffling the
data to right nodes.   So we have to think about is there enough
cores/horsepower to perform the unit of work from the client to servers.

Regards,

Charlie

On Tue, Nov 28, 2017 at 10:06 AM Amit Pandey <[email protected]>
wrote:

> Thanks guy. Much appreciated.
>
> Charlie do you mean batches of say 50-100 for putAlls ?
>
> Regards
>
> On Tue, Nov 28, 2017 at 11:15 PM, Charlie Black <[email protected]> wrote:
>
>> Both are correct and incorrect at the same time - it depends on
>> your application, domain model, workload and physical environment.   I
>> would recommend adding some metrics and follow what Akihiro mentioned and
>> use what works for your environment.
>>
>> As a side note: I would also recommend trying smaller batches in
>> your testing.
>>
>> Regards,
>>
>> Charlie
>>
>> On Tue, Nov 28, 2017 at 8:32 AM Amit Pandey <[email protected]>
>> wrote:
>>
>>> Hey Thanks for the answer. I guess I didn't explain it correctly. I am
>>> not trying to do single puts from threads.
>>>
>>> So my situation is :-
>>>
>>> I can do 500 inserts from 10 threads via putAll
>>>
>>> or I can just collect them ( 5000) and do a putAll.
>>>
>>> Which one is the correct approach ?
>>>
>>> On Mon, Nov 27, 2017 at 8:07 AM, Akihiro Kitada <[email protected]>
>>> wrote:
>>>
>>>> Hello Amit,
>>>>
>>>> >Now my question is will it be faster to do it on the individual
>>>> threads and just return that they have completed the task so that they can
>>>> be sent back to the caller or the way we do it now I,e collect all data and
>>>> insert is better ?
>>>>
>>>> It depends on the workload and cluster configuration (data size, num of
>>>> data, num of threads, num of members, region type and so on) although
>>>> putAll could be more efficient in terms of throughput per threads.
>>>>
>>>> I recommend you to try both ways based on the possible workload and
>>>> configuration.
>>>>
>>>> Thanks, regards.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Akihiro Kitada  |  Staff Customer Engineer |  +81 80 3716 3736
>>>> <+81%2080-3716-3736>
>>>> Support.Pivotal.io <http://support.pivotal.io/>  |  Mon-Fri  9:00am to
>>>> 5:30pm JST  |  1-877-477-2269 <(877)%20477-2269>
>>>> [image: support] <https://support.pivotal.io/> [image: twitter]
>>>> <https://twitter.com/pivotal> [image: linkedin]
>>>> <https://www.linkedin.com/company/3048967> [image: facebook]
>>>> <https://www.facebook.com/pivotalsoftware> [image: google plus]
>>>> <https://plus.google.com/+Pivotal> [image: youtube]
>>>> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>
>>>>
>>>>
>>>> 2017-11-26 0:33 GMT+09:00 Amit Pandey <[email protected]>:
>>>>
>>>>> Hey Guys,
>>>>>
>>>>> I have a question. So I have a function which calls some threads to
>>>>> get data to be inserted into a region. It collects all the data and then
>>>>> puts them into a region with putAll.
>>>>>
>>>>> Now my question is will it be faster to do it on the individual
>>>>> threads and just return that they have completed the task so that they can
>>>>> be sent back to the caller or the way we do it now I,e collect all data 
>>>>> and
>>>>> insert is better ?
>>>>>
>>>>> Regards
>>>>>
>>>>
>>>>
>>> --
>> [email protected] | +1.858.480.9722 <(858)%20480-9722>
>>
>
> --
[email protected] | +1.858.480.9722

Re: Inserts from multiple threads

Reply via email to