Re: fastest way to bulk insert in geode

John Blum Mon, 06 Mar 2017 10:36:51 -0800

Amit-

Note, a CacheLoader does not necessarily imply "loading data from a
database"; it can load data from any [external] data source and does so on
demand (i.e. lazily, on a cache miss).  However, as Mike points out, this
might not work for your Use Case in situations where you are querying, for
instance.


I guess the real question here is, what is the requirement to pre-load this
data quickly?  What is the driving requirement here?

For instance, is the need to be able to bring another system online quickly
in case of "failover".  If so, perhaps an architectural change is more
appropriate such as an Active/Passive arch (using WAN).

-j



On Mon, Mar 6, 2017 at 9:45 AM, Amit Pandey <[email protected]>
wrote:

> We might need that actually...problem is we cant use dataloader because we
> are not loading from database. So we have to use putall. Its taking 2
> seconds for over 30000 data. If implenting it will bring it down that will
> be helpful.
> On 06-Mar-2017 10:05 pm, "Michael Stolz" <[email protected]> wrote:
>
>> Of course if you're REALLY in need of speed you can write your own custom
>> implementations of toData and fromData for the DataSerializable Interface.
>>
>> I haven't seen anyone need that much speed in a long time though.
>>
>>
>> --
>>
>> Mike Stolz
>> Principal Engineer - Gemfire Product Manager
>> Mobile: 631-835-4771 <(631)%20835-4771>
>>
>> On Mar 3, 2017 11:16 PM, "Real Wes" <[email protected]> wrote:
>>
>>> Amit,
>>>
>>>
>>>
>>> John and Mike’s advice about tradeoffs is worth heeding. You’ll find
>>> that your speed is probably just fine with putAll but if you just have to
>>> have NOS in your tank, you might consider - since you’re inside a function
>>> - do the putAll from the function into your region but change the region
>>> scope to distributed-no-ack.  See: https://geode.apache.org/docs/
>>> guide/developing/distributed_regions/choosing_level_of_dist.html
>>>
>>>
>>>
>>> Wes
>>>
>>>
>>>
>>> *From:* Amit Pandey [mailto:[email protected]]
>>> *Sent:* Friday, March 3, 2017 12:26 PM
>>> *To:* [email protected]
>>> *Subject:* Re: fastest way to bulk insert in geode
>>>
>>>
>>>
>>> Hey John ,
>>>
>>>
>>>
>>> Thanks I am planning to use Spring XD. But my current usecase is that I
>>> am aggregating and doing some computes in a Function and then I want to
>>> populate it with the values I have a map , is region.putAll the fastest?
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>>
>>> On Fri, Mar 3, 2017 at 10:52 PM, John Blum <[email protected]> wrote:
>>>
>>> You might consider using the Snapshot service
>>> <http://gemfire90.docs.pivotal.io/geode/managing/cache_snapshots/chapter_overview.html>
>>>  [1]
>>> if you previously had data in a Region of another Cluster (for instance).
>>>
>>>
>>>
>>> If the data is coming externally, then *Spring XD
>>> <http://projects.spring.io/spring-xd/> *[2] is a great tool for moving
>>> (streaming) data from a source
>>> <http://docs.spring.io/spring-xd/docs/1.3.1.RELEASE/reference/html/#sources>
>>>  [3]
>>> to a sink
>>> <http://docs.spring.io/spring-xd/docs/1.3.1.RELEASE/reference/html/#sinks> 
>>> [4].
>>> It also allows you to perform all manners of transformations/conversions,
>>> trigger events, and so and so forth.
>>>
>>>
>>>
>>> -j
>>>
>>>
>>>
>>>
>>>
>>> [1] http://gemfire90.docs.pivotal.io/geode/managing/cache_sn
>>> apshots/chapter_overview.html
>>>
>>> [2] http://projects.spring.io/spring-xd/
>>>
>>> [3] http://docs.spring.io/spring-xd/docs/1.3.1.RELEASE/refer
>>> ence/html/#sources
>>>
>>> [4] http://docs.spring.io/spring-xd/docs/1.3.1.RELEASE/refer
>>> ence/html/#sinks
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Mar 3, 2017 at 9:13 AM, Amit Pandey <[email protected]>
>>> wrote:
>>>
>>> Hey Guys,
>>>
>>>
>>>
>>> Whats the fastest way to do bulk insert in a region?
>>>
>>>
>>>
>>> I am using region.putAll , is there any alternative/faster API?
>>>
>>>
>>>
>>> regards
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> -John
>>>
>>> john.blum10101 (skype)
>>>
>>>
>>>
>>


-- 
-John
john.blum10101 (skype)

Re: fastest way to bulk insert in geode

Reply via email to