Re: Load all data from DB on Cache Start

John Blum Mon, 26 Dec 2016 12:29:50 -0800

Amit-

Regarding...

*> I want to load all data on cache startup at a go.*

Since you are using "*Spring*", you could easily implement a *Spring*
BeanPostProcessor [1] (BPP) for each (or all the) *Region(s)* in which you
need to load data.  I do this frequently in *Spring Data GemFire/Geode's*
test suite when testing *Region* data access operations using the
GemfireTemplate, *Repositories* or things of that nature.  Clearly your BPP
could use a DataSource to load the data from an external data store (e.g.
RDBMS).

Another way to do load data on startup is to use a Geode *Initializer*.
However, this would require you to specify a snippet of cache.xml and does
not work if you specify your *Regions* in *Spring* (XML/Java) config as you
should when using *Spring*.  I also don't recommend using cache.xml, but is
the pure, non-*Spring* way to invoke logic after the cache has been "fully"
initialized (i.e. where the *Regions* have been defined in cache.xml).

See here [2] for more details.  Note, the documentation talks of "launching
an application" on startup, after cache initialization, but technically,
you can do whatever you want, like load data.

I recommend the BPP.

*> How should I set it up in config to allow it to join other nodes in
cluster?*

Regardless of whether your server data node is "embedded" or not, you can
still use a Locator, or mcast to have the node join the cluster.  The
"embedded" scenario, where the "application" is a GemFire Server data node
will be part of the cluster as Udo said.

This is easily achievable with...

<util:properties id="gemfireProperties">
  <prop key="name">Example</prop>
  <!-- Set to non-zero value to use Multicast; comment out "locators" -->
  <prop key="*mcast-port*">0</prop>
  <prop key="log-level">${gemfire.log-level:config}</prop>
  <prop key=“*locators*”>someHost[10334]</prop>
  <prop key="start-locator">localhost[1034]</prop>
</util:properties>

<gfe:cache properties-ref="gemfireProperties"/>

...

As you can see from the snippet of *Spring* XML config above, this
application is a Geode "peer" cache (i.e. embeds a Geode data node/server).

The "*locators*" Geode/GemFire property enables this node to connect to a
cluster.  Likewise, you can use the "*mcast-port*" property instead,
however, I would recommend *Locators* over mcast.

Additionally, you can see that I specified the "start-locator"
Geode/GemFire property, which enables me to start an embedded Locator.
Useful for testing purposes and connecting Geode data nodes together in a
cluster without a dedicated Locator, though, this approach is less
resilient if the applications/servers go down (as may be the case in a
micro-services scenario)!

*> if I start with embedded server is it required to use client pool or is
it not required?*

A "client pool" is only applicable to cache clients (i.e. ClientCaches) on
the "client-side" of the equation.  "peers" find (Locator, mcast) and
communicate (TCP/UDP, JGroups) with each other through other means once a
cluster is formed.

In fact, typically, it is more common to position your microservices-based
applications as Geode cache clients (i.e. <gfe:client-cache ...>) and have
them connect to a dedicated Geode service (i.e. cluster of Geode
servers/data nodes where also, 1 or more of those nodes are running a "
CacheServer", listening for cache clients to connect).  These dedicated
Geode server nodes in a cluster constituting the service can still be
configured with *Spring*, but they typically will not contain an
application-specific components other than CacheListeners, Loaders, Writers,
AEQ *Listeners*, etc.

ClientCache applications use 1 or more Pools configured to talk to the
servers in the cluster (either by way of Locator or direct server
communication). Pools can be configured with groups to target specific
members (in that group) in the cluster.  Typically, members in 1 group host
a different set of Regions from another group and is a way to separate data
traffic from 1 client to another dedicated to a specific resource/purpose
(usually based on business function, etc).

On a side note, some of what you are wanting to do "scale-wise" seems like
a perfect fit for Pivotal CloudFoundry, which can auto-scale up or down
nodes in your cluster based on load and other factors.

Anyway, hope this helps!

-John

[1]
http://docs.spring.io/spring/docs/current/spring-framework-reference/htmlsingle/#beans-factory-extension-bpp
[2]
http://geode.apache.org/docs/guide/basic_config/the_cache/setting_cache_initializer.html

On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey <[email protected]>
wrote:

> Hey,
>
> Thanks.
>
> I have lots of reference data which will be loaded at start of day. This
> data is not bound to change much and as such I want to keep it loaded at
> the start of day. Read through will make it slow while it is being actually
> accessed so I want to keep it loaded in memory.
>
> Also I want to have functions which will be called by clients to do some
> compute and return results. Using functions should allow me to add nodes
> and speed up the compute.
>
> I have some micro services each of which will start a gemfire node, and I
> want to connect, so yes I can set it up with locator.
>
> However I have one doubt, if I start with embedded server is it required
> to use client pool or is it not required?
>
> Regards
>
> On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer <[email protected]>
> wrote:
>
>> Hi there Amit,
>>
>> At this stage the only way you could load all data at one go is to write
>> a client to connect to the db and load all in. Another approach could be to
>> write the same code into a function and invoke the function at start up.
>> But in both cases both are manual.
>>
>> To have geode servers join a cluster, you have 2 ways.
>>
>>    1. Connecting them up via a locator
>>    2. Connecting them up via mcast.
>>
>> Please be aware the once you connect a server to a cluster, that server
>> becomes an integral part of the cluster so adding/removing servers from a
>> cluster is not something you'd want to do in a load-based scaling model.
>> i.e if the load is high, add a server and if load is low, shut down a
>> server.
>>
>> Just interest sake, what is your use case.
>>
>> --Udo
>>
>> On 12/24/16 05:57, Amit Pandey wrote:
>>
>> Hi Guys,
>>
>> I am using Spring Data Geode. I have been able to use read and write
>> through/ write behind. I want to load all data on cache startup at a go.
>>
>> Secondly my geode server is embedded but I want to allow it join to other
>> nodes.  How should I set it up in config to allow it to join other nodes in
>> cluster?
>>
>> Regards
>>
>>
>>
>

-- 
-John
john.blum10101 (skype)

Re: Load all data from DB on Cache Start

Reply via email to