Re: Load all data from DB on Cache Start

Udo Kohlmeyer Mon, 26 Dec 2016 14:15:06 -0800

it helps a lot! :D


On 12/26/16 12:28, John Blum wrote:

Amit-

Regarding...

/> I want to load all data on cache startup at a go./
Since you are using "/Spring/", you could easily implement a /Spring/BeanPostProcessor [1] (BPP) for each (or all the) /Region(s)/ in whichyou need to load data. I do this frequently in /Spring DataGemFire/Geode's/ test suite when testing /Region/ data accessoperations using the GemfireTemplate, /Repositories/ or things of thatnature. Clearly your BPP could use a DataSourceto load the data froman external data store (e.g. RDBMS).
Another way to do load data on startup is to use a Geode/Initializer/. However, this would require you to specify a snippetof cache.xml and does not work if you specify your /Regions/ in/Spring/ (XML/Java) config as you should when using /Spring/. I alsodon't recommend using cache.xml, but is the pure, non-/Spring/ way toinvoke logic after the cache has been "fully" initialized (i.e. wherethe /Regions/ have been defined in cache.xml).
See here [2] for more details. Note, the documentation talks of"launching an application" on startup, after cache initialization, buttechnically, you can do whatever you want, like load data.
I recommend the BPP.
/> How should I set it up in config to allow it to join other nodes incluster?/
Regardless of whether your server data node is "embedded" or not, youcan still use a Locator, or mcast to have the node join the cluster.The "embedded" scenario, where the "application" is a GemFire Serverdata node will be part of the cluster as Udo said.
This is easily achievable with...

<util:properties id="gemfireProperties">
  <prop key="name">Example</prop>

  <prop key="*mcast-port*">0</prop>
  <prop key="log-level">${gemfire.log-level:config}</prop>
  <prop key=“*locators*”>someHost[10334]</prop>
  <prop key="start-locator">localhost[1034]</prop>
</util:properties>

<gfe:cache properties-ref="gemfireProperties"/>

...
As you can see from the snippet of /Spring/ XML config above, thisapplication is a Geode "peer" cache (i.e. embeds a Geode datanode/server).
The "*locators*" Geode/GemFire property enables this node to connectto a cluster. Likewise, you can use the "*mcast-port*" propertyinstead, however, I would recommend /Locators/ over mcast.
Additionally, you can see that I specified the "start-locator"Geode/GemFire property, which enables me to start an embeddedLocator. Useful for testing purposes and connecting Geode data nodestogether in a cluster without a dedicated Locator, though, thisapproach is less resilient if the applications/servers go down (as maybe the case in a micro-services scenario)!
/> if I start with embedded server is it required to use client poolor is it not required?/
A "client pool" is only applicable to cache clients (i.e.ClientCaches) on the "client-side" of the equation. "peers" find(Locator, mcast) and communicate (TCP/UDP, JGroups) with each otherthrough other means once a cluster is formed.
In fact, typically, it is more common to position yourmicroservices-based applications as Geode cache clients (i.e.<gfe:client-cache ...>) and have them connect to a dedicated Geodeservice (i.e. cluster of Geode servers/data nodes where also, 1 ormore of those nodes are running a "CacheServer", listening for cacheclients to connect). These dedicated Geode server nodes in a clusterconstituting the service can still be configured with /Spring/, butthey typically will not contain an application-specific componentsother than CacheListeners, Loaders, Writers, AEQ /Listeners/, etc.
ClientCache applications use 1 or more Pools configured to talk to theservers in the cluster (either by way of Locator or direct servercommunication). Pools can be configured with groups to target specificmembers (in that group) in the cluster. Typically, members in 1 grouphost a different set of Regions from another group and is a way toseparate data traffic from 1 client to another dedicated to a specificresource/purpose (usually based on business function, etc).
On a side note, some of what you are wanting to do "scale-wise" seemslike a perfect fit for Pivotal CloudFoundry, which can auto-scale upor down nodes in your cluster based on load and other factors.
Anyway, hope this helps!

-John
[1]http://docs.spring.io/spring/docs/current/spring-framework-reference/htmlsingle/#beans-factory-extension-bpp[2]http://geode.apache.org/docs/guide/basic_config/the_cache/setting_cache_initializer.html
On Sun, Dec 25, 2016 at 11:12 PM, Amit Pandey<[email protected] <mailto:[email protected]>> wrote:
    Hey,

    Thanks.

    I have lots of reference data which will be loaded at start of
    day. This data is not bound to change much and as such I want to
    keep it loaded at the start of day. Read through will make it slow
    while it is being actually accessed so I want to keep it loaded in
    memory.

    Also I want to have functions which will be called by clients to
    do some compute and return results. Using functions should allow
    me to add nodes and speed up the compute.

    I have some micro services each of which will start a gemfire
    node, and I want to connect, so yes I can set it up with locator.

    However I have one doubt, if I start with embedded server is it
    required to use client pool or is it not required?

    Regards

    On Mon, Dec 26, 2016 at 1:18 AM, Udo Kohlmeyer
    <[email protected] <mailto:[email protected]>> wrote:

        Hi there Amit,

        At this stage the only way you could load all data at one go
        is to write a client to connect to the db and load all in.
        Another approach could be to write the same code into a
        function and invoke the function at start up. But in both
        cases both are manual.

        To have geode servers join a cluster, you have 2 ways.

         1. Connecting them up via a locator
         2. Connecting them up via mcast.

        Please be aware the once you connect a server to a cluster,
        that server becomes an integral part of the cluster so
        adding/removing servers from a cluster is not something you'd
        want to do in a load-based scaling model. i.e if the load is
        high, add a server and if load is low, shut down a server.

        Just interest sake, what is your use case.

        --Udo


        On 12/24/16 05:57, Amit Pandey wrote:
        Hi Guys,

        I am using Spring Data Geode. I have been able to use read
        and write through/ write behind. I want to load all data on
        cache startup at a go.

        Secondly my geode server is embedded but I want to allow it
        join to other nodes.  How should I set it up in config to
        allow it to join other nodes in cluster?

        Regards
--
-John
john.blum10101 (skype)

Re: Load all data from DB on Cache Start

Reply via email to