Yep. This will be a good challenge for our second release ;) On Wed, Oct 19, 2011 at 5:33 PM, Akash Ashok <[email protected]> wrote:
> On Wed, Oct 19, 2011 at 7:01 PM, Raffaele P. Guidi < > [email protected]> wrote: > > > Sorry, Ashish, but I think there must be a misunderstanding: the map > > doesn't > > contain the actual data, it is just the index to data itself, which is > into > > the off-heap memory. In fact it is a collection of Pointer objects, which > > contain the offset and the lenght of the DirectBuffer that contains the > > actual byte array. So: replicating the map (which is natively offered by > > both hc and terracotta) means replicating the INDEX of the data, not data > > itself. > > > > Ah this is a good point would eliminate SPOF of the entire cache. > > > > Again: replication of the map(index) is one matter, distribution of the > > data > > is a different question. I'm not proposing to use terracotta or > hazelcast > > for their caching features but for their *clustering* features > > > > Well distribution of data at its facevalue without replication I presume > wouldn't be highly > complicated I presume. Assuming a cluster of DirectMemory. We have a load > balancer > which goes to a particular system. stores the data OffHeap and adds the > Pointer on Hazelcast. > > So Hazelcast would be acting as a meta-store. But this would require 2 > roundtrips to fetch > somedata. Better than SPOF of the cache. > > But replicating the data is where the real challenge would be.Group > membership which is > pretty complex > > > On Wed, Oct 19, 2011 at 2:46 PM, Ashish <[email protected]> wrote: > > > > > On Wed, Oct 19, 2011 at 5:41 PM, Raffaele P. Guidi > > > <[email protected]> wrote: > > > > Also, on replication/distribution, we have two distinct aspects: > > > > > > > > > > > > 1. *map replication* - the pointers map has to be replicated to all > > > nodes > > > > and each pointer have to contain also a reference to the node who > > > "owns" the > > > > real data > > > > 2. *communication between nodes* - once one node knows that one > entry > > > is > > > > contained in node "n" has to ask for it > > > > > > > > > > > > The first point is easily covered by terracotta or hazelcast, while > the > > > > second one should be implemented using an RPC mechanism (Thrift or > Avro > > > are > > > > both good choices). Another option is to cover also point 1 with a > > custom > > > > replication built on top of the chosen RPC framework - of course this > > > would > > > > lead to another (do we really need it?) distributed map > implementation. > > > > > > Disagree on this. Be it TC or Hazelcast, they shall cover both the > > points. > > > Lets take an example of Terracotta. Its a Client-Server architecture > > > with striping on Server side. > > > Now if you choose TC (short for Terracotta), you got 3 options > > > 1. Use DSO or Distributed Shared Object mode - needs instrumentation > > > and other stuff, not recommended > > > 2. Use Ehcache at back, and TC takes care Distributing data > > > 3. Use Map via TC Toolkit > > > > > > TC will not let you know where its storing the key (which infact are > > > stored in HA manner on Server Stripe). That's the beauty of TC. It > > > does the faulting/flushing transparently to the user code. > > > > > > On Hazelcast side, it does allow to know where the key is, but the > > > moment you use its client, it becomes transparent to you. > > > > > > IMHO, using any existing cache solution would complicate the user > story. > > > > > > Distribution is a nice to have feature, and infact would lead to a > > > wider adoption :) > > > > > > > > > > > Keeping things like this is easy - of course making it > > > efficient/performant > > > > is a different story (i.e., should I keep a local cache of frequently > > > > accessed items stored in other nodes? etc..). > > > > > > > > Ciao, > > > > R > > > > > > > > > > thanks > > > ashish > > > > > >
