Re: slices, collocation

Fernando Padilla Sun, 23 Nov 2008 12:22:57 -0800

hmm. So maybe I was too quick to say that the collocation constraint istoo inhibiting. Coming from my expectations of what a sharding ORMsystem would provide for me, it definitely is too constraining. But Ipromise to put more thought, maybe in different use cases it's still ok.So I'll continue to think on this.

But I ask for you guys to think on the use cases that can't beimplemented and usability costs that the collocation constraint placeson the system.

I know that with sharding you can never execute a join across databases,so fancier queries will not execute as expected. But baking thatlimitation of sharding into the data model system itself seems like overdoing it. Just warning people that they have to be careful not totraverse relations that are not collocated would be fine.. we're notchildren after all :) :)

But like I said, we're taking a big bet that OpenJPA slices will fit ourscale out requirements. So thank you! This is an amazing head start,and looks solidly built and coded. So I'll keep thinking on this, thelimitations and possibilities :) And my complaints are pretty minor inthe big picture.

For example, I have a work-around to the collocation constraint, I'mjust seeing if we can make the system nicer and easier to use. Mywork-around would be to store references to objects (ids), not theobjects themselves (cross db joins are impossible). Then in ourapplication we'll load the referenced objects are desired.. So that wemaintain the relations, not the ORM system...





Fernando Padilla wrote:

right, thank you :)
you have re-confirmed how I thought the collocation constraint worked,and you also gave me a great motivation why the "replicated" featurecame about ( as a work around for the collocation constraint ).
So now we're back to sqaure one. Looking at my example use case, thecollocation constraint is still too inhibiting. I want to get rid ofthose requirements! :)
So if you wanted to remove that requirement, how would you go about it?What code would you look at, etc etc. If I want to put work intofixing this up, where should I begin to look, etc etc. what are somepossible plans.. :) :) :)
Pinaki Poddar wrote:
  One key aspect of data distribution model used in Slice is that the
distribution policy is based at instance level and *not* at class level.
What it implies for your given scenario is that while User U1 instancecanbe persisted in Slice A, another User instance U2 can be stored inSlice B.So it is not necessary that all User instances are stored in one Sliceand
all Comment instances are in a different slice and so forth.
  But what about related instances? For the sake of concreteness let us
consider the following instances and relations:
  User U1 belongs to Group G1 and has commented C11, C12, C13
  User U2 belongs to Group G1 and has commented C21
The distribution policy determines that U1 and U2 are stored in SliceA and
B respectively.
The collocation constraint forces that any instance reachable from U1(i.e.closure of U1 in Graph theory terms) is stored in Slice A and anyinstancereachable from U2 is stored in U2. Thus, C11, C12, C13 go to Slice Awhile
C21 goes to Slice B.

Where does G1 go? G1 is reachable from both U1 and U2. The only current
option is G1 is annotated as @Replicated and identical copies of G1 are
stored in both Slice A and B.
Of course, collocation constraint will prohibit G1 to have a relationto U1
and U2. So, @Replicated is mainly serves to model 'master' data i.e. data
that are referred by many but itself refers none. However, therelationshipis not completely lost. For example, a query such as select u fromUser u where u.group.name='G1'" will fetch both U1 and U2 by executingparallel queries across Slice A and B
and merging the results.
Fernando Padilla wrote:
So, now that I have some attention, I'll post up a question I sentout a month ago.
I want to make a connected datamodel, but I want to put objects ondifferent databases..
Let's say I have 3 objects:

User (slice root)
  - name

Group (slice root)
  - name
  - users

Comment (slice grouped with group)
  - group
  - user
  - text
As you can see they are all inter-related. But I let's say I want todistribute Users and Groups across databases. But they are related,but can't be collocated.
So can you help me understand the "collocation" limitation of slices,and a way to enhance it to remove this limitation ( if I understandit properly ).
ps - If i understand the limitation, I can't have a ManyToManyrelationship from Group to Users, or ManyToOne from Comment to User,instead I would have to have a set of userIds. And I would have toload up each user object myself through code.

Re: slices, collocation

Reply via email to