Re: [discuss][scalability] oak:asyncConflictResolution

2016-03-22 Thread Stefan Egli
Hi,

On 21/03/16 21:23, "Michael Dürig"  wrote:
> There is org.apache.jackrabbit.oak.spi.commit.PartialConflictHandler and
> a couple of its implementations already. Maybe this could be leveraged
> here by somehow connecting it to the mix-ins you propose.

Yes, I think it should be something like a PartialConflictHandler that is
either configurable or customizable.

On 22/03/16 11:35, "Davide Giannella"  wrote:
> I'd go for the mixin, with a default chain/order of conflict resolution
> and allow to define such in a multivalue property. So that in case
> needed the user can define its own chain of conflict resolution, or even
> custom one if needed.

Right, sounds like a mixin rather than (just) a property would be more
appropriate.

Cheers,
Stefan




Re: [discuss][scalability] oak:asyncConflictResolution

2016-03-22 Thread Davide Giannella
On 21/03/2016 20:03, Stefan Egli wrote:
> Hi oak-devs,
>
> tl.dr: suggestion is to introduce a new property (or mixin) that enables
> async merge for a subtree in a cluster case while at the same time
> pre-defines conflict resolution, since conflicts currently prevent
> trouble-free async merging.
> ...

Overall I like the idea as it leaves to the repository creator the
ability to define such behaviours.

I'd go for the mixin, with a default chain/order of conflict resolution
and allow to define such in a multivalue property. So that in case
needed the user can define its own chain of conflict resolution, or even
custom one if needed.

Davide



Re: [discuss][scalability] oak:asyncConflictResolution

2016-03-21 Thread Michael Dürig


Hi,

There is org.apache.jackrabbit.oak.spi.commit.PartialConflictHandler and 
a couple of its implementations already. Maybe this could be leveraged 
here by somehow connecting it to the mix-ins you propose.


Michael

On 21.3.16 9:03 , Stefan Egli wrote:

Hi oak-devs,

tl.dr: suggestion is to introduce a new property (or mixin) that enables
async merge for a subtree in a cluster case while at the same time
pre-defines conflict resolution, since conflicts currently prevent
trouble-free async merging.

In case this has been discussed/suggested before, please point me to the
discussion, in case not, here's the suggestion:

When it comes to handling conflicts we either deal with them in a
synchronous way (we throw a CommitFailedException right away) or have no
feasible/implemented solutions how to asynchronously handle them (we'd have
the possibility of leaving :conflict markers persisted, which would in
theory allow asynchronous merges, but so far we don't have anything built
ontop of that)

In any case, for cluster scalability it's critical that we avoid
'synchronous' checks and instead switch to asynchronous merging wherever
possible: while for some parts of the content (eg '/var') it is always
necessary to have synchronous checks, the assumption is that other areas (eg
'/content') might well live with something asynchronous - as normally no
conflicts occur and if, then a predefined schema that then kicks in is fine.

And one way to tackle this would be to mark nodes (and thus implicitly its
subtree) in a way that says "from here on below it's ok to do asynchronous
conflict resolution of type X". Something that could be solved by
introducing an explicit marker in the form of eg a mixin or a property
'oak:asyncConflictResolution' (that could either refer to a globally defined
resolution or further detail 'how' that resolution should look like). If a
transaction would involve both normal as well as async conflict resolution,
then not much is gained as you'd still have to do conflict checks at least
for that 'normal/sync' part. But if the expectation is that there are cases
of transactions that include only such async marked areas, then you can
avoid the synchronous checks.

Examples for these pre-defined resolutions are: 'delete-wins, then
latest-change-wins' (which might be the easiest), or 'latest-change-wins'
(which might be more tricky as that would mean those 'changeDeleted' cases
would resurrect deleted data magically - possible but perhaps too magic), a
third one could again be 'strict' (which would correspond to JCR semantics
as are the default today) - or again
'no-resolution-but-persist-conflict-marker' etc...

Having such pre-defined conflict resolution and at the same time clearly
indicating that doing conflict-checking asynchronously is OK would allow to
have truly parallel writes into the NodeStore from different instance's pov.

Wdyt?

Cheers,
Stefan





Re: [discuss][scalability] oak:asyncConflictResolution

2016-03-21 Thread Stefan Egli
On 21/03/16 21:03, "Stefan Egli"  wrote:

>...a third one could again be 'strict' (which would correspond to JCR
>semantics
>as are the default today) ..

actually that would not be possible asynchronously, scratch that..

Cheers,
Stefan