Re: [jr3 microkernel] Write skew

Thomas Mueller Thu, 01 Dec 2011 09:13:39 -0800

Hi,

>>>But I don't think we should try to increase concurrency of write
>>> operations within the *same* repository because that's not a problem at
>>> all.
>>
>> i beg to differ ;)
>>
>> in jr2 saves are serialized. IMO that's a *real* problem, especially
>>when
>> saving large change sets. this problem can be addressed e.g. with an
>> MVCC based model.


The problem with Jackrabbit isn't that concurrency for write operations is
bad: throughput is bad. This is the main problem. Increasing concurrency
in the save operation will not affect throughput in a meaningful way
(well, most likely it will decrease throughput).

I'm not aware that there is a big problem with large change sets. Anyway
large change sets should be split up into smaller set.

For me, increasing throughput is a lot more important than increasing
concurrency.



>Yes, I agree. It's something I've seen many times in the field
>(consider saving a large pdf in a cms).

Large PDFs are stored in the data store. Large binaries are stored there
well before the save operation, so this is not part of the save operation
at all. Increasing concurrency in the save operation doesn't affect that
in any way.

>you can't scale out the writes in a
>cluster since all writes are serialized for the whole cluster.

Yes, this is a big problem. We need to solve it. One idea is to
synchronize cluster nodes asynchronously, and better support splitting
data into multiple repositories (sharding), for example using virtual
repositories that can be linked together.

Regards,
Thomas

Re: [jr3 microkernel] Write skew

Reply via email to