Hi, >However, as noted in OAK-633, there are a few conceptual problems with >this approach to processing merges: > >a) Since validators and other commit hooks are not run during the >merge, the result can be an internally inconsistent content tree >(dangling references, incorrect permission store, etc.) > >b) The presence of conflict markers will prevent further changes to >affected nodes until the conflict gets resolved > >c) There's no good way to handle more than one set of conflicts per node > >So, apart from problem a (which also affects the new MongoMK), the >current mechanism works fine (i.e. fully parallel writes) as long as >the changes are non-conflicting, but runs into trouble when there are >conflicts.
Sorry I don't understand, how does SegmentNodeStore merge affect the new MongoMK? Please note I was taking about SegementNodeStore merge operations, not MicroKernel.merge. The MongoMK doesn't merge segments and journals, instead, conflicts are detected when committing on a node level (relying on MongoDB features). >* Use a more aggressive merge algorithm that automatically resolves >all conflicts by throwing away (or storing somewhere else) "less >important" changes when needed. Addresses problems b and c, problem a >still an issue. I'm worried our customers won't like this. It's very different from the behaviour of regular databases (be it relational databases, or NoSQL databases such as MongoDB). If it's a configurable for a certain subtree, for improved performance, then it's acceptable in my view, but even then I'm worried about the added complexity on the user/customer/developer side. And I'm worried that if we need to enable it to get a scalable solution, then it would turn people away. In my view, SegmentNodeStore merging is somewhat similar to database synchronization (as when synchronizing the smartphone calendar with the desktop and so on). A long time ago, I was working on such a database synchronization solution, called PointBase UniSync and MicroSync. A hub-and-spoke model was used, and supported multiple types of conflicts (insert/insert, update/update, update/delete, delete/update; delete/delete was not treated as conflict for example). Multiple conflict resolution algorithms were supported (spoke wins, hub wins, user defined using a resolver callback). Interestingly, the documentation is still available at http://www.ipd.uni-karlsruhe.de/~modbprak/03-MobileDB_mit_Java/pb44/docs/un isync/GettingStarted/ As far as I know, NoSQL databases either try to avoid merging/synchronization (MongoDB: writes always happen on the primary), or do it in a very simple way. For example in Cassandra, if concurrent writes are enabled, the latest change always wins: http://www.datastax.com/docs/1.1/dml/about_writes "The latest timestamp always wins". Regards, Thomas
