Hi, Large transactions: I think we didn't define this as a strict requirement. I'm not aware we got into big troubles with Jackrabbit 2.x where this is not supported. For me, this is still a nice to have. But of course it's something we should test and try to achieve (and resolve problems if we find any).
Flat hierarchies: Yes this is important (we ran into this problem many times). I didn't analyze the results, but could the problem be orderable child nodes? Currently, oak-core stores a property ":childOrder". If there are many child nodes, then this property gets larger and larger. This is a problem, as it consumes more and more disk space / network bandwidth / cpu, of the order n^2. It's the same problem as with storing the list of children in the node bundle. So I guess this needs to be solved in oak-core (not in each MK separately)? Regards, Thomas I combined these two goals into a simple benchmark >that tries to import the contents of a Wikipedia dump into an Oak >repository using just a single save() call. > >Here are some initial numbers using the fairly small Faroese >wikipedia, with just some 12k pages. > >The default H2 MK starts to slow down after 5k transient nodes and >fails after 6k: > >$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \ > benchmark --wikipedia=fowiki-20130213-pages-articles.xml \ > WikipediaImport Oak-Default >Apache Jackrabbit Oak 0.7-SNAPSHOT >Wikipedia import (fowiki-20130213-pages-articles.xml) >Oak-Default: importing Wikipedia... >Imported 1000 pages in 1 seconds (1271us/page) >Imported 2000 pages in 2 seconds (1465us/page) >Imported 3000 pages in 4 seconds (1475us/page) >Imported 4000 pages in 6 seconds (1749us/page) >Imported 5000 pages in 11 seconds (2219us/page) >Imported 6000 pages in 28 seconds (4815us/page) >Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > >The new MongoMK prototype fails already sooner: > >$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \ > benchmark --wikipedia=fowiki-20130213-pages-articles.xml \ > WikipediaImport Oak-Mongo >Apache Jackrabbit Oak 0.7-SNAPSHOT >Wikipedia import (fowiki-20130213-pages-articles.xml) >Oak-Mongo: importing Wikipedia... >Imported 1000 pages in 1 seconds (1949us/page) >Imported 2000 pages in 6 seconds (3260us/page) >Imported 3000 pages in 13 seconds (4523us/page) >Imported 4000 pages in 30 seconds (7613us/page) >Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > >After my recent work on OAK-632 the SegmentMK does better, but it also >experiences some slowdown over time: > >$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \ > benchmark --wikipedia=fowiki-20130213-pages-articles.xml \ > WikipediaImport Oak-Segment >Apache Jackrabbit Oak 0.7-SNAPSHOT >Wikipedia import (fowiki-20130213-pages-articles.xml) >Oak-Segment: importing Wikipedia... >Imported 1000 pages in 1 seconds (1419us/page) >Imported 2000 pages in 2 seconds (1447us/page) >Imported 3000 pages in 4 seconds (1492us/page) >Imported 4000 pages in 6 seconds (1586us/page) >Imported 5000 pages in 8 seconds (1697us/page) >Imported 6000 pages in 10 seconds (1812us/page) >Imported 7000 pages in 13 seconds (1927us/page) >Imported 8000 pages in 16 seconds (2042us/page) >Imported 9000 pages in 19 seconds (2146us/page) >Imported 10000 pages in 22 seconds (2254us/page) >Imported 11000 pages in 25 seconds (2355us/page) >Imported 12000 pages in 29 seconds (2462us/page) >Imported 12148 pages in 41 seconds (3375us/page) > >To summarize, all MKs still need some work on this. Once these initial >problems are solved, we can try the same benchmark with larger >Wikipedias. > >PS. Note that I'm using the OAK-652 feature flag to speed things up on >the oak-jcr level. > >BR, > >Jukka Zitting
