Hi Stefan, On 26.09.2011, at 18:13, Stefan Guggisberg wrote:
>> I wrote a fairly ad-hoc dump of the 5900 data files into Jackrabbit. >> Storing ~240 MB took roughly 3 minutes. Is this the expected time such >> an operation takes? Is it possible to improve the performance somehow? > > the performance seems rather poor. it's hard to tell what's wrong > without having the test data. i noticed that you're storing the > content of the .json files as string properties. why aren't you > storing the json data as nodes & properties? I had no code available for serializing the data as JCR nodes. Is there any simple snippet available somewhere? However, I thought as a first baseline this would work. > anyway, i quickly ran an adapted ad hoc test on my machine > (macbook pro 2.66 ghz, standard harddisk). the test imports > an 'svn export' of jackrabbit/trunk. > > importing ~6500 files takes ~30s which is IMO decent. Thanks for writing your test agains your local files! I run your code and compared the execution times. Unfortunately, it's not performing faster :( The minute delta might be cause by some file traversing differences of by the additional nodes/properties created in your code. However, the overall performance is still a bit low (2:24-3:05 minutes in a clean repository). Any idea how the performance could be improved? Am I doing something conceptually wrong? I'm assuming that there is no big delta between creating hundreds of nodes and properties compared to dumping a file's content into Jackrabbit. Is this correct? Thanks, Marcel === Experiments performance results === Jackrabbit First Hops code adapted: 0:00:08.522: 500 units persisted. data 17 MB 0:00:17.057: 1000 units persisted. data 33 MB 0:00:31.763: 1500 units persisted. data 53 MB 0:00:41.404: 2000 units persisted. data 72 MB 0:00:53.140: 2500 units persisted. data 97 MB 0:01:02.988: 3000 units persisted. data 113 MB 0:01:16.314: 3500 units persisted. data 133 MB 0:01:35.171: 4000 units persisted. data 143 MB 0:01:49.414: 4500 units persisted. data 173 MB 0:02:04.617: 5000 units persisted. data 204 MB 0:02:12.593: 5500 units persisted. data 221 MB Mon Sep 26 19:54:58 CEST 2011: 5927 units persisted Run took 0:02:24.505 Mailing List proposal: 0:00:14.853: 500 units persisted. data 17 MB 0:00:26.353: 1000 units persisted. data 33 MB 0:00:36.114: 1500 units persisted. data 53 MB 0:00:53.274: 2000 units persisted. data 72 MB 0:01:06.643: 2500 units persisted. data 97 MB 0:01:18.230: 3000 units persisted. data 113 MB 0:01:36.765: 3500 units persisted. data 133 MB 0:01:44.245: 4000 units persisted. data 143 MB 0:02:04.026: 4500 units persisted. data 173 MB 0:02:37.533: 5000 units persisted. data 204 MB 0:02:48.089: 5500 units persisted. data 221 MB Run took 0:03:08.458
