> > yes, most probably. Because Jackrabbit stores any pending modification in > memory, the heap is probably used up and the GC runs very often in your > import. > try saving after 1000 nodes. >
How do I go about doing this? I import 20000 nodes in one go, and these elements are all same-level children of a node (let's call it BooksNode). Currently, I am doing a BooksNode.save() that results in the long commit time. How do I actually split this save into groups of 1000 nodes? Is there any save on an iterator? Or do I need to modify the NodeImpl.save() of jackrabbit to do a group-wise save when the node count is above a certain level? On Feb 6, 2008 1:48 PM, Marcel Reutegger <[EMAIL PROTECTED]> wrote: > Sridhar Raman wrote: > > I am too sure whether the problem we are facing can be solved by > tweaking > > around with the SearchIndex parameters, but I want to give it a shot. > The > > gist of the problem we are facing is that our importing of nodes is very > > very slow. > > how is your content structured? how many properties do your nodes have on > average. are they any binary properties? > > > We have around 25000 nodes that are being imported, and then being > committed > > by a single session.save(). This particular operation takes a long > time. > > The index folder showed no activity for almost an hour, before it began > to > > begin creating the indexes. Could this be because of some faulty > > SearchIndex parameters? I haven't changed the parameters from the > default > > values. > > no, I don't think so. nodes are only indexed on commit time. in a first > step the > nodes are stored using the configured persistence manager and in a second > step > indexed by the query handler. > > > Also, would the import process be faster if I did the save() in multiple > > steps? > > yes, most probably. Because Jackrabbit stores any pending modification in > memory, the heap is probably used up and the GC runs very often in your > import. > try saving after 1000 nodes. > > regards > marcel >
