On 8/23/07, Michael Neale <[EMAIL PROTECTED]> wrote: > Stefan - the exported XML won't actually contain the versions will it? ie if > you export/import you lose your versions (assume you import to a fresh > repo).
yes, that's correct. > > If that is the case - how does this effect the exported data? on import of a versionable node, version history etc nodes will automatically be created in the version store. this negatively affects the performance of the import. cheers stefan > > (am also interested if it is possible to export versions). > > Michael > > On 8/23/07, Stefan Guggisberg <[EMAIL PROTECTED]> wrote: > > > > hi steve > > > > On 7/30/07, Steven Singer <[EMAIL PROTECTED]> wrote: > > > > > > How are people using importxml to restore or import anything but small > > > amounts of data into the repository? I have a 22meg xml file that I'm > > > unable to import because I keep running out of memory. > > > > i analyzed the xml file that you sent me offline (thanks!). > > i noticed the following: > > > > 1) system view xml export > > 2) file size: 22mb without whitespace, > > => 650mb with simple 2-space indentation (!) > > 3) 23k nodes and 202k properties > > 4) virtually every node is versionable > > 5) *very* deep structure: max depth is 2340... (!) > > 6) lots of junk data (e.g. thousands of _delete_me1234567890 nodes, > > btw hundreds/thousands of levels deep and all versionable) > > > > i'd say that the content model has lots of room for improvement ;) > > > > mainly 5) accounts for the excessive memory consumption during > > import. while this could certainly be improved in jackrabbit i can't think > > of a > > really good use case for creating >2k level deep hierarchies. > > > > i also would suggest to review the use of mix:versionable. versionability > > doesn't come for free since it implies a certain overhead. making 1 node > > mix:versionable creates approx. 7 nodes and 13 properties in the version > > store > > (version history, root version etc etc). mix:versionable should therefore > > only > > be used where needed. > > > > btw: by using a decorated content handler which performed a save every > > 200 nodes i was able to import the data with 512mb heap. it took about > > 30 minutes on a macbook pro (2ghz). > > > > cheers > > stefan > > > > > > > > The importxml in in JCR commands works fine but when I go to save the > > data > > > the jvm memory usage goes up to 1GB and eventually runs out of memory. > > > This was sort of discussed > > > > > http://mail-archives.apache.org/mod_mbox/jackrabbit-users/200610.mbox/browser > > > but I didn't see any solutions proposed. > > > > > > Does the backup tool suffer from the same problem (being unable to > > restore > > > content above a certain size?) How have other people handled migrating > > > data between different persistence managers or changing a node-type > > > definition that seems to require a re-import? > > > > > > > > > > > > > > > Steven Singer > > > RAD International Ltd. > > > > > > > > >
