hi steve
On 7/30/07, Steven Singer <[EMAIL PROTECTED]> wrote:
>
> How are people using importxml to restore or import anything but small
> amounts of data into the repository? I have a 22meg xml file that I'm
> unable to import because I keep running out of memory.
i analyzed the xml file that you sent me offline (thanks!).
i noticed the following:
1) system view xml export
2) file size: 22mb without whitespace,
=> 650mb with simple 2-space indentation (!)
3) 23k nodes and 202k properties
4) virtually every node is versionable
5) *very* deep structure: max depth is 2340... (!)
6) lots of junk data (e.g. thousands of _delete_me1234567890 nodes,
btw hundreds/thousands of levels deep and all versionable)
i'd say that the content model has lots of room for improvement ;)
mainly 5) accounts for the excessive memory consumption during
import. while this could certainly be improved in jackrabbit i can't think of a
really good use case for creating >2k level deep hierarchies.
i also would suggest to review the use of mix:versionable. versionability
doesn't come for free since it implies a certain overhead. making 1 node
mix:versionable creates approx. 7 nodes and 13 properties in the version store
(version history, root version etc etc). mix:versionable should therefore only
be used where needed.
btw: by using a decorated content handler which performed a save every
200 nodes i was able to import the data with 512mb heap. it took about
30 minutes on a macbook pro (2ghz).
cheers
stefan
>
> The importxml in in JCR commands works fine but when I go to save the data
> the jvm memory usage goes up to 1GB and eventually runs out of memory.
> This was sort of discussed
> http://mail-archives.apache.org/mod_mbox/jackrabbit-users/200610.mbox/browser
> but I didn't see any solutions proposed.
>
> Does the backup tool suffer from the same problem (being unable to restore
> content above a certain size?) How have other people handled migrating
> data between different persistence managers or changing a node-type
> definition that seems to require a re-import?
>
>
>
>
> Steven Singer
> RAD International Ltd.
>
>