Ok, this is the kind of node structure that I have. BOOK ONE, as show in
the example, is the basic unit. It can have multiple Property nodes, and
each Property node has exactly one Property Value node.
<sv:node sv:name="BOOK ONE">
<sv:property sv:name="jcr:primaryType" sv:type="Name">
<sv:value>sr:BookType</sv:value>
</sv:property>
<sv:property sv:name="sr:lastModifiedOn" sv:type="Date">
<sv:value>2007-07-06T15:24:41.125+05:30</sv:value>
</sv:property>
<sv:property sv:name="sr:data" sv:type="String">
<sv:value/>
</sv:property>
<sv:property sv:name="sr:dimvals" sv:type="Reference">
<sv:value>5480a736-ec58-459e-b796-4faf6be581a9</sv:value>
<sv:value>b8c478c6-c395-4f7e-b049-91724bd35324</sv:value>
</sv:property>
<sv:property sv:name="sr:state" sv:type="String">
<sv:value>success</sv:value>
</sv:property>
<sv:property sv:name="sr:cats" sv:type="Reference">
<sv:value>6ee74207-39a1-475e-94a7-a781564a8a0f</sv:value>
</sv:property>
<sv:property sv:name="sr:dateCreated" sv:type="Date">
<sv:value>2007-07-06T15:24:41.125+05:30</sv:value>
</sv:property>
<sv:property sv:name="sr:brank" sv:type="Double">
<sv:value>1.0</sv:value>
</sv:property>
<sv:property sv:name="sr:ownerName" sv:type="String">
<sv:value>g</sv:value>
</sv:property>
<sv:property sv:name="sr:title" sv:type="String">
<sv:value>Book One</sv:value>
</sv:property>
<sv:property sv:name="sr:error" sv:type="String">
<sv:value>none</sv:value>
</sv:property>
<sv:property sv:name="sr:pfxtitle" sv:type="String">
<sv:value>T2945A Book One</sv:value>
</sv:property>
<sv:property sv:name="sr:url" sv:type="String">
<sv:value/>
</sv:property>
<sv:property sv:name="sr:lastModifiedBy" sv:type="String">
<sv:value>g</sv:value>
</sv:property>
<sv:property sv:name="sr:id" sv:type="Long">
<sv:value>1</sv:value>
</sv:property>
<sv:node sv:name="sr:property">
<sv:property sv:name="jcr:primaryType" sv:type="Name">
<sv:value>sr:PropofBook</sv:value>
</sv:property>
<sv:property sv:name="sr:name" sv:type="String">
<sv:value>Book Property</sv:value>
</sv:property>
<sv:property sv:name="sr:property" sv:type="Reference">
<sv:value>1c78697c-068c-4743-9933-eea91c90097c</sv:value>
</sv:property>
<sv:property sv:name="sr:type" sv:type="String">
<sv:value>unrestricted</sv:value>
</sv:property>
<sv:node sv:name="sr:propvalname">
<sv:property sv:name="jcr:primaryType" sv:type="Name">
<sv:value>sr:BookPropValueType</sv:value>
</sv:property>
<sv:property sv:name="sr:name" sv:type="String">
<sv:value>The Shining</sv:value>
</sv:property>
</sv:node>
</sv:node>
</sv:node>
The total number of such Book nodes = 4105. The total number of Property
nodes = 11006, and hence there will be an equal 11006 property value nodes.
That makes it a total of 26117 nodes that will be saved on the session.save
() execution. The time taken for this save step alone is 13.38 mins. Is
this expected and normal? Or is there some other problem?
Oh, I have moved to a bundle Derby Persistence Manager. That is giving me
this 13.38 mins time. Earlier, when it was not bundle, the time used to be
32.35 mins. I am very happy about this decrease. But I am still concerned
that it's taking so long.
Based on Stefan's calculations, it should have been only 26 * 3 = 78
seconds!
So any help?
On 7/16/07, Stefan Guggisberg <[EMAIL PROTECTED]> wrote:
hi,
On 7/16/07, Sridhar Raman <[EMAIL PROTECTED]> wrote:
> Also, how do I switch to bundle persistence? Currently, this is the
> configuration in my workspace.xml file:
>
> > <PersistenceManager class="
> > org.apache.jackrabbit.core.state.db.DerbyPersistenceManager">
> > <param name="url" value="jdbc:derby:${wsp.home}/db;create=true"/>
> > <param name="schemaObjectPrefix" value="${wsp.name}_"/>
> > </PersistenceManager>
> >
>
> How do I change it to include the bundle persistance for Derby?
while switching to BundleDbPersistenceManager would certainly
provide a certain performance gain i doubt that it would solve your
issue. you're using an embedded derby db which should provide
a decent perfomance. i just ran a quick test using
DerbyPersistenceManager:
saving 1000 nodes with 5 string properties each takes
about 3 seconds on a 1.9ghz intel macbook pro (i.e. ~12s./4000 nodes).
you mentioned that in your case it takes ~32 minutes (!) to save 4000
nodes.
please tell us more on your data model. are you storing large binary
properties?
how many properties (and of what type) are you storing per node?
can you provide a simple test case?
cheers
stefan
>
> Thanks,
> Sridhar
>
> On 7/16/07, Sridhar Raman <[EMAIL PROTECTED]> wrote:
> >
> > I use DerbyPersistenceManager and LocalFileSystem. So would I be able
to
> > switch to bundle persistence in this case, and would it be helpful?
> >
> > On 7/15/07, Jukka Zitting <[EMAIL PROTECTED]> wrote:
> > >
> > > Hi,
> > >
> > > On 7/14/07, Sridhar Raman <[EMAIL PROTECTED]> wrote:
> > > > I use Jackrabbit extensively, and one problem that I seem to run
into
> > > a lot
> > > > of times is when I import data, and save the nodes. For saving
4000
> > > nodes,
> > > > it almost takes 32 mins to execute the session.save()
command. Any
> > > way of
> > > > fixing this?
> > > >
> > > > Is it probably because all my data is getting indexed? Could I
> > > somehow
> > > > specify only specific properties/types to be indexed?
> > >
> > > I much more suspect that the time is spent talking to the
persistence
> > > store. Are you using an external database for persistence?
> > >
> > > The traditional database persistence managers issue a separate SQL
> > > statement (causing a network roundtrip to the database) for each
node
> > > *and* property being saved, which can quickly end up taking a lot of
> > > time especially if the network roundtrip to a database server takes
> > > more than a few milliseconds.
> > >
> > > Good solutions to this problem are either to switch to the bundle
> > > persistence (which uses just a single statement for a node and all
> > > it's properties) included in Jackrabbit 1.3 and/or using an embedded
> > > database like the default Derby.
> > >
> > > BR,
> > >
> > > Jukka Zitting
> > >
> >
> >
>