Hi Stefan.
Node types attached, and the example code that rips through it and saves
stuff. Let me know if there is anything obvious I am doing wrong !
Any one interested can download the loop code and node types from this zip:
http://www.users.on.net/~michaelneale/work/jackrabbit_perf.zip
On 9/4/06, Stefan Guggisberg <[EMAIL PROTECTED]> wrote:
hi michael,
On 9/4/06, Michael Neale <[EMAIL PROTECTED]> wrote:
> hi Stefan.
>
> Yes I was able to make it rip through saving lots of simple nodes like
that
> no problem.
> When I add more properties, it degrades a fair bit (probably not
surprising
> if I guess at how the data is being stored for each property).
>
> Interestingly, when I use my own specific node type it slows down quite
a
> lot (and memory consumption goes up) then with nt:unstructured, yet with
all
> other properties being set in the same way. I had to bump up the memory
> quite a lot to avoid OutOfMemoryException's.
that's indeed very interesting and comes as a surprise. would you mind
sharing
with us your node type definitions and some sample code? i'd like to
investigate
this further.
cheers
stefan
>
> In the end, when I batched things up, I was able to ramp up the number
of
> nodes to what I wanted to test. Performance was acceptable once it was
> loaded up - it is definately the save() operations that are the most
> expensive. It was just very very difficult to build up my test data
without
> killing memory.
>
> Thanks everyone for your help, I have learned a lot about jackrabbit in
the
> meantime.
>
> On 9/1/06, Stefan Guggisberg <[EMAIL PROTECTED]> wrote:
> >
> > hi michael
> >
> > i quickly ran a test which successfully added 20k child nodes to the
same
> > parent (whether that's a useful content model is a different
story...).
> >
> > here's the code i used to test:
> >
> > Node parent = root.addNode("foo", "nt:unstructured");
> > for (int i = 1; i <= 20000; i++) {
> > parent.addNode("bar");
> > if (i % 1000 == 0) {
> > root.save();
> > System.out.println("added 1000 child nodes; total=" + i);
> > }
> > }
> >
> > note that save() is a relatively expensive operation; it therefore
makes
> > sense
> > to batch multiple addNode etc calls (which are relatively
inexpensive).
> >
> > please provide a simple self-contained test case that reproduces the
> > behaviour
> > you're describing.
> >
> > cheers
> > stefan
> >
> > On 9/1/06, Michael Neale <[EMAIL PROTECTED]> wrote:
> > > 1:
> > > yeah I use JProfiler - top of the charts with a bullet was:
> > > org.apache.jackrabbit.util.WeakIdentityCollection$WeakRef (a ha !
that
> > would
> > > explain the performance slug when GC has to kick in late in the
piece).
> > > followed by:
> > > org.apache.derby.impl.store.raw.data.StoredRecordHeader
> > > and of course a whole lot of byte[].
> > >
> > > I am using default everything (which means Derby) and no blobs
> > whatsoever
> > > (so all in the database).
> > >
> > > 2:
> > > If I logout, and use fresh everything, it seems to continue fine (ie
> > fast
> > > enough pace), but I haven't really pushed it where I wanted to get
it
> > (10000
> > > Child nodes).
> > >
> > > Responding to Alexandru's email (hi alex, nice work on InfoQ if I
> > remember
> > > correctly ! I am a fan), it would seem that the Session keeps most
in
> > > memory, which I can understand.
> > >
> > > I guess my problem is that I am trying to load up the system to test
> > really
> > > basically that it scales to the numbers that I know I need to scale
to,
> > but
> > > I am having trouble getting the data in - bulk load wise. If I bump
up
> > the
> > > memory, it certainly seems to hum along better, but if Session is
> > keeping a
> > > lot around, then this will have limits - there is no way to "clear"
the
> > > session ?
> > >
> > > Perhaps I will explain what I am using JCR for (feel free to smack
me
> > down
> > > if this is not what JCR and Jackrabbit are ever indended for):
> > > I am storing "atomic business rules" (which means each node is a
small
> > > single business rule). The data on each node is very small. These
nodes
> > are
> > > stored flat as child nodes under a top level node. To give structure
> > > (categorisation) for the users, I have references to these nodes all
> > over
> > > the place so people can navigate them all sorts of different ways
(as
> > there
> > > is no one clear hierarchy at the time the rules are created). JCR
gives
> > me
> > > most of what I need, but as these rule nodes can number in the
> > thousands
> > > (4000 is not uncommon for a reasonably complex business unit),
then I
> > am
> > > worried that this just can't work.
> > >
> > > I have seen from past posts that people put nodes under different
> > parents
> > > (so there is no great number of child nodes) so that is one option,
but
> > my
> > > gut feel is that its the WeakIdentityCollection: this well meaning
code
> > > means that the GC has to due a huge amount of work at the worst
possible
> > > time (when under stress). I am sure most of the time this is not an
> > issue.
> > >
> > > Any ideas/tips/gotchas for a newbie? I would really like to be
confident
> > > that I can scale up enough (its modest) with JCR for this purpose.
> > >
> > > On 8/31/06, Nicolas <[EMAIL PROTECTED]> wrote:
> > > >
> > > > 2 more ideas:
> > > >
> > > > 1/ Did you try using a memory profiler so we can know what is
wrong?
> > > >
> > > > 2/ What happens if you logout after say 100 updates?
> > > >
> > > >
> > > > a+
> > > > Nico
> > > > my blog! http://www.deviant-abstraction.net !!
> > > >
> > > >
> > >
> > >
> >
>
>