Thanks for the tip. I'll give it a try. What sort of performance do you get? Do
you do thousands in one submit?
C N Davies on 18/06/10 14:17, wrote:
Not sure if you have the same issue, but I found breaking my commits down to
10 records per commit and calling em.clear() after each commit improved my
performance and fixed a lot of dettached entity issues.
-----Original Message----- From: Adam Hardy
[mailto:[email protected]] Sent: Friday, 18 June 2010 10:11 PM To:
[email protected] Subject: Re: Performance with large lists
I have a transaction that submits around 20K records, followed by another
transaction in the user process which inserts the same into a related table.
The performance issues I talked about below initially caused processing times
of 45 mins, but I worked on the Java in the code and I tweaked the mySQL
database and reduced it to 15 mins.
This though is still a problem - I've been over the optimization guidelines
in the documentation and there's nothing there that I can implement that I'm
not already.
The first transaction I mentioned takes 5 mins, but the second takes 15mins
and is inserting child records of the records created in the first
transaction. It looks like OpenJPA is fetching all of those first records
again. Shouldn't they already be in memory?
Thanks Adam
Adam Hardy on 12/06/10 13:21, wrote:
I am trying to get a handle on what I should be able to achieve.
Can someone give me some idea of the metrics I should be able to get
optimistically when persisting an object that has a child list with 20,000
child objects and 20,000 grandchildren? (one-to-one child -> grandchild)
Can I reasonably expect to get this done in under a minute?
I think that would work out at a rate of about 1.5 milliseconds per object.
Thanks Adam
Adam Hardy on 11/06/10 17:34, wrote:
I have performance problems with large lists of beans due to the base
class I am using for my entities.
This is slightly non-OpenJPA specific, so I hope nobody minds, but it is
Friday afternoon so I'm hoping you give me a bit of slack here.
The problem arises when I start building lists with over 10,000 items on
a parent class.
The trouble is in the base class for the entities, which is quite clever
(but obviously not clever enough) and it has non-facile equals() and
hashcode() algorithms which make use of reflection. It's here that the
slow-down comes.
When I link the child with a parent that already has 10,000 children, the
equals() method is called by ArrayList before the new child is placed in
the index.
As far as I can tell I have a couple of options.
(1) ditch the reflection-based equals method and hard-code an equals
method.
(2) don't use ArrayList but find a Collection-based class that uses
hashes or similar to identify items instead of equals. This is just
speculation - perhaps there is no such thing or it wouldn't help anyway:
- would a collection using hashes caches the hashes of the items already
indexed? - would such a collection be persistable?
If anyone has been in this situation before, or has an idea about it ,
I'd really appreciate the help.