[ 
https://issues.apache.org/jira/browse/OPENJPA-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541644
 ] 

Christiaan commented on OPENJPA-439:
------------------------------------

Attached you will find a testcase to reproduce the performance drop. Testcase 
should be run twice, once with generateLargeDataset set to true and once with 
false. The first run will generate the data, which takes about 15 minutes on my 
machine, the second run performs the actual test of the performance (with the 
generated data this can be executed 5 times). The test consists of collecting 
objects for deletion and next deleting them.

The output for the testcase:
Collecting 41371 objects took: 0:0:12:799 (12799.0 ms)
Deleting objects took: 0:0:6:267 (6267.0 ms)
duration 1st run: 0:0:19:66 (19066.0 ms)
Collecting 41371 objects took: 0:0:22:730 (22730.0 ms)
Deleting objects took: 0:0:5:569 (5569.0 ms)
duration 2nd run: 0:0:28:299 (28299.0 ms)

A couple of things I noticed:
1) The performance drop only occurs when a large amount of objects is involved 
(>20.000 objects). When it is small there is no performance drop. 
2) The factor of the performance drop is proportional to the amount of objects, 
eg. 40.000 objects have a performance drop of 2, 50.000 objects have a 
performance drop of a factor 4.
3) The performance drop is caused in traversing the object tree, not the actual 
delete (which is actually faster in the second run). 

Attached is also the profiler data for this test case. As you can see, the 
performance drop is caused AbstractHashedMap.clear(). Clear() iterates over all 
entries and sets them to null. Question is why is iterating so much slower in 
the second run when the same amount of objects is involved? I can imagine that 
leaving the data structure for the hashmap intact and adding objects with new 
identies will grow data structure and thus having impact on iterating over it, 
even if the number of entries stay the same. But this is just my assumption.

One other interesting thing to note is that after all objects have been 
collected and pm.deleteAll() + commit() is being called there is quite an 
increase in memory usage. After the collecting of the objects the memory usage 
is 40 mb, after committing of the deleteAll() the memory usage is 91 mb. So the 
memory usage is more than doubled even if all objects to delete have already 
been loaded into memory! This probably needs to be investigated in a separate 
issue. After the commit, the memory usage nicely drops back again to it's level 
when the transaction started. In the second run, the memory usage peaks at 105 
mb, but this 15 mb increase might be related to the implementation of clear().

Btw, if you could send me the patched jar file I could run the test as well. 


> Performance degradation in multi-transaction operations
> -------------------------------------------------------
>
>                 Key: OPENJPA-439
>                 URL: https://issues.apache.org/jira/browse/OPENJPA-439
>             Project: OpenJPA
>          Issue Type: Bug
>          Components: kernel
>    Affects Versions: 0.9.0, 0.9.6, 0.9.7, 1.0.0, 1.0.1, 1.0.2, 1.1.0
>            Reporter: Patrick Linskey
>             Fix For: 1.1.0
>
>         Attachments: OPENJPA-439.patch, performance testcase results.zip, 
> testcaseperformance.zip
>
>
> Reusing a Broker for multiple transactions / persistence contexts 
> demonstrates a performance degradation, possibly due to explicit calls to 
> clear sets and maps, rather than just dereferencing them.
> Discussion: 
> http://www.nabble.com/Performance-drop-in-AbstractHashedMap.clear%28%29-tf4769771.html#a13656730

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to