kinow commented on PR #568:
URL: https://github.com/apache/opennlp/pull/568#issuecomment-1856892659

   Thanks for the GC numbers. The new code seems a bit heavier on the GC and 
mem allocations, but I think that was expected.
   
   A bit late here, but I decided to give it a try. Excuse if I say anything 
silly, I can check these results again tomorrow after having some :coffee: . I 
opened your branch on IntelliJ, then made the jmh profile active by default 
(how do you tell IntelliJ to include the deps of a profile? anywho), 
invalidated cache, etc..
   
   Executed the tests just to make sure they were running, as you already 
provided the results. Then executed with the profiler, to generate the `.jfr` 
file. Saved the `jmh` source folder, checked out `main`, then patched the 
`pom.xml` files, and executed the same test again, to get another `.jfr` for 
the `main` branch (so earlier file is for this branch, older is for main).
   
   Opened both in IntelliJ and selected the `.jfr` file from this branch to 
compare with the one from `main`. Not sure if they show a fair comparison as 
the difference in throughput might exacerbate some number, but in case you find 
it useful, @rzo1 . 
   
   ## CPU samples
   
   
![flamegraph](https://github.com/apache/opennlp/assets/304786/5d715064-5ea0-4dbf-8d89-f8c277ad0a1f)
   
   I think it just confirms the change of calls to intern functions.
   
   
![image](https://github.com/apache/opennlp/assets/304786/26235942-e711-4ed3-b242-18098b9ddca1)
   
   Also the increase in GC calls.
   
   
![image](https://github.com/apache/opennlp/assets/304786/759928d2-1674-426a-a80a-24a39bf90ff8)
   
   And I think the Random calls are from JMH for having more/less samples due 
to higher throughput.
   
   
![image](https://github.com/apache/opennlp/assets/304786/a0eb0c07-8feb-421f-bd1d-01f56135479e)
   
   ## Memory allocations
   
   
![flamegraph](https://github.com/apache/opennlp/assets/304786/7e20ff7b-3b5d-45be-9984-67229c9441c6)
   
   With a lot more Strings, as expected, as well as bytes due to the array copy 
calls.
   
   
![image](https://github.com/apache/opennlp/assets/304786/d3f203ff-1bbc-4d33-8c70-c2ed93861443)
   
   
![image](https://github.com/apache/opennlp/assets/304786/43ba112d-2402-4c68-966f-f2e335334f74)
   
   Unfortunately I don't have any better or more practical test to compare 
memory (maybe a single run of a large example would be better for comparing 
memory, instead of multiple jmh runs…). But I don't think this should really be 
a problem. I wonder if that was some premature optimization or if at some point 
someone had memory issues. But a change that's easy to revert if needed, and 
that way users won't have to modify heap settings. So I am inclined to approve 
it! :tada: 
   
   Will wait until tomorrow so you, Martin, Jeff, others had some time to 
review :sleeping_bed: 
   
   Thanks!
   
   Bruno


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@opennlp.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to