awrigh11 opened a new issue, #3500:
URL: https://github.com/apache/accumulo/issues/3500
**Describe the bug**
We ran a profiler on the GC cycle iterations and found that the Garbage
Collector was spending alot of time evaluating paths and specifically calling
Path.getParent() with the specific call tree of
Path.getParent()
Path<init>
Path.initialize()
Path.normalizePath()
Matcher.replaceAll()
Along the execution path measured in the profiler
85.6% of the time was spent in convertRow()
78.2% time spent in StoredTabletFile()
50.2% time spent in Path.getParent()
5.8% time spent in ValidationUtil.validateFileName()
**Versions (OS, Maven, Java, and others, as appropriate):**
- Affected version(s) of this project: 2.1.1
- Hadoop: 3.3.3
**To Reproduce**
We setup an accumulo cluster with about 10k rfiles and measure the GC with a
profiler.
Here is a git repo detailing how we did that
https://github.com/dtspence/accumulo-jmh-test
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]