[ 
https://issues.apache.org/jira/browse/VFS-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123095#comment-15123095
 ] 

Bernd Eckenfels commented on VFS-593:
-------------------------------------

Note sure if .intern() is a good default solution. With automatic string 
deduplication in the background the problem should be smaller, but more 
importantly I am not sure where this excessive duplication is coming from 
because the internal file cache is supposed to deduplicate the persisting 
FileObjects.

So here are  few things to consider:

A) let us know what strings exactly (class and fieldname and content pattern) 
you see duplicated
B) give us a class histogram with the count of live classes in the above 
mentioned heapdump (especially how many *FileObjects, *FileNames and 
*FileSystem instances are we talking about and how many have different path)
C) can you try with a 2.1 SNAPSHOT, there are multiple leaks fixed in this area 
(especially when using overlay archive filesystems). If you have compatibility 
issues with 2.1 please let us know in a seperate bug
D) how is your file manager configured, especially the FilesCache and 
CacheStrategy
E) can you confirm you are talking about live objects and not allocation 
profiling (i.e. are the strings strongly referenced). ( Intern() wont help to 
reduce the production rate of those strings, only avoids multiple instances to 
be referenced)
F) are you dynamically (explicite or implicite) creating new file systems 
(layered on top of the LocalFileSystem)
G) if you can make a test run with a large heap which does not contain 
sensitive data you could provide us with a heap dump

> duplicated Strings inside WindowsLocalFile
> ------------------------------------------
>
>                 Key: VFS-593
>                 URL: https://issues.apache.org/jira/browse/VFS-593
>             Project: Commons VFS
>          Issue Type: Bug
>    Affects Versions: 2.0
>            Reporter: Sam Halliday
>
> I did a YourKit analysis of a heapdump when running ENSIME, which uses VFS2, 
> and I found several GB (25%) of the heap was wasted on duplicated strings 
> within VFS2.
> This could be averted by using `.intern()` on Strings that represent paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to