[
https://issues.apache.org/jira/browse/OAK-2799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509008#comment-14509008
]
Thomas Mueller commented on OAK-2799:
-------------------------------------
[~chetanm] I don't understand how using a weak reference map would help here.
How does it save memory?
I think the problem is that currently each clone uses its own OakIndexFile. If
there are many clones of the same file, then the same file is loaded (cached?)
in memory multiple times. Maybe _additionally_ the problem is that each
individual OakIndexFile uses too much memory (on average 1 MB in the heap
histogram), but that should need to be solved in another way.
I would rather ensure that each OakIndexInput clone points to the same
OakIndexFile. If the OakIndexInput is closed, I guess the OakIndexFile should
be closed (an exception thrown if this is a clone). That way the clones will
detect closed files. In addition, each OakIndexInput needs to keep it's own
private position within the OakIndexFile, and OakIndexFile needs to be extended
to support positioned read operations.
> OakIndexInput cloned instances are not closed
> ---------------------------------------------
>
> Key: OAK-2799
> URL: https://issues.apache.org/jira/browse/OAK-2799
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: lucene
> Affects Versions: 1.2.1
> Reporter: Tommaso Teofili
> Assignee: Tommaso Teofili
> Fix For: 1.3.0, 1.2.2, 1.0.14
>
> Attachments: OAK-2799.0.patch, OAK-2799.1.patch
>
>
> Related to the inspections I was doing for OAK-2798 I also noticed that we
> don't fully comply with the {{IndexInput}} javadoc [1] as the cloned
> instances should throw the given exception if original is closed, but I also
> think that the original instance should close the cloned instances, see also
> [ByteBufferIndexInput#close|https://github.com/apache/lucene-solr/blob/lucene_solr_4_7_1/lucene/core/src/java/org/apache/lucene/store/ByteBufferIndexInput.java#L271].
> [1] : {code}
> /** Abstract base class for input from a file in a {@link Directory}. A
> * random-access input stream. Used for all Lucene index input operations.
> *
> * <p>{@code IndexInput} may only be used from one thread, because it is not
> * thread safe (it keeps internal state like file position). To allow
> * multithreaded use, every {@code IndexInput} instance must be cloned before
> * used in another thread. Subclasses must therefore implement {@link
> #clone()},
> * returning a new {@code IndexInput} which operates on the same underlying
> * resource, but positioned independently. Lucene never closes cloned
> * {@code IndexInput}s, it will only do this on the original one.
> * The original instance must take care that cloned instances throw
> * {@link AlreadyClosedException} when the original one is closed.
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)