Hi all.

We are using IndexWriter with no limits set and managing the commits
ourselves, mainly so that we can ensure they are done at the same time
as other (non-Lucene) commits.

After upgrading from 3.0 ~ 3.3, we are seeing a change in
ramSizeInBytes() behaviour where it is no longer resetting to zero
after a commit().  The end result is that after a while, the code
wants to commit after adding even a single document.

I boiler it down to a test case (though I'm obviously just using JUnit
as a helper here):

    @Test
    public void testIndexWriterByteCount() throws Exception
    {
        Directory directory = new RAMDirectory();
        IndexWriter writer = new IndexWriter(directory, new
WhitespaceAnalyzer(), IndexWriter.MaxFieldLength.UNLIMITED);
        System.out.println("At start: " + writer.ramSizeInBytes());

        for (int j = 0; j < 3; j++)
        {
            for (int i = 0; i < 5; i++)
            {
                Document document = new Document();
                document.add(new Field("text", "a", Field.Store.YES,
Field.Index.ANALYZED));
                writer.addDocument(document);
            }
            System.out.println("After adding some docs: " +
writer.ramSizeInBytes());

            writer.commit();
            System.out.println("After commit: " + writer.ramSizeInBytes());
        }

        writer.close();
        directory.close();
    }

The results on Lucene 3.3.0:

    At start: 0
    After adding some docs: 99400
    After commit: 99344
    After adding some docs: 99400
    After commit: 99344
    After adding some docs: 99400
    After commit: 99344

The results of running more or less the same test on Lucene 3.0.3:

    At start: 0
    After adding some docs: 115712
    After commit: 0
    After adding some docs: 50176
    After commit: 0
    After adding some docs: 50176
    After commit: 0

Questions:

(1) Is Lucene now caching more than it used to be caching, which would
account for the extra space usage, or is this simply a bug where the
count isn't being updated correctly?

(2) Is checking ramSizeInBytes() still the recommended way to
determine whether it's time to commit()?

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to