[ 
https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera reopened LUCENE-2386:
--------------------------------


As I indicated in an email, Solr tests failed (sorry for not running them 
before). After some investigation (thanks Robert !), that's the problem: before 
this change, IW always committed first on an empty directory. It called 
SegmentInfos.commit(dir), which by a chain of calls ensured the directory 
exists (in FSDir) by calling file.mkdirs().

After this change, that chain of calls did not happen ... yet somehow tests we 
still passing for Lucene. Some investigation shows that the Solr tests that 
failed used SingleInstanceLockFactory, or NoLockFactory. By default, FSDir uses 
either SimpleFSLF or NativeFSLF. The IW.ctor calls LF.makeLock and obtain, 
which for these two LFs meant that calling file.mkdirs ... and thus the problem 
was hidden. SingleInstanceLF and NoLF don't do that !

So first, a test which uses FSDir and one of these LFs need to be created, so 
we catch that problem in Lucene code (this is not related to Solr -- just a 
missing test in Lucene). Second we need to fix IW ctor, or Dir or whatever.

I've added that code to IW.ctor, as a sanity check to make sure it works - and 
indeed all Solr tests pass. So that's one option, even though a bit messy.
{code}
try {
  directory.createOutput("temp").close();
} finally {
  directory.deleteFile("temp");
}
{code}

Another option is to add to Directory a prepareForWrite() or simply prepare() 
which will be called by IW. A default empty impl on Directory, and file.mkdirs 
on FSDirectory should be enough.

A third option is to define clear semantics for dir.listAll(), to throw a 
NoSuchDirectoryException and then change IndexFileDeleter to ignore that 
exception if OpenMode of IW is CREATE*. It kind of makes sense - if the 
directory is empty, why bother looking for any index files. Lucene code today 
already expects that exception to be thrown in 
SegmentInfos.getCurrentSegmentGeneration -- so we kind of say 'either you use 
RAMDirectory, or a sub-class of FSDirectory, and then that's what we expect'. 
So it's not so much of a backwards change ...

While dir.prepare() or prepareForWrite() is very explicit ... it's not 
protective enough - one can still call listAll w/o calling prepareForWrite (why 
would you call it if you just want to list files) so I'm not sure which is the 
best option ... Maybe the last option is the best as at least the caller should 
not assume anything about the state of the directory. Just prepare to handle 
the NoSuchDirectoryException, vs. 'there is a directory but it's empty' case.

I'll revert my commit until this is resolved.

> IndexWriter commits unnecessarily on fresh Directory
> ----------------------------------------------------
>
>                 Key: LUCENE-2386
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2386
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch
>
>
> I've noticed IndexWriter's ctor commits a first commit (empty one) if a fresh 
> Directory is passed, w/ OpenMode.CREATE or CREATE_OR_APPEND. This seems 
> unnecessarily, and kind of brings back an autoCommit mode, in a strange way 
> ... why do we need that commit? Do we really expect people to open an 
> IndexReader on an empty Directory which they just passed to an IW w/ 
> create=true? If they want, they can simply call commit() right away on the IW 
> they created.
> I ran into this when writing a test which committed N times, then compared 
> the number of commits (via IndexReader.listCommits) and was surprised to see 
> N+1 commits.
> Tried to change doCommit to false in IW ctor, but it got IndexFileDeleter 
> jumping on me .. so the change might not be that simple. But I think it's 
> manageable, so I'll try to attack it (and IFD specifically !) back :).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to