While looking at the code I came across the following in the Directory class:

 * A Directory is a flat list of files.  Files may be written once, when they
 * are created.  Once a file is created it may only be opened for read, or
 * deleted.  Random access is permitted both when reading and writing.

What is the "Random access is permitted both when reading and
writing"? Specifically, IndexOutput doesn't allow seeks and if "once a
file is created it may only be opened for read" mean "ONLY after a
file is created it may be opened for read" then we should allow
directory implementations for which concurrent opening of a file for
which an IndexOutput is still open for writes result in an
IOException...

We currently make an exception from the above for "segments*" files,
as shown in MockDirectoryWrapper:

    // cannot open a file for input if it's still open for
    // output, except for segments.gen and segments_N
    if (!allowReadingFilesStillOpenForWrite &&
openFilesForWrite.contains(name) && !name.startsWith("segments")) { ,

and BaseDirectoryTestCase:

             try {
              IndexInput input = dir.openInput(file, newIOContext(random()));
              input.close();
              } catch (FileNotFoundException | NoSuchFileException e) {
                // ignore
              } catch (IOException e) {
                if (e.getMessage() != null &&
e.getMessage().contains("still open for writing")) {
                  // ignore
                } else {
                  throw new RuntimeException(e);
                }
              }

(For the record, Solr's MockDirectoryFactory enables opening files
being written to to be opened entirely.)

I understand SegmentInfos.finishCommit does an atomic rename (and dir
metadata flush) from a temporary (pending) segments file to the final
segments_X so there should be no possibility of reading or ever
accessing a partially written (or still open for writing) segments*
file.

Am I missing something? Are the above assumptions and exceptions a
historical heritage that can be cleaned up and the contract of the
Directory class clarified?

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to