Hello Shad,
Ever since you've mentioned in your e-mail of Tue 6/13/2017 2:54 PM that we
still have multithreading problems, I've been attempting to track them down
(and why I couldn't see them). Your previous e-mail caused me to redouble my
efforts.
I got 2 findings to report so far.
[Finding 1]
I've discovered at least one problem when one thread is appending documents,
and another one is reading from the same directory.
At some point when the IndexWriter is disposed, I've observed one of two
possible stack traces (note that the lines are those of my own build):
Lucene.Net.dll!Lucene.Net.Util.IOUtils.Fsync(string fileToSync,
bool isDir) Line 468 C#
Lucene.Net.dll!Lucene.Net.Store.FSDirectory.Fsync(string name)
Line 536 C#
Lucene.Net.dll!Lucene.Net.Store.FSDirectory.Sync(System.Collections.Generic.ICollection<string>
names) Line 365 C#
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.WriteSegmentsGen(Lucene.Net.Store.Directory
dir, long generation) Line 301 C#
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.FinishCommit(Lucene.Net.Store.Directory
dir) Line 1261 C#
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.FinishCommit() Line
3792 C#
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.CommitInternal()
Line 3775 C#
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.CloseInternal(bool
waitForMerges, bool doFlush) Line 1253 C#
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.Dispose(bool
waitForMerges) Line 1093 C#
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.Dispose() Line 1051
C#
More specifically, in the above stack trace, this statement in IOUtils.Fsync
succeeded (fileToSync is the "segments.gen" file):
file = new FileStream(fileToSync,
FileMode.Open, // We shouldn't create a file when
syncing.
// Java version uses FileChannel which doesn't create
the file if it doesn't already exist,
// so there should be no reason for attempting to
create it in Lucene.Net.
FileAccess.Write,
FileShare.ReadWrite);
The other possible stack trace replaces the lines above FinishCommit() Line
3792 with
Lucene.Net.dll!Lucene.Net.Store.FSDirectory.FSIndexOutput.FSIndexOutput(Lucene.Net.Store.FSDirectory
parent, string name) Line 467 C#
Lucene.Net.dll!Lucene.Net.Store.FSDirectory.CreateOutput(string
name, Lucene.Net.Store.IOContext context) Line 323 C#
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.WriteSegmentsGen(Lucene.Net.Store.Directory
dir, long generation) Line 289 C#
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.FinishCommit(Lucene.Net.Store.Directory
dir) Line 1270 C#
... with a very similar statement which succeeds:
file = new FileStream(Path.Combine(parent.m_directory.FullName, name),
FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite);
At the same time, I had a failing IndexReader with the following stack trace:
>
> Lucene.Net.dll!Lucene.Net.Store.SimpleFSDirectory.OpenInput(string name,
> Lucene.Net.Store.IOContext context) Line 89 C#
Lucene.Net.dll!Lucene.Net.Store.Directory.OpenChecksumInput(string name,
Lucene.Net.Store.IOContext context) Line 115 C#
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(Lucene.Net.Index.IndexCommit
commit) Line 908 C#
Lucene.Net.dll!Lucene.Net.Index.StandardDirectoryReader.Open(Lucene.Net.Store.Directory
directory, Lucene.Net.Index.IndexCommit commit, int termInfosIndexDivisor)
Line 55 C#
Lucene.Net.dll!Lucene.Net.Index.DirectoryReader.Open(Lucene.Net.Store.Directory
directory) Line 69 C#
And this statement in SimpleFSDirectory.OpenInput fails (path.FullName is the
same as fileToSync):
var raf = new FileStream(path.FullName, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite);
This fails with the UnauthorizedAccessException. This exception is uncaught,
and propagates out of the DirectoryReader, causing it to fail.
Why does it throw UnauthorizedAccessException? I have no idea: the 2 FileStream
constructors are compatible since their FileAccess/FileShare mode allows both
reading and writing. But it does fail with that exception.
To make sure it's not an artifact of the test framework, I've written the
following small program that reproduces the problem. It's very similar to the
existing test in _testStressLocks, but doesn't use mock-ups since the overhead
seems to be a factor in reproducing it (on my machine). You should be able to
just cut & paste and see for yourself. If course, if you don't I shall have
made a fool of myself again.
Ah well. Here it goes:
using Lucene.Net.Index;
using Lucene.Net.Store;
using Lucene.Net.Util;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Documents;
using Lucene.Net.Search;
using System.Threading;
namespace StressTest
{
class Program
{
public static volatile bool WriterThreadActive;
private static void WriterThread(Directory directory, int
iterations)
{
WriterThreadActive = true;
try
{
for (int i = 0; i < iterations; ++i)
using (var writer = new
IndexWriter(directory, new IndexWriterConfig(LuceneVersion.LUCENE_48, new
StandardAnalyzer(LuceneVersion.LUCENE_48)) { OpenMode = OpenMode.APPEND }))
{
Document doc = new Document();
doc.Add(new TextField("content",
"aaa", Field.Store.NO));
writer.AddDocument(doc);
}
}
finally
{
WriterThreadActive = false;
}
}
private static void ReaderThread(Directory directory, int
iterations)
{
var query = new TermQuery(new Term("content", "aaa"));
for (int i = 0; WriterThreadActive || i < iterations; ++i)
{
using (var reader = DirectoryReader.Open(directory))
{
var searcher = new IndexSearcher(reader);
searcher.Search(query, null, 1000);
}
}
}
static void Main(string[] args)
{
var directory =
FSDirectory.Open(@"E:\Temp\LuceneStressTest");
using (var writer = new IndexWriter(directory, new
IndexWriterConfig(LuceneVersion.LUCENE_48, new
StandardAnalyzer(LuceneVersion.LUCENE_48)) { OpenMode = OpenMode.CREATE }))
{
Document doc = new Document();
doc.Add(new TextField("content", "aaa",
Field.Store.NO));
writer.AddDocument(doc);
}
const int MinimumIterations = 10000; // you may need
to increase this
var writerThread = new Thread(() =>
WriterThread(directory, MinimumIterations)) { Name = "WriterThread" };
var readerThread = new Thread(() =>
ReaderThread(directory, MinimumIterations)) { Name = "ReaderThread" };
writerThread.Start();
readerThread.Start();
writerThread.Join();
readerThread.Join();
directory.Dispose();
}
}
}
You may tweak the E:\Temp\LuceneStressTest and the MinimumIterations
accordingly.
The solution is to add the UnauthorizedAccessException to the catch clause in
SegmentInfos.cs: (SegmentInfos.FindSegmentFile.Run):
try
{
genInput =
directory.OpenChecksumInput(IndexFileNames.SEGMENTS_GEN, IOContext.READ_ONCE);
}
catch (IOException e)
{
if (infoStream != null)
{
Message("segments.gen open: IOException " + e);
}
}
catch (UnauthorizedAccessException e)
{
if (infoStream != null)
{
Message("segments.gen open:
UnauthorizedAccessException " + e);
}
}
UnauthorizedAccessException doesn't inherit from IOException, you you need to
catch both of them.
After this modification, the problem (obviously) disappears. Maybe similar code
would need similar additions, but there is a lack of evidence to do so.
[Finding 2]
I am on the trail of the intermittent failure of StressTestLocks /
TestStressLocksNativeFSLockFactory but the margin of this e-mail is too small
to contain it. More later.
Hopefully, this helps a bit.
I know this isn't a direct answer to your debugging request on your .NET Core
thing, but I got zero experience on that .NET Core thing and thought that
continuing an ongoing investigation would be a more efficient use of my time.
Vincent