Vincent,

> I'd like to see IOUtils.FSync gone altogether: it's just not how things are 
> done in Dotnettown, mister sherrif Sir!

No argument there. In fact, while we are at it, let's remove 
FSDirectory.Fsync() as well.

Also, If you look at the FSDirectory.Sync() method, all it does are:


1.       EnsureOpen()

2.       Keep track of "stale files" vs "current files"

3.       Call FSync()

And since we now know that stale files are a myth in .NET because we are 
calling Flush(true) on dispose, why are we tracking them? They are stored in a 
protected ConcurrentHashSet (m_staleFiles) that is just taking up memory and is 
using CPU cycles in various methods for no benefit. This variable isn't used by 
any other FSDirectory implementation. Do we need it? Or YAGNI?

It looks like Sync() actually does something interesting in some of the other 
implementations, so we can't very well yank that method out, but the rest of 
the above could probably go, except maybe the call to EnsureOpen().

Anyway, I am running the tests now to see if these changes are sound.

> Oh yes, and a minor miracle happened: I finally found instruction I could 
> follow to "sync" my Lucenenet branch with master.

I am having trouble understanding since Git is the greatest thing since 
unsliced bread (which is far tastier than sliced bread). Could it be that you 
just discovered the command line interface, which is better than any GUI I have 
found for it? I first learned by reading Pragmatic Version Control Using Git 
(https://pragprog.com/book/tsgit/pragmatic-version-control-using-git), in which 
the first 130 pages covers about everything you need to know (and StackOverflow 
is very helpful for the rest).

More on your other reply coming up...

Thanks,
Shad Storhaug (NighOwl888)


From: Van Den Berghe, Vincent [mailto:[email protected]]
Sent: Saturday, July 15, 2017 12:15 AM
To: Shad Storhaug
Cc: [email protected]
Subject: RE: Debugging Help Requested

Hello Shad,

Actually, the devil is in the call to FileStream.Flush(true). The "true" 
parameter forces a call to FlushFileBuffers(HANDLE). This makes sure that all 
the operating system buffers of the file object are written to disk. Note that 
I wrote "file object", not "HANDLE": all the file buffers are written to disk, 
even if these are file buffers of another thread/process that wrote to the file.
Normally, all Win32 functions with HANDLE arguments  only work on whatever you 
did while you had the HANDLE, except one: FlushFIleBuffers.
Don't take my word for it, see 
https://blogs.msdn.microsoft.com/oldnewthing/20170524-00/?p=96215
So even if you open the file and only call Flush(true), the OS buffers that are 
dirty will be written to disk. Even if the buffers aren't coming from your file 
handle, or even your process. So in this sense, the IOUtils.FSync code really 
does "something".

The problem is when we're writing files and re-opening them as memory mapped 
files: there is something funny there and the only thing you absolutely must do 
to avoid strange exceptions is to call Flush(true) just before you close the 
handle. If you use fsync, you create a moment between the previous file close 
and the open in FSync where the file is closed but has dirty buffers: memory 
mapped files have trouble handling that and you get Access Denied/Unauthorized 
due to view inconsistencies and write order. I'm not sure why, but I think the 
system tries to map the file content to virtual memory and doesn't take the 
dirty OS buffers into account until after they are written.
If we wrote and read as memory mapped files, we could just call 
FlushViewOfFiles and FlushFileBuffers and be done with it. But we write 
"regularly" and we read via memory mapped files and therein lies the rub. A 
fascinating discussion about view consistency can be found on 
https://ayende.com/blog/162791/on-memory-mapped-files


So I'm a fan of option 2. And frankly, I'd like to see IOUtils.FSync gone 
altogether: it's just not how things are done in Dotnettown, mister sherrif Sir!


Oh yes, and a minor miracle happened: I finally found instruction I could 
follow to "sync" my Lucenenet branch with master.
Yay!

FSNativeLocks, here I come!

Vincent

From: Shad Storhaug [mailto:[email protected]]
Sent: Friday, July 14, 2017 4:42 PM
To: Van Den Berghe, Vincent 
<[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: RE: Debugging Help Requested

Vincent,

Looks like there is a snag. The TestDirectory.TestFsyncDoesntCreateNewFiles() 
test is now failing because it is expecting there to be an exception when 
calling Sync() and the file doesn't actually exist on disk.

Although I can't say for certain this is a "problem", it makes me wonder if 
there is some purpose for the Fsync method that you and I are overlooking. 
Going through it, I really don't see what its purpose is in .NET. In Java, it 
opens a FileChannel and calls force(true) 
(https://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html#force(boolean))
 which "forces any updates to this channel's file to be written to the storage 
device".

After reviewing the implementation of Fsync, it opens a FileStream, doesn't do 
any writing, then calls Flush(true) which since nothing was written will do 
nothing. It only does something when the file doesn't exist, which is to throw 
a FileNotFoundException. So, in .NET since all of the lower level file writing 
is handled by the FileStream that originally flushed and has already disposed, 
it would seem its only purpose would be to throw an exception when the file 
doesn't exist.

It seems there are 2 possible fixes for this:


1.       Change the implementation of IOUtils.Fsync() to check if the file 
exists and if not, throw a FileNotFoundException (and if isDir is false, just 
return), and add it back to the Sync method

2.       Remove the check for a FileNotFoundException from the 
TestDirectory.TestFsyncDoesntCreateNewFiles() test

To me, it feels like the second option is cheating. But then, in .NET it is a 
best practice not to throw an exception unless something bad happens. Since 
nothing is actually being done with the file that doesn't exist I don't see how 
this can be bad. But I have to wonder if we are going down the right execution 
path in every scenario if the exception is not being thrown when one of the 
files doesn't exist on disk.

It seems that in the Java world you can open a FileChannel, write to it, close 
it, reopen a new FileChannel, force the data that was previously written to the 
first FileChannel memory to disk, and close again. In .NET once dispose is 
called on FileStream that is it - there is no more interaction between memory 
and disk. So, the Sync() method seems to have absolutely no purpose in .NET. 
Sure, it keeps track of "stale files" that were written to, and "unique files" 
we will write to, but we are essentially just going through the motions without 
actually doing anything because in .NET they are already written to disk by 
that point.

Thoughts? Opinions on how to handle this?


Thanks,
Shad Storhaug (NightOwl888)



From: Shad Storhaug
Sent: Friday, July 14, 2017 2:33 AM
To: 'Van Den Berghe, Vincent'
Cc: [email protected]<mailto:[email protected]>
Subject: RE: Debugging Help Requested

Vincent,

Thanks. This was definitely the most important issue to fix. I have no fear of 
replacing a buggy implementation with a functional one - there was no direct 
way to port the Java implementation into .NET, so this wasn't another case of 
it not working because it didn't match Lucene.

I ended up making a few changes:


1.       According to MSDN, FileStream.Flush() could throw an IOException, 
which would skip the Dispose() call on the file if it happened in a finally 
block. Looking at the call to base.Dispose() in the try block, it actually 
calls the Flush() method on BufferedIndexOutput. So, it makes more sense to 
override Flush() and call file.Flush(true), which ensures Flush() will work 
consistently everywhere.

2.       I removed the loop that calls FSync instead of taking the call out of 
that method. FSync is not called anywhere else, but this way can still be 
utilized by subclasses.

3.       Changed from new ReadOnlyHeapByteBuffer(BytesRef.EMPTY_BYTES, 0, 0) to 
 ByteBuffer.Allocate(0).AsReadOnlyBuffer(). We shouldn't be using the internal 
members that weren't meant for public use in Java.

You can review here: 
https://github.com/apache/lucenenet/commit/c0e7a25c4a5901a4b6bea4f92088b5135c1f550a

The TestStressLocks test looks stable. However, on .NET Core when I put 
[Repeat(20)] on the TestStressLocksNativeFSLockFactory test I got a concurrent 
file access exception. I suspect this has to do with the same LockFactory issue 
that is causing the CLI tool to fail, not the MMapDirectory.

I also fixed the IndexInputSlicer: 
https://github.com/apache/lucenenet/commit/de62e95746f23d6026412e22c6f724fed03b5fa7

I'd appreciate if you could look into the LockFactory issue when you get a 
chance.


Thanks,
Shad Storhaug (NightOwl888)


From: Van Den Berghe, Vincent [mailto:[email protected]]
Sent: Thursday, July 13, 2017 8:31 PM
To: Shad Storhaug
Cc: [email protected]<mailto:[email protected]>
Subject: RE: Debugging Help Requested

>> but I found a similar problem when trying to get all of the tests to pass on 
>> the MMapDirectory. When MemoryMappedFile.CreateViewAccessor() is called with 
>> more bytes on the size parameter than the actual file size, it throws 
>> UnauthorizedAccessException.

You are correct, of course.  That is indeed the culprit. It also invalidates my 
"solution" to catch UnauthorizedAccessException: the real problem is the 
implementation of MMapDirectory.

I had a look at 
https://issues.apache.org/jira/projects/LUCENENET/issues/LUCENENET-530 but I 
have doubts about the scalability of the approach. The call to 
_memoryMappedFile.CreateViewStream(0, 0, MemoryMappedFileAccess.Read); needs a 
single continuous chunk of virtual memory as large as the size of the mapped 
file. For a lot of large files , this might be problematic. The "list of 
buffers" approach is probably the better approach. I do have a working 
implementation based on MemoryMappedFile.CreateViewStream, if you're 
interested: it does work and corrects all the problems I'm seeing, but I'm 
hesitant to just replace the existing implementation, even if all the tests run.

However, by making minimal modifications, I was able to make all MMapDirectory 
tests run flawlessly, including my own.
Here's what I did:
In FSDirectory.cs:

-          Implemented the file.Flush(true); call in the 
FSDirectory.FSIndexOutput, and commented out the call to IOUtils.Fsync in 
FSDirectory.Fsync. This remains vitally important.

In MMapDirectory.cs:

-          In MMapDirectory.OpenInput, change the call from:

            var c = new FileStream(file.FullName, FileMode.Open, 
FileAccess.ReadWrite, FileShare.ReadWrite);



To:
                     var c = new FileStream(file.FullName, FileMode.Open, 
FileAccess.Read, FileShare.ReadWrite);


-          In MMapDirectory.Map, replace

            long fileCapacity = length == 0 ? ushort.MaxValue : length;

            if (input.memoryMappedFile == null)
            {
#if NETSTANDARD
                input.memoryMappedFile = MemoryMappedFile.CreateFromFile(fc, 
null, fileCapacity, MemoryMappedFileAccess.ReadWrite, 
HandleInheritability.Inheritable, false);
#else
                input.memoryMappedFile = MemoryMappedFile.CreateFromFile(fc, 
null, fileCapacity, MemoryMappedFileAccess.ReadWrite, null, 
HandleInheritability.Inheritable, false);
#endif
            }


By:


                     if (length == 0)
                           return new[] { new 
ReadOnlyHeapByteBuffer(BytesRef.EMPTY_BYTES, 0, 0) };

                     if (input.memoryMappedFile == null)
                     {
#if NETSTANDARD
            input.memoryMappedFile = MemoryMappedFile.CreateFromFile(fc, null, 
length, MemoryMappedFileAccess.Read, HandleInheritability.Inheritable, false);
#else
                           input.memoryMappedFile = 
MemoryMappedFile.CreateFromFile(fc, null, length, MemoryMappedFileAccess.Read, 
null, HandleInheritability.Inheritable, false);
#endif
                     }

(you may need to add a using Util; inside the namespace Lucene.Net.Store to 
access BytesRef).


I've been running hundreds of iterations of the StressTestLocks and 
TestStressLocksNativeFSLockFactory and haven't seen any single failure yet.


Vincent


From: Shad Storhaug [mailto:[email protected]]
Sent: Wednesday, July 12, 2017 4:00 PM
To: Van Den Berghe, Vincent 
<[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: RE: Debugging Help Requested

Vincent,

Thanks for looking into these.

[Finding 1]


>  Why does it throw UnauthorizedAccessException? I have no idea

This may not be related, but I found a similar problem when trying to get all 
of the tests to pass on the MMapDirectory. When 
MemoryMappedFile.CreateViewAccessor() is called with more bytes on the size 
parameter than the actual file size, it throws UnauthorizedAccessException. 
See: https://stackoverflow.com/a/5501331/181087

Note that if you are on 64 bit Windows, the line 
FSDirectory.Open(@"E:\Temp\LuceneStressTest"); will always return MMapDirectory.

When recently going through the old JIRA issues, I noticed that someone 
contributed an MMapDirectory implementation that was never actually used. 
Frankly, that "array of ByteBuffers" is probably not a sound approach in .NET, 
so it is worth at least reviewing to see if this other implementation is a 
better design approach. See: 
https://issues.apache.org/jira/projects/LUCENENET/issues/LUCENENET-530. I was 
able to determine that MMapDirectory (or something it is dependent on) is 
definitely causing 2 of the intermittent test failures by checking each 
Directory implementation individually in the test framework.

If this is the source of the UnauthorizedAccessException, it makes more sense 
to fix the source than to patch it in a catch block (which likely would need to 
be done in multiple places).



>  I know this isn't a direct answer to your debugging request on your .NET 
> Core thing, but I got zero experience on that .NET Core thing and thought 
> that continuing an ongoing investigation would be a more efficient use of my 
> time.

Actually, the issue is not .NET Core-specific. If you feel more comfortable 
working on .NET Framework, you could add a `.csproj` file to the lucene-cli 
tool (which is now on master) and run the same thing on .NET Framework. But we 
have no plans to have a .NET Framework version of this tool, since it won't 
really make any difference anyway on the command line (other than the fact that 
.NET Framework only runs on Windows). You will need to adjust the first part of 
the command-line commands to "lucene-cli.exe" (or whatever you put in the 
assembly name box of project properties) rather than "dotnet lucene-cli.dll" in 
that case.

Thanks,
Shad Storhaug (NightOwl888)

From: Van Den Berghe, Vincent [mailto:[email protected]]
Sent: Wednesday, July 12, 2017 8:36 PM
To: Shad Storhaug
Cc: [email protected]<mailto:[email protected]>
Subject: RE: Debugging Help Requested

Something I forgot to mention wrt to the previous e-mail: the test application 
to reproduce should be running in 32-bit ("32-bit preferred").
For some strange reason, the problem is different in 64-bit and is not 
corrected by it.
I'm starting to think the "access denied" is an antivirus problem... I'll keep 
you informed.

Vincent



From: Van Den Berghe, Vincent [mailto:[email protected]]
Sent: Wednesday, July 12, 2017 7:32 PM
To: Shad Storhaug
Cc: [email protected]<mailto:[email protected]>
Subject: RE: Debugging Help Requested

Hello Shad,

Ever since you've mentioned in your e-mail of Tue 6/13/2017 2:54 PM that we 
still have multithreading problems, I've been attempting to track them down 
(and why I couldn't see them). Your previous e-mail caused me to redouble my 
efforts.
I got 2 findings to report so far.

[Finding 1]
I've discovered at least one problem when  one thread is appending documents, 
and another one is reading from the same directory.
At some point when the IndexWriter is disposed, I've observed one of two 
possible stack traces (note that the lines are those of my own build):

                Lucene.Net.dll!Lucene.Net.Util.IOUtils.Fsync(string fileToSync, 
bool isDir) Line 468          C#
                Lucene.Net.dll!Lucene.Net.Store.FSDirectory.Fsync(string name) 
Line 536           C#
                
Lucene.Net.dll!Lucene.Net.Store.FSDirectory.Sync(System.Collections.Generic.ICollection<string>
 names) Line 365                C#
                
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.WriteSegmentsGen(Lucene.Net.Store.Directory
 dir, long generation) Line 301       C#
                
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.FinishCommit(Lucene.Net.Store.Directory
 dir) Line 1261                C#
                Lucene.Net.dll!Lucene.Net.Index.IndexWriter.FinishCommit() Line 
3792              C#
                Lucene.Net.dll!Lucene.Net.Index.IndexWriter.CommitInternal() 
Line 3775          C#
                Lucene.Net.dll!Lucene.Net.Index.IndexWriter.CloseInternal(bool 
waitForMerges, bool doFlush) Line 1253                C#
                Lucene.Net.dll!Lucene.Net.Index.IndexWriter.Dispose(bool 
waitForMerges) Line 1093 C#
                Lucene.Net.dll!Lucene.Net.Index.IndexWriter.Dispose() Line 1051 
         C#

More specifically, in the above stack trace, this statement in IOUtils.Fsync  
succeeded (fileToSync is the "segments.gen" file):

   file = new FileStream(fileToSync,
                        FileMode.Open, // We shouldn't create a file when 
syncing.
                        // Java version uses FileChannel which doesn't create 
the file if it doesn't already exist,
                        // so there should be no reason for attempting to 
create it in Lucene.Net.
                        FileAccess.Write,
                        FileShare.ReadWrite);

The other possible stack trace replaces the lines above FinishCommit() Line 
3792 with

                
Lucene.Net.dll!Lucene.Net.Store.FSDirectory.FSIndexOutput.FSIndexOutput(Lucene.Net.Store.FSDirectory
 parent, string name) Line 467     C#
                Lucene.Net.dll!Lucene.Net.Store.FSDirectory.CreateOutput(string 
name, Lucene.Net.Store.IOContext context) Line 323             C#
                
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.WriteSegmentsGen(Lucene.Net.Store.Directory
 dir, long generation) Line 289       C#
                
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.FinishCommit(Lucene.Net.Store.Directory
 dir) Line 1270                C#

... with a very similar statement which succeeds:

file = new FileStream(Path.Combine(parent.m_directory.FullName, name), 
FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite);


At the same time, I had a failing IndexReader with the following stack trace:

>             
> Lucene.Net.dll!Lucene.Net.Store.SimpleFSDirectory.OpenInput(string name, 
> Lucene.Net.Store.IOContext context) Line 89                C#
                
Lucene.Net.dll!Lucene.Net.Store.Directory.OpenChecksumInput(string name, 
Lucene.Net.Store.IOContext context) Line 115             C#
                
Lucene.Net.dll!Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(Lucene.Net.Index.IndexCommit
 commit) Line 908              C#
                
Lucene.Net.dll!Lucene.Net.Index.StandardDirectoryReader.Open(Lucene.Net.Store.Directory
 directory, Lucene.Net.Index.IndexCommit commit, int termInfosIndexDivisor) 
Line 55       C#
                
Lucene.Net.dll!Lucene.Net.Index.DirectoryReader.Open(Lucene.Net.Store.Directory 
directory) Line 69                C#

And this statement in  SimpleFSDirectory.OpenInput fails (path.FullName is the 
same as fileToSync):

           var raf = new FileStream(path.FullName, FileMode.Open, 
FileAccess.Read, FileShare.ReadWrite);

This fails with the UnauthorizedAccessException. This exception is uncaught,  
and propagates out of the DirectoryReader, causing it to fail.
Why does it throw UnauthorizedAccessException? I have no idea: the 2 FileStream 
constructors are compatible since their FileAccess/FileShare mode allows both 
reading and writing. But it does fail with that exception.
To make sure it's not an artifact of the test framework, I've written the 
following small program that reproduces the problem. It's very similar to the 
existing test in _testStressLocks, but doesn't use mock-ups since the overhead 
seems to be a factor in reproducing it (on my machine). You should be able to 
just cut & paste and see for yourself. If course, if you don't I shall have 
made a fool of myself again.
Ah well. Here it goes:

using Lucene.Net.Index;
using Lucene.Net.Store;
using Lucene.Net.Util;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Documents;
using Lucene.Net.Search;
using System.Threading;

namespace StressTest
{
       class Program
       {
              public static volatile bool WriterThreadActive;

              private static void WriterThread(Directory directory, int 
iterations)
              {
                     WriterThreadActive = true;
                     try
                     {
                           for (int i = 0; i < iterations; ++i)
                                  using (var writer = new 
IndexWriter(directory, new IndexWriterConfig(LuceneVersion.LUCENE_48, new 
StandardAnalyzer(LuceneVersion.LUCENE_48)) { OpenMode = OpenMode.APPEND }))
                                  {
                                         Document doc = new Document();
                                         doc.Add(new TextField("content", 
"aaa", Field.Store.NO));
                                         writer.AddDocument(doc);
                                  }
                     }
                     finally
                     {
                           WriterThreadActive = false;
                     }
              }

              private static void ReaderThread(Directory directory, int 
iterations)
              {
                     var query = new TermQuery(new Term("content", "aaa"));
                     for (int i = 0; WriterThreadActive || i < iterations; ++i)
                     {
                           using (var reader = DirectoryReader.Open(directory))
                           {
                                  var searcher = new IndexSearcher(reader);
                                  searcher.Search(query, null, 1000);
                           }
                     }
              }

              static void Main(string[] args)
              {
                     var directory = 
FSDirectory.Open(@"E:\Temp\LuceneStressTest");

                     using (var writer = new IndexWriter(directory, new 
IndexWriterConfig(LuceneVersion.LUCENE_48, new 
StandardAnalyzer(LuceneVersion.LUCENE_48)) { OpenMode = OpenMode.CREATE }))
                     {
                           Document doc = new Document();
                           doc.Add(new TextField("content", "aaa", 
Field.Store.NO));
                           writer.AddDocument(doc);
                     }

                     const int MinimumIterations = 10000;     // you may need 
to increase this

                     var writerThread = new Thread(() => 
WriterThread(directory, MinimumIterations)) { Name = "WriterThread" };
                     var readerThread = new Thread(() => 
ReaderThread(directory, MinimumIterations)) { Name = "ReaderThread" };
                     writerThread.Start();
                     readerThread.Start();
                     writerThread.Join();
                     readerThread.Join();
                     directory.Dispose();
              }
       }
}

You may tweak the E:\Temp\LuceneStressTest and the MinimumIterations 
accordingly.

The solution is to add the UnauthorizedAccessException to the catch clause in 
SegmentInfos.cs: (SegmentInfos.FindSegmentFile.Run):

                        try
                        {
                            genInput = 
directory.OpenChecksumInput(IndexFileNames.SEGMENTS_GEN, IOContext.READ_ONCE);
                        }
                        catch (IOException e)
                        {
                            if (infoStream != null)
                            {
                                Message("segments.gen open: IOException " + e);
                            }
                        }
                     catch (UnauthorizedAccessException e)
                     {
                           if (infoStream != null)
                           {
                                  Message("segments.gen open: 
UnauthorizedAccessException " + e);
                           }
                     }

UnauthorizedAccessException doesn't inherit from IOException, you you need to 
catch both of them.
After this modification, the problem (obviously) disappears. Maybe similar code 
would need similar additions, but there is a lack of evidence to do so.

[Finding 2]
I am on the trail of the intermittent failure of StressTestLocks / 
TestStressLocksNativeFSLockFactory but the margin of this e-mail is too small 
to contain it. More later.


Hopefully, this helps a bit.
I know this isn't a direct answer to your debugging request on your .NET Core 
thing, but I got zero experience on that .NET Core thing and thought that 
continuing an ongoing investigation would be a more efficient use of my time.


Vincent



From: Shad Storhaug [mailto:[email protected]]
Sent: Thursday, July 06, 2017 11:32 PM
To: Van Den Berghe, Vincent 
<[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
Subject: Debugging Help Requested

Vincent,

If you have the time, I'd appreciate your assistance with a fix for a 
long-standing concurrency bug. I have been putting together wrapper console 
application for the various utilities that ship with Lucene and discovered that 
2 of them are non-functional because of this bug, but on the upside is now 
there is a reliable way to reproduce it. I suspect the bug is also causing some 
of the random test failures that we are seeing on certain FSDirectory 
implementations.

I have pushed the WIP application to my local repository 
(https://github.com/NightOwl888/lucenenet/tree/cli/src/tools/lucene-cli). It 
only runs on .NET Core and in Visual Studio 2015 Update 3. I don't think it 
makes sense to support .NET framework for this utility since .NET Core will run 
side-by-side with .NET Framework anyway.

You can run a specific commands directly on the command line or in Visual 
Studio 2015. There is a server that needs to be started first, and then a 
client that connects. The problem seems to be the server.

Command Line

dotnet lucene-cli.dll lock verify-server 127.0.0.4 10

dotnet lucene-cli.dll lock stress-test 3 127.0.0.4 <THE_PORT> 
NativeFSLockFactory F:\temp2 50 10

Note the port is dynamically chosen by the server at runtime and displayed on 
the console.

Visual Studio 2015

In Visual Studio 2015, you can just copy everything after "dotnet 
lucene-cli.dll" and paste it into the project properties > Debug > Application 
Arguments text box. Do note I am not sure if those options are optimal (or even 
if they may be causing the issue).

What I Have Found

When the client calls the server, the server locks on LockVerifyServer.cs line 
129 
(https://github.com/NightOwl888/lucenenet/blob/cli/src/Lucene.Net/Store/LockVerifyServer.cs#L129).
 I tried removing that line, and it gets a bit further and then crashes with 
this error:

An unhandled exception of type 'System.Exception' occurred in 
System.Private.CoreLib.ni.dll

Additional information: System.IO.IOException: Unable to read data from the 
transport connection: An existing connection was forcibly closed by the remote 
host. ---> System.Net.Sockets.SocketException: An existing connection was 
forcibly closed by the remote host

   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 
size, SocketFlags socketFlags)

   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 
size)

   --- End of inner exception stack trace ---

   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 
size)

   at System.IO.Stream.ReadByte()

   at System.IO.BinaryReader.InternalReadOneChar()

   at Lucene.Net.Store.LockVerifyServer.ThreadAnonymousInnerClassHelper.Run() 
in F:\Projects\lucenenet\src\Lucene.Net\Store\LockVerifyServer.cs:line 135


I suspect that has something to do with removing the wait so the timing is off, 
but I compared the thread handling code to some similar tests and it looks the 
same (including the call to Wait()), so I haven't worked out why that method 
call isn't completing in this case.

I believe this bug is related to a couple of intermittently failing tests that 
also seem to indicate the LockFactory is broken.

https://teamcity.jetbrains.com/viewLog.html?buildId=1101813&tab=buildResultsDiv&buildTypeId=LuceneNet_PortableBuilds_TestOnNet451
https://teamcity.jetbrains.com/viewLog.html?buildId=1084071&tab=buildResultsDiv&buildTypeId=LuceneNet_PortableBuilds_TestOnNet451
https://teamcity.jetbrains.com/viewLog.html?buildId=1071425&tab=buildResultsDiv&buildTypeId=LuceneNet_PortableBuilds_TestOnNet451

Namely, the TestLockFactory.StressTestLocks and 
TestLockFactory.TestStressLocksNativeFSLockFactory tests.


FYI, the TestIndexWriter.TestTwoThreadsInterruptDeadlock test also fails 
intermittently, and is apparently concurrency related. I don't recall which 
tests they were, but I discovered a while back that if you put the [Repeat(20)] 
attribute on them, they would fail more consistently. I also noticed that they 
always fail if MMapDirectory is made as the only option provided by the test 
framework.

Anyway, I would really appreciate if you could have a look to see if you can 
work out what is going on.


Thanks,
Shad Storhaug (NightOwl888)


Reply via email to