Re: [PR] Restore fsync behavior in FSDirectory via P/Invoke, #933 [lucenenet]

via GitHub Sun, 12 May 2024 11:46:07 -0700


paulirwin commented on code in PR #938:
URL: https://github.com/apache/lucenenet/pull/938#discussion_r1597690668



##########
src/Lucene.Net/Support/ConcurrentHashSet.cs:
##########
@@ -753,7 +753,16 @@ private void CopyToItems(T[] array, int index)
 
         public void ExceptWith(IEnumerable<T> other)
         {
-            throw new NotImplementedException();
+            if (ReferenceEquals(this, other))

Review Comment:
   Technically this is thread-safe, but you're right that it's not atomic. This 
method only calls either `Clear` or `TryRemove`, both of which are thread-safe. 
Since it is non-atomic, it would allow concurrent adds/etc in the middle of the 
iteration, which as an aside theoretically could be preferred in some use cases 
(which we don't have, which is why this is an aside) where reducing duration or 
scope of locking of add operations is preferred to atomic bulk removal, such as 
in a caching case where the cost of a cache miss is lower than the cost of 
atomic locking. But that's neither here nor there as there's a bigger 
concurrency concern here...
   
   Even if we make this method atomic, it won't solve the problem of ensuring 
consistency of the stale files hashset in `FSDirectory`. Example:
   
   1. Initial state of stale files is "a", "b", and "c"
   2. Thread 1 is doing some operation with the index output on "b"
   3. Thread 2 enters Sync and copies the stale files into `toSync`
   4. Thread 2 fsyncs "a" and "b", is about to fsync "c"
   5. Thread 1 completes its write to "b" and `OnIndexOutputClosed` is called, 
which doesn't add "b" into stale files because it already exists
   6. Thread 2 fsyncs "c" and calls `ExceptWith` (atomic or not), removing a-c
   7. "b" is not fsynced for the second write even if Thread 1 calls Sync again 
because it's now been removed from the set
   
   I think if there was a "clear-and-return-all-removed" method that would 
whack this mole, but then another pops up with the `DeleteFile` method, which 
could result in this problem:
   
   1. Initial state of stale files is "a", "b", and "c"
   2. Thread 1 enters Sync and copies (even with simultaneous clearing) all 
entries into `toSync`
   3. Thread 2 enters DeleteFile and deletes the file "b" on disk
   4. Thread 1 fsyncs and throws an exception because "b" no longer exists
   5. Thread 2 removes "b" from stale files
   
   I don't think there's any amount of concurrent-izing this hashset that 
avoids these potential race conditions. The only proper solution would be to 
synchronize around the entire hashset, at which point we might as well just use 
the BCL non-concurrent HashSet instead. Let me know what you think.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Restore fsync behavior in FSDirectory via P/Invoke, #933 [lucenenet]

Reply via email to