paulirwin commented on code in PR #938: URL: https://github.com/apache/lucenenet/pull/938#discussion_r1597690668
########## src/Lucene.Net/Support/ConcurrentHashSet.cs: ########## @@ -753,7 +753,16 @@ private void CopyToItems(T[] array, int index) public void ExceptWith(IEnumerable<T> other) { - throw new NotImplementedException(); + if (ReferenceEquals(this, other)) Review Comment: Technically this is thread-safe, but you're right that it's not atomic. This method only calls either `Clear` or `TryRemove`, both of which are thread-safe. Since it is non-atomic, it would allow concurrent adds/etc in the middle of the iteration, which as an aside theoretically could be preferred in some use cases (which we don't have, which is why this is an aside) where reducing duration or scope of locking of add operations is preferred to atomic bulk removal, such as in a caching case where the cost of a cache miss is lower than the cost of atomic locking. But that's neither here nor there as there's a bigger concurrency concern here... Even if we make this method atomic, it won't solve the problem of ensuring consistency of the stale files hashset in `FSDirectory`. Example: 1. Initial state of stale files is "a", "b", and "c" 2. Thread 1 is doing some operation with the index output on "b" 3. Thread 2 enters Sync and copies the stale files into `toSync` 4. Thread 2 fsyncs "a" and "b", is about to fsync "c" 5. Thread 1 completes its write to "b" and `OnIndexOutputClosed` is called, which doesn't add "b" into stale files because it already exists 6. Thread 2 fsyncs "c" and calls `ExceptWith` (atomic or not), removing a-c 7. "b" is not fsynced for the second write even if Thread 1 calls Sync again because it's now been removed from the set I think if there was a "clear-and-return-all-removed" method that would whack this mole, but then another pops up with the `DeleteFile` method, which could result in this problem: 1. Initial state of stale files is "a", "b", and "c" 2. Thread 1 enters Sync and copies (even with simultaneous clearing) all entries into `toSync` 3. Thread 2 enters DeleteFile and deletes the file "b" on disk 4. Thread 1 fsyncs and throws an exception because "b" no longer exists 5. Thread 2 removes "b" from stale files I don't think there's any amount of concurrent-izing this hashset that avoids these potential race conditions. The only proper solution would be to synchronize around the entire hashset, at which point we might as well just use the BCL non-concurrent HashSet instead. Let me know what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org