[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529831#comment-17529831 ] Dawid Weiss commented on LUCENE-10292: -- > All I was really trying to do with these tests was demonstrate that data you > get out of the Lookup before you call build(), can still be gotten from the > Lookup while build() is incrementally consuming an iterator (which may take a > long time if you are building up from a long iterator) and that this behavior > is consistent across Lookup impls (as opposed to before i filed this issue, > when most Lookups worked that way, but AnalyzingInfixSuggester would throw an > ugly exception – which was certainly confusing to users who might switch from > one impl to another). I guess I am not comfortable with the fact that this test works only by a lucky coincidence and tests the behavior that isn't guaranteed or documented by the Lookup class - this got me confused and I guess it'll confuse people looking at this code after me. It's not a personal stab at you, it's just something that smells fishy around this code in general. When I was looking at the failure and tried to debug the test, I didn't see the reason why this test was necessary (I looked at the Lookup class documentation). When I understood what the test did, I looked at the implementations and they seemed to be designed with a single-thread model in mind (external synchronization between lookups and rebuilds). For example, even now, if you had a tight loop in one thread calling lookup on an FSTCompletionLookup and this loop got compiled, then there's nothing preventing the compiler from reading higherWeightsCompletion and normalCompletion fields once and never again (they're regular fields in FSTCompletionLookup), even if you call build there multiple times in between... Is this likely to happen? I don't know. Is this possible? Sure. Maybe I'm oversensitive because I grew up on machines with much less strict cache coherency protocols but code like this makes me itchy. > I didn't set out to make any hard & fast guarantee about the thread safety of > all lookups – just improve the one that awas obviously inconsistent with the > others (progress, not perfection) That's my point. Either we should make the Lookup interface explicitly state that it's safe to call the build method from another thread or we shouldn't really guarantee (or test) this behavior. I don't want you to revert the changes you made but my gut feeling is that lookup implementations should be designed to be single-threaded or at least immutable (one publisher-multiple readers model) as it makes implementing them much easier - no volatiles, synchronization blocks, etc. Concurrency concerns should be handled by the code that uses Lookups - this code should know whether synchronization or two concurrent instances are required (one doing the lookups, potentially via multiple threads, one rebuilding). Perhaps a change in the API is needed to separate those two phases (build-use) and then the downstream code has to take care of handling/ swapping out Lookup reference where they're used - I don't know, I just state what I think. > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { >
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529677#comment-17529677 ] ASF subversion and git services commented on LUCENE-10292: -- Commit 115bcd9c66d9723a7a1ea5131aa2f5d16a6867b7 in lucene's branch refs/heads/branch_9x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=115bcd9c66d ] LUCENE-10292: prevent thread leak (or test timeout) if exception/assertion failure in test iterator (cherry picked from commit 6afb9bc25a5935a3cba0a06e67b27fe290255467) > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529676#comment-17529676 ] ASF subversion and git services commented on LUCENE-10292: -- Commit 6afb9bc25a5935a3cba0a06e67b27fe290255467 in lucene's branch refs/heads/main from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=6afb9bc25a5 ] LUCENE-10292: prevent thread leak (or test timeout) if exception/assertion failure in test iterator > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529668#comment-17529668 ] Chris M. Hostetter commented on LUCENE-10292: - {quote} I'm still not sure whether these tests make sense without explicitly stating ... {quote} All I was really trying to do with these tests was demonstrate that data you get out of the Lookup before you call build(), can still be gotten from the Lookup while build() is incrementally consuming an iterator (which may take a long time if you are building up from a long iterator) and that this behavior is consistent across Lookup impls (as opposed to before i filed this issue, when most Lookups worked that way, but AnalyzingInfixSuggester would throw an ugly exception – which was certainly confusing to users who might switch from one impl to another). I didn't set out to make any hard & fast guarantee about the thread safety of all lookups – just improve the one that awas obviously inconsistent with the others (progress, not perfection) I'm happy to rename/re-document/remove the test code if you'd like - or confine it to just AnalyzingInfixSuggester - but i think (assume?) we can probably agree that the src/main code changes I've made in this issue are generally for the better? {quote}I think it'd be a much more user-friendly API if Lookup was actually detached entirely from its build process (for example by replacing the current build method with a builder() that would return a new immutable Lookup instance). ... I'm not saying this should be implemented here - perhaps it's worth a new issue to do this refactoring. {quote} +1 ... although some of the lookup impls explicitly support update()int individual suggestions – so you'd either have to remove that functionality or refactor the API in some way that it can still be optional. {quote}Separately from the above, if the test fails, it'll leak threads ... It should be replaced with a try/catch that rethrows an unchecked exception {quote} Yeah, that was lazy & sloppy of me – sorry about that. I'll fix it ASAP. > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529223#comment-17529223 ] Dawid Weiss commented on LUCENE-10292: -- Thanks Chris. I'm still not sure whether these tests make sense without explicitly stating that build() can be called on Lookup to dynamically (and concurrently) replace its internals... For example, FSTCompletionLookup: {code} // The two FSTCompletions share the same automaton. this.higherWeightsCompletion = builder.build(); this.normalCompletion = new FSTCompletion(higherWeightsCompletion.getFST(), false, exactMatchFirst); this.count = newCount; {code} none of these fields are volatile or under a monitor, so no guaranteed flush occurs anywhere. I understand eventually they'll get consistent by piggybacking on some other synchronization/ memfence but it's weird to rely on this behavior. I think it'd be a much more user-friendly API if Lookup was actually detached entirely from its build process (for example by replacing the current build method with a builder() that would return a new immutable Lookup instance). This would be less confusing and would also allow for a cleaner implementation (no synchronization at all required - just regular assignments, maybe even with final fields). I'm not saying this should be implemented here - perhaps it's worth a new issue to do this refactoring. Separately from the above, if the test fails, it'll leak threads - this: + acquireOnNext.acquireUninterruptibly(); literally blocks forever. It should be replaced with a try/catch that rethrows an unchecked exception when the iterator thread is interrupted. > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529142#comment-17529142 ] ASF subversion and git services commented on LUCENE-10292: -- Commit 65f5a3bcc5afc17888258319d76e4a6293490d12 in lucene's branch refs/heads/branch_9x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=65f5a3bcc5a ] LUCENE-10292: Suggest: Fix FreeTextSuggester so that getCount() returned results consistent with lookup() during concurrent build() Fix SuggestRebuildTestUtil to reliably surfice this kind of failure that was previously sporadic (cherry picked from commit a8d86ea6e8b89ea0324e7f8b6e1d5213254274d5) > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529140#comment-17529140 ] ASF subversion and git services commented on LUCENE-10292: -- Commit a8d86ea6e8b89ea0324e7f8b6e1d5213254274d5 in lucene's branch refs/heads/main from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=a8d86ea6e8b ] LUCENE-10292: Suggest: Fix FreeTextSuggester so that getCount() returned results consistent with lookup() during concurrent build() Fix SuggestRebuildTestUtil to reliably surfice this kind of failure that was previously sporadic > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529139#comment-17529139 ] Chris M. Hostetter commented on LUCENE-10292: - {quote}... how did you arrive at the conclusion that Lookup.build can be called concurrently? {quote} Well, my initial understand/impression was that calling lookup concurrent to a (re)build should be "ok" was based on the fact that it worked for every Suggester I could find _except_ AnalyzingInfixSuggester because all other suggesters atomically replaced their internal structures (typically FSTs) at the end of their build() process – only AnalyzingInfixSuggester blew away it's internal data (it's searcherMgr) at the start of the build method, replacing it only at the end of the build – and had a lookup method that was throw an exception if you tried to use it during a (re)build. As i said in my initial comment, it seemed like at a minimum we could/should make AnalyzingInfixSuggester return Collections.emptyList() in the case that searcherMgr was null (consistent with the other Lookup impls i found and what their lookup methods returned if they had _never_ been built) but it seemed very easy/possible to make AnalyzingInfixSuggester support lookup concurrent w/ (re)build – so i added it (since there were no objections) As i mentioned in subsequent comments (above) – while working on the test for this (and refactoring it so that it could be applied to all suggesters) i found that while i couldn't trigger any failures like this in other Lookup impls during concurrent calls to build()/lookup() i did discover some FST based suggesters (like AnalysingSuggester and FSTCompletionLookup) had inconsistencies between the getCount() and lookup() since the "count" variable was being updated iteratively while the fst was being replaced atomicly –which i only noticed because my new test changes were also sanity checking the count to confirm that the "new" data was in fact getting picked up by the time build finished. I attempted to "improve" those inconsistencies as well, in the two analyzers where i noticed it, by replacing the "count" variable atomically as well _but I evidently overlooked that FreeTextSuggester has this same code pattern_ . (And yes, my "improvement" isn't a perfect "fix" because the fst & count variables are updated independently w/o any synchronization – but it didn't seem worth the overhead of adding that since there's nothing in the docs that implies/guarantees anything about what getCount will return during a build – but it certainly seemed wrong to me that those impls would happily return lots of results from lookup calls while getCount() might return '0') {quote}...while the build() method is not even invoked yet because this line: ... is semaphore-blocked in the constructor parameters (InputArrayIterator). {quote} ...ah, yes, i overlooked that. The spirit of what I was trying to do with this test is assert that the various "checks" all pass even while the build method is in the process of consuming an iterator. I initially implemented the test with both the Semaphore and sleep calls, but didn't want to leave them in and make the test "slow" – and when removing the sleep calls I clearly didn't give enough thought to the possibility that the "main" thread would have a chance to release all it's permits before the background thread had acquired any of them. I've got an improved version of the test locally that uses bi-directional Semaphores to ensure the checks are tested between every call to next() (and wraps the InputArrayIterator instead of vice-versa so it doesn't get hung up on the behavior of that classes constructor) which reliably triggers the current sporadic jenkins failures in TestFreeTextSuggester – and making the same improvements to how that FreeTextSuggester tracks its "count" that my previous commits made to AnalysingSuggester and FSTCompletionLookup causes the modified test to reliably pass. I'll commit these changes to main & 9x ASAP to stop the jenkins failures – but if you would like to re-open this issue for further discussion (and/or revert all of these commits in the meantime) I understand > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529018#comment-17529018 ] Dawid Weiss commented on LUCENE-10292: -- I don't see any evidence in implementations of Lookup that build() can be called in a thread-safe manner. Those testLookupsDuringReBuild are only working by a lucky chance (and rarely still fail!). The code typically releases semaphore permissions quickly here: {code} // at every stage of the slow rebuild, we should still be able to get our original suggestions for (int i = 0; i < data.size(); i++) { initialChecks.check(suggester); rebuildGate.release(); } {code} while the build() method is not even invoked yet because this line: {code} suggester.build( new InputArrayIterator(new DelayedIterator<>(suggester, rebuildGate, data.iterator(; {code} is semaphore-blocked in the constructor parameters (InputArrayIterator). So the result is that for suggester.build() is typically entered a long time after the check look has finished. It is enough to modify the code to: {code} // at every stage of the slow rebuild, we should still be able to get our original suggestions for (int i = 0; i < data.size(); i++) { rebuildGate.release(); Thread.sleep(100); initialChecks.check(suggester); } {code} to cause repeatable failures (this isn't a suggested fix but a demonstration that the code is currently broken). > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528998#comment-17528998 ] Dawid Weiss commented on LUCENE-10292: -- [~hossman] - don't know if you saw the recent discussion on the mailing list - how did you arrive at the conclusion that Lookup.build can be called concurrently? I don't think this is mentioned anywhere in Lookup documentation and I don't think the implementation is thread-safe (at least not the TestFreeTextSuggester)? > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 10.0 (main), 9.2 > > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519694#comment-17519694 ] ASF subversion and git services commented on LUCENE-10292: -- Commit b11b0714297ab5f8005aa01001824acc6ef488e7 in lucene's branch refs/heads/branch_9x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=b11b0714297 ] LUCENE-10292: Suggest: Fix AnalyzingInfixSuggester / BlendedInfixSuggester to correctly return existing lookup() results during concurrent build() Fix other FST based suggesters so that getCount() returned results consistent with lookup() during concurrent build() (cherry picked from commit 5015dc6dbb89a2d3f9c2cd0eb1674f6f146d09b4) > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519682#comment-17519682 ] ASF subversion and git services commented on LUCENE-10292: -- Commit 5015dc6dbb89a2d3f9c2cd0eb1674f6f146d09b4 in lucene's branch refs/heads/main from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=5015dc6dbb8 ] LUCENE-10292: Suggest: Fix AnalyzingInfixSuggester / BlendedInfixSuggester to correctly return existing lookup() results during concurrent build() Fix other FST based suggesters so that getCount() returned results consistent with lookup() during concurrent build() > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292-3.patch, LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519219#comment-17519219 ] Michael Sokolov commented on LUCENE-10292: -- It's surprising it took so long for someone to stumble over this. Maybe because most (as I used to do) rebuild their suggesters off line as part of a batch build process? Anyway I looked over the patch and didn't see any problem except a typo in tests "testDurringReBuild" should probably be "testDuringRebuild"? Thanks for the nice tests! > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: LUCENE-10292-1.patch, LUCENE-10292-2.patch, > LUCENE-10292.patch > > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10292) AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
[ https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454288#comment-17454288 ] Chris M. Hostetter commented on LUCENE-10292: - It seems like at a minimum we should make {{AnalyzingInfixSuggester.lookup()}} return an empty List in the {{searcherMgr == null}} case -- but it also seems like it should be possible/better to change {{AnalyzingInfixSuggester.build()}} so that the {{searcherMgr}} is only repalced *after* we build the new index (and/or stop using a new {{IndexWriter}} on every {{AnalyzingInfixSuggester.build()}} call and just do a {{writer.deleteAll()}} instead?) > AnalyzingInfixSuggester thread safety: lookup() fails during (re)build() > > > Key: LUCENE-10292 > URL: https://issues.apache.org/jira/browse/LUCENE-10292 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Priority: Major > > I'm filing this based on anecdotal information from a Solr user w/o > experiencing it first hand (and I don't have a test case to demonstrate it) > but based on a reading of the code the underlying problem seems self > evident... > With all other Lookup implementations I've examined, it is possible to call > {{lookup()}} regardless of whether another thread is concurrently calling > {{build()}} – in all cases I've seen, it is even possible to call > {{lookup()}} even if {{build()}} has never been called: the result is just an > "empty" {{List}} > Typically this is works because the {{build()}} method uses temporary > datastructures until it's "build logic" is complete, at which point it > atomically replaces the datastructures used by the {{lookup()}} method. In > the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method > starts by closing & null'ing out the {{protected SearcherManager > searcherMgr}} (which it only populates again once it's completed building up > it's index) and then the lookup method starts with... > {code:java} > if (searcherMgr == null) { > throw new IllegalStateException("suggester was not built"); > } > {code} > ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any > situation where another thread may be calling > {{AnalyzingInfixSuggester.build()}} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org