[ 
https://issues.apache.org/jira/browse/LUCENE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-6435:
------------------------------------
    Description: 
While using {{SimpleNaiveBayesClassifier}} on a very large index (all Italian 
Wikipedia articles) I see the following code triggering a 
{{ConcurrentModificationException}} when evicting the {{Query}} from the 
{{LRUCache}}.
{code}
BooleanQuery booleanQuery = new BooleanQuery();
    BooleanQuery subQuery = new BooleanQuery();
    for (String textFieldName : textFieldNames) {
      subQuery.add(new BooleanClause(new TermQuery(new Term(textFieldName, 
word)), BooleanClause.Occur.SHOULD));
    }
    booleanQuery.add(new BooleanClause(subQuery, BooleanClause.Occur.MUST));
    booleanQuery.add(new BooleanClause(new TermQuery(new Term(classFieldName, 
c)), BooleanClause.Occur.MUST));
    //...
    TotalHitCountCollector totalHitCountCollector = new 
TotalHitCountCollector();
    indexSearcher.search(booleanQuery, totalHitCountCollector);
    return totalHitCountCollector.getTotalHits();
{code}

this is the complete stacktrace:
{code}
java.util.ConcurrentModificationException: Removal from the cache failed! This 
is probably due to a query which has been modified after having been put into  
the cache or a badly implemented clone(). Query class: [class 
org.apache.lucene.search.BooleanQuery], query: [#text:panoram #cat:1356]
{code}

The strange thing is that the above doesn't happen if I change the last lines 
of the above piece of code to not use the {{TotalHitCountsCollector}}:
{code}
return indexSearcher.search(booleanQuery, 1).totalHits;
{code}

  was:
While using {{SimpleNaiveBayesClassifier}} on a very large index (all Italian 
Wikipedia articles) I see the following code triggering a 
{{ConcurrentModificationException}} when evicting the {{Query}} from the 
{{LRUCache}}.
{code}
BooleanQuery booleanQuery = new BooleanQuery();
    BooleanQuery subQuery = new BooleanQuery();
    for (String textFieldName : textFieldNames) {
      subQuery.add(new BooleanClause(new TermQuery(new Term(textFieldName, 
word)), BooleanClause.Occur.SHOULD));
    }
    booleanQuery.add(new BooleanClause(subQuery, BooleanClause.Occur.MUST));
    booleanQuery.add(new BooleanClause(new TermQuery(new Term(classFieldName, 
c)), BooleanClause.Occur.MUST));
    //...
    TotalHitCountCollector totalHitCountCollector = new 
TotalHitCountCollector();
    indexSearcher.search(booleanQuery, totalHitCountCollector);
    return totalHitCountCollector.getTotalHits();
{code}

this is the complete stacktrace:
{noformat}
java.util.ConcurrentModificationException: Removal from the cache failed! This 
is probably due to a query which has been modified after having been put into  
the cache or a badly implemented clone(). Query class: [class 
org.apache.lucene.search.BooleanQuery], query: [#text:panoram #cat:1356]
{noformat}

The strange thing is that the above doesn't happen if I change the last lines 
of the above piece of code to not use the {{TotalHitCountsCollector}}:
{code}
return indexSearcher.search(booleanQuery, 1).totalHits;
{code}


> java.util.ConcurrentModificationException: Removal from the cache failed 
> error in SimpleNaiveBayesClassifier
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6435
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6435
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/classification
>    Affects Versions: 5.1
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>             Fix For: Trunk
>
>
> While using {{SimpleNaiveBayesClassifier}} on a very large index (all Italian 
> Wikipedia articles) I see the following code triggering a 
> {{ConcurrentModificationException}} when evicting the {{Query}} from the 
> {{LRUCache}}.
> {code}
> BooleanQuery booleanQuery = new BooleanQuery();
>     BooleanQuery subQuery = new BooleanQuery();
>     for (String textFieldName : textFieldNames) {
>       subQuery.add(new BooleanClause(new TermQuery(new Term(textFieldName, 
> word)), BooleanClause.Occur.SHOULD));
>     }
>     booleanQuery.add(new BooleanClause(subQuery, BooleanClause.Occur.MUST));
>     booleanQuery.add(new BooleanClause(new TermQuery(new Term(classFieldName, 
> c)), BooleanClause.Occur.MUST));
>     //...
>     TotalHitCountCollector totalHitCountCollector = new 
> TotalHitCountCollector();
>     indexSearcher.search(booleanQuery, totalHitCountCollector);
>     return totalHitCountCollector.getTotalHits();
> {code}
> this is the complete stacktrace:
> {code}
> java.util.ConcurrentModificationException: Removal from the cache failed! 
> This is probably due to a query which has been modified after having been put 
> into  the cache or a badly implemented clone(). Query class: [class 
> org.apache.lucene.search.BooleanQuery], query: [#text:panoram #cat:1356]
> {code}
> The strange thing is that the above doesn't happen if I change the last lines 
> of the above piece of code to not use the {{TotalHitCountsCollector}}:
> {code}
> return indexSearcher.search(booleanQuery, 1).totalHits;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to