[jira] [Commented] (LUCENE-5139) ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing

2013-12-16 Thread Rob Audenaerde (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848956#comment-13848956
 ] 

Rob Audenaerde commented on LUCENE-5139:


In the end we managed to get around this problem by implementing locking around 
commits (because multiples thread could call commit() on either iw or tw).

> ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing
> -
>
> Key: LUCENE-5139
> URL: https://issues.apache.org/jira/browse/LUCENE-5139
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Ubuntu 64 bit
>Reporter: Rob Audenaerde
> Attachments: testfacetindexing.zip
>
>
> It is a hard to reproduce problem, but I see it from time to time. I am 
> indexing some 100k documents and while I am doing that, I use the search and 
> facet module. 
> In some cases, I get an AIOOBE on the FacetAccumulator.accumulate method. See 
> for example this little stacktrace:
> java.lang.ArrayIndexOutOfBoundsException: 1400222
>  at 
> org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:87)
>  at 
> org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
>  at 
> org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
>  at ...
> Some more detail:
> I have a index that is being written to by an IndexWriter. The index is 
> searched by a SearcherManager that uses the same Directory. The 
> searcherManager has a scheduled maybeRefresh each 1000ms. When refreshing, I 
> also check whether the taxonomy has changed. If so, I replace it by the new 
> one. I use this code:
> {code}
> TaxonomyReader newReader = TaxonomyReader.openIfChanged( this.taxoReader );
> if ( newReader != null )
> {
>   this.taxoReader = newReader;
>   LOG.info( "Reopening taxonomyReader because it has changed!" );
> }
> {code}
> I will try to make it more reproducable; but maybe someone already has an 
> idea on what might trigger this.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5139) ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing

2013-07-26 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720767#comment-13720767
 ] 

Shai Erera commented on LUCENE-5139:


Not exactly. In your app, you control the commit() to both indexes. And 
therefore you "know" that after commit the two directories agree on their state 
and can refresh a single SearcherTaxoManager.

This might fail though if the commit to one index succeeded, and failed to the 
other (e.g. the machine crashed). For that reason, we always commit taxo first, 
because it's ok if it contains more ordinals than are currently used by the 
search index.

If you're not replacing your index content (i.e. iw.deleteAll() + 
tw.replaceTaxo()), you can do the following (by the same object/thread):

* Call iw.commit()
* Call tw.commit()
* Call searcherTaxoManager.refresh()

If you call refresh() from a different thread, you're still not safe, because 
it might be refreshing after iw committed, but before tw.

> ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing
> -
>
> Key: LUCENE-5139
> URL: https://issues.apache.org/jira/browse/LUCENE-5139
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Ubuntu 64 bit
>Reporter: Rob Audenaerde
> Attachments: testfacetindexing.zip
>
>
> It is a hard to reproduce problem, but I see it from time to time. I am 
> indexing some 100k documents and while I am doing that, I use the search and 
> facet module. 
> In some cases, I get an AIOOBE on the FacetAccumulator.accumulate method. See 
> for example this little stacktrace:
> java.lang.ArrayIndexOutOfBoundsException: 1400222
>  at 
> org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:87)
>  at 
> org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
>  at 
> org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
>  at ...
> Some more detail:
> I have a index that is being written to by an IndexWriter. The index is 
> searched by a SearcherManager that uses the same Directory. The 
> searcherManager has a scheduled maybeRefresh each 1000ms. When refreshing, I 
> also check whether the taxonomy has changed. If so, I replace it by the new 
> one. I use this code:
> {code}
> TaxonomyReader newReader = TaxonomyReader.openIfChanged( this.taxoReader );
> if ( newReader != null )
> {
>   this.taxoReader = newReader;
>   LOG.info( "Reopening taxonomyReader because it has changed!" );
> }
> {code}
> I will try to make it more reproducable; but maybe someone already has an 
> idea on what might trigger this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5139) ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing

2013-07-26 Thread Rob Audenaerde (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720755#comment-13720755
 ] 

Rob Audenaerde commented on LUCENE-5139:


If I understand correctly, the problem basically is that it is not (yet?) 
possible to tell if the two directories agree on their state? IF that is not 
possible, I will still get the assertionErrors?

> ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing
> -
>
> Key: LUCENE-5139
> URL: https://issues.apache.org/jira/browse/LUCENE-5139
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Ubuntu 64 bit
>Reporter: Rob Audenaerde
> Attachments: testfacetindexing.zip
>
>
> It is a hard to reproduce problem, but I see it from time to time. I am 
> indexing some 100k documents and while I am doing that, I use the search and 
> facet module. 
> In some cases, I get an AIOOBE on the FacetAccumulator.accumulate method. See 
> for example this little stacktrace:
> java.lang.ArrayIndexOutOfBoundsException: 1400222
>  at 
> org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:87)
>  at 
> org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
>  at 
> org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
>  at ...
> Some more detail:
> I have a index that is being written to by an IndexWriter. The index is 
> searched by a SearcherManager that uses the same Directory. The 
> searcherManager has a scheduled maybeRefresh each 1000ms. When refreshing, I 
> also check whether the taxonomy has changed. If so, I replace it by the new 
> one. I use this code:
> {code}
> TaxonomyReader newReader = TaxonomyReader.openIfChanged( this.taxoReader );
> if ( newReader != null )
> {
>   this.taxoReader = newReader;
>   LOG.info( "Reopening taxonomyReader because it has changed!" );
> }
> {code}
> I will try to make it more reproducable; but maybe someone already has an 
> idea on what might trigger this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5139) ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing

2013-07-26 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720745#comment-13720745
 ] 

Shai Erera commented on LUCENE-5139:


Yup, the first test shows exactly the problem. Your refresh() method first 
calls sm.maybeRefresh() and then reopens the taxonomy. If in between a Searcher 
acquired an IndexSearcher (refreshed) and runs a search, it uses a newer 
IndexSearcher with an older TaxoReader.

You can try to reverse the order of refresh() so that taxoReader is reopened 
before sm.maybeRefresh(). It's usually ok if taxoReader sees more ordinals than 
IndexSearcher, not vice versa. I say usually, because in some cases, e.g. when 
you replaceTaxonomy and writer.deleteAll(), it's not the case. For that, we 
need a SearcherTaxoManager which can verify that the two indexes are consistent.

> ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing
> -
>
> Key: LUCENE-5139
> URL: https://issues.apache.org/jira/browse/LUCENE-5139
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Ubuntu 64 bit
>Reporter: Rob Audenaerde
> Attachments: testfacetindexing.zip
>
>
> It is a hard to reproduce problem, but I see it from time to time. I am 
> indexing some 100k documents and while I am doing that, I use the search and 
> facet module. 
> In some cases, I get an AIOOBE on the FacetAccumulator.accumulate method. See 
> for example this little stacktrace:
> java.lang.ArrayIndexOutOfBoundsException: 1400222
>  at 
> org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:87)
>  at 
> org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
>  at 
> org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
>  at ...
> Some more detail:
> I have a index that is being written to by an IndexWriter. The index is 
> searched by a SearcherManager that uses the same Directory. The 
> searcherManager has a scheduled maybeRefresh each 1000ms. When refreshing, I 
> also check whether the taxonomy has changed. If so, I replace it by the new 
> one. I use this code:
> {code}
> TaxonomyReader newReader = TaxonomyReader.openIfChanged( this.taxoReader );
> if ( newReader != null )
> {
>   this.taxoReader = newReader;
>   LOG.info( "Reopening taxonomyReader because it has changed!" );
> }
> {code}
> I will try to make it more reproducable; but maybe someone already has an 
> idea on what might trigger this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5139) ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing

2013-07-26 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720735#comment-13720735
 ] 

Shai Erera commented on LUCENE-5139:


That assertion seems to be what I was talking about -- an IndexSearcher sees 
more docs than its matching TaxonomyReader.

The reason we don't have a Directory-based SearcherTaxoManager version (yet!) 
is that we'd need to guarantee the two Directories (index and taxo) actually 
agree on their state. That is, all the categories that are indexed in the 
search index, also exist in the taxonomy index. Also, that all the categories 
that are encoded in the taxonomy index (and their ordinals) are consistent with 
the ones in the search index. In other words, that the ordinal=7 in both 
Directories denote the same category!

We could perhaps write an OptimisticSearcherTaxoManager which takes two 
Directories. You could also write one yourself to try it -- should be very 
easy. You can copy most of the code from STManager (with the writers).

> ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing
> -
>
> Key: LUCENE-5139
> URL: https://issues.apache.org/jira/browse/LUCENE-5139
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Ubuntu 64 bit
>Reporter: Rob Audenaerde
> Attachments: testfacetindexing.zip
>
>
> It is a hard to reproduce problem, but I see it from time to time. I am 
> indexing some 100k documents and while I am doing that, I use the search and 
> facet module. 
> In some cases, I get an AIOOBE on the FacetAccumulator.accumulate method. See 
> for example this little stacktrace:
> java.lang.ArrayIndexOutOfBoundsException: 1400222
>  at 
> org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:87)
>  at 
> org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
>  at 
> org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
>  at ...
> Some more detail:
> I have a index that is being written to by an IndexWriter. The index is 
> searched by a SearcherManager that uses the same Directory. The 
> searcherManager has a scheduled maybeRefresh each 1000ms. When refreshing, I 
> also check whether the taxonomy has changed. If so, I replace it by the new 
> one. I use this code:
> {code}
> TaxonomyReader newReader = TaxonomyReader.openIfChanged( this.taxoReader );
> if ( newReader != null )
> {
>   this.taxoReader = newReader;
>   LOG.info( "Reopening taxonomyReader because it has changed!" );
> }
> {code}
> I will try to make it more reproducable; but maybe someone already has an 
> idea on what might trigger this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5139) ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing

2013-07-26 Thread Rob Audenaerde (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720729#comment-13720729
 ] 

Rob Audenaerde commented on LUCENE-5139:


Hi Shai, I cannot directly use the SearcherTaxonomyManager, because it can only 
be constructed using an IndexWriter. For my use case, I have this:
{code}
this.searcherManager = new SearcherManager( this.indexDirectory, new 
SearcherFactory() );
{code}

I created some tests; using the {SearchedTaxonomyManager} and the Writers I 
don't get any errors. When not using the {SearchedTaxonomyManager} I sometimes 
get this assertion failure: 

  java.lang.AssertionError: ord=2 vs maxOrd=1
at 
org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:86)
at 
org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
at 
org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
at 
org.audenaerde.FacetSearchWhileIndexingLuceneTest$Searcher.run(FacetSearchWhileIndexingLuceneTest.java:126)


> ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing
> -
>
> Key: LUCENE-5139
> URL: https://issues.apache.org/jira/browse/LUCENE-5139
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Ubuntu 64 bit
>Reporter: Rob Audenaerde
>
> It is a hard to reproduce problem, but I see it from time to time. I am 
> indexing some 100k documents and while I am doing that, I use the search and 
> facet module. 
> In some cases, I get an AIOOBE on the FacetAccumulator.accumulate method. See 
> for example this little stacktrace:
> java.lang.ArrayIndexOutOfBoundsException: 1400222
>  at 
> org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:87)
>  at 
> org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
>  at 
> org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
>  at ...
> Some more detail:
> I have a index that is being written to by an IndexWriter. The index is 
> searched by a SearcherManager that uses the same Directory. The 
> searcherManager has a scheduled maybeRefresh each 1000ms. When refreshing, I 
> also check whether the taxonomy has changed. If so, I replace it by the new 
> one. I use this code:
> {code}
> TaxonomyReader newReader = TaxonomyReader.openIfChanged( this.taxoReader );
> if ( newReader != null )
> {
>   this.taxoReader = newReader;
>   LOG.info( "Reopening taxonomyReader because it has changed!" );
> }
> {code}
> I will try to make it more reproducable; but maybe someone already has an 
> idea on what might trigger this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5139) ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing

2013-07-26 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720685#comment-13720685
 ] 

Shai Erera commented on LUCENE-5139:


Rob, can you try to reproduce that while using SearcherTaxonomyManager? The 
problem with what you do is that IndexSearcher and TaxonomyReader are not 
reopened atomically. I.e. a thread could call sm.acquire() after your 
sm.maybeRefresh() finished, but before TaxonomyReader was reopned.

Also, I don't know if it's the full code section which updates the taxoReader 
instance, but if it is, note that you don't close the previous reader instance, 
thereby leaking references (and file handles).

> ArrayIndexOutOfBoundsException in FacetsAccumulator.accumulate while indexing
> -
>
> Key: LUCENE-5139
> URL: https://issues.apache.org/jira/browse/LUCENE-5139
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/facet
>Affects Versions: 4.4
> Environment: Ubuntu 64 bit
>Reporter: Rob Audenaerde
>
> It is a hard to reproduce problem, but I see it from time to time. I am 
> indexing some 100k documents and while I am doing that, I use the search and 
> facet module. 
> In some cases, I get an AIOOBE on the FacetAccumulator.accumulate method. See 
> for example this little stacktrace:
> java.lang.ArrayIndexOutOfBoundsException: 1400222
>  at 
> org.apache.lucene.facet.search.FastCountingFacetsAggregator.aggregate(FastCountingFacetsAggregator.java:87)
>  at 
> org.apache.lucene.facet.search.FacetsAccumulator.accumulate(FacetsAccumulator.java:167)
>  at 
> org.apache.lucene.facet.search.FacetsCollector.getFacetResults(FacetsCollector.java:214)
>  at ...
> Some more detail:
> I have a index that is being written to by an IndexWriter. The index is 
> searched by a SearcherManager that uses the same Directory. The 
> searcherManager has a scheduled maybeRefresh each 1000ms. When refreshing, I 
> also check whether the taxonomy has changed. If so, I replace it by the new 
> one. I use this code:
> {code}
> TaxonomyReader newReader = TaxonomyReader.openIfChanged( this.taxoReader );
> if ( newReader != null )
> {
>   this.taxoReader = newReader;
>   LOG.info( "Reopening taxonomyReader because it has changed!" );
> }
> {code}
> I will try to make it more reproducable; but maybe someone already has an 
> idea on what might trigger this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org