[ https://issues.apache.org/jira/browse/LUCENE-8380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532368#comment-16532368 ]
ASF subversion and git services commented on LUCENE-8380: --------------------------------------------------------- Commit 0f652627a06f036beba0a6a6d201004d7d5a365c in lucene-solr's branch refs/heads/master from [~dawid.weiss] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0f65262 ] LUCENE-8380: UTF8TaxonomyWriterCache page/ offset calculation bug > UTF8TaxonomyWriterCache inconsistency > ------------------------------------- > > Key: LUCENE-8380 > URL: https://issues.apache.org/jira/browse/LUCENE-8380 > Project: Lucene - Core > Issue Type: Bug > Components: modules/facet > Affects Versions: 7.1 > Reporter: Ruslan Torobaev > Priority: Minor > Fix For: 7.5 > > Attachments: LUCENE-8380.patch, lucene-taxonomy-cache-report.tar.gz, > taxonomy-cache.json.gz, taxonomy.tar.gz > > > I’m facing a problem with taxonomy writer cache inconsistency. At some point > in time UTF8TaxonomyWriterCache starts to return wrong ord for some facet > labels. As result wrong ord are written in doc facet fields, and wrong counts > are returned (undercount) during search. This bug is manifested on different > servers with different index contents (we have several separate indexes with > unique data). > Unfortunately I can’t reproduce this behaviour in tests. > I've dumped "broken" UTF8TaxonomyWriterCache instance and created app to > load it and to compare with real taxonomy. Dumps and app are in attachment. > To run demo extract archives content and exec: > {code} > mvn compile > mvn exec:java > -Dexec.mainClass="me.torobaev.lucene.taxonomy.cache.TaxonomyCacheCheck" > -DtaxonomyDir=../taxonomy/ -DcacheDump=../taxonomy-cache.json > {code} > As you can see, labels [frametype, 7] and [modification_id, 682] have same > ord in cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org