[jira] [Commented] (LUCENE-7975) Replace facets taxonomy writer "cache" with BytesRefHash based implementation
[ https://issues.apache.org/jira/browse/LUCENE-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189743#comment-16189743 ] ASF subversion and git services commented on LUCENE-7975: - Commit 6cee162e195bb124cf77c7dc8b9e595cfb3e8a93 in lucene-solr's branch refs/heads/branch_7x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6cee162 ] LUCENE-7975: change the default taxonomy facets cache to a faster UTF-8 cache > Replace facets taxonomy writer "cache" with BytesRefHash based implementation > - > > Key: LUCENE-7975 > URL: https://issues.apache.org/jira/browse/LUCENE-7975 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 7.1, master (8.0) > > Attachments: LUCENE-7975.patch, LUCENE-7975.patch > > > When the facets module was first created we didn't have {{BytesRefHash}} and > so the default cache ({{Cl2oTaxonomyWriterCache}} was quite a bit more > complex than needed. > I changed this to use a {{BytesRefHash}}, which stores labels as UTF8 > (reduces memory for ascii-only usage), and is also faster (~12% overall > speedup on indexing time in my private tests). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7975) Replace facets taxonomy writer "cache" with BytesRefHash based implementation
[ https://issues.apache.org/jira/browse/LUCENE-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189735#comment-16189735 ] ASF subversion and git services commented on LUCENE-7975: - Commit a9fb4ddf80f28c5de36459569f1c94a261a70e8e in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a9fb4dd ] LUCENE-7975: change the default taxonomy facets cache to a faster UTF-8 cache > Replace facets taxonomy writer "cache" with BytesRefHash based implementation > - > > Key: LUCENE-7975 > URL: https://issues.apache.org/jira/browse/LUCENE-7975 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 7.1, master (8.0) > > Attachments: LUCENE-7975.patch, LUCENE-7975.patch > > > When the facets module was first created we didn't have {{BytesRefHash}} and > so the default cache ({{Cl2oTaxonomyWriterCache}} was quite a bit more > complex than needed. > I changed this to use a {{BytesRefHash}}, which stores labels as UTF8 > (reduces memory for ascii-only usage), and is also faster (~12% overall > speedup on indexing time in my private tests). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7975) Replace facets taxonomy writer "cache" with BytesRefHash based implementation
[ https://issues.apache.org/jira/browse/LUCENE-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179123#comment-16179123 ] Michael McCandless commented on LUCENE-7975: Oh, woops, yes the {{xx}} is leftover -- I'll remove those methods. bq. Do we really need the bytes ThreadLocal in UTF8TaxonomyWriterCache? It looks like it is always accessed under 'this' lock Eeek, nice catch! I meant to perf test w/ that code outside of the lock; I'll re-test and see if it's warranted. > Replace facets taxonomy writer "cache" with BytesRefHash based implementation > - > > Key: LUCENE-7975 > URL: https://issues.apache.org/jira/browse/LUCENE-7975 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: master (8.0), 7.1 > > Attachments: LUCENE-7975.patch > > > When the facets module was first created we didn't have {{BytesRefHash}} and > so the default cache ({{Cl2oTaxonomyWriterCache}} was quite a bit more > complex than needed. > I changed this to use a {{BytesRefHash}}, which stores labels as UTF8 > (reduces memory for ascii-only usage), and is also faster (~12% overall > speedup on indexing time in my private tests). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7975) Replace facets taxonomy writer "cache" with BytesRefHash based implementation
[ https://issues.apache.org/jira/browse/LUCENE-7975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179027#comment-16179027 ] Adrien Grand commented on LUCENE-7975: -- Wow, nice simplification! - I think you forgot to remove the {{xx}} prefix to some methods (which I believe were used to make the old and new impls co-exist). - Do we really need the bytes ThreadLocal in UTF8TaxonomyWriterCache? It looks like it is always accessed under 'this' lock > Replace facets taxonomy writer "cache" with BytesRefHash based implementation > - > > Key: LUCENE-7975 > URL: https://issues.apache.org/jira/browse/LUCENE-7975 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: master (8.0), 7.1 > > Attachments: LUCENE-7975.patch > > > When the facets module was first created we didn't have {{BytesRefHash}} and > so the default cache ({{Cl2oTaxonomyWriterCache}} was quite a bit more > complex than needed. > I changed this to use a {{BytesRefHash}}, which stores labels as UTF8 > (reduces memory for ascii-only usage), and is also faster (~12% overall > speedup on indexing time in my private tests). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org