[jira] [Commented] (LUCENE-3262) Facet benchmarking

Shai Erera (Commented) (JIRA) Wed, 05 Oct 2011 22:16:58 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121728#comment-13121728
 ]


Shai Erera commented on LUCENE-3262:
------------------------------------

Patch looks good ! I have a couple of initial comments:

* facets.alg: as I often find these .alg files as examples, I think it would be 
good if it declares facet.source (to random) explicitly.

* OpenTaxonomyReaderTask: I see that since PerfRunData incRef() the incoming 
taxonomy, you decRef(). I also see that setIndexReader behaves the same way. 
But I find it confusing. Personally, since this is not an application, I don't 
think we should 'hold a reference to IR/LTR just in case the one who set it 
closes it'. But if we do that, can we at least document on setIR/LTR that this 
is the case? I can certainly see myself opening IR/LTR, setting on PerfRunData 
without decRef()/close(). It would not occur to me that I should ...

* The abstraction of ItemSource is nice. But it's jdocs still contain 
content.source.*. Since we're not committed to backwards compatibility in 
benchmark, and in the interest of clarity, perhaps we should rename them to 
item.source.*?

* ItemSource.resetInputs has a @SuppressWarnings("unused") -- is it a leftover 
from when it was private?

* In PerfRunData ctor you do a Class.forName using the String name of 
RandomFacetSource. Why not use RandomFacetSource.class.getName()?

Looks very good. Now with FacetSource we can generate facets per the case we 
want to test (dense hierarchies, Zipf'ian ...)
                
> Facet benchmarking
> ------------------
>
>                 Key: LUCENE-3262
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3262
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/benchmark, modules/facet
>            Reporter: Shai Erera
>            Assignee: Doron Cohen
>         Attachments: CorpusGenerator.java, LUCENE-3262.patch, 
> TestPerformanceHack.java
>
>
> A spin off from LUCENE-3079. We should define few benchmarks for faceting 
> scenarios, so we can evaluate the new faceting module as well as any 
> improvement we'd like to consider in the future (such as cutting over to 
> docvalues, implement FST-based caches etc.).
> Toke attached a preliminary test case to LUCENE-3079, so I'll attach it here 
> as a starting point.
> We've also done some preliminary job for extending Benchmark for faceting, so 
> I'll attach it here as well.
> We should perhaps create a Wiki page where we clearly describe the benchmark 
> scenarios, then include results of 'default settings' and 'optimized 
> settings', or something like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3262) Facet benchmarking

Reply via email to