[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722338#comment-13722338
 ] 

Michael McCandless commented on LUCENE-4985:


Could you post a patch with --show-copies-as-adds?  (The current patch isn't 
easily applied since there were svn mvs involved...).  Thanks.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722370#comment-13722370
 ] 

Michael McCandless commented on LUCENE-4985:


This is a nice cleanup!

It's still hard to mix all three kinds of facet requests?  E.g. I
think it's realistic for an app to use SSDV for flat fields (less RAM
usage than taxo, important if there are lots of ords), range for
volatile numeric fields (e.g. time delta based), and taxo for
hierarchies.

It seems like we could have a FacetsAccumulator.create that took both
SSDVReaderState and TaxoReader and created the right
FacetsAccumulator ... and I guess we'd need a SSDVFacetRequest.

Or I guess I can just create the directly MultiFacetsAccumulator
myself ... FA.create is just sugar.

This all can wait for a follow-on issue ... these improvements are
already great.

Should we move MultiFacetsAccumulator somewhere else (out of .range
package)?  It's more generic now?

bq. FacetsAccumulator.create documents that you may receive ListFacetResult 
in a different order than you passed in, guaranteeing that all 
RangeFacetRequests come last.

Hmm, can we fix that?  (So that the order of the results matches the
order of the requests).

bq. Modified DrillSideways to take either TaxonomyReader or 
SortedSetDVReaderState because otherise it cannot be used with SortedSetDV 
facets. Mike, can you please review it?

Those changes look good!  I think we can now simplify
TestDrillSideways (previously it had to @Override
getDrillDown/SidewaysAccumulator to use sorted set)?


 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722375#comment-13722375
 ] 

Shai Erera commented on LUCENE-4985:


Adding State to .create() does not simplify life for an app I think, because 
someone (on the app side) will need to figure out if State should be null or 
not. I'm worried that users will end up creating State even if they don't need 
it?

And since MultiFacetAccumulator lets you wrap any accumulator yourself, I think 
it's fine that these are separate methods, as a first step.

I'm worried about adding SortedSetDVFacetRequest, because unlike 
Count/SumScore/SumIntAssociation, this request is solely about the underlying 
source? And it also implies only counting ...

bq. Should we move MultiFacetsAccumulator somewhere else

You're right! It was left there by mistake because I renamed 
RangeAccumulatorWrapper. Will move.

{quote}
Hmm, can we fix that? (So that the order of the results matches the
order of the requests).
{quote}

I don't know how important it is ... none of our tests depend on it, and it's 
not clear to me how to fix it at all. FA.create() is a factory method. If it 
returns a single Accumulator, then it happens already (order is maintained). 
MultiFacetAccum loses the order. Maybe if we passed it the list of facet 
requests it could re-order them after accumulation, but I don't know how 
important it is... an app can put the ListFacetResult in a Map, and do 
lookups? Also, as a generic MultiFA, it's not easier to determine from which FA 
a source FacetRequest came?

bq. I think we can now simplify TestDrillSideways

You're right. Done.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722377#comment-13722377
 ] 

Michael McCandless commented on LUCENE-4985:


{quote}
I don't know how important it is ... none of our tests depend on it, and it's 
not clear to me how to fix it at all. FA.create() is a factory method. If it 
returns a single Accumulator, then it happens already (order is maintained). 
MultiFacetAccum loses the order. Maybe if we passed it the list of facet 
requests it could re-order them after accumulation, but I don't know how 
important it is... an app can put the ListFacetResult in a Map, and do 
lookups? Also, as a generic MultiFA, it's not easier to determine from which FA 
a source FacetRequest came?
{quote}

OK ...

But, I think we should not document that range facet requests come last?  
Let's leave it defined as undefined?  Maybe we should return Collection not 
List?

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722381#comment-13722381
 ] 

Shai Erera commented on LUCENE-4985:


bq. But, I think we should not document that range facet requests come last?

Ok I will remove that comment. As soon as we add more accumulators, this 
comment is not important anyway.

bq. Maybe we should return Collection not List?

Why? I prefer that we don't change that since that will change tests. Many of 
the tests do results.get(idx).
If we don't need to, let's not complicate the users? If an app does pass the 
requests in known order, it shouldn't suffer.
It's only Multi that loses order.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722387#comment-13722387
 ] 

Michael McCandless commented on LUCENE-4985:


I just think it's a dangerous API if sometimes the order matches and sometimes 
it doesn't ... but we can pursue this separately ...

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722430#comment-13722430
 ] 

ASF subversion and git services commented on LUCENE-4985:
-

Commit 1508043 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1508043 ]

LUCENE-4985: Make it easier to mix different kinds of FacetRequests

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722436#comment-13722436
 ] 

ASF subversion and git services commented on LUCENE-4985:
-

Commit 1508046 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508046 ]

LUCENE-4985: Make it easier to mix different kinds of FacetRequests

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-28 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721923#comment-13721923
 ] 

Shai Erera commented on LUCENE-4985:


Didn't expect I'd run into it so quickly, but there are issues with adding 
createFacetsAccumulator to FacetRequest. Here are some details:

* RangeAccumulator only requires ListFacetRequest
* TaxonomyFacetsAccumulator requires: ListFacetRequest, FacetIndexingParams, 
TaxonomyReader and IndexReader
* SortedSetDVAccumulator requires: ListFacetRequest, SortedSetDVReaderState

At first I made:

* FacetsAccumulator.create(FacetSearchParams, TaxonomyReader, IndexReader)
** FacetSearchParams covers ListFacetRequest and FacetIndexingParams
* FacetRequest.createFacestAccumulator(FacetSearchParams, TxonomyReader, 
IndexReader).

But this is problematic on several fronts:

* A FacetRequest does not need ListFacetRequest in createFacetsAccumulator(), 
only FacetIndexingParams
* Not all requests need FacetIndexingParams, TaxonomyReader and IndexReader
** While declaring parameters that aren't needed is not a big deal, it does 
mean that someone will need to pass e.g. taxoReader=null and hope for the best. 
Especially if someone uses only RangeAccumulator and SortedSetDVAccumulator.
* Some requests may need more than that, e.g. SortedSetDVReaderState.

So what's the best solution? It would be best if FacetRequest is 
self-descriptive so that apps can really just call FacetsAccumulator.create().

If we make FacetRequest take all the parameters it needs in the ctor, we sort 
of get that. A request is fully self-descriptive and FR.createFacetsAccumulator 
does not need any arguments. However, each {{new CountFacetRequest()}} will now 
need to pass many parameters, which is annoying and makes the code less 
readable, IMO.

I was thinking perhaps we can extend FacetSearchParams to include additional 
parameters. So ctor takes only the facet requests, and you can optionally set 
FacetIndexingParams, IndexReader, TaxonomyReader, SortedSetDVReaderState. There 
are minor issues that I think can be resolved easily:

* If in the future we'll create an accumulator which requires a different 
setting, we'll need to add that to FSP. That's easy.
* An app may not know which parameters to set on FSP in the first place -- sort 
of easy to fix (on app side) cause it will hit a hard exception, probably NPE 
or some sort, immediately.
** This does show a slight problem though, as you need to create your FRs and 
then remember to set the right parameters on FSP.

There is another problem with FR.createFacetsAccumulator -- the accumulators 
already take a ListFacetRequest when created, however at the single FR level, 
it doesn't know the global picture and cannot pass all the requests up front. 
I was thinking that perhaps it can return a FacetAccumulatorBuilder which has 
API for addFacetRequest() and build(). FacetAccumulator.create() will call 
FR.createFacetAccumulatorBuilder() and will group all the requests that return 
the same builder, and then call builder.build() to get the proper 
FacetAccumulator instance. This is an orthogonal problem BTW to the fact that 
this builder will also need to know e.g. the SortedSetDVReaderState up front...

So ... making FR completely self-descriptive by taking all the parameters 
needed to execute it solves most of the problems -- you have to resolve the 
needed parameters at ctor-time. But, it makes the code uglier. Moving 
parameters to FacetSearchParams keeps the FR ctors clean, but might be trappy 
for an app. I'll think about it more, but if someone has an idea how to tackle 
this ... don't be shy :).

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-28 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721944#comment-13721944
 ] 

Shai Erera commented on LUCENE-4985:


I discussed this with Gilad and perhaps the situation today is not that bad, 
and we can do away with few minor changes. Today one can already construct a 
FacetsAccumulator given the FacetRequests that he wishes to execute. 
RangeAccumulatorWrapper attempts to allow you to work with RangeAccumulator and 
FacetsAccumulator (and it has a TODO to work for SortedSet too). So maybe what 
we need is the following simple solution:

* Make FacetsAccumulator abstract, as outlined above, with implementations for 
Taxonomy, Range, SortedSet and Multi.
* Each of the accumulator impls take the needed parameters in the ctor:
** ListFacetRequest -- the requests that are relevant to *this* accumulator 
only.
** SortedSetReaderState for SortedSetDVAccumulator, 
TaxoReader/FacetIndexingParams etc. for TaxonomyFacetAccumulator.
* If you want to combine different requests, you construct the relevant 
accumulators and wrap w/ MultiFacetsAccumulator.
* Two choices w.r.t. taxonomy accumulators:
** Keep them as FacetsAggregator - don't change much code, but keeps some 
aggregators with irrelevant API, such as rollupValues which is not implemented 
by the associations aggregators (only relevant for hierarchical facets and 
counting)
** Convert all of them to TaxonomyFacetsAccumulator which implement some 
abstract method, so that TaxoFA drives the accumulation and implements the 
common logic of computing top-k over FacetArrays and labeling. This changes 
existing aggregators ...

Bottom line is, a FacetRequest documents which accumulator should be used, but 
does not declare a createAccumulator() method, which leads to all the issues 
I've described above. Its matching accumulator will also make sure that the 
list of requests passed to it are OK (much like is done today).

This has another advantage of keeping e.g. CountFacetRequest as-is, and support 
it by both a CountingTaxonomyFacetsAccumulator and SortedSetFacetsAccumulator.

I'll start with this route and see where it leads me.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-28 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721946#comment-13721946
 ] 

Shai Erera commented on LUCENE-4985:


And I think that in this process I'll eliminate FacetSearchParams which today 
define only FacetIndexingParams (relevant to TaxonomyFacetsAccumulator only for 
now) and ListFacetRequest which will not be a single list anymore, but per 
accumulator. Really .. FacetsAccumulator is another form of FacetsCollector, 
only results are collected once and processed afterwords by each accumulator.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-25 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719637#comment-13719637
 ] 

Shai Erera commented on LUCENE-4985:


I have been thinking about how to achieve that .. here's a proposal:

* Make FacetsAccumulator abstract with following current impls:
** TaxonomyFacetsAccumulator, assumes that TaxoReader is needed, FacetArrays 
etc.
** SortedSetFacetsAccumulator, assumes that categories were indexed to a 
SortedSetDVField
** RangeFacetsAccumulator, for computing facet ranges on NumericDV
** MultiFacetsAccumulator allows chaining several ones (basically a generic 
version of RangeFacetsAccumulatorWrapper)

* Add to FacetRequest.createFacetsAccumulator()
** CountFacetRequest, Association**FacetRequest return TaxoFacetsAccumulator
** SortedSetCountFacetRequest returns SortedSetFA (and also verify that the 
given CategoryPath was actually indexed in a SortedSetDVField)
** RangeFacetRequest returns RangeFacetsAccumulator

This pretty much divides the FacetRequests into the source from which they 
read the facets information. Now we need to handle the different aggregation 
functions currently supported by the TaxoFacetAcc variants: counting, 
associations. TaxoFacetAcc will let you specify the FacetsAggregator:
* CountFacetRequest will set the aggregator to FastCounting (if possible) or 
just Counting.
* **AssociationFacetRequest will set the aggregator to the matching one
* Additional requests can set their own aggregator
* FacetsAggregator will need to implement equals() and hashCode()

Then we'll have FacetsAccumulator.create(ListFacetRequest) which creates the 
right accumulator:

* Group all requests that use the same FacetsAccumulator, so that all RangeFRs 
are grouped together, all TaxoFacetAcc requests together etc.
* For the TaxoFacetAcc requests, it groups them by their aggregator, so that:
** All CountingAggregators that read the same category list are grouped 
together, separate from ones that do counting yet on a different category list
** All AssociationAggregators are grouped together, by their function, list id 
etc.
* It then creates either a single accumulator, or MultiFacetAccumulator which 
chains the accumulate call

What do we gain -- it's easy for an app to create the right accumulator for a 
given list of requests. Today it needs to sort of do this logic on its own, 
which is sometimes impossible (e.g. if it's a component that doesn't know what 
it's given). Also, the requests are self-descriptive.

What do we lose -- today if one wants to count A, B and C using 
CachedOrdsCountingFacetsAggregator, it needs to override 
FacetsAccumulator.getAggregator(), once. With this change, he will need to do 
that for every CountFacetRequest he creates .. I think that's an OK tradeoff, 
given the situation today which makes apps' life tougher.

I think we'll also need to create an Aggregator (old FacetsAggregator) wrapper. 
It is still needed by StandardFacetsAccumulator, until we finish the cleanup of 
sampling, complements counting etc. I'll look into that too, perhaps it can be 
done separately in a different issue.

Now need to hope I took all the parameters into account, and won't hit a brick 
wall when trying to implement it :).

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org