[jira] [Commented] (SOLR-4212) Let facet queries hang off of pivots

Hoss Man (JIRA) Wed, 25 Mar 2015 16:39:07 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381038#comment-14381038
 ]


Hoss Man commented on SOLR-4212:
--------------------------------

I didn't get a chance to review as in depth as i would have liked, but here are 
some comments based on a quick skim, followed by some more detailed review of a 
few classes...

* it seems verbose and confusing to use "facet.range" and "facet.query" as the 
local params for this...
** Confusing: these local params refer to *tag names* where as people might 
think they can inline arbitrary query expressions / field names to compute 
ranges on
** verbose: why not just "ranges" and "queries" for the local param names, 
similar to how "stats" is already used?
* I'm not a big fan of the fact that this patch breaks the determinism of the 
order that the different types of facets are returned in -- It's probably not a 
blocker, but i suspect there _may_ be some clients (or perhaps even client 
libraries other then SolrJ) which will be broken by this.
* In the SolrJ PivotField class, this patch adds "NamedList<Object> 
getRanges()" and "NamedList<Integer> getQueryCounts()" methods.  We should 
really be consistent with the existing equivalent methods in QueryResponse...
** "Map<String,Integer>getFacetQuery()"
** "List<RangeFacet> getFacetRanges()"
* in PivotFacetValue: "// named list with objects because depending on how big 
the counts are we may get either a long or an int"
** FWIW: I had seriously never noticed until today that these facet counts had 
variable *type*
** why can't we just use "Number" instead of "Object" in this new code since 
that's the lavel all of the casting code that deals with this list seems to use 
anyway?
** doesn't this break the SolrJ code if/when it returns a Long? (see above new 
method "NamedList<Integer> getQueryCounts()")
* DateFacetProcessor
** this entire class should be marked deprecated
** maybe i'm missing something, but what exactly is the advantage of this 
subclassing RangeFacetProcessor? ... if they are sharing code it's not obvious 
to me, and if they aren't (intentionally) sharing code then this subclass 
relationship seems dangerous if/when future improvements to range faceting are 
made.
* FacetComponent
** why does doDistribRanges() still need to exist? why not just move that 
casting logic directly into RangeFacetAccumulator.mergeContributionFromShard 
like the rest of the code that use to be here and call it directly?
** now that these skethy "num()" functions are no longer private, can we please 
get some javadocs on them.
* PivotFacetProcessor
** unless i'm missunderstanding the usage, the way addPivotQueriesAndRanges 
(and removeUnwantedQueriesAndRanges) works means that every facet.query and 
facet.range param value (with all localparams) is going to be reparsed over and 
over and over again for every unique value in every pivot field -- just to 
check the "tag" values and see if it's one that should be computed for this 
pivot.
*** This seems really unneccessary -- why not parse each param once into a 
simple datastructure (isn't that what the new ParsedParams" class is designed 
for?), and then put them in a map by tag available fro mthe request context -- 
just like we did for the stats with StatsInfo.getStatsFieldsByTag(String) ?
*** in particular won't this slow down existing requests containing both 
facet.pivot and facet.range || facet.query) ... even if the later aren't tagged 
or hung off of the pivots at all? because they'll still get parsed over and 
over again won't they?
** this logic also seems to completely break instances of facet.query used w/o 
linking it to a face.tpivot
*** 
http://localhost:8983/solr/techproducts/select?q=*:*&rows=0&stats=true&facet.query=inStock:true&facet=true&facet.pivot=manu_id_s,inStock
*** {noformat}
java.lang.NullPointerException
        at 
org.apache.solr.handler.component.PivotFacetProcessor.removeUnwantedQueriesAndRanges(PivotFacetProcessor.java:399)
        at 
org.apache.solr.handler.component.PivotFacetProcessor.addPivotQueriesAndRanges(PivotFacetProcessor.java:371)
        at 
org.apache.solr.handler.component.PivotFacetProcessor.doPivots(PivotFacetProcessor.java:273)
        at 
org.apache.solr.handler.component.PivotFacetProcessor.processSingle(PivotFacetProcessor.java:194)
        at 
org.apache.solr.handler.component.PivotFacetProcessor.process(PivotFacetProcessor.java:135)
        at 
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:121)
        at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:222)
        at 
{noformat}
** also broken: neither of these requests should result in the facet.query 
hanging off of the pivots, but because of how StringUtils.contains() is used 
they both do erroniously...{noformat}
http://localhost:8983/solr/techproducts/select?q=*:*&rows=0&stats=true&facet.query={!tag=ttt}inStock:true&facet=true&facet.pivot={!facet.query=t}manu_id_s,inStock
http://localhost:8983/solr/techproducts/select?q=*:*&rows=0&stats=true&facet.query={!tag=t}inStock:true&hang=&facet=true&facet.pivot={!facet.query=$hang}manu_id_s,inStock
{noformat}
* tests...
** a lot of code in the new test classes seems to have been copied verbatim 
from other existing tests -- in some cases this is fine, because the copied 
test logic has been modified to include new params+asserts of the new 
functionality -- but theres still a lot of redundent copy/past cruft w/o any 
logic changes
*** eg: DistributedFacetPivotQuerySmallTest lines 428-532 seem to be verbatim 
copied from DistributedFacetPivotSmallTest w/o any additions to test the 
racet.query logic, or even new negative-assertions that stra facet.queries are 
hung off by mistake (ie: to catch the bugs i mentioned above)
*** dito for DistributedFacetPivotRangeSmallTest
** there doesn't seem to be any new tests that show hanging both ranges & 
queries onto a pivot at the same time -- let alone combining it with the 
existing stats logic
** likewise i don't see any testing of using the same tag with multiple 
facet.query instances (or multiple facet.range instances) and confirming that 
both get hung off of the same pivot.



> Let facet queries hang off of pivots
> ------------------------------------
>
>                 Key: SOLR-4212
>                 URL: https://issues.apache.org/jira/browse/SOLR-4212
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Steve Molloy
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 4.9, Trunk
>
>         Attachments: SOLR-4212-multiple-q.patch, SOLR-4212-multiple-q.patch, 
> SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, SOLR-4212.patch, 
> SOLR-4212.patch, SOLR-4212.patch, SOLR-6353-6686-4212.patch, 
> SOLR-6353-6686-4212.patch, patch-4212.txt
>
>
> Facet pivot provide hierarchical support for computing data used to populate 
> a treemap or similar visualization. TreeMaps usually offer users extra 
> information by applying an overlay color on top of the existing square sizes 
> based on hierarchical counts. This second count is based on user choices, 
> representing, usually with gradient, the proportion of the square that fits 
> the user's choices.
> The proposition is to use local parameters to specify facet query to apply 
> for pivot which matches a tag set on facet query. Parameter format would look 
> like:
> facet.pivot={!query=r1}category,manufacturer
> facet.query={!tag=r1}somequery
> facet.query={!tag=r1}somedate:[NOW-1YEAR TO NOW]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-4212) Let facet queries hang off of pivots

Reply via email to