[jira] [Commented] (SOLR-13759) Optimize Queries when query filtering by TRA router.field

Gus Heck (Jira) Sun, 12 Jan 2020 07:52:17 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013794#comment-17013794
 ]


Gus Heck commented on SOLR-13759:
---------------------------------

I've looked at your patch, here are my comments:
 # Generally the init() methd of HttpSolrCall is way to long to begin with. 
Everything inside the if block should be swapped out to a method called 
optimizeTraFilter or something of the sort. 
 # QueriesCollectorQueryVisitor is not very generic, and I don’t think it 
deserves it’s own class. I’d rather see this as a private inner class or 
anonymous class. Folks should not be seeing it in their ide suggestions etc… 
Also the name is long and hard to read anyway, so moving it down should allow 
us to use a shorter name or no name at all.
 # Similarly ResponseHelper doesn't look like it deserves a top level class. 
Making a new class at the top level should be reserved for cases where A) the 
code gets reused in at least 2 if not 3 other classes,  B) some data and logic 
should be encapsulated together, or C) there is a large coherent set of logic 
which is all closely related and including it in the main class would 
significantly pollute or distract from the main class in which it is used. In 
this case these can probably just be private instance methods in your test.
 # if (path.contains("select") && …. My first reaction is I don’t like any sort 
of contains(“foo”) on the path, because this would get triggered by a 
collection name such as “selected” etc. There’s logic similar to that for 
“update” later in the method but it requires / chars too which would be safer… 
 # Secondarily, though I haven’t thought too deeply about it yet, I don’t see 
good reason to limit this only to the case of the /select search handler. There 
would be no reason for any handler that accepts fq to pester Sub-collections in 
a TRA that are out of range. I’m not terribly concerned about cases where 
people add fq to parameters for any handlers that don’t normally accept fq. 
That falls in the “well… um... don’t do that?” category :). I think we are 
probably safe cuing off the presence of fq.
 # Finally, I know it came from my example, but {{QueryParser queryParser = new 
QueryParser("_text_", new StandardAnalyzer());}} makes me nervous. When I wrote 
it in the example I was thinking that we might have a way to use the parser 
already being created for the request, but given where this code lives that may 
be difficult. In it's current form this makes me nervous. If we can't find a 
better way we will need to ask ourselves if this cost in excess object creation 
is greater than the benefits of this filtering (testing/benchmarks would be 
good) It's possible that this will be a tradeoff in GC vs response time and not 
right for everyone. That may imply the need for a mechanism to turn this 
feature on/off
 # the isTRA method: I'd like to suggest a more future proof name: isTimeRouted 
- "TRA" is a specific notion and a special case of the DRA's which may have 
time dimensions. We might or might not ever extend this to the general case 
(depending on how beneficial this turns out to be) but not using the TRA 
acronym will make it read better if we do. 

> Optimize Queries when query filtering by TRA router.field
> ---------------------------------------------------------
>
>                 Key: SOLR-13759
>                 URL: https://issues.apache.org/jira/browse/SOLR-13759
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: mosh
>            Assignee: Gus Heck
>            Priority: Minor
>         Attachments: QueryVisitorExample.java, SOLR-13759.patch, 
> SOLR-13759.patch, SOLR-13759.patch, SOLR-13759.patch, SOLR-13759.patch, 
> image-2019-12-09-22-45-51-721.png
>
>
> We are currently testing TRA using Solr 7.7, having >300 shards in the alias, 
> with much growth in the coming months.
> The "hot" data(in our case, more recent) will be stored on stronger 
> nodes(SSD, more RAM, etc).
> A proposal of optimizing queries will be by filtering query by date range, by 
> that we will be able to querying the specific TRA collections taking 
> advantage of the TRA mechanism of partitioning data based on date.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13759) Optimize Queries when query filtering by TRA router.field

Reply via email to