Re: index selection when 0 or 1 property clause (or sort clause) matches

Alvaro Cabrerizo Tue, 16 May 2017 07:51:06 -0700

Hello,

Yes, there are some reasons:


   - One team is working with all the assets under /nodeA/nodeB
   - Other team only works with an small subset, only assets under
   /nodeA/node/nodeC
   - Both teams have different searching requirements
   - Rebuilding Assets index takes several days
   - Rebuilding deeperAsset takes a couple of hours. Thus merging both will
   penalize the size and modification agility of deeperAsset

One option we are evaluating is to cheat the deepAsset cost (using
costPerEntry property) and make that team work with lucene native queries
(any recommendation is welcome).  Anyway, it is not clear which index is
selected in case of tie.

Regards.


On Tue, May 16, 2017 at 4:36 PM, Chetan Mehrotra <[email protected]>
wrote:

> Any reason for having separate definitions for same nodetype?
> Chetan Mehrotra
>
>
> On Tue, May 16, 2017 at 7:52 PM, Alvaro Cabrerizo <[email protected]>
> wrote:
> > Hello,
> >
> > Actually, it is OAK-5449. Sorry, I hadn't seen it.
> >
> > On the other hand, having these two definitions under oak:index (just a
> > sketch):
> >
> >
> >    - Asset
> >    - evaluatePathRestrictions="true"
> >       - type="lucene"
> >       - includedPath="/nodeA/nodeB"
> >       - indexRules
> >       - my:Asset
> >          - properties
> >                - name="my:title"
> >             - deeperAsset
> >       - evaluatePathRestrictions="true"
> >       - type="lucene"
> >       - includedPath="/nodeA/nodeB/includedC"
> >       - ndexRules
> >       - my:Asset
> >          - properties
> >                - name="my:description"
> >
> > Made the system to return a cost of 1001 (for both indexes) when
> performing
> > these kind of queries:
> >
> >    - SELECT * FROM [my:Asset] AS s WHERE ISDESCENDANTNODE(s,'/nodeA/
> nodeB')
> >    - SELECT * FROM [my:Asset] AS s WHERE ISDESCENDANTNODE(s,'/nodeA/
> nodeB')
> >    AND s.[my:title]='title'
> >    - SELECT * FROM [my:Asset] AS s WHERE ISDESCENDANTNODE(s,'/nodeA/
> nodeB')
> >    AND s.[my:description]='description'
> >
> > Once cost is assigned (equal for both indexes), it is not clear which
> index
> > will be selected.
> >
> > Regards.
> >
> > On Tue, May 16, 2017 at 3:50 PM, Chetan Mehrotra <
> [email protected]>
> > wrote:
> >
> >> This looks similar to OAK-5449 (not yet fixed). Can you give a sample
> >> index definition there and some usecase details which is leading to
> >> ambiguity in index selection.
> >>
> >> In general index selection should not have multiple competing index
> >> definitions hence interested in knowing setup details
> >> Chetan Mehrotra
> >>
> >>
> >> On Tue, May 16, 2017 at 1:53 PM, Alvaro Cabrerizo <[email protected]>
> >> wrote:
> >> > Hello,
> >> >
> >> > I've been checking the code of the IndexPlanner (apache OAK 1.4.1)
> and I
> >> > was surprised because the costPerEntryFactor remains 1 in both cases:
> >> >
> >> >    - when no property indexed or sorted match any property clause or
> sort
> >> >    clause from the query
> >> >    - when only an indexed or sorted property matches a property
> clause or
> >> >    sort clause from the query
> >> >
> >> > Although this piece of code avoids a division by zero (see
> >> > org.apache.jackrabbit.oak.plugins.index.lucene.IndexPlanner lines
> >> 201-203)
> >> >
> >> >             if (costPerEntryFactor == 0){
> >> >                 costPerEntryFactor = 1;
> >> >             }
> >> >
> >> > It also avoids the boosting of indexes that match 1 query clause (or
> in
> >> > other words, it doesn't penalize indexes that don't match any clause).
> >> I'm
> >> > thinking about opening an issue. Although it is for the long-tern,
> >> actually
> >> > I would like to know which index is selected in case that more than
> one
> >> had
> >> > the same cost.
> >> >
> >> > Regards.
> >>
>

Re: index selection when 0 or 1 property clause (or sort clause) matches

Reply via email to