> -----Original Message-----
> From: Branko Čibej [mailto:br...@xbc.nu] On Behalf Of Branko Cibej
> Sent: maandag 16 mei 2011 9:46
> To: dev@subversion.apache.org
> Subject: Re: SQLite and the LIKE operator
> 
> On 16.05.2011 09:38, Hyrum K Wright wrote:
> > 2011/5/16 Branko Čibej <br...@e-reka.si>:
> >> On 16.05.2011 03:13, Hyrum K Wright wrote:
> >>> Several places in wc_db we use the following pattern to select all
> >>> nodes with a common tree ancestor:
> >>>  WHERE wc_id = ?1 AND (local_relpath = ?2 OR local_relpath LIKE ?3
> ESCAPE '#')
> >>>
> >>> While this works, there was some concern about whether or not SQLite
> >>> was using the proper indicies when executing this query.  By examining
> >>> the output for 'EXPLAIN QUERY PLAN' on some of the relevant SELECT
> >>> statements, I believe it does use the indicies as intended.
> >>>
> >>> However, I stumbled across an alternate implementation which I believe
> >>> has some merit.  Instead of the above clause, we could use:
> >>>  WHERE wc_id = ?1 AND substr(local_relpath, 1, length(?2)) = ?2
> >>>
> >>> This also avoids a table scan by making use of the indicies, but has
> >>> the advantage of not having to compute a separate parameter for the
> >>> LIKE clause in C.  It returns the same results, and has the benefit of
> >>> being a bit more clear to SQLite what we're trying to accomplish.  I'm
> >>> tempted to switch our code to using this new format, but wanted some
> >>> comments first.  I have not yet run extensive timing or other analysis
> >>> on the performance.
> >>>
> >>> Thoughts?
> >>>
> >>> -Hyrum
> >> Can't be right. I'm assuming the first query works correctly iff:
> >>    ?2 = foo
> >>    ?3 = foo/%
> >>
> >> and returns 'foo' and all its subtree.
> >>
> >> The second query can't return the same results; if ?2=foo, it'll match
> >> foobar, which is not foo's child; if ?2=foo/, it won't return foo.
> > That's what I get for writing mail at 3am.
> >
> > I believe the following would fix this:
> >  WHERE wc_id = ?1 AND (local_relpath = ?2 OR substr(local_relpath, 1,
> > length(?2 + 1)) = ?2 || '/')
> >
> > -Hyrum
> 
> That query used to be:
> 
>     local_relpath=?2 OR local_relpath LIKE ?2 || '/%'
> 
> but, for obvious reasons, that was a potential bug since literal % were
> not escaped. Your latest proposal is broken, but I'm sure you'll find
> the bug eventually. :)
> 
> Whether substr can be faster than LIKE -- I have no idea.

I think you were talking about the increment problem, but there is another 
problem. Local_relpath "" would give '' || '/', which is the invalid relpath 
'/'.

        Bert

Reply via email to