Re: QueryParser - proposed change may break existing queries.

Steve Rowe Thu, 17 Sep 2020 12:10:14 -0700

You could avoid (some of?) these problems by supporting /(?i)foo/ instead of 
/foo/i


--
Steve

> On Sep 17, 2020, at 1:55 PM, Gus Heck <[email protected]> wrote:
> 
> And as I understand it, current behavior is the silent misinterpretation. To 
> me, the failure to require a space after the regex (and either not become a 
> regex in that case or complain about invalid regex) might be considered a 
> bug...
> 
> On Thu, Sep 17, 2020 at 9:30 AM Mark Harwood <[email protected] 
> <mailto:[email protected]>> wrote:
> I think the decision comes down to choosing between silent 
> (mis)interpratations of ambiguous queries or noisy failures..
> 
> On Thu, Sep 17, 2020 at 1:55 PM Uwe Schindler <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> 
>  
> 
> My idea would have been not to bee too strict and instead only detect it as a 
> regex if its separated. So /foo/bar and /foo/iphone would both go through and 
> ignoring the regex, only ‘/foo/ bar’ or ‘/foo/I phone’ would interpret the 
> first token as regex.
> 
>  
> 
> That’s just my idea, not sure if it makes sense to have this relaxed parsing. 
> I was always very skeptical of adding the regexes, as it breaks many queries. 
> Now it’s even more.
> 
>  
> 
> Uwe
> 
>  
> 
> -----
> 
> Uwe Schindler
> 
> Achterdiek 19, D-28357 Bremen
> 
> https://www.thetaphi.de <https://www.thetaphi.de/>
> eMail: [email protected] <mailto:[email protected]>
>  
> 
> From: Mark Harwood <[email protected] <mailto:[email protected]>> 
> Sent: Wednesday, September 16, 2020 6:45 PM
> To: [email protected] <mailto:[email protected]>
> Subject: Re: QueryParser - proposed change may break existing queries.
> 
>  
> 
> The strictness I was thinking of adding was to make all of the following 
> error:
> 
>  /foo/bar
> 
>  /foo//bar/
> 
>  /foo/iphone 
> 
>  /foo/AND x
> 
>  
> 
> These would be allowed:
> 
>  /foo/i bar
> 
>  (/foo/ OR /bar/)
> 
>  (/foo/ OR /bar/i)
> 
>  /foo/^2
> 
>  /foo/i^2
> 
>  
> 
>  
> 
> 
> 
> 
> On 16 Sep 2020, at 12:00, Uwe Schindler <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> 
> 
> In my opinion, the proposed syntax change should enforce to have whitespace 
> or any other separator chat after the regex “i” parameter.
> 
>  
> 
> Uwe
> 
>  
> 
> -----
> 
> Uwe Schindler
> 
> Achterdiek 19, D-28357 Bremen
> 
> https://www.thetaphi.de <https://www.thetaphi.de/>
> eMail: [email protected] <mailto:[email protected]>
>  
> 
> From: Mark Harwood <[email protected] <mailto:[email protected]>> 
> Sent: Wednesday, September 16, 2020 11:04 AM
> To: [email protected] <mailto:[email protected]>
> Subject: QueryParser - proposed change may break existing queries.
> 
>  
> 
> In Lucene-9445 we'd like to add a case insensitive option to regex queries in 
> the query parser of the form: 
> 
>    /Foo/i
> 
>  
> 
> However, today people can search for :
> 
>  
> 
>    /foo.com/index.html <http://foo.com/index.html>
>  
> 
> and not get an error. The searcher may think this is a query for a URL but 
> it's actually parsed as a regex "foo.com <http://foo.com/>" ORed with a term 
> query.
> 
>  
> 
> I'd like to draw attention to this proposed change in behaviour because I 
> think it could affect many existing systems. Arguably it may be a positive in 
> drawing attention to a number of existing silent failures (unescaped searches 
> for urls or file paths) but equally could be seen as a negative breaking 
> change by some.
> 
>  
> 
> What is our BWC policy for changes to query parser?
> 
> Do the benefits of the proposed new regex feature outweigh the costs of the 
> breakages in your view?
> 
>  
> 
> https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17196793
>  
> <https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17196793>
>  
> 
>  
> 
> 
> 
> -- 
> http://www.needhamsoftware.com <http://www.needhamsoftware.com/> (work)
> http://www.the111shift.com <http://www.the111shift.com/> (play)

Re: QueryParser - proposed change may break existing queries.

Reply via email to