And as I understand it, current behavior is the silent misinterpretation.
To me, the failure to require a space after the regex (and either not
become a regex in that case or complain about invalid regex) might be
considered a bug...

On Thu, Sep 17, 2020 at 9:30 AM Mark Harwood <[email protected]> wrote:

> I think the decision comes down to choosing between silent
> (mis)interpratations of ambiguous queries or noisy failures..
>
> On Thu, Sep 17, 2020 at 1:55 PM Uwe Schindler <[email protected]> wrote:
>
>> Hi,
>>
>>
>>
>> My idea would have been not to bee too strict and instead only detect it
>> as a regex if its separated. So /foo/bar and /foo/iphone would both go
>> through and ignoring the regex, only ‘/foo/ bar’ or ‘/foo/I phone’ would
>> interpret the first token as regex.
>>
>>
>>
>> That’s just my idea, not sure if it makes sense to have this relaxed
>> parsing. I was always very skeptical of adding the regexes, as it breaks
>> many queries. Now it’s even more.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> https://www.thetaphi.de
>>
>> eMail: [email protected]
>>
>>
>>
>> *From:* Mark Harwood <[email protected]>
>> *Sent:* Wednesday, September 16, 2020 6:45 PM
>> *To:* [email protected]
>> *Subject:* Re: QueryParser - proposed change may break existing queries.
>>
>>
>>
>> The strictness I was thinking of adding was to make all of the following
>> error:
>>
>>  /foo/bar
>>
>>  /foo//bar/
>>
>>  /foo/iphone
>>
>>  /foo/AND x
>>
>>
>>
>> These would be allowed:
>>
>>  /foo/i bar
>>
>>  (/foo/ OR /bar/)
>>
>>  (/foo/ OR /bar/i)
>>
>>  /foo/^2
>>
>>  /foo/i^2
>>
>>
>>
>>
>>
>>
>>
>> On 16 Sep 2020, at 12:00, Uwe Schindler <[email protected]> wrote:
>>
>> 
>>
>> In my opinion, the proposed syntax change should enforce to have
>> whitespace or any other separator chat after the regex “i” parameter.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> https://www.thetaphi.de
>>
>> eMail: [email protected]
>>
>>
>>
>> *From:* Mark Harwood <[email protected]>
>> *Sent:* Wednesday, September 16, 2020 11:04 AM
>> *To:* [email protected]
>> *Subject:* QueryParser - proposed change may break existing queries.
>>
>>
>>
>> In Lucene-9445 we'd like to add a case insensitive option to regex
>> queries in the query parser of the form:
>>
>>    /Foo/i
>>
>>
>>
>> However, today people can search for :
>>
>>
>>
>>    /foo.com/index.html
>>
>>
>>
>> and not get an error. The searcher may think this is a query for a URL
>> but it's actually parsed as a regex "foo.com" ORed with a term query.
>>
>>
>>
>> I'd like to draw attention to this proposed change in behaviour because I
>> think it could affect many existing systems. Arguably it may be a positive
>> in drawing attention to a number of existing silent failures (unescaped
>> searches for urls or file paths) but equally could be seen as a negative
>> breaking change by some.
>>
>>
>>
>> What is our BWC policy for changes to query parser?
>>
>> Do the benefits of the proposed new regex feature outweigh the costs of
>> the breakages in your view?
>>
>>
>>
>>
>> https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17196793
>>
>>
>>
>>
>>
>>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to