Re: RFC unifying phrase search behaviour

Tibor Simko Tue, 25 Feb 2014 12:16:10 -0800

On Mon, 24 Feb 2014, Alexander Wagner wrote:
>>     245:'some phrase'
>>     245:"some phrase"
>>
>> so that single-quoted and double-quoted phrase queries would always
>> return the same result.
>
> Which is then an exact match, right? So to get '' matches one would
> use "*bla*", right?


No, actually, not an exact match, but a word pair match.  Here is a
possibly clearer example.  Consider the following record:

   245 $a The Kreutzer Sonata

When users type:

   245:'Kreutzer Sonata'
   245:"Kreutzer Sonata"

then the record would be returned.

When users type:

   245:'reutzer son'
   245:"reutzer son"

then the record won't be returned; people would have to type:

   245:/reutzer son/

in order to get a substring match.

In summary:

  +-----------------------+-------------------+--------------------+
  | QUERY                 | CURRENT BEHAVIOUR | PROPOSED BEHAVIOUR |
  +-----------------------+-------------------+--------------------+
  | 245:'Kreutzer Sonata' | hit               | hit                |
  | 245:"Kreutzer Sonata" | miss              | hit                |
  | 245:'reutzer son'     | hit               | miss               |
  | 245:"reutzer son"     | miss              | miss               |
  | 245:/reutzer son/     | hit               | hit                |
  +-----------------------+-------------------+--------------------+

Note that proposed behaviour is already the case for some logical
indexes such as "title" in Invenio v1.1 release series and above.  The
current RFC proposes to widen its scope to cover all indexes, including
physical MARC queries.

> If I get it correctly, it breaks almost all our bean counting. IDs are
> something like
>
>   sid:(DE-HGF)1
>
> or
>
>   sid:(DE-HGF)11
>
> if you map "sid:(DE-HGF)1" to the old 'sid:(DE-HGF)1' it matches also
>"sid:(DE-HGF)11", which is wrong and not intended.

Nope, it would not be mapped that way, see above.  The ID matching would
remain safe.

Best regards
--
Tibor Simko

Re: RFC unifying phrase search behaviour

Reply via email to