Hi:
For the forthcoming release of Invenio v1.2, we'd like to change one
long-standing feature related to phrase queries.
People don't easily distinguish between the following queries:
title:'some phrase'
title:"some phrase"
which is why in 2012 we have introduced a configuration option that
enables to specify for each and every index whether the difference
between single-quoted and double-quoted expressions should be respected.
By default, we have killed the difference in the most exposed indexes
such as global, title, abstract, but we have kept it for MARC queries in
order not to break existing cataloguing workflows.
We'd now like to extend this to all indexes by default, including MARC
queries like:
245:'some phrase'
245:"some phrase"
so that single-quoted and double-quoted phrase queries would always
return the same result.
What this change means for you:
1. The end users can use single-quoted or double-quoted queries to
express phrase search, in all indexes. There would be no difference.
2. The phrase search would be done by default via word pair matching,
unless indexes are tokenised in a special manner (e.g. exact author
name) or unless users search inside physical MARC tags (when no word
pair index exists).
3. If you have relied on "partial phrase matching", please switch to
regular expression queries like:
245:/some phrase/
245:/[[:blank:]]some phrase[[:blank:]]/
4. If you have relied on "exact phrase matching", please switch to
regular expression queries like:
245:/^Exact title.$/
Please holler if this change could badly break some of your workflows.
References:
[1] http://invenio-demo.cern.ch/help/search-guide#words-vs-phrases
[2] http://invenio-software.org/ticket/137
[3]
https://github.com/inveniosoftware/invenio/blob/master/modules/websearch/lib/search_engine_config.py#L33
Best regards
--
Tibor Simko