#261: Parentheses in search don't work with alternate search terms
------------------------+---------------------------------------------------
Reporter: tbrooks | Owner: tbrooks
Type: defect | Status: in_merge
Priority: critical | Milestone:
Component: WebSearch | Version:
Resolution: | Keywords: INSPIRE Syntax Oct
------------------------+---------------------------------------------------
Comment (by simko):
1) WRT current behaviour of `find a brooks, travis`, this query gets
expanded by the SPIRES syntax compatibility parser into:
{{{
['+', 'author:"brooks, travis*"', '|', 'exactauthor:"brooks, t"', '|',
'exactauthor:"brooks, tr"', '|', 'exactauthor:"brooks, tra"', '|',
'exactauthor:"brooks, trav"', '|', 'exactauthor:"brooks, travi"']
}}}
I wonder how common is the need to iterate through all the various `travi-
trav-tra-tr` forms; maybe a simpler expansion like:
{{{
author:"brooks, travis*" OR exactauthor:"brooks, t"
}}}
would cover the most common use cases well already?
If we manage to simplify the query expansion, it would help in removing
the most weird stuff like `brooks, tra` as if by passing. Although it
would not solve the longer-term task fully, of course, e.g. the internal
parsing logic may still get exposed for unsuccessful queries like `find a
brooks, zelda`.
2) WRT CFG_INSPIRE_SITE changes, I'll wait for the final updates then, but
please check how things work for Invenio syntax vs SPIRES syntax too,
because you are addressing `search_pattern_parenthesised()` globally,
which is called //also// for Invenio syntax. So we may get into troubles
for queries like `reportnumber:cern/th` if they happen to live inside
parens.
In longer term, if SPIRES syntax parser does query expansion in the pre-
search times, while the Invenio syntax parser leaves this for the later
post-search times, so to speak, then indeed the `ap` behaviour may be
harmful in one case, while useful in the other case, and so it would seem
necessary to distinguish the two for the whole duration of the search, if
we want to offer both alternatives at run time in parallel. This could be
done for example by enriching basic search units (`opft`) to include an
information whether `ap` is wanted for a particular unit or not. If we set
`ap` only globally, then we will loose one way or another. If we set `ap`
differently for paren vs non-paren expressions, we may get into `(foo)` vs
`foo` differences. Moreover, on a similar note, people may get confused by
SPIRES mode vs Invenio mode differences, `find a brooks, travis` vs
`brooks, travis` in `author` field.
--
Ticket URL: <http://invenio-software.org/ticket/261#comment:6>
Invenio <http://invenio-software.org>