Hello Tibor!
Just shortly, heading for a train:
The ind:val-example I gave, where you noticed it is perfrctly ok, is actually
broken on JuSER. To get it working properly I have to remove the redundant
brackets in the second or expression. One idea of Martin is to re-order
and/or/not, but in general I can't drop all brackets, and it's not trivial
either.
I see the physics case even tried adding spaces but I think those that I were
able to add got stripped. It felt a bit like I wanted to add another sort of
brackets like in math: (x+y) { a [ (b+c) (d+e) +f]}... :S
--
Kind regards,
Alexander Wagner
Subject Specialist
Central Library
52425 Juelich
mail : [email protected]
phone: +49 2461 61-1586
Fax : +49 2461 61-6103
www.fz-juelich.de/zb/DE/zb-fi
----- Reply message -----
From: "Tibor Simko" <[email protected]>
Date: Wed, Feb 26, 2014 21:29
Subject: Search & Bracketing
To: "Wagner, Alexander" <[email protected]>
Cc: "[email protected]" <[email protected]>
On Wed, 26 Feb 2014, Alexander Wagner wrote:
> http://invenio-software.org/ticket/131
There are some more tickets that are open in this regard, notably:
http://invenio-software.org/ticket/453
> 041__a:"eng"
>
> vs.
>
> (041__a:"eng")
Note that CDS also emits the following warning in the 2nd case:
No exact match found for (041__a:"eng"), using 041 a: eng instead...
This substitute query is wrongly guessed, which leads to wrong results.
The troubles stem from the following. There are physics terms such as
'SU(1)' that we don't want to interpret as a parenthesised search, but
rather do literal match. Upon seeing '(041__a:"eng")', the system
interprets it similarly, i.e. not as a "composed search", but as a "math
search", so to speak. This is mostly because there is no blank within
parenthesised expression. Adding something tautological to create a
Boolean expression would overcome this interpretation, for example:
(041__a:"eng" eng)
would return the same number of hits as 041__a:"eng".
In summary, the best way to use parentheses in order to express
"composed searches" is not to use parentheses around "singletons", but
always around "Boolean expressions", e.g. things containing at least
some white space.
> (ind:"val1" and ind:"val2") and ((ind:"val3" or ind:"val4") or
> ind:"val5")
This use is perfectly OK.
> I fear there's still a bug in the in bracket handling.
Yes, e.g. see the above ticket #453.
We may try to improve parenthesised expression check for word boundaries
in order to behave more properly for queries like "(xy:zzy)", e.g. to
give preference to "composed search" interpretation. Though there are
situations like "(p,q)" where one wants to retain "math search"
interpretation we are favouring now...
Best regards
--
Tibor Simko
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------