On 11.04.2012 14:05, Ludmila Marian wrote:
Hello Ludmilla!
perform_request_search(p='plo', cc='People')
>
This will return all the records that contain the word 'plo' in any of
the fields.
So the query to the db would be like: select <something> from
<all_field_index> where value='plo';
This is what I would expect except probably the notion of "word". Ie. I
would expect that simple search uses substring. I understand that this
is not the case. Right?
perform_request_search(p1='plo', m1='r', f1='author', cc='People')
This indeed is more restrictive since it searches only the author index
but is more broad because is doing a REGEXP search.
Though I do not use any regexping here.
The query in this case would be: select <something> from <author_index>
where value REGEXP 'plo';
That's what I understood. From a feeling I would guess that this is the
same as searching for 'plo' in the first case...
and this will match also the words that contain 'plo' as a substring (so
'fooplobar' would be a match) - as when doing a substring/phrase search.
... as I did NOT search for .*plo.*
I understand it works like the match operator, right? Something like
hit = 1 if str =~ m/plo/;
in perlspeak. So, in selecting regexp search I automagically win
"left/right truncation" to the search string which itself is handled as
phrase, right? Something like "a b c" in regexp search would search for
.*a b c.* (again something like =~ m/a b c/) and not for "a or b or c"
in simple search?
Or the other way round: to mimic simple search via regexp in my first
example I would have had to search \bplo\b?
I would assume, that the simple search gives at least as
many results than the more complex and in fact restricted
(I'm searching only in index 'authors') query. However, the
first one yields 0 results, while the second one gives me 8
hits.
I think if you would do the same type of search (m='a' or m='r') in both
cases, you would see the behavior that you would expect (more results
when doing simple search) otherwise m='r' will probably yield more
results then m='a' in most of the cases even if you are searching on a
smaller space.
I wonder if this is intuitive from an end users perspective. Going to
simple search in the first place is usually someone with the notion "oh,
it's like google, I like that". So wouldn't she suspect to have all this
autmagic truncations to happen? In a way this was what I fell for in my
simplistic approach. I always used regexp in all other parts but for
whatever reason used simple in this single application with the notion:
oh, I don't need real regexp, substring in all fields is just fine,
probably a bit to broad but given that collection it does no harm.
Anyway, thank you for the clarifications :)
--
Kind regards,
Alexander Wagner
Subject Specialist
Central Library
52425 Juelich
mail : [email protected]
phone: +49 2461 61-1586
Fax : +49 2461 61-6103
www.fz-juelich.de/zb/DE/zb-fi
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Kennen Sie schon unsere app? http://www.fz-juelich.de/app