Hi Alexander,

On 04/11/2012 12:08 PM, Alexander Wagner wrote:
Hi!

Consider a websearch in Invenio, probably someone can
explain the following behaviour to me. We found this
accidentially upon debugging a certain new function we
introduced recently.

- Simple search in a given collection "People" by term plo, ie:

  search?ln=en&cc=People&sc=1&p=plo&f=

       or

  perform_request_search(p='plo', cc='People')


This will return all the records that contain the word 'plo' in any of the fields. So the query to the db would be like: select <something> from <all_field_index> where value='plo';



- Advanced search using regexp, ie

  search?ln=en&cc=People&sc=1&p=plo&f=authors

       or

  perform_request_search(p1='plo', m1='r', f1='author', cc='People')


This indeed is more restrictive since it searches only the author index but is more broad because is doing a REGEXP search. The query in this case would be: select <something> from <author_index> where value REGEXP 'plo'; and this will match also the words that contain 'plo' as a substring (so 'fooplobar' would be a match) - as when doing a substring/phrase search.


I would assume, that the simple search gives at least as
many results than the more complex and in fact restricted
(I'm searching only in index 'authors') query. However, the
first one yields 0 results, while the second one gives me 8
hits.

I think if you would do the same type of search (m='a' or m='r') in both cases, you would see the behavior that you would expect (more results when doing simple search) otherwise m='r' will probably yield more results then m='a' in most of the cases even if you are searching on a smaller space.

Best regards,
Ludmila

--
Ludmila Marian ** CERN Document Server **<http://cds.cern.ch/>

Reply via email to