On 26.02.2014 13:18, Tibor Simko wrote:
Hi!
I think this would solve the issue, indeed. I was not aware that I can
hook up a specific tokenizer to an index. I see in our 1.0 that
there's some magic happening with authors, but it looked always a bit
hard coded "just for authors".
Yes, it used to be hard-coded, but we have centralised index
configurations since then.
Ah. Now we're filling some gaps :)
See for example:
http://invenio-software.org/ticket/852
In forthcoming Invenio v1.2, one has:
Ic. This sounds promising, indeed.
So it would always be an exact match type query, right?
Yes, provided that you don't use values like:
$0 P:(DE-Juel1)12345 P:(DE-Juel1)678
by mishap or something.
Ok, we don't have that as those assignments are not done manually.
In this case a phrase search could lead to
false positive, unless you use regexp "/^value$/". This one was of my
motivations behind RFC, to point out that if somebody needs stricter
matching, the best would be to switch to regexp.
My concern was, that if I have to use regexp all the time, it will get
terribly slow. At the moment our bean counting takes about 1-2 min using
intbitsets on the backend and "" queries, regexping wouldn't be nice
here. Additionally, we have similar searches for frontend related stuff.
But I think with the till now unknown tokenizer assignments everything
is fine.
While if I use aid as a logical field I could (somehow) add a
tokenizer to it's index that tells the indexer: this has to be taken
literally.
Yes, you can select one of existing tokenisers via BibIndex Admin Guide,
or if no provided tokeniser suits your needs, you can write a new one
and drop it into "/opt/invenio/lib/python/invenio/bibindex_tokenizers/".
Sounds good.
For librarian style queries though, there is an "exactauthor" index
that behaves stricter here.
Ic. This would, however, then require an explicit "exact"-index for
all fields that should get the ability for exact searches.
Not necessarily; e.g. for DOI index, only exact matching makes sense,
hence our "doi" index uses "exact" tokeniser only, there is no need to
add another DOI-related index.
Sorry for being incorrect. If I want to allow exact match for fields
that use fancy stuff by default I'd need another index. Also I have to
select another index and not just place it in "".
You can see how it is (will be)
implemented here:
http://invenio-software.org/ticket/1655
Sounds familiar, I think we discussed that :)
--
Kind regards,
Alexander Wagner
Scientific Services / Scientific Publishing
Central Library
52425 Juelich
mail : [email protected]
phone: +49 2461 61-1586
Fax : +49 2461 61-6103
http://www.fz-juelich.de/zb/wp
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------