On 6/6/2012 4:17 μμ, Tibor Simko wrote:
Would you be interested in plugging such a flexible asciification step
to the indexing process? Have it configurable per title, keyword,
author, etc?
Or were you thinking about this mostly in relation to the author
disambiguation?
Of course being able to search using an asciified version of the search
string was in my mind too! I was just afraid to ask :)
Now regarding the possible indexes that could be asciified, I suppose it
would mainly used for authors, although someone (for some _strange_
reason) could apply it to keywords, titles etc [1]
But if you think about it, if the functionality is there (even if it's
only meaningful for author indexes), it could be used with
Arabic/Hindi/Asian/Cyrillic names also... I know you have some projects
jointly run with researchers from these regions, so it would benefit
CERN as well!
Best regards,
Theodoros
[1] One such reason would be not to have the appropriate characters in
your typing language, but to know how to transliterate the word. Pretty
rare case, and somewhat dangerous option to enable for keywords and
titles, as you would get 'strange' results in your search if the
transliterated words match English words too... Personally, I believe it
is only safe for author names, but since you never know which marc field
would hold a name, it should be an option for all indexes, disabled by
default.