I have some concerns with fulltext indexing/searching and spanish
special chars (accent marks). Or at least that is what I think.
Let me make it clear with an example.
I have this PDF which has already been indexed by our Invenio (0.99.1)
instance: http://zaguan.unizar.es/record/13793/files/BOUZ_08_10_2010.pdf
The PDF file contains the words "ingeniería" and "emérito" (you can
check it using Adobe Acrobat Reader).
When I perform a "simple search" in "fulltext" index with
term=ingeniería, that record shows up (result number 7). That's ok.
http://zaguan.unizar.es/search?ln=es&cc=BOUZ&sc=1&p=ingenier%C3%ADa&f=fulltext&action_search=Buscar
But when I perform a "simple search" in "fulltext" index with
term=emérito, there are no results. That's not ok.
http://zaguan.unizar.es/search?ln=es&cc=BOUZ&sc=1&p=em%C3%A9rito&f=fulltext&action_search=Buscar
That record only shows up as a result of an Advanced search with partial
phrase or exact phrase in fulltext index...
And Invenio informs that 'there were no hits with %emérito%, but there
were using "em rito"'
- With exact phrase:
http://zaguan.unizar.es/search?as=1&cc=BOUZ&m1=e&p1=em%C3%A9rito&f1=fulltext&op1=a&m2=a&p2=&f2=&op2=a&m3=a&p3=&f3=&action_search=Buscar&c=BOUZ&c=&sf=&so=d&rm=&rg=25&sc=0&of=hb
- With partial phrase:
http://zaguan.unizar.es/search?as=1&cc=BOUZ&m1=p&p1=em%C3%A9rito&f1=fulltext&op1=a&m2=a&p2=&f2=&op2=a&m3=a&p3=&f3=&action_search=Buscar&c=BOUZ&c=&sf=&so=d&rm=&rg=25&sc=0&of=hb
Why is this happening? Why do I get search results with term
"ingeniería" and not "emérito"? Is this error related to accents? How
can I fix this?
Best regards,
Miguel