Hi Walter,
good advice -- but you need to know the language of your material ...
could be hard for an automatized processing ;-)
I also stumbled on the same words in different languages problem. The
sole solution might be the dream of an English-only documented world ;-)
Regards from good
Hi,
oops, the URIEncoding was lost during the update to tomcat 6.0.14.
Thanks for the advice.
But now I am really curioused. After indexing the document from scratch,
I have the effect that queries to this and is work fine, whereas
queries to really and fünny do not return the result. Fünnily
Hi Marc,
Are you using the same stemmer on your queries that you use when indexing?
Try the analysis function in the admin UI, to see how things are stemmed for
indexing vs. querying. If they don't match for really and fünny, and do
match for kraßen, then that's your problem.
Tom
On 9/14/07,
Hi Tom,
thanks for your response -- and sorry for the newbie question, may sound
somehow silly ;-) . Here the quick result of the analysis UI:
Index for really: 5* really. Query for really: 5* really, 2* realli
(from: EnglishPorterFilterFactory {protected=protwords.txt},
Hi Marc,
The searches are going to look for an exact match of the query (after
analysis) in the index (after analysis).
So, realli will not match really.
So you want to have the same stemmer (probably not the English one, given
your examples) in both in index analyzer, and the query analyzer.
Hi Tom,
thanks for your professional response -- works fine and looks good :-).
Since I am playing around with mixed texts (English and German), I do
not have any idea whether or not an EnglishPorter will be useful for
German texts. But I will find it out by playing around ;-)
Regards from
You could use index into multiple fields with different analyzers
and search all of them.
text_en: uses English stemmer
text_de: uses German stemmer
text_exact: no stemming
text_strip: uses ISOLatin1AccentFilter
You can search all of these and put different boosts on them,
with higher boosts for
Hi SOLR kings,
I'm just playing around with queries, but I was not able to query for
any special characters like the German Umlaute (i.e., ä, ö, ü). Maybe
others might have the same effects and already found a solution ;-)
Here is my example: I have one field called sometext of type text
If you are using tomcat, try adding URIEncoding=UTF-8 to your tomcat
connector.
Connector port=8080 maxHttpHeaderSize=8192
maxThreads=150 minSpareThreads=25 maxSpareThreads=75
enableLookups=false redirectPort=8443 acceptCount=100
:[EMAIL PROTECTED]
Sent: Thursday, September 13, 2007 3:14 PM
To: solr-user@lucene.apache.org
Subject: Query for German Special Characters (i.e., ä, ö, ß)
Hi SOLR kings,
I'm just playing around with queries, but I was not able to query for
any special characters like the German Umlaute (i.e., ä, ö, ü
10 matches
Mail list logo