Re: wildcards and German umlauts
Hi, Agree that this is annoying for foreign languages. I get the idea behind the original behaviour, but there could be more elegant ways of handling it. It would make sense to always run the CharFilters. Perhaps a mechanism where TokenFilters can be tagged for exclusion from wildcard terms would be an idea. That way we can skip stemming, synonym and phonetic for wildcard terms, but still do lowercasing and characterNormalization. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 29. mai 2011, at 19.24, mdz-munich wrote: > Ah, NOW I got it. It's not a bug, it's a feature. > > But that would mean, that every character-manipulation (e.g. > char-mapping/replacement, Porter-Stemmer in some cases ...) would cause a > wildcard-query to fail. That too bad. > > But why? What's the Problem with passing the prefix through the > analyzer/filter-chain? > > Greetz, > > Sebastian > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999237.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Ah, NOW I got it. It's not a bug, it's a feature. But that would mean, that every character-manipulation (e.g. char-mapping/replacement, Porter-Stemmer in some cases ...) would cause a wildcard-query to fail. That too bad. But why? What's the Problem with passing the prefix through the analyzer/filter-chain? Greetz, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999237.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
I don't get you. Did I wrote something of an Analyzer? Actually not. -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999074.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Wildcard queries are not passed through an analyzer. > Ah, BTW, > > since the problem seems to be a query-parser-issue a simple workarround > could be done by simple replace all Umlauts with ASCII-Characters (ä = ae, > ö = oe, ü = ue for example) before sending the query to Solr and use a > solr.MappingCharFilterFactory with the same replacements (ä = ae, ö = oe, > ü = ue) while indexing. > > It's unflexible in some cases, but it works so far. > > Greetz, > > Sebastian > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2 > 998449.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Ah, BTW, since the problem seems to be a query-parser-issue a simple workarround could be done by simple replace all Umlauts with ASCII-Characters (ä = ae, ö = oe, ü = ue for example) before sending the query to Solr and use a solr.MappingCharFilterFactory with the same replacements (ä = ae, ö = oe, ü = ue) while indexing. It's unflexible in some cases, but it works so far. Greetz, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2998449.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Hi, "if i type complete word (such as "übersicht"). But there are no hits, if i use wildcards (such as "über*") Searching with wildcards and without umlauts works as well." I can confirm that. Greetz, Sebastian -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2998425.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
Hi, I've got the same problem: searching using wildcards and umlaut -> no results. Just as you descriped it: "if i type complete word (such as "übersicht"). But there are no hits, if i use wildcards (such as "über*") Searching with wildcards and without umlauts works as well." Anyone found the solution to this problem or have any new ideas? -- View this message in context: http://www.nabble.com/wildcards-and-German-umlauts-tp14836043p24517583.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcards and German umlauts
On Dienstag, 15. Januar 2008, Alexey Shakov wrote: > Index-searching works, if i type complete word (such as "übersicht"). > But there are no hits, if i use wildcards (such as "über*") > Searching with wildcards and without umlauts works as well. Maybe this describes your problem on the Lucene level? http://wiki.apache.org/lucene-java/LuceneFAQ#head-133cf44dd3dff3680c96c1316a663e881eeac35a If that doesn't help, try Luke to see how your queries are parsed. Regards Daniel -- http://www.danielnaber.de
wildcards and German umlauts
Hi all, Index-searching works, if i type complete word (such as "übersicht"). But there are no hits, if i use wildcards (such as "über*") Searching with wildcards and without umlauts works as well. Can someone help me? Thanx in advance! Here is my field definition: generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" /> protected="protwords.txt" language="German2" /> synonyms="synonyms.txt" ignoreCase="true" expand="true" /> generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" /> protected="protwords.txt" language="German2" />