I'm using Apache 2.0.55, but I don't think that the problem is in the web server. As I mentioned previously, all characters (including åäö) are displayed correctly. I think the problem is that Nutch simply doesn't calculate a score for these words.

Just so that I understand you correctly: you had the same problem with missing scores, and it was solved by the bug-report you refer to?

/Erik


From: Sami Siren <[EMAIL PROTECTED]>
Reply-To: [email protected]
To: [email protected]
Subject: Re: No score explanation for non-english characters
Date: Thu, 02 Feb 2006 17:25:50 +0200

If you're running tomcat you should try this:

http://issues.apache.org/bugzilla/show_bug.cgi?id=29900

It's wotking for me atleast.

--
 Sami Siren

Erik J wrote:
Sorry, I should have included that... Here is what it shows for pages containing åäö:

page
docNo = 4d2
segment = 20060131193849
digest = 72cc98a81e0340c5463ab5e5706b51ac
boost = 1.7436684
url = http://www.aftonbladet.se/vader/snodjup0405/snodjupet.html
title = Aftonbladet: Väder, Snödjup
score for query: ka
0.0 = sum of:

and then nothing more. The word I was searching for was "åka". As you can see, the initial "å" has disappeared on the line "score for query: ka" and the score is 0.0 on the next line.

/Erik

_________________________________________________________________
Hitta rätt på nätet med MSN Search http://search.msn.se/




_________________________________________________________________
Lättare att hitta drömresan med MSN Resor http://www.msn.se/resor/



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to