I searched the mailing list for this issue already and tried all suggestions,
but didn't find a solution.

I have a German website, the site is encoded in utf-8 and properly displayed
in the browser, which detects the correct encoding and also displays all
pages correctly.
(I use nutch0.9 on Gentoo Linux, with JBoss and embedded Tomcat5.5.)

But nutch displays the search results properly or doesn't even index the
special characters properly, but display a '?' instead of German Umlauts for
example (ä,ü,ö,...) - so the display is something like:

...unabh�ngige Branchenexperten pr�fen....

I already

1) set the meta data correctly as follows:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<%@ page contentType="text/html; charset=utf-8" pageEncoding="utf-8"
language="java" ...

2) in nutch-site.xml I set 
<property>
  <name>parser.character.encoding.default</name>
  <value>utf-8</value>
</property>


I use Jboss with embedded Tomcat:

3) 
In web.xml I added a parameter 
<init-param>
  <param-name>javaEncoding</param-name>
  <param-value>UTF-8</param-value>
</init-param>   

4) In server.xml I added URIEncoding="UTF-8" into the Context

5) in the jsp-page for the search results I set
request.setCharacterEncoding("UTF-8");

Still I meet the above mentioned problem and all special characters are
displayed as UTF-8.

Same when I use the search in the shell via
bin/nutch org.apache.nutch.searcher.NutchBean searchTerm
the special charactes are displayed as '?'

Does anyone meet the same problem and has any idea?
Thanks.

-- 
View this message in context: 
http://www.nabble.com/Problems-with-encoding-%28UTF-8%29%2C-display-of-search-results-with-special-characters-tp16954447p16954447.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to