non-Latin-1 characters cannot be submitted for search
-----------------------------------------------------
Key: NUTCH-138
URL: http://issues.apache.org/jira/browse/NUTCH-138
Project: Nutch
Type: Bug
Components: web gui
Versions: 0.7.1
Environment: Windows XP, Tomcat 5.5.12
Reporter: KuroSaka TeruHiko
Priority: Minor
The search.html currently specifies GET method for query submission.
Tomcat 5.x only allows ISO-8859-1 (aka Latin-1) code set to be submitted over
GET because of some restrictions of HTML or HTTP spec they discovered. (If my
memory is correct, non ISO-8859-1 characters were woking OK over GET with older
versions of Tomcat as far as setCharacterEncoding() is called properly.)
To allow proper transmission of non-ISO-8859-1, POST method should be used.
Here's a proposed patch:
*** search.html Tue Dec 13 15:02:15 2005
--- search-org.html Tue Dec 13 15:02:07 2005
***************
*** 59,65 ****
</span><span class="bodytext">
<center>
! <form name="search" action="../search.jsp" method="post">
<input name="query" size="44"> <input type="submit" value="Search">
<a href="help.html">help</a>
--- 59,65 ----
</span><span class="bodytext">
<center>
! <form name="search" action="../search.jsp" method="get">
<input name="query" size="44"> <input type="submit" value="Search">
<a href="help.html">help</a>
BTW, I am aware that Nutch and Lucene won't hanlde non Western languages well
as packaged.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira