Author: sshafroi
Date: 2008-11-05 08:38:01 +0100 (Wed, 05 Nov 2008)
New Revision: 6903

Modified:
   trunk/war/src/main/java/no/sesat/search/http/filters/DataModelFilter.java
Log:
Issue SKER4951:  (Encoding problems when characters like ,() are used in a 
query) 

The problem was that the automatic detection of ISO-8859-1 failed, and the code 
thought that a string like ,?\195?\184 was encoded in ISO-8859-1 and ended up 
as ,2 2?\195?\131?\194?\184

I have tried to improve the detection, by inverting the process, compared to 
how it was done. I an not 100% sure about the .replacAll("[+]", "%20") and the 
same for *, but I think this might have been added to solve a similar problem 
before. So it would be good if you took a sec and looked at the diff.




Modified: 
trunk/war/src/main/java/no/sesat/search/http/filters/DataModelFilter.java
===================================================================
--- trunk/war/src/main/java/no/sesat/search/http/filters/DataModelFilter.java   
2008-11-03 14:09:42 UTC (rev 6902)
+++ trunk/war/src/main/java/no/sesat/search/http/filters/DataModelFilter.java   
2008-11-05 07:38:01 UTC (rev 6903)
@@ -366,14 +366,14 @@
         return retval;
     }
 
-    /** A safer way to get parameters for the query string.
-     * Handles ISO-8859-1 and UTF-8 URL encodings.
+    /**
+     * This function will try to decode the raw parameter, and see if that 
matches
+     * how the request.getParameter(..) did the decoding. If this dosn't match 
then we
+     * fall back to ISO-8859-1 which in most cases will be correct.
      *
      * @param request The servlet request we are processing
      * @param parameter The parameter to retrieve
      * @return The correct decoded parameter
-     *
-     *
      */
     private static String getParameterSafely(final HttpServletRequest request, 
final String parameter){
 
@@ -391,21 +391,12 @@
         }
 
         if (null != value && null != queryStringValue) {
-
             try {
-
-                final String encodedReqValue = URLEncoder.encode(value, 
"UTF-8")
-                        .replaceAll("[+]", "%20")
-                        .replaceAll("[*]", "%2A");
-
-                queryStringValue = queryStringValue
-                        .replaceAll("[+]", "%20")
-                        .replaceAll("[*]", "%2A");
-
-                if (!queryStringValue.equalsIgnoreCase(encodedReqValue)){
+                final String queryStringValueDecoded = 
URLDecoder.decode(queryStringValue, "UTF-8");
+                if (!queryStringValueDecoded.equals(value)) {
+                    // We don't think the encoding is utf-8 so go for 
ISO-8859-1
                     value = URLDecoder.decode(queryStringValue, "ISO-8859-1");
                 }
-
             } catch (UnsupportedEncodingException e) {
                 LOG.trace(e);
             }

_______________________________________________
Kernel-commits mailing list
[email protected]
http://sesat.no/mailman/listinfo/kernel-commits

Reply via email to