ps: And yes, a util method to check is utf-8-full-support is enabled would be a 
good start - would at least help locating where this is done/used.

-g

On Mar 4, 2010, at 15:34, Grégory Joseph wrote:

> 
> On Mar 4, 2010, at 15:20, Luca Boati wrote:
> 
>> Hi greg,
>> 
>> Is there a reason we removed the URLDecoder.decode() calls in the {else} 
>> block of this, in ContentTypeFilter ?
>> >
>> > if 
>> > (SystemProperty.getBooleanProperty(SystemProperty.MAGNOLIA_UTF8_ENABLED)) {
>> 
>> Actually we can do not remove URLDecoder.decode() if 
>> SystemProperty.MAGNOLIA_UTF8_ENABLED = false but I'd use 
>> UnicodeNormalizerRequestWrapper only if utf8 is enabled.
> 
> Hmm I'm not sure I understand you here. What I was asking is why when 
> utf8=disabled, the URLDecoder.decode() calls have been removed. (it might be 
> correct, or it might be that you just forgot it, I don't know)
> 
>>  
>> Can we avoid constantly having to call SystemProperty.getBooleanProperty in 
>> each and every corner where this might be needed ?
>> 
>> What do you mean? do not use systemProperty to get that information? or only 
>> do not use 
>> SystemProperty.getBooleanProperty(SystemProperty.MAGNOLIA_UTF8_ENABLED) 
>> everywhere but use an util functions that evalutate that property? which is 
>> your idea?
> 
> No specific idea, but I just don't want to have to think, for every piece of 
> code, if 1) i have to normalize or not 2) have 
> "SystemProperty.getBooleanProperty(SystemProperty.MAGNOLIA_UTF8_ENABLED)" 
> allover the place: a) it's long and ugly b) it's unmaintainable and goes back 
> to (1). For instance, if  UnicodeNormalizer already takes care of checking 
> that property, why surround calls to it with yet another check of the 
> property. Additionally, we already have quite a few request/response 
> wrappers, it might be a good idea to see if one of them could be used for the 
> normalization.
> 
>>  
>> UnicodeNormalizer should not have to check this property - or if so, that'd 
>> be only in the "autodetect" implementation of it, so that it'd use the 
>> NonNormalizing impl if the property is set to false.
>> 
>> ok
>> 
>> Why do we have to normalize again in CosMultipartRequestFilter, if 
>> ContentTypeFilter already wrapped the request in a normalizing request 
>> wrapper ?
>> 
>> Because the parameters in cos multipart request does not rely on the 
>> original request (it parses request.getInputStream());
> 
> I see - adding an inline comment for this sort of stuff will prevent this 
> question to be asked again in the future ;)
> The question however, also implied that I was wondering why we don't do the 
> same in the other multipart filter.
> 
>> maybe the right approach is to wrap request in ContentTypeFilter only if it 
>> is not a multipart request.
> 
> None of those 2 filters really sound like the "right" place to do this 
> wrapping. Just a thought: maybe we could have an extra filter (disabled by 
> default) that does this wrapping.
> 
>> 
>> Luca
>>  
>> -g
>> 
>> 
>> Begin forwarded message:
>> 
>> > From: [email protected]
>> > Date: March 4, 2010 12:11:14GMT+01:00
>> > To: [email protected]
>> > Subject: [magnolia-svn] [32381] [MAGNOLIA-3009] NFD/NFC normalization 
>> > added (force to NFC):
>> > Reply-To: "Magnolia SVN-List" <[email protected]>
>> >
>> > Revision
>> > 32381
>> > Author
>> > lucaboati
>> > Date
>> > 2010-03-04 12:11:13 +0100 (Thu, 04 Mar 2010)
>> > Log Message
>> >
>> > [MAGNOLIA-3009
>> > ] NFD/NFC normalization added (force to NFC):
>> > - ContentTypeFilter and CosMultipartRequestFilter normalizing each 
>> > parameters
>> > - ImportXmlRootFilter normalizing sv:name of xml attributes
>> >
>> > Modified Paths
>> >
>> >       • 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/ContentTypeFilter.java
>> >       • 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/CosMultipartRequestFilter.java
>> >       • 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/util/UnicodeNormalizer.java
>> >       • 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/importexport/filters/ImportXmlRootFilter.java
>> >       • 
>> > community/magnolia/branches/utf8support/magnolia-core/src/test/java/info/magnolia/cms/util/UnicodeNormalizerTest.java
>> > Diff
>> >
>> > Modified: 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/ContentTypeFilter.java
>> >  (32380 => 32381)
>> >
>> > --- 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/ContentTypeFilter.java
>> >       2010-03-04 11:07:34 UTC (rev 32380)
>> > +++ 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/ContentTypeFilter.java
>> >       2010-03-04 11:11:13 UTC (rev 32381)
>> >
>> > @@ -35,19 +35,25 @@
>> >
>> >
>> >
>> >  import info.magnolia.cms.beans.config.MIMEMapping;
>> >
>> >  import info.magnolia.cms.core.AggregationState;
>> >
>> > +import info.magnolia.cms.core.SystemProperty;
>> > +import info.magnolia.cms.util.UnicodeNormalizer;
>> >
>> >  import info.magnolia.context.MgnlContext;
>> >
>> >
>> >
>> >  import java.io.IOException;
>> >
>> >  import java.io.UnsupportedEncodingException;
>> >
>> >  import java.net.URLDecoder;
>> >
>> > +import java.util.HashMap;
>> > +import java.util.Map;
>> >
>> >
>> >
>> >  import javax.servlet.FilterChain;
>> >
>> >  import javax.servlet.ServletException;
>> >
>> >  import javax.servlet.http.HttpServletRequest;
>> >
>> > +import javax.servlet.http.HttpServletRequestWrapper;
>> >
>> >  import javax.servlet.http.HttpServletResponse;
>> >
>> >
>> >
>> >  import org.apache.commons.lang.StringUtils;
>> >
>> >
>> >
>> > +
>> >
>> >  /**
>> >
>> >   * TODO : rename this filter. What it really does is initialize and setup 
>> > the basic,
>> >
>> >   * non-content related attributes of the AggregationState. ContentType 
>> > could become an
>> >
>> > @@ -58,7 +64,8 @@
>> >
>> >   * @author gjoseph
>> >
>> >   * @version $Id$
>> >
>> >   */
>> >
>> > -public class ContentTypeFilter extends AbstractMgnlFilter {
>> >
>> > +public class ContentTypeFilter extends AbstractMgnlFilter{
>> > +
>> >
>> >      private static final org.slf4j.Logger log = 
>> > org.slf4j.LoggerFactory.getLogger(ContentTypeFilter.class);
>> >
>> >
>> >
>> >      public void doFilter(HttpServletRequest request, HttpServletResponse 
>> > response, FilterChain chain) throws IOException, ServletException {
>> >
>> > @@ -69,7 +76,7 @@
>> >
>> >          if (!StringUtils.isEmpty(query))
>> >
>> >          {
>> >
>> >              log.info(query);
>> >
>> > -            url.append("?").append(query);
>> >
>> > +            url.append("?").append(query);
>> >
>> >          }
>> >
>> >
>> >
>> >          final String characterEncoding = 
>> > setupContentTypeAndCharacterEncoding(ext, request, response);
>> >
>> > @@ -81,13 +88,25 @@
>> >
>> >          aggregationState.setCharacterEncoding(characterEncoding);
>> >
>> >
>> >
>> >          // @todo this is temporary.. add a new filter?
>> >
>> > -        aggregationState.setOriginalURI(URLDecoder.decode(originalUri, 
>> > characterEncoding));
>> > -        aggregationState.setOriginalURL(URLDecoder.decode(url.toString(), 
>> > characterEncoding));
>> >
>> > +        if 
>> > (SystemProperty.getBooleanProperty(SystemProperty.MAGNOLIA_UTF8_ENABLED)) {
>> > +            request = new UnicodeNormalizerRequestWrapper(request);
>> > +            MgnlContext.push(request, response);
>> > +            
>> > aggregationState.setOriginalURI(URLDecoder.decode(originalUri, 
>> > characterEncoding));
>> > +            
>> > aggregationState.setOriginalURL(URLDecoder.decode(url.toString(), 
>> > characterEncoding));
>> > +        }
>> > +        else {
>> > +            aggregationState.setOriginalURI(originalUri);
>> > +            aggregationState.setOriginalURL(url.toString());
>> > +        }
>> >
>> >          aggregationState.setOriginalBrowserURI(originalUri);
>> >
>> >          aggregationState.setOriginalBrowserURL(url.toString());
>> >
>> >          aggregationState.setExtension(ext);
>> >
>> >
>> >
>> >          chain.doFilter(request, response);
>> >
>> > +
>> > +        if 
>> > (SystemProperty.getBooleanProperty(SystemProperty.MAGNOLIA_UTF8_ENABLED)) {
>> > +            MgnlContext.pop();
>> > +        }
>> >
>> >      }
>> >
>> >
>> >
>> >      // TODO : test + simplification (substringAfterLast(uri, ".") 
>> > probably does the trick !?
>> >
>> > @@ -115,4 +134,76 @@
>> >
>> >          return characterEncoding;
>> >
>> >      }
>> >
>> >
>> >
>> > +    public class UnicodeNormalizerRequestWrapper extends 
>> > HttpServletRequestWrapper
>> > +    {
>> > +
>> > +        private HttpServletRequest original;
>> > +
>> > +        private Map parameters;
>> > +
>> > +        /**
>> > +         * @param request
>> > +         */
>> > +        public UnicodeNormalizerRequestWrapper(HttpServletRequest request)
>> > +        {
>> > +            super(request);
>> > +            original = request;
>> > +        }
>> > +
>> > +        /**
>> > +         * {...@inheritdoc}
>> > +         */
>> > +        @Override
>> > +        public String getParameter(String name)
>> > +        {
>> > +            String[] values = getParameterValues(name);
>> > +            if (values != null && values.length > 0)
>> > +            {
>> > +                return values[0];
>> > +            }
>> > +            return null;
>> > +        }
>> > +
>> > +        /**
>> > +         * {...@inheritdoc}
>> > +         */
>> > +        @Override
>> > +        public Map getParameterMap()
>> > +        {
>> > +            if (parameters == null)
>> > +            {
>> > +                parameters = new HashMap<String, String[]>();
>> > +                for (Object key : original.getParameterMap().keySet())
>> > +                {
>> > +                    String[] value = 
>> > transform((String[])original.getParameterMap().get(key));
>> > +                    parameters.put(key, value);
>> > +                }
>> > +            }
>> > +            return parameters;
>> > +        }
>> > +
>> > +        public String[] transform(String[] input)
>> > +        {
>> > +            String[] toNormalize = input;
>> > +            if (toNormalize != null && toNormalize.length > 0)
>> > +            {
>> > +                for (int i = 0; i < toNormalize.length; i++)
>> > +                {
>> > +                    toNormalize[i] = 
>> > UnicodeNormalizer.normalizeNFC(toNormalize[i]);
>> > +                }
>> > +            }
>> > +            return toNormalize;
>> > +        }
>> > +
>> > +        /**
>> > +         * {...@inheritdoc}
>> > +         */
>> > +        @Override
>> > +        public String[] getParameterValues(String name)
>> > +        {
>> > +            return (String[]) getParameterMap().get(name);
>> > +        }
>> > +
>> > +    }
>> > +
>> >
>> >  }
>> >
>> > Modified: 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/CosMultipartRequestFilter.java
>> >  (32380 => 32381)
>> >
>> > --- 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/CosMultipartRequestFilter.java
>> >       2010-03-04 11:07:34 UTC (rev 32380)
>> > +++ 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/filters/CosMultipartRequestFilter.java
>> >       2010-03-04 11:11:13 UTC (rev 32381)
>> >
>> > @@ -35,6 +35,7 @@
>> >
>> >
>> >
>> >  import info.magnolia.cms.beans.runtime.MultipartForm;
>> >
>> >  import info.magnolia.cms.core.Path;
>> >
>> > +import info.magnolia.cms.util.UnicodeNormalizer;
>> >
>> >  import info.magnolia.context.MgnlContext;
>> >
>> >
>> >
>> >  import java.io.IOException;
>> >
>> > @@ -107,16 +108,20 @@
>> >
>> >          while (params.hasMoreElements()) {
>> >
>> >              String name = (String) params.nextElement();
>> >
>> >              String value = multi.getParameter(name);
>> >
>> > -            form.addParameter(name, value);
>> >
>> > +            form.addParameter(name, 
>> > UnicodeNormalizer.normalizeNFC(value));
>> >
>> >              String[] s = multi.getParameterValues(name);
>> >
>> >              if (s != null) {
>> >
>> > +                for(int i = 0; i < s.length; i++)
>> > +                {
>> > +                    s[i] = UnicodeNormalizer.normalizeNFC(s[i]);
>> > +                }
>> >
>> >                  form.addparameterValues(name, s);
>> >
>> >              }
>> >
>> >          }
>> >
>> >          Enumeration files = multi.getFileNames();
>> >
>> >          while (files.hasMoreElements()) {
>> >
>> >              String name = (String) files.nextElement();
>> >
>> > -            form.addDocument(name, multi.getFilesystemName(name), 
>> > multi.getContentType(name), multi.getFile(name));
>> >
>> > +            form.addDocument(name, 
>> > UnicodeNormalizer.normalizeNFC(multi.getFilesystemName(name)), 
>> > multi.getContentType(name), multi.getFile(name));
>> >
>> >          }
>> >
>> >          request.setAttribute(MultipartForm.REQUEST_ATTRIBUTE_NAME, form);
>> >
>> >          return form;
>> >
>> > Modified: 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/util/UnicodeNormalizer.java
>> >  (32380 => 32381)
>> >
>> > --- 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/util/UnicodeNormalizer.java
>> >  2010-03-04 11:07:34 UTC (rev 32380)
>> > +++ 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/cms/util/UnicodeNormalizer.java
>> >  2010-03-04 11:11:13 UTC (rev 32381)
>> >
>> > @@ -33,10 +33,13 @@
>> >
>> >   */
>> >
>> >  package info.magnolia.cms.util;
>> >
>> >
>> >
>> > +import info.magnolia.cms.core.SystemProperty;
>> >
>> >  import info.magnolia.objectfactory.Components;
>> >
>> >
>> >
>> > +import java.io.UnsupportedEncodingException;
>> >
>> >  import java.lang.reflect.InvocationTargetException;
>> >
>> >  import java.lang.reflect.Method;
>> >
>> > +import java.util.Arrays;
>> >
>> >
>> >
>> >  /**
>> >
>> >   * A wrapper around java.text.Normalizer and com.ibm.icu.text.Normalizer; 
>> > uses the former if present, or none
>> >
>> > @@ -67,7 +70,17 @@
>> >
>> >       * Normalizes the given String to the NFC form.
>> >
>> >       */
>> >
>> >      public static String normalizeNFC(String in) {
>> >
>> > -        return normalizer.normalizeNFC(in);
>> >
>> > +        if 
>> > (SystemProperty.getBooleanProperty(SystemProperty.MAGNOLIA_UTF8_ENABLED)) {
>> > +            try {
>> > +                log.debug("not normalized: " + 
>> > Arrays.toString(in.getBytes("UTF-8")) + " (" + in + ")");
>> > +                String out = normalizer.normalizeNFC(in);
>> > +                log.debug("    normalized: " + 
>> > Arrays.toString(out.getBytes("UTF-8")) + " (" + out + ")");
>> > +                return out;
>> > +            } catch (UnsupportedEncodingException e) {
>> > +                // do nothing
>> > +            }
>> > +        }
>> > +        return in;
>> >
>> >      }
>> >
>> >
>> >
>> >      public interface Normalizer {
>> >
>> > Modified: 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/importexport/filters/ImportXmlRootFilter.java
>> >  (32380 => 32381)
>> >
>> > --- 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/importexport/filters/ImportXmlRootFilter.java
>> >    2010-03-04 11:07:34 UTC (rev 32380)
>> > +++ 
>> > community/magnolia/branches/utf8support/magnolia-core/src/main/java/info/magnolia/importexport/filters/ImportXmlRootFilter.java
>> >    2010-03-04 11:11:13 UTC (rev 32381)
>> >
>> > @@ -33,6 +33,8 @@
>> >
>> >   */
>> >
>> >  package info.magnolia.importexport.filters;
>> >
>> >
>> >
>> > +import info.magnolia.cms.util.UnicodeNormalizer;
>> > +
>> >
>> >  import org.xml.sax.Attributes;
>> >
>> >  import org.xml.sax.SAXException;
>> >
>> >  import org.xml.sax.XMLReader;
>> >
>> > @@ -117,6 +119,7 @@
>> >
>> >          // filter if it is the version store
>> >
>> >          if ("sv:node".equals(qName)) { //$NON-NLS-1$
>> >
>> >              String attName = atts.getValue("sv:name"); //$NON-NLS-1$
>> >
>> > +            attName = UnicodeNormalizer.normalizeNFC(attName);
>> >
>> >              // remember if there was a root node presend
>> >
>> >              if ("jcr:root".equals(attName)) {
>> >
>> >                  this.rootNodeFound = true;
>> >
>> > Modified: 
>> > community/magnolia/branches/utf8support/magnolia-core/src/test/java/info/magnolia/cms/util/UnicodeNormalizerTest.java
>> >  (32380 => 32381)
>> >
>> > --- 
>> > community/magnolia/branches/utf8support/magnolia-core/src/test/java/info/magnolia/cms/util/UnicodeNormalizerTest.java
>> >      2010-03-04 11:07:34 UTC (rev 32380)
>> > +++ 
>> > community/magnolia/branches/utf8support/magnolia-core/src/test/java/info/magnolia/cms/util/UnicodeNormalizerTest.java
>> >      2010-03-04 11:11:13 UTC (rev 32381)
>> >
>> > @@ -68,6 +68,7 @@
>> >
>> >          SystemProperty.getProperties().clear();
>> >
>> >          ComponentsTestUtil.clear();
>> >
>> >          
>> > SystemProperty.setProperty("info.magnolia.cms.util.UnicodeNormalizer$Normalizer",
>> >  "info.magnolia.cms.util.UnicodeNormalizer$AutoDetectNormalizer");
>> >
>> > +        SystemProperty.setProperty("magnolia.utf8.enabled", "true");
>> >
>> >      }
>> >
>> >
>> >
>> >      protected void tearDown() throws Exception {
>> >
>> >
>> >
>> > ----------------------------------------------------------------
>> > For list details see
>> > http://www.magnolia-cms.com/home/community/mailing-lists.html
>> > To unsubscribe, E-mail to: <[email protected]>
>> > ----------------------------------------------------------------
>> 
>> 
>> ----------------------------------------------------------------
>> For list details see
>> http://www.magnolia-cms.com/home/community/mailing-lists.html
>> To unsubscribe, E-mail to: <[email protected]>
>> ----------------------------------------------------------------
>> 
>> 
> 
> 
> 
> ----------------------------------------------------------------
> For list details see
> http://www.magnolia-cms.com/home/community/mailing-lists.html
> To unsubscribe, E-mail to: <[email protected]>
> ----------------------------------------------------------------


----------------------------------------------------------------
For list details see
http://www.magnolia-cms.com/home/community/mailing-lists.html
To unsubscribe, E-mail to: <[email protected]>
----------------------------------------------------------------

Reply via email to