[ http://issues.apache.org/jira/browse/SOLR-32?page=comments#action_12421677 ] Yonik Seeley commented on SOLR-32: ----------------------------------
Anyone know if this is still a problem in the latest Jetty6? Someone might want to follow up with them on it. > Result of select request is not well-formed XML when text field contains > non-ASCII chars and ampersand > ------------------------------------------------------------------------------------------------------ > > Key: SOLR-32 > URL: http://issues.apache.org/jira/browse/SOLR-32 > Project: Solr > Issue Type: Bug > Components: search > Environment: Seen when running with the supplied Jetty container, > macosx, JDK 1.5.0_06 > Reporter: Bertrand Delacretaz > Assigned To: Yonik Seeley > > Starting with the supplied start.jar, the ampersand from this field is not > correctly escaped in the XML search results provided by the select page: > <?xml version="1.0" encoding="UTF-8"?> > <add> > <doc> > <field name="id">amp-test-one</field> > <field name="content">Les événements chez Bonnie & Clyde.</field> > </doc> > </add> > </stuff> > The "content" field is defined as a "text" field in the schema. > Adding this document to the index and querying on "id:amp-test-one" returns > ... > <doc> > <str name="content">Les événements chez Bonnie & Clyde.& Clyde.</str> > <str name="id">amp-test-one</str> > </doc> > With first "Bonnie & Clyde" unescaped and then the correct escaped & > Browsing the index with Luke shows that the field is correctly stored. > I think this might be a Jetty bug: patching the util/XML class of SOLR to > avoid the use of Writer.write(String,start,len) fixes the problem. Maybe the > Jetty ServletWriter gets confused by the presence of non-ascii chars? > Here are my changes in util/XML.java. It looks like the class did use > String.substring(...) before, Writer.write might be faster but it seems like > it's broken in that environment. > Here are my patches to util/XML.java: > Index: src/java/org/apache/solr/util/XML.java > =================================================================== > --- src/java/org/apache/solr/util/XML.java (revision 422655) > +++ src/java/org/apache/solr/util/XML.java (working copy) > @@ -159,8 +159,8 @@ > } > if (subst != null) { > if (start<i) { > - // out.write(str.substring(start,i)); > - out.write(str, start, i-start); > + out.write(str.substring(start,i)); > + // out.write(str, start, i-start); > // n+=i-start; > } > out.write(subst); > @@ -172,8 +172,8 @@ > out.write(str); > // n += str.length(); > } else if (start<str.length()) { > - // out.write(str.substring(start)); > - out.write(str, start, str.length()-start); > + out.write(str.substring(start)); > + // out.write(str, start, str.length()-start); > // n += str.length()-start; > } > // return n; -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
