[ 
http://issues.apache.org/jira/browse/AXIS-2025?page=comments#action_12314564 ] 

Shankar Unni commented on AXIS-2025:
------------------------------------

>  have them be replaced by entity escapes...

Hmm, perhaps not.  But the larger problem still stands - every other transport 
mechanism handles arbitrary characters in strings. Do the standards simply 
*punt* on some subset of strings? 

I see in the SOAP document (3.1.2, Encoding simple values) that *it* basically 
shrugs and says at this point: "Note that certain Unicode characters cannot be 
represented in XML".

So perhaps the true answer is not to use xsd:string for such strings, but 
encode them as binary somehow? The problem is that there is little or no 
guidance for users in this matter.



As an aside: I'm also looking at http://www.xmlrpc.com/spec#update1, where I 
see the interesting lines:

"What characters are allowed in strings? Non-printable characters? Null 
characters? Can a "string" be used to hold an arbitrary chunk of binary data?

Any characters are allowed in a string except < and &, which are encoded as 
&lt; and &amp;. A string can be used to encode binary data."

There's got to be *some* way to pass such strings around..  Most applications 
don't have full control of how such strings are created anyway, and this is an 
almost intolerable restriction..


> Illegal XML characters in String arguments and return values cause XML 
> exceptions in Axis calls
> -----------------------------------------------------------------------------------------------
>
>          Key: AXIS-2025
>          URL: http://issues.apache.org/jira/browse/AXIS-2025
>      Project: Apache Axis
>         Type: Bug
>   Components: Serialization/Deserialization
>     Versions: 1.2
>  Environment: All (but reproduced on WinXP).
> Axis 1.1 and 1.2
>     Reporter: Shankar Unni
>     Assignee: Venkat Reddy
>  Attachments: Axis1.1badmsgAPI.log, Axis1.1echoAPI.log, Axis1.2badmsgAPI.log, 
> Axis1.2echoAPI.log
>
> Arguments and return values of Java type String are incorrectly handled if 
> they contain non-printing illegal ASCII characters.
> Example 1: bad return values:
> - - - - - - - - - - - - - - -
> E.g. the string 
>   "bad char: " + (char)3 + "."
> Trivial example:
> foo.jws:
>   public class foo {
>     public String badmsg()
>     {
>       return "bad: " + (char)3 + ".";
>     }
>   }
> When calling this method and the server is running on Axis 1.1, it returns 
> XML with the illegal character ASCII "3" in the text:
>    <badmsgReturn xsi:type="xsd:string">bad: ?.</badmsgReturn>  
> This causes an XML parse exception on the client side 
> ("org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x3) was 
> found in the element content of the document.")
> With Axis 1.2, the server doesn't even return a valid response: I get an HTTP 
> 200 OK with an empty content, causing a different XML parse error.
> Example 2: bad parameter values:
> - - - - - - - - - - - - - - - -
> A similar problem exists when passing such a string from the the client side.
> If I have a method in foo.jws:
>   public class foo {
>     public String echo(String s)
>     {
>       return s;
>     }
>   }
> Then if I write an ordinary Java client to call this, and pass it a bad 
> string as in the beginning of this post, I get an exception thrown while the 
> call is being composed:
> java.lang.IllegalArgumentException: The char '0x3' in 'bad char: ?.' is not a 
> valid XML character.
> This is somewhat absurd: shouldn't the serialization layer be encoding these 
> illegal XML characters as entity escapes? They're entirely legal in the 
> current locale (US), and normal Java code handles this character quite 
> normally.  Why should it croak when passed by XML/RPC?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to