[ 
https://issues.apache.org/jira/browse/AXIS2C-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Mitchell updated AXIS2C-859:
---------------------------------

    Attachment: guththila_xml_writer.diff
                diff_2.txt

Lahiru, looking at the code again, I now agree that you were right to replace 
the character by sliding the token data down.  I was under the mistaken 
impression that the code was sliding all the rest of the buffer down; as long 
as we are sliding from one end or the other of the token, there is no reason 
not do the obvious slide down.  

In the attached diff_2.txt, I moved the code to perform the replacement into a 
lower level routine.  As guththila_close_token has constructed a temp token in 
both the text case and the attribute value case, it is easy to perform 
replacement on this temp token string before further processing of the 
attribute for a namespace declaration.  Beware that the line number where we 
change the token type to _text may be different in yours; my version includes 
changes for AXIS2C-933 that Supun wants to review before they are applied.  

Separately, in the attached guththila_xml_writer.diff, is a patch to other side 
of this issue, the insertion of character sequences on outgoing messages that 
include ampersand or greater than in the text.  

With both fixes installed, I was able to see ampersand data characters from the 
client arrive at the server intact, and vice versa.  

> guththila parser fails to handle escape sequences for ampersand, less than, 
> greater than
> ----------------------------------------------------------------------------------------
>
>                 Key: AXIS2C-859
>                 URL: https://issues.apache.org/jira/browse/AXIS2C-859
>             Project: Axis2-C
>          Issue Type: Bug
>          Components: guththila
>    Affects Versions: Current (Nightly)
>         Environment: Windows XP, Visual Studio 2005, guththila parser, libcurl
>            Reporter: Bill Mitchell
>         Attachments: diff.txt, diff_1.txt, diff_2.txt, 
> guththila_xml_writer.diff
>
>
> When an incoming message contains within text the escaped ampersand sequence, 
> "&", this sequence is being passed to the client as raw text without 
> being converted to the single ampersand character.  Clearly, this action must 
> take place at the level of the parser, as only the parser knows whether it is 
> seeing simple text, and conversion is required, or text embedded in a CDATA 
> section, where conversion is not allowed.  I have tested the build with the 
> libxml parser, and of course the libxml parser behaves correctly: the text 
> passed to the client contains only the single ampersand character, not the 
> escaped sequence.  (See section 2.4 of XML 1.0 spec.)
> Looking at the code, I expect the same problem occurs with all escaped 
> sequences, less than and greater than as well as ampersand, on both input and 
> output.  I also don't see where CDATA sections are handled, but as I am not 
> seeing CDATA in the messages from the service I am hitting, I have not tested 
> this case.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to