[ 
https://issues.apache.org/jira/browse/SYNAPSE-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595908#action_12595908
 ] 

Andreas Veithen commented on SYNAPSE-304:
-----------------------------------------

Things in scope for the 1.2 release have been implemented. Items planned for 
1.3 will be implemented later.

> BaseUtils uses incorrect strategy to distinguish between XML, text and binary
> -----------------------------------------------------------------------------
>
>                 Key: SYNAPSE-304
>                 URL: https://issues.apache.org/jira/browse/SYNAPSE-304
>             Project: Synapse
>          Issue Type: Bug
>          Components: Transports
>            Reporter: Andreas Veithen
>            Assignee: Andreas Veithen
>             Fix For: 1.3
>
>
> BaseUtils#setSOAPEnvelope (together with BaseUtils#handleLegacyMessage) is 
> used by the  VFS, Mail, JMS and AMQP transports and implements the following 
> strategy to distinguish between XML, text and binary payloads: It first tries 
> to parse the payload as XML. If that fails, it tries to load it as text using 
> BaseUtils#getMessageTextPayload. If that fails again, it loads the message as 
> binary data using BaseUtils#getMessageBinaryPayload.
> This strategy has the following flaws:
> * Corrupted or invalid XML messages are not detected as such but interpreted 
> as text or binary data. This will almost certainly lead to errors at a later 
> stage in the processing (typically in a mediation that doesn't expect text or 
> binary payloads), but for the user it is difficult to identify the root cause 
> of the problem.
> * The VFSUtils and MailUtils implementation of the getMessageTextPayload 
> method actually never fail (except if the file or mime part can't be read). 
> The reason is that they read the content as binary and then construct a 
> String object using new String(byte[]). This constructor never throws an 
> exception, even if there are byte sequences not valid in the platform's 
> default charset. Therefore the VFS and mail transport listeners will never 
> process messages as binary payloads. This problem can't be solved because 
> there is in fact no (reliable) way to distinguish text from binary data by 
> inspecting the content alone. Also note that using the platform's default 
> charset to decode the message is also incorrect (see SYNAPSE-261).
> * This approach doesn't allow using custom message builders to parse messages 
> that are neither XML nor plain text or binary.
> I think that every transport should first determine the content type of the 
> message and than decode the message according to that content type, rather 
> than trying different ways to decode the message. The decoding should be 
> delegated to the message builder corresponding to the content type. This 
> approach has the following advantages:
> * Corrupted or invalid messages trigger an appropriate error immediately.
> * Since for text payloads the content type can include information about the 
> charset (text/plain; charset=...), it provides a straightforward solution for 
> issues like SYNAPSE-261.
> * Custom message builders can be used.
> * It naturally fits into Axis' architecture since it correctly uses the 
> concepts of transport and message builder.
> * It leads to a more consistent behavior between different transports (in 
> particular with the NIO HTTP transport).
> The transport should determine the content type either from the service 
> configuration (e.g. the transport.vfs.ContentType property for VFS) or from 
> information available at the transport protocol level (Content-Type header 
> for mail messages, FileContentInfo or file suffix for VFS, message type for 
> JMS, etc.).
> The algorithm to select the message builder based on the content type and to 
> invoke it to create the SOAP infoset is already implemented in the 
> TransportUtils.createSOAPMessage utility method in the Axis2 kernel (which is 
> also used by the NIO HTTP and the UDP transport). Therefore the proposed 
> changes are:
> 1. Create message builders for text and binary payloads (which are the 
> counterparts of PlainTextFormatter and BinaryFormatter introduced by 
> SYNAPSE-261).
> 2. Let ServerManager#start register these new message builders by default in 
> the Axis configuration for content types "text/plain" and 
> "application/octet-stream" respectively (in a similar way as 
> AxisConfigBuilder#populateConfig registers default message builders for 
> text/xml, application/soap+xml, etc.). In addition they should be added to 
> the default axis2.xml file shipped with Synapse.
> 3. Make sure that every affected transport implements an appropriate strategy 
> to determine the content type and uses TransportUtils.createSOAPMessage 
> instead of BaseUtils#setSOAPEnvelope to process the message payload.
> 4. Remove BaseUtils#setSOAPEnvelope and related code without replacement.
> The proposed work plan is as follows:
> * For release 1.2, implement 1 and 2 as well as 3 for the VFS transport. This 
> allows to completely resolve issue SYNAPSE-261.
> * For release 1.3, implement 3 and 4 for all remaining transports (for JMS 
> and AMQP after SYNAPSE-303 has been handled).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to