[
https://issues.apache.org/jira/browse/SYNAPSE-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595905#action_12595905
]
Andreas Veithen commented on SYNAPSE-304:
-----------------------------------------
The way the VFS transport determines the content type needs review. Indeed
VFSTransportListener#startListeningForService considers the content type as a
mandatory service parameter (see usage of getRequiredServiceParam), while
VFSTransportListener#processFile defines some fallback mechanisms (such as
looking at the file name suffix) if it is not specified as service parameter...
I will leave it like that for 1.2, but for 1.3 we should sort this out.
> BaseUtils uses incorrect strategy to distinguish between XML, text and binary
> -----------------------------------------------------------------------------
>
> Key: SYNAPSE-304
> URL: https://issues.apache.org/jira/browse/SYNAPSE-304
> Project: Synapse
> Issue Type: Bug
> Components: Transports
> Reporter: Andreas Veithen
> Assignee: Andreas Veithen
> Fix For: 1.3
>
>
> BaseUtils#setSOAPEnvelope (together with BaseUtils#handleLegacyMessage) is
> used by the VFS, Mail, JMS and AMQP transports and implements the following
> strategy to distinguish between XML, text and binary payloads: It first tries
> to parse the payload as XML. If that fails, it tries to load it as text using
> BaseUtils#getMessageTextPayload. If that fails again, it loads the message as
> binary data using BaseUtils#getMessageBinaryPayload.
> This strategy has the following flaws:
> * Corrupted or invalid XML messages are not detected as such but interpreted
> as text or binary data. This will almost certainly lead to errors at a later
> stage in the processing (typically in a mediation that doesn't expect text or
> binary payloads), but for the user it is difficult to identify the root cause
> of the problem.
> * The VFSUtils and MailUtils implementation of the getMessageTextPayload
> method actually never fail (except if the file or mime part can't be read).
> The reason is that they read the content as binary and then construct a
> String object using new String(byte[]). This constructor never throws an
> exception, even if there are byte sequences not valid in the platform's
> default charset. Therefore the VFS and mail transport listeners will never
> process messages as binary payloads. This problem can't be solved because
> there is in fact no (reliable) way to distinguish text from binary data by
> inspecting the content alone. Also note that using the platform's default
> charset to decode the message is also incorrect (see SYNAPSE-261).
> * This approach doesn't allow using custom message builders to parse messages
> that are neither XML nor plain text or binary.
> I think that every transport should first determine the content type of the
> message and than decode the message according to that content type, rather
> than trying different ways to decode the message. The decoding should be
> delegated to the message builder corresponding to the content type. This
> approach has the following advantages:
> * Corrupted or invalid messages trigger an appropriate error immediately.
> * Since for text payloads the content type can include information about the
> charset (text/plain; charset=...), it provides a straightforward solution for
> issues like SYNAPSE-261.
> * Custom message builders can be used.
> * It naturally fits into Axis' architecture since it correctly uses the
> concepts of transport and message builder.
> * It leads to a more consistent behavior between different transports (in
> particular with the NIO HTTP transport).
> The transport should determine the content type either from the service
> configuration (e.g. the transport.vfs.ContentType property for VFS) or from
> information available at the transport protocol level (Content-Type header
> for mail messages, FileContentInfo or file suffix for VFS, message type for
> JMS, etc.).
> The algorithm to select the message builder based on the content type and to
> invoke it to create the SOAP infoset is already implemented in the
> TransportUtils.createSOAPMessage utility method in the Axis2 kernel (which is
> also used by the NIO HTTP and the UDP transport). Therefore the proposed
> changes are:
> 1. Create message builders for text and binary payloads (which are the
> counterparts of PlainTextFormatter and BinaryFormatter introduced by
> SYNAPSE-261).
> 2. Let ServerManager#start register these new message builders by default in
> the Axis configuration for content types "text/plain" and
> "application/octet-stream" respectively (in a similar way as
> AxisConfigBuilder#populateConfig registers default message builders for
> text/xml, application/soap+xml, etc.). In addition they should be added to
> the default axis2.xml file shipped with Synapse.
> 3. Make sure that every affected transport implements an appropriate strategy
> to determine the content type and uses TransportUtils.createSOAPMessage
> instead of BaseUtils#setSOAPEnvelope to process the message payload.
> 4. Remove BaseUtils#setSOAPEnvelope and related code without replacement.
> The proposed work plan is as follows:
> * For release 1.2, implement 1 and 2 as well as 3 for the VFS transport. This
> allows to completely resolve issue SYNAPSE-261.
> * For release 1.3, implement 3 and 4 for all remaining transports (for JMS
> and AMQP after SYNAPSE-303 has been handled).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]