BaseUtils uses incorrect strategy to distinguish between XML, text and binary
-----------------------------------------------------------------------------
Key: SYNAPSE-304
URL: https://issues.apache.org/jira/browse/SYNAPSE-304
Project: Synapse
Issue Type: Bug
Components: Transports
Reporter: Andreas Veithen
Assignee: Andreas Veithen
Fix For: 1.3
BaseUtils#setSOAPEnvelope (together with BaseUtils#handleLegacyMessage) is used
by the VFS, Mail, JMS and AMQP transports and implements the following
strategy to distinguish between XML, text and binary payloads: It first tries
to parse the payload as XML. If that fails, it tries to load it as text using
BaseUtils#getMessageTextPayload. If that fails again, it loads the message as
binary data using BaseUtils#getMessageBinaryPayload.
This strategy has the following flaws:
* Corrupted or invalid XML messages are not detected as such but interpreted as
text or binary data. This will almost certainly lead to errors at a later stage
in the processing (typically in a mediation that doesn't expect text or binary
payloads), but for the user it is difficult to identify the root cause of the
problem.
* The VFSUtils and MailUtils implementation of the getMessageTextPayload method
actually never fail (except if the file or mime part can't be read). The reason
is that they read the content as binary and then construct a String object
using new String(byte[]). This constructor never throws an exception, even if
there are byte sequences not valid in the platform's default charset. Therefore
the VFS and mail transport listeners will never process messages as binary
payloads. This problem can't be solved because there is in fact no (reliable)
way to distinguish text from binary data by inspecting the content alone. Also
note that using the platform's default charset to decode the message is also
incorrect (see SYNAPSE-261).
* This approach doesn't allow using custom message builders to parse messages
that are neither XML nor plain text or binary.
I think that every transport should first determine the content type of the
message and than decode the message according to that content type, rather than
trying different ways to decode the message. The decoding should be delegated
to the message builder corresponding to the content type. This approach has the
following advantages:
* Corrupted or invalid messages trigger an appropriate error immediately.
* Since for text payloads the content type can include information about the
charset (text/plain; charset=...), it provides a straightforward solution for
issues like SYNAPSE-261.
* Custom message builders can be used.
* It naturally fits into Axis' architecture since it correctly uses the
concepts of transport and message builder.
* It leads to a more consistent behavior between different transports (in
particular with the NIO HTTP transport).
The transport should determine the content type either from the service
configuration (e.g. the transport.vfs.ContentType property for VFS) or from
information available at the transport protocol level (Content-Type header for
mail messages, FileContentInfo or file suffix for VFS, message type for JMS,
etc.).
The algorithm to select the message builder based on the content type and to
invoke it to create the SOAP infoset is already implemented in the
TransportUtils.createSOAPMessage utility method in the Axis2 kernel (which is
also used by the NIO HTTP and the UDP transport). Therefore the proposed
changes are:
1. Create message builders for text and binary payloads (which are the
counterparts of PlainTextFormatter and BinaryFormatter introduced by
SYNAPSE-261).
2. Let ServerManager#start register these new message builders by default in
the Axis configuration for content types "text/plain" and
"application/octet-stream" respectively (in a similar way as
AxisConfigBuilder#populateConfig registers default message builders for
text/xml, application/soap+xml, etc.). In addition they should be added to the
default axis2.xml file shipped with Synapse.
3. Make sure that every affected transport implements an appropriate strategy
to determine the content type and uses TransportUtils.createSOAPMessage instead
of BaseUtils#setSOAPEnvelope to process the message payload.
4. Remove BaseUtils#setSOAPEnvelope and related code without replacement.
The proposed work plan is as follows:
* For release 1.2, implement 1 and 2 as well as 3 for the VFS transport. This
allows to completely resolve issue SYNAPSE-261.
* For release 1.3, implement 3 and 4 for all remaining transports (for JMS and
AMQP after SYNAPSE-303 has been handled).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]