In theory you are correct but we have been a Web Service B2B shop for about 10 
years and nothing can be assumed to work according to standards. We know that 
unless such elements are wrapped in CDATA the white space will get removed 
somewhere. Opinions on the net:

"Now when the XML specification says any white space, they don’t really mean 
it. HA! The standards leave some aspects of white space handling up to the 
implementers, or at least that’s what the implementers would have us believe. 
I suspect some implementers choose to ignore parts of the standards they 
don’t like or can’t accommodate easily in their toolsets. It’s inevitable 
that different XML parsers make different interpretations of the standards. 
This leads to some fuzzy behavior where white space is concerned."

another openxml rfp:
"White space handling is an unresolved issue in the present definition of XML 
parsers, falling outside the scope of both the DOM specification and the SAX 
API."

In our case when we send data on to customer who knows what parser/technology 
gets used. How
many other routers, mediators, proxies, mess up the XML; using technology from 
1980's.
So, in practise, we wrap all text data with leading/trailing spaces with CDATA 
"" "". At the moment for these small files, using this data format standard, I 
am using a JS mediator to take the text payload and insert it into destination 
XML. This means using CDATA to wrap the JS script.. but then I can't add in 
another nested CDATA in the JS script XML to wrap the data. So I will have to 
move this to XSLT or most likely Java. Unless you know a way around this.

Unfortunately due to size/volume of files we cannot log them directly in 
synapse. Due to privacy laws in US I cannot see the production data directly. 
so it takes a while to debug these issues.  

I trust you, that you believe, it cannot be happening inside Synapse/Axis and 
that really helps rule out where this is happening. But we are loosing the 
spaces; and the first place to rule out when tracing the files through the 
process was Synapse at VFS. Given these problems, wrapping the data at the 
start of the process with CDATA, although a paranoid approach, would mean it is 
then safe all the way through. 

Will take a few days to debug this here; tell you what we find.


Thanks
Kim






-----Original Message-----
From: Andreas Veithen [mailto:[email protected]]
Sent: Fri 03/04/2009 19:00
To: [email protected]
Subject: Re: VFS Text Files with spaces don't work.
 
This is not very convincing for several reasons:

* An XML parser never removes whitespace.
* A validating XML parser reports whitespace _between elements_ in a
special way, but it is up to the application to decide what to do with
it. Note that we are not talking about this type of whitespace here.
* XSLT is not schema aware.
* While space="preserve" is defined in the XML specs it has no well
defined semantics and it is up to the application to interpret it.
* Axis2 and Synapse don't (or at least shouldn't) remove any whitespace.

What you really need to do is to determine at what step in the
mediation the whitespace is lost. Then we can try to understand why
this is so.

Andreas

On Fri, Apr 3, 2009 at 07:19, kimhorn <[email protected]> wrote:
>
> You are probably right. Haven't had time to look but I hope the payload
> "text" element is defined as space="preserve". If yes then its OK, If not
> then ?
>
> Any idea where is the �XSD is easily available ?
>
> It is probably one of the mediators removing the white space along the way;
> as XML does not preserve this. I will have to add in yet again more java to
> wrap the text field in CDATA "". As the recipient cannot change their XSD,
> this looks like the only option....Not sure XSLT will work unless target
> name space also defines the element as space preserve ?
>
> My simple Synapse script is becoming a massive Java program. And I thought
> "wouldn't it be easy
> to use a scripting tool like Synapse compared to writing Java code ". How
> wrong.
>
> Thanks
> Kim
>
>
>
>
> Andreas Veithen-2 wrote:
>>
>> Are you sure that these spaces get trimmed inside the VFS transport
>> and not somewhere in your mediation? Normally the plain text message
>> builder is designed to strictly preserve the file content (including
>> spaces), so this would be a serious bug.
>>
>> Andreas
>>
>> On Thu, Apr 2, 2009 at 08:52, kimhorn <[email protected]> wrote:
>>>
>>> Run into a problem with VFS reading text files with fixed field length
>>> fields, where empty fields are padded with spaces. There are a number of
>>> B2B
>>> formats that do this.
>>>
>>> If the empty fields are at the start or end of the file then when these
>>> are
>>> inserted into XML as Payload the
>>> XML removes the spaces. The text should be wrapped in CDATA with double
>>> Quotes to preserve this space data; but VFS does not do this. So the
>>> fields
>>> at start or end of file get lost and hence the whole file is now garbage.
>>>
>>> Hopefully reading them as binary files (not plan text) will get over this
>>> ?
>>> Other ideas ?
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/VFS-Text-Files-with-spaces-don%27t-work.-tp22841970p22841970.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/VFS-Text-Files-with-spaces-don%27t-work.-tp22841970p22862146.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to