There are two things that need to be done to satisfy your requirement:

1. The message builders for text/plain (PlainTextBuilder) and
application/octet-stream (BinaryBuilder) need to be optimized using
techniques that we developed some time ago to efficiently handle large
text output from XSL transformations (TemporaryData,
WrappedTextNodeStreamReader and TextFileDataSource). This is
technically not difficult to do but it is a bit tricky at this moment
because we will have to move some code from Synapse to WS-Commons
transport (and maybe even to Axiom and Axis2) and we are in the middle
of a release process. Note that there was a discussion some time ago
to do a general overhaul and optimization of the message builders and
formatters after the Axis2 1.5 release and the
PlainTextBuilder/BinaryBuilder optimization issue is part of this.

2. We need to be able to store Axiom elements in properties during
mediation. I think this is supported by Synapse's core but not by the
<property> mediator. There was a discussion around this a few days
ago. We also need a mediator that takes an element from a property and
adds it to the current message (in a location specified by an XPath
expression). This is required because using XSLT or a scripting
mediator would indeed cause the content of the file to be read into
memory.

By extending the message builder interface so that messages can be
build from DataSource objects (instead of InputStreams), we could even
go as far as allowing to stream the data from the FTP directly to the
target Web service without even using a temporary file.

Regards,

Andreas

On Wed, Feb 25, 2009 at 23:58, kimhorn <[email protected]> wrote:
>
>
> An X12 EDI file is FTped to us. We want to submit that data to a Web
> Service.The file is essentially a single string of text data up to about 2MB
> in size.We want to submit that data to a 'Submit' Web Service that takes
> that data in a tag. e.g. <data>ISA....ISE<data/>.To use the 'Submit' Web
> Service, a transaction ID is required. A call to another Web Service prior
> to the submit is required to get a transaction ID. Hence thechain.
>
> If VFS reads a text file then the XML payload has all the data in one field.
> E.G.
>
> <?xml version="1.0" encoding="utf-8"?>
> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/";>
>  <soapenv:Body>
>    <axis2ns1:text xmlns:axis2ns1="http://ws.apache.org/commons/ns/payload";>
> "All the X12 Text data is here. One big blob...."
> </axis2ns1:text>
>  </soapenv:Body>
> </soapenv:Envelope>
>
> So AXIOM does not help here; at all. Its all or none for that VFS <text>
> tag.The text could be up to 2 MB, once a day. Many smaller files, 500KB
> across the day.
>
> The final submit web service call will need to send this data. The data is
> provided by the VFS trigger. A web service call is required in between to
> get the transaction ID. SO the data has to be held temporarily in a
> parameter OR some alternative between the calls. SO:
>
> VFS Trigger - provides <data> - store temporarily somewhere.
> Web Servie Call-1 - get <Transaction ID>
> Web Service Call-2 - send <data> and <Transaction ID>
>
> However this backgorund in not necessary to answer the Technical Question
> above; that focuses on the specific technical risk.
>
> A Standalone Java application can sove this problem easily; Given most ESB's
> and B2B, integration products e.g. BEA Aqualogic, provide ways to process
> and temporarily store large data blobs. Given Synapse does not document its
> architecture or provide any information on Quality of Service limits. E.G.
> What is the 'implementation' of a Parameter; can it handle large blobs of
> data. If not what is the alternate Synapse  Mechanism to do this ?
>
> FTP is traditionally used to move large amounts of data and integration
> products that include FTP adapters, provide mechanisms to process large
> amounts of data. Synapse includes an FTP VFS adapter but no documented
> mechanism to process large blobs of data. Is VFS just a 'TOY' add hoc addon
> or can it do real work ? Given Synapse does not have a tailored script
> mechanism for handling B2B data, and only  uses other approaches ie,
> Javascript, etc that are not designed for integration tasks.
>
> Is the answer that Synapse should not be used to solve this type of
> integration  problem, as it does not have mechanisms to process large data
> blobs ? If it can then what are the best Practises and Patterns that should
> be used for SYnapse to do this ?
>
>
>
> Asankha C. Perera wrote:
>>
>> Hi Kim
>>> The following example script is used for illustrative purposes. In two
>>> places
>>> the VFS payload is copied. Into a property, and in the JSScript. In full
>>> scenario the property value is used later in a chain of Web service but
>>> needs to be stored temporily, as the request will be replaced by
>>> request/responce of next web service. The file drop will be replaced by
>>> FTP.
>>> Issue is we get 'big' files by FTP that we need to submit to a Web
>>> Service.
>>> The Web Service is OK with the large data, but will Synapse cope.
>>>
>>> Issues:
>>>
>>> 1) What if the payload gets large;
>>>     What is large (5MB, 20 MB ?) and
>>>     What problems will this mean for this code e.g. Memory etc.
>>>
>> Synapse, especially the NIO transport, is written to never exhaust with
>> OOM with large payloads - i.e. it does not bite more than it can chew.
>> But this does not mean your configuration is safe - for example, reading
>> any payload into a String is bad.. even if its 1MB.. and I see your
>> script mediator doing that - which is worse.. You could maybe use a
>> simple class/POJO mediator to save or process large payload etc.
>>> 2) The JSScript could be replaced with a Java Mediator and use a
>>>      stream to better copy the VFS text element {data}. However still
>>> means
>>> it would
>>>      copy whole string into memory.
>>>
>> No, in Java, you could get the payload written safely into a file
>> stream, without reading into memory.
>>> 3) Storing the payload temporarily, e.g. alternative to property. How big
>>> can Property be ?
>>>
>> This will depend on how much memory you allocate and how many of these
>> messages are processed concurrently. Ideally, properties should not take
>> much space
>>> 4) using the XPath to get the payload into property.
>>>
>> Not sure I understand.. again, if you are dealing with large payloads,
>> you would need to be careful. Can you state how large your files are,
>> whats the frequency of processing, and the number of files per each
>> batch etc, and the type of file - XML, Text, CSV etc and what type of
>> operations you want to take on this data..
>>
>> One thing I always ask users is to state their business problem or the
>> requirement, and ask us for the best technical solution. Once we
>> understand your problem, we could look into alternatives and weigh pros
>> and cons to help you decide
>>
>> cheers
>> asankha
>>
>> --
>> Asankha C. Perera
>> http://adroitlogic.org
>>
>> http://esbmagic.blogspot.com
>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Large-Payload%3A-Issues-tp22196278p22213925.html
> Sent from the Synapse - User mailing list archive at Nabble.com.
>
>

Reply via email to