Re: Large Web Services responses - how do you deal with them?

Andrew Bruno Mon, 18 May 2009 22:44:23 -0700

Hi Dennis,

The way I have done it is as you suggested, via the XMLStreamReader.  I find
the XML element that I am after, and then the next CDATA element is assumed
to be the mimeContent, and pipe it to a tempFile, then I base64decode it,
and return the inputStream for access to the email to the application that
requires it.


I've just tested it with a 200Meg email  from a remote Exchange box, and it
worked no probs... ahh finally.

Here is some sample code (to give back to community ;)

XMLStreamReader xsr = firstElement.getXMLStreamReaderWithoutCaching();
...
events: for (int eventType = xsr.getEventType();; eventType = xsr.next()) {
                        switch (eventType) {
                          case XMLStreamReader.START_ELEMENT: {
                            logger.debug("START_ELEMENT: " + xsr.getName());

                            QName elementName = xsr.getName();

                            if (mimeContentName.equals(elementName)) {
                                logger.info("Setting mimeContent to TRUE.
Next CDATA will be mimecontent, saved to [" + tempFilePath + "]");
                                mimecontent = true;
                            } else {
                                logger.debug("Setting mimeContent to
false.");
                                mimecontent = false;
                            }

                            depth++;
                            break;
                        }

                        case XMLStreamReader.CDATA: {
                            if (mimecontent) {
                                int sourceStart = 0;
                                int length = xsr.getTextLength();
                                char[] target = new char[length];
                                int copiedLength = 0;

                                int available = xsr.getTextLength() -
sourceStart;
                                if (available < 0) {
                                    throw new
IndexOutOfBoundsException("sourceStart is greater than" + "number of
characters associated with this event");
                                }

                                if (available < length) {
                                    copiedLength = available;
                                } else {
                                    copiedLength = length;
                                }

                                char[] textChars = xsr.getTextCharacters();

                                // Only copy the chars that we need.
                                System.arraycopy(textChars,
xsr.getTextStart() + 0, target, 0, copiedLength);
                                theWriter.write(target);

                            } else {
                                logger.debug("Ignoring CDATA...");
                                // We don't care about other data
                            }

                            break;
                        }
}

P.S. If I had more time, I would love to look at ADB or JiBX, but for now
there is no chance, a lesson learned for now.


On Tue, May 19, 2009 at 3:29 PM, Dennis Sosnoski <[email protected]> wrote:

> Ah, I hadn't seen noticed/remembered this dealt with a single huge base64
> string. That's just plain bad design for the web service... but no surprise
> if Exchange is involved.
>
> No, you can't handle this cleanly with JiBX or any other data binding tool
> I'm aware of. The only way you *could* handle it would be by getting an
> XMLStreamReader and reading the data directly, with . The XMLStreamReader
> interface provides next() and getTextCharacters() methods which in theory
> could be used to get this text a chunk at a time. Assuming the parser
> involved actually implements these as intended, you can get a block of text
> at a time and run the base64 decoding on that block, then move on to the
> next block. A lot of work to implement, though.
>
> Andrew, you might want to try switching to ADB or JiBX anyway, at least to
> try for an easier solution than working directly with the XMLStreamReader.
> Either of these should use less memory than XMLBeans for the same data, so
> as long as your text strings are only very large and not unbounded these
> might let you squeak by.
>
>  - Dennis
>
>
>
> Andreas Veithen wrote:
>
>> Dennis,
>>
>> I think that the Web service is actually Microsoft Exchange, so
>> changing it is not an option. Does JiBX support caching a base64 value
>> (which is not represented using MTOM) on disk instead of memory?
>>
>> Andrew,
>>
>> You might still want to give MTOM a try. Many servers switch to MTOM
>> if the request is sent as MTOM. I can't believe the Microsoft Exchange
>> can only return inline base64.
>>
>> Regards,
>>
>> Andreas
>>
>> On Mon, May 18, 2009 at 13:22, Dennis Sosnoski <[email protected]> wrote:
>>
>>
>>> Hi Andrew,
>>>
>>> Your best starting point is to get rid of XMLBeans. XMLBeans always
>>> stores
>>> raw XML in memory, so there's no way to avoid the memory issues with
>>> large
>>> messages.
>>>
>>> ADB would avoid some of the overhead, in that it would convert the XML
>>> message to an object graph, which would typically be a factor of 2-5x
>>> smaller than the raw XML data (depending on the type of data in your
>>> message).
>>>
>>> JiBX would do at least as well as ADB in terms of the reduced-size object
>>> graph. If you have some way of breaking up the response data into more
>>> easily-digestible chunks, JiBX would also allow you to do piece-meal
>>> processing of the message (by creating a fake collection, for instance,
>>> where your data objects expose an addXXX() method which just writes the
>>> object being added to some backing store rather than adding it to an
>>> in-memory collection). Piece-meal processing is the only way you can
>>> dramatically decrease your memory usage and handle messages of
>>> effectively
>>> unlimited size without a problem.
>>>
>>>  - Dennis
>>>
>>> --
>>> Dennis M. Sosnoski
>>> SOA and Web Services in Java
>>> Axis2 Training and Consulting
>>> http://www.sosnoski.com - http://www.sosnoski.co.nz
>>> Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117
>>>
>>>
>>>
>>> Andrew Bruno wrote:
>>>
>>>
>>>> Mr J,
>>>>
>>>> But its just one data element that I need, so there is no concept many
>>>> rows, etc.
>>>>
>>>> As a matter of fact, I am breaking up the call in 2 calls, one for all
>>>> the
>>>> metadata, and one for the actual mime content.
>>>>
>>>> i.e.
>>>>
>>>>  <m:ResponseMessages>
>>>>   <m:GetItemResponseMessage ResponseClass="Success">
>>>>     <m:ResponseCode>NoError</m:ResponseCode>
>>>>     <m:Items>
>>>>       <t:Message>
>>>>         <t:MimeContent CharacterSet="UTF-8">RnJvbTogTGVlIF..... large
>>>> mime base64 encoded email......... 120Meg++ of encoded
>>>> data</t:MimeContent>
>>>> ....
>>>>
>>>> I only ever request one message at a time.  Its just that when the
>>>> MimeContent is greater then 10Meg, OutOfmemory occurs.
>>>>
>>>> It appears to be the case because the parse loads all the content into
>>>> memory.
>>>>
>>>> In theory, this element can have data of any size coming back.
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, May 18, 2009 at 6:54 PM, J. Hondius <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>>   I'd think about a different design of the webservice calls.
>>>>   You should try to avoid real big results.
>>>>   Split into more calls.
>>>>
>>>>   Something like:
>>>>   One to get a overview list
>>>>   Another to get details
>>>>
>>>>   Or:
>>>>   One call to get the size like the SQL count() does
>>>>   combined with
>>>>   Add parameters to your to limit the result: like start_at,
>>>>   number_of_results
>>>>
>>>>   my 2c
>>>>
>>>>   Andrew Bruno schreef:
>>>>
>>>>       Hello all
>>>>
>>>>       I was wondering how some of you may be dealing with web
>>>>       service calls
>>>>       that result in extremely large data responses?
>>>>
>>>>       I have been struggling in trying to change the way the parsing
>>>>       of the
>>>>       XML response works, as I am getting out of memory errors
>>>>
>>>>       java.lang.OutOfMemoryError: Java heap space
>>>>              at
>>>>
>>>> org.apache.xmlbeans.impl.store.CharUtil.allocate(CharUtil.java:397)
>>>>              at
>>>>
>>>>  org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:506)
>>>>              at
>>>>
>>>>  org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:419)
>>>>              at
>>>>
>>>>  org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:489)
>>>>              at
>>>>
>>>>  org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:2911)
>>>>              at
>>>>
>>>>
>>>>  org.apache.xmlbeans.impl.store.Cur$CurLoadContext.stripText(Cur.java:3113)
>>>>              at
>>>>
>>>>  org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:3126)
>>>>              at
>>>>
>>>>
>>>>  
>>>> org.apache.xmlbeans.impl.store.Locale.loadXMLStreamReader(Locale.java:1154)
>>>>              at
>>>>
>>>>  org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:843)
>>>>              at
>>>>
>>>>  org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:826)
>>>>              at
>>>>
>>>>
>>>>  
>>>> org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:231)
>>>>              .....
>>>>
>>>>       Is there a way to change the parser to use a temp file rather then
>>>>       trying to buffer the response in memory?
>>>>
>>>>       Should I be directing this question to the developers list?
>>>>
>>>>       Or should I be thinking differently on solving this problem?
>>>>
>>>>       Please any ideas :(
>>>>
>>>>       Thank you
>>>>       Andrew
>>>>
>>>>
>>>>
>>>
>>
>>
>

Re: Large Web Services responses - how do you deal with them?

Reply via email to