Glad this approach works for you, Andrew. Thanks for responding to the list for the benefit of others facing the same kind of problem (working with idiotic MS services, that is... :-) ).

 - Dennis



Andrew Bruno wrote:
Hi Dennis,

The way I have done it is as you suggested, via the XMLStreamReader. I find the XML element that I am after, and then the next CDATA element is assumed to be the mimeContent, and pipe it to a tempFile, then I base64decode it, and return the inputStream for access to the email to the application that requires it.

I've just tested it with a 200Meg email from a remote Exchange box, and it worked no probs... ahh finally.

Here is some sample code (to give back to community ;)

XMLStreamReader xsr = firstElement.getXMLStreamReaderWithoutCaching();
...
events: for (int eventType = xsr.getEventType();; eventType = xsr.next()) {
                        switch (eventType) {
                          case XMLStreamReader.START_ELEMENT: {
logger.debug("START_ELEMENT: " + xsr.getName());

                            QName elementName = xsr.getName();

                            if (mimeContentName.equals(elementName)) {
logger.info <http://logger.info>("Setting mimeContent to TRUE. Next CDATA will be mimecontent, saved to [" + tempFilePath + "]");
                                mimecontent = true;
                            } else {
logger.debug("Setting mimeContent to false.");
                                mimecontent = false;
                            }

                            depth++;
                            break;
                        }

                        case XMLStreamReader.CDATA: {
                            if (mimecontent) {
                                int sourceStart = 0;
                                int length = xsr.getTextLength();
                                char[] target = new char[length];
                                int copiedLength = 0;

int available = xsr.getTextLength() - sourceStart;
                                if (available < 0) {
throw new IndexOutOfBoundsException("sourceStart is greater than" + "number of characters associated with this event");
                                }

                                if (available < length) {
                                    copiedLength = available;
                                } else {
                                    copiedLength = length;
                                }

char[] textChars = xsr.getTextCharacters();

                                // Only copy the chars that we need.
System.arraycopy(textChars, xsr.getTextStart() + 0, target, 0, copiedLength);
                                theWriter.write(target);

                            } else {
                                logger.debug("Ignoring CDATA...");
                                // We don't care about other data
                            }

                            break;
                        }
}

P.S. If I had more time, I would love to look at ADB or JiBX, but for now there is no chance, a lesson learned for now.


On Tue, May 19, 2009 at 3:29 PM, Dennis Sosnoski <[email protected] <mailto:[email protected]>> wrote:

    Ah, I hadn't seen noticed/remembered this dealt with a single huge
    base64 string. That's just plain bad design for the web service...
    but no surprise if Exchange is involved.

    No, you can't handle this cleanly with JiBX or any other data
    binding tool I'm aware of. The only way you *could* handle it
    would be by getting an XMLStreamReader and reading the data
    directly, with . The XMLStreamReader interface provides next() and
    getTextCharacters() methods which in theory could be used to get
    this text a chunk at a time. Assuming the parser involved actually
    implements these as intended, you can get a block of text at a
    time and run the base64 decoding on that block, then move on to
    the next block. A lot of work to implement, though.

    Andrew, you might want to try switching to ADB or JiBX anyway, at
    least to try for an easier solution than working directly with the
    XMLStreamReader. Either of these should use less memory than
    XMLBeans for the same data, so as long as your text strings are
    only very large and not unbounded these might let you squeak by.

     - Dennis



    Andreas Veithen wrote:

        Dennis,

        I think that the Web service is actually Microsoft Exchange, so
        changing it is not an option. Does JiBX support caching a
        base64 value
        (which is not represented using MTOM) on disk instead of memory?

        Andrew,

        You might still want to give MTOM a try. Many servers switch
        to MTOM
        if the request is sent as MTOM. I can't believe the Microsoft
        Exchange
        can only return inline base64.

        Regards,

        Andreas

        On Mon, May 18, 2009 at 13:22, Dennis Sosnoski
        <[email protected] <mailto:[email protected]>> wrote:
            Hi Andrew,

            Your best starting point is to get rid of XMLBeans.
            XMLBeans always stores
            raw XML in memory, so there's no way to avoid the memory
            issues with large
            messages.

            ADB would avoid some of the overhead, in that it would
            convert the XML
            message to an object graph, which would typically be a
            factor of 2-5x
            smaller than the raw XML data (depending on the type of
            data in your
            message).

            JiBX would do at least as well as ADB in terms of the
            reduced-size object
            graph. If you have some way of breaking up the response
            data into more
            easily-digestible chunks, JiBX would also allow you to do
            piece-meal
            processing of the message (by creating a fake collection,
            for instance,
            where your data objects expose an addXXX() method which
            just writes the
            object being added to some backing store rather than
            adding it to an
            in-memory collection). Piece-meal processing is the only
            way you can
            dramatically decrease your memory usage and handle
            messages of effectively
            unlimited size without a problem.

             - Dennis

            --
            Dennis M. Sosnoski
            SOA and Web Services in Java
            Axis2 Training and Consulting
            http://www.sosnoski.com - http://www.sosnoski.co.nz
            Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117



            Andrew Bruno wrote:
                Mr J,

                But its just one data element that I need, so there is
                no concept many
                rows, etc.

                As a matter of fact, I am breaking up the call in 2
                calls, one for all the
                metadata, and one for the actual mime content.

                i.e.

                 <m:ResponseMessages>
                  <m:GetItemResponseMessage ResponseClass="Success">
                    <m:ResponseCode>NoError</m:ResponseCode>
                    <m:Items>
                      <t:Message>
                        <t:MimeContent
                CharacterSet="UTF-8">RnJvbTogTGVlIF..... large
                mime base64 encoded email......... 120Meg++ of encoded
                data</t:MimeContent>
                ....

                I only ever request one message at a time.  Its just
                that when the
                MimeContent is greater then 10Meg, OutOfmemory occurs.

                It appears to be the case because the parse loads all
                the content into
                memory.

                In theory, this element can have data of any size
                coming back.




                On Mon, May 18, 2009 at 6:54 PM, J. Hondius
                <[email protected] <mailto:[email protected]>
                <mailto:[email protected] <mailto:[email protected]>>> wrote:

                  I'd think about a different design of the webservice
                calls.
                  You should try to avoid real big results.
                  Split into more calls.

                  Something like:
                  One to get a overview list
                  Another to get details

                  Or:
                  One call to get the size like the SQL count() does
                  combined with
                  Add parameters to your to limit the result: like
                start_at,
                  number_of_results

                  my 2c

                  Andrew Bruno schreef:

                      Hello all

                      I was wondering how some of you may be dealing
                with web
                      service calls
                      that result in extremely large data responses?

                      I have been struggling in trying to change the
                way the parsing
                      of the
                      XML response works, as I am getting out of
                memory errors

                      java.lang.OutOfMemoryError: Java heap space
                             at
org.apache.xmlbeans.impl.store.CharUtil.allocate(CharUtil.java:397)
                             at

                 
org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:506)
                             at

                 
org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:419)
                             at

                 
org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:489)
                             at

                 
org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:2911)
                             at

                 
org.apache.xmlbeans.impl.store.Cur$CurLoadContext.stripText(Cur.java:3113)
                             at

                 
org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:3126)
                             at

                 
org.apache.xmlbeans.impl.store.Locale.loadXMLStreamReader(Locale.java:1154)
                             at

                 
org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:843)
                             at

                 
org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:826)
                             at

                 
org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:231)
                             .....

                      Is there a way to change the parser to use a
                temp file rather then
                      trying to buffer the response in memory?

                      Should I be directing this question to the
                developers list?

                      Or should I be thinking differently on solving
                this problem?

                      Please any ideas :(

                      Thank you
                      Andrew



Reply via email to