Hi Dennis,
The way I have done it is as you suggested, via the XMLStreamReader. I find
the XML element that I am after, and then the next CDATA element is assumed
to be the mimeContent, and pipe it to a tempFile, then I base64decode it,
and return the inputStream for access to the email to the application that
requires it.
I've just tested it with a 200Meg email from a remote Exchange box, and it
worked no probs... ahh finally.
Here is some sample code (to give back to community ;)
XMLStreamReader xsr = firstElement.getXMLStreamReaderWithoutCaching();
...
events: for (int eventType = xsr.getEventType();; eventType = xsr.next()) {
switch (eventType) {
case XMLStreamReader.START_ELEMENT: {
logger.debug("START_ELEMENT: " + xsr.getName());
QName elementName = xsr.getName();
if (mimeContentName.equals(elementName)) {
logger.info("Setting mimeContent to TRUE.
Next CDATA will be mimecontent, saved to [" + tempFilePath + "]");
mimecontent = true;
} else {
logger.debug("Setting mimeContent to
false.");
mimecontent = false;
}
depth++;
break;
}
case XMLStreamReader.CDATA: {
if (mimecontent) {
int sourceStart = 0;
int length = xsr.getTextLength();
char[] target = new char[length];
int copiedLength = 0;
int available = xsr.getTextLength() -
sourceStart;
if (available < 0) {
throw new
IndexOutOfBoundsException("sourceStart is greater than" + "number of
characters associated with this event");
}
if (available < length) {
copiedLength = available;
} else {
copiedLength = length;
}
char[] textChars = xsr.getTextCharacters();
// Only copy the chars that we need.
System.arraycopy(textChars,
xsr.getTextStart() + 0, target, 0, copiedLength);
theWriter.write(target);
} else {
logger.debug("Ignoring CDATA...");
// We don't care about other data
}
break;
}
}
P.S. If I had more time, I would love to look at ADB or JiBX, but for now
there is no chance, a lesson learned for now.
On Tue, May 19, 2009 at 3:29 PM, Dennis Sosnoski <[email protected]> wrote:
> Ah, I hadn't seen noticed/remembered this dealt with a single huge base64
> string. That's just plain bad design for the web service... but no surprise
> if Exchange is involved.
>
> No, you can't handle this cleanly with JiBX or any other data binding tool
> I'm aware of. The only way you *could* handle it would be by getting an
> XMLStreamReader and reading the data directly, with . The XMLStreamReader
> interface provides next() and getTextCharacters() methods which in theory
> could be used to get this text a chunk at a time. Assuming the parser
> involved actually implements these as intended, you can get a block of text
> at a time and run the base64 decoding on that block, then move on to the
> next block. A lot of work to implement, though.
>
> Andrew, you might want to try switching to ADB or JiBX anyway, at least to
> try for an easier solution than working directly with the XMLStreamReader.
> Either of these should use less memory than XMLBeans for the same data, so
> as long as your text strings are only very large and not unbounded these
> might let you squeak by.
>
> - Dennis
>
>
>
> Andreas Veithen wrote:
>
>> Dennis,
>>
>> I think that the Web service is actually Microsoft Exchange, so
>> changing it is not an option. Does JiBX support caching a base64 value
>> (which is not represented using MTOM) on disk instead of memory?
>>
>> Andrew,
>>
>> You might still want to give MTOM a try. Many servers switch to MTOM
>> if the request is sent as MTOM. I can't believe the Microsoft Exchange
>> can only return inline base64.
>>
>> Regards,
>>
>> Andreas
>>
>> On Mon, May 18, 2009 at 13:22, Dennis Sosnoski <[email protected]> wrote:
>>
>>
>>> Hi Andrew,
>>>
>>> Your best starting point is to get rid of XMLBeans. XMLBeans always
>>> stores
>>> raw XML in memory, so there's no way to avoid the memory issues with
>>> large
>>> messages.
>>>
>>> ADB would avoid some of the overhead, in that it would convert the XML
>>> message to an object graph, which would typically be a factor of 2-5x
>>> smaller than the raw XML data (depending on the type of data in your
>>> message).
>>>
>>> JiBX would do at least as well as ADB in terms of the reduced-size object
>>> graph. If you have some way of breaking up the response data into more
>>> easily-digestible chunks, JiBX would also allow you to do piece-meal
>>> processing of the message (by creating a fake collection, for instance,
>>> where your data objects expose an addXXX() method which just writes the
>>> object being added to some backing store rather than adding it to an
>>> in-memory collection). Piece-meal processing is the only way you can
>>> dramatically decrease your memory usage and handle messages of
>>> effectively
>>> unlimited size without a problem.
>>>
>>> - Dennis
>>>
>>> --
>>> Dennis M. Sosnoski
>>> SOA and Web Services in Java
>>> Axis2 Training and Consulting
>>> http://www.sosnoski.com - http://www.sosnoski.co.nz
>>> Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117
>>>
>>>
>>>
>>> Andrew Bruno wrote:
>>>
>>>
>>>> Mr J,
>>>>
>>>> But its just one data element that I need, so there is no concept many
>>>> rows, etc.
>>>>
>>>> As a matter of fact, I am breaking up the call in 2 calls, one for all
>>>> the
>>>> metadata, and one for the actual mime content.
>>>>
>>>> i.e.
>>>>
>>>> <m:ResponseMessages>
>>>> <m:GetItemResponseMessage ResponseClass="Success">
>>>> <m:ResponseCode>NoError</m:ResponseCode>
>>>> <m:Items>
>>>> <t:Message>
>>>> <t:MimeContent CharacterSet="UTF-8">RnJvbTogTGVlIF..... large
>>>> mime base64 encoded email......... 120Meg++ of encoded
>>>> data</t:MimeContent>
>>>> ....
>>>>
>>>> I only ever request one message at a time. Its just that when the
>>>> MimeContent is greater then 10Meg, OutOfmemory occurs.
>>>>
>>>> It appears to be the case because the parse loads all the content into
>>>> memory.
>>>>
>>>> In theory, this element can have data of any size coming back.
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, May 18, 2009 at 6:54 PM, J. Hondius <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>> I'd think about a different design of the webservice calls.
>>>> You should try to avoid real big results.
>>>> Split into more calls.
>>>>
>>>> Something like:
>>>> One to get a overview list
>>>> Another to get details
>>>>
>>>> Or:
>>>> One call to get the size like the SQL count() does
>>>> combined with
>>>> Add parameters to your to limit the result: like start_at,
>>>> number_of_results
>>>>
>>>> my 2c
>>>>
>>>> Andrew Bruno schreef:
>>>>
>>>> Hello all
>>>>
>>>> I was wondering how some of you may be dealing with web
>>>> service calls
>>>> that result in extremely large data responses?
>>>>
>>>> I have been struggling in trying to change the way the parsing
>>>> of the
>>>> XML response works, as I am getting out of memory errors
>>>>
>>>> java.lang.OutOfMemoryError: Java heap space
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.CharUtil.allocate(CharUtil.java:397)
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:506)
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:419)
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:489)
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:2911)
>>>> at
>>>>
>>>>
>>>> org.apache.xmlbeans.impl.store.Cur$CurLoadContext.stripText(Cur.java:3113)
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:3126)
>>>> at
>>>>
>>>>
>>>>
>>>> org.apache.xmlbeans.impl.store.Locale.loadXMLStreamReader(Locale.java:1154)
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:843)
>>>> at
>>>>
>>>> org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:826)
>>>> at
>>>>
>>>>
>>>>
>>>> org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:231)
>>>> .....
>>>>
>>>> Is there a way to change the parser to use a temp file rather then
>>>> trying to buffer the response in memory?
>>>>
>>>> Should I be directing this question to the developers list?
>>>>
>>>> Or should I be thinking differently on solving this problem?
>>>>
>>>> Please any ideas :(
>>>>
>>>> Thank you
>>>> Andrew
>>>>
>>>>
>>>>
>>>
>>
>>
>