Vincent Jacques wrote:
Hello,

I'm trying to use the SAX2 API to implement a match function for boost::asio::async_read_until. Please don't stop reading if you don't know this library, I'll try to explain :-)

This boost::asio::async_read_until function reads some bytes on a tcp/ip socket, then calls the user-provided match function to check if it has read enough data. If yes, it triggers a user-provided callback function. If no, it performs a new read on the socket, and tries again.

What I want is that the callback is called when I receive each complete element which is a direct son of the root element:
<?xml version="1.0">
<root>
<element><foo><bar/></foo></element> <!-- CALLBACK HERE -->
<bee><gee/></bee> <!-- CALLBACK HERE -->
</root>

So, I've designed a SAX2 content handler which tells me if the parsing is at such a place, and I loop on SAX2XMLReader::parseNext, asking my handler if it's ok to call the callback, each time it returns.

I've also designed a BinInputStream and a InputSource that transfers the data given by Boost.Asio to Xerces.

My problem is that my match function is called in a context where I'm not sure I'll be able to give anything to Xerces if it calls readBytes. It's possible for example, that I've already given
"<?xml version="1.0"><root><element><foo><bar/></foo></elem"
to Xerces, because I've received only this. In this case, I must reply 'false' to Boost, to wait for some more bytes.

But at this point, Xerces calls readBytes, and I don't know what to do: -if I give it 0 bytes, it throws an exception telling it has reached the end of file, and I cannot resume the parse when I've received more data.
-if I throw an exception from readBytes, It's no better.

*To make it short*: I need to cleanly exit parseNext when it calls readBytes and I cannot give anything, while keeping the possibility to call parseNext later.

Any idea ?
Unfortunately, parseFirst() and parseNext() weren't designed to support speculative reading of the input source. They assume the input is always available, or that a call to readBytes() will block until something is available. Furthermore, parseNext() won't return until it's actually parsed something, and what that unit of work is is not well-specified.

My gut feeling is it would be difficult to implement this behavior with parseFirst() and parseNext(), because they are implemented using the same core code as parse().

Dave

Reply via email to