Hello all,

I hope this is the correct place to ask this. I didn't want to ask on the 
developers list as it doesn't seem like the proper place to ask questions about 
end usage patterns.

In short, I am having an issue using the MemBufInputSource to take a chunk of 
XML contained in a std::wstring and pass it to DOMBuilder::parse. I am creating 
the input source as such:

MemBufInputSource xmlSource(
        reinterpret_cast<const XMLByte *>(xml.to_utf8()),
        static_cast<const unsigned int>(xml.length() * sizeof(wchar_t)),
        "pidc_rules_file",
        true
);

Ignore the 'to_utf8' call, that is just some encoding agnostic extensions we 
have added to our std::wstring/std::string sub-class so we can actually compile 
the code ANSI when required. I am telling the input source to adopt the buffer 
that I pass to it, so that it isn't referencing back to the std::wstring's 
internal buffer. Note in this case the code is compiled _UNICODE, so the 
to_utf8 call is a simple pass-through to std::wstring::c_str, there is no 
XMLString::transcode going on here.

When I attempt to call DOMBuilder::parse like so:

m_Doc = m_Builder->parse(Wrapper4InputSource(&xmlSource, false));

I get the following error passed to my error handler:

An exception occurred! Type:UTFDataFormatException, Message:invalid byte 2 (╠) 
of a 2-byte sequence.

Obviously it's not liking part of the UTF-8 sequence I am passing it, but why? 
The XML itself is fine. I can tell my builder to load the document from a file 
using parseURI, and all works well. For functionality reasons, I need to be 
able to load an arbitrary chunk of XML outside the scope of loading a file. Is 
there another, more intuitive way to do this aside from creating a 
MemBufInputSource? Please note, I am doing XML Schema validation, so I would 
like to stick to DOMBuilder if possible.

I am pretty new to Xerces (although not XML or XML parsing in general), so I 
fear I am simply missing something obvious here.

Any thoughts?

Matt Holmes


Reply via email to