Re: LoadGrammar Error?

Alberto Massari Wed, 11 Mar 2009 10:42:10 -0700

Hi Ben,

the cast in the MemBufInputSource is fine, as it is simply a wrapper fora bunch of bytes, regardless of which encoding they are using. The onlything that can be made to avoid your case (a missing XML header in thestring) is adding the call to


is.setEncoding(XMLUni::fgXMLChEncodingString);

after the creation of the object.

Alberto

Ben Griffin wrote:

Alberto, thanks for your time.

On 11 Mar 2009, at 15:46, Alberto Massari wrote:
Hi Ben,
1) why do you think that Wrapper4LSInput shouldn't look at thebyteStream? The specs list this order
Okay - I see that there is no LSInput.characterStream, which is (sortof) fair enough, so I agree that the order is therefore correct.
2) the stringData is not being converted: MemBufInputSource works ona byte stream, so it needs a cast and a size computed by multiplyingsizeof(XMLCh) by the length (in UTF-16 chars) of the string.
Well, here I have to disagree. Look at the (fragment of ) makeStreambelow:
            BinInputStream* Wrapper4DOMLSInput::makeStream() const {
// The LSParser will use the LSInput object todetermine how to read data. The LSParser will look at the differentinputs specified in the// LSInput in the following order to know which one toread from, the first one that is not null and not an empty string willbe used:
                //   1. LSInput.characterStream
                //   2. LSInput.byteStream
                //   3. LSInput.stringData
                //   4. LSInput.systemId
                //   5. LSInput.publicId
                InputSource* binStream=fInputSource->getByteStream();
                if(binStream)
                    return binStream->makeStream();
--->                const XMLCh* xmlString=fInputSource->getStringData();
// xmlString is a XMLCh*, as created using LSInput->setStringData()

                if(xmlString)
                {
--> MemBufInputSource is((const XMLByte*)xmlString,XMLString::stringLen(xmlString)*sizeof(XMLCh), "", false,getMemoryManager());
//So why is it being CAST into XMLByte here?
/And now "is" is being instantiated as if the xmlString is a XMLByte*....
                   is.setCopyBufToStream(false);
                     return is.makeStream();

//...which makes a  BinInputStream* from  "is"
Now, THAT goes onto instantiate a XMLReader which does an initial loadof raw bytes.
    refreshRawBuffer();
and then uses.. and XMLRecognizer to test the Encoding.. HANG ON -this is meant to be XMLCh...... anyway... That should be FINE if it returns the same encoding as aXMLCh.
So being a XMLCh* - the grammar starts (in terms of bytes)  3c 00
XMLRecognizer::basicEncodingProbe( const XMLByte* const rawBuffer, const XMLSize_t rawByteCount)
Because this doesn't actually know about non BOM UTF-16BE or UTF-16LE(ie, the XMLCh encoding), it is going to return "UTF-8".
Likewise, the grammar string does not have an <?xml ..> declaration,(which is legal) the XMLRecognizer is going to fail.
As you can imagine, once the BinInputStream has been identified asUTF-8, there really is no turning back.
Sure enough, now AbstractDOMParser::startDocument() calls
fDocument->setInputEncoding(fScanner->getReaderMgr()->getCurrentEncodingStr());
Just in time for
IGXMLScanner::scanDocument(const InputSource& src) to callscanStartTagNS(gotData)
This then hits trouble at (!fReaderMgr.getQName(fQNameBuf,&prefixColonPos)) which return empty
and the empty will emit an Error.
As for the error you see, are you sure yourtranscoder->transcoder(grammar_str.c_str()) is actually generating astring of XMLCh? Could you post its code?
My transcoder?
XMLLCPTranscoder* transcoder =XMLPlatformUtils::fgTransService->makeNewLCPTranscoder(XMLPlatformUtils::fgMemoryManager);
Best regards
    Ben.

Re: LoadGrammar Error?

Reply via email to