Re: LoadGrammar Error?

Ben Griffin Wed, 11 Mar 2009 10:03:00 -0700

Alberto, thanks for your time.

On 11 Mar 2009, at 15:46, Alberto Massari wrote:

Hi Ben,
1) why do you think that Wrapper4LSInput shouldn't look at thebyteStream? The specs list this order

Okay - I see that there is no LSInput.characterStream, which is (sortof) fair enough, so I agree that the order is therefore correct.

2) the stringData is not being converted: MemBufInputSource works ona byte stream, so it needs a cast and a size computed by multiplyingsizeof(XMLCh) by the length (in UTF-16 chars) of the string.

Well, here I have to disagree. Look at the (fragment of ) makeStreambelow:


                        BinInputStream* Wrapper4DOMLSInput::makeStream() const {

// The LSParser will use the LSInput object to determine how toread data. The LSParser will look at the different inputs specified inthe// LSInput in the following order to know which one to readfrom, the first one that is not null and not an empty string will beused:

                            //   1. LSInput.characterStream
                            //   2. LSInput.byteStream
                            //   3. LSInput.stringData
                            //   4. LSInput.systemId
                            //   5. LSInput.publicId
                            InputSource* 
binStream=fInputSource->getByteStream();
                            if(binStream)
                                return binStream->makeStream();
--->                     const XMLCh* xmlString=fInputSource->getStringData();
// xmlString is a XMLCh*, as created using LSInput->setStringData()

                            if(xmlString)
                            {

--> MemBufInputSource is((const XMLByte*)xmlString,XMLString::stringLen(xmlString)*sizeof(XMLCh), "", false,getMemoryManager());

//So why is it being CAST into XMLByte here?

/And now "is" is being instantiated as if the xmlString is aXMLByte* ....


                               is.setCopyBufToStream(false);
                             return is.makeStream();

//...which makes a  BinInputStream* from  "is"

Now, THAT goes onto instantiate a XMLReader which does an initial loadof raw bytes.

    refreshRawBuffer();

and then uses.. and XMLRecognizer to test the Encoding.. HANG ON -this is meant to be XMLCh...... anyway... That should be FINE if it returns the same encoding as aXMLCh.


So being a XMLCh* - the grammar starts (in terms of bytes)  3c 00

XMLRecognizer::basicEncodingProbe( const XMLByte* constrawBuffer , const XMLSize_t rawByteCount)

Because this doesn't actually know about non BOM UTF-16BE or UTF-16LE(ie, the XMLCh encoding), it is going to return "UTF-8".

Likewise, the grammar string does not have an <?xml ..> declaration,(which is legal) the XMLRecognizer is going to fail.

As you can imagine, once the BinInputStream has been identified asUTF-8, there really is no turning back.


Sure enough, now AbstractDOMParser::startDocument() calls

fDocument->setInputEncoding(fScanner->getReaderMgr()->getCurrentEncodingStr());


Just in time for

IGXMLScanner::scanDocument(const InputSource& src) to callscanStartTagNS(gotData)

This then hits trouble at (!fReaderMgr.getQName(fQNameBuf,&prefixColonPos)) which return empty

and the empty will emit an Error.

As for the error you see, are you sure your transcoder->transcoder(grammar_str.c_str()) is actually generating a string ofXMLCh? Could you post its code?


My transcoder?

XMLLCPTranscoder* transcoder = XMLPlatformUtils::fgTransService->makeNewLCPTranscoder(XMLPlatformUtils::fgMemoryManager);



Best regards
        Ben.

Re: LoadGrammar Error?

Reply via email to