Re: LoadGrammar Error?

Alberto Massari Wed, 11 Mar 2009 08:47:24 -0700

Hi Ben,

1) why do you think that Wrapper4LSInput shouldn't look at thebyteStream? The specs list this order


  1. |LSInput.characterStream|
     
<http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSInput-characterStream>

  2. |LSInput.byteStream|
     <http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSInput-byteStream>

  3. |LSInput.stringData|
     <http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSInput-stringData>

  4. |LSInput.systemId|
     <http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSInput-systemId>

  5. |LSInput.publicId|
     <http://www.w3.org/TR/DOM-Level-3-LS/load-save.html#LS-LSInput-publicId>

and the first item, characterStream (of type LSReader) is not availablein Xerces-C++, as allowed by the specs (LSReader is an Object, so itspurpose is to allow the use of java.lang.String).

2) the stringData is not being converted: MemBufInputSource works on abyte stream, so it needs a cast and a size computed by multiplyingsizeof(XMLCh) by the length (in UTF-16 chars) of the string.

As for the error you see, are you sure yourtranscoder->transcoder(grammar_str.c_str()) is actually generating astring of XMLCh? Could you post its code?


Alberto

Ben Griffin wrote:

Okay - I've been staring at this for four days now.
Here is a small example of what is bugging me:
-----------------
    class Err: public DOMErrorHandler {
        bool Err::handleError(const xercesc::DOMError& domError) {
            std::cerr << transcode(domError.getMessage());
            return true;
        }
    };

    int main(int argc, char *argv[]) {
        XMLPlatformUtils::Initialize();
transcoder =XMLPlatformUtils::fgTransService->makeNewLCPTranscoder(XMLPlatformUtils::fgMemoryManager);std::string grammar_str = "<xs:schematargetNamespace=\"http://my.org/blah\";xmlns:xs=\"http://www.w3.org/2001/XMLSchema\"; ><xs:attributename=\"box\" fixed=\"true\" /></xs:schema>";
        XMLCh* grammar_file = transcoder->transcode(grammar_str.c_str());
        Grammar::GrammarType grammar_type = Grammar::SchemaGrammarType;
DOMImplementation* impl =DOMImplementationRegistry::getDOMImplementation(X("LS"));DOMLSParser* parser =((DOMImplementationLS*)impl)->createLSParser(DOMImplementationLS::MODE_SYNCHRONOUS,0);DOMConfiguration* dc = parser->getDomConfig();
        Err* errorHandler = new Err();
        dc->setParameter(XMLUni::fgDOMErrorHandler,errorHandler);
dc->setParameter(XMLUni::fgXercesUseCachedGrammarInParse,true);dc->setParameter(XMLUni::fgXercesSchema, true);dc->setParameter(XMLUni::fgXercesCacheGrammarFromParse,true);dc->setParameter(XMLUni::fgDOMValidate, true);DOMLSInput* input =((DOMImplementationLS*)impl)->createLSInput();
        input->setStringData(grammar_file);
        parser->loadGrammar(input, grammar_type, true);
// [...]
    }
-----------------------------------------------
An error is being thrown by IGXMLScanner::scanStartTagNS becausefQNameBuf is not being loaded by ReaderMgr.getQName becauseisFirstNCNameChar is returning false.
    if (!fReaderMgr.getQName(fQNameBuf, &prefixColonPos)) {
        if (fQNameBuf.isEmpty())
emitError(XMLErrs::ExpectedElementName); // <-- Errorthrown here.
        else


//false being returned by XMLReader::isFirstNCNameChar.
inline bool XMLReader::isFirstNCNameChar(const XMLCh toCheck) const {
    return (((fgCharCharsTable[toCheck] & gFirstNameCharMask) != 0)
            && (toCheck != chColon));
}
The reason is that the schema characters in fCharBuf have beenconverted twice. (note that this is little-endian)
(what follows is the start of a memory dump of the fCharBuf )
3c 00 00 00 78 00 00 00 73 00 00 00 3a 00 00 00
73 00 00 00 63 00 00 00 68 00 00 00 65 00 00 00
6d 00 00 00 61 00 00 00 20 00 00 00 74 00 00 00
61 00 00 00 72 00 00 00 67 00 00 00 65 00 00 00
74 00 00 00 4e 00 00 00 61 00 00 00 6d 00 00 00
65 00 00 00 73 00 00 00 70 00 00 00 61 00 00 00
#0 0x00fe3453 in xercesc_3_0::Wrapper4DOMLSInput::makeStream atWrapper4DOMLSInput.cpp:132#1 0x01011e7b in xercesc_3_0::ReaderMgr::createReader atReaderMgr.cpp:365#2 0x0100d6f7 in xercesc_3_0::IGXMLScanner::scanReset atIGXMLScanner2.cpp:1362#3 0x01003c1b in xercesc_3_0::IGXMLScanner::scanDocument atIGXMLScanner.cpp:197#4 0x0105b587 in xercesc_3_0::AbstractDOMParser::parse atAbstractDOMParser.cpp:535#5 0x01008845 in xercesc_3_0::IGXMLScanner::loadXMLSchemaGrammar atIGXMLScanner2.cpp:2085#6 0x00ffee5f in xercesc_3_0::IGXMLScanner::loadGrammar atIGXMLScanner.cpp:3005#7 0x010616c9 in xercesc_3_0::DOMLSParserImpl::loadGrammar atDOMLSParserImpl.cpp:935
//So here we see the culprit -
BinInputStream* Wrapper4DOMLSInput::makeStream() const {
// The LSParser will use the LSInput object to determine how toread data. The LSParser will look at the different inputs specified inthe// LSInput in the following order to know which one to read from,the first one that is not null and not an empty string will be used:
    //   1. LSInput.characterStream
    //   2. LSInput.byteStream
    //   3. LSInput.stringData
    //   4. LSInput.systemId
    //   5. LSInput.publicId

    InputSource* binStream=fInputSource->getByteStream();
    if(binStream)
        return binStream->makeStream();
    const XMLCh* xmlString=fInputSource->getStringData();
    if(xmlString)
    {
MemBufInputSource is((const XMLByte*)xmlString,XMLString::stringLen(xmlString)*sizeof(XMLCh), "", false,getMemoryManager()); // <--!!!! what?!
        is.setCopyBufToStream(false);
        return is.makeStream();
    }
-----------------------------------------------
First of all the fact that this function first looks at the byteStreamMUST be a bug.Secondly, the characterStream is being CONVERTED - when it shouldalready be an XMLCh* (as defined everywhere else)
Or am I missing a trick?

Re: LoadGrammar Error?

Reply via email to