Hi Xaltotun, This isn't a bug. characters() may be called multiple times [1][2] for contiguous text. Your ContentHandler needs to accumulate the text returned in each call of characters() until you receive a callback that isn't characters.
[1] http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ContentHandler.html#characters(char[],%20int,%20int) [2] http://xerces.apache.org/xerces2-j/faq-sax.html#faq-2 Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] "Xaltotun" <[EMAIL PROTECTED]> wrote on 07/16/2006 12:59:37 PM: > Hi all. My name is Xaltotun. I encountered the following bug in > Xerces2 version 2.8.0. > > BUG DESRIPTION > At first there is an xml file that looks like this: > > <element_1> > abcabcabcabcabcabc > abcabcabcabcabcabc > </element_1> > <element_2> > abcabcabcabcabcabc > abcabcabcabcabcabc > </element_2> > ? > <element_n> > abcabcabcabcabcabc > abcabcabcabcabcabc > </element_n> > > I subclass DefaultHandler and override startDocument, startElement, > characters, endElement, endDocument. > The text ?abcabc?abc? is processed in the ?characters? function. But > when the file is big enough I sometimes get this text broken into two parts. > I.e. the characters function returns > ?abcabcabcabcabcabc > ab? > and after this characters is called one more time and it returns > the remaining part > ?cabcabcabcabcabc?. > > > FILES ATTACHED > I?ve attached two files to this email. > xml_file.xml ? with this file one can reproduce the error I?ve > found. Only print what the characters function returns and prevent > escape characters from printing. > xercesIssueText.txt ? to this file I printed what characters > returned without escape characters. I marked the wrong lines with > several exclamation signs ?!!!!!!?. > > For example there is an element in this file > <Node NodeType="NODE_TYPE_PROPERTY_STRING"> > <NodeName>Location generation type</NodeName> > > <NodeDisplayName>PROPERTY_DISPLAY_NAME_LOCATION_GENERATION_TYPE</NodeDisplayName> > > <NodeDescription>PROPERTY_DESCRIPTION_LOCATION_GENERATION_TYPE</NodeDescription> > <NodeValueType>VALUE_TYPE_STRING</NodeValueType> > > <NodeValue>NETWORK_DEVICE_LOCATION_GENERATION_TYPE_RANDOM_UNIFORM_X_Y_Z</NodeValue> > <NodeEnabled>true</NodeEnabled> > <NodeDisplayable>true</NodeDisplayable> > <NodeEditable>true</NodeEditable> > </Node> > The characters function returns the following: > Location generation type > PROPERTY_DISPLAY_NAME_LOCATION_GENERATION_TYPE > PROPERTY_DESCRIPTION_LOCATION_GENERATION_TYPE > VALUE_TYPE_STRING > NETWORK_DEVICE_LOCATION_GENERATION_TYPE_R > ANDOM_UNIFORM_X_Y_Z > true > true > true > > I think the characters function should return > PROPERTY_DISPLAY_NAME_LOCATION_GENERATION_TYPE > PROPERTY_DESCRIPTION_LOCATION_GENERATION_TYPE > VALUE_TYPE_STRING > NETWORK_DEVICE_LOCATION_GENERATION_TYPE_RANDOM_UNIFORM_X_Y_Z > true > true > true > > > QUESTION > As for me it is a bug. But I think it is strange no one has ever > found it. That is why I decided to post my question to this list. > Can anyone help me to decide if this is a bug? > > > ----------------------------- > Xaltotun > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
