We are trying to use this method to report the exact
positions of elements inside an XML document. It works
fine for regular XML document, but for those XML
containing specail characters like foreign characters,
the returned numbers are no longer correct. 

Below is the testing example I used (all the files are
in the attachments). The main process is pretty
straightforward: For each element, store its start
position and end positions within the callback
methods, and finally, read the whole XML file as a
string and print out the portion between the start and
end positions of each element. We are using the latest
jar file xercesImpl-gump-23062006.jar. Please notice
in the result that the output for elements
"CapitalGainNetIncome" and "DateAcquired" are correct,
but not for elements "PropertyDescription" and
"Return"!

Please let me know if I missed anything here. Any
response would be greatly appreciated!

Thanks,

James Zhang




***XniTest.java***

Himport java.io.File;
import java.io.FileInputStream;
import java.io.FileReader;
import java.io.InputStreamReader;
import java.io.Reader;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

import org.apache.xerces.parsers.XMLDocumentParser;
import org.apache.xerces.xni.Augmentations;
import org.apache.xerces.xni.NamespaceContext;
import org.apache.xerces.xni.QName;
import org.apache.xerces.xni.XMLAttributes;
import org.apache.xerces.xni.XMLLocator;
import org.apache.xerces.xni.XNIException;
import org.apache.xerces.xni.parser.XMLInputSource;
import
org.apache.xerces.xni.parser.XMLParserConfiguration;

public class XniTest extends XMLDocumentParser {
        
        static final String DEFAULT_PARSER_CONFIG =
"org.apache.xerces.parsers.XIncludeAwareParserConfiguration";
        static final String NAMESPACE_PREFIXES_FEATURE_ID =
"http://xml.org/sax/features/namespace-prefixes";;
    protected static final String
SCHEMA_VALIDATION_FEATURE_ID =
       
"http://apache.org/xml/features/validation/schema";;
    protected static final String
HONOUR_ALL_SCHEMA_LOCATIONS_ID = 
       
"http://apache.org/xml/features/honour-all-schemaLocations";;

        public static final String PATH =
"\\work\\data\\special_char.xml";

        static Map startPositions = new HashMap();
        static Map endPositions = new HashMap();
        
        private XMLLocator locator;

        public XniTest(XMLParserConfiguration configuration)
{
                super(configuration);
        }

    public void startDocument(XMLLocator locator,
String encoding, NamespaceContext namespaceContext,
Augmentations augs)
    throws XNIException {
        this.locator = locator;
    }
        
    public void startElement(QName element,
XMLAttributes attrs, Augmentations augs)
    throws XNIException {
                // insert the element
        startPositions.put(element.localpart,
Integer.valueOf(locator.getCharacterOffset()));
        }

    static public void run() {
        XMLParserConfiguration parserConfig = null;
        try {
            parserConfig =
(XMLParserConfiguration)ObjectFactory.newInstance(DEFAULT_PARSER_CONFIG,
                ObjectFactory.findClassLoader(),
true);
            parserConfig.addRecognizedFeatures(new
String[] {
                NAMESPACE_PREFIXES_FEATURE_ID,
            });
           
parserConfig.setFeature(HONOUR_ALL_SCHEMA_LOCATIONS_ID,
true);
            
            XMLDocumentParser parser = new
XniTest(parserConfig);
            parser.parse(new XMLInputSource(null,
PATH, null));
            
            String content = getDocument(PATH);
            
            // loop through all the starting elements
            for (Iterator
it=startPositions.keySet().iterator(); 
            it.hasNext();) {
                String element = (String)it.next();
                int start =
((Integer)startPositions.get(element)).intValue();
                int end =
((Integer)endPositions.get(element)).intValue();
                System.out.println("Element:"+element);
            
System.out.println(content.substring(start, end));
                System.out.println();
            }
        }
        catch (Exception e) {
            e.printStackTrace();
        }
        
    }

    /**
         * @param args
         */
        public static void main(String[] args) {
                // TODO Auto-generated method stub
                run();

        }

        public void endElement(QName arg0, Augmentations
arg1) throws XNIException {
                endPositions.put(arg0.localpart,
Integer.valueOf(locator.getCharacterOffset()));
        }


        // read the content of the file into a string
        static String getDocument(String path) throws
Exception {
                StringBuffer doc = new StringBuffer();
                //CLOB clob = new CLOB();
                char buf[] = new char[10240];
                int start = 0;
                Reader reader = new InputStreamReader(new
FileInputStream(path));
                do {
                        int len = reader.read(buf, 0, 10240);
                        doc.append(new String(buf, 0, len));
                        if (len < 10240) {
                                break;
                        } else {
                                start += len;
                        }
                } while (true);
                return doc.toString();
        }
        
***special_char.xml***
<?xml version="1.0" encoding="UTF-8"?>
<Return>
                
<CapitalGainNetIncome>65774204</CapitalGainNetIncome>
                        <DateAcquired>1999-05-30</DateAcquired>
                
<PropertyDescription>¼</PropertyDescription>
</Return>

        
}


***Test result****
Element:PropertyDescription
¼</PropertyDescription

Element:CapitalGainNetIncome
65774204</CapitalGainNetIncome>

Element:DateAcquired
1999-05-30</DateAcquired>

Element:Return

                
<CapitalGainNetIncome>65774204</CapitalGainNetIncome>
                        <DateAcquired>1999-05-30</DateAcquired>
                
<PropertyDescription>¼</PropertyDescription>
</Return

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Attachment: ObjectFactory.java
Description: 1184072937-ObjectFactory.java

Attachment: XniTest.java
Description: 1783284100-XniTest.java

Attachment: SecuritySupport.java
Description: 1241980436-SecuritySupport.java

<?xml version="1.0" encoding="UTF-8"?>
<Return>
			<CapitalGainNetIncome>65774204</CapitalGainNetIncome>
			<DateAcquired>1999-05-30</DateAcquired>
                	<PropertyDescription>¼</PropertyDescription>
</Return>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to