How do I parse a DTD in Java?

Andrews, Scott Tue, 13 Jan 2004 07:07:03 -0800

How do I parse a DTD into an in-memory Java object, like a TreeMap or perhaps some XML specific collection class?

I asked this question the other day, and got an answer that the DocumentBuilder parse method should handle the parsing of a DTD – since a DTD IS XML.

However, I get basic parsing errors when inputting a simple DTD. The code works fine on XML documents, but not on DTDs. The code I’m using to parse the DTD looks like this:

public static void main( String argArgs[] ) {

{

File dtdFile = new File( "C:\\APIS\\WorkSpace\\tv.dtd" );

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

DocumentBuilder db = dbf.newDocumentBuilder();

Document document = db.parse( dtdFile );

parseChildrenRecursivly( document.getChildNodes(); );

}

public void parseChildrenRecursivly( NodeList argNodeList ) {

if (argNodeList == null) {

return;

}

Node node;

for (int i=0; i<argNodeList.getLength(); i++) {

node = argNodeList.item( i );

if (node.getNodeType() != Node.TEXT_NODE) {

System.out.println(

"node.nodeName = " + node.getNodeName() + "; " +

"node.nodeType = " + Short.toString( node.getNodeType() ) + "; " +

"node.localName = " + node.getLocalName() + "; " +

"node.namespaceUri = " + node.getNamespaceURI() + "; " +

"node.nodeValue = " + node.getNodeValue() + "; " +

);

parseChildrenRecursivly( node.getChildNodes() );

}

} // for

}

However, I get errors when making the attempt:

[Fatal Error] :-1:-1: Premature end of file.

ERR:> Exception Premature end of file.

The DTD I’m trying to parse is just an example. It looks like this, where the elements are embedded inside the DOCTYPE tag:

<!DOCTYPE TVSCHEDULE [

<!ELEMENT TVSCHEDULE (CHANNEL+)>

<!ELEMENT CHANNEL (BANNER, DAY+)>

<!ELEMENT BANNER (#PCDATA)>

<!ELEMENT DAY ((DATE, HOLIDAY) | (DATE, PROGRAMSLOT+))+>

<!ELEMENT HOLIDAY (#PCDATA)>

<!ELEMENT DATE (#PCDATA)>

<!ELEMENT PROGRAMSLOT (TIME, TITLE, DESCRIPTION?)>

<!ELEMENT TIME (#PCDATA)>

<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT DESCRIPTION (#PCDATA)>

<!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED>

<!ATTLIST CHANNEL CHAN CDATA #REQUIRED>

<!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED>

<!ATTLIST TITLE RATING CDATA #IMPLIED>

<!ATTLIST TITLE LANGUAGE CDATA #IMPLIED>

If I just parse the ELEMENTS, by removing the DOCTYPE tag, I still get errors:

Exception The markup in the document preceding the root element must be well-formed.

[Fatal Error] tv.dtd:3:3: The markup in the document preceding the root element must be well-formed.

Anybody have a clue how to parse a DTD, so I can get an in-memory structure of the DTD in Java?

Scott Andrews

Principle Software Engineer

Concurrent Technologies Corporation

(814) 269 6580 (Monday, Wednesday, Friday)

(814) 632 9559 (Tuesday, Thursday)

(814) 880 8522 (Cell)

How do I parse a DTD in Java?

Reply via email to