Hi Charles,
We've also run into this issue at open3.org and have come up with a solution. You are 
close on your analysis of the problem. There
are two problems in using crimson in this manner (also apparent with most other 
parsers - Xerces and Saxon).

1) Parsers usually read in a chunk of data at a time from the stream and then parse it.
2) Parsers will typically close the stream when the end is reached (or an exception is 
thrown).

There are a couple of solutions to this. You can read in the relevant data from the 
stream yourself and place this data into a
buffer - then give it to the parser - OR - You can implement your own stream reader to 
change the implementation of the read/close
methods. The Xerces FAQ actually contains information on how to do this also (they 
have an additional requirement in that you should
also implement your own StreamingCharFactory, but see that FAQ for more information 
about this).

For our purposes, we use our own reader which extends the BufferedReader class. I have 
attached this class so you can see how we
handle it. We create the stream for the reader and give it to the parser like this:
        XMLStreamReader is = new XMLStreamReader(
                new InputStreamReader(socket.getInputStream(), "UTF8"));
        XMLReader reader = ... // Get an XML Reader - or use JAXP or whatever
        reader.parse(new InputStream(is));

I should point out that the included code does have a copyright notice, as it is 
included in our open source distribution. The
licensing agreement uses a GPL style license - so depending on your needs this 
shouldn't be a problem. If it is an issue, or you are
timid about such things - you can use this (or the Xerces help) as an example to 
formulate your own solution.

Hope that helped,
Duane L. Stoddard
[EMAIL PROTECTED]         http://www.open3.org
Open Source Integration Solutions for the Enterprise


-----Original Message-----
From: Charles Owen [mailto:[EMAIL PROTECTED]]
Sent: Friday, August 03, 2001 4:55 PM
To: [EMAIL PROTECTED]
Subject: continuous document; sax



Hello.

I'm trying to use Crimson to parse a "continuous" XML document using SAX.

More specifically, an xml stream is passing through a socket, and the end of
the document may not arrive for quite some time. However, I'd like the SAX
parser to act on elements as soon as they arrive.

I haven't yet looked at the code, but it seems that the parser is waiting until
it sees an EOF before beginning to parse. (Just a guess. Sorry.)

In any case, the parser simply hangs until I close the socket, then it processes
the document (complaining about things at the end).

Is there a way to do what I want using Crimson?

Thanks for any help,

Charles.


---------------------------------------------------------------------
In case of troubles, e-mail:     [EMAIL PROTECTED]
To unsubscribe, e-mail:          [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

XMLStreamReader.java

---------------------------------------------------------------------
In case of troubles, e-mail:     [EMAIL PROTECTED]
To unsubscribe, e-mail:          [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to