Adriano Bonat schrieb:
On Tue, Sep 9, 2008 at 5:39 PM, Simon Kitching <[EMAIL PROTECTED]> wrote:
Can you post the relevant part of the rss input text?

For example:
http://news.google.com/?output=rss&ned=en&num=50&q=test&ie=UTF-8
That isn't what I meant; I'm quite sure that google.com is generating good xml. But what is being passed to Digester?
>From this error message, it sure looks like the input is invalid xml.
And if that is the case, then there is no way to parse it with any xml
parser.

The <description> content from the Google's RSS is escaped, so "<" is
&lt;, ">" is &gt;... so I don't understand why I'm getting that error.
By the way, how do you view the raw xml from that url?
If it is intermittent, then maybe you are getting intermittent
truncation of the input data stream.

Hmm.. it is implemented like this:

InputStreamReader isr = new
InputStreamReader(urlConnection.getInputStream(), "UTF-8");
BufferedReader br = new BufferedReader(isr);
                        
Channel channel = (Channel) this.rssParser.parse(br);
                
urlConnection.disconnect();

... so using a BufferedReader is this "intermittent" problem possible?

It would seem so.

I would recommend reading the contents of the input stream into a String first, then passing that to digester. Then you can see what data is really being parsed.

By the way, digester does not parse the input itself. Digester is simply a "sax event handler". The parse methods are just simple convenience wrappers that create an instance of whatever xml parser is bundled with the jvm, configures the digester instance to listen to events from that parser then passes the input to the xml parser. So what you are seeing is an error being reported from the standard xml parser built into your jvm; it's really nothing to do with Digester.

Regards,
Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to