Adriano Bonat schrieb:
On Tue, Sep 9, 2008 at 5:39 PM, Simon Kitching <[EMAIL PROTECTED]> wrote:
Can you post the relevant part of the rss input text?
For example:
http://news.google.com/?output=rss&ned=en&num=50&q=test&ie=UTF-8
That isn't what I meant; I'm quite sure that google.com is generating
good xml. But what is being passed to Digester?
>From this error message, it sure looks like the input is invalid xml.
And if that is the case, then there is no way to parse it with any xml
parser.
The <description> content from the Google's RSS is escaped, so "<" is
<, ">" is >... so I don't understand why I'm getting that error.
By the way, how do you view the raw xml from that url?
If it is intermittent, then maybe you are getting intermittent
truncation of the input data stream.
Hmm.. it is implemented like this:
InputStreamReader isr = new
InputStreamReader(urlConnection.getInputStream(), "UTF-8");
BufferedReader br = new BufferedReader(isr);
Channel channel = (Channel) this.rssParser.parse(br);
urlConnection.disconnect();
... so using a BufferedReader is this "intermittent" problem possible?
It would seem so.
I would recommend reading the contents of the input stream into a String
first, then passing that to digester. Then you can see what data is
really being parsed.
By the way, digester does not parse the input itself. Digester is simply
a "sax event handler". The parse methods are just simple convenience
wrappers that create an instance of whatever xml parser is bundled with
the jvm, configures the digester instance to listen to events from that
parser then passes the input to the xml parser. So what you are seeing
is an error being reported from the standard xml parser built into your
jvm; it's really nothing to do with Digester.
Regards,
Simon
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]