Thanks for the code frag. It's been inserted into my app, and is working beautifully.
But, of course, Oleg's comment begs the question. I'm now reading the response as a stream, but still parsing it as "one huge String." I based my HTML parser on the ones in The Java Tutorial, which all use a String as input, so I assumed (ah... the Problem) that feeding in one huge String was the way to go.
My method calls look like this:
setParser.parseSetPage(readFully(new InputStreamReader(get.getResponseBodyAsStream(), get.getRequestCharSet())));
where: parseSetPage(String str)
Can you point me in the direction of an online example where the page is fed to the parser chunk by chunk?
M.
Oleg Kalnichevski wrote:
Duncan & Michael,
This is precisely the way we recommend the response body be consumed.
The whole idea is that one should REALLY avoid converting the response body to a String unless absolutely necessary. One should really be consuming the response body as a byte or char stream, which will result in much, much more memory efficient code. For instance, if the content body ultimately gets fed to an HTML parser or a scanner, it is by far more efficient to feed it through a Reader in smaller chunks rather than as one huge String
There's one little change which I would have made, though:
readFully(
new InputStreamReader(
get.getResponseBodyAsStream(), get.getResponseCharSet()));
Otherwise, everything looks cool
Cheers,
Oleg
On Sat, 2004-11-27 at 10:05 +0000, Duncan McGregor wrote:
It will kind of work, although readLine discards the line end character, which you might well want when parsing the string. And you may want to consider the character set used in the InputStreamReader.
Coincidentally I wrote this code yesterday
public static String readFully(Reader input) throws IOException {
BufferedReader bufferedReader = input instanceof BufferedReader ? (BufferedReader) input
: new BufferedReader(input);
StringBuffer result = new StringBuffer();
char[] buffer = new char[4 * 1024];
int charsRead;
while ((charsRead = bufferedReader.read(buffer)) != -1) {
result.append(buffer, 0, charsRead);
} return result.toString();
}
Call this with doc = readFully(new InputStreamReader(get.getResponseBodyAsStream(), YOURCHARSET));
Another good bet would be Jakarta Commons IO - IOUtils.toString(Reader)
Duncan Mc^Gregor The name rings a bell www.oneeyedmen.com
-----Original Message-----
From: Michael Taft [mailto:[EMAIL PROTECTED] Sent: 27 November 2004 07:03
To: HttpClient User Discussion
Subject: getResponseBodyAsStream
HttpClient keeps begging me to use getResponseBodyAsStream, rather than getResponseBodyAsString, due to the size of the response body. I'm willing to do this, even if just to make it happy. However, as a total newbie, I'm not clear about the best way to take a response stream and turn it into a string (that I can then parse, which is what I'm up to).
I realize this is a trivial task for most of you. Here is how I propose to do it:
------
StringBuffer buffer = new StringBuffer(); try { InputStream is = get.getResponseBodyAsStream(); BufferedReader in = new BufferedReader(new InputStreamReader(is)); String str = ""; while(str != null) { str = in.readLine(); buffer.append(str); } } catch(IOException e) ( ...etc. }
------
My questions about this are: 1) Will this work? 2) Is there a better way to do it?
Thanks. M.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
-- Michael W. Taft Writer/Editor 4614 Finley Avenue, #3 Los Angeles, CA 90027 (323)663-6042
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
