Re: Accept-Encoding defaults

2008-07-06 Thread Gisle Aas
On Sat, Jul 5, 2008 at 6:39 PM, Bill Moseley [EMAIL PROTECTED] wrote:
 HTTP::Message has a decoded_content() method that will attempt
 to uncompress based on the Content-Encoding header in the response.

 It's wrapped in an eval which will trap exceptions when trying to
 require the modules used to uncompress the content.

 It would make sense that I would set Accept-Encoding based on if I
 have those modules installed.

RIght.

 Since the list of modules (Compress::Zlib and Compress::Bzip2) is
 internal to HTTP::Message, would it make sense to provide a method
 that could set the Accept-Encoding based on what HTTP::Message uses?
 Something like:

$req-set_default_accept_encoding;

I don't like defaults to be set at that level given that we already
have a $ua-default_header() method, so I think it should be something
like:

   $ua-default_header(Accept-Encoding, join(,,
HTTP::Message::decodable()));

 I'm not clear if there's a need to also specify a quality for the
 encodings in the Accept-Encoding header.

I don't think we need to worry about this initially.

  This can't be the default as it would break existing users.

 I often notice code that uses $res-content instead of
 $res-decoded_content.  Most of the time it seems like users really
 want the decoded content.

 I kind of wonder why $res-content is not decoded by default (and
 provide $res-raw_content for those that need it).

It's mostly because of history and compatibility with the original
content() method.  Both are useful in different contexts.  I don't
find the current situation bad. Since decoded_content() can be
expensive and can fail I think the longer name makes it obvious what's
going on how you should use it.

--Gisle


Re: Accept-Encoding defaults

2008-07-06 Thread Bill Moseley
On Sun, Jul 06, 2008 at 02:36:10PM +0200, Gisle Aas wrote:
 $req-set_default_accept_encoding;
 
 I don't like defaults to be set at that level given that we already
 have a $ua-default_header() method, so I think it should be something
 like:
 
$ua-default_header(Accept-Encoding, join(,,
 HTTP::Message::decodable()));

I saw this yesterday, too, about in the absence of an Accept-Encoding
server MAY send any encoding.

http://use.perl.org/~rhesa/journal/25952

That may be one argument for having a default, but in practice I'd
expect it very rare for a server to compress w/o an Accept-Encoding
header sent by the client.

  I'm not clear if there's a need to also specify a quality for the
  encodings in the Accept-Encoding header.
 
 I don't think we need to worry about this initially.

And the RFC says qvalues are not permitted with x-gzip and z-compress.

  I kind of wonder why $res-content is not decoded by default (and
  provide $res-raw_content for those that need it).
 
 It's mostly because of history and compatibility with the original
 content() method.  Both are useful in different contexts.  I don't
 find the current situation bad. Since decoded_content() can be
 expensive and can fail I think the longer name makes it obvious what's
 going on how you should use it.

Agreed, it's not something that could change.  I was just lamenting
how often I see $res-content used in existing programs and modules.

I don't see using $res-decoded_content as more expensive.   If you
need decoded content (which is likely the typical use) then you have
to decode it -- no way around that.


I can only guess that the beginners are more likely to use
$res-content directly (as that's the example in the SYNOPSIS)  and
they perhaps are on slower connections where compression would help
both the server and client.  But, it's not breaking anything to not
use compression.

Ignoring decoding (charset), on the other hand, is probably wrong in
most cases -- even though it's easy to ignore.


You have this in the SYNOPSIS of LWP::UserAgent:

if ($response-is_success) {
print $response-content;  # or whatever
}

which is, perhaps accidentally, correct since you are printing
un-decoded (charset) content.  But, I doubt most users are just using
LWP to print content out directly.

How would you feel about providing new users with more guidance in the
SYNOPSIS?  That is, use decoded_content in the synopsis for those of
us that often don't get past that section of the man page.

if ($response-is_success) {
$content = $response-decoded_content;
}

Now, I suspect that LWP::Simple really should be returning
decoded_content -- but again, I don't know how to to change that one
without breaking a large number of existing scripts.


I think I asked about this some time ago, but might be good for
HTTP::Message to have decoded_content wrap two methods for
un-compressing and the charset decoding.  There might be a case where
we would want uncompressing but not decoding.

Hum, I'm not clear about this, but I wonder if the response content is XML
that will be passed to, say, XML::LibXML should it be passed decoded
or not.



-- 
Bill Moseley
[EMAIL PROTECTED]
Sent from my iMutt