On 9/22/07, Bjoern Hoehrmann <[EMAIL PROTECTED]> wrote:
>
> * Bill Moseley wrote:
> >If you have the response object:
> >    $response->decoded_content;
>
> That removes content encodings like gzip and deflate, but David is
> asking about character encodings like utf-8 and iso-8859-1. Content
> encodings are applied after character encodings.
>

So after reading Bill's response, I thought to myself the same thing, but
added, "...though that sounds like it would be the perfect place to
implement this."

After checking the code, decoded_content does indeed decode character
encodings and returns text instead of octets!  I don't think it used to do
that, but that's great.

It still doesn't help in the LWP::Simple case, though, and if someone is
actually using LWP::Simple for their application, they probably aren't going
to spend the time needed to ensure the octets they get back are meaningful
text either.  But this certainly simplifies the problem.

What would people think about just changing LWP::Simple to use
decoded_content instead of content?

David

Reply via email to