On 9/22/07, Bjoern Hoehrmann <[EMAIL PROTECTED]> wrote: > > * Bill Moseley wrote: > >If you have the response object: > > $response->decoded_content; > > That removes content encodings like gzip and deflate, but David is > asking about character encodings like utf-8 and iso-8859-1. Content > encodings are applied after character encodings. >
So after reading Bill's response, I thought to myself the same thing, but added, "...though that sounds like it would be the perfect place to implement this." After checking the code, decoded_content does indeed decode character encodings and returns text instead of octets! I don't think it used to do that, but that's great. It still doesn't help in the LWP::Simple case, though, and if someone is actually using LWP::Simple for their application, they probably aren't going to spend the time needed to ensure the octets they get back are meaningful text either. But this certainly simplifies the problem. What would people think about just changing LWP::Simple to use decoded_content instead of content? David