Re: Byte Order Mark mucks up headers

Gisle Aas Thu, 07 Oct 2004 02:27:06 -0700

"Phil Archer" <[EMAIL PROTECTED]> writes:

> I've read Sean Burke's book, I've looked through the archives of this
> list and done other searches but can't find an answer to a problem I
> have found with LWP. If the character coding for a website has a byte
> order mark (things like utf-16, all that "big endian/little endian"
> stuff) then LWP can't interpret HTML headers in the usual way. Does
> anyone know a way around this?


HML::HeadParser needs to be fixed.  It will assume that there is no
<head> section when it sees text before anything else.  The part of
the code responsible for this currently allows whitespace, but needs
to be tought that BOM is harmless too.  Look at the 'text' method.

Do you want to try to provide a patch?

Regards,
Gisle

Re: Byte Order Mark mucks up headers

Reply via email to