>> -override-charset maybe?
> Or, heck, that!
:-)
> But editing a document to get it -dump'ed out correctly, that is no
> good.
Well...the real offender here is whatever led to serving 8859-* data
but mislabeling it as UTF-8. I have mixed feelings about making it
easy to work around brokenness
Steffen Nurpmeso dixit:
>> But editing a document to get it -dump'ed out correctly, that is no
>> good.
You’d be surprised at the amounts of changes I have to do to some
pages for them to lead to correct results…
Mouse dixit:
>Well...the real offender here is whatever led to serving 8859-*
Hello!
I stumbled over the above, ie.,
curl -o x.html https://www.google.com
from here in Germany (in C.UTF-8 locale; but also Poland, said
a PLD-Linux developer) fetches a document that declares itself as
UTF-8, but in fact contains ISO-8859-1 (-2).
I had to copy the document and modify it,
> I stumbled over the above, [...]
IMO this is correct. I would say there could be a place for a flag
that overrides the document's declared charset, but I don't think
"assume" would be a good verb to use. -override-charset maybe?
/~\ The ASCII Mouse
\ / Ribbon
Mouse wrote in
<202401172011.paa04...@stone.rodents-montreal.org>:
|> I stumbled over the above, [...]
|
|IMO this is correct. I would say there could be a place for a flag
|that overrides the document's declared charset, but I don't think
|"assume" would be a good verb to use.
Thorsten Glaser wrote in
:
|Steffen Nurpmeso dixit:
|>> But editing a document to get it -dump'ed out correctly, that is no
|>> good.
|
|You’d be surprised at the amounts of changes I have to do to some
|pages for them to lead to correct results…
Likely so.
--steffen
|
|Der Kragenbaer,