On 8/31/13 12:58 PM +0900, MCBastos wrote:
Interviewed by CNN on 30/08/2013 02:34, Ken F told the world:
I will experiment with that. Doing the actual html codes could be a good basic quick
fix for me to do for restoring copyright symbols, and other isolated symbols on
webpages. It may be more difficult for the much older story webpages from 10 years
ago, where the left-quotes and right-quotes and apostrophes are also now displaying
as <?>.
There is software that will do the conversion for you. HTMLtidy, for
instance, is able to take all your Win-1252 and convert it into
US-ASCII, changing all those curly quotes into appropriate HTML entities
such as ‘, ’ and such.
Note: there three major "generations" of HTMLtidy around the web:
- The original one by Dave Raggett, which is very old (there are still
some sites offering that one)
- The "official" updated version at Sourceforge, which is improved but
has no support for HTML5 stuff:
http://tidy.sourceforge.net/
(This page also has a lot of documentation)
- And Tidy-HTML5, a new (unstable) fork which is adding support for
HTML5, but for which it's a bit hard to find standalone binaries -- the
most recent one I found was here: http://tidybatchfiles.info/
Since you are mostly dealing with oldish pages, with no HTML5, your best
bet is probably the Sourceforge page.
The good points about Tidy: it's a command-line tool, therefore it's not
that hard to find some way to run it on all your HTML files in a batch.
Also, it will find (and attempt to fix) a lot of HTML syntax errors.
The bad points about Tidy: it's a command-line tool, so there is some
effort involved on learning to use it. Also, you don't see what it does
as it is doing it -- Its syntax-fixing algorithm is prone to do stupid
things. I mostly use it as an error FINDER, not an error FIXER; I run it
(taking care to keep a backup of the original file), check the error
log, revert to the backup file and fix the errors by hand.
But if your code is reasonably clean and free of stupid errors like
forgetting to close <b> tags, the auto-fixing might work for you. You
should check the pages afterwards just in case, to see if Tidy didn't
make a mess of them, as it sometimes happens.
SeaMonkey will also very nicely change the encoding in Composer.
--
/////////////////////////////////////////////////////////
// Trane Francks [email protected] Tokyo, Japan
// Practice random kindness and senseless acts of beauty.
_______________________________________________
support-seamonkey mailing list
[email protected]
https://lists.mozilla.org/listinfo/support-seamonkey