OK, here's the algorithm I've come up with. If an understandable charset is specified in an HTTP header for that page, use that.
Otherwise: If the page is text/html, and specifies the charset internally with a <META> tag, use that. Otherwise: If the user has explicitly specified a charset, either in the .pluckerrc or .ini file, or on the command line, use that charset. Otherwise: If the page's URL starts with 'http:' or 'https:', use the HTTP default charset of ISO-8859-1. Otherwise: If a locale-specific charset (obtained by using the Python locale module) is both specified and understandable, use that. Note that this is mainly for file: and plucker: URLs, which seem just right. Otherwise: Specify that the charset for the page is unknown. Bill
