Dr.Ruud wrote:
Erik schreef:
iconv converts 'å' into "Ã¥". Then code2html reconizes à but not ¥ as
a letter, so the output of "Aål" is:
<font color="#2040a0">AÃ</font>¥<font color="#2040a0">l</font>
OK, that's a bug of code2html. Also, the produced html doesn't signal
that it is in UTF-8.
I just figured out a way to get a result that is correctly displayed in
the browser and does not give error messages from code2html!
Here is what I had to do:
1. Add "use locale;" to code2html.
2. Make sure that LANG is set to swedish (I executed "export
LANG=swedish" in the shell. To make it permanent in Gentoo I added
"LANG=swedish" to /etc/env.d/02locale and executed env-update.)
3. Execute "code2html prov.adb prov.adb.html".
Here is what I had to avoid doing:
* Add "use encoding 'latin1';" to code2html. With this, code2html still
seems to produce correct output, but it shows error messages on the
konsole ("Malformed UTF-8 character ...").
* Converting the file with iconv first. This will produce "<font
color="#2040a0">AÃ</font>¥<font color="#2040a0">l</font>".
It may be relevant that my /usr/share/locale/locale.alias contains the
line "swedish sv_SE.ISO-8859-1".
It works for me now, but it would be preferrable if code2html would
always behave as if LANG was set to swedish, so it will just work for
everyone everywhere.
And
since HTML should have things like "å", an output layer is also
needed. Seems like a lot of work.
The default of HTML was once Latin1 (ISO-8859-1), so either "å" or
a byte with value 229 should be equivalent.
But it is of course much better to follow this:
http://www.w3.org/TR/REC-html40/charset.html
and include a proper header.
Yes, it should include <meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1"> in the <head>. I added it to
all the HTML templates in code2html (patch attached). But even with
that, it is far from correct HTML according to the validator.
--- code2html 2006-05-13 03:17:35.000000000 +0200
+++ code2html.ada_identifiers_fixed 2006-05-14 12:19:23.000000000 +0200
@@ -1,3 +1,4 @@
#!/usr/bin/perl -w
+use locale;
my $vernr = "0.9.1";
my $monthshort = "Jan";
@@ -1326,4 +1327,5 @@
'<html>
<head>
+ <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>%%title%%</title>
</head>
@@ -1474,4 +1476,5 @@
${ $STYLESHEET{'html-nobg'}} {'template'} = '<html>
<head>
+ <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>%%title%%</title>
</head>
@@ -1494,4 +1497,5 @@
${ $STYLESHEET{'html-dark'}} {'template'} = '<html>
<head>
+ <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>%%title%%</title>
</head>
@@ -1751,4 +1755,5 @@
${ $STYLESHEET{'html-simple'}} {'template'} = '<html>
<head>
+ <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>%%title%%</title>
</head>
@@ -1774,4 +1779,5 @@
${ $STYLESHEET{'html-fntlck'}} {'template'} = '<html>
<head>
+ <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>%%title%%</title>
</head>
@@ -1968,5 +1974,5 @@
{
'name' => 'Identifiers',
- 'regex' => '\\b[a-zA-Z][a-zA-Z0-9_]*\\b',
+ 'regex' => '\\b[[:alpha:]](?:_?[^\W_])*\\b',
'style' => 'identifier',
'childregex' => []
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>