Dr.Ruud wrote:

Erik schreef:
iconv converts 'å' into "Ã¥". Then code2html reconizes à but not ¥ as
a letter, so the output of "Aål" is:
<font color="#2040a0">AÃ</font>¥<font color="#2040a0">l</font>
OK, that's a bug of code2html. Also, the produced html doesn't signal
that it is in UTF-8.
I just figured out a way to get a result that is correctly displayed in the browser and does not give error messages from code2html!

Here is what I had to do:
1. Add "use locale;" to code2html.
2. Make sure that LANG is set to swedish (I executed "export LANG=swedish" in the shell. To make it permanent in Gentoo I added "LANG=swedish" to /etc/env.d/02locale and executed env-update.)
3. Execute "code2html prov.adb prov.adb.html".

Here is what I had to avoid doing:
* Add "use encoding 'latin1';" to code2html. With this, code2html still seems to produce correct output, but it shows error messages on the konsole ("Malformed UTF-8 character ..."). * Converting the file with iconv first. This will produce "<font color="#2040a0">AÃ</font>¥<font color="#2040a0">l</font>".

It may be relevant that my /usr/share/locale/locale.alias contains the line "swedish sv_SE.ISO-8859-1".

It works for me now, but it would be preferrable if code2html would always behave as if LANG was set to swedish, so it will just work for everyone everywhere.


And
since HTML should have things like "&aring;", an output layer is also
needed. Seems like a lot of work.
The default of HTML was once Latin1 (ISO-8859-1), so either "&aring;" or
a byte with value 229 should be equivalent.
But it is of course much better to follow this:
http://www.w3.org/TR/REC-html40/charset.html
and include a proper header.
Yes, it should include <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> in the <head>. I added it to all the HTML templates in code2html (patch attached). But even with that, it is far from correct HTML according to the validator.

--- code2html	2006-05-13 03:17:35.000000000 +0200
+++ code2html.ada_identifiers_fixed	2006-05-14 12:19:23.000000000 +0200
@@ -1,3 +1,4 @@
 #!/usr/bin/perl -w
+use locale;
 my $vernr = "0.9.1";
 my $monthshort = "Jan";
@@ -1326,4 +1327,5 @@
 '<html>
 <head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <title>%%title%%</title>
 </head>
@@ -1474,4 +1476,5 @@
 ${ $STYLESHEET{'html-nobg'}} {'template'} = '<html>
 <head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <title>%%title%%</title>
 </head>
@@ -1494,4 +1497,5 @@
 ${ $STYLESHEET{'html-dark'}} {'template'} = '<html>
 <head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <title>%%title%%</title>
 </head>
@@ -1751,4 +1755,5 @@
 ${ $STYLESHEET{'html-simple'}} {'template'} = '<html>
   <head>
+    <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
     <title>%%title%%</title>
   </head>
@@ -1774,4 +1779,5 @@
 ${ $STYLESHEET{'html-fntlck'}} {'template'} = '<html>
 <head>
+  <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <title>%%title%%</title>
 </head>
@@ -1968,5 +1974,5 @@
                                               {
                                                 'name'       => 'Identifiers',
-                                                'regex'      => '\\b[a-zA-Z][a-zA-Z0-9_]*\\b',
+                                                'regex'      => '\\b[[:alpha:]](?:_?[^\W_])*\\b',
 					        'style'      => 'identifier',
                                                 'childregex' => []

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Reply via email to