On Wed, 2007-04-04 at 14:53 +0200, Jérôme Haguet wrote: > I think I start to get it > > + My data inside Bugzilla are Latin-1 encoded > + Changing the UTF-8 flag in Bugzilla will not change that, it will only > result in bad display in my browser for existing data > + integration_get_vdd get the data from Bugzilla as it is, without changing > the encoding. > > So, I see 2 solutions > 1/ Converting my Bugzilla database to UTF-8 (I do not know how to do that > right now, maybe exporting and importing)
Other people may have the same problem and converting may not be an option for them. > 2/ > 2.a/ Adding a pararameter in Scmbug to know how original data are encoded I like this option. It seems that the documentation of XML::Simple which is used to convert the data in an XML file at the daemon permits the option of also choosing the encoding: http://search.cpan.org/dist/XML-Simple/lib/XML/Simple.pm e.g. open my $fh, '>:encoding(iso-8859-1)', $path or die "open($path): $!"; XMLout($ref, OutputFile => $fh); which works when actually opening a file (we don't do that -- we produce the xml string and transmit it across a socket) So, preferably with: my $xml = new XML::Simple (NoAttr=>1, XMLDecl => "iso-8859-1"); my $ret_message = $xml->XMLout( $vdd ); which will provide the encoding when the XML file is created. So, this encoding value could be listed in the daemon.conf file. How does this sound ? > 2.b.1/ Adding that parameter on the first line of the vdd.xml (as I did it > for testing purpose) We are doing this from the user-side when we call the vdd generator, so we don't have a configuration file around that will contain the description of the encoding. We will have to use a command-line argument to the vdd generator to explicitly specify it. I'm not against this either. Perhaps for some data in your database, the encoding is Latin-1, but after today you turn on UTF-8 on in Bugzilla (or use some other encoding) and all your new data are encoded in UTF-8, hence you might still need to explicitly specify this to the vdd generator. > 2.b.1/ Converting data to UTF-8 at Fetch time (I do not know how to do that > right now) This seems similar to option 1, just applied when needed. Another question: Do we need the encoding explicitly listed in the html version of the vdd ? It looks like it's displayed well in your browser so far, but is still missing from the .html file. Is it safe to assume that this is a totally separate problem (perhaps a bug with docbook2html)? I think I prefer option 2a. How about you Jérôme ? _______________________________________________ scmbug-users mailing list [email protected] http://lists.mkgnu.net/cgi-bin/mailman/listinfo/scmbug-users
