On Wed, 2007-04-04 at 14:53 +0200, Jérôme Haguet wrote:
> I think I start to get it 
> 
> + My data inside Bugzilla are Latin-1 encoded
> + Changing the UTF-8 flag in Bugzilla will not change that, it will only 
> result in bad display in my browser for existing data
> + integration_get_vdd get the data from Bugzilla as it is, without changing 
> the encoding.
> 
> So, I see 2 solutions
> 1/ Converting my Bugzilla database to UTF-8 (I do not know how to do that 
> right now, maybe exporting and importing)

Other people may have the same problem and converting may not be an
option for them.

> 2/ 
> 2.a/ Adding a pararameter in Scmbug to know how original data are encoded

I like this option. It seems that the documentation of XML::Simple which
is used to convert the data in an XML file at the daemon permits the
option of also choosing the encoding:

http://search.cpan.org/dist/XML-Simple/lib/XML/Simple.pm

e.g.

  open my $fh, '>:encoding(iso-8859-1)', $path or die "open($path): $!";
  XMLout($ref, OutputFile => $fh);

which works when actually opening a file (we don't do that -- we
produce the xml string and transmit it across a socket)

So, preferably with:

    my $xml = new XML::Simple (NoAttr=>1, XMLDecl => "iso-8859-1");
    my $ret_message = $xml->XMLout( $vdd );

which will provide the encoding when the XML file is created. So,
this encoding value could be listed in the daemon.conf file. How
does this sound ?


> 2.b.1/ Adding that parameter on the first line of the vdd.xml (as I did it 
> for testing purpose)

We are doing this from the user-side when we call the vdd generator, so
we don't have a configuration file around that will contain the
description of the encoding. We will have to use a command-line argument
to the vdd generator to explicitly specify it. I'm not against this
either. Perhaps for some data in your database, the encoding is Latin-1,
but after today you turn on UTF-8 on in Bugzilla (or use some other
encoding) and all your new data are encoded in UTF-8, hence you might
still need to explicitly specify this to the vdd generator.

> 2.b.1/ Converting data to UTF-8 at Fetch time (I do not know how to do that 
> right now)

This seems similar to option 1, just applied when needed.

Another question: Do we need the encoding explicitly listed in the html
version of the vdd ? It looks like it's displayed well in your browser
so far, but is still missing from the .html file. Is it safe to assume
that this is a totally separate problem (perhaps a bug with
docbook2html)?

I think I prefer option 2a. How about you Jérôme ?


_______________________________________________
scmbug-users mailing list
[email protected]
http://lists.mkgnu.net/cgi-bin/mailman/listinfo/scmbug-users

Reply via email to