Package: graphviz
Version: 2.2.1-1sarge1
Severity: normal
Steps to reproduce:
1) cat > hello.dot << EOF
digraph g {
a -> b;
b [label="testiƤ"];
}
EOF
2) dot -Tsvg hello.dot > hello.svg
3) inkscape hello.svg
Expected results:
3) if step 2 completeled successfully "hello.svg" should be a valid SVG file
and inkscape should open it.
Actual results:
3) Inkscape fails to open the file and shows the following error:
hello.svg:17: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xE4 0x3C 0x2F 0x74
<text text-anchor="middle" x="33" y="99">testiƤ</text>
First line of hello.svg is
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
If I chage this to
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
then inkscape is able to open the file correctly. I suggest that
either
a) dot should only accept UTF-8 input and refuse to continue if it
reads something else,
b) dot should support specifying charset with a command line option,
or c) dot should support specifying both input and output charset and do
conversions between these (this might be overkill)
At least b) should be very easy to do with something like
--- ./orig/graphviz-2.2.1/dotneato/common/svggen.c 2004-12-11
21:26:05.000000000 +0200
+++ ./graphviz-2.2.1/dotneato/common/svggen.c 2005-10-26 12:25:41.000000000
+0300
@@ -475,8 +475,12 @@
/* Pages = pages; */
N_pages = pages.x * pages.y;
- svg_fputs
- ("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n");
+ svg_fputs("<?xml version=\"1.0\" encoding=\"");
+ if ((s = agget(g, "encoding")) && s[0])
+ svg_fputs(s);
+ else
+ svg_fputs("UTF-8");
+ svg_fputs("\" standalone=\"no\"?>\n");
if ((s = agget(g, "stylesheet")) && s[0]) {
svg_fputs("<?xml-stylesheet href=\"");
svg_fputs(s);
and then use dot -Gencoding=iso-8859-1 -Tsvg hello.dot > hello.svg
-- System Information:
Debian Release: 3.1
Architecture: i386 (i686)
Kernel: Linux 2.4.27-2-k7
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Versions of packages graphviz depends on:
ii libc6 2.3.2.ds1-22 GNU C Library: Shared libraries an
ii libexpat1 1.95.8-3 XML parsing C library - runtime li
ii libfontconfig1 2.3.1-2 generic font configuration library
ii libfreetype6 2.1.7-2.4 FreeType 2 font engine, shared lib
ii libice6 4.3.0.dfsg.1-14sarge1 Inter-Client Exchange library
ii libjpeg62 6b-10 The Independent JPEG Group's JPEG
ii libpng12-0 1.2.8rel-1 PNG library - runtime
ii libsm6 4.3.0.dfsg.1-14sarge1 X Window System Session Management
ii libx11-6 4.3.0.dfsg.1-14sarge1 X Window System protocol client li
ii libxaw7 4.3.0.dfsg.1-14sarge1 X Athena widget set library
ii libxext6 4.3.0.dfsg.1-14sarge1 X Window System miscellaneous exte
ii libxmu6 4.3.0.dfsg.1-14sarge1 X Window System miscellaneous util
ii libxpm4 4.3.0.dfsg.1-14sarge1 X pixmap library
ii libxt6 4.3.0.dfsg.1-14sarge1 X Toolkit Intrinsics
ii tcl8.4 8.4.9-1 Tcl (the Tool Command Language) v8
ii tk8.4 8.4.9-1 Tk toolkit for Tcl and X11, v8.4 -
ii xlibs 4.3.0.dfsg.1-14sarge1 X Keyboard Extension (XKB) configu
ii zlib1g 1:1.2.2-4.sarge.2 compression library - runtime
-- no debconf information