Package: docbook2odf
Version: 0.244-1
Severity: normal

When a Docbook XML file contains unbreakable spaces ( ), docbook2odf
generates files that Openoffice can't read.  Here's a diff:

[EMAIL PROTECTED]:/tmp$ diff -u article2.xml article3.xml 
--- article2.xml        2007-08-06 20:09:24.000000000 +0200
+++ article3.xml        2007-08-06 20:08:59.000000000 +0200
@@ -106,7 +106,7 @@
       <tip>
 <!--   <title>Astuce</title> -->
        <para>Pour taper des équations, on peut utiliser la syntaxe
-       LaTeX : <inlineequation>
+       LaTeX : <inlineequation>
            <alt>$ \phi = \frac{\sqrt{5}-1}{2} $</alt>
            <graphic/>
          </inlineequation> sum dolor sit amet, consectetuer adipiscing elit, 
sed
[EMAIL PROTECTED]:/tmp$

  When I try to load article2.odt, I get an error message that says:
,----
| Read-Error.
| Format error discovered in the file in sub-document content.xml at
| 623,26(row,col).
`----
  article3.odt works fine.  If I unzip article2.odt, here's what
content.xml contains around line 623:

[EMAIL PROTECTED]:/tmp$ nl -ba content.xml | grep -C5 '^ *623'
   618          <text:p text:style-name="para-padding">important</text:p>
   619        
   620        
   621
   622          <text:p text:style-name="para-padding">Pour taper des 
équations, on peut utiliser la syntaxe
   623          LaTeX<text:s text:c="1"/>�: 
   624              $ \phi = \frac{\sqrt{5}-1}{2} $
   625              
   626             sum dolor sit amet, consectetuer adipiscing elit, sed
   627          diam nonummy nibh euismod tincidunt ut lsum dolor sit amet, 
consectetuer adipiscing elit, sed
   628          diam nonummy nibh euismod tincidunt ut lsum dolor sit amet, 
consectetuer adipiscing elit, sed

  The funky character before the colon at line 623 seems to be my
unbreakable space, only not properly encoded:

[EMAIL PROTECTED]:/tmp$ nl -ba content.xml | grep -C5 '^ *623' | recode l1..u8
   618          <text:p text:style-name="para-padding">important</text:p>
   619        
   620        
   621
   622          <text:p text:style-name="para-padding">Pour taper des 
équations, on peut utiliser la syntaxe
   623          LaTeX<text:s text:c="1"/> : 
   624              $ \phi = \frac{\sqrt{5}-1}{2} $
   625              
   626             sum dolor sit amet, consectetuer adipiscing elit, sed
   627          diam nonummy nibh euismod tincidunt ut lsum dolor sit amet, 
consectetuer adipiscing elit, sed
   628          diam nonummy nibh euismod tincidunt ut lsum dolor sit amet, 
consectetuer adipiscing elit, sed

  Note the é sequence, which is what "é" looks like when encoded
from Latin1 to UTF-8 once too many.  The NBSP underneath does come out
properly this time.

Roland.

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)

Kernel: Linux 2.6.22-1-k7 (SMP w/1 CPU core)
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages docbook2odf depends on:
ii  libarchive-zip-perl    1.18-1            Module for manipulation of ZIP arc
ii  libxml-sablot-perl     1.0-2             encapsulation of the Sablotron XSL
ii  perl                   5.8.8-7           Larry Wall's Practical Extraction 
ii  perlmagick             7:6.2.4.5.dfsg1-1 A perl interface to the libMagick 
ii  zip                    2.32-1            Archiver for .zip files

docbook2odf recommends no packages.

-- no debconf information

Reply via email to