https://bugs.documentfoundation.org/show_bug.cgi?id=161367

            Bug ID: 161367
           Summary: Excessive generation of <SPAN> tags in EPUB export
           Product: LibreOffice
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: medium
         Component: Writer
          Assignee: [email protected]
          Reporter: [email protected]

Exporting from .ODT to .EPUB format generates unnecessarily voluminous XHTML
that is difficult to read and edit, because it contains vast numbers of
redundant <span>s with identical properties.

For example, the following paragraph, which contains no formatting except for
the rendering of 82nd:

“One in particular, a Ranger, uh, force known as the 82nd Airborne, had a
particular nickname, and a specific song that they took as their own, that
invoked that nickname.  The rhyme of the song does not work in Saamen, but the
part I can remember of the song goes like this:

Results in the following XHTML:

  <p class="para12"><span class="span15">“One in particular, </span><span
class="span15">a Ranger, uh, force </span><span class="span15">known as
</span><span class="span15">the 82</span><span class="span41">nd</span><span
class="span15"> Airborne, had a particular nickname, and a specific song that
they took as their own, that invoked that nickname. </span><span
class="span15"> </span><span class="span15">The rhyme of the song does not work
in Saamen, but the </span><span class="span15">part I can remember of the
</span><span class="span15">song goes like this:</span></p>

When what it SHOULD produce is this:

  <p class="para12"><span class="span15">“One in particular, a Ranger, uh,
force known as the 82</span><span class="span41">nd</span><span class="span15">
Airborne, had a particular nickname, and a specific song that they took as
their own, that invoked that nickname.  The rhyme of the song does not work in
Saamen, but the part I can remember of the song goes like this:</span></p>

No less than SEVEN TIMES in that one paragraph, LibreOffice *closes* a span of
class span15 only to immediately begin a new span *also* of class span15.  I
can find no clear reason why it is generating so many redundant spans.  My
hypothesis would be that it is because the source .ODT document ITSELF contains
many such redundant and unnecessary duplicated formatting codes.

This is wasteful and unnecessary, and results in XHTML documents much larger
than they need to be, that probably also take much longer to *render* than the
need to.  It should probably be considered malformed.

LibreOffice should automatically collapse adjacent spans (and its own
formatting regions) of the same type.  Currently I have to have a custom Perl
script to perform this cleanup.  The resulting reduction in the uncompressed
size of the XHTML files within the epub is as much as 30%.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to