Bug#502702: chm2pdf is nowhere close to releasable

Thomas Viehmann Sun, 19 Oct 2008 03:45:05 -0700

X-Debbugs-RC: [EMAIL PROTECTED]
Package: chm2pdf
Severity: grave
Justification: is between unusable and dangerous for non-trivial chm files
Version: 0.9-2


Hi,

I've been looking at chm2pdf's RC bug.
There are so much more problems that releasing lenny seems totally out of 
question for
quality reasons.
Some items in addition to the tempfile RC bug (only the ones that have turned 
up in the
attempt to get it to convert some chm to pdf, I have not specifically looked at 
anything):
- the calls to subprograms using os.system are not using proper
  escaping, this is a security hazard (Raphael's patch proposed to fix the
  rm -f, the the rest is just as bad),
- the code looses perfectly good data in a way that cannot be turned off (note 
that the
  typical chm files I found that related to free software were api docs which 
are loosing
  a big deal without links to the right place on the page in per-module lists of
  functions):
  # Replace links of the form "somefile.html#894" with "somefile0206.html"
  # The following will match anchors like '<a href="temp0206.html#894"' and 
will store the
  # 'temp0206.html' in backreference 1.
  # The replace string will then replace it with '<a href="temp0206.html"', 
i.e. it will
  # take away the '#894' part.
  # This is because the numbers after the '#' are often wrong or non-existent. 
It is
  # better to link to an existing chapter than to a non-existent part of an 
existing
  # chapter.
- The implementation is inacceptably inefficient: The convert_to_pdf function's 
uses
  the following (match_strings and replace_garbled_strings are of length = 
number of
  pages) and this in in a loop over all pages. This is completely bogus (in 
terms of
  what they lengthily explain to try to achieve) and unacceptably inefficient, 
this can
  readily be implemented in linear time without effort.

     # Substitutions in 1st pass: we replace the original filenames with their
     # corresponding "garbled" equivalents.
        for match_string in  match_strings:
            replace_string = 
replace_garbled_strings[match_strings.index(match_string)]
            page = re.sub(match_string, replace_string, page)
     # Substitutuions in the 2nd pass: we replace the garbled filenames with 
the correct
     # ones.
        for match_string in  replace_garbled_strings:
            replace_string = 
replace_strings[replace_garbled_strings.index(match_string)]
            page = re.sub(match_string, replace_string, page)

chm2pdf never has been in a Debian release and it should not be before it gets 
better.
Please remove it from lenny.

Kind regards

T.
-- 
Thomas Viehmann, http://thomas.viehmann.net/



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Bug#502702: chm2pdf is nowhere close to releasable

Reply via email to