https://issues.apache.org/ooo/show_bug.cgi?id=124081

            Bug ID: 124081
        Issue Type: DEFECT
           Summary: import of Microsoft's OOXML formats: unify (generated)
                    XML token IDs
           Product: General
           Version: 3.4.0
          Hardware: All
                OS: All
            Status: CONFIRMED
          Severity: normal
          Priority: P3
         Component: code
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]

During hunting the defect cause for bug 123723 it reveals that both the import
of Microsoft Excel and PowerPoint documents in OOXML format (import code in
module oox) and the import of Microsoft Word documents in OOXML format (import
code in module writerfilter) rely on the text file
main/oox/source/token/token.txt. This text file contains the XML tokens which
are relevant for these import filters.
Unfortunately, both modules oox and writerfilter have their own algorithm to
create unique IDs for these XML tokens. Module oox seems to sort the tokens and
then numbers the tokens ascending. Module writerfilter takes the list as it is
and numbers the token ascending. While module oox uses type <sal_Int32>, module
writerfilter uses its own type <Token_t> (which is only a typedef on
<sal_Int32>).
--> This is duplicate work which should be avoided.

The bigger mistake made regarding these token IDs is that module writerfilter
uses module oox (e.g. for the import of graphics). Token IDs are exchanged
between these module's code. Thus, the token IDs must be the same. But this is
not assured by using different algorithms for the generation of the token IDs.
E.g. inappropriate change made for bug 123528 causes bug 123723.
--> Generate the token IDs only once, may be use an enumeration type

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are watching all bug changes.

Reply via email to