https://bz.apache.org/ooo/show_bug.cgi?id=128549

--- Comment #6 from dam...@apache.org ---
(In reply to damjan from comment #5)
> But SvRTFParser::Continue() must be getting called after the constructor,
> and it seems to set the "mac" encoding, so why is the wrong encoding still
> used?

Putting a breakpoint on SvParser::SetSrcEncoding(), and backtracing when it's
called, shows it's called from the following places, in order:

1. The constructor, with "eEnc=1" meaning RTL_TEXTENCODING_MS_1252:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at
source/svrtf/svparser.cxx:142


2. The CallParser() method, itself called from editeng/source/rtf/svxrtf.cxx
method RtfReader::Read():

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc7a85 in SvRTFParser::CallParser() (this=0x80db69e10) at
source/svrtf/parrtf.cxx:593


3. The Continue() method when it finds the "\mac" instruction, now with
"eEnc=2" meaning (the good) RTL_TEXTENCODING_APPLE_ROMAN:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=2) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short)
(this=0x80db69e10, eEnc=2) at source/svrtf/parrtf.cxx:688
#2  0x0000000801dc7d02 in SvRTFParser::Continue(int) (this=0x80db69e10,
nToken=262) at source/svrtf/parrtf.cxx:655


4. SvxRTFParser::ReadColorTable() still with the good "eEnc=2":

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=2) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc6a76 in SvRTFParser::_GetNextToken() (this=0x80db69e10) at
source/svrtf/parrtf.cxx:268
#2  0x0000000801dcd1bf in SvParser::GetNextToken() (this=0x80db69e10) at
source/svrtf/svparser.cxx:439
#3  0x00000008045fbd04 in SvxRTFParser::ReadColorTable() (this=0x80db69e10) at
source/rtf/svxrtf.cxx:464


5. SvxRTFParser::ReadFontTable(), now with THE BAD "eEnc=1" !!!!!!!!!!!!!!

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short)
(this=0x80db69e10, eEnc=1) at source/svrtf/parrtf.cxx:688
#2  0x00000008045fbd83 in SvxRTFParser::ReadFontTable() (this=0x80db69e10) at
source/rtf/svxrtf.cxx:513

6. SvxRTFParser::ReadFontTable() again.
7. SvxRTFParser::ReadFontTable() again
8. SvxRTFParser::ReadFontTable() again but now with eEnc=2.

9. SvxRTFParser::ReadStyleTable() with eEnc=2:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=2) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc6a76 in SvRTFParser::_GetNextToken() (this=0x80db69e10) at
source/svrtf/parrtf.cxx:268
#2  0x0000000801dcd1bf in SvParser::GetNextToken() (this=0x80db69e10) at
source/svrtf/svparser.cxx:439
#3  0x00000008045fc241 in SvxRTFParser::ReadStyleTable() (this=0x80db69e10) at
source/rtf/svxrtf.cxx:362


10. SvxRTFParser::ReadStyleTable() with eEnc=2.
11. SvxRTFParser::RTFPardPlain() with eEnc=1:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short)
(this=0x80db69e10, eEnc=1) at source/svrtf/parrtf.cxx:688
#2  0x00000008045f851e in SvxRTFParser::RTFPardPlain(int, SfxItemSet**)
(this=this@entry=0x80db69e10, bPard=bPard@entry=0,
ppSet=ppSet@entry=0x7fffffffc248) at source/rtf/rtfitem.cxx:1969

12. SvxRTFParser::ReadAttr() with eEnc=1:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short)
(this=0x80db69e10, eEnc=1) at source/svrtf/parrtf.cxx:688
#2  0x00000008045f6238 in SvxRTFParser::ReadAttr(int, SfxItemSet*)
(this=0x80db69e10, nToken=1801, pSet=<optimized out>) at
source/rtf/rtfitem.cxx:692

13. ReadBmpData() methods with eEnc=1:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at
source/svrtf/svparser.cxx:142
#1  0x00000008045f455b in SvxRTFParser::ReadBmpData(Graphic&,
SvxRTFPictureType&) (this=0x80db69e10, rGrf=..., rPicType=...) at
source/rtf/rtfgrf.cxx:304
#2  0x000000081010c3a2 in SwRTFParser::ReadBitmapData() (this=0x80db69e10) at
source/filter/rtf/rtffly.cxx:1492


And a large number of other methods, ending with one that sets it back to
eEnc=2:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db6ad10, eEnc=2) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc6a97 in SvRTFParser::_GetNextToken() (this=0x80db6ad10) at
source/svrtf/parrtf.cxx:273
#2  0x0000000801dcd1bf in SvParser::GetNextToken() (this=0x80db6ad10) at
source/svrtf/svparser.cxx:439
#3  0x0000000801dc7da8 in SvRTFParser::Continue(int) (this=0x80db6ad10,
nToken=2059) at source/svrtf/parrtf.cxx:675
#4  0x00000008045fb6b4 in SvxRTFParser::Continue(int) (this=0x80db6ad10,
nToken=2) at source/rtf/svxrtf.cxx:175
#5  0x000000081011afe3 in SwRTFParser::Continue(int) (this=0x80db6ad10,
nToken=0) at source/filter/rtf/swparrtf.cxx:337
#6  0x0000000801dc7ad3 in SvRTFParser::CallParser() (this=0x80db6ad10) at
source/svrtf/parrtf.cxx:600




Now if we also put a breakpoint on SvRTFParser::GetHexValue() to check when the
\'8e is parsed relative to the setting of the text encoding, we see the most
recent call set eEnc to 1:

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db6ad10, eEnc=1) at
source/svrtf/svparser.cxx:142
#1  0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short)
(this=0x80db6ad10, eEnc=1) at source/svrtf/parrtf.cxx:688
#2  0x00000008045f6238 in SvxRTFParser::ReadAttr(int, SfxItemSet*)
(this=0x80db6ad10, nToken=1801, pSet=<optimized out>) at
source/rtf/rtfitem.cxx:692

And then another call from SvxRTFParser::ReadBmpData():

#0  SvParser::SetSrcEncoding(unsigned short) (this=0x80db6ad10, eEnc=1) at
source/svrtf/svparser.cxx:142
#1  0x00000008045f4f31 in SvxRTFParser::ReadBmpData(Graphic&,
SvxRTFPictureType&) (this=0x80db6ad10, rGrf=..., rPicType=...) at
source/rtf/rtfgrf.cxx:578

sets it to 1, and it appears, the last call to SetSrcEncoding() that sets it
back to 2, only happens after all the "\'xx" style text is already parsed.

So that explains why the wrong code page is used: almost every damn method from
its editeng subclass SvxRTFParser calls SetSrcEncoding() with
RTL_TEXTENCODING_MS_1252. Next let's explore that class to find out why.

-- 
You are receiving this mail because:
You are the assignee for the issue.

Reply via email to