https://bz.apache.org/ooo/show_bug.cgi?id=128549
--- Comment #6 from dam...@apache.org --- (In reply to damjan from comment #5) > But SvRTFParser::Continue() must be getting called after the constructor, > and it seems to set the "mac" encoding, so why is the wrong encoding still > used? Putting a breakpoint on SvParser::SetSrcEncoding(), and backtracing when it's called, shows it's called from the following places, in order: 1. The constructor, with "eEnc=1" meaning RTL_TEXTENCODING_MS_1252: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/svparser.cxx:142 2. The CallParser() method, itself called from editeng/source/rtf/svxrtf.cxx method RtfReader::Read(): #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc7a85 in SvRTFParser::CallParser() (this=0x80db69e10) at source/svrtf/parrtf.cxx:593 3. The Continue() method when it finds the "\mac" instruction, now with "eEnc=2" meaning (the good) RTL_TEXTENCODING_APPLE_ROMAN: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=2) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short) (this=0x80db69e10, eEnc=2) at source/svrtf/parrtf.cxx:688 #2 0x0000000801dc7d02 in SvRTFParser::Continue(int) (this=0x80db69e10, nToken=262) at source/svrtf/parrtf.cxx:655 4. SvxRTFParser::ReadColorTable() still with the good "eEnc=2": #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=2) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc6a76 in SvRTFParser::_GetNextToken() (this=0x80db69e10) at source/svrtf/parrtf.cxx:268 #2 0x0000000801dcd1bf in SvParser::GetNextToken() (this=0x80db69e10) at source/svrtf/svparser.cxx:439 #3 0x00000008045fbd04 in SvxRTFParser::ReadColorTable() (this=0x80db69e10) at source/rtf/svxrtf.cxx:464 5. SvxRTFParser::ReadFontTable(), now with THE BAD "eEnc=1" !!!!!!!!!!!!!! #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/parrtf.cxx:688 #2 0x00000008045fbd83 in SvxRTFParser::ReadFontTable() (this=0x80db69e10) at source/rtf/svxrtf.cxx:513 6. SvxRTFParser::ReadFontTable() again. 7. SvxRTFParser::ReadFontTable() again 8. SvxRTFParser::ReadFontTable() again but now with eEnc=2. 9. SvxRTFParser::ReadStyleTable() with eEnc=2: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=2) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc6a76 in SvRTFParser::_GetNextToken() (this=0x80db69e10) at source/svrtf/parrtf.cxx:268 #2 0x0000000801dcd1bf in SvParser::GetNextToken() (this=0x80db69e10) at source/svrtf/svparser.cxx:439 #3 0x00000008045fc241 in SvxRTFParser::ReadStyleTable() (this=0x80db69e10) at source/rtf/svxrtf.cxx:362 10. SvxRTFParser::ReadStyleTable() with eEnc=2. 11. SvxRTFParser::RTFPardPlain() with eEnc=1: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/parrtf.cxx:688 #2 0x00000008045f851e in SvxRTFParser::RTFPardPlain(int, SfxItemSet**) (this=this@entry=0x80db69e10, bPard=bPard@entry=0, ppSet=ppSet@entry=0x7fffffffc248) at source/rtf/rtfitem.cxx:1969 12. SvxRTFParser::ReadAttr() with eEnc=1: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/parrtf.cxx:688 #2 0x00000008045f6238 in SvxRTFParser::ReadAttr(int, SfxItemSet*) (this=0x80db69e10, nToken=1801, pSet=<optimized out>) at source/rtf/rtfitem.cxx:692 13. ReadBmpData() methods with eEnc=1: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db69e10, eEnc=1) at source/svrtf/svparser.cxx:142 #1 0x00000008045f455b in SvxRTFParser::ReadBmpData(Graphic&, SvxRTFPictureType&) (this=0x80db69e10, rGrf=..., rPicType=...) at source/rtf/rtfgrf.cxx:304 #2 0x000000081010c3a2 in SwRTFParser::ReadBitmapData() (this=0x80db69e10) at source/filter/rtf/rtffly.cxx:1492 And a large number of other methods, ending with one that sets it back to eEnc=2: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db6ad10, eEnc=2) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc6a97 in SvRTFParser::_GetNextToken() (this=0x80db6ad10) at source/svrtf/parrtf.cxx:273 #2 0x0000000801dcd1bf in SvParser::GetNextToken() (this=0x80db6ad10) at source/svrtf/svparser.cxx:439 #3 0x0000000801dc7da8 in SvRTFParser::Continue(int) (this=0x80db6ad10, nToken=2059) at source/svrtf/parrtf.cxx:675 #4 0x00000008045fb6b4 in SvxRTFParser::Continue(int) (this=0x80db6ad10, nToken=2) at source/rtf/svxrtf.cxx:175 #5 0x000000081011afe3 in SwRTFParser::Continue(int) (this=0x80db6ad10, nToken=0) at source/filter/rtf/swparrtf.cxx:337 #6 0x0000000801dc7ad3 in SvRTFParser::CallParser() (this=0x80db6ad10) at source/svrtf/parrtf.cxx:600 Now if we also put a breakpoint on SvRTFParser::GetHexValue() to check when the \'8e is parsed relative to the setting of the text encoding, we see the most recent call set eEnc to 1: #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db6ad10, eEnc=1) at source/svrtf/svparser.cxx:142 #1 0x0000000801dc7e63 in SvRTFParser::SetEncoding(unsigned short) (this=0x80db6ad10, eEnc=1) at source/svrtf/parrtf.cxx:688 #2 0x00000008045f6238 in SvxRTFParser::ReadAttr(int, SfxItemSet*) (this=0x80db6ad10, nToken=1801, pSet=<optimized out>) at source/rtf/rtfitem.cxx:692 And then another call from SvxRTFParser::ReadBmpData(): #0 SvParser::SetSrcEncoding(unsigned short) (this=0x80db6ad10, eEnc=1) at source/svrtf/svparser.cxx:142 #1 0x00000008045f4f31 in SvxRTFParser::ReadBmpData(Graphic&, SvxRTFPictureType&) (this=0x80db6ad10, rGrf=..., rPicType=...) at source/rtf/rtfgrf.cxx:578 sets it to 1, and it appears, the last call to SetSrcEncoding() that sets it back to 2, only happens after all the "\'xx" style text is already parsed. So that explains why the wrong code page is used: almost every damn method from its editeng subclass SvxRTFParser calls SetSrcEncoding() with RTL_TEXTENCODING_MS_1252. Next let's explore that class to find out why. -- You are receiving this mail because: You are the assignee for the issue.