[dev] Re: __attribute__((packed)) for enum
Hi Niklas, Am 18.05.2011 13:44, schrieb tora - Takamichi Akiyama: The motivation of this topic comes from a different point. I have been looking for a way to prevent a well-known phenomenon A single sheet that fits in a A4 paper with Excel turns into two or four A4 papers with Calc. On 2011/05/18 22:18, Niklas Nebel wrote: Four pages means that both width and height are wrong? They are handled quite differently. Column widths in Calc are static (stored in twips internally). The conversion from Excel's character-based units is done during import. Yeap, and 888 ? Automatic row heights are updated based on cell formats and contents. The edit engine is used only for complex cell contents. With simple cells, we cheat a bit and avoid the OutputDevice::SetFont call, for performance reasons. The calculation is based on the height of the default font (determined from an OutputDevice once), the direct value from the font height format, and some tweaking, see lcl_GetAttribHeight in sc/source/core/data/column2.cxx. This obviously leaves lots of room to arrive at different values from Excel. In some cases even correctly so, because optimal height is supposed to fit the cell content with your current setup (system, installed fonts). I appreciate your explanation on the inside of Calc. As you pointed out, the installed font might be one of the factors. Excel files are prepared on Windows while I am trying to open them on CentOS and/or Solaris. Those systems have a different font set. Another point that I have been suspecting since OpenOffice.org 2.x is artificial Ascendant. The vcl module had implemented a feature that mathematically produced an artificial Ascendant of glyph. Compared with typical Western font files which usually have certain amount of ascendant, typical Japanese font files have an ascendant of value zero from, probably, historical reasons. To make implementation of the upper layer applications such as Writer, Calc, and Impress, the underlying module, vcl, tries to internally take care of the differences. It is good, but, I feel, the artificial ascendant, thus, virtual text height, might be slightly higher than its expectation. That might lead a cause to slightly increase unnecessary amount of row height. BTW, in contrast to the topic on artificial ascendant, what I have been currently aiming at is that how to create an preview image of Microsoft Office files running on back-end servers without any user interaction. For the purpose, which might be better? (a) One spreadsheet is converted into a single too-small image file. (b) One spreadsheet is converted into two or four image files. Reducing a font size right before calling OutputDevice::SetFont seems to work. I am trying this attempt for a while. Regards, Tora -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: __attribute__((packed)) for enum
On 19.05.2011 16:40, tora - Takamichi Akiyama wrote: As you pointed out, the installed font might be one of the factors. Excel files are prepared on Windows while I am trying to open them on CentOS and/or Solaris. Those systems have a different font set. Another point that I have been suspecting since OpenOffice.org 2.x is artificial Ascendant. The vcl module had implemented a feature that mathematically produced an artificial Ascendant of glyph. Compared with typical Western font files which usually have certain amount of ascendant, typical Japanese font files have an ascendant of value zero from, probably, historical reasons. To make implementation of the upper layer applications such as Writer, Calc, and Impress, the underlying module, vcl, tries to internally take care of the differences. It is good, but, I feel, the artificial ascendant, thus, virtual text height, might be slightly higher than its expectation. That might lead a cause to slightly increase unnecessary amount of row height. There's been some tweaking of CJK font metrics in VCL, but I'm not really familiar with that. BTW, in contrast to the topic on artificial ascendant, what I have been currently aiming at is that how to create an preview image of Microsoft Office files running on back-end servers without any user interaction. For the purpose, which might be better? (a) One spreadsheet is converted into a single too-small image file. (b) One spreadsheet is converted into two or four image files. Reducing a font size right before calling OutputDevice::SetFont seems to work. I am trying this attempt for a while. If you want to shrink everything to make sure you don't generate too many pages, perhaps it's better to reduce the print scale (ATTR_PAGE_SCALE, or PageScale via API). That would apply to simple cells and EditEngine content equally. Niklas -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: __attribute__((packed)) for enum
On Wed, May 18, 2011 at 3:40 AM, tora - Takamichi Akiyama t...@openoffice.org wrote: How big benefits could we get when such structure or class instances are produced in large numbers? The produced in large numbers (plus processed in bursts of large chunks, I would say) is the crucial point, indeed. Saving a few bytes here and there would probably not improve overall OOo performance, that's what I wanted to say with my response. But, of course, ultimately, only trying it out would tell... -Stephan -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: __attribute__((packed)) for enum
On 18.05.2011 03:40, tora - Takamichi Akiyama wrote: The size of class Impl_Font would reduce to 50 bytes or so from 88 bytes. How big benefits could we get when such structure or class instances are produced in large numbers? Generally, this is a very valid concern. In some places where we found out (or just assumed) that it makes a difference, we currently use smaller integer types directly for member variables, instead of enum. For example, see eCellType in ScBaseCell (sc/inc/cell.hxx). Or FormulaToken (formula/inc/formula/token.hxx), where the type of eOp and eType even depends on DBG_UTIL. I don't know if Font objects are constructed in very large numbers. Niklas -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: __attribute__((packed)) for enum
On 2011/05/18 18:06, Niklas Nebel wrote: Generally, this is a very valid concern. In some places where we found out (or just assumed) that it makes a difference, we currently use smaller integer types directly for member variables, instead of enum. For example, see eCellType in ScBaseCell (sc/inc/cell.hxx). Or FormulaToken (formula/inc/formula/token.hxx), where the type of eOp and eType even depends on DBG_UTIL. formula/inc/formula/opcode.hxx enum OpCodeEnum { ... }; #ifndef DBG_UTIL // save memory since compilers tend to int an enum typedef USHORT OpCode; #else // have enum names in debugger typedef OpCodeEnum OpCode; #endif That is a great idea!!! I don't know if Font objects are constructed in very large numbers. That might not be a big concern than that of calc, I think. The font object tends to share the same instance using a reference counter. When a small change such as a font size is being made, the entire object will be duplicated first and then the change is applied to the new instance. I have just tried to determine how __attribute__((packed)) works. https://bitbucket.org/tora/ooo-enum-attribute-packed-experiment-ooo330_m20-vcl/changeset/6f5ec89f0a56 https://bitbucket.org/tora/ooo-enum-attribute-packed-experiment-ooo330_m20-vcl/changeset/110df3d51a23 gcc (GCC) 4.2.3 running CentOS release 5.4 (Final) With the original source code of OOO330_m20 aka OpenOffice.org 3.3.0 sizeof(Impl_Font) = 88 sizeof(FontFamily) = 4 With patch applied source code sizeof(Impl_Font) = 56 sizeof(FontFamily) = 1 That works well so far. But, unfortunately, __attribute__((packed)) seems relatively new. The motivation of this topic comes from a different point. I have been looking for a way to prevent a well-known phenomenon A single sheet that fits in a A4 paper with Excel turns into two or four A4 papers with Calc. Tweaking nPPTX and nPPTY ? Reducing a font size just before calling OutputDevice::SetFont() from the inside of sc and editeng ? Best regards, Tora -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: __attribute__((packed)) for enum
Am 18.05.2011 13:44, schrieb tora - Takamichi Akiyama: The motivation of this topic comes from a different point. I have been looking for a way to prevent a well-known phenomenon A single sheet that fits in a A4 paper with Excel turns into two or four A4 papers with Calc. Tweaking nPPTX and nPPTY ? Reducing a font size just before calling OutputDevice::SetFont() from the inside of sc and editeng ? Four pages means that both width and height are wrong? They are handled quite differently. Column widths in Calc are static (stored in twips internally). The conversion from Excel's character-based units is done during import. Automatic row heights are updated based on cell formats and contents. The edit engine is used only for complex cell contents. With simple cells, we cheat a bit and avoid the OutputDevice::SetFont call, for performance reasons. The calculation is based on the height of the default font (determined from an OutputDevice once), the direct value from the font height format, and some tweaking, see lcl_GetAttribHeight in sc/source/core/data/column2.cxx. This obviously leaves lots of room to arrive at different values from Excel. In some cases even correctly so, because optimal height is supposed to fit the cell content with your current setup (system, installed fonts). Niklas -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: __attribute__((packed)) for enum
On Tue, May 17, 2011 at 8:44 PM, tora - Takamichi Akiyama t...@openoffice.org wrote: Any thoughts? Binary UNO requires that its enums are of specific size, but that should be taken care of by the dummy max-value element in each enum. Likewise for enums in the C/C++ URE ABI. Whether smaller enums have positive or negative runtime impact is hard to tell up front, I would say---and hard to measure, I would guess, as I assume the impact is negligible overall. Similarly, I would assume the space savings to be negligible, too. -Stephan -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: __attribute__((packed)) for enum
Hi Stephan, Thank you for your comments. On 2011/05/18 4:11, Stephan Bergmann wrote: Binary UNO requires that its enums are of specific size, but that should be taken care of by the dummy max-value element in each enum. Likewise for enums in the C/C++ URE ABI. Whether smaller enums have positive or negative runtime impact is hard to tell up front, I would say---and hard to measure, I would guess, as I assume the impact is negligible overall. Similarly, I would assume the space savings to be negligible, too. I used to think in the similar way as you did. But when I read some articles offered by Intel in order to find basic concepts of performance improvement, I was surprised and had learned. Smaller data size improves the cache hit rate of the processor. Smaller data size decreases the possibility of miss hit of virtual memory pages. Programming application software in the level of C++ language, we might never consider underlying architecture. When you jump in the world of what is DDR2-800 2GB memory module, what is 512-KB L2 Cache, what is Translation Lookaside Buffer (TLB), what OS has to do when a miss hit occurs, ... you might be encouraged to explore the new world. I have tried to find such articles, but cannot find them so far. This web page, however, might help you enjoy with the new concepts. http://www.intel.com/products/processor/manuals/ Just an example from http://developer.intel.com/Assets/PDF/manual/248966.pdf Example 3-44. Rearranging a Data Structure struct unpacked { /* Fits in 20 bytes due to padding */ int a; char b; int c; char d; int e; }; struct packed { /* Fits in 16 bytes */ int a; int c; int e; char b; char d; } http://hg.services.openoffice.org/OOO330/file/OOO330_m20/vcl/inc/vcl/impfont.hxx class Impl_Font /* Fits in 88 bytes */ { ... several enum ... }; http://hg.services.openoffice.org/OOO330/file/8601acbe0e6c/vcl/inc/vcl/vclenum.hxx enum FontItalic { ITALIC_NONE, ITALIC_OBLIQUE, ITALIC_NORMAL, ITALIC_DONTKNOW, FontItalic_FORCE_EQUAL_SIZE=SAL_MAX_ENUM }; could be rewrote in the following way: #if defined( THIS_COMPILER ) ( VERSION_OF_THE_COMPILER = 0x ) #define SAL_ATTRIBUTE_PACKED __attribute__ ((packed)) #else #define SAL_ATTRIBUTE_PACKED #endif enum FontItalic { ITALIC_NONE, ITALIC_OBLIQUE, ITALIC_NORMAL, ITALIC_DONTKNOW, FontItalic_FORCE_EQUAL_SIZE=SAL_MAX_ENUM } SAL_ATTRIBUTE_PACKED; The size of class Impl_Font would reduce to 50 bytes or so from 88 bytes. How big benefits could we get when such structure or class instances are produced in large numbers? Best regards, Tora -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help