[dev] Re: __attribute__((packed)) for enum

2011-05-19 Thread tora - Takamichi Akiyama

Hi Niklas,


Am 18.05.2011 13:44, schrieb tora - Takamichi Akiyama:

The motivation of this topic comes from a different point. I have been
looking for a way to prevent a well-known phenomenon A single sheet
that fits in a A4 paper with Excel turns into two or four A4 papers with
Calc.


On 2011/05/18 22:18, Niklas Nebel wrote:

Four pages means that both width and height are wrong? They are handled quite 
differently.

Column widths in Calc are static (stored in twips internally). The conversion 
from Excel's character-based units is done during import.


Yeap,  and 888 ?


Automatic row heights are updated based on cell formats and contents. The edit engine is 
used only for complex cell contents. With simple cells, we cheat a bit and 
avoid the OutputDevice::SetFont call, for performance reasons. The calculation is based 
on the height of the default font (determined from an OutputDevice once), the direct 
value from the font height format, and some tweaking, see lcl_GetAttribHeight in 
sc/source/core/data/column2.cxx. This obviously leaves lots of room to arrive at 
different values from Excel. In some cases even correctly so, because optimal height is 
supposed to fit the cell content with your current setup (system, installed fonts).


I appreciate your explanation on the inside of Calc.

As you pointed out, the installed font might be one of the factors. Excel files 
are prepared on Windows while I am trying to open them on CentOS and/or 
Solaris. Those systems have a different font set.

Another point that I have been suspecting since OpenOffice.org 2.x is artificial 
Ascendant. The vcl module had implemented a feature that mathematically produced an 
artificial Ascendant of glyph.

Compared with typical Western font files which usually have certain amount of 
ascendant, typical Japanese font files have an ascendant of value zero from, 
probably, historical reasons.

To make implementation of the upper layer applications such as Writer, Calc, and Impress, 
the underlying module, vcl, tries to internally take care of the differences.

It is good, but, I feel, the artificial ascendant, thus, virtual text height, 
might be slightly higher than its expectation. That might lead a cause to 
slightly increase unnecessary amount of row height.


BTW, in contrast to the topic on artificial ascendant, what I have been 
currently aiming at is that how to create an preview image of Microsoft Office files 
running on back-end servers without any user interaction.

For the purpose, which might be better?
 (a) One spreadsheet is converted into a single too-small image file.
 (b) One spreadsheet is converted into two or four image files.

Reducing a font size right before calling OutputDevice::SetFont seems to work. 
I am trying this attempt for a while.

Regards,
Tora
--
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help


[dev] Re: __attribute__((packed)) for enum

2011-05-19 Thread Niklas Nebel

On 19.05.2011 16:40, tora - Takamichi Akiyama wrote:

As you pointed out, the installed font might be one of the factors.
Excel files are prepared on Windows while I am trying to open them on
CentOS and/or Solaris. Those systems have a different font set.

Another point that I have been suspecting since OpenOffice.org 2.x is
artificial Ascendant. The vcl module had implemented a feature that
mathematically produced an artificial Ascendant of glyph.

Compared with typical Western font files which usually have certain
amount of ascendant, typical Japanese font files have an ascendant of
value zero from, probably, historical reasons.

To make implementation of the upper layer applications such as Writer,
Calc, and Impress, the underlying module, vcl, tries to internally
take care of the differences.

It is good, but, I feel, the artificial ascendant, thus, virtual text
height, might be slightly higher than its expectation. That might lead a
cause to slightly increase unnecessary amount of row height.


There's been some tweaking of CJK font metrics in VCL, but I'm not 
really familiar with that.



BTW, in contrast to the topic on artificial ascendant, what I have
been currently aiming at is that how to create an preview image of
Microsoft Office files running on back-end servers without any user
interaction.

For the purpose, which might be better?
(a) One spreadsheet is converted into a single too-small image file.
(b) One spreadsheet is converted into two or four image files.

Reducing a font size right before calling OutputDevice::SetFont seems to
work. I am trying this attempt for a while.


If you want to shrink everything to make sure you don't generate too 
many pages, perhaps it's better to reduce the print scale 
(ATTR_PAGE_SCALE, or PageScale via API). That would apply to simple 
cells and EditEngine content equally.


Niklas
--
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help


[dev] Re: __attribute__((packed)) for enum

2011-05-18 Thread Stephan Bergmann
On Wed, May 18, 2011 at 3:40 AM, tora - Takamichi Akiyama 
t...@openoffice.org wrote:

 How big benefits could we get when such structure or class instances are
 produced in large numbers?


The produced in large numbers (plus processed in bursts of large chunks,
I would say) is the crucial point, indeed.  Saving a few bytes here and
there would probably not improve overall OOo performance, that's what I
wanted to say with my response.  But, of course, ultimately, only trying it
out would tell...

-Stephan
-- 
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help


[dev] Re: __attribute__((packed)) for enum

2011-05-18 Thread Niklas Nebel

On 18.05.2011 03:40, tora - Takamichi Akiyama wrote:

The size of class Impl_Font would reduce to 50 bytes or so from 88 bytes.

How big benefits could we get when such structure or class instances are
produced in large numbers?


Generally, this is a very valid concern. In some places where we found 
out (or just assumed) that it makes a difference, we currently use 
smaller integer types directly for member variables, instead of enum. 
For example, see eCellType in ScBaseCell (sc/inc/cell.hxx). Or 
FormulaToken (formula/inc/formula/token.hxx), where the type of eOp and 
eType even depends on DBG_UTIL.


I don't know if Font objects are constructed in very large numbers.

Niklas
--
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help


[dev] Re: __attribute__((packed)) for enum

2011-05-18 Thread tora - Takamichi Akiyama

On 2011/05/18 18:06, Niklas Nebel wrote:
 Generally, this is a very valid concern. In some places where we found out 
(or just assumed) that it makes a difference, we currently use smaller integer 
types directly for member variables, instead of enum. For example, see eCellType 
in ScBaseCell (sc/inc/cell.hxx). Or FormulaToken (formula/inc/formula/token.hxx), 
where the type of eOp and eType even depends on DBG_UTIL.

formula/inc/formula/opcode.hxx
enum OpCodeEnum
{
...
};

#ifndef DBG_UTIL
// save memory since compilers tend to int an enum
typedef USHORT OpCode;
#else
// have enum names in debugger
typedef OpCodeEnum OpCode;
#endif

That is a great idea!!!

 I don't know if Font objects are constructed in very large numbers.

That might not be a big concern than that of calc, I think. The font object 
tends to share the same instance using a reference counter. When a small change 
such as a font size is being made, the entire object will be duplicated first 
and then the change is applied to the new instance.

I have just tried to determine how __attribute__((packed)) works.
https://bitbucket.org/tora/ooo-enum-attribute-packed-experiment-ooo330_m20-vcl/changeset/6f5ec89f0a56
https://bitbucket.org/tora/ooo-enum-attribute-packed-experiment-ooo330_m20-vcl/changeset/110df3d51a23

gcc (GCC) 4.2.3 running CentOS release 5.4 (Final)
With the original source code of OOO330_m20 aka OpenOffice.org 3.3.0
sizeof(Impl_Font) = 88
sizeof(FontFamily) = 4

With patch applied source code
sizeof(Impl_Font) = 56
sizeof(FontFamily) = 1

That works well so far. But, unfortunately, __attribute__((packed)) seems 
relatively new.

The motivation of this topic comes from a different point. I have been looking for a way 
to prevent a well-known phenomenon A single sheet that fits in a A4 paper with 
Excel turns into two or four A4 papers with Calc.
Tweaking nPPTX and nPPTY ?
Reducing a font size just before calling OutputDevice::SetFont() from the 
inside of sc and editeng ?

Best regards,
Tora
--
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help


[dev] Re: __attribute__((packed)) for enum

2011-05-18 Thread Niklas Nebel

Am 18.05.2011 13:44, schrieb tora - Takamichi Akiyama:

The motivation of this topic comes from a different point. I have been
looking for a way to prevent a well-known phenomenon A single sheet
that fits in a A4 paper with Excel turns into two or four A4 papers with
Calc.
Tweaking nPPTX and nPPTY ?
Reducing a font size just before calling OutputDevice::SetFont() from
the inside of sc and editeng ?


Four pages means that both width and height are wrong? They are handled 
quite differently.


Column widths in Calc are static (stored in twips internally). The 
conversion from Excel's character-based units is done during import.


Automatic row heights are updated based on cell formats and contents. 
The edit engine is used only for complex cell contents. With simple 
cells, we cheat a bit and avoid the OutputDevice::SetFont call, for 
performance reasons. The calculation is based on the height of the 
default font (determined from an OutputDevice once), the direct value 
from the font height format, and some tweaking, see lcl_GetAttribHeight 
in sc/source/core/data/column2.cxx. This obviously leaves lots of room 
to arrive at different values from Excel. In some cases even correctly 
so, because optimal height is supposed to fit the cell content with your 
current setup (system, installed fonts).


Niklas
--
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help


[dev] Re: __attribute__((packed)) for enum

2011-05-17 Thread Stephan Bergmann
On Tue, May 17, 2011 at 8:44 PM, tora - Takamichi Akiyama 
t...@openoffice.org wrote:

 Any thoughts?


Binary UNO requires that its enums are of specific size, but that should be
taken care of by the dummy max-value element in each enum.  Likewise for
enums in the C/C++ URE ABI.

Whether smaller enums have positive or negative runtime impact is hard to
tell up front, I would say---and hard to measure, I would guess, as I assume
the impact is negligible overall.  Similarly, I would assume the space
savings to be negligible, too.

-Stephan
-- 
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help


[dev] Re: __attribute__((packed)) for enum

2011-05-17 Thread tora - Takamichi Akiyama

Hi Stephan,

Thank you for your comments.

On 2011/05/18 4:11, Stephan Bergmann wrote:
 Binary UNO requires that its enums are of specific size, but that should be 
taken care of by the dummy max-value element in each enum.  Likewise for enums in 
the C/C++ URE ABI.

 Whether smaller enums have positive or negative runtime impact is hard to 
tell up front, I would say---and hard to measure, I would guess, as I assume the 
impact is negligible overall.  Similarly, I would assume the space savings to be 
negligible, too.

I used to think in the similar way as you did. But when I read some articles 
offered by Intel in order to find basic concepts of performance improvement, I 
was surprised and had learned.

Smaller data size improves the cache hit rate of the processor.
Smaller data size decreases the possibility of miss hit of virtual memory pages.

Programming application software in the level of C++ language, we might never 
consider underlying architecture.

When you jump in the world of what is DDR2-800 2GB memory module, what is 512-KB L2 
Cache, what is Translation Lookaside Buffer (TLB), what OS has to do when a miss 
hit occurs, ... you might be encouraged to explore the new world.

I have tried to find such articles, but cannot find them so far. This web page, 
however, might help you enjoy with the new concepts.
http://www.intel.com/products/processor/manuals/

Just an example from http://developer.intel.com/Assets/PDF/manual/248966.pdf
Example 3-44. Rearranging a Data Structure

struct unpacked { /* Fits in 20 bytes due to padding */
int a;
char b;
int c;
char d;
int e;
};

struct packed { /* Fits in 16 bytes */
int a;
int c;
int e;
char b;
char d;
}

http://hg.services.openoffice.org/OOO330/file/OOO330_m20/vcl/inc/vcl/impfont.hxx
class Impl_Font  /* Fits in 88 bytes */
{
  ... several enum ...
};

http://hg.services.openoffice.org/OOO330/file/8601acbe0e6c/vcl/inc/vcl/vclenum.hxx
enum FontItalic { ITALIC_NONE, ITALIC_OBLIQUE, ITALIC_NORMAL, ITALIC_DONTKNOW, 
FontItalic_FORCE_EQUAL_SIZE=SAL_MAX_ENUM };

could be rewrote in the following way:

#if defined( THIS_COMPILER )  ( VERSION_OF_THE_COMPILER = 0x )
#define SAL_ATTRIBUTE_PACKED __attribute__ ((packed))
#else
#define SAL_ATTRIBUTE_PACKED
#endif

enum FontItalic { ITALIC_NONE, ITALIC_OBLIQUE, ITALIC_NORMAL, ITALIC_DONTKNOW, 
FontItalic_FORCE_EQUAL_SIZE=SAL_MAX_ENUM } SAL_ATTRIBUTE_PACKED;

The size of class Impl_Font would reduce to 50 bytes or so from 88 bytes.

How big benefits could we get when such structure or class instances are 
produced in large numbers?

Best regards,
Tora
--
-
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help