Re: [HACKERS] database size estimates

2007-04-25 Thread Gregory Stark

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:

> Francois Deliege wrote:
>> Hi,
>>
>> I am trying to estimate the size of a table composed of 51754000 rows.
>> Each row has 31 attributes: 16 x bit(24) and 15 x bit(8)
>>
>> So, the payload should be:
>> 51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MB
>
> What data types are those exactly? If those 24-bit fields are in fact text,
> varchar, char(x) or other data types that are stored as variable length 
> fields,

And sadly that includes bit() if you're being literal.

As of Postgres 8.1 you can see how much space a column is taking up using the
pg_column_size() function. This won't include alignment padding but will
include the length header for that column.

You can see how much a given row is taking up by passing the entire row to
og_column_size with something like pg_column_size(tab.*)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] database size estimates

2007-04-25 Thread Heikki Linnakangas

Francois Deliege wrote:

Hi,

I am trying to estimate the size of a table composed of 51754000 rows.
Each row has 31 attributes: 16 x bit(24) and 15 x bit(8)

So, the payload should be:
51754000 x ( 16 x 24 + 15 x 24 ) bits = 3115 MB


What data types are those exactly? If those 24-bit fields are in fact 
text, varchar, char(x) or other data types that are stored as variable 
length fields, the varlen header will take 4 bytes. And then there's 
alignment, those 24-bit fields are most almost certainly 4-byte aligned, 
which means that there'll be one byte of padding between them.


The upcoming 8.3 release will be much better in that respect, It'll use 
just a 1 byte varlen header per field instead of 4 bytes for small 
values like yours. You might want to test a CVS snapshot.


Now, from what I understand from postgresql manual is that the overhead 
is composed of

32 bytes per row + 20 bytes per page.
This leads me to approx. 1700 MB overhead.
Therefore a total table size of 4900 MB.


In addition, there will be on average 1/2 rows worth of wasted space on 
every page.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings