Thanks for the response.


The use-case I'm targeting is a schema that has multiple tables with ~800 
columns, most of which have only the first 50 or so values set. 800 columns 
would require 800 bits in a bitmap which equates to 100 bytes. With 8-byte 
alignment the row bitmap would take up 104 bytes with the current 
implementation. If only the first 50 or so columns are actually non-null, then 
the minimum bitmap size wouldn't need to be more than 8 bytes, which means the 
proposed change would save 96 bytes. For the data set I have in mind roughly 
90% of the rows would fall into the category of needing only 8 bytes for the 
null bitmap.


What kind of test results would prove that this is a net win (or not a net 
loss) for typical cases? Are you interested in some insert performance tests? 
Also, how would you define a typical case (e.g. what kind of data shape)?

Thanks.
-jamie


________________________________
 From: Tom Lane <t...@sss.pgh.pa.us>
To: Jameison Martin <jameis...@yahoo.com> 
Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org> 
Sent: Tuesday, April 17, 2012 9:38 AM
Subject: Re: [HACKERS] patch submission: truncate trailing nulls from heap rows 
to reduce the size of the null bitmap 
 
Jameison Martin <jameis...@yahoo.com> writes:
> The following patch truncates trailing null attributes from heap rows to 
> reduce the size of the row bitmap. 

This has been discussed before, but it always seemed that the
cost-benefit ratio was exceedingly questionable.  You don't get any
savings whatsoever unless you reduce the size of the null bitmap across
a MAXALIGN boundary, which more and more often is 64 bits, so that the
frequency with which the optimization wins anything doesn't look likely
to be that high.  And on the other side of the coin, you're adding
cycles to every single tuple-construction operation to make this work.
The introduction of bugs doesn't seem improbable either.  (Just because
tuples in user tables might have unexpected natts values doesn't mean
that the code is, or should be, prepared to
 tolerate that in system
tables or plan-constructed tuples.)

So what I'd like to see is some concrete test results proving that this
is a net win, or at least not a net loss, for typical cases.  Just
asserting that it might be a win for certain usage patterns doesn't do
it for me.

            regards, tom lane

Reply via email to