Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-07-01 Thread Zdenek Kotala

Tom Lane napsal(a):


Cutting a third off the size of a system index has got to be worth
something, but is it worth a hack as ugly as this one?



The problem what I see there is how to fit with in-place-upgrade. Catalog should 
be generate from scratch, but if somebody uses name in regular table it invokes 
request for reindex.


Zdenek

--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-07-01 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
 Tom Lane napsal(a):
 Cutting a third off the size of a system index has got to be worth
 something, but is it worth a hack as ugly as this one?

 The problem what I see there is how to fit with in-place-upgrade. Catalog 
 should 
 be generate from scratch, but if somebody uses name in regular table it 
 invokes 
 request for reindex.

Actually, an existing index stored as name would continue to work
fine, so I think this could be worked around.   But in any case name is
deprecated for user use, so anyone who suffers a reindex has only
themselves to blame.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-07-01 Thread Zdenek Kotala

Tom Lane napsal(a):

Zdenek Kotala [EMAIL PROTECTED] writes:

Tom Lane napsal(a):

Cutting a third off the size of a system index has got to be worth
something, but is it worth a hack as ugly as this one?


The problem what I see there is how to fit with in-place-upgrade. Catalog should 
be generate from scratch, but if somebody uses name in regular table it invokes 
request for reindex.


Actually, an existing index stored as name would continue to work
fine, so I think this could be worked around.   But in any case name is
deprecated for user use, so anyone who suffers a reindex has only
themselves to blame.


Yes, it is deprecated by you know a user. Give him a loaded shot-gun and he 
start play a golf :-).


However, reindex is acceptable penalty for user who uses deprecated things.

Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-25 Thread Josh Berkus

Mark,

Not that I disagree with your change, but  5 Mbytes in 4 Gbytes of RAM 
for my main PostgreSQL system that I manage seems like a drop in the 
bucket. Even if 40% of pg_class_relname and pg_proc_proname indices was 
saved - we're talking about 154 Kbytes saved on both those indices 
combined. Minor? Major? I bet I wouldn't notice unless my database 
requirements used up all RAM, and even then I'm suspecting it wouldn't 
matter except for border line cases (like all pages required for 
everything else happened to equal 4 Gbytes near exactly).


Again, I think the best way to test this would be to create an 
installation with more than 100,000 tables  views. That's not 
hypothetical; I've encountered it already twice in production users.


--Josh Berkus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Teodor Sigaev

dead easy to implement this: effectively, we just decree that the
index column storage type for NAME is always CSTRING.  Because the


Isn't it a reason to add STORAGE option of CREATE OPERATOR CLASS to BTree? as 
it's done for GiST and GIN indexes.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Shane Ambler

Mark Mielke wrote:

Not that I disagree with your change, but  5 Mbytes in 4 Gbytes of RAM 
for my main PostgreSQL system that I manage seems like a drop in the 
bucket. Even if 40% of pg_class_relname and pg_proc_proname indices was 
saved - we're talking about 154 Kbytes saved on both those indices 
combined. Minor? Major? I bet I wouldn't notice unless my database 
requirements used up all RAM, and even then I'm suspecting it wouldn't 
matter except for border line cases (like all pages required for 
everything else happened to equal 4 Gbytes near exactly).


Guess the mileage will vary depending on the complexity of the db 
structure. Shorter names will also benefit more than longer ones.



The performance impact is probably going to be limited by our extensive
use of catalog caches --- once a desired row is in a backend's catcache,
it doesn't take a btree search to fetch it again.  Still, the system
indexes are probably hot enough to stay in shared buffers most of the
time, and the smaller they are the more space will be left for other
stuff, so I think there should be a distributed benefit.
  


My question is whether this is limited to system catalogs? or will this 
benefit char() index used on any table? The second would make it more 
worthwhile.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Heikki Linnakangas

Shane Ambler wrote:
My question is whether this is limited to system catalogs? or will this 
benefit char() index used on any table? The second would make it more 
worthwhile.


char(n) fields are already stored as variable-length on disk. This isn't 
applicable to them.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 dead easy to implement this: effectively, we just decree that the
 index column storage type for NAME is always CSTRING.  Because the

 Isn't it a reason to add STORAGE option of CREATE OPERATOR CLASS to BTree? as
 it's done for GiST and GIN indexes.

Hmm ... I don't see a point in exposing that as a user-level facility,
unless you can point to other use-cases besides NAME.  But it would be
cute to implement the hack by changing the initial contents of
pg_opclass instead of inserting code in the backend.  I'll give that
a try.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Josh Berkus

Shane Ambler wrote:

Mark Mielke wrote:

Not that I disagree with your change, but  5 Mbytes in 4 Gbytes of 
RAM for my main PostgreSQL system that I manage seems like a drop in 
the bucket. Even if 40% of pg_class_relname and pg_proc_proname 
indices was saved - we're talking about 154 Kbytes saved on both those 
indices combined. Minor? Major? I bet I wouldn't notice unless my 
database requirements used up all RAM, and even then I'm suspecting it 
wouldn't matter except for border line cases (like all pages required 
for everything else happened to equal 4 Gbytes near exactly).


Guess the mileage will vary depending on the complexity of the db 
structure. Shorter names will also benefit more than longer ones.


There are PostgreSQL users out there with more than 100,000 tables per 
server instance.  This will make more of a difference to them.


--Josh


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Joshua D. Drake

Josh Berkus wrote:

Shane Ambler wrote:

Mark Mielke wrote:

Not that I disagree with your change, but  5 Mbytes in 4 Gbytes of 
RAM for my main PostgreSQL system that I manage seems like a drop in 
the bucket. Even if 40% of pg_class_relname and pg_proc_proname 
indices was saved - we're talking about 154 Kbytes saved on both 
those indices combined. Minor? Major? I bet I wouldn't notice unless 
my database requirements used up all RAM, and even then I'm 
suspecting it wouldn't matter except for border line cases (like all 
pages required for everything else happened to equal 4 Gbytes near 
exactly).


Guess the mileage will vary depending on the complexity of the db 
structure. Shorter names will also benefit more than longer ones.


There are PostgreSQL users out there with more than 100,000 tables per 
server instance.  This will make more of a difference to them.


More than I think people realize.

Joshua D. Drake



--Josh





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Stephen R. van den Berg
Mark Mielke wrote:
saved - we're talking about 154 Kbytes saved on both those indices 
combined. Minor? Major? I bet I wouldn't notice unless my database 
requirements used up all RAM, and even then I'm suspecting it wouldn't 
matter except for border line cases (like all pages required for 
everything else happened to equal 4 Gbytes near exactly).

There is always only so much of 1st level and 2nd level cache; for those
the savings might well make a difference, even on multigigabyte
databases.
-- 
Sincerely,
   Stephen R. van den Berg.

Life is that brief interlude between nothingness and eternity.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-23 Thread Bruce Momjian

I would mention in the C comment that we are doing this for space
savings, but other than that, it seems fine.

---

Tom Lane wrote:
 I was thinking a bit about how we pad columns of type NAME to
 fixed-width, even though they're semantically equivalent to C strings.
 The reason for wasting that space is that it makes it possible to
 overlay a C struct onto the leading columns of most system catalogs.
 I don't wish to propose changing that (at least not today), but it
 struck me that there is no reason to overlay a C struct onto index
 entries, and that getting rid of the padding space would be even more
 useful in an index than in the catalog itself.  It turns out to be
 dead easy to implement this: effectively, we just decree that the
 index column storage type for NAME is always CSTRING.  Because the
 two types are effectively binary-compatible as long as you don't
 look at the padding, the attached ugly-but-impressively-short patch
 seems to accomplish this.  It passes the regression tests anyway.
 Here are some numbers about the space savings in a virgin database:
 
   CVS HEADw/patch savings
 
 pg_database_size('postgres')  4439752 4071112 8.3%
 pg_relation_size('pg_class_relname_nsp_index')57344   40960   
 28%
 pg_relation_size('pg_proc_proname_args_nsp_index')  319488204800  35%
 
 Cutting a third off the size of a system index has got to be worth
 something, but is it worth a hack as ugly as this one?
 
   regards, tom lane
 
 

Content-Description: index-name-as-cstring.patch

 Index: src/backend/catalog/index.c
 ===
 RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v
 retrieving revision 1.300
 diff -c -r1.300 index.c
 *** src/backend/catalog/index.c   19 Jun 2008 00:46:04 -  1.300
 --- src/backend/catalog/index.c   23 Jun 2008 19:34:54 -
 ***
 *** 262,267 
 --- 262,278 
   
   ReleaseSysCache(tuple);
   }
 + 
 + /*
 +  * For an index on NAME, force the index storage to be CSTRING,
 +  * rather than padded to fixed length.
 +  */
 + if (to-atttypid == NAMEOID)
 + {
 + to-atttypid = CSTRINGOID;
 + to-attlen = -2;
 + to-attalign = 'c';
 + }
   }
   
   return indexTupDesc;

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-23 Thread Andrew Dunstan



Tom Lane wrote:

Cutting a third off the size of a system index has got to be worth
something, but is it worth a hack as ugly as this one?
  
  


I think so.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-23 Thread Simon Riggs

On Mon, 2008-06-23 at 15:52 -0400, Tom Lane wrote:
   CVS HEADw/patch savings
 
 pg_database_size('postgres')  4439752 4071112 8.3%
 pg_relation_size('pg_class_relname_nsp_index')57344   40960   
 28%
 pg_relation_size('pg_proc_proname_args_nsp_index')  319488204800  35%
 
 Cutting a third off the size of a system index has got to be worth
 something, but is it worth a hack as ugly as this one?

Not doing it would be more ugly, unless there is some negative
side-effect?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-23 Thread Mark Mielke

Andrew Dunstan wrote:

Tom Lane wrote:

Cutting a third off the size of a system index has got to be worth
something, but is it worth a hack as ugly as this one?



I think so.


Were you able to time any speedup? Is this something that would benefit 
installations with a lot of metadata? I presume most of this information 
normally easily fits in cache most of the time?


I am trying to understand what exactly it is worth... :-)

Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-23 Thread Tom Lane
Mark Mielke [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Cutting a third off the size of a system index has got to be worth
 something, but is it worth a hack as ugly as this one?

 Were you able to time any speedup?

I didn't try; can you suggest any suitable benchmark?

The performance impact is probably going to be limited by our extensive
use of catalog caches --- once a desired row is in a backend's catcache,
it doesn't take a btree search to fetch it again.  Still, the system
indexes are probably hot enough to stay in shared buffers most of the
time, and the smaller they are the more space will be left for other
stuff, so I think there should be a distributed benefit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Mon, 2008-06-23 at 15:52 -0400, Tom Lane wrote:
 Cutting a third off the size of a system index has got to be worth
 something, but is it worth a hack as ugly as this one?

 Not doing it would be more ugly, unless there is some negative
 side-effect?

I thought some more about why this seems ugly to me, and realized that a
lot of it has to do with the change in typalign.  Currently, a compiler
is entitled to assume that a pointer to Name is 4-byte aligned; thus
for instance it could generate word-wide instructions for copying a Name
from one place to another.  A Name that is stored as just CSTRING
might break that.  We are already at risk of this, really, because of
all the places where we gaily pass plain old C strings to syscache and
index searches on Name columns.  I think the only reason we've not been
burnt is that it's hard to optimize strcmp() into word-wide operations.

However the solution to that seems fairly obvious: let's downgrade Name
to typalign 1 instead of 4.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-23 Thread Mark Mielke

Tom Lane wrote:

Were you able to time any speedup?



I didn't try; can you suggest any suitable benchmark?

  


Unfortunately - no. I kind of think it won't benefit any of my databases 
in any noticeable way. My numbers are similar to yours:



pccyber=# select pg_database_size('postgres');
  4468332

pccyber=# select pg_relation_size('pg_class_relname_nsp_index');
90112

pccyber=# select pg_relation_size('pg_proc_proname_args_nsp_index');
   294912


Not that I disagree with your change, but  5 Mbytes in 4 Gbytes of RAM 
for my main PostgreSQL system that I manage seems like a drop in the 
bucket. Even if 40% of pg_class_relname and pg_proc_proname indices was 
saved - we're talking about 154 Kbytes saved on both those indices 
combined. Minor? Major? I bet I wouldn't notice unless my database 
requirements used up all RAM, and even then I'm suspecting it wouldn't 
matter except for border line cases (like all pages required for 
everything else happened to equal 4 Gbytes near exactly).



The performance impact is probably going to be limited by our extensive
use of catalog caches --- once a desired row is in a backend's catcache,
it doesn't take a btree search to fetch it again.  Still, the system
indexes are probably hot enough to stay in shared buffers most of the
time, and the smaller they are the more space will be left for other
stuff, so I think there should be a distributed benefit.
  


In my opinion it is 'do the right thing', rather than a performance 
question. It seems to me that an index keeping tracking of space 
characters at the end of a name, char, varchar, or text does not make 
sense, and the right thing may be to do a more generic version of your 
patch? In the few cases that space at the end matters, couldn't that be 
determined by re-checking the table row after querying it?


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]