Re: [HACKERS] ToDo List Item - System Table Index Clustering
Robert Haas wrote: On Thu, Jan 20, 2011 at 4:40 PM, Simone Aiken sai...@quietlycompetent.com wrote: After playing with this in benchmarks and researching the weird results I got I'm going to advise dropping the todo for now unless something happens to change how postgres handles clustering. I agree, let's remove it. That having been said, analyzing TODO items to figure out which ones are worthless is a useful thing to do, so please feel free to keep at it. OK, removed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Wed, Jan 19, 2011 at 4:27 PM, Simone Aiken sai...@ulfheim.net wrote: In my experience size increases related to documentation are almost always worth it. So I'm prejudiced right out of the gate. I was wondering if every pg_ table gets copied out to every database .. if there is already a mechanism for not replicating all of them we could utilize views or re-writes rules to merge a single copy of catalog comments in a separate table with each deployed database's pg_descriptions. All of them get copied, except for a handful of so-called shared catalogs. Changing that would be difficult. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
After playing with this in benchmarks and researching the weird results I got I'm going to advise dropping the todo for now unless something happens to change how postgres handles clustering. You guys probably already grokked this so I am just recording it for the list archives. The primary factor here is that postgres doesn't maintain clustered indexes. Clustering is a one-time operation that clusters the table at this current point in time. Basically, there really isn't any such thing in postgres as a clustered index. There is an operation - Cluster - which takes an index and a table as input and re-orders the table according to the index. But it is borderline fiction to call the index used clustered because the next row inserted will pop in at the end of the table instead of slipping into the middle of the table per the desired ordering. All the pg_table cluster candidates are candidates because they have a row per table column and we expect that a query will want to get several of these rows at once. These rows are naturally clustered because the scripts that create them insert their information into the catalog contiguously. When you create a catalog table the pg_attribute rows for its columns are inserted together. When you then create all its triggers they too are put into pg_triggers one after the other. So calling the Cluster operation after dbinit doesn't help anything. Over time table alterations can fragment this information. If a user loads a bunch of tables, then alters them over time the columns added later on will have their metadata stored separately from the columns created originally. Which gets us to the down and dirty of how the Cluster function works. It puts an access exclusive lock on the entire table - blocking all attempts to read and write to the table - creates a copy of the table in the desired order, drops the original, and renames the copy. Doing this to a catalog table that is relevant to queries pretty much brings everything else in the database to a halt while the system table is locked up. And the brute force logic makes this time consuming even if the table is perfectly ordered already. Additionally, snapshots taken of the table during the Cluster operation make the table appear to be empty which introduces the possibility of system table corruption if transactions are run concurrently with a Cluster operation. So basically, the Cluster operation in its current form is not something you want running automatically on a bunch of system table as it is currently implemented. It gives your system the hiccups. You would only want to run it manually during downtime. And you can do that just as easily with or without any preparation during dbinit. Thanks everyone, -Simone Aiken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Thu, Jan 20, 2011 at 4:40 PM, Simone Aiken sai...@quietlycompetent.com wrote: After playing with this in benchmarks and researching the weird results I got I'm going to advise dropping the todo for now unless something happens to change how postgres handles clustering. I agree, let's remove it. That having been said, analyzing TODO items to figure out which ones are worthless is a useful thing to do, so please feel free to keep at it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Robert I think the first thing to do would be to try to come up with a reproducible test case where clustering the tables improves performance. On that note, is there any standard way you guys do benchmarks? Bruce I think CLUSTER is a win when you are looking up multiple rows in the same table, either using a non-unique index or a range search. What places do such lookups? Having them all in adjacent pages would be a win --- single-row lookups are usually not. Mostly the tables that track column level data. Typically you will want to grab rows for multiple columns for a given table at once so it would be helpful to have them be contiguous on disk. I could design a benchmark to display this by building a thousand tables one column at a time using 'alter add column' to scatter the catalog rows for the tables across many blocks. So they'll be a range with column 1 for each table and column 2 for each table and column three for each table. Then fill a couple data tables with a lot of data and set some noise makers to loop through them over and over with full table scans ... filling up cache with unrelated data and hopefully ageing out the cache of the pg_tables. Then do some benchmark index lookup queries to see the retrieval time before and after clustering the pg_ctalog tables to record a difference. If the criteria is doesn't hurt anything and helps a little I think this passes. Esp since clusters aren't maintained automatically so adding them has no negative impact on insert or update. It'd just be a nice thing to do if you know it can be done that doesn't harm anyone who doesn't know. -Simone Aiken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Tue, Jan 18, 2011 at 6:49 PM, Simone Aiken sai...@quietlycompetent.com wrote: Pages like this one have column comments for the system tables: http://www.psql.it/manuale/8.3/catalog-pg-attribute.html Oh, I see. I don't think we want to go there. We'd need some kind of system for keeping the two places in sync. And there'd be no easy way to upgrade the in-database descriptions when we upgraded to a newer minor release, supposing they'd changed in the meantime. And some of the descriptions are quite long, so they wouldn't fit nicely in the amount of space you typically have available when you run \d+. And it would enlarge the size of an empty database by however much was required to store all those comments, which could be an issue for PostgreSQL instances that have many small databases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Robert Haas robertmh...@gmail.com writes: On Tue, Jan 18, 2011 at 6:49 PM, Simone Aiken sai...@quietlycompetent.com wrote: Pages like this one have column comments for the system tables: http://www.psql.it/manuale/8.3/catalog-pg-attribute.html Oh, I see. I don't think we want to go there. We'd need some kind of system for keeping the two places in sync. I seem to recall some muttering about teaching genbki to extract such comments from the SGML sources or perhaps the C header files. I tend to agree though that it would be a lot more work than it's worth. And as you say, pg_description entries aren't free. Which brings up another point though. I have a personal TODO item to make the comments for operator support functions more consistent: http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us Should we consider removing those comments altogether, instead? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Excerpts from Robert Haas's message of mié ene 19 15:25:00 -0300 2011: Oh, I see. I don't think we want to go there. We'd need some kind of system for keeping the two places in sync. Maybe autogenerate both the .sgml and the postgres.description files from a single source. And there'd be no easy way to upgrade the in-database descriptions when we upgraded to a newer minor release, supposing they'd changed in the meantime. I wouldn't worry about this issue. We don't do many catalog changes in minor releases anyway. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Wed, Jan 19, 2011 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Jan 18, 2011 at 6:49 PM, Simone Aiken sai...@quietlycompetent.com wrote: Pages like this one have column comments for the system tables: http://www.psql.it/manuale/8.3/catalog-pg-attribute.html Oh, I see. I don't think we want to go there. We'd need some kind of system for keeping the two places in sync. I seem to recall some muttering about teaching genbki to extract such comments from the SGML sources or perhaps the C header files. I tend to agree though that it would be a lot more work than it's worth. And as you say, pg_description entries aren't free. Which brings up another point though. I have a personal TODO item to make the comments for operator support functions more consistent: http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us Should we consider removing those comments altogether, instead? I could go either way on that. Most of those comments are pretty short, aren't they? How much storage are they really costing us? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Robert Haas robertmh...@gmail.com writes: On Wed, Jan 19, 2011 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Which brings up another point though. I have a personal TODO item to make the comments for operator support functions more consistent: http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us Should we consider removing those comments altogether, instead? I could go either way on that. Most of those comments are pretty short, aren't they? How much storage are they really costing us? Well, on my machine pg_description is about 210K (per database) as of HEAD. 90% of its contents are pg_proc entries, though I have no good fix on how much of that is for internal-use-only functions. A very rough estimate from counting pg_proc and pg_operator entries suggests that the answer might be about a third. So if we do what was said in the above-cited thread, ie move existing comments to pg_operator and add boilerplate ones to pg_proc, we probably would pay 100K for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Wed, Jan 19, 2011 at 3:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Jan 19, 2011 at 2:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Which brings up another point though. I have a personal TODO item to make the comments for operator support functions more consistent: http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us Should we consider removing those comments altogether, instead? I could go either way on that. Most of those comments are pretty short, aren't they? How much storage are they really costing us? Well, on my machine pg_description is about 210K (per database) as of HEAD. 90% of its contents are pg_proc entries, though I have no good fix on how much of that is for internal-use-only functions. A very rough estimate from counting pg_proc and pg_operator entries suggests that the answer might be about a third. So if we do what was said in the above-cited thread, ie move existing comments to pg_operator and add boilerplate ones to pg_proc, we probably would pay 100K for it. I guess that's not enormously expensive, but it's not insignificant either. On my machine, a template database is 5.5MB. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Robert Haas robertmh...@gmail.com writes: On Wed, Jan 19, 2011 at 3:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Well, on my machine pg_description is about 210K (per database) as of HEAD. 90% of its contents are pg_proc entries, though I have no good fix on how much of that is for internal-use-only functions. A very rough estimate from counting pg_proc and pg_operator entries suggests that the answer might be about a third. So if we do what was said in the above-cited thread, ie move existing comments to pg_operator and add boilerplate ones to pg_proc, we probably would pay 100K for it. I guess that's not enormously expensive, but it's not insignificant either. On my machine, a template database is 5.5MB. The implementation I was thinking about was to have initdb run a SQL command that would do something like INSERT INTO pg_description SELECT oprcode, 'pg_proc'::regclass, 0, 'implementation of ' || oprname FROM pg_operator WHERE theres-not-already-a-description-of-the-oprcode-function So it would be minimal work to either provide or omit the boilerplate descriptions. I think we can postpone the decision till we have a closer fix on the number of entries we're talking about. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
I seem to recall some muttering about teaching genbki to extract such comments from the SGML sources or perhaps the C header files. I tend to agree though that it would be a lot more work than it's worth. And as you say, pg_description entries aren't free. I know I can't do all of the work, any submission requires review etc, but it is worth it to me provided it does no harm to the codebase. So the only outstanding question is the impact of increased size. In my experience size increases related to documentation are almost always worth it. So I'm prejudiced right out of the gate. I was wondering if every pg_ table gets copied out to every database .. if there is already a mechanism for not replicating all of them we could utilize views or re-writes rules to merge a single copy of catalog comments in a separate table with each deployed database's pg_descriptions. If all catalog descriptions were handled this way it would actually decrease the size of a deployed database ( by 210K? ) by absorbing the pg_descriptions that are currently being duplicated. Since users shouldn't be messing with them anyway and they are purely for humans to refer to - not computers to calculate explain plans with - there shouldn't be anything inherently wrong with moving static descriptions out of user space. In theory at least. -Simone Aiken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Excerpts from Simone Aiken's message of dom ene 16 02:11:26 -0300 2011: Hello Postgres Hackers, In reference to this todo item about clustering system table indexes, ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php ) I have been studying the system tables to see which would benefit from clustering. I have some index suggestions and a question if you have a moment. Wow, this is really old stuff. I don't know if this is really of any benefit, given that these catalogs are loaded into syscaches anyway. Furthermore, if you cluster at initdb time, they will soon lose the ordering, given that updates move tuples around and inserts put them anywhere. So you'd need the catalogs to be re-clustered once in a while, and I don't see how you'd do that (except by asking the user to do it, which doesn't sound so great). I think you need some more discussion on the operational details before engaging in the bootstrap bison stuff (unless you just want to play with Bison for educational purposes, of course, which is always a good thing to do). -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Excerpts from Simone Aiken's message of dom ene 16 02:11:26 -0300 2011: Hello Postgres Hackers, BTW whatever you do, don't start a new thread by replying to an existing message and just changing the subject line. It will mess up the threading for some readers, and some might not even see your message. Compose a fresh message instead. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Tue, Jan 18, 2011 at 8:35 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Simone Aiken's message of dom ene 16 02:11:26 -0300 2011: Hello Postgres Hackers, In reference to this todo item about clustering system table indexes, ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php ) I have been studying the system tables to see which would benefit from clustering. I have some index suggestions and a question if you have a moment. Wow, this is really old stuff. I don't know if this is really of any benefit, given that these catalogs are loaded into syscaches anyway. Furthermore, if you cluster at initdb time, they will soon lose the ordering, given that updates move tuples around and inserts put them anywhere. So you'd need the catalogs to be re-clustered once in a while, and I don't see how you'd do that (except by asking the user to do it, which doesn't sound so great). The idea of the TODO seems to have been to set the default clustering to something reasonable. That doesn't necessarily seem like a bad idea even if we can't automatically maintain the cluster order, but there's some question in my mind whether we'd get any measurable benefit from the clustering. Even on a database with a gigantic number of tables, it seems likely that the relevant system catalogs will stay fully cached and, as you point out, the system caches will further blunt the impact of any work in this area. I think the first thing to do would be to try to come up with a reproducible test case where clustering the tables improves performance. If we can't, that might mean it's time to remove this TODO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Jan 18, 2011, at 6:35 AM, Alvaro Herrera wrote: Wow, this is really old stuff. I don't know if this is really of any benefit, given that these catalogs are loaded into syscaches anyway. The benefit is educational primarily. I was looking for a todo list item that would expose me to the system tables. Learning the data model of a new system is always step 1 for me. So that one was perfect as it would have me study and consider each one to determine if there was any benefit from clustering on its initial load into cache. Furthermore, if you cluster at initdb time, they will soon lose the ordering, given that updates move tuples around and inserts put them anywhere. So you'd need the catalogs to be re-clustered once in a while, and I don't see how you'd do that (except by asking the user to do it, which doesn't sound so great). I did discover that last night. I'm used to databases that keep up their clustering. One that falls apart over time is distinctly strange. And the way you guys do your re-clustering logic is overkill if just a few rows are out of place. On the upside, a call to mass re-clustering goes and updates all the clustered indexes in the system and that includes these tables. Will have to study auto-vacuum as well to consider that. (unless you just want to play with Bison for educational purposes, of course, which is always a good thing to do). Pretty much, yeah. - Simone Aiken
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Tue, Jan 18, 2011 at 8:35 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Simone Aiken's message of dom ene 16 02:11:26 -0300 2011: Hello Postgres Hackers, In reference to this todo item about clustering system table indexes, ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php ) Wow, this is really old stuff. I don't know if this is really of any If we can't, that might mean it's time to remove this TODO. When I'm learning a new system I like to first learn how to use it, second learn its data model, third start seriously looking at the code. So that Todo is ideal for my learning method. If there is something else that would also involve studying all the system tables it would also be great. For example, I noticed we have column level comments on the web but not in the database itself. This seems silly. Why not have the comments in the database and have the web query the tables of template databases for the given versions? That way \d+ pg_tablename would provide instant gratification for users. And we all like our gratification to be instant. They could be worked into The .h files though as inserts to pg_description they wouldn't provide an excuse to learn bison. I'm open to other suggestions as well. -Simone Aiken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Robert Haas wrote: On Tue, Jan 18, 2011 at 8:35 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Simone Aiken's message of dom ene 16 02:11:26 -0300 2011: Hello Postgres Hackers, In reference to this todo item about clustering system table indexes, ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php ) I have been studying the system tables to see which would benefit ?from clustering. ?I have some index suggestions and a question if you have a moment. Wow, this is really old stuff. ?I don't know if this is really of any benefit, given that these catalogs are loaded into syscaches anyway. Furthermore, if you cluster at initdb time, they will soon lose the ordering, given that updates move tuples around and inserts put them anywhere. ?So you'd need the catalogs to be re-clustered once in a while, and I don't see how you'd do that (except by asking the user to do it, which doesn't sound so great). The idea of the TODO seems to have been to set the default clustering to something reasonable. That doesn't necessarily seem like a bad idea even if we can't automatically maintain the cluster order, but there's some question in my mind whether we'd get any measurable benefit from the clustering. Even on a database with a gigantic number of tables, it seems likely that the relevant system catalogs will stay fully cached and, as you point out, the system caches will further blunt the impact of any work in this area. I think the first thing to do would be to try to come up with a reproducible test case where clustering the tables improves performance. If we can't, that might mean it's time to remove this TODO. I think CLUSTER is a win when you are looking up multiple rows in the same table, either using a non-unique index or a range search. What places do such lookups? Having them all in adjacent pages would be a win --- single-row lookups are usually not. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
On Tue, Jan 18, 2011 at 12:16 PM, Simone Aiken sai...@ulfheim.net wrote: When I'm learning a new system I like to first learn how to use it, second learn its data model, third start seriously looking at the code. So that Todo is ideal for my learning method. Sure - my point is just that we usually have as a criteria for any performance related patch that it actually does improve performance. So, we'd need a test case. If there is something else that would also involve studying all the system tables it would also be great. For example, I noticed we have column level comments on the web but not in the database itself. This seems silly. Why not have the comments in the database and have the web query the tables of template databases for the given versions? Uh... I don't know what this means. I'm open to other suggestions as well. Here are a few TODO items that look relatively easy to me (they may not actually be easy when you dig in, of course): Clear table counters on TRUNCATE Allow the clearing of cluster-level statistics Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME Allow ALTER TABLE to change constraint deferrability and actions Unfortunately we don't have a lot of easy TODOs. People keep doing the ones we think up... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
-Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Tuesday, January 18, 2011 2:53 PM To: Simone Aiken Cc: Alvaro Herrera; pgsql-hackers Subject: Re: [HACKERS] ToDo List Item - System Table Index Clustering Sure - my point is just that we usually have as a criteria for any performance related patch that it actually does improve performance. Sorry wasn't arguing your point. Conceding it actually. =) I wasn't explaining why I chose it anyway to contest your statements, but as an invitation for you to point me towards something more useful that fit what I was looking for in a task. Uh... I don't know what this means. Pages like this one have column comments for the system tables: http://www.psql.it/manuale/8.3/catalog-pg-attribute.html But in my database when I look for comments they aren't there: qcc= \d+ pg_attribute Table pg_catalog.pg_attribute Column | Type | Modifiers | Description ---+--+---+- attrelid | oid | not null | attname | name | not null | atttypid | oid | not null | attstattarget | integer | not null | attlen| smallint | not null | attnum| smallint | not null | attndims | integer | not null | attcacheoff | integer | not null | atttypmod | integer | not null | attbyval | boolean | not null | attstorage| char | not null | attalign | char | not null | attnotnull| boolean | not null | atthasdef | boolean | not null | attisdropped | boolean | not null | attislocal| boolean | not null | attinhcount | integer | not null | So I have to fire up a web browser and start googling to learn about the columns. Putting them in pg_description would be more handy, no? -Simone Aiken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Followup on System Table Index clustering ToDo - It looks like to implement this I need to do the following: 1 - Add statements to indexing.h to cluster the selected indexes. A do-nothing define at the top to suppress warnings and then lines below for perl to parse out. #define DECLARE_CLUSTER_INDEX(table,index) ... ( add the defines under the index declarations ). 2 - Alter genbki.pl to produce the appropriate statements in postgres.bki when it reads the new lines in indexing.h. Will hold them in memory until the end of the file so they will come in after 'Build Indices' is called. CLUSTER tablename USING indexname 3 - Initdb will pipe the commands in postgres.bki to the postgres executable running in --boot mode. Code will need to be added to bootparse.y to recognize this new command and resolve it into a call to cluster_rel( tabOID, indOID, 0, 0, -1, -1 ); Speak now before I learn Bison ... actually I should probably learn Bison anyway. After ProC other pre-compilation languages can't be that bad. Sound all right? Thanks, -Simone Aiken On Jan 15, 2011, at 10:11 PM, Simone Aiken wrote: Hello Postgres Hackers, In reference to this todo item about clustering system table indexes, ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php ) I have been studying the system tables to see which would benefit from clustering. I have some index suggestions and a question if you have a moment. Cluster Candidates: pg_attribute: Make the existing index ( attrelid, attnum ) clustered to order it by table and column. pg_attrdef: Existing index ( adrelid, adnum ) clustered to order it by table and column. pg_constraint: Existing index ( conrelid ) clustered to get table constraints contiguous. pg_depend: Existing Index (refclassid, refobjid, refobjsubid) clustered to so that when the referenced object is changed its dependencies arevcontiguous. pg_description: Make the existing index ( Objoid, classoid, objsubid ) clustered to order it by entity, catalog, and optional column. * reversing the first two columns makes more sense to me ... catalog, object, column or since object implies catalog ( right? ) just dispensing with catalog altogether, but that would mean creating a new index. pg_shdependent: Existing index (refclassid, refobjid) clustered for same reason as pg_depend. pg_statistic: Existing index (starelid, staattnum) clustered to order it by table and column. pg_trigger: Make the existing index ( tgrelid, tgname ) clustered to order it by table then name getting all the triggers on a table together. Maybe Cluster: pg_rewrite: Not sure about this one ... The existing index ( ev_class, rulename ) seems logical to cluster to get all the rewrite rules for a given table contiguous but in the db's available to me virtually every table only has one rewrite rule. pg_auth_members: We could order it by role or by member of that role. Not sure which would be more valuable. Stupid newbie question: is there a way to make queries on the system tables show me what is actually there when I'm poking around? So for example: Select * from pg_type limit 1; tells me that the typoutput is 'boolout'. An english string rather than a number. So even though the documentation says that column maps to pg_proc.oid I can't then write: Select * from pg_proc where oid = 'boolout'; It would be very helpful if I wasn't learning the system but since I am I'd like to turn it off for now. Fewer layers of abstraction. Thanks, Simone Aiken 303-956-7188 Quietly Competent Consulting -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ToDo List Item - System Table Index Clustering
Hello Postgres Hackers, In reference to this todo item about clustering system table indexes, ( http://archives.postgresql.org/pgsql-hackers/2004-05/msg00989.php ) I have been studying the system tables to see which would benefit from clustering. I have some index suggestions and a question if you have a moment. Cluster Candidates: pg_attribute: Make the existing index ( attrelid, attnum ) clustered to order it by table and column. pg_attrdef: Existing index ( adrelid, adnum ) clustered to order it by table and column. pg_constraint: Existing index ( conrelid ) clustered to get table constraints contiguous. pg_depend: Existing Index (refclassid, refobjid, refobjsubid) clustered to so that when the referenced object is changed its dependencies arevcontiguous. pg_description: Make the existing index ( Objoid, classoid, objsubid ) clustered to order it by entity, catalog, and optional column. * reversing the first two columns makes more sense to me ... catalog, object, column or since object implies catalog ( right? ) just dispensing with catalog altogether, but that would mean creating a new index. pg_shdependent: Existing index (refclassid, refobjid) clustered for same reason as pg_depend. pg_statistic: Existing index (starelid, staattnum) clustered to order it by table and column. pg_trigger: Make the existing index ( tgrelid, tgname ) clustered to order it by table then name getting all the triggers on a table together. Maybe Cluster: pg_rewrite: Not sure about this one ... The existing index ( ev_class, rulename ) seems logical to cluster to get all the rewrite rules for a given table contiguous but in the db's available to me virtually every table only has one rewrite rule. pg_auth_members: We could order it by role or by member of that role. Not sure which would be more valuable. Stupid newbie question: is there a way to make queries on the system tables show me what is actually there when I'm poking around? So for example: Select * from pg_type limit 1; tells me that the typoutput is 'boolout'. An english string rather than a number. So even though the documentation says that column maps to pg_proc.oid I can't then write: Select * from pg_proc where oid = 'boolout'; It would be very helpful if I wasn't learning the system but since I am I'd like to turn it off for now. Fewer layers of abstraction. Thanks, Simone Aiken 303-956-7188 Quietly Competent Consulting -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
2011/1/16 Simone Aiken sai...@ulfheim.net: is there a way to make queries on the system tables show me what is actually there when I'm poking around? So for example: Select * from pg_type limit 1; tells me that the typoutput is 'boolout'. An english string rather than a number. So even though the documentation says that column maps to pg_proc.oid I can't then write: Select * from pg_proc where oid = 'boolout'; Type type of typoutput is regproc, which is really an oid with a different output function. To get the numeric value, do: Select typoutput::oid from pg_type limit 1; Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Nicolas Barbier nicolas.barb...@gmail.com writes: 2011/1/16 Simone Aiken sai...@ulfheim.net: ... So even though the documentation says that column maps to pg_proc.oid I can't then write: Select * from pg_proc where oid = 'boolout'; Type type of typoutput is regproc, which is really an oid with a different output function. To get the numeric value, do: Select typoutput::oid from pg_type limit 1; Also, you *can* go back the other way. It's very common to write Select * from pg_proc where oid = 'boolout'::regproc rather than looking up the OID first. There are similar pseudotypes for relation and operator names; see Object Identifier Types in the manual. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ToDo List Item - System Table Index Clustering
Select typoutput::oid from pg_type limit 1; Also, you *can* go back the other way. It's very common to write Select * from pg_proc where oid = 'boolout'::regproc rather than looking up the OID first. see Object Identifier Types in the manual. Many thanks to you both, that helps tremendously. - Simone Aiken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers