Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: toruvinn wrote: I was always wondering, though, why PostgreSQL uses this approach and not its catalogs. It does use the catalog for most things. THe flatfile is used for the situations where the catalogs are not yet ready to be read. Now that we have SQL-level CONNECT privilege, I wonder just how much functionality would be lost if we got rid of the flat files and told people they had to use CONNECT to do any per-user or per-database access control. The main point I can see offhand is that password checking would have to be done a lot later in the startup sequence, with correspondingly more cycles wasted to reject bad passwords. Is this a TODO? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] ORDER BY ... LIMIT and JOIN
Hello, I'm trying to optimize the follow query which returns the top users ordered by ranking. I'll show you my schema and explain analyze for each case. So, i'm asking two things: 1) Why ranking index is not used in the second query when sorting. 2) Am i missing some obvious optimization like a missing index? :) Schemas: # \d ranking Table public.ranking Column | Type | Modifiers ---+---+--- ranking | bigint| score | double precision | username | character varying(20) | not null variation | bigint| Indexes: ranking_tmp_pkey1 PRIMARY KEY, btree (username) idxrank_6057 btree (ranking) CLUSTER # \d user Table public.user Column | Type | Modifiers +---+--- id | integer | not null default nextval('user_id_seq'::regclass) username | character varying(20) | not null about | text | name | character varying(50) | photo | text | country_id | integer | Indexes: user_pkey PRIMARY KEY, btree (username) country_ranking_user_idx btree (country_id) Explain: # explain analyze SELECT * FROM ranking INNER JOIN user ON (ranking.username = user.username) WHERE user.country_id = 1 ORDER BY ranking.ranking ASC LIMIT 100; QUERY PLAN -- Limit (cost=13.03..13.04 rows=1 width=180) (actual time=965.229..965.302 rows=100 loops=1) - Sort (cost=13.03..13.04 rows=1 width=180) (actual time=965.227..965.256 rows=100 loops=1) Sort Key: ranking.ranking Sort Method: top-N heapsort Memory: 56kB - Nested Loop (cost=0.00..13.02 rows=1 width=180) (actual time=0.049..900.847 rows=57309 loops=1) - Index Scan using country_ranking_user_idx on user (cost=0.00..6.49 rows=1 width=145) (actual time=0.023..57.633 rows=57309 loops=1) Index Cond: (country_id = 1) - Index Scan using ranking_tmp_pkey1 on ranking (cost=0.00..6.52 rows=1 width=35) (actual time=0.013..0.013 rows=1 loops=57309) Index Cond: ((ranking.username)::text = (user.username)::text) Total runtime: 965.412 ms (10 rows) # explain analyze SELECT * FROM ranking INNER JOIN user ON (ranking.username = user.username) ORDER BY ranking.ranking ASC LIMIT 100; QUERY PLAN --- Limit (cost=0.00..137.02 rows=100 width=180) (actual time=0.056..1.973 rows=100 loops=1) - Nested Loop (cost=0.00..3081316.65 rows=2248753 width=180) (actual time=0.055..1.921 rows=100 loops=1) - Index Scan using idxrank_6057 on ranking (cost=0.00..70735.73 rows=2248753 width=35) (actual time=0.021..0.076 rows=100 loops=1) - Index Scan using user_pkey on user (cost=0.00..1.33 rows=1 width=145) (actual time=0.016..0.017 rows=1 loops=100) Index Cond: ((user.username)::text = (ranking.username)::text) Total runtime: 2.043 ms (6 rows) Thanks! Fz -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PG-related ACM Article: The Pathologies of Big Data
I don't see how on any recent hardware, random access to RAM is slower than sequential from disk. RAM access, random or not, is measured in GB/sec... I don't think anybody's arguing that. http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2795p=5 These guys mention about 50 ns memory latency ; this would translate into 20 million memory seeks per second, which is in the same ballpark as the numbers given by the article... If you count 10GB/s bandwidth, 50 ns is the time to fetch 500 bytes. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Now that we have SQL-level CONNECT privilege, I wonder just how much functionality would be lost if we got rid of the flat files and told people they had to use CONNECT to do any per-user or per-database access control. The main point I can see offhand is that password checking would have to be done a lot later in the startup sequence, with correspondingly more cycles wasted to reject bad passwords. Is this a TODO? Well, it's a TO-THINK-ABOUT anyway. I think the appropriate next step would not be to write code, but to do a detailed investigation of what would be gained or lost. I don't remember exactly what we do with the flat-file contents. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] ORDER BY ... LIMIT and JOIN
On Saturday 08 August 2009 08:02:47 Fizu wrote: - Index Scan using country_ranking_user_idx on user (cost=0.00..6.49 rows=1 width=145) (actual time=0.023..57.633 rows=57309 loops=1) Index Cond: (country_id = 1) The planner is expecting one user with country_id = 1, but instead there are 57309. Have you analyzed recently? Maybe increasing the statistics target will help. /Michael -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Tom Lane wrote: Well, it's a TO-THINK-ABOUT anyway. I think the appropriate next step would not be to write code, but to do a detailed investigation of what would be gained or lost. I don't remember exactly what we do with the flat-file contents. Maybe what we need is not to get rid of the flat files, but to speed them up. If we're worried about speed in the pg_authid flatfile, and come up with a solution to that problem, what will we do with the pg_database flatfile when it grows too large? We can't just get rid of it, because autovacuum needs to access it. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: ... I don't remember exactly what we do with the flat-file contents. Maybe what we need is not to get rid of the flat files, but to speed them up. If we're worried about speed in the pg_authid flatfile, and come up with a solution to that problem, what will we do with the pg_database flatfile when it grows too large? We can't just get rid of it, because autovacuum needs to access it. Well, one of the components of the TODO would have to be to figure out a way to fix autovacuum to avoid that. Or we could consider getting rid of the pg_auth flatfile while keeping the pg_database one, which would fix the issue for role names anyway... but it's certainly an ugly compromise. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
I wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: ... I don't remember exactly what we do with the flat-file contents. Maybe what we need is not to get rid of the flat files, but to speed them up. If we're worried about speed in the pg_authid flatfile, and come up with a solution to that problem, what will we do with the pg_database flatfile when it grows too large? We can't just get rid of it, because autovacuum needs to access it. Well, one of the components of the TODO would have to be to figure out a way to fix autovacuum to avoid that. Actually, I had forgotten that we were using the pg_database flatfile for purposes other than authentication checks. In particular, we need it during backend startup to map from database name to database OID, without which it's impossible to locate the system catalogs for the target database. It's pretty hard to see a way around that one. We could grovel through pg_database itself, as indeed is done to rebuild the flatfile during system start. But that's certainly not going to be fast in cases where there are enough DBs to make the flatfile slow. So on third thought, Alvaro's right: the only real solution here is to adopt a more efficient representation of the flat files. Maybe some sort of simple hashtable arrangement would work. (Rendering them not so flat anymore...) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Tom Lane wrote: Actually, I had forgotten that we were using the pg_database flatfile for purposes other than authentication checks. In particular, we need it during backend startup to map from database name to database OID, without which it's impossible to locate the system catalogs for the target database. It's pretty hard to see a way around that one. We could grovel through pg_database itself, as indeed is done to rebuild the flatfile during system start. But that's certainly not going to be fast in cases where there are enough DBs to make the flatfile slow. Also, IIRC flatfiles were introduced precisely to avoid having to read the catalogs manually. So on third thought, Alvaro's right: the only real solution here is to adopt a more efficient representation of the flat files. Maybe some sort of simple hashtable arrangement would work. (Rendering them not so flat anymore...) As long as there's a simple API, there should be no problem. (Except that it would be nice to be able to build the file incrementally ... If we have to write out a million lines each time a millionth user is created, there will still be a bottleneck at CREATE USER time.) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: So on third thought, Alvaro's right: the only real solution here is to adopt a more efficient representation of the flat files. Maybe some sort of simple hashtable arrangement would work. (Rendering them not so flat anymore...) As long as there's a simple API, there should be no problem. (Except that it would be nice to be able to build the file incrementally ... If we have to write out a million lines each time a millionth user is created, there will still be a bottleneck at CREATE USER time.) If we allow the requirements to creep on this, we'll soon find ourselves either using or reinventing BDB for the flatfiles. Ick. [ thinks for awhile... ] In some sense this is a bootstrap problem: what does it take to get to the point of being able to read pg_database and its indexes? That is necessarily not dependent on the particular database we want to join. Maybe we could solve it by having the relcache write a global cache file containing only entries for the global tables, and load that before we have identified the database we want to join (after which, we'll load another cache file for the local entries). It would doubtless take some rearrangement of the backend startup sequence, but it doesn't seem obviously impossible. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Tom Lane wrote: In some sense this is a bootstrap problem: what does it take to get to the point of being able to read pg_database and its indexes? That is necessarily not dependent on the particular database we want to join. Maybe we could solve it by having the relcache write a global cache file containing only entries for the global tables, and load that before we have identified the database we want to join (after which, we'll load another cache file for the local entries). This sounds good, because autovacuum could probably use this too. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: In some sense this is a bootstrap problem: what does it take to get to the point of being able to read pg_database and its indexes? That is necessarily not dependent on the particular database we want to join. Maybe we could solve it by having the relcache write a global cache file containing only entries for the global tables, and load that before we have identified the database we want to join (after which, we'll load another cache file for the local entries). This sounds good, because autovacuum could probably use this too. Maybe I'll look at this after commitfest is over. I haven't messed with the bootstrap sequence in awhile, but I used to remember how it worked ... As far as AV is concerned, taking this approach would likely mean turning the launcher into a full-fledged backend just like the workers. Do you see any problem with that? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [BUGS] BUG #4919: CREATE USER command slows down system performance
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Now that we have SQL-level CONNECT privilege, I wonder just how much functionality would be lost if we got rid of the flat files and told people they had to use CONNECT to do any per-user or per-database access control. The main point I can see offhand is that password checking would have to be done a lot later in the startup sequence, with correspondingly more cycles wasted to reject bad passwords. Is this a TODO? Well, it's a TO-THINK-ABOUT anyway. I think the appropriate next step would not be to write code, but to do a detailed investigation of what would be gained or lost. I don't remember exactly what we do with the flat-file contents. The flat file is the username/password sorted list. We load that info into the postmaster in an array that we can binary sort. I wonder how big the postmaster process address space was when handling 2 billion users: http://archives.postgresql.org/pgsql-bugs/2009-07/msg00176.php It seems just storing many users in the postmaster could be burdensome, but fixing that would be creating something like a database, which we already have. ;-) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need suggestions on kernel settings for dedicated FreeBSD/Postgresql machine
On Fri, Aug 7, 2009 at 5:24 PM, Culley Harrelsonharrel...@gmail.com wrote: Hi Everyone, I manage a freeBSD server that is dedicated to postgresql. The machine has 4 gigs of ram and there is a single database powering a web application that is hosted on a neighboring machine. The web application is mostly reading the database but there are considerable writes and I don't want to tune the machine exclusively for writes. I realize more information would be needed to optimally tune the machine but I am seeking advice on making some sane kernel settings for a general purpose database on a dedicated system. Currently I have: $ cat /etc/sysctl.conf kern.ipc.shmmax=268435456 kern.ipc.shmall=65536 and $ cat /boot/loader.conf kern.ipc.semmni=256 kern.ipc.semmns=512 kern.ipc.semmnu=256 In postgresql.conf I have: max_connections = 180 shared_buffers = 28MB I would like to increase this to 256 connections and make sure the kernel settings are giving postgresql enough breathing room without. I suspect my settings are conservative and since the machine is dedicated to postgresql I would like to give it more resources if they could be used. Any suggestions? This might be worth a look, for starters. http://pgfoundry.org/projects/pgtune/ ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance