Re: [PERFORM] security for row level but not based on Database user's
Friends wrote: Hi I need to set security for row level but not based on Database user's login. It should be based on the user table login. For the particular user I need to allow only the particular records to access insert, update delete and select. Well, the data access stuff is all manageable via views, which is the standard way to do this. You don't say which version of PostgreSQL you are using, but I'd be tempted just to switch to a different user after connecting and use the session_user system function to control what is visible in the view. For example: CREATE VIEW my_contacts AS SELECT * FROM contacts WHERE owner = session_user; If that's not practical then you'll need to write some functions to simulate your own session_user (say application_user()). This is easiest to write in plperl/pltcl or some other interpreted language - check the list archvies for plenty of discussion. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] security for row level but not based on Database user's
Ramasamy wrote: Hi Richard , Very good Day. Great information that you given to me. Great glad you think it's useful. Oh, don't forget to cc: the mailing-list - that's the convention around here. I will try in your idea. Here I am using SQL server 2000(Even I can use SQL Sever 2005 too if needed.) We are dealing with lot of databases with lot of business logic. I think your information will great for me. Ah - you're not using PostgreSQL? Well, the principle is the same but you should be aware that this is a PostgreSQL list so the details will be different. Let I try and I will back with you. Thanks and hands of too you. But i don't have any idea of getting Session from databases (I think you are meaning that user can be handle with session). If you have some more idea then it will be great full to me. With MS-SQL you'll probably want to look at T-SQL variables, although I don't know if they last longer than a single transaction. Failing that, functions or temporary tables might be a help. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Query on postgresql 7.4.2 not using index
Hi all, I have the following running on postgresql version 7.4.2: CREATE SEQUENCE agenda_user_group_id_seq MINVALUE 1 MAXVALUE 9223372036854775807 CYCLE INCREMENT 1 START 1; CREATE TABLE AGENDA_USERS_GROUPS ( AGENDA_USER_GROUP_ID INT8 CONSTRAINT pk_agndusrgrp_usergroup PRIMARY KEY DEFAULT NEXTVAL('agenda_user_group_id_seq'), USER_ID NUMERIC(10) CONSTRAINT fk_agenda_uid REFERENCES AGENDA_USERS (USER_ID) ON DELETE CASCADE NOT NULL, GROUP_ID NUMERIC(10) CONSTRAINT fk_agenda_gid REFERENCES AGENDA_GROUPS (GROUP_ID) ON DELETE CASCADE NOT NULL, CREATION_DATE DATE DEFAULT CURRENT_DATE, CONSTRAINT un_agndusrgrp_usergroup UNIQUE(USER_ID, GROUP_ID) ); CREATE INDEX i_agnusrsgrs_userid ON AGENDA_USERS_GROUPS ( USER_ID ); CREATE INDEX i_agnusrsgrs_groupid ON AGENDA_USERS_GROUPS ( GROUP_ID ); When I execute: EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = 9; it does a sequential scan and doesn't use the index and I don't understand why, any idea? I have the same in postgresql 8.1 and it uses the index :-| Thanks -- Arnau ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query on postgresql 7.4.2 not using index
On 4/25/06, Arnau [EMAIL PROTECTED] wrote: Hi all, I have the following running on postgresql version 7.4.2: CREATE SEQUENCE agenda_user_group_id_seq MINVALUE 1 MAXVALUE 9223372036854775807 CYCLE INCREMENT 1 START 1; CREATE TABLE AGENDA_USERS_GROUPS ( AGENDA_USER_GROUP_ID INT8 CONSTRAINT pk_agndusrgrp_usergroup PRIMARY KEY DEFAULT NEXTVAL('agenda_user_group_id_seq'), USER_ID NUMERIC(10) CONSTRAINT fk_agenda_uid REFERENCES AGENDA_USERS (USER_ID) ON DELETE CASCADE NOT NULL, GROUP_ID NUMERIC(10) CONSTRAINT fk_agenda_gid REFERENCES AGENDA_GROUPS (GROUP_ID) ON DELETE CASCADE NOT NULL, CREATION_DATE DATE DEFAULT CURRENT_DATE, CONSTRAINT un_agndusrgrp_usergroup UNIQUE(USER_ID, GROUP_ID) ); CREATE INDEX i_agnusrsgrs_userid ON AGENDA_USERS_GROUPS ( USER_ID ); CREATE INDEX i_agnusrsgrs_groupid ON AGENDA_USERS_GROUPS ( GROUP_ID ); When I execute: EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = 9; Try EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id::int8 = 9; or EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = '9'; and let us know what happens. -- Postgresql php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Query on postgresql 7.4.2 not using index
chris smith wrote: On 4/25/06, Arnau [EMAIL PROTECTED] wrote: Hi all, I have the following running on postgresql version 7.4.2: CREATE SEQUENCE agenda_user_group_id_seq MINVALUE 1 MAXVALUE 9223372036854775807 CYCLE INCREMENT 1 START 1; CREATE TABLE AGENDA_USERS_GROUPS ( AGENDA_USER_GROUP_ID INT8 CONSTRAINT pk_agndusrgrp_usergroup PRIMARY KEY DEFAULT NEXTVAL('agenda_user_group_id_seq'), USER_ID NUMERIC(10) CONSTRAINT fk_agenda_uid REFERENCES AGENDA_USERS (USER_ID) ON DELETE CASCADE NOT NULL, GROUP_ID NUMERIC(10) CONSTRAINT fk_agenda_gid REFERENCES AGENDA_GROUPS (GROUP_ID) ON DELETE CASCADE NOT NULL, CREATION_DATE DATE DEFAULT CURRENT_DATE, CONSTRAINT un_agndusrgrp_usergroup UNIQUE(USER_ID, GROUP_ID) ); CREATE INDEX i_agnusrsgrs_userid ON AGENDA_USERS_GROUPS ( USER_ID ); CREATE INDEX i_agnusrsgrs_groupid ON AGENDA_USERS_GROUPS ( GROUP_ID ); When I execute: EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = 9; Try EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id::int8 = 9; or EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = '9'; and let us know what happens. The same, the table has 2547556 entries: espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups espsm_moviltelevision-# WHERE group_id::int8 = 9; QUERY PLAN - Seq Scan on agenda_users_groups (cost=0.00..59477.34 rows=12738 width=8) (actual time=3409.541..11818.794 rows=367026 loops=1) Filter: ((group_id)::bigint = 9) Total runtime: 13452.114 ms (3 filas) espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups espsm_moviltelevision-# WHERE group_id = '9'; QUERY PLAN Seq Scan on agenda_users_groups (cost=0.00..53108.45 rows=339675 width=8) (actual time=916.903..5763.830 rows=367026 loops=1) Filter: (group_id = 9::numeric) Total runtime: 7259.861 ms (3 filas) espsm_moviltelevision=# select count(*) from agenda_users_groups ; count - 2547556 Thanks -- Arnau ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Query on postgresql 7.4.2 not using index
On 4/25/06, Arnau [EMAIL PROTECTED] wrote: espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups espsm_moviltelevision-# WHERE group_id = '9'; QUERY PLAN Seq Scan on agenda_users_groups (cost=0.00..53108.45 rows=339675 width=8) (actual time=916.903..5763.830 rows=367026 loops=1) Filter: (group_id = 9::numeric) Total runtime: 7259.861 ms (3 filas) Arnau, Why do you use a numeric instead of an integer/bigint?? IIRC, there were a few problems with index on numeric column on older version of PostgreSQL. You can't change the type of a column with 7.4, so create a new integer column then copy the values in this new column, drop the old one, rename the new one. Run vacuum analyze and recreate your index. It should work far better with an int. Note that you will have to update all the tables referencing this key... -- Guillaume ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Query on postgresql 7.4.2 not using index
On Tue, 2006-04-25 at 08:49, Arnau wrote: chris smith wrote: On 4/25/06, Arnau [EMAIL PROTECTED] wrote: Hi all, I have the following running on postgresql version 7.4.2: CREATE SEQUENCE agenda_user_group_id_seq MINVALUE 1 MAXVALUE 9223372036854775807 CYCLE INCREMENT 1 START 1; CREATE TABLE AGENDA_USERS_GROUPS ( AGENDA_USER_GROUP_ID INT8 CONSTRAINT pk_agndusrgrp_usergroup PRIMARY KEY DEFAULT NEXTVAL('agenda_user_group_id_seq'), USER_ID NUMERIC(10) CONSTRAINT fk_agenda_uid REFERENCES AGENDA_USERS (USER_ID) ON DELETE CASCADE NOT NULL, GROUP_ID NUMERIC(10) CONSTRAINT fk_agenda_gid REFERENCES AGENDA_GROUPS (GROUP_ID) ON DELETE CASCADE NOT NULL, CREATION_DATE DATE DEFAULT CURRENT_DATE, CONSTRAINT un_agndusrgrp_usergroup UNIQUE(USER_ID, GROUP_ID) ); CREATE INDEX i_agnusrsgrs_userid ON AGENDA_USERS_GROUPS ( USER_ID ); CREATE INDEX i_agnusrsgrs_groupid ON AGENDA_USERS_GROUPS ( GROUP_ID ); SNIP The same, the table has 2547556 entries: espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups espsm_moviltelevision-# WHERE group_id::int8 = 9; QUERY PLAN - Seq Scan on agenda_users_groups (cost=0.00..59477.34 rows=12738 width=8) (actual time=3409.541..11818.794 rows=367026 loops=1) Filter: ((group_id)::bigint = 9) Total runtime: 13452.114 ms (3 filas) espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups espsm_moviltelevision-# WHERE group_id = '9'; QUERY PLAN Seq Scan on agenda_users_groups (cost=0.00..53108.45 rows=339675 width=8) (actual time=916.903..5763.830 rows=367026 loops=1) Filter: (group_id = 9::numeric) Total runtime: 7259.861 ms (3 filas) espsm_moviltelevision=# select count(*) from agenda_users_groups ; count - 2547556 OK, a few points. 1: 7.4.2 is WAY out of date for the 7.4 series. The 7.4 series, also it a bit out of date, and many issues in terms of performance have been enhanced in the 8.x series. You absolutely should update to the latest 7.4 series, as there are known data loss bugs and other issues in the 7.4.2 version. 2: An index scan isn't always faster. In this instance, it looks like the number of rows that match in the last version of your query is well over 10% of the rows. Assuming your average row takes up 10% or so of a block, which is pretty common, then you're going to have to hit almost every block anyway to get your data. So, an index scan is no win. 3: To test whether or not an index scan IS a win, you can use the enable_xxx settings to prove it to yourself: set enable_seqscan = off; explain analyze your query here; and compare. Note that the enable_seqscan = off thing is a sledge hammer, not a nudge, and generally should NOT be used in production. If an index scan is generally a win for you, but the database isn't using it, you might need to tune the database for your machine. note that you should NOT tune your database based on a single query. You'll need to reach a compromise on your settings that makes all your queries run reasonably fast without the planner making insane decisions. One of the better postgresql tuning docs out there is the one at: http://www.varlena.com/GeneralBits/Tidbits/perf.html . Good luck. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Query on postgresql 7.4.2 not using index
Arnau [EMAIL PROTECTED] writes: Seq Scan on agenda_users_groups (cost=0.00..53108.45 rows=339675 width=8) (actual time=916.903..5763.830 rows=367026 loops=1) Filter: (group_id = 9::numeric) Total runtime: 7259.861 ms (3 filas) espsm_moviltelevision=# select count(*) from agenda_users_groups ; count - 2547556 So the SELECT is fetching nearly 15% of the rows in the table. The planner is doing *the right thing* to use a seqscan, at least for this particular group_id value. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query on postgresql 7.4.2 not using index
Tom Lane wrote: Arnau [EMAIL PROTECTED] writes: Seq Scan on agenda_users_groups (cost=0.00..53108.45 rows=339675 width=8) (actual time=916.903..5763.830 rows=367026 loops=1) Filter: (group_id = 9::numeric) Total runtime: 7259.861 ms (3 filas) espsm_moviltelevision=# select count(*) from agenda_users_groups ; count - 2547556 So the SELECT is fetching nearly 15% of the rows in the table. The planner is doing *the right thing* to use a seqscan, at least for this particular group_id value. I have done the same tests on 8.1.0. espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = 9; QUERY PLAN -- Bitmap Heap Scan on agenda_users_groups (cost=2722.26..30341.78 rows=400361 width=8) (actual time=145.533..680.839 rows=367026 loops=1) Recheck Cond: (group_id = 9::numeric) - Bitmap Index Scan on i_agnusrsgrs_groupid (cost=0.00..2722.26 rows=400361 width=0) (actual time=142.958..142.958 rows=367026 loops=1) Index Cond: (group_id = 9::numeric) Total runtime: 1004.966 ms (5 rows) espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id::int8 = 9; QUERY PLAN --- Seq Scan on agenda_users_groups (cost=0.00..60947.43 rows=12777 width=8) (actual time=457.963..2244.928 rows=367026 loops=1) Filter: ((group_id)::bigint = 9) Total runtime: 2571.496 ms (3 rows) espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id::int8 = '9'; QUERY PLAN --- Seq Scan on agenda_users_groups (cost=0.00..60947.43 rows=12777 width=8) (actual time=407.193..2182.880 rows=367026 loops=1) Filter: ((group_id)::bigint = 9::bigint) Total runtime: 2506.998 ms (3 rows) espsm_moviltelevision=# select count(*) from agenda_users_groups ; count - 2555437 (1 row) Postgresql then uses the index, I don't understand why? in this server I tried to tune the configuration, it's because of the tuning? Because it's a newer version of postgresql? Thanks for all the replies -- Arnau ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query on postgresql 7.4.2 not using index
On Tue, 2006-04-25 at 10:47, Arnau wrote: Tom Lane wrote: Arnau [EMAIL PROTECTED] writes: espsm_moviltelevision=# select count(*) from agenda_users_groups ; count - 2547556 So the SELECT is fetching nearly 15% of the rows in the table. The planner is doing *the right thing* to use a seqscan, at least for this particular group_id value. I have done the same tests on 8.1.0. espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = 9; QUERY PLAN -- Bitmap Heap Scan on agenda_users_groups (cost=2722.26..30341.78 rows=400361 width=8) (actual time=145.533..680.839 rows=367026 loops=1) Recheck Cond: (group_id = 9::numeric) - Bitmap Index Scan on i_agnusrsgrs_groupid (cost=0.00..2722.26 rows=400361 width=0) (actual time=142.958..142.958 rows=367026 loops=1) Index Cond: (group_id = 9::numeric) Total runtime: 1004.966 ms (5 rows) How big are these individual records? I'm guessing a fairly good size, since an index scan is winning. espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id::int8 = 9; QUERY PLAN --- Seq Scan on agenda_users_groups (cost=0.00..60947.43 rows=12777 width=8) (actual time=457.963..2244.928 rows=367026 loops=1) Filter: ((group_id)::bigint = 9) Total runtime: 2571.496 ms (3 rows) OK. Stop and think about what you're telling postgresql to do here. You're telling it to cast the field group_id to int8, then compare it to 9. How can it cast the group_id to int8 without fetching it? That's right, you're ensuring a seq scan. You need to put the int8 cast on the other side of that equality comparison, like: where group_id = 9::int8 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query on postgresql 7.4.2 not using index
Arnau [EMAIL PROTECTED] writes: I have done the same tests on 8.1.0. Bitmap scans are a totally different animal that doesn't exist in 7.4. A plain indexscan, such as 7.4 knows about, is generally not effective for fetching more than a percent or two of the table. The crossover point for a bitmap scan is much higher (don't know exactly, but probably something like 30-50%). regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query on postgresql 7.4.2 not using index
I have done the same tests on 8.1.0. espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = 9; QUERY PLAN -- Bitmap Heap Scan on agenda_users_groups (cost=2722.26..30341.78 rows=400361 width=8) (actual time=145.533..680.839 rows=367026 loops=1) Recheck Cond: (group_id = 9::numeric) - Bitmap Index Scan on i_agnusrsgrs_groupid (cost=0.00..2722.26 rows=400361 width=0) (actual time=142.958..142.958 rows=367026 loops=1) Index Cond: (group_id = 9::numeric) Total runtime: 1004.966 ms (5 rows) How big are these individual records? I'm guessing a fairly good size, since an index scan is winning. How I could know the size on an individual record? espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id::int8 = 9; QUERY PLAN --- Seq Scan on agenda_users_groups (cost=0.00..60947.43 rows=12777 width=8) (actual time=457.963..2244.928 rows=367026 loops=1) Filter: ((group_id)::bigint = 9) Total runtime: 2571.496 ms (3 rows) OK. Stop and think about what you're telling postgresql to do here. You're telling it to cast the field group_id to int8, then compare it to 9. How can it cast the group_id to int8 without fetching it? That's right, you're ensuring a seq scan. You need to put the int8 cast on the other side of that equality comparison, like: where group_id = 9::int8 I just did what Chris Smith asked me to do :), here I paste the results I get when I change the cast. espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = 9::int8; QUERY PLAN -- Bitmap Heap Scan on agenda_users_groups (cost=2722.33..30343.06 rows=400379 width=8) (actual time=147.723..714.473 rows=367026 loops=1) Recheck Cond: (group_id = 9::numeric) - Bitmap Index Scan on i_agnusrsgrs_groupid (cost=0.00..2722.33 rows=400379 width=0) (actual time=145.015..145.015 rows=367026 loops=1) Index Cond: (group_id = 9::numeric) Total runtime: 1038.537 ms (5 rows) espsm_moviltelevision=# EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = '9'::int8; QUERY PLAN -- Bitmap Heap Scan on agenda_users_groups (cost=2722.33..30343.06 rows=400379 width=8) (actual time=153.858..1192.838 rows=367026 loops=1) Recheck Cond: (group_id = 9::numeric) - Bitmap Index Scan on i_agnusrsgrs_groupid (cost=0.00..2722.33 rows=400379 width=0) (actual time=151.298..151.298 rows=367026 loops=1) Index Cond: (group_id = 9::numeric) Total runtime: 1527.039 ms (5 rows) Thanks -- Arnau ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] PL/pgSQL Loop Vs. Batch Update
Fellow PostgreSQLers, This post is longish and has a bit of code, but here's my question up- front: Why are batch queries in my PL/pgSQL functions no faster than iterating over a loop and executing a series of queries for each iteration of the loop? Are batch queries or array or series generation slow in PL/pgSQL? Now, for the details of my problem. For managing an ordered many-to- many relations between blog entries and tags, I created these tables: CREATE TABLE entry ( id SERIAL PRIMARY KEY, title text, content text ); CREATE TABLE tag ( id SERIAL PRIMARY KEY, name text ); CREATE TABLE entry_coll_tag ( entry_id integer REFERENCES entry(id) ON UPDATE CASCADE ON DELETE CASCADE, tag_id integer REFERENCES tag(id) ON UPDATE CASCADE ON DELETE CASCADE, ord smallint, PRIMARY KEY (entry_id, tag_id) ); To make it easy to associate an entry with a bunch of tags in a single query, I wrote this PL/pgSQL function: CREATE OR REPLACE FUNCTION entry_coll_tag_set ( obj_id integer, coll_ids integer[] ) RETURNS VOID AS $$ DECLARE -- For checking to see if a tuple was updated. update_count smallint; -- For looping through the array. iloop integer := 1; BEGIN -- Lock the containing object tuple to prevernt inserts into the -- collection table. PERFORM true FROM entry WHERE id = obj_id FOR UPDATE; -- Update possible existing record with the current sequence so -- as to avoid unique constraint violations. We just set it to a -- negative number, since negative numbers are never used for -- ord. UPDATE entry_coll_tag SETord = -ord WHERE entry_id = obj_id; -- Loop through the tag IDs to associate with the entry ID. while coll_ids[iloop] is not null loop -- Update possible existing collection record. UPDATE entry_coll_tag SETord = iloop WHERE entry_id = obj_id AND tag_id = coll_ids[iloop]; -- Only insert the item if nothing was updated. IF FOUND IS false THEN -- Insert a new record. INSERT INTO entry_coll_tag (entry_id, tag_id, ord) VALUES (obj_id, coll_ids[iloop], iloop); END IF; iloop := iloop + 1; END loop; -- Delete any remaining tuples. DELETE FROM entry_coll_tag WHERE entry_id = obj_id AND ord 0; END; $$ LANGUAGE plpgsql SECURITY DEFINER; Josh Berkus kindly reviewed it, and suggested that I take advantage of generate_series() and batch updates instead of looping through the results. Here's the revised version: CREATE OR REPLACE FUNCTION entry_coll_tag_set ( obj_id integer, coll_ids integer[] ) RETURNS VOID AS $$ BEGIN -- Lock the containing object tuple to prevernt inserts into the -- collection table. PERFORM true FROM entry WHERE id = obj_id FOR UPDATE; -- First set all ords to negative value to prevent unique index -- violations. UPDATE entry_coll_tag SETord = -ord WHERE entry_id = obj_id; IF FOUND IS false THEN -- There are no existing tuples, so just insert the new ones. INSERT INTO entry_coll_tag (entry_id, tag_id, ord) SELECT obj_id, coll_ids[gs.ser], gs.ser FROM generate_series(1, array_upper(coll_ids, 1)) AS gs(ser); ELSE -- First, update the existing tuples to new ord values. UPDATE entry_coll_tag SET ord = ser FROM ( SELECT gs.ser, coll_ids[gs.ser] as move_tag FROM generate_series(1, array_upper(coll_ids, 1)) AS gs(ser) ) AS expansion WHERE move_tag = entry_coll_tag.tag_id AND entry_id = obj_id; -- Now insert the new tuples. INSERT INTO entry_coll_tag (entry_id, tag_id, ord ) SELECT obj_id, coll_ids[gs.ser], gs.ser FROM generate_series(1, array_upper(coll_ids, 1)) AS gs (ser) WHERE coll_ids[gs.ser] NOT IN ( SELECT tag_id FROM entry_coll_tag ect2 WHERE entry_id = obj_id ); -- Delete any remaining tuples. DELETE FROM entry_coll_tag WHERE entry_id = obj_id AND ord 0; END IF; END; $$ LANGUAGE plpgsql SECURITY DEFINER; Josh thought that the replacement of the loop with a couple of batch queries would be an order of magnitude faster. So I happily ran my benchmark comparing the two approaches. The benchmark actually tests two different functions that had a similar refactoring, as well as two other functions that are quite simple. Before the tests, I inserted 100,000 entry records, and a random number of tag records (1-10 for each entry) and related
[PERFORM] Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)
Hello, The performance comparison saga of the last month continues (see list archive). After some time experimenting on windows, the conclusion is clear: windows is likely crap for databases other than MS-SQL. I guess that MS-SQL uses lot of undocumented api calls, may run in kernel mode, ring 0 and a lot of dirty tricks to get some reasonable performance. Then, I asked my coleague to send a new FB dump and a Pg dump to try at my desktop machine. This time, the database is somewhat bigger. Around 20 million records. The timings are attached. Tried to follow the same query sequence on both files. Both databases are much more faster on linux than on windows, and the desktop machine is not dedicated and tuned. (no scsi, no raid, many services enabled, ext3 fs, etc). At many of the queries, postgresql is faster, sometimes way MUCH faster. But Firebird have very good defaults out of the box and a few of the queries are really a pain in Postgresql. Please, see the abismal timing differences at the last 2 queries, for example. They used 100% cpu, almost no disk activity, no twait cpu, for lng time to complete. Maybe these queries bring into the light some instructions weaknesses, or bad tuning. Do you have some suggestions? Regards. Andre Felipe Machado http://www.techforce.com.br linux blog queries_timing_firebird.txt.tar.gz Description: GNU Zip compressed data queries_timing_postgresql.txt.tar.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] planner not using index for like operator
For the query Select col1 from table1 Where col1 like 172.% The table has 133 million unique ip addresses. Col1 is indexed. The optimizer is using a sequential scan This is the explain analyze output Seq Scan on table1 (cost=0.00..2529284.80 rows=1 width=15) (actual time=307591.339..565251.775 rows=524288 loops=1) Filter: ((col1)::text ~~ '172.%'::text) Total runtime: 565501.873 ms The number of affected rows (500K) is a small fraction of the total row count.
[PERFORM] Slow queries salad ;)
Here is a simple test case for this strange behaviour : annonces= CREATE TABLE test.current (id INTEGER PRIMARY KEY, description TEXT); INFO: CREATE TABLE / PRIMARY KEY creera un index implicite current_pkey pour la table current CREATE TABLE annonces= CREATE TABLE test.archive (id INTEGER PRIMARY KEY, description TEXT); INFO: CREATE TABLE / PRIMARY KEY creera un index implicite archive_pkey pour la table archive CREATE TABLE annonces= CREATE VIEW test.all AS SELECT * FROM test.archive UNION ALL SELECT * FROM test.current ; CREATE VIEW let's populate... annonces= INSERT INTO test.current SELECT id, description FROM annonces; INSERT 0 11524 annonces= INSERT INTO test.archive SELECT id, description FROM archive_ext; INSERT 0 437351 annonces= ANALYZE test.archive; ANALYZE annonces= ANALYZE test.current; ANALYZE This is the bookmarks table... SELECT count(*), list_id FROM bookmarks GROUP BY list_id; count | list_id ---+- 15 | 7 5 | 5 150 | 4 3 | 3 15 | 2 2 | 1 6 | 8 I want to list the stuff I bookmarked : annonces= EXPLAIN ANALYZE SELECT * FROM test.current WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN - Hash Join (cost=6.58..532.84 rows=140 width=203) (actual time=0.747..5.283 rows=150 loops=1) Hash Cond: (outer.id = inner.annonce_id) - Seq Scan on current (cost=0.00..467.24 rows=11524 width=203) (actual time=0.006..3.191 rows=11524 loops=1) - Hash (cost=6.23..6.23 rows=140 width=4) (actual time=0.244..0.244 rows=150 loops=1) - HashAggregate (cost=4.83..6.23 rows=140 width=4) (actual time=0.155..0.184 rows=150 loops=1) - Seq Scan on bookmarks (cost=0.00..4.45 rows=150 width=4) (actual time=0.008..0.097 rows=150 loops=1) Filter: (list_id = 4) Total runtime: 5.343 ms (8 lignes) annonces= set enable_hashjoin TO 0; SET annonces= EXPLAIN ANALYZE SELECT * FROM test.current WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN Nested Loop (cost=4.83..824.22 rows=140 width=203) (actual time=0.219..1.034 rows=150 loops=1) - HashAggregate (cost=4.83..6.23 rows=140 width=4) (actual time=0.158..0.199 rows=150 loops=1) - Seq Scan on bookmarks (cost=0.00..4.45 rows=150 width=4) (actual time=0.011..0.096 rows=150 loops=1) Filter: (list_id = 4) - Index Scan using current_pkey on current (cost=0.00..5.83 rows=1 width=203) (actual time=0.005..0.005 rows=1 loops=150) Index Cond: (current.id = outer.annonce_id) Total runtime: 1.108 ms (7 lignes) Hm, the row estimates on the bookmarks table are spot on ; why did it choose the hash join ? Now, if I want to access the all view which contains the union of the current and archive table : annonces= set enable_hashjoin TO 1; SET annonces= EXPLAIN ANALYZE SELECT * FROM test.all WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN Hash Join (cost=6.58..33484.41 rows=314397 width=36) (actual time=8300.484..8311.784 rows=150 loops=1) Hash Cond: (outer.?column1? = inner.annonce_id) - Append (cost=0.00..23596.78 rows=449139 width=219) (actual time=6.390..8230.821 rows=448875 loops=1) - Seq Scan on archive (cost=0.00..18638.15 rows=437615 width=219) (actual time=6.389..8175.491 rows=437351 loops=1) - Seq Scan on current (cost=0.00..467.24 rows=11524 width=203) (actual time=0.022..8.985 rows=11524 loops=1) - Hash (cost=6.23..6.23 rows=140 width=4) (actual time=0.255..0.255 rows=150 loops=1) - HashAggregate (cost=4.83..6.23 rows=140 width=4) (actual time=0.168..0.197 rows=150 loops=1) - Seq Scan on bookmarks (cost=0.00..4.45 rows=150 width=4) (actual time=0.015..0.102 rows=150 loops=1) Filter: (list_id = 4) Total runtime: 8311.870 ms (10 lignes) annonces= set enable_hashjoin TO 0; SET annonces= EXPLAIN ANALYZE SELECT * FROM test.all WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN - Merge Join (cost=79604.61..84994.98 rows=314397
Re: [PERFORM] planner not using index for like operator
Title: Message If you are using a locale other than the C locale, you need to create the index with an operator class to get index scans with like. See here for details: http://www.postgresql.org/docs/8.1/interactive/indexes-opclass.html -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sriram DandapaniSent: Tuesday, April 25, 2006 12:08 PMTo: Pgsql-Performance (E-mail)Subject: [PERFORM] planner not using index for like operator For the query Select col1 from table1 Where col1 like 172.% The table has 133 million unique ip addresses. Col1 is indexed. The optimizer is using a sequential scan This is the explain analyze output "Seq Scan on table1 (cost=0.00..2529284.80 rows=1 width=15) (actual time=307591.339..565251.775 rows=524288 loops=1)" " Filter: ((col1)::text ~~ '172.%'::text)" "Total runtime: 565501.873 ms" The number of affected rows (500K) is a small fraction of the total row count.
[PERFORM] Large (8M) cache vs. dual-core CPUs
I've been given the task of making some hardware recommendations for the next round of server purchases. The machines to be purchased will be running FreeBSD PostgreSQL. Where I'm stuck is in deciding whether we want to go with dual-core pentiums with 2M cache, or with HT pentiums with 8M cache. Both of these are expensive bits of hardware, and I'm trying to gather as much evidence as possible before making a recommendation. The FreeBSD community seems pretty divided over which is likely to be better, and I have been unable to discover a method for estimating how much of the 2M cache on our existing systems is being used. Does anyone in the PostgreSQL community have any experience with large caches or dual-core pentiums that could make any recommendations? Our current Dell 2850 systems are CPU bound - i.e. they have enough RAM, and fast enough disks that the CPUs seem to be the limiting factor. As a result, this decision on what kind of CPUs to get in the next round of servers is pretty important. Any advice is much appreciated. -- Bill Moran Collaborative Fusion Inc. IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Slow queries salad ;)
PFC [EMAIL PROTECTED] writes: The IN() is quite small (150 values), but the two large tables are seq-scanned... is there a way to avoid this ? Not in 8.1. HEAD is a bit smarter about joins to Append relations. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] planner not using index for like operator
On Tue, Apr 25, 2006 at 10:08:02AM -0700, Sriram Dandapani wrote: Here's the key: Filter: ((col1)::text ~~ '172.%'::text) In order to do a like comparison, it has to convert col1 (which I'm guessing is of type 'inet') to text, so there's no way it can use an index on col1 (maybe a function index, but that's a different story). Is there some reason you're not doing WHERE col1 = '172/8'::inet ? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] planner not using index for like operator
The col is a varchar. I am currently testing with the inet data type(and also the ipv4 pgfoundry data type). Due to time constraints, I am trying to minimize code changes. What kind of index do I need to create to enable efficient range scans (e.g anything between 172.16.x.x thru 172.31.x.x) on the inet data type? Thanks Sriram -Original Message- From: Jim C. Nasby [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 25, 2006 11:25 AM To: Sriram Dandapani Cc: Pgsql-Performance (E-mail) Subject: Re: [PERFORM] planner not using index for like operator On Tue, Apr 25, 2006 at 10:08:02AM -0700, Sriram Dandapani wrote: Here's the key: Filter: ((col1)::text ~~ '172.%'::text) In order to do a like comparison, it has to convert col1 (which I'm guessing is of type 'inet') to text, so there's no way it can use an index on col1 (maybe a function index, but that's a different story). Is there some reason you're not doing WHERE col1 = '172/8'::inet ? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, 2006-04-25 at 13:14, Bill Moran wrote: I've been given the task of making some hardware recommendations for the next round of server purchases. The machines to be purchased will be running FreeBSD PostgreSQL. Where I'm stuck is in deciding whether we want to go with dual-core pentiums with 2M cache, or with HT pentiums with 8M cache. Given a choice between those two processors, I'd choose the AMD 64 x 2 CPU. It's a significantly better processor than either of the Intel choices. And if you get the HT processor, you might as well turn of HT on a PostgreSQL machine. I've yet to see it make postgresql run faster, but I've certainly seen HT make it run slower. If you can't run AMD in your shop due to bigotry (let's call a spade a spade) then I'd recommend the real dual core CPU with 2M cache. Most of what makes a database slow is memory and disk bandwidth. Few datasets are gonna fit in that 8M cache, and when they do, they'll get flushed right out by the next request anyway. Does anyone in the PostgreSQL community have any experience with large caches or dual-core pentiums that could make any recommendations? Our current Dell 2850 systems are CPU bound - i.e. they have enough RAM, and fast enough disks that the CPUs seem to be the limiting factor. As a result, this decision on what kind of CPUs to get in the next round of servers is pretty important. If the CPUs are running at 100% then you're likely not memory I/O bound, but processing speed bound. The dual core will definitely be the better option in that case. I take it you work at a Dell Only place, hence no AMD for you... Sad, cause the AMD is, on a price / performance scale, twice the processor for the same money as the Intel. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, 25 Apr 2006 14:14:35 -0400 Bill Moran [EMAIL PROTECTED] wrote: Does anyone in the PostgreSQL community have any experience with large caches or dual-core pentiums that could make any recommendations? Heh :) You're in the position I was in about a year ago - we naturally replaced our old Dell 2650 with £14k of Dell 6850 Quad Xeon with 8M cache, and TBH the performance is woeful :/ Having gone through Postgres consultancy, been through IBM 8-way POWER4 hardware, discovered a bit of a shortcoming in PG on N-way hardware (where N is large) [1] , I have been able to try out a dual-dual-core Opteron machine, and it flies. In fact, it flies so well that we ordered one that day. So, in short £3k's worth of dual-opteron beat the living daylights out of our Xeon monster. I can't praise the Opteron enough, and I've always been a firm Intel pedant - the HyperTransport stuff must really be doing wonders. I typically see 500ms searches on it instead of 1000-2000ms on the Xeon) As it stands, I've had to borrow this Opteron so much (and send live searches across the net to the remote box) because otherwise we simply don't have enough CPU power to run the website (!) Cheers, Gavin. [1] Simon Riggs + Tom Lane are currently involved in optimisation work for this - it turns out our extremely read-heavy load pattern reveals some buffer locking issues in PG. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Slow queries salad ;)
On Tue, Apr 25, 2006 at 07:53:15PM +0200, PFC wrote: What version is this?? annonces= EXPLAIN ANALYZE SELECT * FROM test.current WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN - Hash Join (cost=6.58..532.84 rows=140 width=203) (actual time=0.747..5.283 rows=150 loops=1) Hash Cond: (outer.id = inner.annonce_id) - Seq Scan on current (cost=0.00..467.24 rows=11524 width=203) (actual time=0.006..3.191 rows=11524 loops=1) - Hash (cost=6.23..6.23 rows=140 width=4) (actual time=0.244..0.244 rows=150 loops=1) - HashAggregate (cost=4.83..6.23 rows=140 width=4) (actual time=0.155..0.184 rows=150 loops=1) - Seq Scan on bookmarks (cost=0.00..4.45 rows=150 width=4) (actual time=0.008..0.097 rows=150 loops=1) Filter: (list_id = 4) Total runtime: 5.343 ms (8 lignes) annonces= set enable_hashjoin TO 0; SET annonces= EXPLAIN ANALYZE SELECT * FROM test.current WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN Nested Loop (cost=4.83..824.22 rows=140 width=203) (actual time=0.219..1.034 rows=150 loops=1) - HashAggregate (cost=4.83..6.23 rows=140 width=4) (actual time=0.158..0.199 rows=150 loops=1) - Seq Scan on bookmarks (cost=0.00..4.45 rows=150 width=4) (actual time=0.011..0.096 rows=150 loops=1) Filter: (list_id = 4) - Index Scan using current_pkey on current (cost=0.00..5.83 rows=1 width=203) (actual time=0.005..0.005 rows=1 loops=150) Index Cond: (current.id = outer.annonce_id) Total runtime: 1.108 ms (7 lignes) Hm, the row estimates on the bookmarks table are spot on ; why did it choose the hash join ? Because it thought it would be cheaper; see the estimates. Increasing effective_cache_size or decreasing random_page_cost would favor the index scan. Now, if I want to access the all view which contains the union of the current and archive table : annonces= set enable_hashjoin TO 1; SET annonces= EXPLAIN ANALYZE SELECT * FROM test.all WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN Hash Join (cost=6.58..33484.41 rows=314397 width=36) (actual time=8300.484..8311.784 rows=150 loops=1) Hash Cond: (outer.?column1? = inner.annonce_id) - Append (cost=0.00..23596.78 rows=449139 width=219) (actual time=6.390..8230.821 rows=448875 loops=1) - Seq Scan on archive (cost=0.00..18638.15 rows=437615 width=219) (actual time=6.389..8175.491 rows=437351 loops=1) - Seq Scan on current (cost=0.00..467.24 rows=11524 width=203) (actual time=0.022..8.985 rows=11524 loops=1) - Hash (cost=6.23..6.23 rows=140 width=4) (actual time=0.255..0.255 rows=150 loops=1) - HashAggregate (cost=4.83..6.23 rows=140 width=4) (actual time=0.168..0.197 rows=150 loops=1) - Seq Scan on bookmarks (cost=0.00..4.45 rows=150 width=4) (actual time=0.015..0.102 rows=150 loops=1) Filter: (list_id = 4) Total runtime: 8311.870 ms (10 lignes) annonces= set enable_hashjoin TO 0; SET annonces= EXPLAIN ANALYZE SELECT * FROM test.all WHERE id IN (SELECT annonce_id FROM bookmarks WHERE list_id IN ('4')); QUERY PLAN - Merge Join (cost=79604.61..84994.98 rows=314397 width=36) (actual time=6944.229..7109.371 rows=150 loops=1) Merge Cond: (outer.annonce_id = inner.id) - Sort (cost=11.22..11.57 rows=140 width=4) (actual time=0.326..0.355 rows=150 loops=1) Sort Key: bookmarks.annonce_id - HashAggregate (cost=4.83..6.23 rows=140 width=4) (actual time=0.187..0.218 rows=150 loops=1) - Seq Scan on bookmarks (cost=0.00..4.45 rows=150 width=4) (actual time=0.028..0.126 rows=150 loops=1) Filter: (list_id = 4) - Sort (cost=79593.40..80716.25 rows=449139 width=36) (actual time=6789.786..7014.815 rows=448625 loops=1) Sort Key: all.id - Append (cost=0.00..23596.78 rows=449139 width=219) (actual time=0.013..391.447 rows=448875 loops=1) - Seq Scan on archive (cost=0.00..18638.15 rows=437615
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, 2006-04-25 at 13:14, Bill Moran wrote: I've been given the task of making some hardware recommendations for the next round of server purchases. The machines to be purchased will be running FreeBSD PostgreSQL. Where I'm stuck is in deciding whether we want to go with dual-core pentiums with 2M cache, or with HT pentiums with 8M cache. BTW: For an interesting article on why the dual core Opterons are so much better than their Intel cousins, read this article: http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=1 Enlightening read. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: Sad, cause the AMD is, on a price / performance scale, twice the processor for the same money as the Intel. Maybe a year or two ago. Prices are all coming down. Intel more than AMD. AMD still seems better - but not X2, and it depends on the workload. X2 sounds like biggotry against Intel... :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, 2006-04-25 at 13:38, [EMAIL PROTECTED] wrote: On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: Sad, cause the AMD is, on a price / performance scale, twice the processor for the same money as the Intel. Maybe a year or two ago. Prices are all coming down. Intel more than AMD. AMD still seems better - but not X2, and it depends on the workload. X2 sounds like biggotry against Intel... :-) Actually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
Bill Moran wrote: I've been given the task of making some hardware recommendations for the next round of server purchases. The machines to be purchased will be running FreeBSD PostgreSQL. Where I'm stuck is in deciding whether we want to go with dual-core pentiums with 2M cache, or with HT pentiums with 8M cache. Dual Core Opterons :) Joshua D. Drake Both of these are expensive bits of hardware, and I'm trying to gather as much evidence as possible before making a recommendation. The FreeBSD community seems pretty divided over which is likely to be better, and I have been unable to discover a method for estimating how much of the 2M cache on our existing systems is being used. Does anyone in the PostgreSQL community have any experience with large caches or dual-core pentiums that could make any recommendations? Our current Dell 2850 systems are CPU bound - i.e. they have enough RAM, and fast enough disks that the CPUs seem to be the limiting factor. As a result, this decision on what kind of CPUs to get in the next round of servers is pretty important. Any advice is much appreciated. -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] planner not using index for like operator
Using an index on col1 with the operator class varchar_pattern_ops , I was able to get a 3 second response time. That will work for me. I used a like '172.%' and an extra pattern matching condition to restrict Between 172.16.x.x and 172.31.x.x Thanks for the input..I will also test the inet data type to see if there are differences. Sriram -Original Message- From: Jim C. Nasby [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 25, 2006 11:25 AM To: Sriram Dandapani Cc: Pgsql-Performance (E-mail) Subject: Re: [PERFORM] planner not using index for like operator On Tue, Apr 25, 2006 at 10:08:02AM -0700, Sriram Dandapani wrote: Here's the key: Filter: ((col1)::text ~~ '172.%'::text) In order to do a like comparison, it has to convert col1 (which I'm guessing is of type 'inet') to text, so there's no way it can use an index on col1 (maybe a function index, but that's a different story). Is there some reason you're not doing WHERE col1 = '172/8'::inet ? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back. To be more detailed... AMD Opteron has some specific technical advantages to their design over Intel when it comes to peforming for a database. Specifically no front side bus :) Also it is widely known and documented (just review the archives) that AMD performs better then the equivelant Intel CPU, dollar for dollar. Lastly it is also known that Dell frankly, sucks for PostgreSQL. Again, check the archives. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] ip address data type
* Sriram Dandapani: Does the inet data type offer comparison/search performance benefits over plain text for ip addresses.. Queries like host '192.168.17.192/28' use an available index on the host column. In theory, you could do this with LIKE and strings, but this gets pretty messy and needs a lot of client-side logic. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
Actually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back. I've been doing plenty of performance evaluation on a parallel application we're developing here : on Dual Core Opterons, P4, P4D. I can say that the Opterons open up a can of wupass on the Intel processors. Almost 2x the performance on our application vs. what the SpecCPU numbers would suggest.
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
David Boreham wrote: Actually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back. I've been doing plenty of performance evaluation on a parallel application we're developing here : on Dual Core Opterons, P4, P4D. I can say that the Opterons open up a can of wupass on the Intel processors. Almost 2x the performance on our application vs. what the SpecCPU numbers would suggest. Because Stone Cold Said So! -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
Joshua D. Drake wrote: David Boreham wrote: Actually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back. I've been doing plenty of performance evaluation on a parallel application we're developing here : on Dual Core Opterons, P4, P4D. I can say that the Opterons open up a can of wupass on the Intel processors. Almost 2x the performance on our application vs. what the SpecCPU numbers would suggest. Because Stone Cold Said So! I'll believe someone who uses 'wupass' in a sentence any day! -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
As others have noted, the current price/performance sweet spot for DB servers is 2S 2C AMD CPUs. These CPUs are also the highest performing x86 compatible solution for pg. If you must go Intel for some reason, then wait until the new NGMA CPU's (Conroe, Merom, Woodcrest) come out and see how they bench on DB workloads. Preliminary benches on these chips look good, but I would not recommend making a purchase decision based on just preliminary benches of unreleased products. If you must buy soon, then the decision is clear cut from anything except possinly a political/religious standpoint. The NetBurst based Pentium and Xeon solutions are simply not worth the money spent or the PITA they will put you through compared to the AMD dual cores. The new Intel NGMA CPUs may be different, but all the pertinent evidence is not yet available. My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slots combined with dual core AMD Kx's. Less money than the comparable Intel solution and _far_ more performance. ...and even if you do buy Intel, =DONT= buy Dell unless you like causing trouble for yourself. Bad experiences with Dell in general and their poor PERC RAID controllers in specific are all over this and other DB forums. Ron -Original Message- From: Bill Moran [EMAIL PROTECTED] Sent: Apr 25, 2006 2:14 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Large (8M) cache vs. dual-core CPUs I've been given the task of making some hardware recommendations for the next round of server purchases. The machines to be purchased will be running FreeBSD PostgreSQL. Where I'm stuck is in deciding whether we want to go with dual-core pentiums with 2M cache, or with HT pentiums with 8M cache. Both of these are expensive bits of hardware, and I'm trying to gather as much evidence as possible before making a recommendation. The FreeBSD community seems pretty divided over which is likely to be better, and I have been unable to discover a method for estimating how much of the 2M cache on our existing systems is being used. Does anyone in the PostgreSQL community have any experience with large caches or dual-core pentiums that could make any recommendations? Our current Dell 2850 systems are CPU bound - i.e. they have enough RAM, and fast enough disks that the CPUs seem to be the limiting factor. As a result, this decision on what kind of CPUs to get in the next round of servers is pretty important. Any advice is much appreciated. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slots combined with dual core AMD Kx's. Right. We've been buying Tyan bare-bones boxes like this. It's better to go with bare-bones than building boxes from bare metal because the cooling issues are addressed correctly. Note that if you need a large number of machines, then Intel Core Duo may give the best overall price/performance because they're cheaper to run and cool. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] slow deletes on pgsql 7.4
Hi all, we encounter issues when deleting from a table based on id (primary key). On certain 'id', it took forever to delete and the i/o is 100% busy. Table scenario has around 1400 entries. It is the parent of 3 other table. Table public.scenario Column | Type | Modifiers -+---+ id | bigint | not null default nextval('scenario_seq'::text) name | character varying(50) | description | text | subscriber_id | bigint | organization_id | bigint | schedule_id | bigint | Indexes: scenario_pkey primary key, btree (id) org_ind_scenario_index btree (organization_id) sch_ind_scenario_index btree (schedule_id) sub_ind_scenario_index btree (subscriber_id) Check constraints: $3 CHECK (schedule_id = 0) $2 CHECK (organization_id = 0) $1 CHECK (subscriber_id = 0) Foreign-key constraints: 0_4774 FOREIGN KEY (schedule_id) REFERENCES schedule(id) ON DELETE CASCADE 0_4773 FOREIGN KEY (organization_id) REFERENCES organization(id) ON DELETE CASCADE 0_4772 FOREIGN KEY (subscriber_id) REFERENCES subscriber(id) ON DELETE CASCADE In all the child tables, the foreign key has the same data type and are indexed. When I do delete from scenario where id='1023', it takes less than 200 ms. But when i do delete from scenario where id='1099', it took forever (more than 10 minutes that i decided to cancel it. I can't do explain analyze, but here is the explain: MONSOON=# begin; BEGIN MONSOON=# explain delete from scenario where id='1099'; QUERY PLAN -- Index Scan using scenario_pkey on scenario (cost=0.00..3.14 rows=1 width=6) Index Cond: (id = 1099::bigint) (2 rows) MONSOON=# explain delete from scenario where id='1023'; QUERY PLAN -- Index Scan using scenario_pkey on scenario (cost=0.00..3.14 rows=1 width=6) Index Cond: (id = 1023::bigint) (2 rows) MONSOON=# explain analyze delete from scenario where id='1023'; QUERY PLAN Index Scan using scenario_pkey on scenario (cost=0.00..3.14 rows=1 width=6) (actual time=0.028..0.030 rows=1 loops=1) Index Cond: (id = 1023::bigint) Total runtime: 0.174 ms (3 rows) I have also tried increasing statistics on both parent and child tables to 100, vacuum analyze parent and all child tables. But still the same slowness. The o/s is Solaris 10, with fsync = true. Any ideas what's going on? Thanks in advance, J
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
I've had intermittent freeze and reboot and, worse, just plain freeze problems with the Core Duo's I've been testing. I have not been able to narrow it down so I do not know if it is a platform issue or a CPU issue. It appears to be HW, not SW, related since I have experienced the problem both under M$ and Linux 2.6 based OS's. I have not tested the Core Duo's under *BSD. Also, being that they are only 32b Core Duo's have limited utility for a present day DB server. Power and space critical applications where 64b is not required may be a reasonable place for them... ...if the present reliability problems I'm seeing go away. Ron -Original Message- From: David Boreham [EMAIL PROTECTED] Sent: Apr 25, 2006 5:15 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Large (8M) cache vs. dual-core CPUs My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slots combined with dual core AMD Kx's. Right. We've been buying Tyan bare-bones boxes like this. It's better to go with bare-bones than building boxes from bare metal because the cooling issues are addressed correctly. Note that if you need a large number of machines, then Intel Core Duo may give the best overall price/performance because they're cheaper to run and cool. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow deletes on pgsql 7.4
I should also mention that select ... for update is fast: MONSOON=# begin;explain analyze select * from SCENARIO WHERE id = '1099' FOR UPDATE; BEGIN QUERY PLAN - Index Scan using scenario_pkey on scenario (cost=0.00..3.17 rows=1 width=64) (actual time=0.016..0.017 rows=1 loops=1) Index Cond: (id = 1099::bigint) Total runtime: 0.072 ms (3 rows) On 4/25/06, Junaili Lie [EMAIL PROTECTED] wrote: Hi all, we encounter issues when deleting from a table based on id (primary key). On certain 'id', it took forever to delete and the i/o is 100% busy. Table scenario has around 1400 entries. It is the parent of 3 other table. Table public.scenario Column | Type | Modifiers -+---+ id | bigint | not null default nextval('scenario_seq'::text) name | character varying(50) | description | text | subscriber_id | bigint | organization_id | bigint | schedule_id | bigint | Indexes: scenario_pkey primary key, btree (id) org_ind_scenario_index btree (organization_id) sch_ind_scenario_index btree (schedule_id) sub_ind_scenario_index btree (subscriber_id) Check constraints: $3 CHECK (schedule_id = 0) $2 CHECK (organization_id = 0) $1 CHECK (subscriber_id = 0) Foreign-key constraints: 0_4774 FOREIGN KEY (schedule_id) REFERENCES schedule(id) ON DELETE CASCADE 0_4773 FOREIGN KEY (organization_id) REFERENCES organization(id) ON DELETE CASCADE 0_4772 FOREIGN KEY (subscriber_id) REFERENCES subscriber(id) ON DELETE CASCADE In all the child tables, the foreign key has the same data type and are indexed. When I do delete from scenario where id='1023', it takes less than 200 ms. But when i do delete from scenario where id='1099', it took forever (more than 10 minutes that i decided to cancel it. I can't do explain analyze, but here is the explain: MONSOON=# begin; BEGIN MONSOON=# explain delete from scenario where id='1099'; QUERY PLAN -- Index Scan using scenario_pkey on scenario (cost=0.00..3.14 rows=1 width=6) Index Cond: (id = 1099::bigint) (2 rows) MONSOON=# explain delete from scenario where id='1023'; QUERY PLAN -- Index Scan using scenario_pkey on scenario (cost=0.00..3.14 rows=1 width=6) Index Cond: (id = 1023::bigint) (2 rows) MONSOON=# explain analyze delete from scenario where id='1023'; QUERY PLAN Index Scan using scenario_pkey on scenario (cost=0.00..3.14 rows=1 width=6) (actual time=0.028..0.030 rows=1 loops=1) Index Cond: (id = 1023::bigint) Total runtime: 0.174 ms (3 rows) I have also tried increasing statistics on both parent and child tables to 100, vacuum analyze parent and all child tables. But still the same slowness. The o/s is Solaris 10, with fsync = true. Any ideas what's going on? Thanks in advance, J
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
Ron Peacetree wrote: As others have noted, the current price/performance sweet spot for DB servers is 2S 2C AMD CPUs. These CPUs are also the highest performing x86 compatible solution for pg. If you must go Intel for some reason, then wait until the new NGMA CPU's (Conroe, Merom, Woodcrest) come out and see how they bench on DB workloads. Preliminary benches on these chips look good, but I would not recommend making a purchase decision based on just preliminary benches of unreleased products. If you must buy soon, then the decision is clear cut from anything except possinly a political/religious standpoint. The NetBurst based Pentium and Xeon solutions are simply not worth the money spent or the PITA they will put you through compared to the AMD dual cores. The new Intel NGMA CPUs may be different, but all the pertinent evidence is not yet available. My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slots combined with dual core AMD Kx's. Less money than the comparable Intel solution and _far_ more performance. ...and even if you do buy Intel, =DONT= buy Dell unless you like causing trouble for yourself. Bad experiences with Dell in general and their poor PERC RAID controllers in specific are all over this and other DB forums. Ron To add to this... the HP DL 385 is a pretty nice dual core capable opteron box. Just don't buy the extra ram from HP (they like to charge entirely too much). Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] slow deletes on pgsql 7.4
Junaili Lie [EMAIL PROTECTED] writes: we encounter issues when deleting from a table based on id (primary key). O= n certain 'id', it took forever to delete and the i/o is 100% busy. Almost always, if delete is slow when selecting the same rows is fast, it's because you've got a trigger performance problem --- most commonly, there are foreign keys referencing this table from other tables and you don't have the referencing columns indexed. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: On Tue, 2006-04-25 at 13:14, Bill Moran wrote: I've been given the task of making some hardware recommendations for the next round of server purchases. The machines to be purchased will be running FreeBSD PostgreSQL. Where I'm stuck is in deciding whether we want to go with dual-core pentiums with 2M cache, or with HT pentiums with 8M cache. Given a choice between those two processors, I'd choose the AMD 64 x 2 CPU. It's a significantly better processor than either of the Intel choices. And if you get the HT processor, you might as well turn of HT on a PostgreSQL machine. I've yet to see it make postgresql run faster, but I've certainly seen HT make it run slower. Actually, believe it or not, a coworker just saw HT double the performance of pgbench on his desktop machine. Granted, not really a representative test case, but it still blew my mind. This was with a database that fit in his 1G of memory, and running windows XP. Both cases were newly minted pgbench databases with a scale of 40. Testing was 40 connections and 100 transactions. With HT he saw 47.6 TPS, without it was 21.1. I actually had IT build put w2k3 server on a HT box specifically so I could do more testing. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Slow deletes in 8.1 when FKs are involved
I did double check for indexes on the referenced and referencing columns, and even though this database is restored and vacuum analyzed nightly the issue remains. Using explain analyze in postgresql 8.1, I was able to see where the problem lies. For performance reasons on our 7.4 server, we removed one of the 3 RI triggers for some constraints (the RI trigger that performs the SELECTFOR UPDATE to prevent modifications) and replaced it with a trigger to just prevent deletes on this data indefinitely (the data never gets deleted or updated in our app). This works great in postgresql 7.4 and nearly eliminated our performance issue, but when that database is restored to postgresql 8.1 one of the remaining two RI triggers does not perform well at all when you try to delete from that table (even though it's fine in postgresql 7.4). On the 8.1 server I dropped the remaining two RI triggers, and added the constraint to recreate the 3 RI triggers. After that the delete performed fine. So it looks like the 7.4 RI triggers that carried over to the 8.1 server don't perform very well. I'm hoping that the SELECT...FOR SHARE functionality in 8.1 will allow us to re-add our constraints and not suffer from the locking issues we had in postgresql 7.4. Will Reese -- http://blog.rezra.com On Apr 23, 2006, at 10:32 PM, Tom Lane wrote: Will Reese [EMAIL PROTECTED] writes: ... Both servers have identical postgresql.conf settings and were restored from the same 7.4 backup. Almost everything is faster on the 8.1 server (mostly due to hardware), except one thing...deletes from tables with many foreign keys pointing to them. I think it's unquestionable that you have a bad FK plan in use on the 8.1 server. Double check that you have suitable indexes on the referencing (not referenced) columns, that you've ANALYZEd all the tables involved, and that you've started a fresh psql session (remember the backend tends to cache FK plans for the life of the connection). It might help to EXPLAIN ANALYZE one of the slow deletes --- 8.1 will break out the time spent in FK triggers, which would let you see which one(s) are the culprit. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] slow deletes on pgsql 7.4
hi, Thanks for the answer. I have double checked that all the foreign key that are referencing id on scenario are indexed. I have even vacuum analyze scenario table and all the tables that referenced this table. Something that is interesting is that: it only happens for a certain values. ie. delete from scenario where id='1023' is very fast, but delete from scenario where id='1099' is running forever. Any ideas? J On 4/25/06, Tom Lane [EMAIL PROTECTED] wrote: Junaili Lie [EMAIL PROTECTED] writes: we encounter issues when deleting from a table based on id (primary key). O= n certain 'id', it took forever to delete and the i/o is 100% busy. Almost always, if delete is slow when selecting the same rows is fast,it's because you've got a trigger performance problem --- most commonly,there are foreign keys referencing this table from other tables and you don't have the referencing columns indexed.regards, tom lane
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, Apr 25, 2006 at 01:42:31PM -0500, Scott Marlowe wrote: On Tue, 2006-04-25 at 13:38, [EMAIL PROTECTED] wrote: On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote: Sad, cause the AMD is, on a price / performance scale, twice the processor for the same money as the Intel. Maybe a year or two ago. Prices are all coming down. Intel more than AMD. AMD still seems better - but not X2, and it depends on the workload. X2 sounds like biggotry against Intel... :-) Actually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back. AMD Opteron vs Intel Xeon is different than AMD X2 vs Pentium D. For AMD X2 vs Pentium D - I have both - in similar price range, and similar speed. I choose to use the AMD X2 as my server, and Pentium D as my Windows desktop. They're both quite fast. I made the choice I describe based on a lot of research. I was going to go both Intel, until I noticed that the Intel prices were dropping fast. 30% price cut in 2 months. AMD didn't drop at all during the same time. There are plenty of reasons to choose one over the other. Generally the AMD comes out on top. It is *not* 2X though. Anybody who claims this is being highly selective about which benchmarks they consider. One article is nothing. There is a lot of hype these days. AMD is winning the elite market, which means that they are able to continue to sell high. Intel, losing this market, is cutting its prices to compete. And they do compete. Quite well. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] slow deletes on pgsql 7.4
Junaili Lie [EMAIL PROTECTED] writes: ie. delete from scenario where id=3D'1023' is very fast, but delete from scenario where id=3D'1099' is running forever. What does EXPLAIN show for each of those cases? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
On Tue, Apr 25, 2006 at 08:54:40PM -0400, [EMAIL PROTECTED] wrote: I made the choice I describe based on a lot of research. I was going to go both Intel, until I noticed that the Intel prices were dropping fast. 30% price cut in 2 months. AMD didn't drop at all during the same time. Errr.. big mistake. That was going to be - I was going to go both AMD. There are plenty of reasons to choose one over the other. Generally the AMD comes out on top. It is *not* 2X though. Anybody who claims this is being highly selective about which benchmarks they consider. I have an Intel Pentium D 920, and an AMD X2 3800+. These are very close in performance. The retail price difference is: Intel Pentium D 920 is selling for $310 CDN AMD X2 3800+is selling for $347 CDN Another benefit of Pentium D over AMD X2, at least until AMD chooses to switch, is that Pentium D supports DDR2, whereas AMD only supports DDR. There are a lot of technical pros and cons to each - with claims from AMD that DDR2 can be slower than DDR - but one claim that isn't often made, but that helped me make my choice: 1) DDR2 supports higher transfer speeds. I'm using DDR2 5400 on the Intel. I think I'm at 3200 or so on the AMD X2. 2) DDR2 is cheaper. I purchased 1 Gbyte DDR2 5400 for $147 CDN. 1 Gbyte of DDR 3200 starts at around the same price, and stretches into $200 - $300 CDN. Now, granted, the Intel 920 requires more electricity to run. Running 24/7 for a year might make the difference in cost. It doesn't address point 1) though. I like my DDR2 5400. So, unfortunately, I won't be able to do a good test for you to prove that my Windows Pentium D box is not only cheaper to buy, but faster, because the specs aren't exactly equivalent. In the mean time, I'm quite enjoying my 3d games while doing other things at the same time. I imagine my desktop load approaches that of a CPU-bound database load. 3d games require significant I/O and CPU. Anybody who claims that Intel is 2X more expensive for the same performance, isn't considering all factors. No question at all - the Opteron is good, and the Xeon isn't - but the original poster didn't ask about Opeteron or Xeon, did he? For the desktop lines - X2 is not double Pentium D. Maybe 10%. Maybe not at all. Especially now that Intel is dropping it's prices due to overstock. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PL/pgSQL Loop Vs. Batch Update
David Wheeler [EMAIL PROTECTED] writes: This post is longish and has a bit of code, but here's my question up- front: Why are batch queries in my PL/pgSQL functions no faster than iterating over a loop and executing a series of queries for each iteration of the loop? You'd really have to look at the plans generated for each of the commands in the functions to be sure. A knee-jerk reaction is to suggest that that NOT IN might be the core of the problem, but it's only a guess. It's a bit tricky to examine the behavior of a parameterized query, which is what these will all be since they depend on local variables of the plpgsql function (which are passed as parameters to the main SQL executor). The basic idea is PREPARE foo(datatype, datatype, ...) AS SELECT ... $1 ... $2 ... EXPLAIN ANALYZE EXECUTE foo(value, value) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Query on postgresql 7.4.2 not using index
OK. Stop and think about what you're telling postgresql to do here. You're telling it to cast the field group_id to int8, then compare it to 9. How can it cast the group_id to int8 without fetching it? That's right, you're ensuring a seq scan. You need to put the int8 cast on the other side of that equality comparison, like: Yeh that one was my fault :) I couldn't remember which way it went and if 7.4.x had issues with int8 indexes.. -- Postgresql php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Firebird 1.5.3 X Postgresql 8.1.3 (linux and windows)
andremachado [EMAIL PROTECTED] writes: After some time experimenting on windows, the conclusion is clear: windows is likely crap for databases other than MS-SQL. Maybe. One thing that comes to mind is that you really should do some performance tuning experiments. In particular it'd be a good idea to increase checkpoint_segments and try other settings for wal_sync_method. Your fifth query, bddnf=# explain analyze update NOTA_FISCAL set VA_TOTAL_ITENSDNF = (select sum(ITEM_NOTA.VA_TOTAL) from ITEM_NOTA where ITEM_NOTA.ID_NF = NOTA_FISCAL.ID_NF) where ID_NF in (select NF2.ID_NF from DECLARACAO DE inner join CADASTRO CAD on (CAD.ID_DECLARACAO=DE.ID_DECLARACAO) inner join NOTA_FISCAL NF2 on (NF2.ID_CADASTRO=CAD.ID_CADASTRO) where DE.ID_ARQUIVO in (1) ); shows runtime of the plan proper as 158 seconds but total runtime as 746 seconds --- the discrepancy has to be associated with writing out the updated rows, which there are a lot of (719746) in this query, but still we should be able to do it faster than that. So I surmise a bottleneck in pushing WAL updates to disk. The last two queries are interesting. Does Firebird have any equivalent of EXPLAIN, ie a way to see what sort of query plan they are using? I suspect they are being more aggressive about optimizing the max() functions in the sub-selects than we are. In particular, the 8.1 code for optimizing min/max just punts if it sees any sub-selects in the WHERE clause, which prevents us from doing anything with these examples. /* * Also reject cases with subplans or volatile functions in WHERE. This * may be overly paranoid, but it's not entirely clear if the * transformation is safe then. */ if (contain_subplans(parse-jointree-quals) || contain_volatile_functions(parse-jointree-quals)) return NULL; This is something I'd wanted to go back and look at more carefully, but never got around to. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
[EMAIL PROTECTED] wrote: Another benefit of Pentium D over AMD X2, at least until AMD chooses to switch, is that Pentium D supports DDR2, whereas AMD only supports DDR. There are a lot of technical pros and cons to each - with claims from AMD that DDR2 can be slower than DDR - but one claim that isn't often made, but that helped me make my choice: They're switching quite soon though -- within the next month now it seems, after moving up their earlier plans to launch in June: http://www.dailytech.com/article.aspx?newsid=1854 This Anandtech article shows the kind of performance increase we can expect with DDR2 on AMD's new socket: http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2741 The short version is that it's an improvement, but not an enormous one, and you need to spend quite a bit of cash on 800Mhz (PC6400) DDR2 sticks to see the most benefit. Some brief local (Australian) price comparisons show 1GB PC-3200 DDR sticks starting at just over AU$100, with 1GB PC2-4200 DDR2 sticks around the same price, though Anandtech's tests showed PC2-4200 DDR2 benching generally slower than PC-3200 DDR, probably due to the increased latency in DDR2. Comparing reasonable quality matched pairs of 1GB sticks, PC-3200 DDR still seems generally cheaper than PC2-5300 DDR2, though not by a lot, and I'm sure the DDR2 will start dropping even further as AMD systems start using it in the next month or so. One thing's for sure though -- Intel's Pentium D prices are remarkably low, and at the lower end of the price range AMD has nothing that's even remotely competitive in terms of price/performance. The Pentium D 805, for instance, with its dual 2.67Ghz cores, costs just AU$180. The X2 3800+ is a far better chip, but it's also two-and-a-half times the price. None of this really matters much in the server space though, where Opteron's real advantage over Xeon is not its greater raw CPU power, or its better dual-core implementation (though both would be hard to dispute), but the improved system bandwidth provided by Hypertransport. Even with Intel's next-gen CPUs, which look set to address the first two points quite well, they still won't have an interconnect technology that can really compete with AMD's. Thanks Leigh ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PL/pgSQL Loop Vs. Batch Update
On Apr 25, 2006, at 18:19, Tom Lane wrote: You'd really have to look at the plans generated for each of the commands in the functions to be sure. A knee-jerk reaction is to suggest that that NOT IN might be the core of the problem, but it's only a guess. Well, the rows are indexed (I forgot to include the indexes in my first post), and given that each entry_id has no more than ten associated tag_ids, I would expect it to be quite fast, relying on the primary key index to look up the entry_id first, and then the associated tag_ids. But that's just a guess on my part, too. Perhaps I should try a left outer join with tag_id IS NULL? It's a bit tricky to examine the behavior of a parameterized query, which is what these will all be since they depend on local variables of the plpgsql function (which are passed as parameters to the main SQL executor). Right, that makes sense. The basic idea is PREPARE foo(datatype, datatype, ...) AS SELECT ... $1 ... $2 ... EXPLAIN ANALYZE EXECUTE foo(value, value) Just on a lark, I tried to get this to work: try=# explain analyze EXECUTE foo(1, ARRAY [61,62,63,64,65,66,67]); QUERY PLAN -- Result (cost=0.00..0.01 rows=1 width=0) (actual time=26.241..26.251 rows=1 loops=1) Total runtime: 27.512 ms (2 rows) That's not much use. Is there no way to EXPLAIN ANALYZE this stuff? Thanks Tom. Best, David ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] PL/pgSQL Loop Vs. Batch Update
David Wheeler [EMAIL PROTECTED] writes: Just on a lark, I tried to get this to work: try=# explain analyze EXECUTE foo(1, ARRAY [61,62,63,64,65,66,67]); QUERY PLAN -- Result (cost=0.00..0.01 rows=1 width=0) (actual time=26.241..26.251 rows=1 loops=1) Total runtime: 27.512 ms (2 rows) That's not much use. It looks like you had something trivial as the definition of foo(). Try one of the actual queries from the plpgsql function. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
Another benefit of Pentium D over AMD X2, at least until AMD chooses to switch, is that Pentium D supports DDR2, whereas AMD only supports DDR. There are a lot of technical pros and cons to each - with claims from AMD that DDR2 can be slower than DDR - but one claim that isn't often made, but that helped me make my choice: 1) DDR2 supports higher transfer speeds. I'm using DDR2 5400 on the Intel. I think I'm at 3200 or so on the AMD X2. 2) DDR2 is cheaper. I purchased 1 Gbyte DDR2 5400 for $147 CDN. 1 Gbyte of DDR 3200 starts at around the same price, and stretches into $200 - $300 CDN. There's a logical fallacy here that needs to be noted. THROUGHPUT is better with DDR2 if and only if there is enough data to be fetched in a serial fashion from memory. LATENCY however is dependent on the base clock rate of the RAM involved. So PC3200, 200MHz x2, is going to actually perform better than PC2-5400, 166MHz x4, for almost any memory access pattern except those that are highly sequential. In fact, even PC2-6400, 200MHz x4, has a disadvantage compared to 200MHz x2 memory. The minimum latency of the two types of memory in clock cycles is always going to be higher for the memory type that multiplies its base clock rate by the most. For the mostly random memory access patterns that comprise many DB applications, the base latency of the RAM involved is going to matter more than the peak throughput AKA the bandwidth of that RAM. The big message here is that despite engineering tricks and marketing claims, the base clock rate of the RAM you use matters. A minor point to be noted in addition here is that most DB servers under load are limited by their physical IO subsystem, their HDs, and not the speed of their RAM. All of the above comments about the relative performance of different RAM types become insignificant when performance is gated by the HD subsystem. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PL/pgSQL Loop Vs. Batch Update
On Apr 25, 2006, at 19:36, Tom Lane wrote: It looks like you had something trivial as the definition of foo(). Yeah, the function call. :-) Try one of the actual queries from the plpgsql function. Oh. Duh. Will do. Tomorrow. Best, David ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq