Re: [PERFORM] Millions of tables
Greg, sorry for the resent: I had forgotten to include the list. On Wed, Oct 5, 2016 at 2:34 PM, Greg Spiegelbergwrote: > Data is not static. The 4M tables fall into one of two groups. > > Group A contains 2M tables. INSERT will occur ~100 times/day and maximum > number of records anticipated will be 200k. Periodic DELETE's will occur > removing "old" records. Age is something the client sets and I have no way > of saying 1 or 10k records will be removed. The ~100 times / day are per table I assume. Also, I assume DELETES will probably delete batches (because the time criteria catches several records). > Group B contains the other 2M tables. Maximum records ~140k and UPSERT will > be the only mechanism used to populate and maintain. Periodic DELETE's may > run on these tables as well removing "old" records. So there will be inserts and updates. Either I missed it or you did not mention the criteria for placing a record in one of the 4M buckets. Can you shed light on what the criteria are? That would obviously suggest what indexing could be done. Also it would be interesting to see results of your tests with btree on really large tables as Stephen had suggested. I know it is not the primary tests you want to do but I would rather first explore "traditional" schema before I venture in the unknown of the multi million dollar, pardon, table schema. Kind regards -- [guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Seeing execution plan of foreign key constraint check?
On Fri, Jul 22, 2016 at 12:14 AM, Jim Nasbywrote: > On 7/21/16 4:59 PM, Tom Lane wrote: >>> >>> > As for function plans, ISTM that could be added to the PL handlers if >>> > we >>> > wanted to (allow a function invocation to return an array of explain >>> > outputs). >> >> Where would you put those, particularly for functions executed many >> times in the query? Would it include sub-functions recursively? >> I mean, yeah, in principle we could do something roughly like that, >> but it's not easy and presenting the results intelligibly seems >> almost impossible. > > > Yeah, it'd certainly need to be handled internally in a > machine-understandable form that got aggregated before presentation (or with > non-text output formats we could provide the raw data). Or just punt and > don't capture the data unless you're using an alternative output format. I'd imagine the output to just list all "recursive" execution plans executed probably along with indicators for how much IO and / or CPU they were responsible for. The "recursive" plans could also be sorted in decreasing order of total (i.e. across all individual invocations) time spent so you see the most impacting plan first. All of that would loose displaying calling relationships at the advantage of a simpler presentation. I think, the relationship which statement / function invoked with other could be determined by looking at statements / functions. And I guess often it will be apparent from names already. I am wondering what to do if the same statement has multiple execution plans if that is possible in such a scenario. Present all the plans or just the one with the highest impact? Show them next to each other so the user is immediately aware that all these plans originated from the same piece of SQL? Kind regards robert -- [guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Seeing execution plan of foreign key constraint check?
Hi, I was wondering whether there are any plans to include the plan of the FK check in EXPLAIN output. Or is there a different way to get to see all the plans of triggers as well as of the main SQL? When researching I found this thread from 2011 and the output format does not seem to have changed since then: https://www.postgresql.org/message-id/flat/3798971.mRNc5JcYXj%40moltowork#3798971.mRNc5JcYXj@moltowork Kind regards robert -- [guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Database transaction with intermittent slow responses
On Sat, May 14, 2016 at 1:11 AM, Gerardo Herzig wrote: > Oh, so *all* the transactions are being slowed down at that point...What > about CPU IO Wait% at that moment? Could be some other processes stressing > the system out? Or the database has just grown pass the size where disk caching is efficient. Usually these are nonlinear processes, i.e. it works good until a certain moment and then cache hit ratio decreases dramatically because all of a sudden content starts heavily competing about space in the cache. > Now im thinking about hard disk issues...maybe some "smart" messages? > > Have some other hardware to give it a try? Can we please see the full DDL of table and indexes? The only additional idea I can throw up at the moment is looking at a combined index on (primarykeyofchange, category) - maybe replacing changehistory_idx4 - but that of course depends on the other SQL used against that table. Kind regards robert -- [guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] LIKE pattern
On Mon, May 9, 2016 at 11:41 PM, SoDupuDupu wrote: > Владимир-3 wrote >> It seems my quite complex query runs 10 times faster on "some_column >> LIKE '%test_1' " vs "some_column LIKE 'test_1' " >> So I just add "%" to the pattern... > > Keep in mind then LIKE '%test_1' and LIKE 'test_1' are not equivalent, using > the % as a prefix to the argument means that the scan only has to confirm > that the value ends in 'test_1' where forgoing the % entirely means that you > are essentially saying some_column='test_1'. Yes, but wouldn't the latter test be more efficient usually since it tests against a prefix - at least with a regular index? Kind regards robert -- [guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index Scan Backward Slow
On 01.05.2015 13:06, David Osborne wrote: Simple... that did it... thanks! dev= create index on table(code,row_id); CREATE INDEX Time: 38088.482 ms dev= explain (analyse,buffers) select row_id as last_row_id from table where code='XX' order by row_id desc limit 1; Just out of curiosity: Is there a particular reason why you do not use select max(row_id) as last_row_id from table where code='XX' ? Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why don't use index on x when ORDER BY x, y?
On Mon, Nov 24, 2014 at 12:02 PM, Vlad Arkhipov wrote: Hello, I wonder why Postgres does not use index in the query below? It is a quite common use-case when you want to sort records by an arbitrary set of columns but do not want to create a lot of compound indexes for all possible combinations of them. It seems that if, for instance, your query's ORDER BY is x, y, z then any of these indexes could be used to improve the performance: (x); (x, y); (x, y, z). create temp table t as select s as x, s % 10 as y, s % 100 as z from generate_series(1, 100) s; analyze t; create index idx1 on t (y); select * from t order by y desc, x limit 10; For the record, here's the plan: rklemme= explain analyze rklemme- select * rklemme- from t rklemme- order by y desc, x rklemme- limit 10; QUERY PLAN --- Limit (cost=36511.64..36511.67 rows=10 width=12) (actual time=4058.863..4058.917 rows=10 loops=1) - Sort (cost=36511.64..39011.64 rows=100 width=12) (actual time=4058.849..4058.868 rows=10 loops=1) Sort Key: y, x Sort Method: top-N heapsort Memory: 17kB - Seq Scan on t (cost=0.00..14902.00 rows=100 width=12) (actual time=0.031..2025.639 rows=100 loops=1) Total runtime: 4058.992 ms (6 rows) Your index does not cover y AND x. If you remove order by x the index will be used: rklemme= explain analyze select * from t order by y desc limit 10; QUERY PLAN --- Limit (cost=0.42..0.86 rows=10 width=12) (actual time=0.081..0.284 rows=10 loops=1) - Index Scan Backward using idx1 on t (cost=0.42..43268.33 rows=100 width=12) (actual time=0.066..0.125 rows=10 loops=1) Total runtime: 0.403 ms (3 rows) Now, with a different index: rklemme= create index idx2 on t (y desc, x); CREATE INDEX rklemme= explain analyze select * from t order by y desc, x limit 10; QUERY PLAN -- Limit (cost=0.42..0.88 rows=10 width=12) (actual time=0.127..0.290 rows=10 loops=1) - Index Scan using idx2 on t (cost=0.42..45514.12 rows=100 width=12) (actual time=0.112..0.165 rows=10 loops=1) Total runtime: 0.404 ms (3 rows) rklemme= select version(); version --- PostgreSQL 9.3.5 on i686-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 32-bit (1 row) Kind regards -- [guy, jim].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres slave not catching up (on 9.2)
On Sat, Nov 8, 2014 at 2:11 PM, Ruben Domingo Gaspar Aparicio wrote: The slave (I don't have control on the master) is using 2 NFS file systems, one for WALs and another one for the data, on Netapp controllers: dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600) dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600) You should use noatime to avoid unnecessary IO. The master produces quite a lot of WALs. This is what I get on the slave (number of WAL files, date-hour, Total size in MB), so per day is more than 400GB: I tried to play with how the IO is handled, making it less strict setting synchronous_commit and fsync to off with not much success. I have also done a second test increasing shared_buffers from 12GB to 24GB (we are running on a 48GB, 8 cores server). Please let me know if you can see something obvious I am missing. Your IO system needs to be able to deliver sustained IO bandwith at least as large as you need to read and write all the changes. What raw IO bandwidth do those NFS file systems deliver _long term_? I am not talking about spikes because there are buffers. I am talking about the minimum of network throughput on one hand and raw disk IO those boxes can do on the other hand. Then, how much of it is available to your slave? Did you do the math to ensure that the IO bandwidth you have available on the slave is at least as high as what is needed? Note that it's not simply the WAL size that needs to be written and read but also data pages. Kind regards robert -- [guy, jim].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Advice on optimizing select/index
On 22.05.2013 16:37, Niels Kristian Schjødt wrote: In reality the adverts that are selected is all 'active'. I'm hence wondering if it theoretically (and in reality of cause) would make my query faster if I did something like: SELECT .* FROM cars LEFT OUTER JOIN adverts on cars.id = adverts.car_id WHERE cars.brand = 'Audi' AND adverts.state = 'active' with a partial index on INDEX adverts ON (car_id) WHERE state = 'active'? That sounds reasonable to do. If you have enough bandwidth on your production database why not just try it out? Otherwise you could do this on a test database and see how it goes and what plan you get. Btw. did you look at the original plan? Cheers robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query with limit goes from few ms to hours
Hi Henk, On Sun, Oct 14, 2012 at 9:04 AM, henk de wit henk53...@hotmail.com wrote: Hi, For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I pasted those on pastebin.com: http://pastebin.com/T5JTwh5T Just an additional data point: for whatever reason your email was placed in my GMail spam folder. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Deferred constraints performance impact ?
On Fri, Jul 20, 2012 at 4:27 AM, mark dvlh...@gmail.com wrote: We have put some deferred constraints (some initially immediate, some initially deferred) into our database for testing with our applications. I understand a lot more may have to be tracked through a transaction and there could be some impact from that. Similar to an after update trigger? Or are the two not comparable in terms of impact from what is tracked and then checked. Another factor might be the amount of constraint violations you expect: if there are many then deferring the check can create much more work for the DB because you issue more DML as with a non deferred constraint which could create errors much sooner and hence make you stop sending DML earlier. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query using incorrect index
On Thu, Aug 2, 2012 at 4:54 PM, Russell Keane russell.ke...@inps.co.ukwrote: ** ** Using PG 9.0 and given 2 queries (messageq_current is a view on the messageq_table): ** ** select entity_id from messageq_current where entity_id = 123456; ** ** select entity_id from messageq_current where incoming = true and inactive = false and staff_ty = 2 and staff_id = 2 order by entity_id desc limit 1; ** ** and 2 indexes (there are 15 indexes in total but they are left out here for brevity): ** ** messageq1: CREATE INDEX messageq1 ON messageq_table USING btree (entity_id); ** ** And messageq4: ** ** CREATE INDEX messageq4 ON messageq_table USING btree (inactive, staff_ty, staff_id, incoming, tran_dt); ** Of course *a lot* of detail is missing (full schema of table, all the other indexes) but with inactive a boolean column I suspect selectivity might not be too good here and so having it as a first column in a covering index is at least questionable. If query 2 is frequent you might also want to consider creating a partial index only on (staff_ty, staff_id) with filtering criteria on incoming and active as present in query 2. Btw, why don't you formulate query 2 as max query? select max(entity_id) as entity_id from messageq_current where incoming = true and inactive = false and staff_ty = 2 and staff_id = 2; ** With the messageq1 index present, query 1 is very quick (0.094ms) and query 2 is very slow (241.515ms). If I remove messageq1 then query 2 uses messageq4 and is very quick (0.098ms) but then query 1 must use a different index and is therefore slower ( 5ms). ** ** So, to the Query plans: Of which query? Shouldn't there be four plans in total? I'd post plans here: http://explain.depesz.com/ With messageq1: Limit (cost=0.00..2670.50 rows=1 width=4) (actual time=241.481..241.481 rows=0 loops=1) Output: messageq_table.entity_id Buffers: shared hit=32 read=18870 written=12 - Index Scan Backward using messageq1 on prac_live_10112.messageq_table (cost=0.00..66762.53 rows=25 width=4) (actual time=241.479..241.479 rows=0 loops=1) Output: messageq_table.entity_id Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (aud_status_to_flag(messageq_table.aud_status) = 1)) Buffers: shared hit=32 read=18870 written=12 Total runtime: 241.515 ms ** ** Without messageq1: Limit (cost=12534.45..12534.45 rows=1 width=4) (actual time=0.055..0.055 rows=0 loops=1) Output: messageq_table.entity_id Buffers: shared read=3 - Sort (cost=12534.45..12534.51 rows=25 width=4) (actual time=0.054..0.054 rows=0 loops=1) Output: messageq_table.entity_id Sort Key: messageq_table.entity_id Sort Method: quicksort Memory: 17kB - Bitmap Heap Scan on prac_live_10112.messageq_table (cost=174.09..12534.32 rows=25 width=4) (actual time=0.043..0.043 rows=0 loops=1) Output: messageq_table.entity_id Recheck Cond: ((messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2)) Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status) = 1)) Buffers: shared read=3 - Bitmap Index Scan on messageq4 (cost=0.00..174.08 rows=4920 width=0) (actual time=0.040..0.040 rows=0 loops=1) Index Cond: ((messageq_table.inactive = false) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (messageq_table.incoming = true)) Buffers: shared read=3 Total runtime: 0.098 ms ** ** Clearly the statistics are off somehow but I really don’t know where to start. ** ** Any help you can give me would be very much appreciated. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.
On Sat, Jul 14, 2012 at 11:50 AM, B Sreejith bsreejit...@gmail.com wrote: Dear All, Thanks alot for all the invaluable comments. Additionally to Craig's excellent advice to measurements there's something else you can do: with the knowledge of the queries your application fires against the database you can evaluate your schema and index definitions. While there is no guarantee that your application will scale well if all indexes are present you believe need to be present based on that inspection, you can pretty easily identify tables with can be improved. These are tables which a) are known to grow large and b) do not have indexes nor no indexes which support the queries your application does against these tables which will result in full table scans. Any database which scales in size will sooner or later hit a point where full table scans of these large tables will be extremely slow. If these queries are done during regular operation (and not nightly maintenance windows for example) then you pretty surely have identified a show stopper. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.
On Tue, Jul 10, 2012 at 10:21 AM, Sreejith Balakrishnan sreejith.balakrish...@tcs.com wrote: Is there any tool or some sort of script available, for PostgreSQL, which can be used to measure scalability of an application's database. Or is there any guideline on how to do this. I am a bit confused about the concept of measuring scalability of an application's database. You cannot measure scalability of a database as such. You need to know the nature of the load (i.e. operations executed against the DB - how many INSERT, UPDATE, DELETE and SELECT, against which tables and with what frequency and criteria). And then, as Sergey said, you need to define whether you want to scale up the load or the size - or both. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] The need for clustered indexes to boost TPC-V performance
On Wed, Jul 4, 2012 at 1:13 AM, Reza Taheri rtah...@vmware.com wrote: Checking online, the subject of clustered indexes for PostgreSQL comes up often. PGSQL does have a concept called “clustered table”, which means a table has been organized in the order of an index. This would help with sequential accesses to a table, but has nothing to do with this problem. PGSQL folks sometimes refer to what we want as “integrated index”. I do understand this correctly that we are speaking about the concept which is known under the term index organized table (IOT) in Oracle land, correct? http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm#CBBJEBIH Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] partitioning performance question
On Sat, Jun 9, 2012 at 7:58 PM, Kevin Kempter cs_...@consistentstate.com wrote: Hi All; We have a client that has a table where large blobs (bytea) are stored. the table has a key column that is numbers (like 112362) but unfortunately it's a varchar column so the blobs are accessed via queries like: select * from bigtable where keycol = '217765' The primary reason we want to partition the table is for maintenance, the table is well over 1.2 Terabytes in size and they have never successfully vacuumed it. However I don't want to make performance even worse. The table does have a serial key, I'm thinking the best options will be to partition by range on the serial key, or maybe via the keycol character column via using an in statement on the check constraints, thus allowing the planner to actually leverage the above sql. I suspect doing a typecast to integer in the check constraints will prove to be a bad idea if the keycol column remains a varchar. Thoughts? Here's the table: Table problemchild Column | Type | Modifiers ---+--+ keycol | character varying | blob_data | bytea | removed_date | timestamp with time zone | alt_key | bigint | not null default nextval('problemchild_alt_key_seq'::regclass) Indexes: pc_pkey PRIMARY KEY, btree (alt_key) key2 btree (keycol) I find it odd that you have a column keycol which is not the PK and your PK column is named alt_key. Is keycol always the character representation of alt_key? Are they unrelated? It would also help to know how the data in this table changes. Do you only ever add data? Is some data removed from time to time (maybe based on the removed_date)? If the table grows continually then range partitioning sounds good. However, I think you should really make keycol a number type because otherwise range partitioning will be a pain (you would need to include the length of the string in the criterion if you want your varchar ranges to mimic number ranges). However, if you are deleting from time to time and hence the table does not grow in the long run then hash partitioning might be a better idea because then you do not need to create new partitions all the time. Example on alt_key create table problemchild ( keycol varchar(100), blob_data bytea, removed_date timestamp with time zone, alt_key bigint primary key ); create table problemchild_00 ( check ( alt_key % 16 = 0 ) ) inherits (problemchild); create table problemchild_01 ( check ( alt_key % 16 = 1 ) ) inherits (problemchild); create table problemchild_02 ( check ( alt_key % 16 = 2 ) ) inherits (problemchild); ... Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Multiple Concurrent Updates of Shared Resource Counter
On Thu, Jun 7, 2012 at 9:53 AM, Nir Zilberman n...@checkpoint.com wrote: We are handling multiple concurrent clients connecting to our system - trying to get a license seat (each license has an initial capacity of seats). We have a table which keeps count of the acquired seats for each license. When a client tries to acquire a seat we first make sure that the number of acquired seats is less than the license capacity. We then increase the number of acquired seats by 1. Our main problem here is with the acquired seats table. It is actually a shared resource which needs to be updated concurrently by multiple transactions. When multiple transactions are running concurrently - each transaction takes a long time to complete because it waits on the lock for the shared resource table. Any suggestions for better implementation/design of this feature would be much appreciated. Well, there are the usual suspects for lock contention 1. Reduce time a lock needs to be held. 2. Increase granularity of locking. ad 1) It sounds as if you need two statements for check and increase. That can easily be done with a single statement if you use check constraints. Example: $ psql -ef seats.sql drop table licenses; DROP TABLE create table licenses ( name varchar(200) primary key, max_seats integer not null check ( max_seats = 0 ), current_seats integer not null default 0 check ( current_seats = 0 and current_seats = max_seats ) ); psql:seats.sql:6: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index licenses_pkey for table licenses CREATE TABLE insert into licenses (name, max_seats) values ('foo', 4); INSERT 0 1 update licenses set current_seats = current_seats + 1 where name = 'foo'; UPDATE 1 update licenses set current_seats = current_seats + 1 where name = 'foo'; UPDATE 1 update licenses set current_seats = current_seats + 1 where name = 'foo'; UPDATE 1 update licenses set current_seats = current_seats + 1 where name = 'foo'; UPDATE 1 update licenses set current_seats = current_seats + 1 where name = 'foo'; psql:seats.sql:12: ERROR: new row for relation licenses violates check constraint licenses_check update licenses set current_seats = current_seats - 1 where name = 'foo'; UPDATE 1 The increase will fail and you can react on that. Another scheme is to use update licenses set current_seats = current_seats + 1 where name = 'foo' and current_seats max_seats; and check how many rows where changed. If however your transaction covers increase of used license seat count, other work and finally decrease used license seat count you need to change your transaction handling. You rather want three TX: start TX update licenses set current_seats = current_seats + 1 where name = 'foo'; commit if OK start TX main work commit / rollback start TX update licenses set current_seats = current_seats - 1 where name = 'foo'; commit end ad 2) At the moment I don't see a mechanism how that could be achieved in your case. Distribution of counters of a single license across multiple rows and checking via SUM(current_seats) is not concurrency safe because of MVCC. Generally checking licenses via a relational database does neither seem very robust nor secure. As long as someone has administrative access to the database or regular access to the particular database limits and counts can be arbitrarily manipulated. License servers I have seen usually work by managing seats in memory and counting license usage via network connections. That has the advantage that the OS quite reliably informs the license server if a client dies. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump and thousands of schemas
On Thu, May 31, 2012 at 10:45 AM, Tatsuo Ishii is...@postgresql.org wrote: Just for record, I rerun the test again with my single-LOCK patch, and now total runtime of pg_dump is 113 minutes. 188 minutes(9.0)-125 minutes(git master)-113 minutes(with my patch). So far, I'm glad to see 40% time savings at this point. I see only 9.6% savings (100 * (113/125 - 1)). What am I missing? Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pg_dump and thousands of schemas
On Thu, May 31, 2012 at 4:07 PM, Tatsuo Ishii is...@postgresql.org wrote: On Thu, May 31, 2012 at 10:45 AM, Tatsuo Ishii is...@postgresql.org wrote: Just for record, I rerun the test again with my single-LOCK patch, and now total runtime of pg_dump is 113 minutes. 188 minutes(9.0)-125 minutes(git master)-113 minutes(with my patch). So far, I'm glad to see 40% time savings at this point. I see only 9.6% savings (100 * (113/125 - 1)). What am I missing? What I meant was (100 * (113/188 - 1)). OK, my fault was to assume you wanted to measure only your part, while apparently you meant overall savings. But Tom had asked for separate measurements if I understood him correctly. Also, that measurement of your change would go after the O(N^2) fix. It could actually turn out to be much more than 9% because the overall time would be reduced even more dramatic. So it might actually be good for your fix to wait a bit. ;-) Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Configuration Recommendations
On Tue, May 15, 2012 at 7:53 PM, Greg Sabino Mullane g...@turnstep.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Is it established practice in the Postgres world to separate indexes from tables? I would assume that the reasoning of Richard Foote - albeit for Oracle databases - is also true for Postgres: Yes, it's an established practice. I'd call it something just short of a best practice though, as it really depends on your situation. What are the benefits? Disk seeks, basically. Yes, there are a lot of complications regarding all the various hardware/OS/PG level cachings, but at the end of the day, it's less work to have each drive concentrate on a single area Hmm... I see your point. OTOH, the whole purpose of using NAS or SAN with cache, logical volumes and multiple spindles per volume is to reduce the impact of slow disk operations like seeks. If in such a situation your database operations are impacted by those seek operations then the setup does not seem optimal anyway. Bottom line is: with a setup properly tailored to the workload there should be no seeks visible to the database. (especially as we always require a heap scan at the moment). Are you referring to the scan along tuple versions? http://facility9.com/2011/03/postgresql-row-storage-fundamentals/ I also find his examples a bit contrived, and the whole multi-user argument irrelevant for common cases. Why is that? Because most Postgres servers are dedicated to serving the same data or sets of data, and the number of other users calling ad-hoc queries against lots of different tables (per his example) is small. I don't see how it should be relevant for this discussion whether selects are ad hoc or other. The mere fact that concurrent accesses to the same set of tables and indexes albeit to different data (keys) is sufficient to have a potential for seeks - even if disks for index and table are separated. And this will typically happen in a multiuser application - even if all users use the same set of queries. So this sentence just doesn't ring true to me: ... by the time weâve read the index leaf block, processed and read all the associated table blocks referenced by the index leaf block, the chances of there being no subsequent physical activity in the index tablespace due to another user session is virtually nil. We would still need to re-scan the disk to physically access the next index leaf block (or table block) anyways. That's certainly not true for Postgres servers, and I doubt if it is quite that bad on Oracle either. I don't think this has much to do with the brand. Richard just describes logical consequences of concurrent access (see my attempt at explanation above). Fact remains that concurrent accesses rarely target for the same data and because of that you would see quite erratic access patterns to blocks. How they translate to actual disk accesses depends on various caching mechanisms in place and the physical distribution of data across disks (RAID). But I think we cannot ignore the fact that the data requested by concurrent queries most likely resides on different blocks. I lean towards using separate tablespaces in Postgres, as the performance outweighs the additional complexity. What about his argument with regards to access patterns (i.e. interleaving index and table access during an index scan)? Also, Shaun's advice to have more spindles available sounds convincing to me, too. I don't buy his arguments. To do so, you'd have to buy a key point: when most physical I/Os in both index and table segments are effectively random, single block reads They are not; hence, the rest of his argument falls apart. Certainly, if things were as truly random and uncached as he states, there would be no benefit to separation. Your argument with seeks also only works in absence of caching (see above). I think Richard was mainly pointing out that /in absence of caching/ different blocks need to be accessed here. As far as spindles, yes: like RAM, it's seldom the case to have too litte :) But as with all things, one should get some benchmarks on your specific workload before making hardware changes. (Well, RAM may be an exception to that, up to a point). Can you share some measurement data which backs the thesis that the distribution of index and table to different disks is advantageous? That would be interesting to see. Then one could also balance performance benefits against other effects (manageability etc.) and see on which side the advantage comes out. Even though I'm not convinced: Thank you for the interesting discussion! Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Maximum number of sequences that can be created
Hi, On Tue, May 15, 2012 at 12:57 PM, Andres Freund and...@anarazel.de wrote: I would rather suggest going with a suming table if you need to do something like that: sequence_id | value 1 | 3434334 1 | 1 1 | -1 1 | 1 1 | 1 ... You then can get the current value with SELECT SUM(value) WHERE sequence_id = 1. For garbage collection you can delete those values and insert the newly summed up value again. That solution won't ever block if done right. I was going to suggest another variant which would not need GC but would also increase concurrency: sequence_id | hash | value 1 | 0 | 3 1 | 1 | 9 1 | 2 | 0 1 | 3 | 2 ... with PK = (sequence_id, hash) and hash in a fixed range (say 0..15). Value would be obtained the same way, i.e. via SELECT SUM(value) FROM T WHERE sequence_id = 1 The hash value would have to be calculated - at session start time (cheap but might reduce concurrency due to small number of changes) or - at TX start time (more expensive but probably better concurrency due to higher change rate) Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Maximum number of sequences that can be created
On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин tiv...@gmail.com wrote: 2012/5/11 Robert Klemme shortcut...@googlemail.com On the contrary: what would be the /advantage/ of being able to create millions of sequences? What's the use case? We are using sequences as statistics counters - they produce almost no performance impact and we can tolerate it's non-transactional nature. I can imaging someone who wants to have a sequence per user or other relation row. I can almost see the point. But my natural choice in that case would be a table with two columns. Would that actually be so much less efficient? Of course you'd have fully transactional behavior and thus locking. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Maximum number of sequences that can be created
On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre meetvbon...@gmail.com wrote: Is there any max limit set on sequences that can be created on the database ? Also would like to know if we create millions of sequences in a single db what is the downside of it. On the contrary: what would be the /advantage/ of being able to create millions of sequences? What's the use case? Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] scale up (postgresql vs mssql)
On Fri, May 4, 2012 at 3:04 PM, Merlin Moncure mmonc...@gmail.com wrote: let's see the query plan...when you turned it off, did it go faster? put your suspicious plans here: http://explain.depesz.com/ I suggest to post three plans: 1. insert into temp table 2. access to temp table before analyze 3. access to temp table after analyze Maybe getting rid of the temp table (e.g. using a view or even an inline view) is even better than optimizing temp table access. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 3:58 PM, Merlin Moncure mmonc...@gmail.com wrote: On Wed, May 9, 2012 at 8:06 AM, MauMau maumau...@gmail.com wrote: I've heard from some people that synchronous streaming replication has severe performance impact on the primary. They said that the transaction throughput of TPC-C like benchmark (perhaps DBT-2) decreased by 50%. I'm sorry I haven't asked them about their testing environment, because they just gave me their experience. They think that this result is much worse than some commercial database. I can't speak for other databases, but it's only natural to assume that tps must drop. At minimum, you have to add the latency of communication and remote sync operation to your transaction time. For very short transactions this adds up to a lot of extra work relative to the transaction itself. Actually I would expect 50% degradation if both databases run on identical hardware: the second instance needs to do the same work (i.e. write WAL AND ensure it reached the disk) before it can acknowledge. When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server. http://www.postgresql.org/docs/9.1/static/warm-standby.html#SYNCHRONOUS-REPLICATION I am not sure whether the replicant can be triggered to commit to disk before the commit to disk on the master has succeeded; if that was the case there would be true serialization = 50%. This sounds like it could actually be the case (note the after it commits): When synchronous replication is requested the transaction will wait after it commits until it receives confirmation that the transfer has been successful. http://wiki.postgresql.org/wiki/Synchronous_replication Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 5:45 PM, Claudio Freire klaussfre...@gmail.com wrote: On Wed, May 9, 2012 at 12:41 PM, Robert Klemme shortcut...@googlemail.com wrote: I am not sure whether the replicant can be triggered to commit to disk before the commit to disk on the master has succeeded; if that was the case there would be true serialization = 50%. This sounds like it could actually be the case (note the after it commits): When synchronous replication is requested the transaction will wait after it commits until it receives confirmation that the transfer has been successful. http://wiki.postgresql.org/wiki/Synchronous_replication That should only happen for very short transactions. IIRC, WAL records can be sent to the slaves before the transaction in the master commits, so bigger transactions would see higher parallelism. I considered that as well. But the question is: when are they written to disk in the slave? If they are in buffer cache until data is synched to disk then you only gain a bit of advantage by earlier sending (i.e. network latency). Assuming a high bandwidth and low latency network (which you want to have in this case anyway) that gain is probably not big compared to the time it takes to ensure WAL is written to disk. I do not know implementation details but *if* the server triggers sync only after its own sync has succeeded *then* you basically have serialization and you need to wait twice the time. For small TX OTOH network overhead might relatively large compared to WAL IO (for example with a battery backed cache in the controller) that it shows. Since we do not know the test cases which lead to the 50% statement we can probably only speculate. Ultimately each individual setup and workload has to be tested. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Result Set over Network Question
On Mon, May 7, 2012 at 4:25 PM, Thomas Kellerer spam_ea...@gmx.net wrote: That seems to be a documentation bug. I tried it, and it definitely does not work (or I am missing something). Apparently I am the one who is missing something. :-) Their release notes at: http://www.oracle.com/technetwork/developer-tools/sql-developer/sqldev31-ea-relnotes-487612.html state: Third Party Databases SQL Developer supports IBM DB2 UDB LUW , Microsoft SQL Server and Microsoft Access, MySQL, Sybase Adaptive Server and Teradata. See Supported Platforms for details on all third party database releases supported. Right you are, Thomas! Thank you! Sorry for the confusion. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Result Set over Network Question
On Thu, May 3, 2012 at 5:40 PM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, May 3, 2012 at 10:28 AM, Ronald Hahn, DOCFOCUS INC. rh...@docfocus.ca wrote: After some testing using wiershark (poor mans profiler) to see what was going on with the network I found that it was the tools I've been using. Both Aqua and PGadminIII have a large overhead per column to get the meta data. MSSQL sends that data upfront so the impact isn't as bad. I'm not sure if it's a limitation of the pgsql protocol vs tds or a limitation of Aqua or a combination of both. At any rate it turns out not to be part of the problem I'm having with my software stalling out so I'm back to Square one with my problem. So, Ronald, are you saying the different approach to meta data transfer is _not_ the issue? ok, let's figure out what the issue is then. first, let's make sure it isn't the server that's stalling: configure log_min_duration_statement with an appropriate value so you start catching queries that are taking longer then you think the should be. also some client side logging directly wrapping the SQL invocation couldn't hurt. is your application jdbc? Ronald said ODBC in his first posting. But since ADS seems to support JDBC as well trying that might be a good test to get another data point. Alternative tools for JDBC tests: http://squirrel-sql.sourceforge.net/ http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html Using the PG client remotely with \timing on might be an even better idea. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Result Set over Network Question
On Mon, May 7, 2012 at 2:11 PM, Thomas Kellerer spam_ea...@gmx.net wrote: Robert Klemme, 07.05.2012 14:03: Alternative tools for JDBC tests: http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html SQL Developer does not support PostgreSQL Last time I checked (quite a while ago) you could use arbitrary JDBC drivers. There's also http://docs.oracle.com/cd/E25259_01/appdev.31/e24285/intro.htm#sthref306 And this seems to indicate that it's still the case: [...] or another third-party driver. [...] JDBC URL (Other Third Party Driver): URL for connecting directly from Java to the database; overrides any other connection type specification. http://docs.oracle.com/cd/E25259_01/appdev.31/e24285/dialogs.htm#BACDGCIA I assume Oracle is not interested in aggressively advertizing this feature though. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Configuration Recommendations
Hi Jan, On Thu, May 3, 2012 at 4:10 AM, Jan Nielsen jan.sture.niel...@gmail.com wrote: Below is the hardware, firmware, OS, and PG configuration pieces that I'm settling in on. As was noted, the local storage used for OS is actually two disks with RAID 10. If anything appears like a mistake or something is missing, I'd appreciate the feedback. You should quickly patent this solution. As far as I know you need at least four disks for RAID 10. :-) http://en.wikipedia.org/wiki/RAID#Nested_.28hybrid.29_RAID Or did you mean RAID 1? I'm still working on the benchmarks scripts and I don't have good/reliable numbers yet since our SAN is still very busy reconfiguring from the 2x4 to 1x8. I'm hoping to get them running tomorrow when the SAN should complete its 60 hours of reconfiguration. Yeah, does not seem to make a lot of sense to test during this phase. Thanks, again, for all the great feedback. You're welcome! 300GB RAID10 2x15k drive for OS on local storage */dev/sda1 RA* 4096 */dev/sda1 FS* ext4 */dev/sda1 MO* See above. 600GB RAID 10 8x15k drive for $PGDATA on SAN *IO Scheduler sda* noop anticipatory deadline [cfq] */dev/sdb1 RA* 4096 */dev/sdb1 FS* xfs */dev/sdb1 MO* allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime 300GB RAID 10 2x15k drive for $PGDATA/pg_xlog on SAN *IO Scheduler sdb* noop anticipatory deadline [cfq] */dev/sde1 RA* 4096 */dev/sde1 FS* xfs */dev/sde1 MO* allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime *IO Scheduler sde* noop anticipatory deadline [cfq] See above. With regard to the scheduler, I have frequently read that [deadline] and [noop] perform better for PG loads. Fortunately this can be easily changed. Maybe this also has some additional input: http://www.fccps.cz/download/adv/frr/hdd/hdd.html On Thu, May 3, 2012 at 8:54 AM, John Lister john.lis...@kickstone.co.uk wrote: I was wondering if it would be better to put the xlog on the same disk as the OS? Apart from the occasional log writes I'd have thought most OS data is loaded into cache at the beginning, so you effectively have an unused disk. This gives you another spindle (mirrored) for your data. Or have I missed something fundamental? Separating avoids interference between OS and WAL logging (i.e. a script running berserk and filling OS filesystem). Also it's easier to manage (e.g. in case of relocation to another volume etc.). And you can have different mount options (i.e. might want to have atime for OS volume). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Configuration Recommendations
On Wed, Apr 25, 2012 at 7:08 PM, Greg Sabino Mullane g...@turnstep.com wrote: Is it established practice in the Postgres world to separate indexes from tables? I would assume that the reasoning of Richard Foote - albeit for Oracle databases - is also true for Postgres: Yes, it's an established practice. I'd call it something just short of a best practice though, as it really depends on your situation. What are the benefits? I'd take those articles with a grain of salt, as they are very Oracle-specific (e.g. we do not have fat indexes (yet!), nor segments). True. As far as I understand disk layout segments in Oracle serve the purpose to cluster data for a database object. With that feature missing the situation would be worse in Postgres - unless you manually do something similar by using tablespaces for that. I also find his examples a bit contrived, and the whole multi-user argument irrelevant for common cases. Why is that? I lean towards using separate tablespaces in Postgres, as the performance outweighs the additional complexity. What about his argument with regards to access patterns (i.e. interleaving index and table access during an index scan)? Also, Shaun's advice to have more spindles available sounds convincing to me, too. It's down on the tuning list however: much more important is getting your kernel/volumes configured correctly, allocating shared_buffers sanely, separating pg_xlog, etc. That does make a lot of sense. Separating pg_xlog would probably the first thing I'd do especially since the IO pattern is so dramatically different from tablespace IO access patterns. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Configuration Recommendations
On Tue, Apr 24, 2012 at 4:56 AM, Jan Nielsen jan.sture.niel...@gmail.com wrote: We are considering the following drive allocations: * 4 x 15k SAS drives, XFS, RAID 10 on SAN for PG data * 4 x 15k SAS drives, XFS, RAID 10 on SAN for PG indexes * 2 x 15k SAS drives, XFS, RAID 1 on SAN for PG xlog * 1 x 15k SAS drive, XFS, on local storage for OS Is it established practice in the Postgres world to separate indexes from tables? I would assume that the reasoning of Richard Foote - albeit for Oracle databases - is also true for Postgres: http://richardfoote.wordpress.com/2008/04/16/separate-indexes-from-tables-some-thoughts-part-i-everything-in-its-right-place/ http://richardfoote.wordpress.com/2008/04/18/separate-indexes-from-tables-some-thoughts-part-ii-there-there/ http://richardfoote.wordpress.com/2008/04/28/indexes-in-their-own-tablespace-availabilty-advantages-is-there-anybody-out-there/ Conversely if you lump both on a single volume you have more flexibility with regard to usage - unless of course you can dynamically resize volumes. To me it also seems like a good idea to mirror local disk with OS and database software because if that fails you'll get downtime as well. As of now you have a single point of failure there. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Tablespace files deleted automatically.
On Fri, Oct 14, 2011 at 8:19 PM, Josh Berkus j...@agliodbs.com wrote: Vishnu, I am using PostgreSQL 8.4 in windows. I have created a database and some tables on it. Also created a table space and some tables in it. My application inserts data into these tables in every second. The application is a continuous running application. My issue is that after a 20-30 days continuous run ( Machine logged off 2 times), some of the files in the \16384 folder of newly created table space are dropped automatically. Can you tell me the reason for this.? How can we recover from this? PostgreSQL creates and deletes files all the time. I don't understand why this is a problem. Also it does not seem clear whether we are talking about a performance issue (what this ML is about) or a bug (lost data). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Window functions and index usage
On Tue, Oct 4, 2011 at 11:39 AM, Anssi Kääriäinen anssi.kaariai...@thl.fi wrote: I have the following setup: create table test(id integer, seq integer); insert into test select generate_series(0, 100), generate_series(0, 1000); create unique index test_idx on test(id, seq); analyze test; Now I try to fetch the latest 5 values per id, ordered by seq from the table: select * from ( select id, seq, row_number() over (partition by id order by seq) from test where id in (1, 2, 3) ) where row_number() = 5; It seems this fetches the *first* 5 values per id - and not the latest. For that you would need to order by seq desc in the window and probably also in the index. This does not use the index on id, seq for sorting the data. It uses a bitmap index scan, and sequential scan when issued SET enable_bitmapscan to false. Tested both on git head and 8.4.8. See end of post for plans. It seems it would be possible to fetch the first values per id using the index, or at least skip the sorting. Just guessing: since row_number is an analytic function and it can be combined with any type of windowing only in rare cases do the ordering of index columns and the ordering in the window align. So while your particular use case could benefit from this optimization the overall judgement might be that it's not worthwhile to make the optimizer more complex to cover this case - or I fail to see the more general pattern here. :-) Is there some other way to run the query so that it would use the index? Is there plans to support the index usage for the above query assuming that it is possible to use the index for that query? The real world use case would be to fetch latest 5 threads per discussion forum in one query. Or fetch 3 latest comments for all patches in given commit fest in single query. Is it really that realistic that someone wants the latest n entries for *all* threads / patches? It seems since this can result in a very large data set this is probably not the type of query which is done often. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Window functions and index usage
On Tue, Oct 4, 2011 at 4:06 PM, Anssi Kääriäinen anssi.kaariai...@thl.fi wrote: On 10/04/2011 04:27 PM, Robert Klemme wrote: On Tue, Oct 4, 2011 at 11:39 AM, Anssi Kääriäinen anssi.kaariai...@thl.fi wrote: I have the following setup: create table test(id integer, seq integer); insert into test select generate_series(0, 100), generate_series(0, 1000); create unique index test_idx on test(id, seq); analyze test; Now I try to fetch the latest 5 values per id, ordered by seq from the table: select * from ( select id, seq, row_number() over (partition by id order by seq) from test where id in (1, 2, 3) ) where row_number()= 5; It seems this fetches the *first* 5 values per id - and not the latest. For that you would need to order by seq desc in the window and probably also in the index. Yeah. Sorry, wrong order. And the last line is wrong, it should be ) tmp where row_number = 5;. This does not use the index on id, seq for sorting the data. It uses a bitmap index scan, and sequential scan when issued SET enable_bitmapscan to false. Tested both on git head and 8.4.8. See end of post for plans. It seems it would be possible to fetch the first values per id using the index, or at least skip the sorting. Just guessing: since row_number is an analytic function and it can be combined with any type of windowing only in rare cases do the ordering of index columns and the ordering in the window align. So while your particular use case could benefit from this optimization the overall judgement might be that it's not worthwhile to make the optimizer more complex to cover this case - or I fail to see the more general pattern here. :-) I think there are common use cases for this - see end of message for an example. Is there some other way to run the query so that it would use the index? Is there plans to support the index usage for the above query assuming that it is possible to use the index for that query? The real world use case would be to fetch latest 5 threads per discussion forum in one query. Or fetch 3 latest comments for all patches in given commit fest in single query. Is it really that realistic that someone wants the latest n entries for *all* threads / patches? It seems since this can result in a very large data set this is probably not the type of query which is done often. The idea is that the dataset isn't that large. But then why do require using the second index column in the first place? If the data set is small then the query is likely fast if the selection via id can use any index. And you don't have to fetch them for all threads / patches. You might fetch them only for patches in currently viewed commit fest. See https://commitfest.postgresql.org/action/commitfest_view?id=12 for one such use. What I have in mind is fetching first all the patches in the commit fest in one go. Then issue query which would look something like: select * from (select comment_data, row_number() over (partition by patch_id order by comment_date desc) from patch_comments where patch_id in (list of patch_ids fetched in first query) ) tmp where row_number = 3; Interesting: I notice that I the query cannot successfully be simplified on 8.4: rklemme= select *, row_number() over (partition by id order by seq desc) as rn from test where id in (1,2,3) and rn = 3 ; ERROR: column rn does not exist LINE 5: and rn = 3 ^ rklemme= select *, row_number() over (partition by id order by seq desc) as rn from test where id in (1,2,3) and row_number() = 3; ERROR: window function call requires an OVER clause LINE 5: and row_number() = 3; ^ rklemme= select *, row_number() over (partition by id order by seq desc) as rn from test where id in (1,2,3) and row_number() over (partition by id order by seq desc) = 3; ERROR: window functions not allowed in WHERE clause LINE 5: and row_number() over (partition by id order by seq desc) =... ^ rklemme= I think I need to switch to 9.1 soon. :-) Now you have all the data needed for the first column in the above mentioned page. I guess the commit fest application is fetching all the comments for the patches in the commit fest in one query, and then doing the limit in application code. This is fine because there aren't that many comments per patch. But if you have dozens of forums and thousands of threads per forum you can't do that. This is useful in any situation where you want to show n latest entries instead of the last entry in a list view. Latest modifications to an object, latest commits to a file, latest messages to a discussion thread or latest payments per project. Or 5 most popular videos per category, 10 highest paid employees per department and so on. Again, what is easy for you as a human will likely be quite complex for the optimizer (knowing that the order by and the row_number output align). Kind regards robert -- remember.guy do
Re: [PERFORM] Slow query with self-join, group by, 100m rows
On Tue, Sep 20, 2011 at 7:43 PM, Thomas Kappler tkapp...@googlemail.com wrote: [please CC, I'm not on the list] Hi all, we have one table that basically uses Postgres as a key-value store. Table public.termindex Column | Type | Modifiers -+-+--- subject_id | integer | indextype | integer | cid | integer | This is with Postgres 9.0. The table has 96 million rows and an index on each column. It contains no NULLs and has no triggers. subject_id has about 2m distinct values, cid about 200k, and indextype only six. The table is *read-only* after the initial load. The query we want to do is (with example values): select t.cid, count(distinct t1.subject_id) from termindex t1, termindex t2 where t1.cid=20642 and t1.indextype=636 and t2.indextype=636 and t2.subject_id=t1.subject_id group by t2.cid; Do you have any multi column indexes? From the text of your query it seems it could benefit from these two indexes: (cid, indextype) (subject_id, indextype) I do not know whether PostgreSQL can avoid the table if you make the first index (cid, indextype, subject_id) in other words: append all the columns needed for the join. In theory the query could then be satisfied from the indexes. Pasting the explain analyze output into http://explain.depesz.com/s/Yr4 we see that Postgres is doing an external sort using about 150MB of data. Now, we're not Postgres experts, or even great at relational design. Are there better ways of doing that query, or designing the table? For the latter we do have a number of constraints, though, that I don't want to outline now because this mail is already long enough. Those are probably important to know. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?
On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller sfkel...@gmail.com wrote: I'm simply referring to literature (like the intro Ramakrishnan Gehrke). I just know that Oracle an Mysql actually do have them too and use it without those current implementation specific restrictions in Postgres. Where exactly do you take that from that Oracle has hash indexes? I can't seem to find them: http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#sthref293 Are you mixing this up with hash partitioning? http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#sthref443 Or am I missing something? IMHO by design Hash Index (e.g. linear hashing) work best when: 1. only equal (=) tests are used (on whole values) 2. columns (key values) have very-high cardinality And ideally but not necessarily when index values do not change and number of rows are known ahead of time (avoiding O(N) worst case - but there are approaches to chaining with dynamic resizing). I just collected this to encourage ourselves that enhancing hash indexes could be worthwhile. There's still the locking issue Jeff mentioned. At least every time a table resize occurs the whole index must be locked. Or is there a more fine granular locking strategy which I am overlooking? Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?
On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure mmonc...@gmail.com wrote: On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller sfkel...@gmail.com wrote: Merlin and Jeff, General remark again:It's hard for me to imagine that btree is superior for all the issues mentioned before. I still believe in hash index for primary keys and certain unique constraints where you need equality search and don't need ordering or range search. It is -- but please understand I'm talking about int32 tree vs hash. Hashing as a technique of course is absolutely going to cream btree for all kinds of data because of the advantages of working with decomposed data -- we are all taught that in comp-sci 101 :-). The debate here is not about the advantages of hashing, but the specific implementation of the hash index used. Postgres's hash index implementation used to be pretty horrible -- it stored the pre-hashed datum in the index which, while making it easier to do certain things, made it horribly slow, and, for all intents and purposes, useless. Somewhat recently,a lot of work was put in to fix that -- the index now packs the hash code only which made it competitive with btree and superior for larger keys. However, certain technical limitations like lack of WAL logging and uniqueness hold hash indexing back from being used like it really should be. In cases where I really *do* need hash indexing, I do it in userland. create table foo ( a_long_field text; ); create index on foo(hash(a_long_field)); select * from foo where hash(a_long_field) = hash(some_value) and a_long_field = some_value; This technique works fine -- the main disadvantage is that enforcing uniqueness is a PITA but since the standard index doesn't support it either it's no great loss. I also have the option of getting 'uniqueness' and being able to skip the equality operation if I sacrifice some performance and choose a strong digest. Until the hash index issues are worked out, I submit that this remains the go-to method to do this. Is this approach (storing the hash code in a btree) really faster than a regular btree index on a_long_field? And if so, for which kind of data and load? Now, my point here is that I've noticed that even with the latest optimizations btree seems to still be superior to the hash indexing by most metrics, so that: create table foo ( an_int_field int; a_long_field text; ); create index on foo(an_int_field); create index on foo using hash(a_long_field); On performance grounds alone, the btree index seems to be (from my very limited testing) a better bet. So I'm conjecturing that the current hash implementation should be replaced with a formalization of the userland technique shown above -- when you request a hash index, the database will silently hash your field and weave it into a btree. It's a hybrid: a hashed btree. I'd rather call it a btreefied hash because you are storing a hash but in a btree structure. :-) But that's a detail. What I find worrying is that then there is a certain level of obscuring the real nature since create index ... using hash is not exactly creating a hash table. To truly demonstrate if the technique was effective though, it would have to be coded up -- it's only fair to compare if the btree based hash is also double checking the value in the heap which the standard hash index must do. Right. The other way to go of course is to try and fix up the existing hash index code -- add wal logging, etc. In theory, a customized hash structure should be able to beat btree all day long which argues to continue in this direction. I still haven't seen a solution to locking when a hash table needs resizing. All hashing algorithms I can think of at the moment would require a lock on the whole beast during the resize which makes this type of index impractical for certain loads (heavy updating). One solution would be to apply partitioning to the hash table itself (e.g. have four partitions for the two least significant bits or 16 for the four lest significant bits) and treat them independently. How that would interact with WAL I have no idea though. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On Tue, Sep 13, 2011 at 5:13 PM, Marti Raudsepp ma...@juffo.org wrote: On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com wrote: In the case of PG this particular example will work: 1. TX inserts new PK row 2. TX tries to insert same PK row = blocks 1. TX commits 2. TX fails with PK violation 2. TX does the update (if the error is caught) That goes against the point I was making in my earlier comment. In order to implement this error-catching logic, you'll have to allocate a new subtransaction (transaction ID) for EVERY ROW you insert. I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? If you're going to be loading billions of rows this way, you will invoke the wrath of the vacuum freeze process, which will seq-scan all older tables and re-write every row that it hasn't touched yet. You'll survive it if your database is a few GB in size, but in the terabyte land that's unacceptable. Transaction IDs are a scarce resource there. Certainly. But it's not needed as far as I can see. In addition, such blocking will limit the parallelism you will get from multiple inserters. Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ ins.sql Description: Binary data -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On 13.09.2011 20:11, Marti Raudsepp wrote: On Tue, Sep 13, 2011 at 19:34, Robert Klemmeshortcut...@googlemail.com wrote: I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Ouch! Learn something new every day. Thanks for the update! http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html Side note: it seems that Oracle handles this differently (i.e. no subtransaction but the INSERT would be rolled back) making the pattern pretty usable for this particular situation. Also, I have never heard that TX ids are such a scarse resource over there. Would anybody think it a good idea to optionally have a BEGIN EXCEPTION block without the current TX semantics? In absence of that what would be a better pattern to do it (other than UPDATE and INSERT if not found)? Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. In a mass-loading application you can often divide the work between threads in a manner that doesn't cause conflicts. Yeah, but concurrency might not the only reason to optionally update. If the data is there you might rather want to overwrite it instead of failure. Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On 13.09.2011 20:57, Stefan Keller wrote: Interesting debate. Indeed. 2011/9/13 Marti Raudseppma...@juffo.org: Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Are you sure? In theory I always understood that there are no subtransactions. What theory are you referring to? In fact when looking at the docs there is chapter 39.6.6. saying By default, any error occurring in a PL/pgSQL function aborts execution of the function, and indeed of the surrounding transaction as well. You can trap errors and recover from them by using a BEGIN block with an EXCEPTION clause. (http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html ) So the doc isn't totally explicit about this. But whatever: What would be the the function of a subtransaction? To give the possibility to recover and continue within the surrounding transaction? I find this pretty explicit: It is important not to confuse the use of BEGIN/END for grouping statements in PL/pgSQL with the similarly-named SQL commands for transaction control. PL/pgSQL's BEGIN/END are only for grouping; they do not start or end a transaction. Functions and trigger procedures are always executed within a transaction established by an outer query — they cannot start or commit that transaction, since there would be no context for them to execute in. However, a block containing an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affecting the outer transaction. For more about that see Section 38.6.5. http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html Cheers robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On 11.09.2011 22:10, Scott Marlowe wrote: Another data point. We had a big Oracle installation at my last job, and OLAP queries were killing it midday, so I built a simple replication system to grab rows from the big iron Oracle SUN box and shove into a single core P IV 2.xGHz machine with 4 120G SATA drives in SW RAID-10. That machine handily beat the big iron Oracle machine at OLAP queries, running in 20 minutes what was taking well over an hour for the big Oracle machine to do, even during its (Oracle machine) off peak load times. Um, that sounds as if the SUN setup was really bad. Do you remember any details about the drive configuration there? Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On 11.09.2011 19:02, Marti Raudsepp wrote: On Sun, Sep 11, 2011 at 17:23, Andy Colsona...@squeakycode.net wrote: On 09/11/2011 08:59 AM, Igor Chudov wrote: By the way, does that INSERT UPDATE functionality or something like this exist in Postgres? You have two options: 1) write a function like: create function doinsert(_id integer, _value text) returns void as 2) use two sql statements: Unfortunately both of these options have caveats. Depending on your I/O speed, you might need to use multiple loader threads to saturate the write bandwidth. However, neither option is safe from race conditions. If you need to load data from multiple threads at the same time, they won't see each other's inserts (until commit) and thus cause unique violations. If you could somehow partition their operation by some key, so threads are guaranteed not to conflict each other, then that would be perfect. The 2nd option given by Andy is probably faster. You *could* code a race-condition-safe function, but that would be a no-go on a data warehouse, since each call needs a separate subtransaction which involves allocating a transaction ID. Wouldn't it be sufficient to reverse order for race condition safety? Pseudo code: begin insert ... catch update ... if not found error end Speed is another matter though... Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On 12.09.2011 19:22, Andy Colson wrote: On 9/12/2011 12:15 PM, Robert Klemme wrote: On 11.09.2011 19:02, Marti Raudsepp wrote: On Sun, Sep 11, 2011 at 17:23, Andy Colsona...@squeakycode.net wrote: On 09/11/2011 08:59 AM, Igor Chudov wrote: By the way, does that INSERT UPDATE functionality or something like this exist in Postgres? You have two options: 1) write a function like: create function doinsert(_id integer, _value text) returns void as 2) use two sql statements: Unfortunately both of these options have caveats. Depending on your I/O speed, you might need to use multiple loader threads to saturate the write bandwidth. However, neither option is safe from race conditions. If you need to load data from multiple threads at the same time, they won't see each other's inserts (until commit) and thus cause unique violations. If you could somehow partition their operation by some key, so threads are guaranteed not to conflict each other, then that would be perfect. The 2nd option given by Andy is probably faster. You *could* code a race-condition-safe function, but that would be a no-go on a data warehouse, since each call needs a separate subtransaction which involves allocating a transaction ID. Wouldn't it be sufficient to reverse order for race condition safety? Pseudo code: begin insert ... catch update ... if not found error end Speed is another matter though... No, I dont think so, if you had two loaders, both would start a transaction, then neither could see what the other was doing. It depends. But the point is that not both INSERTS can succeed. The one which fails will attempt the UPDATE and - depending on isolation level and DB implementation - will be blocked or fail. In the case of PG this particular example will work: 1. TX inserts new PK row 2. TX tries to insert same PK row = blocks 1. TX commits 2. TX fails with PK violation 2. TX does the update (if the error is caught) There are transaction isolation levels, but they are like playing with fire. (in my opinion). You make them sound like witchcraft. But they are clearly defined - even standardized. Granted, different RDBMS might implement them in different ways - here's PG's view of TX isolation: http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html In my opinion anybody working with RDBMS should make himself familiar with this concept - at least know about it - because it is one of the fundamental features of RDBMS and certainly needs consideration in applications with highly concurrent DB activity. Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] settings input for upgrade
On Sat, Aug 20, 2011 at 8:33 PM, Midge Brown midg...@sbcglobal.net wrote: Robert, I was largely looking for input on whether I may have inadvertently shot myself in the foot with some of the choices I made when setting up postgresql 9.0, which is on different hardware than was the 7.4 setup. OK, I though the config change was the diff for the other two database and not for 9.0. The splitting of one table to two separate databases was done on 7.4 and did make a positive change in write performance. I was including that information only in an attempt to provide as much detail as possible. Good to know! Thanks! Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] settings input for upgrade
On Thu, Aug 18, 2011 at 11:55 PM, Midge Brown midg...@sbcglobal.net wrote: I'm in the process of upgrading from postgres 7.4.8 to 9.0.4 and wanted to run my decisions past some folks who can give me some input on whether my decisions make sense or not. I am not sure what decisions you actually refer to here: in your posting I can only see description of the current setup but no decisions for the upgrade (i.e. changed parameters, other physical layout etc.). The others are very write-heavy, started as one table within the original DB, and were split out on an odd/even id # in an effort to get better performance: Did it pay off? I mean you planned to increase performance and did this actually happen? Apart from reserving IO bandwidth (which you achieved by placing data on different disks) you basically only added reserved memory for each instance by separating them. Or are there any other effects achieved by separating (like reduced lock contention on some globally shared resource, distribution of CPU for logging)? Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance penalty when using WITH
On Tue, Aug 2, 2011 at 11:48 PM, Merlin Moncure mmonc...@gmail.com wrote: On Sat, Jul 30, 2011 at 8:10 AM, Robert Klemme shortcut...@googlemail.com wrote: On Thu, Jul 28, 2011 at 11:00 PM, Li Jin l...@tripadvisor.com wrote: I met with the problem that when I was using WITH clause to reuse a subquery, I got a huge performance penalty because of query planner. Here are the details, the original query is EXPLAIN ANALYZE WITH latest_identities AS ( SELECT DISTINCT ON (memberid) memberid, username, changedate FROM t_username_history WHERE memberid IN (SELECT memberid FROM t_member WHERE firstname || ' ' || substring(lastname,1,1) = 'Eddie T') ORDER BY memberid, changedate DESC ) Another observation: That criterion looks suspicious to me. I would expect any RDBMS to be better able to optimize this: WHERE firstname = 'Eddie' AND lastname like 'T%' I know it's semantically not the same but I would assume this is good enough for the common usecase. Plus, if there is an index on (firstname, lastname) then that could be used. disagree. just one of the ways that could be stymied would to change the function behind the '||' operator. I don't understand what you mean. Can you please elaborate? To explain my point a bit: I meant that by querying individual fields separately instead of applying a criterion on a function of the two the RDBMS has a better chance to use indexes and come up with a better plan for this part of the query. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance penalty when using WITH
On Wed, Aug 3, 2011 at 3:27 PM, Li Jin l...@tripadvisor.com wrote: Robert, I've built an index on this expression firstname || ' ' || substring(lastname,1,1). I believe this is the best index for this particular query. Correct me if I am wrong. Maybe, maybe not. Difficult to tell from a distance. I would have an index on (firstname, lastname). You could try that and look at the plan for the other query. That's the only ultimate test which will give you hard facts. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance penalty when using WITH
On Wed, Aug 3, 2011 at 3:27 PM, Li Jin l...@tripadvisor.com wrote: Robert, I've built an index on this expression firstname || ' ' || substring(lastname,1,1). I believe this is the best index for this particular query. Correct me if I am wrong. Maybe, maybe not. Difficult to tell from a distance. I would have an index on (firstname, lastname). You could try that and look at the plan for the other query. That's the only ultimate test which will give you hard facts. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance penalty when using WITH
On Wed, Aug 3, 2011 at 6:24 PM, Merlin Moncure mmonc...@gmail.com wrote: On Wed, Aug 3, 2011 at 2:18 AM, Robert Klemme shortcut...@googlemail.com wrote: Another observation: That criterion looks suspicious to me. I would expect any RDBMS to be better able to optimize this: WHERE firstname = 'Eddie' AND lastname like 'T%' I know it's semantically not the same but I would assume this is good enough for the common usecase. Plus, if there is an index on (firstname, lastname) then that could be used. disagree. just one of the ways that could be stymied would to change the function behind the '||' operator. I don't understand what you mean. Can you please elaborate? To explain my point a bit: I meant that by querying individual fields separately instead of applying a criterion on a function of the two the RDBMS has a better chance to use indexes and come up with a better plan for this part of the query. Yes, but your assuming that it is safe and generally advantageous to do that. Both assumptions I think are false. I am not sure why you say I assume this is _safe_. I said it is good enough for the common usecase. And it is certainly good enough for this particular query. As for the generally advantageous I'd say that an index on raw column values is usually useful for more queries than an index on a specific function. That's why I'd say generally an index on column values is more versatile and I would prefer it. Of course you might achieve orders of magnitude of speedup for individual queries with an index on a function tailored to that particular query but if you need to do that for multiple queries you pay a higher penalty for updates. The || operator is trivially hacked: create or replace function funky_concat(l text, r text) returns text as $$ select textcat(textcat($1, 'abc'), $2); $$ language sql immutable ; update pg_operator set oprcode = 'funky_concat' where oid = 654; postgres=# select 'a' || 'b'; ?column? -- aabcb (1 row) Also even ignoring the above it's not free to have the database try and analyze every instance of the || operator to see if it can be decomposed to boolean field operations. Even with your hacked operator you would need an index on the expression to make it efficient. That could be done with the original || as well. But my point was to query WHERE a = 'foo' and b like 'b%' instead of WHERE a || ' ' || substring(b, 1, 1) = 'foo b' to use an index on (a,b). That index would also be useful for queries like WHERE a = 'foo' WHERE a like 'fo%' WHERE a = 'foo' and b = 'bar' and probably also WHERE a 'foo' WHERE a 'foo' and b like 'b%' WHERE a 'foo' and b = 'bar' Kind regards robert PS: Sorry for the earlier duplicate. Gmail had a hickup. -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Trigger or Function
On Sat, Jul 30, 2011 at 3:01 AM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: On 24/07/11 03:58, alan wrote: My first approach would be to remove WeekAvg and MonthAvg from the table and create a view which calculates appropriate values. Thanks Robert, I had to upgrade to 9.0.4 to use the extended windowing features. Here is how I set it up. If anyone sees an issue, please let me know. I'm new to postgres. Basically, my daily_vals table contains HOST, DATE, VALUE columns. What I wanted was a way to automatically populate a 4th column called rolling_average, which would be the sum ofn preceding columns. There seems to be contradiction in the naming here. Did you mean avg ofn preceding columns.? I created a view called weekly_average using this VIEW statement. CREATE OR REPLACE VIEW weekly_average AS SELECT *, sum(value) OVER (PARTITION BY host ORDER BY rid ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) as rolling_average FROM daily_vals; The above gives just the rolling sum, you need to divide by the number of rows in the sum to get the average (I assume you want the arithmetic mean, as the are many types of average!). CREATE OR REPLACE VIEW weekly_average AS SELECT *, round((sum(value) OVER mywindow / LEAST(6, (row_number() OVER mywindow))), 4) AS rolling_average FROM daily_vals WINDOW mywindow AS ( PARTITION BY host ORDER BY rid ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ); Why not CREATE OR REPLACE VIEW weekly_average AS SELECT *, avg(value) OVER (PARTITION BY host ORDER BY rid ROWS BETWEEN 6 PRECEDING AND CURRENT ROW ) as rolling_average FROM daily_vals; What did I miss? Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance penalty when using WITH
On Thu, Jul 28, 2011 at 11:00 PM, Li Jin l...@tripadvisor.com wrote: I met with the problem that when I was using WITH clause to reuse a subquery, I got a huge performance penalty because of query planner. Here are the details, the original query is EXPLAIN ANALYZE WITH latest_identities AS ( SELECT DISTINCT ON (memberid) memberid, username, changedate FROM t_username_history WHERE memberid IN (SELECT memberid FROM t_member WHERE firstname || ' ' || substring(lastname,1,1) = 'Eddie T') ORDER BY memberid, changedate DESC ) Another observation: That criterion looks suspicious to me. I would expect any RDBMS to be better able to optimize this: WHERE firstname = 'Eddie' AND lastname like 'T%' I know it's semantically not the same but I would assume this is good enough for the common usecase. Plus, if there is an index on (firstname, lastname) then that could be used. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Trigger or Function
On Tue, Jul 12, 2011 at 9:41 AM, alan alan.mill...@gmail.com wrote: Hello, I'm a postgres newbie and am wondering what's the best way to do this. I am gathering some data and will be inserting to a table once daily. The table is quite simple but I want the updates to be as efficient as possible since this db is part of a big data project. Say I have a table with these columns: | Date | Hostname | DayVal | WeekAvg | MonthAvg | When I insert a new row I have the values for Date, Hostname, DayVal. Is it possible to define the table is such a way that the WeekAvg and MonthAvg are automatically updated as follows? WeekAvg = current rows DayVal plus the sum of DayVal for the previous 6 rows. MonthAvg = current row's DayVal plus the sum of DayVal for the previous 29 rows. Should I place the logic in a Trigger or in a Function? Does someone have an example or a link showing how I could set this up? IMHO that design does not fit the relational model well because you are trying to store multirow aggregate values in individual rows. For example, your values will be wrong if you insert rows in the wrong order (i.e. today's data before yesterday's data). My first approach would be to remove WeekAvg and MonthAvg from the table and create a view which calculates appropriate values. If that proves too inefficient (e.g. because the data set is too huge and too much data is queried for individual queries) we can start optimizing. One approach to optimizing would be to have secondary tables | Week | Hostname | WeekAvg | | Month | Hostname | MonthAvg | and update them with an insert trigger and probably also with an update and delete trigger. If you actually need increasing values (i.e. running totals) you can use windowing functions (analytic SQL in Oracle). View definitions then of course need to change. http://www.postgresql.org/docs/9.0/interactive/queries-table-expressions.html#QUERIES-WINDOW Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database
On Thu, Jul 14, 2011 at 4:05 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Tom Lane t...@sss.pgh.pa.us wrote: It seems like we ought to distinguish heap cleanup activities from user-visible semantics (IOW, users shouldn't care if a HOT cleanup has to be done over after restart, so if the transaction only wrote such records there's no need to flush). This'd require more process-global state than we keep now, I'm afraid. That makes sense, and seems like the right long-term fix. It seems like a boolean might do it; the trick would be setting it (or not) in all the right places. I also believe this is the right way to go. I think the crucial thing is in distinguish heap cleanup activities from user-visible semantics - basically this is what happens with auto vacuum: it does work concurrently that you do not want to burden on user transactions. Another approach we could take (also nontrivial) is to prevent select-only queries from doing HOT cleanups. You said upthread that there were alleged performance benefits from aggressive cleanup, but IMO that can charitably be described as unproven. The real reason it happens is that we didn't see a simple way for page fetches to know soon enough whether a tuple update would be likely to happen later, so they just do cleanups unconditionally. Hmm. One trivial change could be to skip it when the top level transaction is declared to be READ ONLY. At least that would give people a way to work around it for now. Of course, that can't be back-patched before 9.1 because subtransactions could override READ ONLY before that. What I don't like about this approach is that it a) increases complexity for the user, b) might not be for everyone (i.e. tools like OR mappers which do not allow such setting of the TX or cases where you do not know what type of TX this is when you start it) and c) it still keeps the performance penalty to suddenly come to haunt a different TX. I can only speculate whether the latter might actually cause other people to run into issues because their usage patterns currently force the cleanout activities into an unimportant TX while the workaround would suddenly have the cleanout delay show up in an important TX which used to be fast. This is also hard to debug since you would normally only look at the slow TX before you realize you need to look elsewhere (the length of this thread is kind of proof of this already :-)). My 0.01 EUR... Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 100% CPU Utilization when we run queries.
On Wed, Jul 6, 2011 at 9:04 PM, Tomas Vondra t...@fuzzy.cz wrote: Dne 6.7.2011 15:30, bakkiya napsal(a): Any help, please? According to the EXPLAIN ANALYZE output (please, don't post it to the mailing list directly - use something like explain.depesz.com, I've done that for you this time: http://explain.depesz.com/s/HMN), you're doing a UNIQUE over a lot of data (2 million rows, 1.5GB). That is done by sorting the data, and sorting is very CPU intensive task usually. So the fact that the CPU is 100% utilized is kind of expected in this case. So that's a feature, not a bug. In general each process is hitting some bottleneck. It might be an I/O, it might be a CPU, it might be something less visible (memory bandwidth or something like that). But I've noticed one thing in your query - you're doing a UNIQUE in the view (probably, we don't know the definition) and then once again in the query (but using just one column from the view). The problem is the inner sort does not remove any rows (1979735 rows in/out). Why do you do the UNIQUE in the view? Do you really need it there? I guess removing it might significantly improve the plan. Try to do the query without the view - it seems it's just an union of current tables and a history (both partitioned, so do something like this) SELECT DISTINCT init_service_comp FROM ( SELECT init_service_comp FROM events UNION SELECT init_service_comp FROM hist_events ) or maybe even SELECT DISTINCT init_service_comp FROM ( SELECT DISTINCT init_service_comp FROM events UNION SELECT DISTINCT init_service_comp FROM hist_events ) Let's see how that works - post EXPLAIN ANALYZE using explain.depesz.com In this case UNION ALL is probably more appropriate than UNION - and may have different performance characteristics (saving the UNIQUE?). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Long Running Update - My Solution
On Mon, Jun 27, 2011 at 5:37 PM, t...@fuzzy.cz wrote: The mystery remains, for me: why updating 100,000 records could complete in as quickly as 5 seconds, whereas an attempt to update a million records was still running after 25 minutes before we killed it? Hi, there's a lot of possible causes. Usually this is caused by a plan change - imagine for example that you need to sort a table and the amount of data just fits into work_mem, so that it can be sorted in memory. If you need to perform the same query with 10x the data, you'll have to sort the data on disk. Which is way slower, of course. And there are other such problems ... I would rather assume it is one of the other problems, typically related to handling the TX (e.g. checkpoints, WAL, creating copies of modified records and adjusting indexes...). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] bitmask index
On 06/22/2011 11:42 PM, Greg Smith wrote: On 06/22/2011 05:27 PM, Marcus Engene wrote: I have some tables with bitmask integers. Set bits are the interesting ones. Usually they are sparse. If it's sparse, create a partial index that just includes rows where the bit is set: http://www.postgresql.org/docs/current/static/indexes-partial.html That would mean that if different bits are queried there would need to be several of those indexes. Maybe it's an alternative to index all rows where misc_bits 0 and include that criterion in the query. Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [GENERAL] [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
On Thu, Jun 9, 2011 at 7:44 PM, Greg Smith g...@2ndquadrant.com wrote: ** On 06/09/2011 07:43 AM, Willy-Bas Loos wrote: Well, after reading your article i have been reading some materail about it on the internet, stating that separating indexes from data for performance benefits is a myth. I found your comment So then a single query will only ever access one of both at a time. very smart (no sarcasm there). I also found a threadhttp://asktom.oracle.com/pls/asktom/f?p=100:11:0P11_QUESTION_ID:901906930328on AskTom that said mainly the goal is to achieve even io. (that makes absolute sense) The idea that separating indexes and tables from one another via a tablespace is inherently good is a myth. Particularly nowadays, where the fastest part of a drive is nearly twice as fast as the slowest one in sequential transfers, and the ratio between sequential and random I/O is huge. Trying to get clever about breaking out a tablespace is unlikely to outsmart what you'd get if you just let the OS deal with that stuff. What is true is that when you have multiple tiers of storage speeds available, allocating the indexes and tables among them optimally is both difficult and potentially worthwhile. A customer of mine has two drive arrays, one of which is about 50% faster than the other; second was added as expansion once the first filled. Nowadays, both are 75% full, and I/O on each has to be carefully balanced. Making sure the heavily hit indexes are on the fast array, and that less critical things are not placed there, is the difference between that site staying up or going down. The hidden surprise in this problem for most people is the day they discover that *the* most popular indexes, the ones they figured had to go on the fastest storage around, are actually sitting in RAM all the time anyway. It's always fun and sad at the same time to watch someone spend a fortune on some small expensive storage solution, move their most performance critical data to it, and discover nothing changed. Some days that works great; others, it's no faster all, because that data was already in memory. http://www.2ndQuadrant.com/books Adding a few more thoughts to this: it is important to understand the very different nature of read and write IO. While write IO usually can be done concurrently to different IO channels (devices) for read IO there are typically dependencies, e.g. you need to read the index before you know which part of the table you need to read. Thus both cannot be done concurrently for a *single select* unless the whole query is partitioned and executed in parallel (Oracle can do that for example). Even then each parallel executor has this dependency between index and table data. That's the same situation as with concurrent queries into the same table and index. There are of course exceptions, e.g. during a sequential table scan you can know beforehand which blocks need to be read next and fetch them wile processing the current block(s). The buffering strategy also plays an important role here. Bottom line: one needs to look at each case individually, do the math and ideally also measurements. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...
On Thu, Feb 10, 2011 at 7:32 PM, Craig James craig_ja...@emolecules.com wrote: On 2/10/11 9:21 AM, Kevin Grittner wrote: Shaun Thomasstho...@peak6.com wrote: how difficult would it be to add that syntax to the JOIN statement, for example? Something like this syntax?: JOIN WITH (correlation_factor=0.3) Where 1.0 might mean that for each value on the left there was only one distinct value on the right, and 0.0 would mean that they were entirely independent? (Just as an off-the-cuff example -- I'm not at all sure that this makes sense, let alone is the best thing to specify. I'm trying to get at *syntax* here, not particular knobs.) There are two types of problems: 1. The optimizer is imperfect and makes a sub-optimal choice. 2. There is theoretical reasons why it's hard for the optimizer. For example, in a table with 50 columns, there is a staggering number of possible correlations. An optimizer can't possibly figure this out, but a human might know them from the start. The City/Postal-code correlation is a good example. For #1, Postgres should never offer any sort of hint mechanism. As many have pointed out, it's far better to spend the time fixing the optimizer than adding hacks. For #2, it might make sense to give a designer a way to tell Postgres stuff that it couldn't possibly figure out. But ... not until the problem is clearly defined. What should happen is that someone writes with an example query, and the community realizes that no amount of cleverness from Postgres could ever solve it (for solid theoretical reasons). Only then, when the problem is clearly defined, should we talk about solutions and SQL extensions. I don't have one such query handy. However, I think your posting is a good starting point for a discussion how to figure out what we need and how a good solution could look like. For example, one thing I dislike about hints is that they go into the query. There are a few drawbacks of this approach - Applications need to be changed to benefit which is not always possible. - One important class of such applications are those that use OR mappers - hinting then would have to be buried in OR mapper code or configuration. - Hints in the query work only for exactly that query (this might be an advantage depending on point of view). I think the solution should rather be to tell Postgres what it couldn't possibly figure out. I imagine that could be some form of description of the distribution of data in columns and / or correlations between columns. Advantage would be that the optimizer gets additional input which it can use (i.e. the usage can change between releases), the information is separate from queries (more like meta data for tables) and thus all queries using a particular table which was augmented with this meta data would benefit. Usage of this meta data could be controlled by a flag per session (as well as globally) so it would be relatively easy to find out whether this meta data has become obsolete (because data changed or a new release of the database is in use). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL?
On Mon, May 23, 2011 at 5:54 PM, Dave Johansen davejohan...@gmail.com wrote: I apologize for the multiple posts. I sent this email right after joining the list and after it hadn't shown up a day later I figured that it had been lost or something and sent the other one. Sorry for the nitpicking but I even see _three_ instances of this posting (first on May 18th). Also, the database I posted this about does not have internet access and so I'm working on getting it moved over to a machine that does or getting it the info onto a machine where I can post the pertinent information about the schema and explain outputs. Great! Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL?
Dave, how often do you want to repeat that posting? What about instead replying to the answers you got so far? Cheers robert On Tue, May 17, 2011 at 5:31 PM, Dave Johansen davejohan...@gmail.com wrote: I am using Postgres 8.3 and I have an issue very closely related to the one described here: http://archives.postgresql.org/pgsql-general/2005-06/msg00488.php Basically, I have a VIEW which is a UNION ALL of two tables but when I do a select on the view using a LIMIT, it scans the entire tables and takes significantly longer than writing out the query with the LIMITs in the sub-queries themselves. Is there a solution to get the view to perform like the sub-query version? Thanks, Dave -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] LIMIT and UNION ALL
On Wed, May 18, 2011 at 5:26 PM, Dave Johansen davejohan...@gmail.com wrote: I am using Postgres 8.3.3 and I have a VIEW which is a UNION ALL of two tables but when I do a select on the view using a LIMIT, it scans the entire tables and takes significantly longer than writing out the query with the LIMITs in the sub-queries themselves. Is there a solution to get the view to perform like the query with the LIMIT explicitly placed in the sub-queries? Can you show DDL and queries? The query with the LIMIT on the subqueries and the one with the LIMIT on the overall query are not semantically equivalent. Since you can have an ORDER BY before the LIMIT on the query with the limit on the view the database must have all the rows before it can apply the ordering and properly determine the limit. Although it might be possible to determine under particular circumstances that only one of the tables needs to be queried or tables need only be queried partially I deem that quite complex. I do not know whether Postgres can do such optimizations but for that we would certainly need to see the concrete example (including constraint and indexes). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
On Mon, May 16, 2011 at 4:31 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, May 16, 2011 at 4:19 AM, Robert Klemme shortcut...@googlemail.com wrote: - If the planner chooses a Bitmap Index Scan, it effectively scans the index to figure out which table blocks to read, and then reads those table blocks in block number order, so that the I/O is sequential, with skips. Are these two separate phases (i.e. first scan index completely, then access table)? Yes. So then a single query will only ever access one of both at a time. Separating index and tables might not be a totally good idea generally. Richard Foote has an excellent article about Oracle but I assume at least a few things do apply to PostgreSQL as well - it's at least worth as something to check PostgreSQL's access patterns against: http://richardfoote.wordpress.com/2008/04/16/separate-indexes-from-tables-some-thoughts-part-i-everything-in-its-right-place/ I would probably rather try to separate data by the nature and frequency of accesses. One reasonable separation would be to leave all frequently accessed tables *and* their indexes on local RAID and moving less frequently accessed data to the SAN. This separation could be easily identified if you have separate tables for current and historic data. Yeah, I think the idea of putting tables and indexes in separate tablespaces is mostly to bring more I/O bandwidth to bear on the same data. Richard commented on that as well, I believe it was in http://richardfoote.wordpress.com/2008/04/18/separate-indexes-from-tables-some-thoughts-part-ii-there-there/ The main point is that you do not benefit from the larger IO bandwidth if access patterns do not permit parallel access to both disks (e.g. because you first need to read index blocks in order to know the table blocks to read). The story might be different though if you have a lot of concurrent accesses. But even then, if the table is a hotspot chances are that index blocks are cached and you only need physical IO for table blocks... But there are other reasonable things you might do also - e.g. put the indexes on an SSD, and the tables on a spinning disk, figuring that the SSD is less reliable but you can always rebuild the index if you need to... Richard commented on that theory as well: http://richardfoote.wordpress.com/2008/05/02/indexes-in-their-own-tablespace-recoverability-advantages-get-back/ The point: if you do the math you might figure that lost indexes lead to so much downtime that you don't want to risk that and the rebuild isn't all that simple (in terms of time). For a reasonable sized database recovery might be significantly faster than rebuilding. Also, a lot of people have reported big speedups from putting pg_xlog on a dedicated RAID 1 pair, or moving the PostgreSQL logs off the data partition. So those sorts of divisions should be considered also. Now, this is something I'd seriously consider because access patterns to pg_xlog are vastly different than those of table and index data! So you want to have pg_xlog on a device with high reliability and high write speed. Your idea of dividing things by access frequency is another good thought. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
On Tue, May 17, 2011 at 11:47 AM, Craig Ringer cr...@postnewspapers.com.au wrote: On 05/17/2011 03:00 PM, Robert Klemme wrote: The main point is that you do not benefit from the larger IO bandwidth if access patterns do not permit parallel access to both disks (e.g. because you first need to read index blocks in order to know the table blocks to read). This makes me wonder if Pg attempts to pre-fetch blocks of interest for areas where I/O needs can be known in advance, while there's still other works or other I/O to do. For example, pre-fetching for the next iteration of a nested loop while still executing the prior one. Is it even possible? I'm guessing not, because (AFAIK) Pg uses only synchronous blocking I/O, and with that there isn't really a way to pre-fetch w/o threads or helper processes. Linux (at least) supports buffered async I/O, so it'd be possible to submit such prefetch requests ... on modern Linux kernels. Portably doing so, though - not so much. There is a much more serious obstacle than the mere technical (if that was one at all): prefetching is only reasonable if you can predict which data you need with high probability (say = 80%). If you can't you'll have much more IO than without prefetching and overall performance likely suffers. Naturally that probability depends on the data at hand and the access pattern. As Cédric wrote, there seems to be at least one case where it's done. Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move
On Fri, May 13, 2011 at 9:04 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, May 4, 2011 at 6:31 AM, Willy-Bas Loos willy...@gmail.com wrote: I'm asking them for (real) benchmarks, thanks for the advice. (fio is not available for us now to do it myself, grmbl) It just occurred to me that it is not necessarily the case that reading the indexes causes a lot of random I/O (on the index itself). I mean, maybe the index is generally read sequentially and then, when retrieving the data, there is a lot of random I/O. if it's a long story, any tips for info about this (book or web site)? If you don't do anything special, and if the query plan says Index Scan rather than Bitmap Index Scan, then both the index I/O and the table I/O are likely to be fairly random. However there are a number of cases in which you can expect the table I/O to be sequential: - In some cases, you may happen to insert rows with an ordering that matches the index. For example, if you have a table with not too many updates and deletes, and an index on a serial column, then new rows will have a higher value in that column than old rows, and will also typically be physically after older rows in the file. Or you might be inserting timestamped data from oldest to newest. - If the planner chooses a Bitmap Index Scan, it effectively scans the index to figure out which table blocks to read, and then reads those table blocks in block number order, so that the I/O is sequential, with skips. Are these two separate phases (i.e. first scan index completely, then access table)? - If you CLUSTER the table on a particular index, it will be physically ordered to match the index's key ordering. As the table is further modified the degree of clustering will gradually decline; eventually you may wish to re-CLUSTER. It's also worth keeping in mind that the index itself won't necessarily be accessed in physically sequential order. The point of the index is to emit the rows in key order, but if the table is heavily updated, it won't necessarily be the case that a page containing lower-valued keys physically precedes a page containing higher-valued keys. I'm actually somewhat fuzzy on how this works, and to what extent it's a problem in practice, but I am fairly sure it can happen. Separating index and tables might not be a totally good idea generally. Richard Foote has an excellent article about Oracle but I assume at least a few things do apply to PostgreSQL as well - it's at least worth as something to check PostgreSQL's access patterns against: http://richardfoote.wordpress.com/2008/04/16/separate-indexes-from-tables-some-thoughts-part-i-everything-in-its-right-place/ I would probably rather try to separate data by the nature and frequency of accesses. One reasonable separation would be to leave all frequently accessed tables *and* their indexes on local RAID and moving less frequently accessed data to the SAN. This separation could be easily identified if you have separate tables for current and historic data. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] big distinct clause vs. group by
On Mon, Apr 18, 2011 at 7:13 PM, Uwe Bartels uwe.bart...@gmail.com wrote: the aggregate function I was talking about is the function I need to use for the non-group by columns like min() in my example. There are of course several function to choose from, and I wanted to know which causes as less as possible resources. If you do not care about the output of the non key columns, why do you include them in the query at all? That would certainly be the cheapest option. If you need _any_ column value you can use a constant. rklemme= select * from t1; k | v ---+--- 0 | 0 0 | 1 1 | 2 1 | 3 2 | 4 2 | 5 3 | 6 3 | 7 4 | 8 4 | 9 (10 rows) rklemme= select k, 99 as v from t1 group by k order by k; k | v ---+ 0 | 99 1 | 99 2 | 99 3 | 99 4 | 99 (5 rows) rklemme= Greetings from Paderborn robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to configure a read-only database server?
On Tue, Apr 19, 2011 at 12:08 AM, Stefan Keller sfkel...@gmail.com wrote: I browsed the faq and looked at PostgreSQL performance books but I could not find the obvious: How to configure a read-only database server? I have a single-disk virtual Linux system and a read-only dataset which is exposed to internet and completely replaced from time to time. This is what I found so far: * Disabling autovacuum daemon. I guess this will give you only small benefits as the daemon won't find any tables with modifications. * Setting postgresql.conf parameters: fsync=off synchronous_commit=off Since you don't commit changes the effect of this might be small as well. full_page_writes=off * For the session: SET transaction_read_only TO FALSE; Did you mean TRUE? SET TRANSACTION READ ONLY; What about ALTER DATABASE x SET default_transaction_read_only = on; ? * What about wal_level and archive_mode? = Any comments on speeding up/optimizing such database server? Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] big distinct clause vs. group by
On Tue, Apr 19, 2011 at 10:47 AM, Uwe Bartels uwe.bart...@gmail.com wrote: Oh, I do care about these columns. But by using an group by on the key columns, I cannot select the columns as they are. Otherwise you get an error message. So I have to use an aggregate functionlike min(). I find that slightly contradictory: either you do care about the values then your business requirements dictate the aggregate function. If you only want to pick any value actually in the table but do not care about which one (e.g. MIN or MAX or any other) then you don't actually care about the value. Because SELECT a, MAX(b) ... GROUP BY a and SELECT a, MIN(b) ... GROUP BY a are not equivalent. And, if you do not care then there is probably no point in selecting them at all. At best you could use a constant for any legal value then. Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Which gives good performance? separate database vs separate schema
On Thu, Nov 25, 2010 at 4:46 PM, t...@fuzzy.cz wrote: I am not facing any issues, but yes I want to have optimal performance for SELECT and INSERT, especially when I am doing these ops repeatedly. Actually I am porting from Oracle to PG. Oracle starts a lot of processes when it needs to run many schemas. I do not think PG would need much more resources (mem, cpu) if I go for different database for each process..? Also, is there any limit on number of databases I can start using a PG server? Hm, I would try to run that using single cluster, and only if that does not perform well I'd try multiple clusters. Yes, Oracle starts a lot of processes for an instance, and then some processes for each connection. But again - in PostgreSQL, you do not start databases. You start a cluster, containing databases and then there are connections. This is similar to Oracle where you start instances (something like cluster in PostgreSQL) containing schemas (something like databases in PostgreSQL). And then you create connections, which is the object consuming processes and memory. PostgreSQL will create one process for each connection (roughly the same as Oracle in case of dedicated server). And yes, the number of connections is limited - see max_connections parameter in postgresql.conf. I think this is a pretty common trade off that is frequently made: basically the question is whether one wants to reserve resources or share resources. In this case resources would be memory and maybe also disk IO. With two separate clusters each one has its own memory. Which means that if one instance is idle and the other one has high load then the idle instance's memory cannot be used by the other one. With a single cluster all the memory is shared which has the downside that high load of one instance can affect the other instance's memory. It depends on the usage patterns (load) and the user's policy which way to go. Since the OP mentioned many instances the aspect of overhead of many instances (even if idle) may come into play as well. Plus, a single cluster is likely easier to administer than multiple. But of course the more DB there are in a single cluster the higher the likeliness of bottlenecks (see the other thread Performance under contention). Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] best db schema for time series data?
On Fri, Nov 19, 2010 at 10:50 AM, Louis-David Mitterrand vindex+lists-pgsql-performa...@apartia.org wrote: On Tue, Nov 16, 2010 at 05:28:19PM +0100, Harald Fuchs wrote: In article 4ce2688b.2050...@tweakers.net, Arjen van der Meijden acmmail...@tweakers.net writes: On 16-11-2010 11:50, Louis-David Mitterrand wrote: I have to collect lots of prices from web sites and keep track of their changes. What is the best option? 1) one 'price' row per price change: create table price ( id_price primary key, id_product integer references product, price integer ); 2) a single 'price' row containing all the changes: create table price ( id_price primary key, id_product integer references product, price integer[] -- prices are 'pushed' on this array as they change ); Which is bound to give the best performance, knowing I will often need to access the latest and next-to-latest prices? If you mostly need the last few prices, I'd definitaly go with the first aproach, its much cleaner. Besides, you can store a date/time per price, so you know when it changed. With the array-approach that's a bit harder to do. I'd probably use a variant of this: CREATE TABLE prices ( pid int NOT NULL REFERENCES products, validTil timestamp(0) NULL, price int NOT NULL, UNIQUE (pid, validTil) ); The current price of a product is always the row with validTil IS NULL. The lookup should be pretty fast because it can use the index of the UNIQUE constraint. Even better: with a partial index lookup should be more efficient and probably will stay that way even when the number of prices increases (and the number of products stays the same). With CREATE UNIQUE INDEX current_prices ON prices ( pid ) WHERE validTil IS NULL; I get robert= explain select price from prices where pid = 12344 and validTil is null; QUERY PLAN - Index Scan using current_prices on prices (cost=0.00..8.28 rows=1 width=4) Index Cond: (pid = 12344) (2 rows) The index can actually be used here. (see attachment) The validTil idea is nice, but you have to manage that field with a trigger, right? Well, you don't need to. You can always do begin; update prices set validTil = current_timestamp where pid = 123 and validTil is NULL; insert into prices values ( 123, null, 94 ); commit; But with a trigger it would be more convenient of course. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ \timing on DROP TABLE prices; CREATE TABLE prices ( pid int NOT NULL, -- REFERENCES products, validTil timestamp(0) NULL, price int NOT NULL, UNIQUE (pid, validTil) ); CREATE UNIQUE INDEX current_prices ON prices ( pid ) WHERE validTil IS NULL; INSERT INTO prices SELECT generate_series, CASE MOD(generate_series,10) WHEN 0 THEN NULL ELSE current_timestamp - MOD(generate_series,10) * interval '1' day END, 123 FROM generate_series(1,100); explain select price from prices where pid = 12344 and validTil is null; -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Simple (hopefully) throughput question?
On 11/03/2010 04:52 PM, Nick Matheson wrote: We have an application that needs to do bulk reads of ENTIRE Postgres tables very quickly (i.e. select * from table). We have observed that such sequential scans run two orders of magnitude slower than observed raw disk reads (5 MB/s versus 100 MB/s). Part of this is due to the storage overhead we have observed in Postgres. In the example below, it takes 1 GB to store 350 MB of nominal data. However that suggests we would expect to get 35 MB/s bulk read rates. Our business logic does operations on the resulting data such that the output is several orders of magnitude smaller than the input. So we had hoped that by putting our business logic into stored procedures (and thus drastically reducing the amount of data flowing to the client) our throughput would go way up. This did not happen. Can you disclose what kinds of manipulations you want to do on the data? I am asking because maybe there is a fancy query (possibly using windowing functions and / or aggregation functions) that gets you the speed that you need without transferring the whole data set to the client. So our questions are as follows: Is there any way using stored procedures (maybe C code that calls SPI directly) or some other approach to get close to the expected 35 MB/s doing these bulk reads? Or is this the price we have to pay for using SQL instead of some NoSQL solution. (We actually tried Tokyo Cabinet and found it to perform quite well. However it does not measure up to Postgres in terms of replication, data interrogation, community support, acceptance, etc). Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance