Re: [PERFORM] Millions of tables

2016-11-25 Thread Robert Klemme
Greg, sorry for the resent: I had forgotten to include the list.

On Wed, Oct 5, 2016 at 2:34 PM, Greg Spiegelberg  wrote:

> Data is not static.  The 4M tables fall into one of two groups.
>
> Group A contains 2M tables.  INSERT will occur ~100 times/day and maximum
> number of records anticipated will be 200k.  Periodic DELETE's will occur
> removing "old" records.  Age is something the client sets and I have no way
> of saying 1 or 10k records will be removed.

The ~100 times / day are per table I assume. Also, I assume DELETES
will probably delete batches (because the time criteria catches
several records).

> Group B contains the other 2M tables.  Maximum records ~140k and UPSERT will
> be the only mechanism used to populate and maintain.  Periodic DELETE's may
> run on these tables as well removing "old" records.

So there will be inserts and updates.

Either I missed it or you did not mention the criteria for placing a
record in one of the 4M buckets. Can you shed light on what the
criteria are? That would obviously suggest what indexing could be
done.

Also it would be interesting to see results of your tests with btree
on really large tables as Stephen had suggested. I know it is not the
primary tests you want to do but I would rather first explore
"traditional" schema before I venture in the unknown of the multi
million dollar, pardon, table schema.

Kind regards


-- 
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Seeing execution plan of foreign key constraint check?

2016-07-22 Thread Robert Klemme
On Fri, Jul 22, 2016 at 12:14 AM, Jim Nasby  wrote:
> On 7/21/16 4:59 PM, Tom Lane wrote:
>>>
>>> > As for function plans, ISTM that could be added to the PL handlers if
>>> > we
>>> > wanted to (allow a function invocation to return an array of explain
>>> > outputs).
>>
>> Where would you put those, particularly for functions executed many
>> times in the query?  Would it include sub-functions recursively?
>> I mean, yeah, in principle we could do something roughly like that,
>> but it's not easy and presenting the results intelligibly seems
>> almost impossible.
>
>
> Yeah, it'd certainly need to be handled internally in a
> machine-understandable form that got aggregated before presentation (or with
> non-text output formats we could provide the raw data). Or just punt and
> don't capture the data unless you're using an alternative output format.

I'd imagine the output to just list all "recursive" execution plans
executed probably along with indicators for how much IO and / or CPU
they were responsible for. The "recursive" plans could also be sorted
in decreasing order of total (i.e. across all individual invocations)
time spent so you see the most impacting plan first. All of that would
loose displaying calling relationships at the advantage of a simpler
presentation. I think, the relationship which statement / function
invoked with other could be determined by looking at statements /
functions. And I guess often it will be apparent from names already.

I am wondering what to do if the same statement has multiple execution
plans if that is possible in such a scenario. Present all the plans or
just the one with the highest impact? Show them next to each other so
the user is immediately aware that all these plans originated from the
same piece of SQL?

Kind regards

robert

-- 
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Seeing execution plan of foreign key constraint check?

2016-07-05 Thread Robert Klemme
Hi,

I was wondering whether there are any plans to include the plan of the
FK check in EXPLAIN output. Or is there a different way to get to see
all the plans of triggers as well as of the main SQL?

When researching I found this thread from 2011 and the output format
does not seem to have changed since then:

https://www.postgresql.org/message-id/flat/3798971.mRNc5JcYXj%40moltowork#3798971.mRNc5JcYXj@moltowork

Kind regards

robert

-- 
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Database transaction with intermittent slow responses

2016-05-14 Thread Robert Klemme
On Sat, May 14, 2016 at 1:11 AM, Gerardo Herzig wrote:
> Oh, so *all* the transactions are being slowed down at that point...What 
> about CPU IO Wait% at that moment? Could be some other processes stressing 
> the system out?

Or the database has just grown pass the size where disk caching is
efficient. Usually these are nonlinear processes, i.e. it works good
until a certain moment and then cache hit ratio decreases dramatically
because all of a sudden content starts heavily competing about space
in the cache.

> Now im thinking about hard disk issues...maybe some "smart" messages?
>
> Have some other hardware to give it a try?

Can we please see the full DDL of table and indexes?

The only additional idea I can throw up at the moment is looking at a
combined index on (primarykeyofchange, category) - maybe replacing
changehistory_idx4 - but that of course depends on the other SQL used
against that table.

Kind regards

robert

-- 
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] LIKE pattern

2016-05-12 Thread Robert Klemme
On Mon, May 9, 2016 at 11:41 PM, SoDupuDupu wrote:
> Владимир-3 wrote
>> It seems my quite complex query runs 10 times faster on "some_column
>> LIKE '%test_1' " vs "some_column LIKE 'test_1' "
>> So I just add "%"  to the pattern...
>
> Keep in mind then LIKE '%test_1' and LIKE 'test_1' are not equivalent, using
> the % as a prefix to the argument means that the scan only has to confirm
> that the value ends in 'test_1' where forgoing the % entirely means that you
> are essentially saying some_column='test_1'.

Yes, but wouldn't the latter test be more efficient usually since it
tests against a prefix - at least with a regular index?

Kind regards

robert

-- 
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Index Scan Backward Slow

2015-05-02 Thread Robert Klemme

On 01.05.2015 13:06, David Osborne wrote:

Simple... that did it... thanks!

dev= create index on table(code,row_id);
CREATE INDEX
Time: 38088.482 ms
dev= explain (analyse,buffers)  select row_id as last_row_id from table
where code='XX' order by row_id desc limit 1;


Just out of curiosity: Is there a particular reason why you do not use

select max(row_id) as last_row_id
from table
where code='XX'

?

Kind regards

robert




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why don't use index on x when ORDER BY x, y?

2014-11-24 Thread Robert Klemme
On Mon, Nov 24, 2014 at 12:02 PM, Vlad Arkhipov wrote:
 Hello,

 I wonder why Postgres does not use index in the query below? It is a quite
 common use-case when you  want to sort records by an arbitrary set of
 columns but do not want to create a lot of compound indexes for all possible
 combinations of them. It seems that if, for instance, your query's ORDER BY
 is x, y, z then any of these indexes could be used to improve the
 performance: (x); (x, y); (x, y, z).

 create temp table t as
 select s as x, s % 10 as y, s % 100 as z
 from generate_series(1, 100) s;

 analyze t;
 create index idx1 on t (y);

 select *
 from t
 order by y desc, x
 limit 10;

For the record, here's the plan:

rklemme= explain analyze
rklemme- select *
rklemme- from t
rklemme- order by y desc, x
rklemme- limit 10;
QUERY PLAN
---
 Limit  (cost=36511.64..36511.67 rows=10 width=12) (actual
time=4058.863..4058.917 rows=10 loops=1)
   -  Sort  (cost=36511.64..39011.64 rows=100 width=12) (actual
time=4058.849..4058.868 rows=10 loops=1)
 Sort Key: y, x
 Sort Method: top-N heapsort  Memory: 17kB
 -  Seq Scan on t  (cost=0.00..14902.00 rows=100
width=12) (actual time=0.031..2025.639 rows=100 loops=1)
 Total runtime: 4058.992 ms
(6 rows)

Your index does not cover y AND x. If you remove order by x the
index will be used:

rklemme= explain analyze
select *
from t
order by y desc
limit 10;
QUERY PLAN
---
 Limit  (cost=0.42..0.86 rows=10 width=12) (actual time=0.081..0.284
rows=10 loops=1)
   -  Index Scan Backward using idx1 on t  (cost=0.42..43268.33
rows=100 width=12) (actual time=0.066..0.125 rows=10 loops=1)
 Total runtime: 0.403 ms
(3 rows)


Now, with a different index:

rklemme= create index idx2 on t (y desc, x);
CREATE INDEX
rklemme= explain analyze
select *
from t
order by y desc, x
limit 10;
QUERY PLAN
--
 Limit  (cost=0.42..0.88 rows=10 width=12) (actual time=0.127..0.290
rows=10 loops=1)
   -  Index Scan using idx2 on t  (cost=0.42..45514.12 rows=100
width=12) (actual time=0.112..0.165 rows=10 loops=1)
 Total runtime: 0.404 ms
(3 rows)


rklemme= select version();
version
---
 PostgreSQL 9.3.5 on i686-pc-linux-gnu, compiled by gcc (Ubuntu
4.8.2-19ubuntu1) 4.8.2, 32-bit
(1 row)

Kind regards


-- 
[guy, jim].each {|him| remember.him do |as, often| as.you_can - without end}
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres slave not catching up (on 9.2)

2014-11-08 Thread Robert Klemme
On Sat, Nov 8, 2014 at 2:11 PM, Ruben Domingo Gaspar Aparicio wrote:

 The slave (I don't have control on the master) is using 2 NFS file systems,
 one for WALs and another one for the data, on Netapp controllers:

 dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs
 (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)

 dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs
 (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)

You should use noatime to avoid unnecessary IO.

 The master produces quite a lot of WALs. This is what I get on the slave
 (number of WAL files, date-hour, Total size in MB), so per day is more than
 400GB:

 I tried to play with how the IO is handled, making it less strict setting
 synchronous_commit and fsync to off with not much success.

 I have also done a second test increasing shared_buffers from 12GB to 24GB
 (we are running on a 48GB, 8 cores server).

 Please let me know if you can see something obvious I am missing.

Your IO system needs to be able to deliver sustained IO bandwith at
least as large as you need to read and write all the changes. What raw
IO bandwidth do those NFS file systems deliver _long term_? I am not
talking about spikes because there are buffers. I am talking about the
minimum of network throughput on one hand and raw disk IO those boxes
can do on the other hand. Then, how much of it is available to your
slave? Did you do the math to ensure that the IO bandwidth you have
available on the slave is at least as high as what is needed? Note
that it's not simply the WAL size that needs to be written and read
but also data pages.

Kind regards

robert

-- 
[guy, jim].each {|him| remember.him do |as, often| as.you_can - without end}
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Advice on optimizing select/index

2013-05-26 Thread Robert Klemme

On 22.05.2013 16:37, Niels Kristian Schjødt wrote:


In reality the adverts that are selected is all 'active'. I'm hence
wondering if it theoretically (and in reality of cause) would make my
query faster if I did something like:  SELECT .* FROM cars LEFT
OUTER JOIN adverts on cars.id = adverts.car_id WHERE cars.brand =
'Audi' AND adverts.state = 'active' with a partial index on INDEX
adverts ON (car_id) WHERE state = 'active'?


That sounds reasonable to do.  If you have enough bandwidth on your 
production database why not just try it out?  Otherwise you could do 
this on a test database and see how it goes and what plan you get.  Btw. 
did you look at the original plan?


Cheers

robert




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query with limit goes from few ms to hours

2012-10-15 Thread Robert Klemme
Hi Henk,

On Sun, Oct 14, 2012 at 9:04 AM, henk de wit henk53...@hotmail.com wrote:
 Hi,

 For some reason the mailinglist software seems to block the email as soon as
 the planner details are in it, so I pasted those on pastebin.com:
 http://pastebin.com/T5JTwh5T

Just an additional data point: for whatever reason your email was
placed in my GMail spam folder.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deferred constraints performance impact ?

2012-08-13 Thread Robert Klemme
On Fri, Jul 20, 2012 at 4:27 AM, mark dvlh...@gmail.com wrote:
 We have put some deferred constraints (some initially immediate, some
 initially deferred) into our database for testing with our applications.

 I understand a lot more may have to be tracked through a transaction and
 there could be some impact from that. Similar to an after update trigger? Or
 are the two not comparable in terms of impact from what is tracked and then
 checked.

Another factor might be the amount of constraint violations you
expect: if there are many then deferring the check can create much
more work for the DB because you issue more DML as with a non deferred
constraint which could create errors much sooner and hence make you
stop sending DML earlier.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query using incorrect index

2012-08-03 Thread Robert Klemme
On Thu, Aug 2, 2012 at 4:54 PM, Russell Keane russell.ke...@inps.co.ukwrote:

  ** **

 Using PG 9.0 and given 2 queries (messageq_current is a view on the
 messageq_table):

 ** **

 select entity_id from messageq_current

 where entity_id = 123456;

 ** **

 select entity_id from messageq_current

 where incoming = true

 and inactive = false

 and staff_ty = 2

 and staff_id = 2

 order by entity_id desc

 limit 1;

 ** **

 and 2 indexes (there are 15 indexes in total but they are left out here
 for brevity):

 ** **

 messageq1:

 CREATE INDEX messageq1

   ON messageq_table

   USING btree

   (entity_id);

 ** **

 And messageq4:

 ** **

 CREATE INDEX messageq4

   ON messageq_table

   USING btree

   (inactive, staff_ty, staff_id, incoming, tran_dt);

 **


Of course *a lot* of detail is missing (full schema of table, all the other
indexes) but with inactive a boolean column I suspect selectivity might
not be too good here and so having it as a first column in a covering index
is at least questionable.  If query 2 is frequent you might also want to
consider creating a partial index only on (staff_ty, staff_id) with
filtering criteria on incoming and active as present in query 2.

Btw, why don't you formulate query 2 as max query?

select max(entity_id) as entity_id

from messageq_current

where incoming = true

and inactive = false

and staff_ty = 2

and staff_id = 2;


  **

 With the messageq1 index present, query 1 is very quick (0.094ms) and
 query 2 is very slow (241.515ms).

 If I remove messageq1 then query 2 uses messageq4 and is very quick
 (0.098ms) but then query 1 must use a different index and is therefore
 slower ( 5ms).

 ** **

 So, to the Query plans:


Of which query?  Shouldn't there be four plans in total?  I'd post plans
here:
http://explain.depesz.com/


 With messageq1:

 Limit  (cost=0.00..2670.50 rows=1 width=4) (actual time=241.481..241.481
 rows=0 loops=1)

   Output: messageq_table.entity_id

   Buffers: shared hit=32 read=18870 written=12

   -  Index Scan Backward using messageq1 on
 prac_live_10112.messageq_table  (cost=0.00..66762.53 rows=25 width=4)
 (actual time=241.479..241.479 rows=0 loops=1)

 Output: messageq_table.entity_id

 Filter: (messageq_table.incoming AND (NOT
 messageq_table.inactive) AND (messageq_table.staff_ty = 2) AND
 (messageq_table.staff_id = 2) AND
 (aud_status_to_flag(messageq_table.aud_status) = 1))

 Buffers: shared hit=32 read=18870 written=12

 Total runtime: 241.515 ms

 ** **

 Without messageq1:

 Limit  (cost=12534.45..12534.45 rows=1 width=4) (actual time=0.055..0.055
 rows=0 loops=1)

   Output: messageq_table.entity_id

   Buffers: shared read=3

   -  Sort  (cost=12534.45..12534.51 rows=25 width=4) (actual
 time=0.054..0.054 rows=0 loops=1)

 Output: messageq_table.entity_id

 Sort Key: messageq_table.entity_id

 Sort Method:  quicksort  Memory: 17kB

 -  Bitmap Heap Scan on prac_live_10112.messageq_table
 (cost=174.09..12534.32 rows=25 width=4) (actual time=0.043..0.043 rows=0
 loops=1)

   Output: messageq_table.entity_id

   Recheck Cond: ((messageq_table.staff_ty = 2) AND
 (messageq_table.staff_id = 2))

   Filter: (messageq_table.incoming AND (NOT
 messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status)
 = 1))

   Buffers: shared read=3

   -  Bitmap Index Scan on messageq4  (cost=0.00..174.08
 rows=4920 width=0) (actual time=0.040..0.040 rows=0 loops=1)

 Index Cond: ((messageq_table.inactive = false) AND
 (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND
 (messageq_table.incoming = true))

 Buffers: shared read=3

 Total runtime: 0.098 ms

 ** **

 Clearly the statistics are off somehow but I really don’t know where to
 start.

 ** **

 Any help you can give me would be very much appreciated.


Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-14 Thread Robert Klemme
On Sat, Jul 14, 2012 at 11:50 AM, B Sreejith bsreejit...@gmail.com wrote:
 Dear All,
 Thanks alot for all the invaluable comments.

Additionally to Craig's excellent advice to measurements there's
something else you can do: with the knowledge of the queries your
application fires against the database you can evaluate your schema
and index definitions.  While there is no guarantee that your
application will scale well if all indexes are present you believe
need to be present based on that inspection, you can pretty easily
identify tables with can be improved.  These are tables which a) are
known to grow large and b) do not have indexes nor no indexes which
support the queries your application does against these tables which
will result in full table scans.  Any database which scales in size
will sooner or later hit a point where full table scans of these large
tables will be extremely slow.  If these queries are done during
regular operation (and not nightly maintenance windows for example)
then you pretty surely have identified a show stopper.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Any tool/script available which can be used to measure scalability of an application's database.

2012-07-13 Thread Robert Klemme
On Tue, Jul 10, 2012 at 10:21 AM, Sreejith Balakrishnan
sreejith.balakrish...@tcs.com wrote:
 Is there any tool or some sort of script available, for PostgreSQL, which
 can be used to measure scalability of an application's database. Or is there
 any guideline on how to do this.

 I am a bit confused about the concept of measuring scalability of an
 application's database.

You cannot measure scalability of a database as such.  You need to
know the nature of the load (i.e. operations executed against the DB -
how many INSERT, UPDATE, DELETE and SELECT, against which tables and
with what frequency and criteria).  And then, as Sergey said, you need
to define whether you want to scale up the load or the size - or both.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The need for clustered indexes to boost TPC-V performance

2012-07-05 Thread Robert Klemme
On Wed, Jul 4, 2012 at 1:13 AM, Reza Taheri rtah...@vmware.com wrote:

 Checking online, the subject of clustered indexes for PostgreSQL comes up
 often. PGSQL does have a concept called “clustered table”, which means a
 table has been organized in the order of an index. This would help with
 sequential accesses to a table, but has nothing to do with this problem.
 PGSQL folks sometimes refer to what we want as “integrated index”.

I do understand this correctly that we are speaking about the concept
which is known under the term index organized table (IOT) in Oracle
land, correct?

http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm#CBBJEBIH

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partitioning performance question

2012-06-10 Thread Robert Klemme
On Sat, Jun 9, 2012 at 7:58 PM, Kevin Kempter
cs_...@consistentstate.com wrote:
 Hi All;

 We have a client that has a table where large blobs (bytea) are stored. the
 table has a key column that is numbers (like 112362) but unfortunately it's
 a varchar column so the blobs are accessed via queries like:

 select * from bigtable where keycol = '217765'

 The primary reason we want to partition the table is for maintenance, the
 table is well over 1.2 Terabytes in size and they have never successfully
 vacuumed it. However I don't want to make performance even worse. The table
 does have a serial key, I'm thinking the best options will be to partition
 by range on the serial key, or maybe via the keycol character column via
 using an in statement on the check constraints, thus allowing the planner to
 actually leverage the above sql.  I suspect doing a typecast to integer in
 the check constraints will prove to be a bad idea if the keycol column
 remains a varchar.

 Thoughts?

 Here's the table:


                                        Table problemchild
  Column   |           Type           |                             Modifiers
 ---+--+
  keycol                | character varying        |
  blob_data           | bytea                    |
  removed_date    | timestamp with time zone |
  alt_key | bigint                   | not null default
 nextval('problemchild_alt_key_seq'::regclass)
 Indexes:
    pc_pkey PRIMARY KEY, btree (alt_key)
    key2 btree (keycol)

I find it odd that you have a column keycol which is not the PK and
your PK column is named alt_key.  Is keycol always the character
representation of alt_key?  Are they unrelated?

It would also help to know how the data in this table changes.  Do you
only ever add data?  Is some data removed from time to time (maybe
based on the removed_date)?

If the table grows continually then range partitioning sounds good.
However, I think you should really make keycol a number type because
otherwise range partitioning will be a pain (you would need to include
the length of the string in the criterion if you want your varchar
ranges to mimic number ranges).

However, if you are deleting from time to time and hence the table
does not grow in the long run then hash partitioning might be a better
idea because then you do not need to create new partitions all the
time.  Example on alt_key

create table problemchild (
  keycol varchar(100),
  blob_data bytea,
  removed_date timestamp with time zone,
  alt_key bigint primary key
);
create table problemchild_00 (
  check ( alt_key % 16 = 0 )
) inherits (problemchild);
create table problemchild_01 (
  check ( alt_key % 16 = 1 )
) inherits (problemchild);
create table problemchild_02 (
  check ( alt_key % 16 = 2 )
) inherits (problemchild);
...

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Multiple Concurrent Updates of Shared Resource Counter

2012-06-07 Thread Robert Klemme
On Thu, Jun 7, 2012 at 9:53 AM, Nir Zilberman n...@checkpoint.com wrote:
 We are handling multiple concurrent clients connecting to our system -
 trying to get a license seat (each license has an initial capacity of
 seats).
 We have a table which keeps count of the acquired seats for each license.
 When a client tries to acquire a seat we first make sure that the number of
 acquired seats is less than the license capacity.
 We then increase the number of acquired seats by 1.

 Our main problem here is with the acquired seats table.
 It is actually a shared resource which needs to be updated concurrently by
 multiple transactions.

 When multiple transactions are running concurrently - each transaction takes
 a long time to complete because it waits on the lock for the shared resource
 table.

 Any suggestions for better implementation/design of this feature would be
 much appreciated.

Well, there are the usual suspects for lock contention

1. Reduce time a lock needs to be held.
2. Increase granularity of locking.

ad 1)

It sounds as if you need two statements for check and increase.  That
can easily be done with a single statement if you use check
constraints.  Example:

$ psql -ef seats.sql
drop table licenses;
DROP TABLE
create table licenses (
  name varchar(200) primary key,
  max_seats integer not null check ( max_seats = 0 ),
  current_seats integer not null default 0 check ( current_seats = 0
and current_seats = max_seats )
);
psql:seats.sql:6: NOTICE:  CREATE TABLE / PRIMARY KEY will create
implicit index licenses_pkey for table licenses
CREATE TABLE
insert into licenses (name, max_seats) values ('foo', 4);
INSERT 0 1
update licenses set current_seats = current_seats + 1 where name = 'foo';
UPDATE 1
update licenses set current_seats = current_seats + 1 where name = 'foo';
UPDATE 1
update licenses set current_seats = current_seats + 1 where name = 'foo';
UPDATE 1
update licenses set current_seats = current_seats + 1 where name = 'foo';
UPDATE 1
update licenses set current_seats = current_seats + 1 where name = 'foo';
psql:seats.sql:12: ERROR:  new row for relation licenses violates
check constraint licenses_check
update licenses set current_seats = current_seats - 1 where name = 'foo';
UPDATE 1

The increase will fail and you can react on that.  Another scheme is to use

update licenses set current_seats = current_seats + 1
where name = 'foo' and current_seats  max_seats;

and check how many rows where changed.

If however your transaction covers increase of used license seat
count, other work and finally decrease used license seat count you
need to change your transaction handling.  You rather want three TX:

start TX
update licenses set current_seats = current_seats + 1 where name = 'foo';
commit

if OK
  start TX
  main work
  commit / rollback

  start TX
  update licenses set current_seats = current_seats - 1 where name = 'foo';
  commit
end

ad 2)
At the moment I don't see a mechanism how that could be achieved in
your case.  Distribution of counters of a single license across
multiple rows and checking via SUM(current_seats) is not concurrency
safe because of MVCC.

Generally checking licenses via a relational database does neither
seem very robust nor secure.  As long as someone has administrative
access to the database or regular access to the particular database
limits and counts can be arbitrarily manipulated.  License servers I
have seen usually work by managing seats in memory and counting
license usage via network connections.  That has the advantage that
the OS quite reliably informs the license server if a client dies.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Robert Klemme
On Thu, May 31, 2012 at 10:45 AM, Tatsuo Ishii is...@postgresql.org wrote:
 Just for record, I rerun the test again with my single-LOCK patch, and
 now total runtime of pg_dump is 113 minutes.
 188 minutes(9.0)-125 minutes(git master)-113 minutes(with my patch).

 So far, I'm glad to see 40% time savings at this point.

I see only 9.6% savings (100 * (113/125 - 1)).  What am I missing?

Cheers

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Robert Klemme
On Thu, May 31, 2012 at 4:07 PM, Tatsuo Ishii is...@postgresql.org wrote:
 On Thu, May 31, 2012 at 10:45 AM, Tatsuo Ishii is...@postgresql.org wrote:
 Just for record, I rerun the test again with my single-LOCK patch, and
 now total runtime of pg_dump is 113 minutes.
 188 minutes(9.0)-125 minutes(git master)-113 minutes(with my patch).

 So far, I'm glad to see 40% time savings at this point.

 I see only 9.6% savings (100 * (113/125 - 1)).  What am I missing?

 What I meant was (100 * (113/188 - 1)).

OK, my fault was to assume you wanted to measure only your part, while
apparently you meant overall savings.  But Tom had asked for separate
measurements if I understood him correctly.  Also, that measurement of
your change would go after the O(N^2) fix.  It could actually turn out
to be much more than 9% because the overall time would be reduced even
more dramatic.  So it might actually be good for your fix to wait a
bit. ;-)

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration Recommendations

2012-05-16 Thread Robert Klemme
On Tue, May 15, 2012 at 7:53 PM, Greg Sabino Mullane g...@turnstep.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: RIPEMD160


 Is it established practice in the Postgres world to separate indexes
 from tables? I would assume that the reasoning of Richard Foote -
 albeit for Oracle databases - is also true for Postgres:

 Yes, it's an established practice. I'd call it something just short of
 a best practice though, as it really depends on your situation.

 What are the benefits?

 Disk seeks, basically. Yes, there are a lot of complications regarding
 all the various hardware/OS/PG level cachings, but at the end of the
 day, it's less work to have each drive concentrate on a single area

Hmm...  I see your point.  OTOH, the whole purpose of using NAS or SAN
with cache, logical volumes and multiple spindles per volume is to
reduce the impact of slow disk operations like seeks.  If in such a
situation your database operations are impacted by those seek
operations then the setup does not seem optimal anyway.  Bottom line
is: with a setup properly tailored to the workload there should be no
seeks visible to the database.

 (especially as we always require a heap scan at the moment).

Are you referring to the scan along tuple versions?
http://facility9.com/2011/03/postgresql-row-storage-fundamentals/

 I also find his examples a bit contrived, and the whole multi-user
 argument irrelevant for common cases.

 Why is that?

 Because most Postgres servers are dedicated to serving the same data
 or sets of data, and the number of other users calling ad-hoc queries
 against lots of different tables (per his example) is small.

I don't see how it should be relevant for this discussion whether
selects are ad hoc or other.  The mere fact that concurrent accesses
to the same set of tables and indexes albeit to different data (keys)
is sufficient to have a potential for seeks - even if disks for index
and table are separated.  And this will typically happen in a
multiuser application - even if all users use the same set of queries.

 So this sentence just doesn't ring true to me:

     ... by the time weâve read the index leaf block, processed and
    read all the associated table blocks referenced by the index leaf
    block, the chances of there being no subsequent physical activity
    in the index tablespace due to another user session is virtually
    nil. We would still need to re-scan the disk to physically access
    the next index leaf block (or table block) anyways.

 That's certainly not true for Postgres servers, and I doubt if it
 is quite that bad on Oracle either.

I don't think this has much to do with the brand.  Richard just
describes logical consequences of concurrent access (see my attempt at
explanation above).  Fact remains that concurrent accesses rarely
target for the same data and because of that you would see quite
erratic access patterns to blocks.  How they translate to actual disk
accesses depends on various caching mechanisms in place and the
physical distribution of data across disks (RAID).  But I think we
cannot ignore the fact that the data requested by concurrent queries
most likely resides on different blocks.

 I lean towards using separate tablespaces in Postgres, as the
 performance outweighs the additional complexity.

 What about his argument with regards to access patterns (i.e.
 interleaving index and table access during an index scan)?  Also,
 Shaun's advice to have more spindles available sounds convincing to
 me, too.

 I don't buy his arguments. To do so, you'd have to buy a key point:

    when most physical I/Os in both index and table segments are
     effectively random, single block reads

 They are not; hence, the rest of his argument falls apart. Certainly,
 if things were as truly random and uncached as he states, there would
 be no benefit to separation.

Your argument with seeks also only works in absence of caching (see
above).  I think Richard was mainly pointing out that /in absence of
caching/ different blocks need to be accessed here.

 As far as spindles, yes: like RAM, it's seldom the case to have
 too litte :) But as with all things, one should get some benchmarks
 on your specific workload before making hardware changes. (Well, RAM
 may be an exception to that, up to a point).

Can you share some measurement data which backs the thesis that the
distribution of index and table to different disks is advantageous?
That would be interesting to see.  Then one could also balance
performance benefits against other effects (manageability etc.) and
see on which side the advantage comes out.

Even though I'm not convinced: Thank you for the interesting discussion!

Cheers

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Maximum number of sequences that can be created

2012-05-15 Thread Robert Klemme
Hi,

On Tue, May 15, 2012 at 12:57 PM, Andres Freund and...@anarazel.de wrote:

 I would rather suggest going with a suming table if you need to do something
 like that:

 sequence_id | value
 1 | 3434334
 1 | 1
 1 | -1
 1 | 1
 1 | 1
 ...

 You then can get the current value with SELECT SUM(value) WHERE sequence_id =
 1. For garbage collection you can delete those values and insert the newly
 summed up value again.
 That solution won't ever block if done right.

I was going to suggest another variant which would not need GC but
would also increase concurrency:

sequence_id | hash | value
1 | 0 | 3
1 | 1 | 9
1 | 2 | 0
1 | 3 | 2
...

with PK = (sequence_id, hash) and hash in a fixed range (say 0..15).

Value would be obtained the same way, i.e. via
SELECT SUM(value) FROM T WHERE sequence_id = 1

The hash value would have to be calculated

 - at session start time (cheap but might reduce concurrency due to
small number of changes) or
 - at TX start time (more expensive but probably better concurrency
due to higher change rate)

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Maximum number of sequences that can be created

2012-05-13 Thread Robert Klemme
On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин tiv...@gmail.com wrote:
 2012/5/11 Robert Klemme shortcut...@googlemail.com

 On the contrary: what would be the /advantage/ of being able to create
 millions of sequences?  What's the use case?

 We are using sequences as statistics counters - they produce almost no
 performance impact and we can tolerate it's non-transactional nature. I can
 imaging someone who wants to have a  sequence per user or other relation
 row.

I can almost see the point. But my natural choice in that case would
be a table with two columns.  Would that actually be so much less
efficient? Of course you'd have fully transactional behavior and thus
locking.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Maximum number of sequences that can be created

2012-05-11 Thread Robert Klemme
On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre meetvbon...@gmail.com wrote:
 Is there any max limit set on sequences that can be created on the database
 ? Also would like to know if we create millions of sequences in a single db
 what is the downside of it.

On the contrary: what would be the /advantage/ of being able to create
millions of sequences?  What's the use case?

Cheers

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] scale up (postgresql vs mssql)

2012-05-09 Thread Robert Klemme
On Fri, May 4, 2012 at 3:04 PM, Merlin Moncure mmonc...@gmail.com wrote:

 let's see the query plan...when you turned it off, did it go faster?
 put your suspicious plans here: http://explain.depesz.com/

I suggest to post three plans:

1. insert into temp table
2. access to temp table before analyze
3. access to temp table after analyze

Maybe getting rid of the temp table (e.g. using a view or even an
inline view) is even better than optimizing temp table access.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?

2012-05-09 Thread Robert Klemme
On Wed, May 9, 2012 at 3:58 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, May 9, 2012 at 8:06 AM, MauMau maumau...@gmail.com wrote:
 I've heard from some people that synchronous streaming replication has
 severe performance impact on the primary. They said that the transaction
 throughput of TPC-C like benchmark (perhaps DBT-2) decreased by 50%. I'm
 sorry I haven't asked them about their testing environment, because they
 just gave me their experience. They think that this result is much worse
 than some commercial database.

 I can't speak for other databases, but it's only natural to assume
 that tps must drop.  At minimum, you have to add the latency of
 communication and remote sync operation to your transaction time.  For
 very short transactions this adds up to a lot of extra work relative
 to the transaction itself.

Actually I would expect 50% degradation if both databases run on
identical hardware: the second instance needs to do the same work
(i.e. write WAL AND ensure it reached the disk) before it can
acknowledge.

When requesting synchronous replication, each commit of a write
transaction will wait until confirmation is received that the commit
has been written to the transaction log on disk of both the primary
and standby server.
http://www.postgresql.org/docs/9.1/static/warm-standby.html#SYNCHRONOUS-REPLICATION

I am not sure whether the replicant can be triggered to commit to disk
before the commit to disk on the master has succeeded; if that was the
case there would be true serialization = 50%.

This sounds like it could actually be the case (note the after it commits):
When synchronous replication is requested the transaction will wait
after it commits until it receives confirmation that the transfer has
been successful.
http://wiki.postgresql.org/wiki/Synchronous_replication

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?

2012-05-09 Thread Robert Klemme
On Wed, May 9, 2012 at 5:45 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Wed, May 9, 2012 at 12:41 PM, Robert Klemme
 shortcut...@googlemail.com wrote:
 I am not sure whether the replicant can be triggered to commit to disk
 before the commit to disk on the master has succeeded; if that was the
 case there would be true serialization = 50%.

 This sounds like it could actually be the case (note the after it commits):
 When synchronous replication is requested the transaction will wait
 after it commits until it receives confirmation that the transfer has
 been successful.
 http://wiki.postgresql.org/wiki/Synchronous_replication

 That should only happen for very short transactions.
 IIRC, WAL records can be sent to the slaves before the transaction in
 the master commits, so bigger transactions would see higher
 parallelism.

I considered that as well.  But the question is: when are they written
to disk in the slave?  If they are in buffer cache until data is
synched to disk then you only gain a bit of advantage by earlier
sending (i.e. network latency).  Assuming a high bandwidth and low
latency network (which you want to have in this case anyway) that gain
is probably not big compared to the time it takes to ensure WAL is
written to disk.  I do not know implementation details but *if* the
server triggers sync only after its own sync has succeeded *then* you
basically have serialization and you need to wait twice the time.

For small TX OTOH network overhead might relatively large compared to
WAL IO (for example with a battery backed cache in the controller)
that it shows.  Since we do not know the test cases which lead to the
50% statement we can probably only speculate.  Ultimately each
individual setup and workload has to be tested.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Result Set over Network Question

2012-05-08 Thread Robert Klemme
On Mon, May 7, 2012 at 4:25 PM, Thomas Kellerer spam_ea...@gmx.net wrote:
 That seems to be a documentation bug.
 I tried it, and it definitely does not work (or I am missing something).

Apparently I am the one who is missing something. :-)

 Their release notes at:
 http://www.oracle.com/technetwork/developer-tools/sql-developer/sqldev31-ea-relnotes-487612.html

 state:

 Third Party Databases

    SQL Developer supports IBM DB2 UDB LUW , Microsoft SQL Server and
 Microsoft Access, MySQL, Sybase Adaptive Server and Teradata.
    See Supported Platforms for details on all third party database releases
 supported.

Right you are, Thomas!  Thank you!  Sorry for the confusion.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Result Set over Network Question

2012-05-07 Thread Robert Klemme
On Thu, May 3, 2012 at 5:40 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Thu, May 3, 2012 at 10:28 AM, Ronald Hahn, DOCFOCUS INC.
 rh...@docfocus.ca wrote:
 After some testing using wiershark (poor mans profiler) to see what was
 going on with the network I found that it was the tools I've been using.
 Both Aqua and PGadminIII have a large overhead per column to get the meta
 data. MSSQL sends that data upfront so the impact isn't as bad. I'm not sure
 if it's a limitation of the pgsql protocol vs tds or a limitation of Aqua or
 a combination of both. At any rate it turns out not to be part of the
 problem I'm having with my software stalling out so I'm back to Square one
 with my problem.

So, Ronald, are you saying the different approach to meta data
transfer is _not_ the issue?

 ok, let's figure out what the issue is then.  first, let's make sure
 it isn't the server that's stalling: configure
 log_min_duration_statement with an appropriate value so you start
 catching queries that are taking longer then you think the should be.
  also some client side logging directly wrapping the SQL invocation
 couldn't hurt.   is your application jdbc?

Ronald said ODBC in his first posting.  But since ADS seems to support
JDBC as well trying that might be a good test to get another data
point.  Alternative tools for JDBC tests:

http://squirrel-sql.sourceforge.net/
http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html

Using the PG client remotely with \timing on might be an even better idea.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Result Set over Network Question

2012-05-07 Thread Robert Klemme
On Mon, May 7, 2012 at 2:11 PM, Thomas Kellerer spam_ea...@gmx.net wrote:
 Robert Klemme, 07.05.2012 14:03:

 Alternative tools for JDBC tests:

 http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html

 SQL Developer does not support PostgreSQL

Last time I checked (quite a while ago) you could use arbitrary JDBC
drivers.  There's also
http://docs.oracle.com/cd/E25259_01/appdev.31/e24285/intro.htm#sthref306

And this seems to indicate that it's still the case: [...] or another
third-party driver. [...]
JDBC URL (Other Third Party Driver): URL for connecting directly from
Java to the database; overrides any other connection type
specification.
http://docs.oracle.com/cd/E25259_01/appdev.31/e24285/dialogs.htm#BACDGCIA

I assume Oracle is not interested in aggressively advertizing this
feature though.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration Recommendations

2012-05-03 Thread Robert Klemme
Hi Jan,

On Thu, May 3, 2012 at 4:10 AM, Jan Nielsen jan.sture.niel...@gmail.com wrote:
 Below is the hardware, firmware, OS, and PG configuration pieces that I'm
 settling in on. As was noted, the local storage used for OS is actually two
 disks with RAID 10. If anything appears like a mistake or something is
 missing, I'd appreciate the feedback.

You should quickly patent this solution.  As far as I know you need at
least four disks for RAID 10. :-)
http://en.wikipedia.org/wiki/RAID#Nested_.28hybrid.29_RAID

Or did you mean RAID 1?

 I'm still working on the benchmarks scripts and I don't have good/reliable
 numbers yet since our SAN is still very busy reconfiguring from the 2x4 to
 1x8. I'm hoping to get them running tomorrow when the SAN should complete
 its 60 hours of reconfiguration.

Yeah, does not seem to make a lot of sense to test during this phase.

 Thanks, again, for all the great feedback.

You're welcome!

 300GB RAID10 2x15k drive for OS on local storage
 */dev/sda1 RA*    4096
 */dev/sda1 FS*    ext4
 */dev/sda1 MO*

See above.

 600GB RAID 10 8x15k drive for $PGDATA on SAN
 *IO Scheduler sda*    noop anticipatory deadline [cfq]
 */dev/sdb1 RA*    4096
 */dev/sdb1 FS* xfs
 */dev/sdb1 MO*
 allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime

 300GB RAID 10 2x15k drive for $PGDATA/pg_xlog on SAN
 *IO Scheduler sdb*    noop anticipatory deadline [cfq]
 */dev/sde1 RA*    4096
 */dev/sde1 FS* xfs
 */dev/sde1 MO*    allocsize=256m,attr2,logbufs=8,logbsize=256k,noatime
 *IO Scheduler sde*    noop anticipatory deadline [cfq]

See above.

With regard to the scheduler, I have frequently read that [deadline]
and [noop] perform better for PG loads.  Fortunately this can be
easily changed.

Maybe this also has some additional input:
http://www.fccps.cz/download/adv/frr/hdd/hdd.html

On Thu, May 3, 2012 at 8:54 AM, John Lister john.lis...@kickstone.co.uk wrote:
 I was wondering if it would be better to put the xlog on the same disk as
 the OS? Apart from the occasional log writes I'd have thought most OS data
 is loaded into cache at the beginning, so you effectively have an unused
 disk. This gives you another spindle (mirrored) for your data.

 Or have I missed something fundamental?

Separating avoids interference between OS and WAL logging (i.e. a
script running berserk and filling OS filesystem).  Also it's easier
to manage (e.g. in case of relocation to another volume etc.).  And
you can have different mount options (i.e. might want to have atime
for OS volume).

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration Recommendations

2012-04-25 Thread Robert Klemme
On Wed, Apr 25, 2012 at 7:08 PM, Greg Sabino Mullane g...@turnstep.com wrote:

 Is it established practice in the Postgres world to separate indexes
 from tables?  I would assume that the reasoning of Richard Foote -
 albeit for Oracle databases - is also true for Postgres:

 Yes, it's an established practice. I'd call it something just short of
 a best practice though, as it really depends on your situation.

What are the benefits?

 I'd
 take those articles with a grain of salt, as they are very
 Oracle-specific (e.g. we do not have fat indexes (yet!), nor segments).

True.  As far as I understand disk layout segments in Oracle serve the
purpose to cluster data for a database object.  With that feature
missing the situation would be worse in Postgres - unless you manually
do something similar by using tablespaces for that.

 I also find his examples a bit contrived, and the whole multi-user
 argument irrelevant for common cases.

Why is that?

 I lean towards using separate
 tablespaces in Postgres, as the performance outweighs the additional
 complexity.

What about his argument with regards to access patterns (i.e.
interleaving index and table access during an index scan)?  Also,
Shaun's advice to have more spindles available sounds convincing to
me, too.

 It's down on the tuning list however: much more important
 is getting your kernel/volumes configured correctly, allocating
 shared_buffers sanely, separating pg_xlog, etc.

That does make a lot of sense.  Separating pg_xlog would probably the
first thing I'd do especially since the IO pattern is so dramatically
different from tablespace IO access patterns.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration Recommendations

2012-04-23 Thread Robert Klemme
On Tue, Apr 24, 2012 at 4:56 AM, Jan Nielsen
jan.sture.niel...@gmail.com wrote:
 We are considering the following drive allocations:

  * 4 x 15k SAS drives, XFS, RAID 10 on SAN for PG data
  * 4 x 15k SAS drives, XFS, RAID 10 on SAN  for PG indexes
  * 2 x 15k SAS drives, XFS, RAID 1 on SAN  for PG xlog
  * 1 x 15k SAS drive, XFS, on local storage for OS

Is it established practice in the Postgres world to separate indexes
from tables?  I would assume that the reasoning of Richard Foote -
albeit for Oracle databases - is also true for Postgres:

http://richardfoote.wordpress.com/2008/04/16/separate-indexes-from-tables-some-thoughts-part-i-everything-in-its-right-place/
http://richardfoote.wordpress.com/2008/04/18/separate-indexes-from-tables-some-thoughts-part-ii-there-there/
http://richardfoote.wordpress.com/2008/04/28/indexes-in-their-own-tablespace-availabilty-advantages-is-there-anybody-out-there/

Conversely if you lump both on a single volume you have more
flexibility with regard to usage - unless of course you can
dynamically resize volumes.

To me it also seems like a good idea to mirror local disk with OS and
database software because if that fails you'll get downtime as well.
As of now you have a single point of failure there.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Tablespace files deleted automatically.

2011-10-17 Thread Robert Klemme
On Fri, Oct 14, 2011 at 8:19 PM, Josh Berkus j...@agliodbs.com wrote:
 Vishnu,

 I am using PostgreSQL 8.4 in windows.  I have  created a database and
 some tables on it. Also created a table space and some tables in it. My
 application inserts data into these tables in every second. The
 application is a continuous running application. My issue is that after
 a 20-30 days continuous run ( Machine logged off  2 times), some of the
 files in the  \16384 folder of newly created table space are dropped
 automatically. Can  you tell me the reason for this.?  How can we
 recover from this?

 PostgreSQL creates and deletes files all the time.  I don't understand
 why this is a problem.

Also it does not seem clear whether we are talking about a performance
issue (what this ML is about) or a bug (lost data).

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Window functions and index usage

2011-10-04 Thread Robert Klemme
On Tue, Oct 4, 2011 at 11:39 AM, Anssi Kääriäinen
anssi.kaariai...@thl.fi wrote:
 I have the following setup:

 create table test(id integer, seq integer);
 insert into test select generate_series(0, 100), generate_series(0, 1000);
 create unique index test_idx on test(id, seq);
 analyze test;

 Now I try to fetch the latest 5 values per id, ordered by seq from the
 table:

 select * from (
 select id, seq, row_number() over (partition by id order by seq)
  from test
  where id in (1, 2, 3)
 ) where row_number() = 5;

It seems this fetches the *first* 5 values per id - and not the
latest.  For that you would need to order by seq desc in the window
and probably also in the index.

 This does not use the index on id, seq for sorting the data. It uses a
 bitmap index scan, and sequential scan when issued SET enable_bitmapscan to
 false. Tested both on git head and 8.4.8. See end of post for plans. It
 seems it would be possible to fetch the first values per id using the index,
 or at least skip the sorting.

Just guessing: since row_number is an analytic function and it can be
combined with any type of windowing only in rare cases do the
ordering of index columns and the ordering in the window align.  So
while your particular use case could benefit from this optimization
the overall judgement might be that it's not worthwhile to make the
optimizer more complex to cover this case - or I fail to see the more
general pattern here. :-)

 Is there some other way to run the query so that it would use the index? Is
 there plans to support the index usage for the above query assuming that it
 is possible to use the index for that query?

 The real world use case would be to fetch latest 5 threads per discussion
 forum in one query. Or fetch 3 latest comments for all patches in given
 commit fest in single query.

Is it really that realistic that someone wants the latest n entries
for *all* threads / patches?  It seems since this can result in a very
large data set this is probably not the type of query which is done
often.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Window functions and index usage

2011-10-04 Thread Robert Klemme
On Tue, Oct 4, 2011 at 4:06 PM, Anssi Kääriäinen
anssi.kaariai...@thl.fi wrote:
 On 10/04/2011 04:27 PM, Robert Klemme wrote:

 On Tue, Oct 4, 2011 at 11:39 AM, Anssi Kääriäinen
 anssi.kaariai...@thl.fi  wrote:

 I have the following setup:

 create table test(id integer, seq integer);
 insert into test select generate_series(0, 100), generate_series(0,
 1000);
 create unique index test_idx on test(id, seq);
 analyze test;

 Now I try to fetch the latest 5 values per id, ordered by seq from the
 table:

 select * from (
 select id, seq, row_number() over (partition by id order by seq)
  from test
  where id in (1, 2, 3)
 ) where row_number()= 5;

 It seems this fetches the *first* 5 values per id - and not the
 latest.  For that you would need to order by seq desc in the window
 and probably also in the index.

 Yeah. Sorry, wrong order. And the last line is wrong, it should be ) tmp
 where row_number = 5;.

 This does not use the index on id, seq for sorting the data. It uses a
 bitmap index scan, and sequential scan when issued SET enable_bitmapscan
 to
 false. Tested both on git head and 8.4.8. See end of post for plans. It
 seems it would be possible to fetch the first values per id using the
 index,
 or at least skip the sorting.

 Just guessing: since row_number is an analytic function and it can be
 combined with any type of windowing only in rare cases do the
 ordering of index columns and the ordering in the window align.  So
 while your particular use case could benefit from this optimization
 the overall judgement might be that it's not worthwhile to make the
 optimizer more complex to cover this case - or I fail to see the more
 general pattern here. :-)

 I think there are common use cases for this - see end of message for an
 example.

 Is there some other way to run the query so that it would use the index?
 Is
 there plans to support the index usage for the above query assuming that
 it
 is possible to use the index for that query?

 The real world use case would be to fetch latest 5 threads per discussion
 forum in one query. Or fetch 3 latest comments for all patches in given
 commit fest in single query.

 Is it really that realistic that someone wants the latest n entries
 for *all* threads / patches?  It seems since this can result in a very
 large data set this is probably not the type of query which is done
 often.

 The idea is that the dataset isn't that large.

But then why do require using the second index column in the first
place?  If the data set is small then the query is likely fast if the
selection via id can use any index.

 And you don't have to fetch
 them for all threads / patches. You might fetch them only for patches in
 currently viewed commit fest. See
 https://commitfest.postgresql.org/action/commitfest_view?id=12 for one such
 use. What I have in mind is fetching first all the patches in the commit
 fest in one go. Then issue query which would look something like:
  select * from
    (select comment_data, row_number() over (partition by patch_id order by
 comment_date desc)
       from patch_comments
     where patch_id in (list of patch_ids fetched in first query)
   ) tmp where row_number = 3;

Interesting: I notice that I the query cannot successfully be simplified on 8.4:

rklemme= select *,
row_number() over (partition by id order by seq desc) as rn
from test
where id in (1,2,3)
and rn = 3
;
ERROR:  column rn does not exist
LINE 5: and rn = 3
^
rklemme= select *,
row_number() over (partition by id order by seq desc) as rn
from test
where id in (1,2,3)
and row_number() = 3;
ERROR:  window function call requires an OVER clause
LINE 5: and row_number() = 3;
^
rklemme= select *,
row_number() over (partition by id order by seq desc) as rn
from test
where id in (1,2,3)
and row_number() over (partition by id order by seq desc) = 3;
ERROR:  window functions not allowed in WHERE clause
LINE 5: and row_number() over (partition by id order by seq desc) =...
^
rklemme=

I think I need to switch to 9.1 soon. :-)

 Now you have all the data needed for the first column in the above mentioned
 page.

 I guess the commit fest application is fetching all the comments for the
 patches in the commit fest in one query, and then doing the limit in
 application code. This is fine because there aren't that many comments per
 patch. But if you have dozens of forums and thousands of threads per forum
 you can't do that.

 This is useful in any situation where you want to show n latest entries
 instead of the last entry in a list view. Latest modifications to an object,
 latest commits to a file, latest messages to a discussion thread or latest
 payments per project. Or 5 most popular videos per category, 10 highest paid
 employees per department and so on.

Again, what is easy for you as a human will likely be quite complex
for the optimizer (knowing that the order by and the row_number output
align).

Kind regards

robert

-- 
remember.guy do

Re: [PERFORM] Slow query with self-join, group by, 100m rows

2011-09-21 Thread Robert Klemme
On Tue, Sep 20, 2011 at 7:43 PM, Thomas Kappler tkapp...@googlemail.com wrote:
 [please CC, I'm not on the list]

 Hi all,

 we have one table that basically uses Postgres as a key-value store.

     Table public.termindex
 Column   |  Type   | Modifiers
 -+-+---
  subject_id | integer |
  indextype  | integer |
  cid        | integer |

 This is with Postgres 9.0.

 The table has 96 million rows and an index on each column. It contains
 no NULLs and has no triggers.

 subject_id has about 2m distinct values, cid about 200k, and indextype only 
 six.

 The table is *read-only* after the initial load.

 The query we want to do is (with example values):

 select t.cid, count(distinct t1.subject_id)
 from termindex t1, termindex t2
 where t1.cid=20642 and t1.indextype=636 and t2.indextype=636 and
 t2.subject_id=t1.subject_id
 group by t2.cid;

Do you have any multi column indexes?  From the text of your query it
seems it could benefit from these two indexes:

(cid, indextype)
(subject_id, indextype)

I do not know whether PostgreSQL can avoid the table if you make the first index
(cid, indextype, subject_id)
in other words: append all the columns needed for the join.  In theory
the query could then be satisfied from the indexes.

 Pasting the explain analyze output into
 http://explain.depesz.com/s/Yr4 we see that Postgres is doing an
 external sort using about 150MB of data.

 Now, we're not Postgres experts, or even great at relational design.
 Are there better ways of doing that query, or designing the table? For
 the latter we do have a number of constraints, though, that I don't
 want to outline now because this mail is already long enough.

Those are probably important to know.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller sfkel...@gmail.com wrote:
 I'm simply referring to literature (like the intro Ramakrishnan  Gehrke).
 I just know that Oracle an Mysql actually do have them too and use it
 without those current implementation specific restrictions in
 Postgres.

Where exactly do you take that from that Oracle has hash indexes?  I
can't seem to find them:
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#sthref293

Are you mixing this up with hash partitioning?
http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#sthref443

Or am I missing something?

 IMHO by design Hash Index (e.g. linear hashing) work best when:
 1. only equal (=) tests are used (on whole values)
 2. columns (key values) have very-high cardinality

 And ideally but not necessarily when index values do not change and
 number of rows are known ahead of time (avoiding O(N) worst case - but
 there are approaches to chaining with dynamic resizing).

 I just collected this to encourage ourselves that enhancing hash
 indexes could be worthwhile.

There's still the locking issue Jeff mentioned.  At least every time a
table resize occurs the whole index must be locked.  Or is there a
more fine granular locking strategy which I am overlooking?

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller sfkel...@gmail.com wrote:
 Merlin and Jeff,

 General remark again:It's hard for me to imagine that btree is
 superior for all the issues mentioned before. I still believe in hash
 index for primary keys and certain unique constraints where you need
 equality search and don't need ordering or range search.

 It is -- but please understand I'm talking about int32 tree vs hash.
 Hashing as a technique of course is absolutely going to cream btree
 for all kinds of data because of the advantages of working with
 decomposed data -- we are all taught that in comp-sci 101 :-).  The
 debate here is not about the advantages of hashing, but the specific
 implementation of the hash index used.

 Postgres's hash index implementation used to be pretty horrible -- it
 stored the pre-hashed datum in the index which, while making it easier
 to do certain things,  made it horribly slow, and, for all intents and
 purposes, useless.  Somewhat recently,a lot of work was put in to fix
 that -- the index now packs the hash code only which made it
 competitive with btree and superior for larger keys.  However, certain
 technical limitations like lack of WAL logging and uniqueness hold
 hash indexing back from being used like it really should be.  In cases
 where I really *do* need hash indexing, I do it in userland.

 create table foo
 (
  a_long_field text;
 );
 create index on foo(hash(a_long_field));

 select * from foo where hash(a_long_field) = hash(some_value) and
 a_long_field = some_value;

 This technique works fine -- the main disadvantage is that enforcing
 uniqueness is a PITA but since the standard index doesn't support it
 either it's no great loss.  I also have the option of getting
 'uniqueness' and being able to skip the equality operation if I
 sacrifice some performance and choose a strong digest.  Until the hash
 index issues are worked out, I submit that this remains the go-to
 method to do this.

Is this approach (storing the hash code in a btree) really faster than
a regular btree index on a_long_field?  And if so, for which kind of
data and load?

 Now, my point here is that I've noticed that even with the latest
 optimizations btree seems to still be superior to the hash indexing by
 most metrics, so that:
 create table foo
 (
  an_int_field int;
  a_long_field text;
 );

 create index on foo(an_int_field);
 create index on foo using hash(a_long_field);

 On performance grounds alone, the btree index seems to be (from my
 very limited testing) a better bet.  So I'm conjecturing that the
 current hash implementation should be replaced with a formalization of
 the userland technique shown above -- when you request a hash index,
 the database will silently hash your field and weave it into a btree.
 It's a hybrid: a hashed btree.

I'd rather call it a btreefied hash because you are storing a hash
but in a btree structure. :-) But that's a detail.  What I find
worrying is that then there is a certain level of obscuring the real
nature since create index ... using hash is not exactly creating a
hash table.

 To truly demonstrate if the technique
 was effective though, it would have to be coded up -- it's only fair
 to compare if the btree based hash is also double checking the value
 in the heap which the standard hash index must do.

Right.

 The other way to go of course is to try and fix up the existing hash
 index code -- add wal logging, etc. In theory, a customized hash
 structure should be able to beat btree all day long which argues to
 continue in this direction.

I still haven't seen a solution to locking when a hash table needs
resizing.  All hashing algorithms I can think of at the moment would
require a lock on the whole beast during the resize which makes this
type of index impractical for certain loads (heavy updating).

One solution would be to apply partitioning to the hash table itself
(e.g. have four partitions for the two least significant bits or 16
for the four lest significant bits) and treat them independently.  How
that would interact with WAL I have no idea though.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Robert Klemme
On Tue, Sep 13, 2011 at 5:13 PM, Marti Raudsepp ma...@juffo.org wrote:
 On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com 
 wrote:
 In the case of PG this particular example will work:
 1. TX inserts new PK row
 2. TX tries to insert same PK row = blocks
 1. TX commits
 2. TX fails with PK violation
 2. TX does the update (if the error is caught)

 That goes against the point I was making in my earlier comment. In
 order to implement this error-catching logic, you'll have to allocate
 a new subtransaction (transaction ID) for EVERY ROW you insert.

I don't think so.  You only need to catch the error (see attachment).
Or does this create a sub transaction?

 If
 you're going to be loading billions of rows this way, you will invoke
 the wrath of the vacuum freeze process, which will seq-scan all
 older tables and re-write every row that it hasn't touched yet. You'll
 survive it if your database is a few GB in size, but in the terabyte
 land that's unacceptable. Transaction IDs are a scarce resource there.

Certainly.  But it's not needed as far as I can see.

 In addition, such blocking will limit the parallelism you will get
 from multiple inserters.

Yes, I mentioned the speed issue.  But regardless of the solution for
MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
will have the locking problem anyhow if you plan to insert
concurrently into the same table and be robust.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


ins.sql
Description: Binary data

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Robert Klemme

On 13.09.2011 20:11, Marti Raudsepp wrote:

On Tue, Sep 13, 2011 at 19:34, Robert Klemmeshortcut...@googlemail.com  wrote:

I don't think so.  You only need to catch the error (see attachment).
Or does this create a sub transaction?


Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
SAVEPOINT it can roll back to in case of an error.


Ouch!  Learn something new every day.  Thanks for the update!

http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html

Side note: it seems that Oracle handles this differently (i.e. no 
subtransaction but the INSERT would be rolled back) making the pattern 
pretty usable for this particular situation.  Also, I have never heard 
that TX ids are such a scarse resource over there.


Would anybody think it a good idea to optionally have a BEGIN EXCEPTION 
block without the current TX semantics?  In absence of that what would 
be a better pattern to do it (other than UPDATE and INSERT if not found)?



Yes, I mentioned the speed issue.  But regardless of the solution for
MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you
will have the locking problem anyhow if you plan to insert
concurrently into the same table and be robust.


In a mass-loading application you can often divide the work between
threads in a manner that doesn't cause conflicts.


Yeah, but concurrency might not the only reason to optionally update. 
If the data is there you might rather want to overwrite it instead of 
failure.


Kind regards

robert


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-13 Thread Robert Klemme

On 13.09.2011 20:57, Stefan Keller wrote:

Interesting debate.


Indeed.


2011/9/13 Marti Raudseppma...@juffo.org:

Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
SAVEPOINT it can roll back to in case of an error.


Are you sure? In theory I always understood that there are no
subtransactions.


What theory are you referring to?


In fact when looking at the docs there is chapter 39.6.6. saying By
default, any error occurring in a PL/pgSQL function aborts execution
of the function, and indeed of the surrounding transaction as well.
You can trap errors and recover from them by using a BEGIN block with
an EXCEPTION clause.
(http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html
)

So the doc isn't totally explicit about this. But whatever: What would
be the the function of a subtransaction? To give the possibility to
recover and continue within the surrounding transaction?


I find this pretty explicit:

It is important not to confuse the use of BEGIN/END for grouping 
statements in PL/pgSQL with the similarly-named SQL commands for 
transaction control. PL/pgSQL's BEGIN/END are only for grouping; they do 
not start or end a transaction. Functions and trigger procedures are 
always executed within a transaction established by an outer query — 
they cannot start or commit that transaction, since there would be no 
context for them to execute in. However, a block containing an EXCEPTION 
clause effectively forms a subtransaction that can be rolled back 
without affecting the outer transaction. For more about that see Section 
38.6.5.


http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html

Cheers

robert




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-12 Thread Robert Klemme

On 11.09.2011 22:10, Scott Marlowe wrote:


Another data point.  We had a big Oracle installation at my last job,
and OLAP queries were killing it midday, so I built a simple
replication system to grab rows from the big iron Oracle SUN box and
shove into a single core P IV 2.xGHz machine with 4 120G SATA drives
in SW RAID-10.

That machine handily beat the big iron Oracle machine at OLAP queries,
running in 20 minutes what was taking well over an hour for the big
Oracle machine to do, even during its (Oracle machine) off peak load
times.


Um, that sounds as if the SUN setup was really bad.  Do you remember any 
details about the drive configuration there?


Kind regards

robert




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-12 Thread Robert Klemme

On 11.09.2011 19:02, Marti Raudsepp wrote:

On Sun, Sep 11, 2011 at 17:23, Andy Colsona...@squeakycode.net  wrote:

On 09/11/2011 08:59 AM, Igor Chudov wrote:

By the way, does that INSERT UPDATE functionality or something like this exist 
in Postgres?

You have two options:
1) write a function like:
create function doinsert(_id integer, _value text) returns void as
2) use two sql statements:


Unfortunately both of these options have caveats. Depending on your
I/O speed, you might need to use multiple loader threads to saturate
the write bandwidth.

However, neither option is safe from race conditions. If you need to
load data from multiple threads at the same time, they won't see each
other's inserts (until commit) and thus cause unique violations. If
you could somehow partition their operation by some key, so threads
are guaranteed not to conflict each other, then that would be perfect.
The 2nd option given by Andy is probably faster.

You *could* code a race-condition-safe function, but that would be a
no-go on a data warehouse, since each call needs a separate
subtransaction which involves allocating a transaction ID.


Wouldn't it be sufficient to reverse order for race condition safety? 
Pseudo code:


begin
  insert ...
catch
  update ...
  if not found error
end

Speed is another matter though...

Kind regards

robert



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres for a data warehouse, 5-10 TB

2011-09-12 Thread Robert Klemme

On 12.09.2011 19:22, Andy Colson wrote:

On 9/12/2011 12:15 PM, Robert Klemme wrote:

On 11.09.2011 19:02, Marti Raudsepp wrote:

On Sun, Sep 11, 2011 at 17:23, Andy Colsona...@squeakycode.net wrote:

On 09/11/2011 08:59 AM, Igor Chudov wrote:

By the way, does that INSERT UPDATE functionality or something like
this exist in Postgres?

You have two options:
1) write a function like:
create function doinsert(_id integer, _value text) returns void as
2) use two sql statements:


Unfortunately both of these options have caveats. Depending on your
I/O speed, you might need to use multiple loader threads to saturate
the write bandwidth.

However, neither option is safe from race conditions. If you need to
load data from multiple threads at the same time, they won't see each
other's inserts (until commit) and thus cause unique violations. If
you could somehow partition their operation by some key, so threads
are guaranteed not to conflict each other, then that would be perfect.
The 2nd option given by Andy is probably faster.

You *could* code a race-condition-safe function, but that would be a
no-go on a data warehouse, since each call needs a separate
subtransaction which involves allocating a transaction ID.


Wouldn't it be sufficient to reverse order for race condition safety?
Pseudo code:

begin
insert ...
catch
update ...
if not found error
end

Speed is another matter though...



No, I dont think so, if you had two loaders, both would start a
transaction, then neither could see what the other was doing.


It depends.  But the point is that not both INSERTS can succeed.  The 
one which fails will attempt the UPDATE and - depending on isolation 
level and DB implementation - will be blocked or fail.


In the case of PG this particular example will work:

1. TX inserts new PK row
2. TX tries to insert same PK row = blocks
1. TX commits
2. TX fails with PK violation
2. TX does the update (if the error is caught)


There are
transaction isolation levels, but they are like playing with fire. (in
my opinion).


You make them sound like witchcraft.  But they are clearly defined - 
even standardized.  Granted, different RDBMS might implement them in 
different ways - here's PG's view of TX isolation:


http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html

In my opinion anybody working with RDBMS should make himself familiar 
with this concept - at least know about it - because it is one of the 
fundamental features of RDBMS and certainly needs consideration in 
applications with highly concurrent DB activity.


Kind regards

robert


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] settings input for upgrade

2011-08-21 Thread Robert Klemme
On Sat, Aug 20, 2011 at 8:33 PM, Midge Brown midg...@sbcglobal.net wrote:
 Robert,

 I was largely looking for input on whether I may have inadvertently shot
 myself in the foot with some of the choices I made when setting up
 postgresql 9.0, which is on different hardware than was the 7.4 setup.

OK, I though the config change was the diff for the other two database
and not for 9.0.

 The splitting of one table to two separate databases was done on 7.4 and did
 make a positive change in write performance. I was including that
 information only in an attempt to provide as much detail as possible.

Good to know!  Thanks!

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] settings input for upgrade

2011-08-20 Thread Robert Klemme
On Thu, Aug 18, 2011 at 11:55 PM, Midge Brown midg...@sbcglobal.net wrote:
 I'm in the process of upgrading from postgres 7.4.8 to 9.0.4 and wanted to
 run my decisions past some folks who can give me some input on whether my
 decisions make sense or not.

I am not sure what decisions you actually refer to here: in your
posting I can only see description of the current setup but no
decisions for the upgrade (i.e. changed parameters, other physical
layout etc.).

 The others are very write-heavy, started as one table within the original
 DB, and were split out on an odd/even id # in an effort to get better
 performance:

Did it pay off?  I mean you planned to increase performance and did
this actually happen?  Apart from reserving IO bandwidth (which you
achieved by placing data on different disks) you basically only added
reserved memory for each instance by separating them.  Or are there
any other effects achieved by separating (like reduced lock contention
on some globally shared resource, distribution of CPU for logging)?

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance penalty when using WITH

2011-08-03 Thread Robert Klemme
On Tue, Aug 2, 2011 at 11:48 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Sat, Jul 30, 2011 at 8:10 AM, Robert Klemme
 shortcut...@googlemail.com wrote:
 On Thu, Jul 28, 2011 at 11:00 PM, Li Jin l...@tripadvisor.com wrote:
 I met with the problem that when I was using WITH clause to reuse a
 subquery, I got a huge performance penalty because of query planner.
 Here are the details, the original query is
 EXPLAIN ANALYZE WITH latest_identities AS
 (
     SELECT DISTINCT ON (memberid) memberid, username, changedate
     FROM t_username_history
     WHERE memberid IN (SELECT memberid FROM t_member WHERE firstname || ' '
 || substring(lastname,1,1) = 'Eddie T')
     ORDER BY memberid, changedate DESC
 )

 Another observation: That criterion looks suspicious to me. I would
 expect any RDBMS to be better able to optimize this:

 WHERE firstname = 'Eddie' AND lastname like 'T%'

 I know it's semantically not the same but I would assume this is good
 enough for the common usecase.  Plus, if there is an index on
 (firstname, lastname) then that could be used.

 disagree. just one of the ways that could be stymied would to change
 the function behind the '||' operator.

I don't understand what you mean.  Can you please elaborate?

To explain my point a bit: I meant that by querying individual fields
separately instead of applying a criterion on a function of the two
the RDBMS has a better chance to use indexes and come up with a better
plan for this part of the query.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance penalty when using WITH

2011-08-03 Thread Robert Klemme
On Wed, Aug 3, 2011 at 3:27 PM, Li Jin l...@tripadvisor.com wrote:
 Robert,
 I've built an index on this expression firstname || ' ' ||
 substring(lastname,1,1). I believe this is the best index for this
 particular query. Correct me if I am wrong.

Maybe, maybe not.  Difficult to tell from a distance.  I would have an
index on (firstname, lastname).  You could try that and look at the
plan for the other query.  That's the only ultimate test which will
give you hard facts.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance penalty when using WITH

2011-08-03 Thread Robert Klemme
On Wed, Aug 3, 2011 at 3:27 PM, Li Jin l...@tripadvisor.com wrote:
 Robert,
 I've built an index on this expression firstname || ' ' ||
 substring(lastname,1,1). I believe this is the best index for this
 particular query. Correct me if I am wrong.

Maybe, maybe not.  Difficult to tell from a distance.  I would have an
index on (firstname, lastname).  You could try that and look at the
plan for the other query.  That's the only ultimate test which will
give you hard facts.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance penalty when using WITH

2011-08-03 Thread Robert Klemme
On Wed, Aug 3, 2011 at 6:24 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Aug 3, 2011 at 2:18 AM, Robert Klemme
 shortcut...@googlemail.com wrote:
 Another observation: That criterion looks suspicious to me. I would
 expect any RDBMS to be better able to optimize this:

 WHERE firstname = 'Eddie' AND lastname like 'T%'

 I know it's semantically not the same but I would assume this is good
 enough for the common usecase.  Plus, if there is an index on
 (firstname, lastname) then that could be used.

 disagree. just one of the ways that could be stymied would to change
 the function behind the '||' operator.

 I don't understand what you mean.  Can you please elaborate?

 To explain my point a bit: I meant that by querying individual fields
 separately instead of applying a criterion on a function of the two
 the RDBMS has a better chance to use indexes and come up with a better
 plan for this part of the query.

 Yes, but your assuming that it is safe and generally advantageous to
 do that.  Both assumptions I think are false.

I am not sure why you say I assume this is _safe_.  I said it is good
enough for the common usecase.  And it is certainly good enough for
this particular query.

As for the generally advantageous I'd say that an index on raw
column values is usually useful for more queries than an index on a
specific function.  That's why I'd say generally an index on column
values is more versatile and I would prefer it.  Of course you might
achieve orders of magnitude of speedup for individual queries with an
index on a function tailored to that particular query but if you need
to do that for multiple queries you pay a higher penalty for updates.

 The || operator is trivially hacked:
 create or replace function funky_concat(l text, r text) returns text as
 $$
  select textcat(textcat($1, 'abc'), $2);
 $$ language sql immutable ;

 update pg_operator set oprcode = 'funky_concat' where oid = 654;

 postgres=# select 'a' || 'b';
 ?column?
 --
  aabcb
 (1 row)

 Also even ignoring the above it's not free to have the database try
 and analyze every instance of the || operator to see if it can be
 decomposed to boolean field operations.

Even with your hacked operator you would need an index on the
expression to make it efficient.  That could be done with the original
|| as well.  But my point was to query

WHERE a = 'foo' and b like 'b%'
instead of WHERE a || ' ' || substring(b, 1, 1) = 'foo b'

to use an index on (a,b).  That index would also be useful for queries like

WHERE a = 'foo'
WHERE a like 'fo%'
WHERE a = 'foo' and b = 'bar'

and probably also

WHERE a  'foo'
WHERE a  'foo' and b like 'b%'
WHERE a  'foo' and b = 'bar'

Kind regards

robert


PS: Sorry for the earlier duplicate.  Gmail had a hickup.

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trigger or Function

2011-08-01 Thread Robert Klemme
On Sat, Jul 30, 2011 at 3:01 AM, Gavin Flower
gavinflo...@archidevsys.co.nz wrote:
 On 24/07/11 03:58, alan wrote:

 My first approach would be to remove WeekAvg and MonthAvg from the
 table and create a view which calculates appropriate values.

 Thanks Robert, I had to upgrade to 9.0.4 to use the extended windowing
 features.
 Here is how I set it up. If anyone sees an issue, please let me know.
 I'm new to postgres.

 Basically, my daily_vals table contains HOST, DATE,  VALUE columns.
 What I wanted was a way to automatically populate a 4th column
 called rolling_average, which would be the sum ofn  preceding
 columns.

There seems to be contradiction in the naming here.  Did you mean avg
ofn  preceding columns.?

 I created a view called weekly_average using this VIEW statement.

 CREATE OR REPLACE
   VIEW weekly_average
     AS SELECT *, sum(value) OVER (PARTITION BY host
             ORDER BY rid
             ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
           ) as rolling_average FROM daily_vals;

 The above gives just the rolling sum, you need to divide by the number of
 rows in the sum to get the average (I assume you want the arithmetic mean,
 as the are many types of average!).

 CREATE OR REPLACE
  VIEW weekly_average
    AS SELECT
        *,
        round((sum(value) OVER mywindow / LEAST(6, (row_number() OVER
 mywindow))), 4) AS rolling_average
    FROM daily_vals
    WINDOW mywindow AS
    (
        PARTITION BY host
        ORDER BY rid
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    );

Why not

CREATE OR REPLACE
  VIEW weekly_average
AS SELECT *, avg(value) OVER (PARTITION BY host
ORDER BY rid
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  ) as rolling_average FROM daily_vals;

What did I miss?

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance penalty when using WITH

2011-07-30 Thread Robert Klemme
On Thu, Jul 28, 2011 at 11:00 PM, Li Jin l...@tripadvisor.com wrote:
 I met with the problem that when I was using WITH clause to reuse a
 subquery, I got a huge performance penalty because of query planner.
 Here are the details, the original query is
 EXPLAIN ANALYZE WITH latest_identities AS
 (
     SELECT DISTINCT ON (memberid) memberid, username, changedate
     FROM t_username_history
     WHERE memberid IN (SELECT memberid FROM t_member WHERE firstname || ' '
 || substring(lastname,1,1) = 'Eddie T')
     ORDER BY memberid, changedate DESC
 )

Another observation: That criterion looks suspicious to me. I would
expect any RDBMS to be better able to optimize this:

WHERE firstname = 'Eddie' AND lastname like 'T%'

I know it's semantically not the same but I would assume this is good
enough for the common usecase.  Plus, if there is an index on
(firstname, lastname) then that could be used.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trigger or Function

2011-07-14 Thread Robert Klemme
On Tue, Jul 12, 2011 at 9:41 AM, alan alan.mill...@gmail.com wrote:
 Hello,
 I'm a postgres newbie and am wondering what's the best way to do this.

 I am gathering some data and will be inserting to a table once daily.
 The table is quite simple but I want the updates to be as efficient as
 possible since
 this db is part of a big data project.

 Say I have a table with these columns:
 | Date | Hostname | DayVal | WeekAvg | MonthAvg |

 When I insert a new row I have the values for Date, Hostname, DayVal.
 Is it possible to define the table is such a way that the WeekAvg and
 MonthAvg
 are automatically updated as follows?
  WeekAvg = current rows DayVal plus the sum of DayVal for the
 previous 6 rows.
  MonthAvg = current row's DayVal plus the sum of DayVal for the
 previous 29 rows.

 Should I place the logic in a Trigger or in a Function?
 Does someone have an example or a link showing how I could set this
 up?

IMHO that design does not fit the relational model well because you
are trying to store multirow aggregate values in individual rows.  For
example, your values will be wrong if you insert rows in the wrong
order (i.e. today's data before yesterday's data).

My first approach would be to remove WeekAvg and MonthAvg from the
table and create a view which calculates appropriate values.

If that proves too inefficient (e.g. because the data set is too huge
and too much data is queried for individual queries) we can start
optimizing.  One approach to optimizing would be to have secondary
tables

| Week | Hostname | WeekAvg |
| Month | Hostname | MonthAvg |

and update them with an insert trigger and probably also with an
update and delete trigger.

If you actually need increasing values (i.e. running totals) you can
use windowing functions (analytic SQL in Oracle).  View definitions
then of course need to change.
http://www.postgresql.org/docs/9.0/interactive/queries-table-expressions.html#QUERIES-WINDOW

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] UPDATEDs slowing SELECTs in a fully cached database

2011-07-14 Thread Robert Klemme
On Thu, Jul 14, 2011 at 4:05 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 It seems like we ought to distinguish heap cleanup activities from
 user-visible semantics (IOW, users shouldn't care if a HOT cleanup
 has to be done over after restart, so if the transaction only
 wrote such records there's no need to flush).  This'd require more
 process-global state than we keep now, I'm afraid.

 That makes sense, and seems like the right long-term fix.  It seems
 like a boolean might do it; the trick would be setting it (or not)
 in all the right places.

I also believe this is the right way to go.  I think the crucial thing
is in distinguish heap cleanup activities from user-visible
semantics - basically this is what happens with auto vacuum: it does
work concurrently that you do not want to burden on user transactions.

 Another approach we could take (also nontrivial) is to prevent
 select-only queries from doing HOT cleanups.  You said upthread
 that there were alleged performance benefits from aggressive
 cleanup, but IMO that can charitably be described as unproven.
 The real reason it happens is that we didn't see a simple way for
 page fetches to know soon enough whether a tuple update would be
 likely to happen later, so they just do cleanups unconditionally.

 Hmm.  One trivial change could be to skip it when the top level
 transaction is declared to be READ ONLY.  At least that would give
 people a way to work around it for now.  Of course, that can't be
 back-patched before 9.1 because subtransactions could override READ
 ONLY before that.

What I don't like about this approach is that it a) increases
complexity for the user, b) might not be for everyone (i.e. tools like
OR mappers which do not allow such setting of the TX or cases where
you do not know what type of TX this is when you start it) and c) it
still keeps the performance penalty to suddenly come to haunt a
different TX.

I can only speculate whether the latter might actually cause other
people to run into issues because their usage patterns currently force
the cleanout activities into an unimportant TX while the workaround
would suddenly have the cleanout delay show up in an important TX
which used to be fast.  This is also hard to debug since you would
normally only look at the slow TX before you realize you need to look
elsewhere (the length of this thread is kind of proof of this already
:-)).

My 0.01 EUR...

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] 100% CPU Utilization when we run queries.

2011-07-07 Thread Robert Klemme
On Wed, Jul 6, 2011 at 9:04 PM, Tomas Vondra t...@fuzzy.cz wrote:
 Dne 6.7.2011 15:30, bakkiya napsal(a):
 Any help, please?

 According to the EXPLAIN ANALYZE output (please, don't post it to the
 mailing list directly - use something like explain.depesz.com, I've done
 that for you this time: http://explain.depesz.com/s/HMN), you're doing a
 UNIQUE over a lot of data (2 million rows, 1.5GB).

 That is done by sorting the data, and sorting is very CPU intensive task
 usually. So the fact that the CPU is 100% utilized is kind of expected
 in this case. So that's a feature, not a bug.

 In general each process is hitting some bottleneck. It might be an I/O,
 it might be a CPU, it might be something less visible (memory bandwidth
 or something like that).

 But I've noticed one thing in your query - you're doing a UNIQUE in the
 view (probably, we don't know the definition) and then once again in the
 query (but using just one column from the view).

 The problem is the inner sort does not remove any rows (1979735 rows
 in/out). Why do you do the UNIQUE in the view? Do you really need it
 there? I guess removing it might significantly improve the plan.

 Try to do the query without the view - it seems it's just an union of
 current tables and a history (both partitioned, so do something like this)

 SELECT DISTINCT init_service_comp FROM (
  SELECT init_service_comp FROM events
  UNION
  SELECT init_service_comp FROM hist_events
 )

 or maybe even

 SELECT DISTINCT init_service_comp FROM (
  SELECT DISTINCT init_service_comp FROM events
  UNION
  SELECT DISTINCT init_service_comp FROM hist_events
 )

 Let's see how that works - post EXPLAIN ANALYZE using explain.depesz.com

In this case UNION ALL is probably more appropriate than UNION - and
may have different performance characteristics (saving the UNIQUE?).

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Long Running Update - My Solution

2011-06-27 Thread Robert Klemme
On Mon, Jun 27, 2011 at 5:37 PM,  t...@fuzzy.cz wrote:
 The mystery remains, for me: why updating 100,000 records could complete
 in as quickly as 5 seconds, whereas an attempt to update a million
 records was still running after 25 minutes before we killed it?

 Hi, there's a lot of possible causes. Usually this is caused by a plan
 change - imagine for example that you need to sort a table and the amount
 of data just fits into work_mem, so that it can be sorted in memory. If
 you need to perform the same query with 10x the data, you'll have to sort
 the data on disk. Which is way slower, of course.

 And there are other such problems ...

I would rather assume it is one of the other problems, typically
related to handling the TX (e.g. checkpoints, WAL, creating copies of
modified records and adjusting indexes...).

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bitmask index

2011-06-23 Thread Robert Klemme

On 06/22/2011 11:42 PM, Greg Smith wrote:

On 06/22/2011 05:27 PM, Marcus Engene wrote:

I have some tables with bitmask integers. Set bits are the interesting
ones. Usually they are sparse.


If it's sparse, create a partial index that just includes rows where the
bit is set:
http://www.postgresql.org/docs/current/static/indexes-partial.html


That would mean that if different bits are queried there would need to 
be several of those indexes.


Maybe it's an alternative to index all rows where misc_bits  0 and 
include that criterion in the query.


Kind regards

robert


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [GENERAL] [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move

2011-06-10 Thread Robert Klemme
On Thu, Jun 9, 2011 at 7:44 PM, Greg Smith g...@2ndquadrant.com wrote:

 **
 On 06/09/2011 07:43 AM, Willy-Bas Loos wrote:

 Well, after reading your article i have been reading some materail about it
 on the internet, stating that separating indexes from data for performance
 benefits is a myth.
 I found your comment  So then a single query will only ever access one of
 both at a time. very smart (no sarcasm there).
 I also found a 
 threadhttp://asktom.oracle.com/pls/asktom/f?p=100:11:0P11_QUESTION_ID:901906930328on
  AskTom that said mainly the goal is to achieve even io. (that makes
 absolute sense)


 The idea that separating indexes and tables from one another via a
 tablespace is inherently good is a myth.  Particularly nowadays, where the
 fastest part of a drive is nearly twice as fast as the slowest one in
 sequential transfers, and the ratio between sequential and random I/O is
 huge.  Trying to get clever about breaking out a tablespace is unlikely to
 outsmart what you'd get if you just let the OS deal with that stuff.

 What is true is that when you have multiple tiers of storage speeds
 available, allocating the indexes and tables among them optimally is both
 difficult and potentially worthwhile.  A customer of mine has two drive
 arrays, one of which is about 50% faster than the other; second was added as
 expansion once the first filled.  Nowadays, both are 75% full, and I/O on
 each has to be carefully balanced.  Making sure the heavily hit indexes are
 on the fast array, and that less critical things are not placed there, is
 the difference between that site staying up or going down.

 The hidden surprise in this problem for most people is the day they
 discover that *the* most popular indexes, the ones they figured had to go on
 the fastest storage around, are actually sitting in RAM all the time
 anyway.  It's always fun and sad at the same time to watch someone spend a
 fortune on some small expensive storage solution, move their most
 performance critical data to it, and discover nothing changed.  Some days
 that works great; others, it's no faster all, because that data was already
 in memory.
 http://www.2ndQuadrant.com/books


Adding a few more thoughts to this: it is important to understand the very
different nature of read and write IO.  While write IO usually can be done
concurrently to different IO channels (devices) for read IO there are
typically dependencies, e.g. you need to read the index before you know
which part of the table you need to read.  Thus both cannot be done
concurrently for a *single select* unless the whole query is partitioned and
executed in parallel (Oracle can do that for example).  Even then each
parallel executor has this dependency between index and table data.  That's
the same situation as with concurrent queries into the same table and
index.  There are of course exceptions, e.g. during a sequential table scan
you can know beforehand which blocks need to be read next and fetch them
wile processing the current block(s).  The buffering strategy also plays an
important role here.

Bottom line: one needs to look at each case individually, do the math and
ideally also measurements.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


Re: [PERFORM] Why we don't want hints Was: Slow count(*) again...

2011-06-06 Thread Robert Klemme
On Thu, Feb 10, 2011 at 7:32 PM, Craig James craig_ja...@emolecules.com wrote:
 On 2/10/11 9:21 AM, Kevin Grittner wrote:

 Shaun Thomasstho...@peak6.com  wrote:

 how difficult would it be to add that syntax to the JOIN
 statement, for example?

 Something like this syntax?:

 JOIN WITH (correlation_factor=0.3)

 Where 1.0 might mean that for each value on the left there was only
 one distinct value on the right, and 0.0 would mean that they were
 entirely independent?  (Just as an off-the-cuff example -- I'm not
 at all sure that this makes sense, let alone is the best thing to
 specify.  I'm trying to get at *syntax* here, not particular knobs.)

 There are two types of problems:

 1. The optimizer is imperfect and makes a sub-optimal choice.

 2. There is theoretical reasons why it's hard for the optimizer. For
 example, in a table with 50 columns, there is a staggering number of
 possible correlations.  An optimizer can't possibly figure this out, but a
 human might know them from the start.  The City/Postal-code correlation is a
 good example.

 For #1, Postgres should never offer any sort of hint mechanism.  As many
 have pointed out, it's far better to spend the time fixing the optimizer
 than adding hacks.

 For #2, it might make sense to give a designer a way to tell Postgres stuff
 that it couldn't possibly figure out. But ... not until the problem is
 clearly defined.

 What should happen is that someone writes with an example query, and the
 community realizes that no amount of cleverness from Postgres could ever
 solve it (for solid theoretical reasons). Only then, when the problem is
 clearly defined, should we talk about solutions and SQL extensions.

I don't have one such query handy.  However, I think your posting is a
good starting point for a discussion how to figure out what we need
and how a good solution could look like.  For example, one thing I
dislike about hints is that they go into the query.  There are a few
drawbacks of this approach

- Applications need to be changed to benefit which is not always possible.
- One important class of such applications are those that use OR
mappers - hinting then would have to be buried in OR mapper code or
configuration.
- Hints in the query work only for exactly that query (this might be
an advantage depending on point of view).

I think the solution should rather be to tell Postgres what it
couldn't possibly figure out.  I imagine that could be some form of
description of the distribution of data in columns and / or
correlations between columns.  Advantage would be that the optimizer
gets additional input which it can use (i.e. the usage can change
between releases), the information is separate from queries (more like
meta data for tables) and thus all queries using a particular table
which was augmented with this meta data would benefit.  Usage of this
meta data could be controlled by a flag per session (as well as
globally) so it would be relatively easy to find out whether this meta
data has become obsolete (because data changed or a new release of the
database is in use).

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL?

2011-05-23 Thread Robert Klemme
On Mon, May 23, 2011 at 5:54 PM, Dave Johansen davejohan...@gmail.com wrote:
 I apologize for the multiple posts. I sent this email right after joining
 the list and after it hadn't shown up a day later I figured that it had been
 lost or something and sent the other one.

Sorry for the nitpicking but I even see _three_ instances of this
posting (first on May 18th).

 Also, the database I posted this about does not have internet access and so
 I'm working on getting it moved over to a machine that does or getting it
 the info onto a machine where I can post the pertinent information about the
 schema and explain outputs.

Great!

Cheers

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL?

2011-05-22 Thread Robert Klemme
Dave,

how often do you want to repeat that posting?  What about instead
replying to the answers you got so far?

Cheers

robert


On Tue, May 17, 2011 at 5:31 PM, Dave Johansen davejohan...@gmail.com wrote:
 I am using Postgres 8.3 and I have an issue very closely related to the one
 described here:
 http://archives.postgresql.org/pgsql-general/2005-06/msg00488.php

 Basically, I have a VIEW which is a UNION ALL of two tables but when I do a
 select on the view using a LIMIT, it scans the entire tables and takes
 significantly longer than writing out the query with the LIMITs in the
 sub-queries themselves. Is there a solution to get the view to perform like
 the sub-query version?

 Thanks,
 Dave



-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] LIMIT and UNION ALL

2011-05-18 Thread Robert Klemme
On Wed, May 18, 2011 at 5:26 PM, Dave Johansen davejohan...@gmail.com wrote:
 I am using Postgres 8.3.3 and I have a VIEW which is a UNION ALL of two
 tables but when I do a select on the view using a LIMIT, it scans the entire
 tables and takes significantly longer than writing out the query with the
 LIMITs in the sub-queries themselves. Is there a solution to get the view to
 perform like the query with the LIMIT explicitly placed in the sub-queries?

Can you show DDL and queries?

The query with the LIMIT on the subqueries and the one with the LIMIT
on the overall query are not semantically equivalent.  Since you can
have an ORDER BY before the LIMIT on the query with the limit on the
view the database must have all the rows before it can apply the
ordering and properly determine the limit.  Although it might be
possible to determine under particular circumstances that only one of
the tables needs to be queried or tables need only be queried
partially I deem that quite complex. I do not know whether Postgres
can do such optimizations but for that we would certainly need to see
the concrete example (including constraint and indexes).

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move

2011-05-17 Thread Robert Klemme
On Mon, May 16, 2011 at 4:31 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, May 16, 2011 at 4:19 AM, Robert Klemme
 shortcut...@googlemail.com wrote:
 - If the planner chooses a Bitmap Index Scan, it effectively scans the
 index to figure out which table blocks to read, and then reads those
 table blocks in block number order, so that the I/O is sequential,
 with skips.

 Are these two separate phases (i.e. first scan index completely, then
 access table)?

 Yes.

So then a single query will only ever access one of both at a time.

 Separating index and tables might not be a totally good idea
 generally.  Richard Foote has an excellent article about Oracle but I
 assume at least a few things do apply to PostgreSQL as well - it's at
 least worth as something to check PostgreSQL's access patterns
 against:

 http://richardfoote.wordpress.com/2008/04/16/separate-indexes-from-tables-some-thoughts-part-i-everything-in-its-right-place/

 I would probably rather try to separate data by the nature and
 frequency of accesses.  One reasonable separation would be to leave
 all frequently accessed tables *and* their indexes on local RAID and
 moving less frequently accessed data to the SAN.  This separation
 could be easily identified if you have separate tables for current and
 historic data.

 Yeah, I think the idea of putting tables and indexes in separate
 tablespaces is mostly to bring more I/O bandwidth to bear on the same
 data.

Richard commented on that as well, I believe it was in
http://richardfoote.wordpress.com/2008/04/18/separate-indexes-from-tables-some-thoughts-part-ii-there-there/

The main point is that you do not benefit from the larger IO bandwidth
if access patterns do not permit parallel access to both disks (e.g.
because you first need to read index blocks in order to know the table
blocks to read).  The story might be different though if you have a
lot of concurrent accesses.  But even then, if the table is a hotspot
chances are that index blocks are cached and you only need physical IO
for table blocks...

  But there are other reasonable things you might do also - e.g.
 put the indexes on an SSD, and the tables on a spinning disk, figuring
 that the SSD is less reliable but you can always rebuild the index if
 you need to...

Richard commented on that theory as well:
http://richardfoote.wordpress.com/2008/05/02/indexes-in-their-own-tablespace-recoverability-advantages-get-back/

The point: if you do the math you might figure that lost indexes lead
to so much downtime that you don't want to risk that and the rebuild
isn't all that simple (in terms of time).  For a reasonable sized
database recovery might be significantly faster than rebuilding.

 Also, a lot of people have reported big speedups from putting pg_xlog
 on a dedicated RAID 1 pair, or moving the PostgreSQL logs off the data
 partition.  So those sorts of divisions should be considered also.

Now, this is something I'd seriously consider because access patterns
to pg_xlog are vastly different than those of table and index data!
So you want to have pg_xlog on a device with high reliability and high
write speed.

 Your idea of dividing things by access frequency is another good
 thought.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move

2011-05-17 Thread Robert Klemme
On Tue, May 17, 2011 at 11:47 AM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 On 05/17/2011 03:00 PM, Robert Klemme wrote:

 The main point is that you do not benefit from the larger IO bandwidth
 if access patterns do not permit parallel access to both disks (e.g.
 because you first need to read index blocks in order to know the table
 blocks to read).

 This makes me wonder if Pg attempts to pre-fetch blocks of interest for
 areas where I/O needs can be known in advance, while there's still other
 works or other I/O to do. For example, pre-fetching for the next iteration
 of a nested loop while still executing the prior one. Is it even possible?

 I'm guessing not, because (AFAIK) Pg uses only synchronous blocking I/O, and
 with that there isn't really a way to pre-fetch w/o threads or helper
 processes. Linux (at least) supports buffered async I/O, so it'd be possible
 to submit such prefetch requests ... on modern Linux kernels. Portably doing
 so, though - not so much.

There is a much more serious obstacle than the mere technical (if that
was one at all): prefetching is only reasonable if you can predict
which data you need with high probability (say = 80%).  If you can't
you'll have much more IO than without prefetching and overall
performance likely suffers.  Naturally that probability depends on the
data at hand and the access pattern.  As Cédric wrote, there seems to
be at least one case where it's done.

Cheers

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move

2011-05-16 Thread Robert Klemme
On Fri, May 13, 2011 at 9:04 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, May 4, 2011 at 6:31 AM, Willy-Bas Loos willy...@gmail.com wrote:
 I'm asking them for (real) benchmarks, thanks for the advice. (fio is not
 available for us now to do it myself, grmbl)
 It just occurred to me that it is not necessarily the case that reading the
 indexes causes a lot of random I/O (on the index itself).
 I mean, maybe the index is generally read sequentially and then, when
 retrieving the data, there is a lot of random I/O.
 if it's a long story, any tips for info about this (book or web site)?

 If you don't do anything special, and if the query plan says Index
 Scan rather than Bitmap Index Scan, then both the index I/O and the
 table I/O are likely to be fairly random.  However there are a number
 of cases in which you can expect the table I/O to be sequential:

 - In some cases, you may happen to insert rows with an ordering that
 matches the index.  For example, if you have a table with not too many
 updates and deletes, and an index on a serial column, then new rows
 will have a higher value in that column than old rows, and will also
 typically be physically after older rows in the file.  Or you might be
 inserting timestamped data from oldest to newest.
 - If the planner chooses a Bitmap Index Scan, it effectively scans the
 index to figure out which table blocks to read, and then reads those
 table blocks in block number order, so that the I/O is sequential,
 with skips.

Are these two separate phases (i.e. first scan index completely, then
access table)?

 - If you CLUSTER the table on a particular index, it will be
 physically ordered to match the index's key ordering.  As the table is
 further modified the degree of clustering will gradually decline;
 eventually you may wish to re-CLUSTER.

 It's also worth keeping in mind that the index itself won't
 necessarily be accessed in physically sequential order.  The point of
 the index is to emit the rows in key order, but if the table is
 heavily updated, it won't necessarily be the case that a page
 containing lower-valued keys physically precedes a page containing
 higher-valued keys.  I'm actually somewhat fuzzy on how this works,
 and to what extent it's a problem in practice, but I am fairly sure it
 can happen.

Separating index and tables might not be a totally good idea
generally.  Richard Foote has an excellent article about Oracle but I
assume at least a few things do apply to PostgreSQL as well - it's at
least worth as something to check PostgreSQL's access patterns
against:

http://richardfoote.wordpress.com/2008/04/16/separate-indexes-from-tables-some-thoughts-part-i-everything-in-its-right-place/

I would probably rather try to separate data by the nature and
frequency of accesses.  One reasonable separation would be to leave
all frequently accessed tables *and* their indexes on local RAID and
moving less frequently accessed data to the SAN.  This separation
could be easily identified if you have separate tables for current and
historic data.

Kind regards

robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] big distinct clause vs. group by

2011-04-19 Thread Robert Klemme
On Mon, Apr 18, 2011 at 7:13 PM, Uwe Bartels uwe.bart...@gmail.com wrote:
 the aggregate function I was talking about is the function I need to use for
 the non-group by columns like min() in my example.
 There are of course several function to choose from, and I wanted to know
 which causes as less as possible resources.

If you do not care about the output of the non key columns, why do you
include them in the query at all?  That would certainly be the
cheapest option.

If you need _any_ column value you can use a constant.

rklemme= select * from t1;
 k | v
---+---
 0 | 0
 0 | 1
 1 | 2
 1 | 3
 2 | 4
 2 | 5
 3 | 6
 3 | 7
 4 | 8
 4 | 9
(10 rows)

rklemme= select k, 99 as v from t1 group by k order by k;
 k | v
---+
 0 | 99
 1 | 99
 2 | 99
 3 | 99
 4 | 99
(5 rows)

rklemme=

Greetings from Paderborn

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to configure a read-only database server?

2011-04-19 Thread Robert Klemme
On Tue, Apr 19, 2011 at 12:08 AM, Stefan Keller sfkel...@gmail.com wrote:
 I browsed the faq and looked at PostgreSQL performance books but I
 could not find the obvious:
 How to configure a read-only database server?

 I have a single-disk virtual Linux system and a read-only dataset
 which is exposed to internet and completely replaced from time to
 time.

 This is what I found so far:

 * Disabling autovacuum daemon.

I guess this will give you only small benefits as the daemon won't
find any tables with modifications.

 * Setting postgresql.conf parameters:
   fsync=off
   synchronous_commit=off

Since you don't commit changes the effect of this might be small as well.

   full_page_writes=off

 * For the session:
   SET transaction_read_only TO FALSE;

Did you mean TRUE?

   SET TRANSACTION READ ONLY;

What about

ALTER DATABASE x SET default_transaction_read_only = on;

?

 * What about wal_level and archive_mode?

 = Any comments on speeding up/optimizing such database server?

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] big distinct clause vs. group by

2011-04-19 Thread Robert Klemme
On Tue, Apr 19, 2011 at 10:47 AM, Uwe Bartels uwe.bart...@gmail.com wrote:
 Oh, I do care about these columns.
 But by using an group by on the key columns, I cannot select the columns as
 they are. Otherwise you get an error message.
 So I have to use an aggregate functionlike min().

I find that slightly contradictory: either you do care about the
values then your business requirements dictate the aggregate function.
 If you only want to pick any value actually in the table but do not
care about which one (e.g. MIN or MAX or any other) then you don't
actually care about the value.  Because SELECT a, MAX(b) ... GROUP BY
a and SELECT a, MIN(b) ... GROUP BY a are not equivalent.  And, if
you do not care then there is probably no point in selecting them at
all.  At best you could use a constant for any legal value then.

Cheers

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Which gives good performance? separate database vs separate schema

2010-11-26 Thread Robert Klemme
On Thu, Nov 25, 2010 at 4:46 PM,  t...@fuzzy.cz wrote:
 I am not facing any issues, but yes I want to have optimal performance for
 SELECT and INSERT, especially when I am doing these ops repeatedly.
 Actually I am porting from Oracle to PG. Oracle starts a lot of processes
 when
 it needs to run many schemas. I do not think PG would need much more
 resources
 (mem, cpu) if I go for different database for each process..? Also, is
 there any
 limit on number of databases I can start using a PG server?

 Hm, I would try to run that using single cluster, and only if that does
 not perform well I'd try multiple clusters. Yes, Oracle starts a lot of
 processes for an instance, and then some processes for each connection.

 But again - in PostgreSQL, you do not start databases. You start a
 cluster, containing databases and then there are connections. This is
 similar to Oracle where you start instances (something like cluster in
 PostgreSQL) containing schemas (something like databases in PostgreSQL).
 And then you create connections, which is the object consuming processes
 and memory.

 PostgreSQL will create one process for each connection (roughly the same
 as Oracle in case of dedicated server). And yes, the number of connections
 is limited - see max_connections parameter in postgresql.conf.

I think this is a pretty common trade off that is frequently made:
basically the question is whether one wants to reserve resources or
share resources.  In this case resources would be memory and maybe
also disk IO.  With two separate clusters each one has its own memory.
 Which means that if one instance is idle and the other one has high
load then the idle instance's memory cannot be used by the other one.
With a single cluster all the memory is shared which has the downside
that high load of one instance can affect the other instance's memory.

It depends on the usage patterns (load) and the user's policy which
way to go.  Since the OP mentioned many instances the aspect of
overhead of many instances (even if idle) may come into play as well.
Plus, a single cluster is likely easier to administer than multiple.
But of course the more DB there are in a single cluster the higher the
likeliness of bottlenecks (see the other thread Performance under
contention).

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] best db schema for time series data?

2010-11-19 Thread Robert Klemme
On Fri, Nov 19, 2010 at 10:50 AM, Louis-David Mitterrand
vindex+lists-pgsql-performa...@apartia.org wrote:
 On Tue, Nov 16, 2010 at 05:28:19PM +0100, Harald Fuchs wrote:
 In article 4ce2688b.2050...@tweakers.net,
 Arjen van der Meijden acmmail...@tweakers.net writes:

  On 16-11-2010 11:50, Louis-David Mitterrand wrote:
  I have to collect lots of prices from web sites and keep track of their
  changes. What is the best option?
 
  1) one 'price' row per price change:
 
  create table price (
  id_price primary key,
  id_product integer references product,
  price integer
  );
 
  2) a single 'price' row containing all the changes:
 
  create table price (
  id_price primary key,
  id_product integer references product,
  price integer[] -- prices are 'pushed' on this array as they change
  );
 
  Which is bound to give the best performance, knowing I will often need
  to access the latest and next-to-latest prices?

  If you mostly need the last few prices, I'd definitaly go with the
  first aproach, its much cleaner. Besides, you can store a date/time
  per price, so you know when it changed. With the array-approach that's
  a bit harder to do.

 I'd probably use a variant of this:

   CREATE TABLE prices (
     pid int NOT NULL REFERENCES products,
     validTil timestamp(0) NULL,
     price int NOT NULL,
     UNIQUE (pid, validTil)
   );

 The current price of a product is always the row with validTil IS NULL.
 The lookup should be pretty fast because it can use the index of the
 UNIQUE constraint.

Even better: with a partial index lookup should be more efficient and
probably will stay that way even when the number of prices increases
(and the number of products stays the same).  With

CREATE UNIQUE INDEX current_prices
ON prices (
  pid
)
WHERE validTil IS NULL;

I get

robert= explain select price from prices where pid = 12344 and
validTil is null;
 QUERY PLAN
-
 Index Scan using current_prices on prices  (cost=0.00..8.28 rows=1 width=4)
   Index Cond: (pid = 12344)
(2 rows)

The index can actually be used here.

(see attachment)

 The validTil idea is nice, but you have to manage that field with a
 trigger, right?

Well, you don't need to.  You can always do

begin;
update prices set validTil = current_timestamp
  where pid = 123 and validTil is NULL;
insert into prices values ( 123, null, 94 );
commit;

But with a trigger it would be more convenient of course.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
\timing on

DROP TABLE prices;

CREATE TABLE prices (
  pid int NOT NULL, -- REFERENCES products,
  validTil timestamp(0) NULL,
  price int NOT NULL,
  UNIQUE (pid, validTil)
);

CREATE UNIQUE INDEX current_prices
ON prices (
  pid
)
WHERE validTil IS NULL;

INSERT INTO prices
SELECT generate_series,
CASE MOD(generate_series,10)
WHEN 0 THEN NULL
ELSE current_timestamp - MOD(generate_series,10) * interval '1' day
END,
123
FROM generate_series(1,100);

explain select price from prices where pid = 12344 and validTil is null;


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-05 Thread Robert Klemme

On 11/03/2010 04:52 PM, Nick Matheson wrote:


We have an application that needs to do bulk reads of ENTIRE
Postgres tables very quickly (i.e. select * from table). We have
observed that such sequential scans run two orders of magnitude slower
than observed raw disk reads (5 MB/s versus 100 MB/s). Part of this is
due to the storage overhead we have observed in Postgres. In the
example below, it takes 1 GB to store 350 MB of nominal data. However
that suggests we would expect to get 35 MB/s bulk read rates.



Our business logic does operations on the resulting data such that
the output is several orders of magnitude smaller than the input. So
we had hoped that by putting our business logic into stored procedures
(and thus drastically reducing the amount of data flowing to the
client) our throughput would go way up. This did not happen.


Can you disclose what kinds of manipulations you want to do on the data? 
 I am asking because maybe there is a fancy query (possibly using 
windowing functions and / or aggregation functions) that gets you the 
speed that you need without transferring the whole data set to the client.



So our questions are as follows:

Is there any way using stored procedures (maybe C code that calls
SPI directly) or some other approach to get close to the expected 35
MB/s doing these bulk reads? Or is this the price we have to pay for
using SQL instead of some NoSQL solution. (We actually tried Tokyo
Cabinet and found it to perform quite well. However it does not measure
up to Postgres in terms of replication, data interrogation, community
support, acceptance, etc).


Kind regards

robert


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance