Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-06 Thread Scott Marlowe
On Thu, Jan 7, 2010 at 12:17 AM, Carlo Stonebanks wrote: > Our DB has an audit table which is 500M rows and growing. (FYI the objects > being audited are grouped semantically, not individual field values). > > Recently we wanted to add a new feature and we altered the table to add a > new column.

Re: [PERFORM] Optimizer use of index slows down query by factor

2010-01-06 Thread Michael Ruf
Hi, Tom Lane wrote: > > I think you need to see about getting this rowcount estimate to be more > accurate: > >> -> Index Scan using idx_link_1 on link >> (cost=0.00..680.51 rows=13477 width=26) (actual time=5.707..12.043 >> rows=126 loops=1) >>

[PERFORM] Massive table (500M rows) update nightmare

2010-01-06 Thread Carlo Stonebanks
Our DB has an audit table which is 500M rows and growing. (FYI the objects being audited are grouped semantically, not individual field values). Recently we wanted to add a new feature and we altered the table to add a new column. We are backfilling this varchar(255) column by writing a TCL sc

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Craig Ringer
On 7/01/2010 10:44 AM, Dmitri Girski wrote: Hi everybody, Many thanks to everyone replied, I think we are on the right way. I've used tcpdump to generate the logs and there are a lot of dropped packets due to the bad checksum. Network guy is currently looking at the problem and most likely this

Re: [PERFORM] Digesting explain analyze

2010-01-06 Thread Greg Smith
Jesper Krogh wrote: Is it possible to get PG to tell me, how many rows that fits in a disk-page. All columns are sitting in "plain" storage according to \d+ on the table. select relname,round(reltuples / relpages) as "avg_rows_per_page" from pg_class where relpages > 0; -- Greg Smith2nd

Re: [PERFORM] Digesting explain analyze

2010-01-06 Thread Jesper Krogh
Ron Mayer wrote: >> ...The inner sets are on average 3.000 for >> both id1 and id2 and a typical limit would be 100, so if I could convince >> postgresql to not fetch all of them then I would reduce the set retrieved >> by around 60. The dataset is quite large so the random query is not very >> lik

[PERFORM] Joining on text field VS int

2010-01-06 Thread Radhika S
Hi, I am going to test this out but would be good to know anyways. A large table is joined to a tiny table (8 rows) on a text field. Should I be joining on an int field eg: recid intead of name? Is the performance affected in either case? Thanks .

Re: [PERFORM] noob inheritance question

2010-01-06 Thread Nikolas Everett
Inheritance would only make sense if each of your categories had more columns. Say if you had a "wines" category and only they had a year column. Its probably not worth it for one or two columns but if you've got a big crazy heterogeneous tree of stuff then its probably appropriate. I'm with Ric

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Scott Marlowe
On Wed, Jan 6, 2010 at 7:44 PM, Dmitri Girski wrote: > Hi everybody, > Many thanks to everyone replied, I think we are on the right way. > I've used tcpdump to generate the logs and there are a lot of dropped > packets due to the bad checksum. Network guy is currently looking at the > problem and

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Dmitri Girski
Hi everybody, Many thanks to everyone replied, I think we are on the right way. I've used tcpdump to generate the logs and there are a lot of dropped packets due to the bad checksum. Network guy is currently looking at the problem and most likely this is hardware issue. Cheers, Dmitri. On Tue, J

Re: [PERFORM] Digesting explain analyze

2010-01-06 Thread Robert Haas
On Wed, Jan 6, 2010 at 2:10 PM, Jesper Krogh wrote: > Hi. > > I have a table that consists of somewhere in the magnitude of 100.000.000 > rows and all rows are of this tuples > > (id1,id2,evalue); > > Then I'd like to speed up a query like this: > > explain analyze select id from table where id1 =

Re: [PERFORM] noob inheritance question

2010-01-06 Thread Richard Neill
Zintrigue wrote: I'm hoping the inheritance feature will be a nice alternative method for me to implement categories in particular database of products I need to keep updated. I suppose in MySQL I would probably do this by creating, for example, one table for the products, and then a table(s)

Re: [PERFORM] noob inheritance question

2010-01-06 Thread Richard Broersma
On Wed, Jan 6, 2010 at 3:53 PM, Zintrigue wrote: > I'm wondering if there's any performance penalty here, analogous to the > penalty of JOINs in a regular RDBMS (versus an ORDBMS). > If anyone can offer in any insight as too how inheritance is actually > executed (compared to JOINs especially), I'

[PERFORM] noob inheritance question

2010-01-06 Thread Zintrigue
Hello, I am complete noob to Postgres and to this list, and I hope this will be the appropriate list for this question. I'm hoping the inheritance feature will be a nice alternative method for me to implement categories in particular database of products I need to keep updated. I suppose in MySQL

Re: [PERFORM] Digesting explain analyze

2010-01-06 Thread Ron Mayer
Jesper Krogh wrote: > I have a table that consists of somewhere in the magnitude of 100.000.000 > rows and all rows are of this tuples > > (id1,id2,evalue); > > Then I'd like to speed up a query like this: > > explain analyze select id from table where id1 = 2067 or id2 = 2067 order > by evalue

[PERFORM] Digesting explain analyze

2010-01-06 Thread Jesper Krogh
Hi. I have a table that consists of somewhere in the magnitude of 100.000.000 rows and all rows are of this tuples (id1,id2,evalue); Then I'd like to speed up a query like this: explain analyze select id from table where id1 = 2067 or id2 = 2067 order by evalue asc limit 100;

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Craig James
Dave Crooke wrote: The fact that the delays are clustered at (3 + 0.2 n) seconds, rather than a distributed range, strongly indicates a timeout and not (directly) a resource issue. 3 seconds is too fast for a timeout on almost any DNS operation, unless it has been modified, so I'd suspect it'

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Dave Crooke
The fact that the delays are clustered at (3 + 0.2 n) seconds, rather than a distributed range, strongly indicates a timeout and not (directly) a resource issue. 3 seconds is too fast for a timeout on almost any DNS operation, unless it has been modified, so I'd suspect it's the TCP layer, e.g. pe

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Robert Haas
On Tue, Jan 5, 2010 at 11:50 PM, Craig Ringer wrote: > Wireshark is your friend. +1. I think if you put a packet sniffer on the interface you are connecting from it will become clear what the problem is in short order. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@po

Re: [PERFORM] pg_connect takes 3.0 seconds

2010-01-06 Thread Matthew Wakeling
On Wed, 6 Jan 2010, Dmitri Girski wrote: On the other hand, if I use ip addresses this should not attract any possible issues with DNS, right? Not true. It is likely that the server program you are connecting to will perform a reverse DNS lookup to work out who the client is, for logging or