Re: [PERFORM] How to unique-ify HUGE table?

2008-12-24 Thread Kynn Jones
Thank you all for the very helpful advice. Upping work_mem made it possible for me to generate the table within this century without bringing the server to a near standstill. I have not yet experimented with GROUP BY, but I'll do this next. Cheers, Kynn

[PERFORM] How to unique-ify HUGE table?

2008-12-23 Thread Kynn Jones
Hi everyone! I have a very large 2-column table (about 500M records) from which I want to remove duplicate records. I have tried many approaches, but they all take forever. The table's definition consists of two short TEXT columns. It is a temporary table generated from a query: CREATE TEMP

[PERFORM] How to profile an SQL script?

2008-12-03 Thread Kynn Jones
Hi. I have a longish collection of SQL statements stored in a file that I run periodically via cron. Running this script takes a bit too long, even for a cron job, and I would like to streamline it. Is there a way to tell Postgres to print out, after each SQL statement is executed, how long it

Re: [PERFORM] How to profile an SQL script?

2008-12-03 Thread Kynn Jones
Andreas, Heikki: Thanks! Kynn

[PERFORM] Performance and IN clauses

2008-11-18 Thread Kynn Jones
Hi. I have a Perl script whose main loop generates thousands of SQL updates of the form UPDATE edge SET keep = true WHERE node1 IN ( $node_list ) AND node2 = $node_id; ...where here $node_list stands for a comma-separated list of integers, and $node_id stands for some integer. The list

[PERFORM] The many nulls problem

2008-03-14 Thread Kynn Jones
It often happens that a particular pieces of information is non-null for a small minority of cases. A superficially different manifestation of this is when two pieces of information are identical in all but a small minority of cases. This can be easily mapped to the previous description by

Re: [PERFORM] The many nulls problem

2008-03-14 Thread Kynn Jones
On Fri, Mar 14, 2008 at 3:46 PM, Heikki Linnakangas [EMAIL PROTECTED] wrote: tons of useful info snipped From performance point of view, I would go with a single table with NULL fields on PostgreSQL. Wow. I'm so glad I asked! Thank you very much! Kynn

Re: [PERFORM] The many nulls problem

2008-03-14 Thread Kynn Jones
On Fri, Mar 14, 2008 at 2:59 PM, Oleg Bartunov [EMAIL PROTECTED] wrote: have you seen contrib/hstore ? You can have one table with common attributes and hide others in hstore That's interesting. I'll check it out. Thanks! Kynn

Re: [PERFORM] Joins and DELETE FROM

2008-03-11 Thread Kynn Jones
Thank you for your post. I finally spent some quality time with the query planner section in the docs' server config chapter. Very instructive, even considering that most of it went over my head! On Sat, Mar 8, 2008 at 4:08 PM, Tom Lane [EMAIL PROTECTED] wrote: ...have you got

Re: [PERFORM] Q on views and performance

2008-02-26 Thread Kynn Jones
On Mon, Feb 25, 2008 at 11:56 AM, Matthew [EMAIL PROTECTED] wrote: On Mon, 25 Feb 2008, Kynn Jones wrote: This is just GREAT!!! It fits the problem to a tee. It makes the queries quick then? It is good that you ask. Clearly you know the story: a brilliant-sounding optimization

Re: [PERFORM] Q on views and performance

2008-02-25 Thread Kynn Jones
On Mon, Feb 25, 2008 at 8:45 AM, Matthew [EMAIL PROTECTED] wrote: On Fri, 22 Feb 2008, Kynn Jones wrote: Hi. I'm trying to optimize... (Q1) SELECT a1.word, a2.word FROM T a1 JOIN T a2 USING ( zipk ) WHERE a1.type = int1 AND a2.type = int2; Okay, try

Re: [PERFORM] Q on views and performance

2008-02-23 Thread Kynn Jones
On Fri, Feb 22, 2008 at 8:48 PM, Dean Gibson (DB Administrator) [EMAIL PROTECTED] wrote: On 2008-02-22 12:49, Kynn Jones wrote: Of course, I expect that using views Vint1 and Vint2... would result in a loss in performance relative to a version that used bona fide tables Tint1 and Tint2

Re: [PERFORM] Q on views and performance

2008-02-23 Thread Kynn Jones
On Fri, Feb 22, 2008 at 8:48 PM, Dean Gibson (DB Administrator) [EMAIL PROTECTED] wrote: On 2008-02-22 12:49, Kynn Jones wrote: Of course, I expect that using views Vint1 and Vint2... would result in a loss in performance relative to a version that used bona fide tables Tint1 and Tint2

Re: [PERFORM] Q on views and performance

2008-02-23 Thread Kynn Jones
Hi, Dean. The system I'm working with is very similar in spirit to a large multilingual dictionary covering 100 languages. Using this analogy, the type column would correspond to the language, and the zipk column would correspond to some language-independent key associated with a concept

[PERFORM] Q on views and performance

2008-02-22 Thread Kynn Jones
Hi. I'm trying to optimize the performance of a database whose main purpose is to support two (rather similar) kinds of queries. The first kind, which is expected to be the most common (I estimate it will account for about 90% of all the queries performed on this DB), has the following general

[PERFORM] Basic Q on superfluous primary keys

2007-04-14 Thread Kynn Jones
Consider these two very similar schemas: Schema 1: CREATE TABLE foo ( id serial PRIMARY KEY, frobnitz character(varying 100) NOT NULL UNIQUE ); CREATE TABLE bar ( id serial PRIMARY KEY, foo_id int REFERENCES foo(id) ) Schema 2: CREATE TABLE foo ( frobnitz character(varying 100)

Re: [PERFORM] Optimizing a huge_table/tiny_table join

2006-05-30 Thread Kynn Jones
On 5/24/06, Tom Lane [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] writes:Limit(cost=19676.75..21327.99 rows=6000 width=84)-Hash Join(cost=19676.75..1062244.81 rows=3788315 width=84)Hash Cond: (upper((outer.id)::text) = upper((inner.id)::text))-Seq Scan on huge_table h(cost= 0.00..51292.43