Re: [PERFORM] big joins not converging

2011-03-11 Thread fork
tables, and try again with something fuzzy. If you build the indices and use = and it is still slow, ask again here -- that shouldn't happen. And you're right fork, Record Linkage is in fact an entire academic discipline! Indeed. Look for blocking and editing with your data first, I think. I

Re: [PERFORM] Tuning massive UPDATES and GROUP BY's?

2011-03-11 Thread fork
Marti Raudsepp marti at juffo.org writes: If you don't mind long recovery times in case of a crash, set checkpoint_segments to ~100 and checkpoint_completion_target=0.9; this will improve write throughput significantly. Sounds good. Also, if you don't mind CORRUPTing your database after a

[PERFORM] Tuning massive UPDATES and GROUP BY's?

2011-03-10 Thread fork
Given that doing a massive UPDATE SET foo = bar || ' ' || baz; on a 12 million row table (with about 100 columns -- the US Census PUMS for the 2005-2009 ACS) is never going to be that fast, what should one do to make it faster? I set work_mem to 2048MB, but it currently is only using a little bit

Re: [PERFORM] Tuning massive UPDATES and GROUP BY's?

2011-03-10 Thread fork
Merlin Moncure mmoncure at gmail.com writes: I am loathe to create a new table from a select, since the indexes themselves take a really long time to build. you are aware that updating the field for the entire table, especially if there is an index on it (or any field being updated),

Re: [PERFORM] big joins not converging

2011-03-10 Thread fork
Steve Atkins steve at blighty.com writes: On Mar 10, 2011, at 1:25 PM, Dan Ancona wrote: Hi postgressers - As part of my work with voter file data, I pretty regularly have to join one large-ish (over 500k rows) table to another. Sometimes this is via a text field (countyname) +