1) OS/Configuration64-bit Windows 7 Enterprise with 8G RAM and a Dual Core 2.67 
Ghz Intel CPUpostgresql-x64-9.0 (PostgreSQL 9.0.1, compiled by Visual C++ build 
1500, 64-bit)work_mem  2GBshared_buffers = 22) 
Datasetname,pages,tuples,pg_size_pretty"pivotbad";1870;93496;"15 
MB""pivotgood";5025;251212;"39 MB"
3) EXPLAIN (ANALYZE ON, BUFFERS ON)"Hash Join  (cost=16212.30..52586.43 
rows=92869 width=17) (actual time=25814.222..32296.765 rows=3163 loops=1)""  
Hash Cond: (((pb.id)::text = (pg.id)::text) AND ((pb.question)::text = 
(pg.question)::text))""  Join Filter: ((COALESCE(pb.response, 
'MISSING'::character varying))::text <> (COALESCE(pg.response, 
'MISSING'::character varying))::text)""  Buffers: shared hit=384 read=6511, 
temp read=6444 written=6318""  ->  Seq Scan on pivotbad pb  (cost=0.00..2804.96 
rows=93496 width=134) (actual time=0.069..37.143 rows=93496 loops=1)""        
Buffers: shared hit=192 read=1678""  ->  Hash  (cost=7537.12..7537.12 
rows=251212 width=134) (actual time=24621.752..24621.752 rows=251212 loops=1)"" 
       Buckets: 1024  Batches: 64  Memory Usage: 650kB""        Buffers: shared 
hit=192 read=4833, temp written=4524""        ->  Seq Scan on pivotgood pg  
(cost=0.00..7537.12 rows=251212 width=134) (actual time=0.038..117.780 
rows=251212 loops=1)""              Buffers: shared hit=192 read=4833""Total 
runtime: 32297.305 ms"
4) INDEXESI can certainly add an index but given the table sizes I am not sure 
if that is a factor. This by no means is a large dataset less than 350,000 rows 
in total and 3 columns. Also this was just a quick dump of data for comparison 
purpose. When I saw the poor performance on the COALESCE, I pointed the data 
load to SQL Server and ran the same query except with the TSQL specific ISNULL 
function.
                                          

Reply via email to