On 6 November 2012 13:38, Gunnar "Nick" Bluth <gunnar.bl...@pro-open.de>wrote:

>  Am 06.11.2012 18:38, schrieb Petr Praus:
>
>
>  Yes, but note that this happens only in Linux. Increasing work_mem on my
> iMac increases performance (but the queries are slower under OSX than on
> virtualized Ubuntu on the same machine). Over the weekend, I tried the same
> test on my Ubuntu home machine with Ivy Bridge i5 3570K and it also slows
> down (from ~900ms with work_mem=1MB to ~1200ms with work_mem=96MB).
>
>
> OS X is rather different from a memory access point of view, IIRC. So the
> direct comparison actually only shows how well the Linux FS cache works
> (for the temp files created with small work_mem ;-).
>
> The i5 puzzles me a bit though...
>
>
>
>> I'm pretty sure you're hitting some subtle, memory-access-related
>> cornercase here.
>>
>> The L2 cache of your X7350 CPUs is 2MB, could you run the tests with,
>> say, 1, 2, 4 and 8MB of work_mem and post the results?
>>
> I made a pgbench test with the same query and run it 25 times (5 clients,
> 5 transactions each):
> work_mem   speed
> 1MB        1794ms
> 2MB        1877ms
> 4MB        2084ms
> 8MB        2141ms
> 10MB       2124ms
> 12MB       3018ms
> 16MB       3004ms
> 32MB       2999ms
> 64MB       3015ms
>
>  It seems that there is some sort of "plateau".
>
> Two, afaics. The 1->2 change hints towards occasionally breaching your L2
> cache, so it can probably be ignored. The actual plateaus thus seem to be
> 0-2, 2-12, >= 12.
> It'd be interesting to see the EXPLAIN ANALYSE outputs for these levels,
> the buckets and batches in particular. I'd reckon we'll see significant
> changes at 2->4 and 10->12MB work_mem.
>

Here are the explains, I run the query a few times before actually taking
the explain to warm up the caches. (I also noticed that explain slows down
the query execution which is probably to be expected.)

2MB: http://explain.depesz.com/s/ul1
4MB: http://explain.depesz.com/s/IlVu
10MB: http://explain.depesz.com/s/afx3
12MB: http://explain.depesz.com/s/i0vQ

 So, to sum this up (and make someone more competent bite on it maybe ;-),
> on your SMP, FSB, "fake-multicore" system all "hash"-related works that
> potentially switch to different implementations internally (but w/out
> telling us so) when given more work_mem are slower.
>
> See other post... it actually does tell us (# of buckets/batches).
> However, the result is not good and could potentially be improved be
> twealing the statistic_targets of the joined tables/columns.
>
> I wonder why noone actually understanding the implementation chipped in
> yet... Andres, Greg, Tom, whoever actually understands what's happening
> here, anyone reading this? ;-)
>
> Cheers,
>
> --
> Gunnar "Nick" Bluth
> RHCE/SCLA
>
> Mobil   +49 172 8853339
> Email: gunnar.bl...@pro-open.de
> __________________________________________________________________________
> In 1984 mainstream users were choosing VMS over UNIX.  Ten years later
> they are choosing Windows over UNIX.  What part of that message aren't you
> getting? - Tom Payne
>
>

Reply via email to