One of our customers noticed that there were a high number of NUMA cache misses on a quad core opteron system running Bizgres MPP resulting in about a 15% performance hit. We use a process-based parallelization approach and we can guess that there's context switching due to the high degree of pipeline parallelism in our executions plans. Each context switch likely switches a process away from the CPU with local memory, resulting in the NUMA cache misses.
The answer for us is to bind each process to a CPU. Might that help in running DBT-2? - Luke On 10/10/06 9:40 AM, "Mark Wong" <[EMAIL PROTECTED]> wrote: > Luke Lonergan wrote: >> +1 >> >> Mark, can you quantify the impact of not running with IRQ balancing enabled? > > Whoops, look like performance was due more to enabling the > --enable-thread-safe flag. > > IRQ balancing on : 7086.75 > http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/158/ > IRQ balancing off: 7057.90 > http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/163/ > > The interrupt charts look completely different. There's too much stuff > on the chart to determine what interrupts are from what though. :( It > needs to be redone per processor (as opposed to per interrupt per > processor) to be more useful in determining if one processor is > overloaded due to interrupts. > > http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/158/report/sar/sar-intr.png > http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/163/report/sar/sar-intr.png > > But the sum of all the interrupts handled are close between tests so it > seems clear no single processor was overloaded: > > http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/158/report/sar/sar-intr_s.png > http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/163/report/sar/sar-intr_s.png > > Mark > ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match