On Monday 13 March 2006 04:44, Glenn Enright <[EMAIL PROTECTED]> wrote about 'Re: [gentoo-user] Mobo/proc combination': > On Monday 13 March 2006 21:47, Boyd Stephen Smith Jr. wrote: > > Hyper-Transport is a way for CPUs to exchange data directly rather > > than going through a memory controller, thus allowing limited > > resources (L1/2/3 cache) to be used more effectively. In particular, > > process migration causes fewer cache misses. > > > > Hyper-Threading is a way for a CPU to pretend to be two, thus causing > > the system to request/require more resources than are available. > > > > Hyper-Transport attempts to alleviate a bottleneck, while > > Hyper-Threading increases the load on an existing one. > > I apreciate that AMD certainly seem to have the memory > bandwidth/throughput thing nailed, and their processors stand tall as a > result. but I doubt that a p4 would perform near as well without a large > part of the enginered paralelism that comes as part HThreading, compared > to a purely serial system.
My characterization is mostly correct. However, there are mitigating circumstances on the P4. What happened is that Intel made the instruction pipeline so long, they were losing a /large/ number of cycles when they had to flush the pipeline. The pipeline basically has to be flushed anytime the branch predictor guesses wrong. With HT there's a separate pipeline that can be independently filled and flushed, sharing the same compute devices, this allows the chip to continue processing unless both branch predictors go wrong. For certain pair of processes this will increase performance because the (on-chip) scheduler overhead is overtaken by the reducing of wasted cycle due to single-pipeline flush. The additional pipeline is a good idea, but I think it would have been better used as a parallel pipeline so that the branch predictor /is/ the scheduler. When a branch is encountered, the pipeline is duplicated and the branch the predictor chose is given priority for compute resources. A branch can only go two ways so one of the pipelines is correct and a bad guess by the branch predictor will only flush one. The problem with my approach [1] is handling the case when the pipeline ends up having multiple branch instructions in it. Then, you don't have enough pipelines to do all the branches simultaneously and you can run into the same issues of having to flush all the pipelines. I'm sure the designers at Intel weighed my approach against the HT approach taken and found their approach superior given the size of the pipeline and statistics available for branch instruction probability. I just hope that HT was the winner because of technical superiority and not because they couldn't find a cool name for my approach. [My choice: Quantum-Prediction.] In short, going with H-Thr vs. non-H-Thr will probably buy you performance, but it could be the wrong solution to a problem that shouldn't have existed in the first place. (The first thing you learn about pipelines is the flush overhead associated with long ones.) -- "If there's one thing we've established over the years, it's that the vast majority of our users don't have the slightest clue what's best for them in terms of package stability." -- Gentoo Developer Ciaran McCreesh [1] I call it "my approach" for two reasons: a) I wrote it in the email and 2) I can't remember the actual name of it; it's been used before and may be used currently, it's not original or anything. -- gentoo-user@gentoo.org mailing list