On our cell blade + PCI-e Mellanox.
I don't see anything in arch/powerpc that looks like
dma_alloc_coherent() will do anything other than allocate some memory
and map it with DMA_BIDIRECTIONAL. So how does this altix fix help in
your situation? Am I misreading the Cell IOMMU code?
Roland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 01:40:36 PM:
Shirley, can you clarify why doing dma_alloc_coherent() in the kernel
helps on your Cell blade? It really seems that dma_alloc_coherent()
just allocates some memory and then does dma_map(DMA_BIDIRECTIONAL),
which would be
That would be great. We hit a similar problem in our cluster test -- data
corruption because of this race.
On what platform?
- R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To
Hmm, OK. Then I will do my best to make sure we get a fix for this
into 2.6.22.
That would be great. We hit a similar problem in our cluster test -- data
corruption because of this race.
Thanks
Shirley Ma___
openib-general mailing list
Roland Dreier [EMAIL PROTECTED] wrote on 02/26/2007 02:09:48 PM:
That would be great. We hit a similar problem in our cluster test --
data
corruption because of this race.
On what platform?
- R.
On our cell blade + PCI-e Mellanox.
Thanks
Shirley
On our cell blade + PCI-e Mellanox.
I don't see anything in arch/powerpc that looks like
dma_alloc_coherent() will do anything other than allocate some memory
and map it with DMA_BIDIRECTIONAL. So how does this altix fix help in
your situation? Am I misreading the Cell IOMMU code?
- R.
A first-cut at a patch was sent out, some very reasonable
objections were raised, and the thread fizzled out.
Sorry, I meant to respond again, but I never got around to it.
The biggest concern with the earlier patch seemed to be
backward compatibility. There was a stab at addressing
On Thu, Feb 22, 2007 at 10:34:16AM -0800, Roland Dreier wrote:
I actually have a vague plan for a somewhat cleaner way to get this
fix. For a variety of reasons, I am planning on changing the way the
kernel handles memory registration so that low-level drivers have more
control over what
We found this accidentally, running a normal MPI job, on a
normally sized machine (i.e., tens, not hundreds of
processors.) It appears to be more easily produced that
we'd expected, and we consider it to be a severe problem.
Hmm, OK. Then I will do my best to make sure we get a fix
In:
http://openib.org/pipermail/openib-general/2006-December/030251.html
I described a potential race between DMA and CQ updates on
Altix systems. At that time the bug hadn't been observed,
but was expected to be possible on large NUMA systems.
A first-cut at a patch was sent out, some very
10 matches
Mail list logo