Johannes Erdfelt wrote:
On Mon, Jun 14, 2004, David Brownell <[EMAIL PROTECTED]> wrote:
Alan Cox wrote:

x86 systems are supposed to be cache coherent in this situation. The bridge will untangle the mess as neccessary. The PC is the unusual one here - most other platforms (eg mips) behave like the PPC

Seems like the dma_alloc_coherent() API spec can't be implemented on such machines then, since it's defined to return memory(*) such that:

 ... a write by either the device or the processor
 can immediately be read by the processor or device
 without having to worry about caching effects.

...


I'm having trouble understanding exactly what the problem is.

I'm reading it as: on the PPC boxes in question, dma_alloc_coherent() -- used indirectly to allocate the memory in question -- is returning memory that doesn't work as required by the text I quoted.

Specifically, caching effects DO exist, ones where
the CPU's L1 cache line size matters.  X86 systems
use cache snooping to let coherent memory be cached.


Is the problem that the device (or maybe PCI bridge) is caching data and
then overwriting stale data to the software portion of the QH descriptor?

I didn't think it was write buffering, since the CPU cache line size mattered. Nicolas says that padding out the QH structure is needed:

    struct uhci_qh {
        /* Hardware fields */
        __u32 link;                     /* Next queue */
        __u32 element;                  /* Queue element pointer */

        /* Software fields */
        // PATCH
        dma_addr_t dma_handle __attribute__((aligned( UHCI_ALIGN )));
        // Ensure that the software fields are out of any cache line
        // or burst write so they can't be overwrite by DMA operation

        struct usb_device *dev;
        struct urb_priv *urbp;

        struct list_head list;          /* P: uhci->frame_list_lock */
        struct list_head remove_list;   /* P: uhci->remove_list_lock */
   } __attribute__((aligned( UHCI_ALIGN )));

If the QH memory were truly "coherent", that patch could never
matter.  Someone who knows that CPU better should probably
answer more detailed questions.  For example, I'd think that
using uncached mappings for that memory should ensure that the
accesses are "coherent" enough.

On the other hand, it's easy to end up with subtle ordering
assumptions when the HC and HCD are both updating the same
data structures.  Write buffers can pop up in both directions,
some CPUs will re-order accesses, the HC silicon may not even
have a clearly specified sequence to violate, the HCD might
be missing some barriers, and so on.

- Dave




------------------------------------------------------- This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to