On Thursday 23 April 2009 3:59:28 pm Peter Jeremy wrote:
> I'm currently trying to port some code that uses bus_dma(9) from
> OpenBSD to FreeBSD and am having some difficulties in following the
> way bus_dma is intended to be used on FreeBSD (and how it differs from
> Net/OpenBSD). Other than the man page and existing FreeBSD drivers, I
> am unable to locate any information on bus_dma care and feeding. Has
> anyone written any tutorial guide to using bus_dma?
>
> The OpenBSD man page provides pseudo-code showing the basic cycle.
> Unfortunately, FreeBSD doesn't provide any similar pseudo-code and
> the functionality is distributed somewhat differently amongst the
> functions (and the drivers I've looked at tend to use a different
> order of calls).
>
> So far, I've hit a number of issues that I'd like some advice on:
>
> Firstly, the OpenBSD model only provides a single DMA tag for the
> device at attach() time, whereas FreeBSD provides the parent's DMA tag
> at attach time and allows the driver to create multiple tags. Rather
> than just creating a single tag for a device, many drivers create a
> device tag which is only used as the parent for additional tags to
> handle receive, transmit etc. Whilst the need for multiple tags is
> probably a consequence of moving much of the dmamap information from
> OpenBSD bus_dmamap_create() into FreeBSD bus_dma_tag_create(), the
> rationale behind multiple levels of tags is unclear. Is this solely
> to provide a single point where overall device DMA characteristics &
> limitations can be specified or is there another reason?
Many drivers provide a parent "driver" tag specifically to have a single
point, yes.
> Secondly, bus_dma_tag_create() supports a BUS_DMA_ALLOCNOW flag that
> "pre-allocates enough resources to handle at least one map load
> operation on this tag". However it also states "[t]his should not be
> used for tags that only describe buffers that will be allocated with
> bus_dmamem_alloc()" - does this mean that only one of bus_dmamap_load()
> or bus_dmamap_alloc() should be used on a tag/mapping? Or is the
> sense backwards (ie "don't specify BUS_DMA_ALLOCNOW for tags that are
> only used as the parent for other tags and never mapped themselves")?
> Or is there some other explanation.
What happens usually now is that each thing you want to pre-alloc memory
for using bus_dmamem_alloc() (such as descriptor rings) uses its own tag.
This is somewhat mandated by the fact that bus_dmamem_alloc() doesn't take
a size but gets the size to allocate from the tag. So usually a NIC driver
will have 3 tags: 1 for the RX ring, 1 for packet data, and 1 for the TX
ring. Some drivers have 2 tags for packet data, 1 for TX buffers and 1
for RX buffers.
> Thirdly, bus_dmamap_load() has a uses a callback function to return
> the actual mapping details. According to the man page, there is no
> way to ensure that the callback occurs synchronously - a caller can
> only request that bus_dmamap_load() fail if resources are not
> immediately available. Despite this, many drivers pass 0 for flags
> (allowing an asynchronous invocation of the callback) and then fail
> (and cleanup) if bus_dmamap_load() returns EINPROGRESS. This appears
> to open a race condition where the callback and cleanup could occur
> simultaneously. Mitigating the race condition seems to rely on one of
> the following two behaviours:
>
> a) The system is implicitly single-threaded when bus_dmamap_load() is
> called (generally as part of the device attach() function). Whilst
> this is true at boot time, it would not be true for a dynamically
> loaded module.
>
> b) Passing BUS_DMA_ALLOCNOW to bus_dma_tag_create() guarantees that
> the first bus_dmamap_load() on that tag will be synchronous. Is this
> true? Whilst it appears to be implied, it's not explicitly stated.
That doesn't really guarantee that either as the pool of bounce pages can be
shared across multiple tags. I think what you might be missing is this:
c) bus_dmamap_load() of a map returned from bus_dmamem_alloc() will always
succeed synchronously.
That is the only case other than BUS_DMA_NOWAIT where one can assume
synchronous calls to the callback. Also, some bus_dma calls basically
assumes BUS_DMA_NOWAIT such as bus_dmamap_load_mbuf() and
bus_dmamap_load_mbuf_sg().
> Finally, what are the ordering requirements between the alloc, create,
> load and sync functions? OpenBSD implies that the normal ordering is
> create, alloc, load, sync whilst several FreeBSD drivers use
> tag_create, alloc, load and then create.
FreeBSD uses the same ordering as OpenBSD. I think you might be confused by
the bus_dmamem_alloc() case. There are basically two cases, the first is
preallocating a block of RAM to use for a descriptor or command ring:
alloc_ring:
bus_dma_tag_create(..., &ring_tag);
/* Creates a map internally. */
bus_dmamem_alloc(ring_tag, &p, ..., &ring_map);
/* Will not fail wit