On May 29, 2007, at 10:58 AM, David Acker wrote:

Ok, I finally got some time to code this out and study it and Ihave some
questions.

Milton Miller wrote:
We add two flags to struct rx: one says this packet is EL, and one says
it is or was size 0.   We create a function, find_mark_el(struct nic,
is_running) that is called after initial alloc and/or after refilling
skb list. In find_mark_el, we start with rx_to_use (lets rename this
rx_to_fill), and go back two entries, call this rx_to_mark.   If
is_running and rx_to_mark->prev->el then just return, leave EL alone.
So if we start clean and then one packet completes, we can not clear the
old marked entry?  We must then add a per nic pointer to the current
marked entry so that when the next packet completes, we can avoid
scanning for it.  Why do we need to check this again?

I don't think we need a per-nic pointer, we just need to check if our
to_mark->prev has the "This is EL" flag set.  the to_mark is the
to_fill->prev->prev  (I think -- to_use in the code, which is the
next to be allocated, not last allocated? I haven't studied that detail).

We need this to avoid our refill code running in lock step with the
hardware and exposing two packets in a row with a size that it reads
as 0.

This means we never need to scan ahead more than one packet.



Otherwise, set EL and size to 0, set the two flags in the rx struct,
sync the RFD, then search for the current EL (we could save the pointer, but its always the odl rx_to_use (fill) or its ->prev). Clear RFD->EL, sync, clear rx->el. Set size non-zero, sync, Leave the was_0 flag set
if is_running (clear it only if we know reciever is stopped).

At this point, if the receiver was stopped we can restart it,.  We
should do so in the caller.  (We always restart at rx_to_clean).
Restart should ack the RNR status before issuing the ru_start command to
avoid a spurious RNR interrupt.

In the receive loop, if RFD->C is not set, rx->was_0 is set and el
is not set, then we need to check rx->next->RFD->C bit (after
pci_sync_for_cpu).   If the next packet C bit is set then we consider
this skb as used, and just complete it with no data to the stack.
If the C-bit is not set, we can read the status to see if the RU is
running (we cleared the EL bit before it read it but it has not found
another packet yet) or not (it read the el-bit before we cleared it).
This read lets us avoid going through an enable interrupts, wait for the
RNR interrupt cycle.

Yes agreed.  But I was pointing out if we never had set size to 0
on this packet (ie rx->was_0 is clear) we dont' even need to poll
the device (saves a slow mmio load) or check the next packet (saves
a pci_sync_for_cpu ie a cache line invalidate, and check for c bit).

Because find_mark_el will only advance EL if the receiver was stopped
or we had more than 1 packet added, we can guarantee that if packet
N has was_0 set, then packet N+1 will not have it set, so we have
bounded lookahead.
Is this meant to be 1 packet added at any given call or 1 packet added
since the last time we cleared?

This would be we allocated one packet since the last time we moved EL.

This adds two flags to struct rx, but that is allocated as a single
array and the array size is filled out as forward and back pointers.
Rx clean only has to look at was_0 on the last packet when it
decides C is not set, and only if both are set does it peek at the
next packet.  In other words, we shouldn't worry about the added
flags making it a non-power-of-2 size.

By setting size != 0 after we have modified all other fields, the
device will only write to this packet if we are done writing.  If
we loose the race and only partially complete, it will just go on
to the next packet and we find it.  If we totally loose, then the
receiver will go RNR and we can reclaim all the buffer space we
have allocated.

Comments?  Questions?



Say we have an 8 buffer receive pool (0-7) at start.  rx[6] is marked.

hardware consumes rx[0]
software sees one rx complete without an s-bit set.

wihtout el bit set?

software sees one packet complete, checks next and find was_el is not set so it assumes it is still pending and the device is running.


software notes that rx[6] is marked
software allocates new buffer for rx[0]
software runs find_mark_el with rx_to_use == rx[1], rx_to_mark == rx[7], is_running == true, marked_rx == rx[6] find_mark_el finds rx_to_mark->prev->el set and is_running true, and returns.

hardware consumes rx[1]
software sees one rx complete without an s-bit set.

software sees one packet complete, checks next and find was_0 is not set so it assumes it is still pending and the device is running.

software notes that rx[6] is still marked
software allocates new buffer for rx[1]
software runs find_mark_el with rx_to_use == rx[2], rx_to_mark == rx[0], is_running == true, marked_rx == rx[6]
find_mark_el does not find rx_to_mark->prev->el set so continues
find_mark_el sets rx[0]->rfd->el rx[0]->rfd->size = 0, rx[0]->el, rx[0]->size0; find_mark_el clears rx[6]->rfd->el, syncs, sets rx[6]->rfd->size, syncs, clears rx[6]->rfd->el but leaves rx[6]->rfd->size0 set.

Is this the correct flow?


Without thinking about the ordering of these updates, yes I think that is the right flow.

although the "rx[6] is still marked" is not really a step, but notices
that [0]->prev === [7]->el is not set.


actually, lets do second part again with what I was thinking:

hardware consumes rx[1]
software sees one rx complete without an s-bit set.

software notes that rx_to_use (fill) is rx[1]

software allocates new buffer for rx[1], and moves rx_to_use to rx[2]

software runs with find_mark_el with rx_to_use = rx[2], rx_was_to_use = rx[1], is_running = true

find_mark_el walks backward and calculates rx_to_mark as rx[0]. it calculates rx_to_unmark as rx_was_to_use->prev == rx[7] initially, but since ->el is not set it backs up once more to rx[6]. It then compares rx_to_unmark is != rx_to_mark->prev and continues as above. Note that if is_running == true we would also continue; this should be the second clause in the || obviously.

Your logic works, this adds some conditional branching but saves a pointer, not sure which is better. Mine can be used for initializing without special casing since it will just calculate rx_to_unmark as rx[n-2] and rx_to_mark as rx[n-2] != rx[n-2]->prev; unmarking a never marked still works, where as for yours the "was marked" must be explicitly set to something (hmm, rxs = rx[0] might work for that initial value until we mark -2??)

again, the compare of rx->el instead of rx->rfd->el is to save cache traffic to the rfd under io. The rx->was_0 is required, so the el flag is free.

milton

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to