Re: [vbox-dev] Handling of race in Linux e1000 driver

2018-03-13 Thread Aleksey Ilyushin
There are quite a lot of issues with the patch provided. See below.

On 12 Mar 2018, at 07:07, Sylvia Else  wrote:

> There's a race condition in the Linux e1000 driver, in that it enables 
> receiving on the interface before it initialises the buffer size and 
> addresses. The window is short, of the order of a millisecond, and probably 
> does little harm with real hardware.
The buffer size, which is set via RCTL.BSIZE, has a default value of 2048. 
Under no circumstances it can be zero. So I assume that you are talking about 
e1000 driver enabling receiving before allocating rx buffers and assigning them 
to rx descriptors. Indeed, e1000_configure allocates rx buffers after it 
enables receiving via e1000_configure_rx, but there is a catch: 
e1000_configure_rx sets both RDH and RDT to zeros, which should totally prevent 
entering the rx descriptor processing loop. Only when an actual buffer 
allocation function (like e1000_allloc_rx_buffer) advances RDT the rx 
descriptor, for which the buffer has just been allocated, will be read by e1000 
(hardware or emulated). 
> 
> Unfortunately, when in bridge networking mode, if there are any broadcast 
> packets around, the VirtualBox DevE1000 module will try to send one during 
> that window. The hardware specification is silent as to what the hardware 
> should do in this situation, but the effect of the interaction between 
> VirtualBox  and the driver is to render the network non-functional. The 
> dependence on broadcast packets to trigger the failure is no doubt why many 
> people have no problem with it.
I am not trying to say that there is no problem, but your explanation of the 
cause does not convince me. The idea of improperly initialised descriptors 
getting stuck in rx descriptor cache seems very plausible to me, but I doubt 
that e1000 driver initialisation sequence is to blame, hence I do not believe 
the proposed fix solves the actual issue. With what version/flavour of linux 
guest have you seen this issue? I am looking at 
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L1871
> 
> While this is a bug in the driver, rather than VirtualBox, the latter could 
> better handle the situation by discarding packets when the buffer size or 
> addresses of zero rather than attempting (but ultimately failing) to deliver 
> it.
As I explained earlier, u16RxBSize cannot be zero under any circumstances, so 
your patch essentially introduces two things: (1) discarding rx descriptor 
cache if a descriptor with zero buffer address is seen, and (2) not writing 
back the processed descriptor with DD (descriptor done) bit set. Both things go 
against e1000 spec. The spec explicitly allows descriptors with zero buffer 
addresses, see section 3.2.5.2 “Null Descriptor Padding”, and prescribes the 
proper course of action for hardware. I quote,

"Hardware stores no data in descriptors with a null data address. Software can 
make use of this property to cause the first condition under receive descriptor 
packing to occur early. Hardware writes back null descriptors with the DD bit 
set in the status byte and all other bits unchanged.”

This is precisely what the existing code does. Another issue with the patch is 
that it won’t compile without E1K_WITH_RXD_CACHE defined.

Please let me know if there are any special steps that could help me to 
re-produce this issue. I’ll try to pester an Ubuntu guest by spamming 
broadcasts in host’s LAN, but I vaguely remember trying it before without any 
luck. Perhaps I had done it before I introduced rx cache, or it could have been 
a different kind of guest, not Linux. Let me try again.
___
vbox-dev mailing list
vbox-dev@virtualbox.org
https://www.virtualbox.org/mailman/listinfo/vbox-dev


Re: [vbox-dev] Handling of race in Linux e1000 driver

2018-03-12 Thread Kalogrianitis Socratis

> I propose the following patch, which I provide under the MIT license. I have 
> confirmed that this allows the network to function despite the error in the 
> driver.

Hi Sylvia,

Is your patch in addition to the script ( 
https://forums.virtualbox.org/viewtopic.php?f=3&t=87111 ) that you posted in 
the forums? Or the patch makes the script obsolete?

Socratis
___
vbox-dev mailing list
vbox-dev@virtualbox.org
https://www.virtualbox.org/mailman/listinfo/vbox-dev


[vbox-dev] Handling of race in Linux e1000 driver

2018-03-11 Thread Sylvia Else
There's a race condition in the Linux e1000 driver, in that it enables 
receiving on the interface before it initialises the buffer size and 
addresses. The window is short, of the order of a millisecond, and probably 
does little harm with real hardware.


Unfortunately, when in bridge networking mode, if there are any broadcast 
packets around, the VirtualBox DevE1000 module will try to send one during 
that window. The hardware specification is silent as to what the hardware 
should do in this situation, but the effect of the interaction between 
VirtualBox  and the driver is to render the network non-functional. The 
dependence on broadcast packets to trigger the failure is no doubt why many 
people have no problem with it.


While this is a bug in the driver, rather than VirtualBox, the latter could 
better handle the situation by discarding packets when the buffer size or 
addresses of zero rather than attempting (but ultimately failing) to 
deliver it.


I propose the following patch, which I provide under the MIT license. I 
have confirmed that this allows the network to function despite the error 
in the driver.



--- orig/VirtualBox-5.2.8/src/VBox/Devices/Network/DevE1000.cpp 2018-02-27 
03:02:05.0 +1100
+++ VirtualBox-5.2.8/src/VBox/Devices/Network/DevE1000.cpp  2018-03-12 
14:45:03.028671162 +1100

@@ -2449,7 +2449,36 @@
 PDMDevHlpPhysRead(pThis->CTX_SUFF(pDevIns), e1kDescAddr(RDBAH, 
RDBAL, RDH),

   &desc, sizeof(desc));
 # endif /* !E1K_WITH_RXD_CACHE */
-if (pDesc->u64BufAddr)
+if (!pDesc->u64BufAddr || pThis->u16RxBSize == 0)
+{
+/*
+ * Some drivers misguidedly enable the receive function before 
completing
+ * initialisation of the buffer size and/or addresses, leading 
to a race which
+ * can prevent the network coming up properly if we don't 
handle it. If that's the
+ * case then drop the packet. But first, make sure we're not 
using a stale copy
+ * of the offending descriptor. (If the buffer size is zero, 
then this won't

+ * help, but it does no harm.)
+ */
+pThis->iRxDCurrent = pThis->nRxDFetched = 0;
+e1kRxDPrefetch(pThis);
+pDesc = e1kRxDGet(pThis);
+if(pDesc == NULL)
+{
+/* I doubt this can happen, since a buffer was determined 
to exist above. */

+E1kLog(("%s Out of receive buffers, dropping the packet\n",
+pThis->szPrf));
+break;
+}
+
+if(!pDesc->u64BufAddr || pThis->u16RxBSize == 0)
+{
+E1kLog(("%s Null or zero sized received buffer, dropping 
the packet\n",

+pThis->szPrf));
+break;
+}
+}
+
+/* Was previously a conditional block. */
 {
 /* Update descriptor */
 pDesc->status= status;

___
vbox-dev mailing list
vbox-dev@virtualbox.org
https://www.virtualbox.org/mailman/listinfo/vbox-dev