Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-20 Thread Akihiro Suda
Hello, Ladi,

Thank you a lot for the analysis.

> At this point we should engage somebody from Google to help us understand
why the host behaves this way, i.e. why it doesn't let us use the device in
"simple" mode with nothing but VIRTIO_NET_F_MAC.

This site seems the right place for that:
https://googlecloudplatform.uservoice.com/forums/302595-compute-engine/filters/top


Since you have a better understanding of the issue than me, can you please
consider making a post to that place?

Thank you again



2016-12-21 1:28 GMT+09:00 Ladi Prosek :

> Hi Akihiro,
>
> On Tue, Dec 20, 2016 at 2:58 AM, Akihiro Suda 
> wrote:
>
>> Hello, Ladi,
>>
>> > Could you please help me set up a GCE instance with iPXE? I haven't
>> been able to make it produce any output. Do you build iPXE
>> with CONSOLE_SERIAL? How do you create the .tar.gz for GCE to import the
>> image from? Thanks!
>>
>> Yes, I built the image with CONSOLE_SERIAL.
>> For creating .tar.gz, I did:
>>
>>   cp bin/ipxe.usb /tmp/disk.raw
>>   ( cd /tmp; tar Sczvf ipxe.tar.gz disk.raw )
>>   gsutil cp /tmp/ipxe.tar.gz gs://blahblahblah
>>
>> Also, for clarity, I attached a complete script I'm using.
>>
>> Thank you for looking into this!
>>
>>
> ​Many thanks for the script, it made testing so much eas​ier!
>
> The problem is that GCE hosts use advanced virtio-net features like
> host-side checksum offload (VIRTIO_NET_F_CSUM) even if they are not
> accepted by the iPXE driver. The driver uses only one static instance of
> the virtio-net header in virtnet_nic.empty_header for both tx and rx
> buffers. The rationale is that without advanced features, the header will
> never be modified and will stay all zeros so it can be shared. Here's the
> comment in virtio-net.c:
>
> /* Share a single zeroed virtio net header between all
> * rx and tx packets.  This works because this driver
> * does not use any advanced features so none of the
> * header fields get used.
> */
>
> But I see this buffer written to by the host. After receiving a DHCP
> packet, the flags field is updated to 0x2 (VIRTIO_NET_HDR_F_DATA_VALID),
> and receiving non-UDP non-TCP traffic resets it back to 0x0. This explains
> why pinging the instance helps. Without pinging, the first packet is sent
> ok, the first received packed sets VIRTIO_NET_HDR_F_DATA_VALID, and
> subsequent sends fail because the flag stays set which is illegal (virtio
> spec 5.1.6.2.1 Driver Requirements: Packet Transmission "The driver MUST
> NOT set the VIRTIO_NET_HDR_F_DATA_VALID bit in flags."). With pinging,
> the incoming ICMP packets keep resetting the flags field back to 0x0,
> making successful transmission possible.
>
> Even if we allocate separate headers for each packet to try to work around
> this, we'd be running into checksumming issues with UDP and TCP, because
> the host expects us to use its precomputed partial checksums.
>
> virtio_net_hdr
> {
>flags = VIRTIO_NET_HDR_F_NEEDS_CSUM
>gso_type = VIRTIO_NET_HDR_GSO_TCPV4
>hdr_len = 0x42
> ...
> }
>
> IPv4 RX 10.132.0.2<-90.130.70.73 len 1500 (1500) proto 6 id 8f25 csum 00a6
> TCP checksum incorrect (is 3268 including checksum field, should be )
>
> I have also tried reordering the virtio initialization steps
> in virtnet_open_legacy hoping that it would help, but no joy there either.
>
> At this point we should engage somebody from Google to help us understand
> why the host behaves this way, i.e. why it doesn't let us use the device in
> "simple" mode with nothing but VIRTIO_NET_F_MAC.
>
> ​Thanks,
> Ladi​
>
>
>
>> 2016-12-20 1:21 GMT+09:00 Ladi Prosek :
>>
>>> On Mon, Dec 19, 2016 at 10:40 AM, Akihiro Suda 
>>> wrote:
>>>
 Hello, Ladi,

 Thank you a lot for providing the patch.
 However, unfortunately, it does not seem to have any effect.

 BTW, when I was tackling with the issue without your patch, I found
 some interesting behavior:

 - Pinging from another instance (Linux) to an iPXE instance seems
 always working as expected. (i.e. iPXE can successfully "pong" by
 transmitting Ethernet packets)
 - When an iPXE instance is being pinged, it can successfully send
 non-"pong" packets as well. So `dhcp` and `imgfetch
 http://10.146.0.2/a.ipxe`  work!

 So, I guess the issue is somehow related to interruption.
 When iPXE is frequently interrupted due to RX, the interruptions seems
 triggering TX as well.
 However, TX seems failing when it is not triggered by frequent RX
 interruptions.
 (Please correct me if my guess is wrong 😅)


>>> iPXE doesn't rely on interrupts, virtqueues are polled in a loop. The
>>> log you posted suggests that the host is really transmitting, or at least
>>> accepting outbound packets for transmission. I'm not sure what guest driver
>>> issue would explain this behavior :(
>>>
>>> Could you please help me set up a GCE instance with iPXE? I haven't been
>>> able to make it produce any output. Do you build iP

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-20 Thread Ladi Prosek
Hi Akihiro,

On Tue, Dec 20, 2016 at 2:58 AM, Akihiro Suda  wrote:

> Hello, Ladi,
>
> > Could you please help me set up a GCE instance with iPXE? I haven't
> been able to make it produce any output. Do you build iPXE
> with CONSOLE_SERIAL? How do you create the .tar.gz for GCE to import the
> image from? Thanks!
>
> Yes, I built the image with CONSOLE_SERIAL.
> For creating .tar.gz, I did:
>
>   cp bin/ipxe.usb /tmp/disk.raw
>   ( cd /tmp; tar Sczvf ipxe.tar.gz disk.raw )
>   gsutil cp /tmp/ipxe.tar.gz gs://blahblahblah
>
> Also, for clarity, I attached a complete script I'm using.
>
> Thank you for looking into this!
>
>
​Many thanks for the script, it made testing so much eas​ier!

The problem is that GCE hosts use advanced virtio-net features like
host-side checksum offload (VIRTIO_NET_F_CSUM) even if they are not
accepted by the iPXE driver. The driver uses only one static instance of
the virtio-net header in virtnet_nic.empty_header for both tx and rx
buffers. The rationale is that without advanced features, the header will
never be modified and will stay all zeros so it can be shared. Here's the
comment in virtio-net.c:

/* Share a single zeroed virtio net header between all
* rx and tx packets.  This works because this driver
* does not use any advanced features so none of the
* header fields get used.
*/

But I see this buffer written to by the host. After receiving a DHCP
packet, the flags field is updated to 0x2 (VIRTIO_NET_HDR_F_DATA_VALID),
and receiving non-UDP non-TCP traffic resets it back to 0x0. This explains
why pinging the instance helps. Without pinging, the first packet is sent
ok, the first received packed sets VIRTIO_NET_HDR_F_DATA_VALID, and
subsequent sends fail because the flag stays set which is illegal (virtio
spec 5.1.6.2.1 Driver Requirements: Packet Transmission "The driver MUST
NOT set the VIRTIO_NET_HDR_F_DATA_VALID bit in flags."). With pinging, the
incoming ICMP packets keep resetting the flags field back to 0x0, making
successful transmission possible.

Even if we allocate separate headers for each packet to try to work around
this, we'd be running into checksumming issues with UDP and TCP, because
the host expects us to use its precomputed partial checksums.

virtio_net_hdr
{
   flags = VIRTIO_NET_HDR_F_NEEDS_CSUM
   gso_type = VIRTIO_NET_HDR_GSO_TCPV4
   hdr_len = 0x42
...
}

IPv4 RX 10.132.0.2<-90.130.70.73 len 1500 (1500) proto 6 id 8f25 csum 00a6
TCP checksum incorrect (is 3268 including checksum field, should be )

I have also tried reordering the virtio initialization steps
in virtnet_open_legacy hoping that it would help, but no joy there either.

At this point we should engage somebody from Google to help us understand
why the host behaves this way, i.e. why it doesn't let us use the device in
"simple" mode with nothing but VIRTIO_NET_F_MAC.

​Thanks,
Ladi​



> 2016-12-20 1:21 GMT+09:00 Ladi Prosek :
>
>> On Mon, Dec 19, 2016 at 10:40 AM, Akihiro Suda 
>> wrote:
>>
>>> Hello, Ladi,
>>>
>>> Thank you a lot for providing the patch.
>>> However, unfortunately, it does not seem to have any effect.
>>>
>>> BTW, when I was tackling with the issue without your patch, I found some
>>> interesting behavior:
>>>
>>> - Pinging from another instance (Linux) to an iPXE instance seems always
>>> working as expected. (i.e. iPXE can successfully "pong" by transmitting
>>> Ethernet packets)
>>> - When an iPXE instance is being pinged, it can successfully send
>>> non-"pong" packets as well. So `dhcp` and `imgfetch
>>> http://10.146.0.2/a.ipxe`  work!
>>>
>>> So, I guess the issue is somehow related to interruption.
>>> When iPXE is frequently interrupted due to RX, the interruptions seems
>>> triggering TX as well.
>>> However, TX seems failing when it is not triggered by frequent RX
>>> interruptions.
>>> (Please correct me if my guess is wrong 😅)
>>>
>>>
>> iPXE doesn't rely on interrupts, virtqueues are polled in a loop. The log
>> you posted suggests that the host is really transmitting, or at least
>> accepting outbound packets for transmission. I'm not sure what guest driver
>> issue would explain this behavior :(
>>
>> Could you please help me set up a GCE instance with iPXE? I haven't been
>> able to make it produce any output. Do you build iPXE with CONSOLE_SERIAL?
>> How do you create the .tar.gz for GCE to import the image from? Thanks!
>>
>>
>>
>>>
>>> 2016-12-19 18:00 GMT+09:00 Ladi Prosek :
>>>
 Hi Akihiro,

 On Mon, Dec 19, 2016 at 7:17 AM, Akihiro Suda 
 wrote:
 > Hello Ladi,
 >
 > Thank you a lot for looking into this issue and writing patches about
 the
 > queue size. The patch set looks good to me.
 >
 >>> I haven't captured DHCP packets (because I'm unsure how to do so in
 GCE's
 >>> fully managed DHCP server), but I guess DHCPACK is actually sent
 from the
 >>> DHCP server but somehow iPXE missed it.
 >>
 >> Do you see any interesting log output with
 DEBU

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-19 Thread Akihiro Suda
Hello, Ladi,

> Could you please help me set up a GCE instance with iPXE? I haven't been
able to make it produce any output. Do you build iPXE with CONSOLE_SERIAL?
How do you create the .tar.gz for GCE to import the image from? Thanks!

Yes, I built the image with CONSOLE_SERIAL.
For creating .tar.gz, I did:

  cp bin/ipxe.usb /tmp/disk.raw
  ( cd /tmp; tar Sczvf ipxe.tar.gz disk.raw )
  gsutil cp /tmp/ipxe.tar.gz gs://blahblahblah

Also, for clarity, I attached a complete script I'm using.

Thank you for looking into this!


2016-12-20 1:21 GMT+09:00 Ladi Prosek :

> On Mon, Dec 19, 2016 at 10:40 AM, Akihiro Suda 
> wrote:
>
>> Hello, Ladi,
>>
>> Thank you a lot for providing the patch.
>> However, unfortunately, it does not seem to have any effect.
>>
>> BTW, when I was tackling with the issue without your patch, I found some
>> interesting behavior:
>>
>> - Pinging from another instance (Linux) to an iPXE instance seems always
>> working as expected. (i.e. iPXE can successfully "pong" by transmitting
>> Ethernet packets)
>> - When an iPXE instance is being pinged, it can successfully send
>> non-"pong" packets as well. So `dhcp` and `imgfetch
>> http://10.146.0.2/a.ipxe`  work!
>>
>> So, I guess the issue is somehow related to interruption.
>> When iPXE is frequently interrupted due to RX, the interruptions seems
>> triggering TX as well.
>> However, TX seems failing when it is not triggered by frequent RX
>> interruptions.
>> (Please correct me if my guess is wrong 😅)
>>
>>
> iPXE doesn't rely on interrupts, virtqueues are polled in a loop. The log
> you posted suggests that the host is really transmitting, or at least
> accepting outbound packets for transmission. I'm not sure what guest driver
> issue would explain this behavior :(
>
> Could you please help me set up a GCE instance with iPXE? I haven't been
> able to make it produce any output. Do you build iPXE with CONSOLE_SERIAL?
> How do you create the .tar.gz for GCE to import the image from? Thanks!
>
>
>
>>
>> 2016-12-19 18:00 GMT+09:00 Ladi Prosek :
>>
>>> Hi Akihiro,
>>>
>>> On Mon, Dec 19, 2016 at 7:17 AM, Akihiro Suda 
>>> wrote:
>>> > Hello Ladi,
>>> >
>>> > Thank you a lot for looking into this issue and writing patches about
>>> the
>>> > queue size. The patch set looks good to me.
>>> >
>>> >>> I haven't captured DHCP packets (because I'm unsure how to do so in
>>> GCE's
>>> >>> fully managed DHCP server), but I guess DHCPACK is actually sent
>>> from the
>>> >>> DHCP server but somehow iPXE missed it.
>>> >>
>>> >> Do you see any interesting log output with
>>> DEBUG=virtio-net,virtio-pci ?
>>> >
>>> > Did not see any interesting log. No error about TX is printed, and it
>>> just
>>> > fails to receive DHCPACK.
>>> >
>>> > So, for ease of debugging with capturing packets, I tried to do
>>> `imgfetch
>>> > http://...` with a static IP address (`set net0/ip`) instead.
>>> >
>>> > The result implies that "iPXE believes it has sent the second Ethernet
>>> > packet without an error, but it is not sent actually".
>>> >
>>> > This assumption is consistent with my report about DHCP.
>>> > iPXE can successfully send the 1st packet (DHCPDISCOVER), but seems
>>> failing
>>> > to send the 2nd packet (DHCPREQUEST). It should be the reason why it
>>> cannot
>>> > receive DHCPACK.
>>> >
>>> >
>>> > Here is the log. In the client log, we can see "TX ... PSH ACK", but
>>> it is
>>> > not received on the server side.
>>> >
>>> > === iPXE Client 10.146.0.10 (DEBUG=virtio-net:7,virtio-pci
>>> :7,tcp:7,ipv4:7
>>> > CONFIG=cloud, 26050fd4 with Ladi's 4 great patches) ===
>>> >
>>> > iPXE> imgfetch http://10.146.0.2/a.ipxe
>>> > http://10.146.0.2/a.ipxe...TCP 0x12eda4 allocated
>>> > TCP 0x12eda4 transitioned from CLOSED to SYN_SENT
>>> > TCP 0x12eda4 bound to port 46673
>>> > TCP 0x12eda4 timer fired in SYN_SENT for 609cb55e..609cb55e
>>> 
>>> > TCP 0x12eda4 TX 46673->80 609cb55e..609cb55f   
>>> 0 SYN
>>> > IPv4 TX 10.146.0.10->10.146.0.2 len 64 proto 6 id 0100 csum 6489
>>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ef90 on vq 1
>>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ef90
>>> > VIRTIO-NET 0xee704 rx complete iobuf 0x12f5f4 len 74
>>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1335f4 on vq 0
>>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
>>> > TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc
>>> 0 SYN
>>> > ACK
>>> > TCP 0x12eda4 using timestamps, SACK, TX window x128, RX window x512
>>> > TCP 0x12eda4 transitioned from SYN_SENT to ESTABLISHED
>>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc
>>>  98 PSH
>>> > ACK
>>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0201 csum 6332
>>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
>>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
>>> > VIRTIO-NET 0xee704 rx complete iobuf 0x12fdf4 len 74
>>> > VIRTIO-

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-19 Thread Ladi Prosek
On Mon, Dec 19, 2016 at 10:40 AM, Akihiro Suda  wrote:

> Hello, Ladi,
>
> Thank you a lot for providing the patch.
> However, unfortunately, it does not seem to have any effect.
>
> BTW, when I was tackling with the issue without your patch, I found some
> interesting behavior:
>
> - Pinging from another instance (Linux) to an iPXE instance seems always
> working as expected. (i.e. iPXE can successfully "pong" by transmitting
> Ethernet packets)
> - When an iPXE instance is being pinged, it can successfully send
> non-"pong" packets as well. So `dhcp` and `imgfetch
> http://10.146.0.2/a.ipxe`  work!
>
> So, I guess the issue is somehow related to interruption.
> When iPXE is frequently interrupted due to RX, the interruptions seems
> triggering TX as well.
> However, TX seems failing when it is not triggered by frequent RX
> interruptions.
> (Please correct me if my guess is wrong 😅)
>
>
iPXE doesn't rely on interrupts, virtqueues are polled in a loop. The log
you posted suggests that the host is really transmitting, or at least
accepting outbound packets for transmission. I'm not sure what guest driver
issue would explain this behavior :(

Could you please help me set up a GCE instance with iPXE? I haven't been
able to make it produce any output. Do you build iPXE with CONSOLE_SERIAL?
How do you create the .tar.gz for GCE to import the image from? Thanks!



>
> 2016-12-19 18:00 GMT+09:00 Ladi Prosek :
>
>> Hi Akihiro,
>>
>> On Mon, Dec 19, 2016 at 7:17 AM, Akihiro Suda 
>> wrote:
>> > Hello Ladi,
>> >
>> > Thank you a lot for looking into this issue and writing patches about
>> the
>> > queue size. The patch set looks good to me.
>> >
>> >>> I haven't captured DHCP packets (because I'm unsure how to do so in
>> GCE's
>> >>> fully managed DHCP server), but I guess DHCPACK is actually sent from
>> the
>> >>> DHCP server but somehow iPXE missed it.
>> >>
>> >> Do you see any interesting log output with DEBUG=virtio-net,virtio-pci
>> ?
>> >
>> > Did not see any interesting log. No error about TX is printed, and it
>> just
>> > fails to receive DHCPACK.
>> >
>> > So, for ease of debugging with capturing packets, I tried to do
>> `imgfetch
>> > http://...` with a static IP address (`set net0/ip`) instead.
>> >
>> > The result implies that "iPXE believes it has sent the second Ethernet
>> > packet without an error, but it is not sent actually".
>> >
>> > This assumption is consistent with my report about DHCP.
>> > iPXE can successfully send the 1st packet (DHCPDISCOVER), but seems
>> failing
>> > to send the 2nd packet (DHCPREQUEST). It should be the reason why it
>> cannot
>> > receive DHCPACK.
>> >
>> >
>> > Here is the log. In the client log, we can see "TX ... PSH ACK", but it
>> is
>> > not received on the server side.
>> >
>> > === iPXE Client 10.146.0.10 (DEBUG=virtio-net:7,virtio-pci
>> :7,tcp:7,ipv4:7
>> > CONFIG=cloud, 26050fd4 with Ladi's 4 great patches) ===
>> >
>> > iPXE> imgfetch http://10.146.0.2/a.ipxe
>> > http://10.146.0.2/a.ipxe...TCP 0x12eda4 allocated
>> > TCP 0x12eda4 transitioned from CLOSED to SYN_SENT
>> > TCP 0x12eda4 bound to port 46673
>> > TCP 0x12eda4 timer fired in SYN_SENT for 609cb55e..609cb55e 
>> > TCP 0x12eda4 TX 46673->80 609cb55e..609cb55f   
>> 0 SYN
>> > IPv4 TX 10.146.0.10->10.146.0.2 len 64 proto 6 id 0100 csum 6489
>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ef90 on vq 1
>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ef90
>> > VIRTIO-NET 0xee704 rx complete iobuf 0x12f5f4 len 74
>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1335f4 on vq 0
>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
>> > TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc
>> 0 SYN
>> > ACK
>> > TCP 0x12eda4 using timestamps, SACK, TX window x128, RX window x512
>> > TCP 0x12eda4 transitioned from SYN_SENT to ESTABLISHED
>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc
>>  98 PSH
>> > ACK
>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0201 csum 6332
>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
>> > VIRTIO-NET 0xee704 rx complete iobuf 0x12fdf4 len 74
>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12f5f4 on vq 0
>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
>> > TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc
>> 0 SYN
>> > ACK
>> > TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1
>> 5ae4dacc
>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc
>>  98 PSH
>> > ACK
>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0302 csum 6231
>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
>> > .TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1
>> 5ae4dacc
>> > TCP 0x12eda4 TX 46673->80 

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-19 Thread Akihiro Suda
Hello, Ladi,

Thank you a lot for providing the patch.
However, unfortunately, it does not seem to have any effect.

BTW, when I was tackling with the issue without your patch, I found some
interesting behavior:

- Pinging from another instance (Linux) to an iPXE instance seems always
working as expected. (i.e. iPXE can successfully "pong" by transmitting
Ethernet packets)
- When an iPXE instance is being pinged, it can successfully send
non-"pong" packets as well. So `dhcp` and `imgfetch
http://10.146.0.2/a.ipxe` work!

So, I guess the issue is somehow related to interruption.
When iPXE is frequently interrupted due to RX, the interruptions seems
triggering TX as well.
However, TX seems failing when it is not triggered by frequent RX
interruptions.
(Please correct me if my guess is wrong 😅)


2016-12-19 18:00 GMT+09:00 Ladi Prosek :

> Hi Akihiro,
>
> On Mon, Dec 19, 2016 at 7:17 AM, Akihiro Suda 
> wrote:
> > Hello Ladi,
> >
> > Thank you a lot for looking into this issue and writing patches about the
> > queue size. The patch set looks good to me.
> >
> >>> I haven't captured DHCP packets (because I'm unsure how to do so in
> GCE's
> >>> fully managed DHCP server), but I guess DHCPACK is actually sent from
> the
> >>> DHCP server but somehow iPXE missed it.
> >>
> >> Do you see any interesting log output with DEBUG=virtio-net,virtio-pci ?
> >
> > Did not see any interesting log. No error about TX is printed, and it
> just
> > fails to receive DHCPACK.
> >
> > So, for ease of debugging with capturing packets, I tried to do `imgfetch
> > http://...` with a static IP address (`set net0/ip`) instead.
> >
> > The result implies that "iPXE believes it has sent the second Ethernet
> > packet without an error, but it is not sent actually".
> >
> > This assumption is consistent with my report about DHCP.
> > iPXE can successfully send the 1st packet (DHCPDISCOVER), but seems
> failing
> > to send the 2nd packet (DHCPREQUEST). It should be the reason why it
> cannot
> > receive DHCPACK.
> >
> >
> > Here is the log. In the client log, we can see "TX ... PSH ACK", but it
> is
> > not received on the server side.
> >
> > === iPXE Client 10.146.0.10 (DEBUG=virtio-net:7,virtio-
> pci:7,tcp:7,ipv4:7
> > CONFIG=cloud, 26050fd4 with Ladi's 4 great patches) ===
> >
> > iPXE> imgfetch http://10.146.0.2/a.ipxe
> > http://10.146.0.2/a.ipxe...TCP 0x12eda4 allocated
> > TCP 0x12eda4 transitioned from CLOSED to SYN_SENT
> > TCP 0x12eda4 bound to port 46673
> > TCP 0x12eda4 timer fired in SYN_SENT for 609cb55e..609cb55e 
> > TCP 0x12eda4 TX 46673->80 609cb55e..609cb55f   0
> SYN
> > IPv4 TX 10.146.0.10->10.146.0.2 len 64 proto 6 id 0100 csum 6489
> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ef90 on vq 1
> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ef90
> > VIRTIO-NET 0xee704 rx complete iobuf 0x12f5f4 len 74
> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1335f4 on vq 0
> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
> > TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0
> SYN
> > ACK
> > TCP 0x12eda4 using timestamps, SACK, TX window x128, RX window x512
> > TCP 0x12eda4 transitioned from SYN_SENT to ESTABLISHED
> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
> PSH
> > ACK
> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0201 csum 6332
> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> > VIRTIO-NET 0xee704 rx complete iobuf 0x12fdf4 len 74
> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12f5f4 on vq 0
> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
> > TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0
> SYN
> > ACK
> > TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1
> 5ae4dacc
> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
> PSH
> > ACK
> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0302 csum 6231
> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> > .TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1
> 5ae4dacc
> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
> PSH
> > ACK
> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0402 csum 6131
> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> > VIRTIO-NET 0xee704 rx complete iobuf 0x1305f4 len 74
> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12fdf4 on vq 0
> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
> > TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0
> SYN
> > ACK
> > ..TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1
> > 5ae4dacc
> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   9

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-19 Thread Ladi Prosek
Hi Akihiro,

On Mon, Dec 19, 2016 at 7:17 AM, Akihiro Suda  wrote:
> Hello Ladi,
>
> Thank you a lot for looking into this issue and writing patches about the
> queue size. The patch set looks good to me.
>
>>> I haven't captured DHCP packets (because I'm unsure how to do so in GCE's
>>> fully managed DHCP server), but I guess DHCPACK is actually sent from the
>>> DHCP server but somehow iPXE missed it.
>>
>> Do you see any interesting log output with DEBUG=virtio-net,virtio-pci ?
>
> Did not see any interesting log. No error about TX is printed, and it just
> fails to receive DHCPACK.
>
> So, for ease of debugging with capturing packets, I tried to do `imgfetch
> http://...` with a static IP address (`set net0/ip`) instead.
>
> The result implies that "iPXE believes it has sent the second Ethernet
> packet without an error, but it is not sent actually".
>
> This assumption is consistent with my report about DHCP.
> iPXE can successfully send the 1st packet (DHCPDISCOVER), but seems failing
> to send the 2nd packet (DHCPREQUEST). It should be the reason why it cannot
> receive DHCPACK.
>
>
> Here is the log. In the client log, we can see "TX ... PSH ACK", but it is
> not received on the server side.
>
> === iPXE Client 10.146.0.10 (DEBUG=virtio-net:7,virtio-pci:7,tcp:7,ipv4:7
> CONFIG=cloud, 26050fd4 with Ladi's 4 great patches) ===
>
> iPXE> imgfetch http://10.146.0.2/a.ipxe
> http://10.146.0.2/a.ipxe...TCP 0x12eda4 allocated
> TCP 0x12eda4 transitioned from CLOSED to SYN_SENT
> TCP 0x12eda4 bound to port 46673
> TCP 0x12eda4 timer fired in SYN_SENT for 609cb55e..609cb55e 
> TCP 0x12eda4 TX 46673->80 609cb55e..609cb55f   0 SYN
> IPv4 TX 10.146.0.10->10.146.0.2 len 64 proto 6 id 0100 csum 6489
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12ef90 on vq 1
> VIRTIO-NET 0xee704 tx complete iobuf 0x12ef90
> VIRTIO-NET 0xee704 rx complete iobuf 0x12f5f4 len 74
> VIRTIO-NET 0xee704 enqueuing iobuf 0x1335f4 on vq 0
> IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
> TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0 SYN
> ACK
> TCP 0x12eda4 using timestamps, SACK, TX window x128, RX window x512
> TCP 0x12eda4 transitioned from SYN_SENT to ESTABLISHED
> TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98 PSH
> ACK
> IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0201 csum 6332
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> VIRTIO-NET 0xee704 rx complete iobuf 0x12fdf4 len 74
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12f5f4 on vq 0
> IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
> TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0 SYN
> ACK
> TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 5ae4dacc
> TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98 PSH
> ACK
> IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0302 csum 6231
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> .TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 5ae4dacc
> TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98 PSH
> ACK
> IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0402 csum 6131
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> VIRTIO-NET 0xee704 rx complete iobuf 0x1305f4 len 74
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12fdf4 on vq 0
> IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
> TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0 SYN
> ACK
> ..TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1
> 5ae4dacc
> TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98 PSH
> ACK
> IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0503 csum 6030
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> ..VIRTIO-NET 0xee704 rx complete iobuf 0x130df4 len 74
> VIRTIO-NET 0xee704 enqueuing iobuf 0x1305f4 on vq 0
> IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
> TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0 SYN
> ACK
> .TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 5ae4dacc
> TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98 PSH
> ACK
> IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0604 csum 5f2f
> VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
> VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
> ..TCP 0x12eda4 timer expired in ESTABLISHED for 609cb55f..609cb5c1
> 5ae4dacc
> TCP 0x12eda4 transitioned from ESTABLISHED to CLOSED
> TCP 0x12eda4 connection deleted
>  Connection timed out (http://ipxe.or

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-18 Thread Akihiro Suda
Hello Ladi,

Thank you a lot for looking into this issue and writing patches about the
queue size. The patch set looks good to me.

>> I haven't captured DHCP packets (because I'm unsure how to do so in GCE's
>> fully managed DHCP server), but I guess DHCPACK is actually sent from the
>> DHCP server but somehow iPXE missed it.
>
> Do you see any interesting log output with DEBUG=virtio-net,virtio-pci ?

Did not see any interesting log. No error about TX is printed, and it just
fails to receive DHCPACK.

So, for ease of debugging with capturing packets, I tried to do `imgfetch
http://...` with a static IP address (`set net0/ip`) instead.

The result implies that "iPXE believes it has sent the second Ethernet
packet without an error, but it is not sent actually".

This assumption is consistent with my report about DHCP.
iPXE can successfully send the 1st packet (DHCPDISCOVER), but seems failing
to send the 2nd packet (DHCPREQUEST). It should be the reason why it cannot
receive DHCPACK.


Here is the log. In the client log, we can see "TX ... PSH ACK", but it is
not received on the server side.

=== iPXE Client 10.146.0.10 (DEBUG=virtio-net:7,virtio-pci:7,tcp:7,ipv4:7
CONFIG=cloud, 26050fd4 with Ladi's 4 great patches) ===

iPXE> imgfetch http://10.146.0.2/a.ipxe
http://10.146.0.2/a.ipxe...TCP 0x12eda4 allocated
TCP 0x12eda4 transitioned from CLOSED to SYN_SENT
TCP 0x12eda4 bound to port 46673
TCP 0x12eda4 timer fired in SYN_SENT for 609cb55e..609cb55e 
TCP 0x12eda4 TX 46673->80 609cb55e..609cb55f   0 SYN
IPv4 TX 10.146.0.10->10.146.0.2 len 64 proto 6 id 0100 csum 6489
VIRTIO-NET 0xee704 enqueuing iobuf 0x12ef90 on vq 1
VIRTIO-NET 0xee704 tx complete iobuf 0x12ef90
VIRTIO-NET 0xee704 rx complete iobuf 0x12f5f4 len 74
VIRTIO-NET 0xee704 enqueuing iobuf 0x1335f4 on vq 0
IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0
SYN ACK
TCP 0x12eda4 using timestamps, SACK, TX window x128, RX window x512
TCP 0x12eda4 transitioned from SYN_SENT to ESTABLISHED
TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
PSH ACK
IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0201 csum 6332
VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
VIRTIO-NET 0xee704 rx complete iobuf 0x12fdf4 len 74
VIRTIO-NET 0xee704 enqueuing iobuf 0x12f5f4 on vq 0
IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0
SYN ACK
TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 5ae4dacc
TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
PSH ACK
IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0302 csum 6231
VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
.TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 5ae4dacc
TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
PSH ACK
IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0402 csum 6131
VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
VIRTIO-NET 0xee704 rx complete iobuf 0x1305f4 len 74
VIRTIO-NET 0xee704 enqueuing iobuf 0x12fdf4 on vq 0
IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0
SYN ACK
..TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1
5ae4dacc
TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
PSH ACK
IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0503 csum 6030
VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
..VIRTIO-NET 0xee704 rx complete iobuf 0x130df4 len 74
VIRTIO-NET 0xee704 enqueuing iobuf 0x1305f4 on vq 0
IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
TCP 0x12eda4 RX 46673<-80   609cb55f 5ae4dacb..5ae4dacc0
SYN ACK
.TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 5ae4dacc
TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1   5ae4dacc   98
PSH ACK
IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0604 csum 5f2f
VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1
VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94
..TCP 0x12eda4 timer expired in ESTABLISHED for 609cb55f..609cb5c1
5ae4dacc
TCP 0x12eda4 transitioned from ESTABLISHED to CLOSED
TCP 0x12eda4 connection deleted
 Connection timed out (http://ipxe.org/4c0a6035)
iPXE> VIRTIO-NET 0xee704 rx complete iobuf 0x1315f4 len 74
VIRTIO-NET 0xee704 enqueuing iobuf 0x130df4 on vq 0
IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id  csum 258d
TCP 0x0 RX 46673<-80   609cb55f 5ae4

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-14 Thread Ladi Prosek
Hi Akihiro,

On Wed, Dec 14, 2016 at 8:51 AM, Akihiro Suda  wrote:
> Hi Christian,
>
> Thank you for the suggestion, I'll try to report this to google.
>
> I found that the cause of this issue is because GCE's VIRTIO_PCI_QUEUE_NUM
> is 4096, which is larger than iPXE's MAX_QUEUE_NUM (256).
>
> https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b2ec91703676e211:/src/drivers/bus/virtio-pci.c#l43
> https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b2ec91703676e211:/src/include/ipxe/virtio-ring.h#l24
>
>
> IIUC this issue will be resolved if GCE supports virtio 1.0, which would
> allow iPXE to use custom queue num smaller than the virtio device's max
> value.

iPXE virtio 1.0 code does not cap the queue size to MAX_QUEUE_NUM. In
fact, it does not check it against any maximum and still uses
fixed-sized virtqueue data structures, just like legacy. This is a
bug.

> However, it would be great if iPXE can support setting MAX_QUEUE_NUM to 4096
> (via `config/general.h`, probably).

Even better, iPXE should try to use the host provided size in legacy
mode, i.e. no MAX_QUEUE_NUM needed. The few extra dynamic allocations
shouldn't hurt anything.

> Actually I tried to set MAX_QUEUE_NUM to 4096 and got virtnet_open_legacy()
> succeeded, but it still doesn't work:
>
> iPXE> dhcp
> DHCP 0x120e44 entering discovery state
> Configuring (net0 42:01:0a:92:00:0a)...DHCP 0x120e44 DHCPDISCOVER
> DHCP 0x120e44 DHCPOFFER from 169.254.169.254:67 for 10.146.0.10
> .DHCP 0x120e44 DHCPDISCOVER
> ..DHCP 0x120e44 entering request state
> DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
> DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
> .DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
> .DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
> ...DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
> ... No configuration methods succeeded (http://ipxe.org/040ee119)
> iPXE>
>
> I haven't captured DHCP packets (because I'm unsure how to do so in GCE's
> fully managed DHCP server), but I guess DHCPACK is actually sent from the
> DHCP server but somehow iPXE missed it.

Do you see any interesting log output with DEBUG=virtio-net,virtio-pci ?

> I'd appreciate if anyone can support me to set MAX_QUEUE_NUM to 4096.

I'll look into it. My proposal is to keep using 128 as the cap in
virtio 1.0 and support any queue size in legacy, only subject to
available memory.

> 2016-12-14 16:19 GMT+09:00 Christian Nilsson :
>>
>> You might want to report this to google as well if you haven't already.
>>
>> On Wed, Dec 14, 2016 at 7:16 AM, Akihiro Suda 
>> wrote:
>> > Hello,
>> >
>> > I tried to boot iPXE on a Google Compute Engine instance, but the virtio
>> > driver doesn't work because virtnet_open_legacy() fails to find
>> > virtqueues.
>> >
>> > Please let me know what I can do for providing further information to
>> > debug
>> > the issue.
>> >
>> > The serial console output with "DEBUG=virtio-net" is as follows:
>> >
>> > iPXE initialising devices...Common virtio capability not found!
>> > VIRTIO-NET 0xdf584 busaddr=:00:04.0 ioaddr=0xc040 irq=11
>> > VIRTIO-NET 0xdf584 mac=42:01:0a:92:00:0a
>> > ok
>> >
>> > iPXE 1.0.0+ (26050) -- Open Source Network Boot Firmware --
>> > http://ipxe.org
>> > Features: DNS HTTP iSCSI TFTP SRP AoE ELF MBOOT PXE bzImage Menu
>> > PXEXT
>> > iPXE> ifstat
>> > net0: 42:01:0a:92:00:0a using virtio-net on :00:04.0 (closed)
>> >   [Link:up, TX:0 TXE:0 RX:0 RXE:0]
>> > iPXE> dhcp
>> > VIRTIO-NET 0xdf584 cannot register queue 0
>> > Could not open net0: No such file or directory
>> > (http://ipxe.org/2d5c403b)
>> > iPXE>
>> >
>> >
>> > This ENOENT is raised from virtnet_open_legacy():
>> >
>> > https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b2ec91703676e211:/src/drivers/net/virtio-net.c#l205
>> >
>> >
>> > I also tested some previous commits, but all of them failed, so the
>> > issue is
>> > not a regression in some recent commit:
>> >
>> >  * 26050fd (Dec  8, 2016): the last commit when I tested
>> >  * 7b499f8 (Apr 11, 2016): the last commit without support for virtio
>> > 1.0
>> >  * e4419ff (Jul  2, 2010):  the first commit with the "native iPXE
>> > driver"
>> > (The commit date is older than 232c208, but actually newer than 232c208)
>> >  * 232c208 (Jul 11, 2010): the last commit without the "native iPXE
>> > driver".
>> > The error message is "ERROR: queue size 4096 > 512", "Cannot register
>> > queue
>> > #0"
>> >
>> >
>> > Regards,
>> >
>> > Akihiro Suda
>> >
>> >
>> > ___
>> > ipxe-devel mailing list
>> > ipxe-devel@lists.ipxe.org
>> > https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel
>> >
>
>
>
> ___
> ipxe-devel mailing list
> ipxe-devel@lists.ipxe.org
> https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel
>
__

Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-13 Thread Akihiro Suda
Hi Christian,

Thank you for the suggestion, I'll try to report this to google.

I found that the cause of this issue is because GCE's VIRTIO_PCI_QUEUE_NUM
is 4096, which is larger than iPXE's MAX_QUEUE_NUM (256).

https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b2ec91703676e211:/src/drivers/bus/virtio-pci.c#l43
https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b2ec91703676e211:/src/include/ipxe/virtio-ring.h#l24


IIUC this issue will be resolved if GCE supports virtio 1.0, which would
allow iPXE to use custom queue num smaller than the virtio device's max
value.

However, it would be great if iPXE can support setting MAX_QUEUE_NUM to
4096 (via `config/general.h`, probably).

Actually I tried to set MAX_QUEUE_NUM to 4096 and got virtnet_open_legacy()
succeeded, but it still doesn't work:

iPXE> dhcp
DHCP 0x120e44 entering discovery state
Configuring (net0 42:01:0a:92:00:0a)...DHCP 0x120e44 DHCPDISCOVER
DHCP 0x120e44 DHCPOFFER from 169.254.169.254:67 for 10.146.0.10
.DHCP 0x120e44 DHCPDISCOVER
..DHCP 0x120e44 entering request state
DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
.DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
.DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
...DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10
... No configuration methods succeeded (http://ipxe.org/040ee119)
iPXE>

I haven't captured DHCP packets (because I'm unsure how to do so in GCE's
fully managed DHCP server), but I guess DHCPACK is actually sent from the
DHCP server but somehow iPXE missed it.

I'd appreciate if anyone can support me to set MAX_QUEUE_NUM to 4096.


2016-12-14 16:19 GMT+09:00 Christian Nilsson :

> You might want to report this to google as well if you haven't already.
>
> On Wed, Dec 14, 2016 at 7:16 AM, Akihiro Suda 
> wrote:
> > Hello,
> >
> > I tried to boot iPXE on a Google Compute Engine instance, but the virtio
> > driver doesn't work because virtnet_open_legacy() fails to find
> virtqueues.
> >
> > Please let me know what I can do for providing further information to
> debug
> > the issue.
> >
> > The serial console output with "DEBUG=virtio-net" is as follows:
> >
> > iPXE initialising devices...Common virtio capability not found!
> > VIRTIO-NET 0xdf584 busaddr=:00:04.0 ioaddr=0xc040 irq=11
> > VIRTIO-NET 0xdf584 mac=42:01:0a:92:00:0a
> > ok
> >
> > iPXE 1.0.0+ (26050) -- Open Source Network Boot Firmware --
> > http://ipxe.org
> > Features: DNS HTTP iSCSI TFTP SRP AoE ELF MBOOT PXE bzImage Menu
> PXEXT
> > iPXE> ifstat
> > net0: 42:01:0a:92:00:0a using virtio-net on :00:04.0 (closed)
> >   [Link:up, TX:0 TXE:0 RX:0 RXE:0]
> > iPXE> dhcp
> > VIRTIO-NET 0xdf584 cannot register queue 0
> > Could not open net0: No such file or directory
> > (http://ipxe.org/2d5c403b)
> > iPXE>
> >
> >
> > This ENOENT is raised from virtnet_open_legacy():
> > https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b2ec91
> 703676e211:/src/drivers/net/virtio-net.c#l205
> >
> >
> > I also tested some previous commits, but all of them failed, so the
> issue is
> > not a regression in some recent commit:
> >
> >  * 26050fd (Dec  8, 2016): the last commit when I tested
> >  * 7b499f8 (Apr 11, 2016): the last commit without support for virtio 1.0
> >  * e4419ff (Jul  2, 2010):  the first commit with the "native iPXE
> driver"
> > (The commit date is older than 232c208, but actually newer than 232c208)
> >  * 232c208 (Jul 11, 2010): the last commit without the "native iPXE
> driver".
> > The error message is "ERROR: queue size 4096 > 512", "Cannot register
> queue
> > #0"
> >
> >
> > Regards,
> >
> > Akihiro Suda
> >
> >
> > ___
> > ipxe-devel mailing list
> > ipxe-devel@lists.ipxe.org
> > https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel
> >
>
___
ipxe-devel mailing list
ipxe-devel@lists.ipxe.org
https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel


Re: [ipxe-devel] virtnet_open_legacy() fails to find virtqueues on Google Compute Engine

2016-12-13 Thread Christian Nilsson
You might want to report this to google as well if you haven't already.

On Wed, Dec 14, 2016 at 7:16 AM, Akihiro Suda  wrote:
> Hello,
>
> I tried to boot iPXE on a Google Compute Engine instance, but the virtio
> driver doesn't work because virtnet_open_legacy() fails to find virtqueues.
>
> Please let me know what I can do for providing further information to debug
> the issue.
>
> The serial console output with "DEBUG=virtio-net" is as follows:
>
> iPXE initialising devices...Common virtio capability not found!
> VIRTIO-NET 0xdf584 busaddr=:00:04.0 ioaddr=0xc040 irq=11
> VIRTIO-NET 0xdf584 mac=42:01:0a:92:00:0a
> ok
>
> iPXE 1.0.0+ (26050) -- Open Source Network Boot Firmware --
> http://ipxe.org
> Features: DNS HTTP iSCSI TFTP SRP AoE ELF MBOOT PXE bzImage Menu PXEXT
> iPXE> ifstat
> net0: 42:01:0a:92:00:0a using virtio-net on :00:04.0 (closed)
>   [Link:up, TX:0 TXE:0 RX:0 RXE:0]
> iPXE> dhcp
> VIRTIO-NET 0xdf584 cannot register queue 0
> Could not open net0: No such file or directory
> (http://ipxe.org/2d5c403b)
> iPXE>
>
>
> This ENOENT is raised from virtnet_open_legacy():
> https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b2ec91703676e211:/src/drivers/net/virtio-net.c#l205
>
>
> I also tested some previous commits, but all of them failed, so the issue is
> not a regression in some recent commit:
>
>  * 26050fd (Dec  8, 2016): the last commit when I tested
>  * 7b499f8 (Apr 11, 2016): the last commit without support for virtio 1.0
>  * e4419ff (Jul  2, 2010):  the first commit with the "native iPXE driver"
> (The commit date is older than 232c208, but actually newer than 232c208)
>  * 232c208 (Jul 11, 2010): the last commit without the "native iPXE driver".
> The error message is "ERROR: queue size 4096 > 512", "Cannot register queue
> #0"
>
>
> Regards,
>
> Akihiro Suda
>
>
> ___
> ipxe-devel mailing list
> ipxe-devel@lists.ipxe.org
> https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel
>
___
ipxe-devel mailing list
ipxe-devel@lists.ipxe.org
https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel