Hello, Ladi, > Could you please help me set up a GCE instance with iPXE? I haven't been able to make it produce any output. Do you build iPXE with CONSOLE_SERIAL? How do you create the .tar.gz for GCE to import the image from? Thanks!
Yes, I built the image with CONSOLE_SERIAL. For creating .tar.gz, I did: cp bin/ipxe.usb /tmp/disk.raw ( cd /tmp; tar Sczvf ipxe.tar.gz disk.raw ) gsutil cp /tmp/ipxe.tar.gz gs://blahblahblah Also, for clarity, I attached a complete script I'm using. Thank you for looking into this! 2016-12-20 1:21 GMT+09:00 Ladi Prosek <lpro...@redhat.com>: > On Mon, Dec 19, 2016 at 10:40 AM, Akihiro Suda <suda.ky...@gmail.com> > wrote: > >> Hello, Ladi, >> >> Thank you a lot for providing the patch. >> However, unfortunately, it does not seem to have any effect. >> >> BTW, when I was tackling with the issue without your patch, I found some >> interesting behavior: >> >> - Pinging from another instance (Linux) to an iPXE instance seems always >> working as expected. (i.e. iPXE can successfully "pong" by transmitting >> Ethernet packets) >> - When an iPXE instance is being pinged, it can successfully send >> non-"pong" packets as well. So `dhcp` and `imgfetch >> http://10.146.0.2/a.ipxe` <http://10.146.0.2/a.ipxe> work! >> >> So, I guess the issue is somehow related to interruption. >> When iPXE is frequently interrupted due to RX, the interruptions seems >> triggering TX as well. >> However, TX seems failing when it is not triggered by frequent RX >> interruptions. >> (Please correct me if my guess is wrong 😅) >> >> > iPXE doesn't rely on interrupts, virtqueues are polled in a loop. The log > you posted suggests that the host is really transmitting, or at least > accepting outbound packets for transmission. I'm not sure what guest driver > issue would explain this behavior :( > > Could you please help me set up a GCE instance with iPXE? I haven't been > able to make it produce any output. Do you build iPXE with CONSOLE_SERIAL? > How do you create the .tar.gz for GCE to import the image from? Thanks! > > > >> >> 2016-12-19 18:00 GMT+09:00 Ladi Prosek <lpro...@redhat.com>: >> >>> Hi Akihiro, >>> >>> On Mon, Dec 19, 2016 at 7:17 AM, Akihiro Suda <suda.ky...@gmail.com> >>> wrote: >>> > Hello Ladi, >>> > >>> > Thank you a lot for looking into this issue and writing patches about >>> the >>> > queue size. The patch set looks good to me. >>> > >>> >>> I haven't captured DHCP packets (because I'm unsure how to do so in >>> GCE's >>> >>> fully managed DHCP server), but I guess DHCPACK is actually sent >>> from the >>> >>> DHCP server but somehow iPXE missed it. >>> >> >>> >> Do you see any interesting log output with >>> DEBUG=virtio-net,virtio-pci ? >>> > >>> > Did not see any interesting log. No error about TX is printed, and it >>> just >>> > fails to receive DHCPACK. >>> > >>> > So, for ease of debugging with capturing packets, I tried to do >>> `imgfetch >>> > http://...` with a static IP address (`set net0/ip`) instead. >>> > >>> > The result implies that "iPXE believes it has sent the second Ethernet >>> > packet without an error, but it is not sent actually". >>> > >>> > This assumption is consistent with my report about DHCP. >>> > iPXE can successfully send the 1st packet (DHCPDISCOVER), but seems >>> failing >>> > to send the 2nd packet (DHCPREQUEST). It should be the reason why it >>> cannot >>> > receive DHCPACK. >>> > >>> > >>> > Here is the log. In the client log, we can see "TX ... PSH ACK", but >>> it is >>> > not received on the server side. >>> > >>> > === iPXE Client 10.146.0.10 (DEBUG=virtio-net:7,virtio-pci >>> :7,tcp:7,ipv4:7 >>> > CONFIG=cloud, 26050fd4 with Ladi's 4 great patches) === >>> > >>> > iPXE> imgfetch http://10.146.0.2/a.ipxe >>> > http://10.146.0.2/a.ipxe...TCP 0x12eda4 allocated >>> > TCP 0x12eda4 transitioned from CLOSED to SYN_SENT >>> > TCP 0x12eda4 bound to port 46673 >>> > TCP 0x12eda4 timer fired in SYN_SENT for 609cb55e..609cb55e >>> 00000000 >>> > TCP 0x12eda4 TX 46673->80 609cb55e..609cb55f 00000000 >>> 0 SYN >>> > IPv4 TX 10.146.0.10->10.146.0.2 len 64 proto 6 id 0100 csum 6489 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ef90 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ef90 >>> > VIRTIO-NET 0xee704 rx complete iobuf 0x12f5f4 len 74 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1335f4 on vq 0 >>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id 0000 csum 258d >>> > TCP 0x12eda4 RX 46673<-80 609cb55f 5ae4dacb..5ae4dacc >>> 0 SYN >>> > ACK >>> > TCP 0x12eda4 using timestamps, SACK, TX window x128, RX window x512 >>> > TCP 0x12eda4 transitioned from SYN_SENT to ESTABLISHED >>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1 5ae4dacc >>> 98 PSH >>> > ACK >>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0201 csum 6332 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94 >>> > VIRTIO-NET 0xee704 rx complete iobuf 0x12fdf4 len 74 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12f5f4 on vq 0 >>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id 0000 csum 258d >>> > TCP 0x12eda4 RX 46673<-80 609cb55f 5ae4dacb..5ae4dacc >>> 0 SYN >>> > ACK >>> > TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 >>> 5ae4dacc >>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1 5ae4dacc >>> 98 PSH >>> > ACK >>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0302 csum 6231 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94 >>> > .TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 >>> 5ae4dacc >>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1 5ae4dacc >>> 98 PSH >>> > ACK >>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0402 csum 6131 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94 >>> > VIRTIO-NET 0xee704 rx complete iobuf 0x1305f4 len 74 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12fdf4 on vq 0 >>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id 0000 csum 258d >>> > TCP 0x12eda4 RX 46673<-80 609cb55f 5ae4dacb..5ae4dacc >>> 0 SYN >>> > ACK >>> > ..TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 >>> > 5ae4dacc >>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1 5ae4dacc >>> 98 PSH >>> > ACK >>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0503 csum 6030 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94 >>> > ..VIRTIO-NET 0xee704 rx complete iobuf 0x130df4 len 74 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1305f4 on vq 0 >>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id 0000 csum 258d >>> > TCP 0x12eda4 RX 46673<-80 609cb55f 5ae4dacb..5ae4dacc >>> 0 SYN >>> > ACK >>> > .TCP 0x12eda4 timer fired in ESTABLISHED for 609cb55f..609cb5c1 >>> 5ae4dacc >>> > TCP 0x12eda4 TX 46673->80 609cb55f..609cb5c1 5ae4dacc >>> 98 PSH >>> > ACK >>> > IPv4 TX 10.146.0.10->10.146.0.2 len 150 proto 6 id 0604 csum 5f2f >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x12ec94 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x12ec94 >>> > ......TCP 0x12eda4 timer expired in ESTABLISHED for >>> 609cb55f..609cb5c1 >>> > 5ae4dacc >>> > TCP 0x12eda4 transitioned from ESTABLISHED to CLOSED >>> > TCP 0x12eda4 connection deleted >>> > Connection timed out (http://ipxe.org/4c0a6035) >>> > iPXE> VIRTIO-NET 0xee704 rx complete iobuf 0x1315f4 len 74 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x130df4 on vq 0 >>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id 0000 csum 258d >>> > TCP 0x0 RX 46673<-80 609cb55f 5ae4dacb..5ae4dacc 0 >>> SYN ACK >>> > IPv4 received packet rejected by stack: The socket is not connected >>> > (http://ipxe.org/380a6001) >>> > >>> > iPXE> VIRTIO-NET 0xee704 rx complete iobuf 0x131df4 len 74 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1315f4 on vq 0 >>> > IPv4 RX 10.146.0.10<-10.146.0.2 len 60 proto 6 id 0000 csum 258d >>> > TCP 0x0 RX 46673<-80 609cb55f 5ae4dacb..5ae4dacc 0 >>> SYN ACK >>> > IPv4 received packet rejected by stack: The socket is not connected >>> > (http://ipxe.org/380a6001) >>> > VIRTIO-NET 0xee704 rx complete iobuf 0x1325f4 len 42 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x131df4 on vq 0 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1325f4 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x1325f4 >>> > VIRTIO-NET 0xee704 rx complete iobuf 0x132df4 len 62 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1325f4 on vq 0 >>> > IPv4 RX 10.146.0.10<-221.204.224.56 len 48 proto 6 id 6aac csum >>> 167b >>> > TCP 0x0 RX 22<-2281 0a4bb0cf 3368312b..3368312c 0 SYN >>> > IPv4 received packet rejected by stack: The socket is not connected >>> > (http://ipxe.org/380a6001) >>> > VIRTIO-NET 0xee704 rx complete iobuf 0x1335f4 len 42 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x132df4 on vq 0 >>> > VIRTIO-NET 0xee704 enqueuing iobuf 0x1335f4 on vq 1 >>> > VIRTIO-NET 0xee704 tx complete iobuf 0x1335f4 >>> > >>> > >>> > >>> > === HTTP server 10.146.0.2 (ws01) === >>> > >>> > $ sudo tcpdump host 10.146.0.10 >>> > tcpdump: verbose output suppressed, use -v or -vv for full protocol >>> > decode >>> > listening on ens4, link-type EN10MB (Ethernet), capture size 262144 >>> > bytes >>> > 04:55:43.715483 IP ipxe01.c.suda-main.internal.46673 > >>> > ws01.c.suda-main.internal.http: Flags [S], seq 1620882782, win 65532, >>> > options [nop,nop,TS val 323011 ecr 0,nop,nop,sackOK,nop,wscale 9,mss >>> 1460], >>> > length 0 >>> > 04:55:43.715549 IP ws01.c.suda-main.internal.http > >>> > ipxe01.c.suda-main.internal.46673: Flags [S.], seq 1524947659, ack >>> > 1620882783, win 28160, options [mss 1420,sackOK,TS val 77061640 ecr >>> > 323011,nop,wscale 7], length 0 >>> > 04:55:44.712811 IP ws01.c.suda-main.internal.http > >>> > ipxe01.c.suda-main.internal.46673: Flags [S.], seq 1524947659, ack >>> > 1620882783, win 28160, options [mss 1420,sackOK,TS val 77061890 ecr >>> > 323011,nop,wscale 7], length 0 >>> > 04:55:46.712829 IP ws01.c.suda-main.internal.http > >>> > ipxe01.c.suda-main.internal.46673: Flags [S.], seq 1524947659, ack >>> > 1620882783, win 28160, options [mss 1420,sackOK,TS val 77062390 ecr >>> > 323011,nop,wscale 7], length 0 >>> > 04:55:50.712864 IP ws01.c.suda-main.internal.http > >>> > ipxe01.c.suda-main.internal.46673: Flags [S.], seq 1524947659, ack >>> > 1620882783, win 28160, options [mss 1420,sackOK,TS val 77063390 ecr >>> > 323011,nop,wscale 7], length 0 >>> > 04:55:58.712842 IP ws01.c.suda-main.internal.http > >>> > ipxe01.c.suda-main.internal.46673: Flags [S.], seq 1524947659, ack >>> > 1620882783, win 28160, options [mss 1420,sackOK,TS val 77065390 ecr >>> > 323011,nop,wscale 7], length 0 >>> > 04:56:14.712836 IP ws01.c.suda-main.internal.http > >>> > ipxe01.c.suda-main.internal.46673: Flags [S.], seq 1524947659, ack >>> > 1620882783, win 28160, options [mss 1420,sackOK,TS val 77069390 ecr >>> > 323011,nop,wscale 7], length 0 >>> >>> Thanks for the logs. Can you please try the following patch? It's a >>> crazy workaround to reinitialize the driver on each transmitted >>> packet. If it works, we'll go from there and try something more >>> sophisticated. Thanks! >>> >>> >>> --- a/src/drivers/net/virtio-net.c >>> +++ b/src/drivers/net/virtio-net.c >>> @@ -345,6 +345,8 @@ static void virtnet_close ( struct net_device >>> *netdev ) { >>> */ >>> static int virtnet_transmit ( struct net_device *netdev, >>> struct io_buffer *iobuf ) { >>> + virtnet_close ( netdev ); >>> + virtnet_open ( netdev ); >>> virtnet_enqueue_iob ( netdev, TX_INDEX, iobuf ); >>> return 0; >>> } >>> >>> >>> > 2016-12-14 18:17 GMT+09:00 Ladi Prosek <lpro...@redhat.com>: >>> >> >>> >> Hi Akihiro, >>> >> >>> >> On Wed, Dec 14, 2016 at 8:51 AM, Akihiro Suda <suda.ky...@gmail.com> >>> >> wrote: >>> >> > Hi Christian, >>> >> > >>> >> > Thank you for the suggestion, I'll try to report this to google. >>> >> > >>> >> > I found that the cause of this issue is because GCE's >>> >> > VIRTIO_PCI_QUEUE_NUM >>> >> > is 4096, which is larger than iPXE's MAX_QUEUE_NUM (256). >>> >> > >>> >> > >>> >> > https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b >>> 2ec91703676e211:/src/drivers/bus/virtio-pci.c#l43 >>> >> > >>> >> > https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b >>> 2ec91703676e211:/src/include/ipxe/virtio-ring.h#l24 >>> >> > >>> >> > >>> >> > IIUC this issue will be resolved if GCE supports virtio 1.0, which >>> would >>> >> > allow iPXE to use custom queue num smaller than the virtio device's >>> max >>> >> > value. >>> >> >>> >> iPXE virtio 1.0 code does not cap the queue size to MAX_QUEUE_NUM. In >>> >> fact, it does not check it against any maximum and still uses >>> >> fixed-sized virtqueue data structures, just like legacy. This is a >>> >> bug. >>> >> >>> >> > However, it would be great if iPXE can support setting >>> MAX_QUEUE_NUM to >>> >> > 4096 >>> >> > (via `config/general.h`, probably). >>> >> >>> >> Even better, iPXE should try to use the host provided size in legacy >>> >> mode, i.e. no MAX_QUEUE_NUM needed. The few extra dynamic allocations >>> >> shouldn't hurt anything. >>> >> >>> >> > Actually I tried to set MAX_QUEUE_NUM to 4096 and got >>> >> > virtnet_open_legacy() >>> >> > succeeded, but it still doesn't work: >>> >> > >>> >> > iPXE> dhcp >>> >> > DHCP 0x120e44 entering discovery state >>> >> > Configuring (net0 42:01:0a:92:00:0a)...DHCP 0x120e44 >>> DHCPDISCOVER >>> >> > DHCP 0x120e44 DHCPOFFER from 169.254.169.254:67 for 10.146.0.10 >>> >> > .DHCP 0x120e44 DHCPDISCOVER >>> >> > ..DHCP 0x120e44 entering request state >>> >> > DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10 >>> >> > DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for 10.146.0.10 >>> >> > .DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for >>> 10.146.0.10 >>> >> > .DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for >>> 10.146.0.10 >>> >> > ...DHCP 0x120e44 DHCPREQUEST to 169.254.169.254:67 for >>> 10.146.0.10 >>> >> > ....... No configuration methods succeeded >>> >> > (http://ipxe.org/040ee119) >>> >> > iPXE> >>> >> > >>> >> > I haven't captured DHCP packets (because I'm unsure how to do so in >>> >> > GCE's >>> >> > fully managed DHCP server), but I guess DHCPACK is actually sent >>> from >>> >> > the >>> >> > DHCP server but somehow iPXE missed it. >>> >> >>> >> Do you see any interesting log output with >>> DEBUG=virtio-net,virtio-pci ? >>> >> >>> >> > I'd appreciate if anyone can support me to set MAX_QUEUE_NUM to >>> 4096. >>> >> >>> >> I'll look into it. My proposal is to keep using 128 as the cap in >>> >> virtio 1.0 and support any queue size in legacy, only subject to >>> >> available memory. >>> >> >>> >> > 2016-12-14 16:19 GMT+09:00 Christian Nilsson <nik...@gmail.com>: >>> >> >> >>> >> >> You might want to report this to google as well if you haven't >>> already. >>> >> >> >>> >> >> On Wed, Dec 14, 2016 at 7:16 AM, Akihiro Suda < >>> suda.ky...@gmail.com> >>> >> >> wrote: >>> >> >> > Hello, >>> >> >> > >>> >> >> > I tried to boot iPXE on a Google Compute Engine instance, but the >>> >> >> > virtio >>> >> >> > driver doesn't work because virtnet_open_legacy() fails to find >>> >> >> > virtqueues. >>> >> >> > >>> >> >> > Please let me know what I can do for providing further >>> information to >>> >> >> > debug >>> >> >> > the issue. >>> >> >> > >>> >> >> > The serial console output with "DEBUG=virtio-net" is as follows: >>> >> >> > >>> >> >> > iPXE initialising devices...Common virtio capability not >>> found! >>> >> >> > VIRTIO-NET 0xdf584 busaddr=0000:00:04.0 ioaddr=0xc040 irq=11 >>> >> >> > VIRTIO-NET 0xdf584 mac=42:01:0a:92:00:0a >>> >> >> > ok >>> >> >> > >>> >> >> > iPXE 1.0.0+ (26050) -- Open Source Network Boot Firmware -- >>> >> >> > http://ipxe.org >>> >> >> > Features: DNS HTTP iSCSI TFTP SRP AoE ELF MBOOT PXE bzImage >>> Menu >>> >> >> > PXEXT >>> >> >> > iPXE> ifstat >>> >> >> > net0: 42:01:0a:92:00:0a using virtio-net on 0000:00:04.0 >>> (closed) >>> >> >> > [Link:up, TX:0 TXE:0 RX:0 RXE:0] >>> >> >> > iPXE> dhcp >>> >> >> > VIRTIO-NET 0xdf584 cannot register queue 0 >>> >> >> > Could not open net0: No such file or directory >>> >> >> > (http://ipxe.org/2d5c403b) >>> >> >> > iPXE> >>> >> >> > >>> >> >> > >>> >> >> > This ENOENT is raised from virtnet_open_legacy(): >>> >> >> > >>> >> >> > >>> >> >> > https://git.ipxe.org/ipxe.git/blob/26050fd4c87c50503d5bd573b >>> 2ec91703676e211:/src/drivers/net/virtio-net.c#l205 >>> >> >> > >>> >> >> > >>> >> >> > I also tested some previous commits, but all of them failed, so >>> the >>> >> >> > issue is >>> >> >> > not a regression in some recent commit: >>> >> >> > >>> >> >> > * 26050fd (Dec 8, 2016): the last commit when I tested >>> >> >> > * 7b499f8 (Apr 11, 2016): the last commit without support for >>> virtio >>> >> >> > 1.0 >>> >> >> > * e4419ff (Jul 2, 2010): the first commit with the "native >>> iPXE >>> >> >> > driver" >>> >> >> > (The commit date is older than 232c208, but actually newer than >>> >> >> > 232c208) >>> >> >> > * 232c208 (Jul 11, 2010): the last commit without the "native >>> iPXE >>> >> >> > driver". >>> >> >> > The error message is "ERROR: queue size 4096 > 512", "Cannot >>> register >>> >> >> > queue >>> >> >> > #0" >>> >> >> > >>> >> >> > >>> >> >> > Regards, >>> >> >> > >>> >> >> > Akihiro Suda >>> >> >> > >>> >> >> > >>> >> >> > _______________________________________________ >>> >> >> > ipxe-devel mailing list >>> >> >> > ipxe-devel@lists.ipxe.org >>> >> >> > https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel >>> >> >> > >>> >> > >>> >> > >>> >> > >>> >> > _______________________________________________ >>> >> > ipxe-devel mailing list >>> >> > ipxe-devel@lists.ipxe.org >>> >> > https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel >>> >> > >>> > >>> > >>> >> >> >
ipxe-deploy.sh
Description: Bourne shell script
_______________________________________________ ipxe-devel mailing list ipxe-devel@lists.ipxe.org https://lists.ipxe.org/mailman/listinfo.cgi/ipxe-devel