Hi, I have some nodes which have Mellanox ConnectX4 nics in them and when I boot with iPXE they seem (based on observed speed) to be negotiating 100 Mbps when bringing the links up. I enabled debugging (make DEBUG=golan ...), which produces the following output:
NBP file downloaded successfully. iPXE initialising devices...golan_probe: start golan_probe: Using NODNIC driver golan_probe: rc = 0 golan_probe: start golan_probe: Using NODNIC driver golan_probe: rc = 0 golan_probe: start golan_probe: Using normal driver golan_bring_up golan_cmd_init Command interface was initialized golan_core_enable_hca golan_handle_pages golan_handle_pages pages needed: 6 golan_provide_pages golan_provide_pages Pages handled golan_set_hca_cap golan_set_hca_cap caps.uar_sz = 5 golan_set_hca_cap caps.log_pg_sz = 12 golan_set_hca_cap caps.log_uar_sz = 0 golan_handle_pages golan_handle_pages pages needed: 10024 golan_provide_pages golan_provide_pages Pages handled golan_hca_init golan_alloc_uar: UAR allocated with index 0x80 UAR idx 80 (BE 80000003) golan_create_eq: Event queue created (EQN = 0x4) golan_alloc_pd: Protection domain created (PDN = 0x11) golan_create_mkey: Got DMA Key for local access read/write (MKEY = 0x1000) golan_bring_down: start golan_destroy_mkey DMA Key (0x1000) for local access write was destroyed golan_dealloc_pd in golan_dealloc_pd Protection domain (0x11) was destroyed golan_destory_eq in golan_destory_eq Event queue (0x4) was destroyed golan_dealloc_uar in golan_dealloc_uar UAR (0x80) was destroyed golan_teardown_hca in golan_teardown_hca HCA teardown compleated golan_handle_pages golan_handle_pages pages needed: -10024 golan_take_pages golan_take_pages Pages handled golan_bring_down: end golan_probe: rc = 0 ok [[ normal boot stuff is here ]] golan_remove: start golan_remove: Using NODNIC driver remove golan_remove: end golan_remove: start golan_remove: Using NODNIC driver remove golan_remove: end golan_remove: start golan_remove: Using normal driver remove golan_remove_normal [[ node boots ]] Is there something in that which confirms 100 Mbps or gives a clue as to why these are so slow? Once booted to an OS the NICs work fine, I only see this slowdown during the initial ipxe fetching of kernel and initrd. I don't (yet) have easy access to watch this at the switch but will get that soon. For now I base my guess that this is auto negotiating 100 Mbps on watching the host that sends the images max out at 11 MB/s when one of these nodes is fetching an image. Is there a way to get more debugging here or better, force the link to come up at a higher speed? Best, griznog
_______________________________________________ ipxe-devel mailing list ipxe-devel@lists.ipxe.org https://lists.ipxe.org/mailman/listinfo/ipxe-devel