Hi that is great to know. I will internally report this behavior to our dpdk team But I have already got a patch to change our default target to native-linuxapp https://review.openstack.org/#/c/246375/ which should merge shortly. Im glad it is now working for you.
From: Prathyusha Guduri [mailto:prathyushaconne...@gmail.com] Sent: Wednesday, November 18, 2015 6:13 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [networking-ovs-dpdk] Thanks a lot Sean, that was helpful. Changing the target from ivshmem to native-linuxapp removed the error and it doesn't hang at creating external bridge anymore. All processes(nova-api, neutron, ovs-vswitchd, etc) did start. Thanks, Prathyusha On Tue, Nov 17, 2015 at 7:57 PM, Mooney, Sean K <sean.k.moo...@intel.com<mailto:sean.k.moo...@intel.com>> wrote: We mainly test with 2M hugepages not 1G however our ci does use 1G pages. We recently noticed a different but unrelated related issue with using the ivshmem target when building dpdk. (https://bugs.launchpad.net/networking-ovs-dpdk/+bug/1517032) Instead of modifying dpdk can you try Changing the default dpdk build target to x86_64-native-linuxapp-gcc. This can be done by adding RTE_TARGET=x86_64-native-linuxapp-gcc to the local.conf And removing the following file to force a rebuild “/opt/stack/ovs/BUILD_COMPLETE” I agree with your assessment though this appears to be a timing issue in dpdk 2.0 From: Prathyusha Guduri [mailto:prathyushaconne...@gmail.com<mailto:prathyushaconne...@gmail.com>] Sent: Tuesday, November 17, 2015 1:42 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [networking-ovs-dpdk] Here is stack.sh log - 2015-11-17 13:38:50.010 | Loading uio module 2015-11-17 13:38:50.028 | Loading DPDK UIO module 2015-11-17 13:38:50.038 | starting ovs db 2015-11-17 13:38:50.038 | binding nics 2015-11-17 13:38:50.039 | starting vswitchd 2015-11-17 13:38:50.190 | sudo RTE_SDK=/opt/stack/DPDK-v2.0.0 RTE_TARGET=build /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py -b igb_uio 0000:07:00.0 2015-11-17 13:38:50.527 | sudo ovs-vsctl --no-wait --may-exist add-port br-eth1 dpdk0 -- set Interface dpdk0 type=dpdk 2015-11-17 13:38:51.671 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:52.685 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:53.702 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:54.720 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:55.733 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:56.749 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:57.768 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:58.787 | Waiting for ovs-vswitchd to start... 2015-11-17 13:38:59.802 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:00.818 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:01.836 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:02.849 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:03.866 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:04.884 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:05.905 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:06.923 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:07.937 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:08.956 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:09.973 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:10.988 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:12.004 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:13.022 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:14.040 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:15.060 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:16.073 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:17.089 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:18.108 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:19.121 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:20.138 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:21.156 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:22.169 | Waiting for ovs-vswitchd to start... 2015-11-17 13:39:23.185 | Waiting for ovs-vswitchd to start... On Tue, Nov 17, 2015 at 6:50 PM, Prathyusha Guduri <prathyushaconne...@gmail.com<mailto:prathyushaconne...@gmail.com>> wrote: Hi Sean, Here is ovs-vswitchd.log 2015-11-13T12:48:01Z|00001|dpdk|INFO|User-provided -vhost_sock_dir in use: /var/run/openvswitch EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 2 on socket 0 EAL: Detected lcore 3 as core 3 on socket 0 EAL: Detected lcore 4 as core 4 on socket 0 EAL: Detected lcore 5 as core 5 on socket 0 EAL: Detected lcore 6 as core 0 on socket 0 EAL: Detected lcore 7 as core 1 on socket 0 EAL: Detected lcore 8 as core 2 on socket 0 EAL: Detected lcore 9 as core 3 on socket 0 EAL: Detected lcore 10 as core 4 on socket 0 EAL: Detected lcore 11 as core 5 on socket 0 EAL: Support maximum 128 logical core(s) by configuration. EAL: Detected 12 lcore(s) EAL: VFIO modules not all loaded, skip VFIO support... EAL: Searching for IVSHMEM devices... EAL: No IVSHMEM configuration found! EAL: Setting up memory... EAL: Ask a virtual area of 0x180000000 bytes EAL: Virtual area found at 0x7f1e00000000 (size = 0x180000000) EAL: remap_all_hugepages(): mmap failed: Cannot allocate memory EAL: Failed to remap 1024 MB pages PANIC in rte_eal_init(): Cannot init memory 7: [/usr/sbin/ovs-vswitchd() [0x40b803]] 6: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f1fb52d3ec5]] 5: [/usr/sbin/ovs-vswitchd() [0x40a822]] 4: [/usr/sbin/ovs-vswitchd() [0x675432]] 3: [/usr/sbin/ovs-vswitchd() [0x442155]] 2: [/usr/sbin/ovs-vswitchd() [0x407c9f]] 1: [/usr/sbin/ovs-vswitchd() [0x447828]] Before this hugepages were free and port binding was also done. So I suspected that this is a DPDK specific issue and found that in remap_all_hugepages( ) of /opt/stack/DPDK-v2.0.0/lib/librte_eal/linuxapp/eal/eal_memory.c which first unmaps and then mmaps, there is an issue here and so mmap here fails. In DPDK mailing list I found that the unmap is taking longer time because of which mmap fails, so putting a sleep(1) between unmap and map is supposed to solve the issue. Please check the below link : https://lists.01.org/pipermail/dpdk-ovs/2014-April/000864.html After changing so, the ovs-vswitchd command hangs at this place 2015-11-17T10:52:38Z|00001|dpdk|INFO|User-provided -vhost_sock_dir in use: /var/run/openvswitch 2015-11-17 10:52:38.680 | EAL: Detected lcore 0 as core 0 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 1 as core 1 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 2 as core 2 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 3 as core 3 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 4 as core 4 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 5 as core 5 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 6 as core 0 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 7 as core 1 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 8 as core 2 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 9 as core 3 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 10 as core 4 on socket 0 2015-11-17 10:52:38.680 | EAL: Detected lcore 11 as core 5 on socket 0 2015-11-17 10:52:38.680 | EAL: Support maximum 128 logical core(s) by configuration. 2015-11-17 10:52:38.680 | EAL: Detected 12 lcore(s) 2015-11-17 10:52:38.687 | EAL: VFIO modules not all loaded, skip VFIO support... 2015-11-17 10:52:38.687 | EAL: Searching for IVSHMEM devices... 2015-11-17 10:52:38.687 | EAL: No IVSHMEM configuration found! 2015-11-17 10:52:38.687 | EAL: Setting up memory... 2015-11-17 10:52:39.252 | EAL: Ask a virtual area of 0x1c00000 bytes 2015-11-17 10:52:39.252 | EAL: Virtual area found at 0x7fcab3a00000 (size = 0x1c00000) 2015-11-17 10:52:53.265 | EAL: Ask a virtual area of 0x200000 bytes 2015-11-17 10:52:53.266 | EAL: Virtual area found at 0x7fcab3600000 (size = 0x200000) 2015-11-17 10:52:54.266 | EAL: Ask a virtual area of 0x200000 bytes 2015-11-17 10:52:54.266 | EAL: Virtual area found at 0x7fcab3200000 (size = 0x200000) 2015-11-17 10:52:55.267 | EAL: Ask a virtual area of 0x22c00000 bytes 2015-11-17 10:52:55.267 | EAL: Virtual area found at 0x7fca90400000 (size = 0x22c00000) 2015-11-17 10:57:33.574 | EAL: Ask a virtual area of 0x1800000 bytes 2015-11-17 10:57:33.574 | EAL: Virtual area found at 0x7fca8ea00000 (size = 0x1800000) 2015-11-17 10:57:45.585 | EAL: Ask a virtual area of 0xd9800000 bytes 2015-11-17 10:57:45.585 | EAL: Virtual area found at 0x7fc9b5000000 (size = 0xd9800000) 2015-11-17 11:26:50.605 | EAL: Ask a virtual area of 0x200000 bytes 2015-11-17 11:26:50.605 | EAL: Virtual area found at 0x7fc9b4c00000 (size = 0x200000) 2015-11-17 11:26:51.606 | EAL: Ask a virtual area of 0x200000 bytes 2015-11-17 11:26:51.606 | EAL: Virtual area found at 0x7fc9b4800000 (size = 0x200000) 2015-11-17 11:26:52.608 | EAL: Requesting 1024 pages of size 2MB from socket 0 2015-11-17 11:26:53.111 | EAL: TSC frequency is ~3491914 KHz 2015-11-17 11:26:53.111 | EAL: Master lcore 1 is ready (tid=b73cd700;cpuset=[1]) 2015-11-17 11:26:53.111 | PMD: ENICPMD trace: rte_enic_pmd_init 2015-11-17 11:26:53.111 | EAL: PCI device 0000:07:00.0 on NUMA socket 0 2015-11-17 11:26:53.111 | EAL: probe driver: 8086:10d3 rte_em_pmd 2015-11-17 11:26:53.111 | EAL: PCI memory mapped at 0x7fcab5600000 2015-11-17 11:26:53.111 | EAL: PCI memory mapped at 0x7fcab730f000 2015-11-17 11:26:53.111 | EAL: PCI memory mapped at 0x7fcab73d6000 2015-11-17 11:26:53.189 | PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x10d3 2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00002|ovs_numa|INFO|Discovered 12 CPU cores on NUMA node 0 2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00003|ovs_numa|INFO|Discovered 1 NUMA nodes and 12 CPU cores 2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00004|memory|INFO|10680 kB peak resident set size after 2054.5 seconds 2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00006|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected 2015-11-17 11:26:53.194 | 2015-11-17T11:26:53Z|00007|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports recirculation 2015-11-17 11:26:53.194 | 2015-11-17T11:26:53Z|00008|ofproto_dpif|INFO|netdev@ovs-netdev: MPLS label stack length probed as 3 2015-11-17 11:26:53.194 | 2015-11-17T11:26:53Z|00009|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports unique flow ids 2015-11-17 11:26:53.195 | 2015-11-17T11:26:53Z|00010|bridge|INFO|bridge br-eth1: added interface br-eth1 on port 65534 2015-11-17 11:26:53.197 | 2015-11-17T11:26:53Z|00011|dpif_netlink|ERR|Generic Netlink family 'ovs_datapath' does not exist. The Open vSwitch kernel module is probably not loaded. 2015-11-17 11:26:53.287 | Zone 0: name:<MALLOC_S0_HEAP_0>, phys:0x9b600000, len:0xb00000, virt:0x7fca8ea00000, socket_id:0, flags:0 2015-11-17 11:26:53.287 | Zone 1: name:<RG_MP_log_history>, phys:0x36600000, len:0x2080, virt:0x7fcab3600000, socket_id:0, flags:0 2015-11-17 11:26:53.287 | Zone 2: name:<MP_log_history>, phys:0x9c100000, len:0x28a0c0, virt:0x7fca8f500000, socket_id:0, flags:0 2015-11-17 11:26:53.287 | Zone 3: name:<rte_eth_dev_data>, phys:0x36602080, len:0x1f400, virt:0x7fcab3602080, socket_id:0, flags:0 2015-11-17 11:26:53.287 | PMD: eth_em_tx_queue_setup(): sw_ring=0x7fca8f4efd40 hw_ring=0x7fcab3621480 dma_addr=0x36621480 2015-11-17 11:26:53.287 | PMD: eth_em_rx_queue_setup(): sw_ring=0x7fca8f4ebc40 hw_ring=0x7fcab3631480 dma_addr=0x36631480 2015-11-17 11:26:53.368 | PMD: eth_em_start(): << 2015-11-17 11:26:53.368 | 2015-11-17T11:26:53Z|00012|dpdk|INFO|Port 0: 68:05:ca:1b:ca:c9 2015-11-17 11:26:53.405 | PMD: eth_em_tx_queue_setup(): sw_ring=0x7fca8f4efe00 hw_ring=0x7fcab3621480 dma_addr=0x36621480 2015-11-17 11:26:53.405 | PMD: eth_em_rx_queue_setup(): sw_ring=0x7fca8f4ebdc0 hw_ring=0x7fcab3631480 dma_addr=0x36631480 2015-11-17 11:26:53.486 | PMD: eth_em_start(): << 2015-11-17 11:26:53.486 | 2015-11-17T11:26:53Z|00013|dpdk|INFO|Port 0: 68:05:ca:1b:ca:c9 2015-11-17 11:26:53.487 | 2015-11-17T11:26:53Z|00014|dpif_netdev|INFO|Created 1 pmd threads on numa node 0 2015-11-17 11:26:53.487 | 2015-11-17T11:26:53Z|00001|dpif_netdev(pmd10)|INFO|Core 0 processing port 'dpdk0' 2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00002|dpif_netdev(pmd10)|INFO|Core 0 processing port 'dpdk0' 2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00015|bridge|INFO|bridge br-eth1: added interface dpdk0 on port 1 2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00016|bridge|INFO|bridge br-int: added interface br-int on port 65534 2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00017|bridge|INFO|bridge br-eth1: using datapath ID 00006805ca1bcac9 2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00018|connmgr|INFO|br-eth1: added service controller "punix:/var/run/openvswitch/br-eth1.mgmt" 2015-11-17 11:26:53.489 | 2015-11-17T11:26:53Z|00019|bridge|INFO|bridge br-int: using datapath ID 00002ef7b66a8742 2015-11-17 11:26:53.489 | 2015-11-17T11:26:53Z|00020|connmgr|INFO|br-int: added service controller "punix:/var/run/openvswitch/br-int.mgmt" 2015-11-17 11:26:53.490 | 2015-11-17T11:26:53Z|00021|dpif_netdev|INFO|Created 2 pmd threads on numa node 0 2015-11-17 11:26:53.492 | 2015-11-17T11:26:53Z|00022|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.4.90 2015-11-17 11:26:53.493 | 2015-11-17T11:26:53Z|00001|dpif_netdev(pmd23)|INFO|Core 2 processing port 'dpdk0' 2015-11-17 11:27:03.494 | 2015-11-17T11:27:03Z|00023|memory|INFO|peak resident set size grew 93% in last 10.3 seconds, from 10680 kB to 20572 kB 2015-11-17 11:27:03.494 | 2015-11-17T11:27:03Z|00024|memory|INFO|handlers:4 ports:3 revalidators:2 rules:10 ubuntu@ubuntu-Precision-Tower-5810:/opt/stack/DPDK-v2.0.0/lib/librte_eal/linuxapp/eal$<mailto:ubuntu@ubuntu-Precision-Tower-5810:/opt/stack/DPDK-v2.0.0/lib/librte_eal/linuxapp/eal$> ps -Al | grep ovs 5 S 0 1681 2595 0 80 0 - 4433 poll_s ? 00:00:00 ovsdb-server 4 S 0 1716 1715 0 80 0 - 4636 wait pts/3 00:00:00 ovs-dpdk 4 S 0 2124 1716 99 80 0 - 870841 poll_s pts/3 03:42:31 ovs-vswitchd So now ovs-vswitchd runs unlike the last time. I really dont understand where am missing out.... On Tue, Nov 17, 2015 at 5:14 PM, Mooney, Sean K <sean.k.moo...@intel.com<mailto:sean.k.moo...@intel.com>> wrote: Can you provide the ovs-vswitchd log form ${OVS_LOG_DIR}/ovs-vswitchd.log /tmp/ovs-vswitchd.log in your case. If the vswitch fails to start we clean up by unmounting the hugepages. From: Prathyusha Guduri [mailto:prathyushaconne...@gmail.com<mailto:prathyushaconne...@gmail.com>] Sent: Tuesday, November 17, 2015 7:37 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [networking-ovs-dpdk] Hi Sean, I realised on debugging ovs-dpdk-init script that the main issue is with the following command $ screen -dms ovs-vswitchd sudo sg $qemu_group -c "umask 002; ${OVS_INSTALL_DIR}/sbin/ovs-vswitchd --dpdk -vhost_sock_dir $OVS_DB_SOCKET_DIR -c $OVS_CORE_MASK -n $OVS_MEM_CHANNELS --proc-type primary --huge-dir $OVS_HUGEPAGE_MOUNT --socket-mem $OVS_SOCKET_MEM $pciAddressWhitelist -- unix:$OVS_DB_SOCKET 2>&1 | tee ${OVS_LOG_DIR}/ovs-vswitchd.log" which I guess is starting the ovs-vswitchd application. Before this command, huge pages is mounted and port binding is also done but still the screen command fails. I verified the db.sock and conf.db files. Any help is highly appreciated. Thanks, Prathyusha On Mon, Nov 16, 2015 at 5:12 PM, Prathyusha Guduri <prathyushaconne...@gmail.com<mailto:prathyushaconne...@gmail.com>> wrote: Hi Sean, Thanks for your response. in your case though you are using 1GB hugepages so I don’t think this is related to memory fragmentation or a lack of free hugepages. to use preallocated 1GB page with ovs you should instead set the following in your local.conf OVS_HUGEPAGE_MOUNT_PAGESIZE=1G OVS_ALLOCATE_HUGEPAGES=False Added the above two parameters to the local.conf. The same problem again. Basically it throws this error - 2015-11-16 11:31:44.741 | starting vswitchd 2015-11-16 11:31:44.863 | sudo RTE_SDK=/opt/stack/DPDK-v2.0.0 RTE_TARGET=build /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py -b igb_uio 0000:07:00.0 2015-11-16 11:31:45.169 | sudo ovs-vsctl --no-wait --may-exist add-port br-eth1 dpdk0 -- set Interface dpdk0 type=dpdk 2015-11-16 11:31:46.314 | Waiting for ovs-vswitchd to start... 2015-11-16 11:31:47.442 | libvirt-bin stop/waiting 2015-11-16 11:31:49.473 | libvirt-bin start/running, process 2255 2015-11-16 11:31:49.477 | [ERROR] /etc/init.d/ovs-dpdk:563 ovs-vswitchd application failed to start manually mounting /mnt/huge and then commenting that part from the /etc/init.d/ovs-dpdk script also throws the same error. Using 1G hugepagesize should not give any memory related problem. I dont understand why it is not mounting then. Here is the /opt/stack/networking-ovs-dpdk/devstack/ovs-dpdk/ovs-dpdk.conf RTE_SDK=${RTE_SDK:-/opt/stack/DPDK} RTE_TARGET=${RTE_TARGET:-x86_64-ivshmem-linuxapp-gcc} OVS_INSTALL_DIR=/usr OVS_DB_CONF_DIR=/etc/openvswitch OVS_DB_SOCKET_DIR=/var/run/openvswitch OVS_DB_CONF=$OVS_DB_CONF_DIR/conf.db OVS_DB_SOCKET=OVS_DB_SOCKET_DIR/db.sock OVS_SOCKET_MEM=2048,2048 OVS_MEM_CHANNELS=4 OVS_CORE_MASK=${OVS_CORE_MASK:-2} OVS_PMD_CORE_MASK=${OVS_PMD_CORE_MASK:-4} OVS_LOG_DIR=/tmp OVS_LOCK_DIR='' OVS_SRC_DIR=/opt/stack/ovs OVS_DIR=${OVS_DIR:-${OVS_SRC_DIR}} OVS_UTILS=${OVS_DIR}/utilities/ OVS_DB_UTILS=${OVS_DIR}/ovsdb/ OVS_DPDK_DIR=$RTE_SDK OVS_NUM_HUGEPAGES=${OVS_NUM_HUGEPAGES:-5} OVS_HUGEPAGE_MOUNT=${OVS_HUGEPAGE_MOUNT:-/mnt/huge} OVS_HUGEPAGE_MOUNT_PAGESIZE='' OVS_BOND_MODE=$OVS_BOND_MODE OVS_BOND_PORTS=$OVS_BOND_PORTS OVS_BRIDGE_MAPPINGS=eth1 OVS_PCI_MAPPINGS=0000:07:00.0#eth1 OVS_DPDK_PORT_MAPPINGS='' OVS_TUNNEL_CIDR_MAPPING='' OVS_ALLOCATE_HUGEPAGES=True OVS_INTERFACE_DRIVER='igb_uio' Verified the OVS_DB_SOCKET_DIR and all others. conf.db and db.sock exist. So why ovs-vswitchd is failing to start??? Am I missing something??? Thanks, Prathyusha On Mon, Nov 16, 2015 at 4:39 PM, Mooney, Sean K <sean.k.moo...@intel.com<mailto:sean.k.moo...@intel.com>> wrote: Hi Yes sorry for the delay in responding to you and samta. In your case assuming you are using 2mb hugepages it is easy to hit dpdks default max memory segments This can be changed by setting OVS_DPDK_MEM_SEGMENTS=<arbitrary large number that you will never hit> In the local.conf and recompiling. To do this simply remove the build complete file in /opt/stack/ovs rm –f /opt/stack/BUILD_COMPLETE in your case though you are using 1GB hugepages so I don’t think this is related to memory fragmentation or a lack of free hugepages. to use preallocated 1GB page with ovs you should instead set the following in your local.conf OVS_HUGEPAGE_MOUNT_PAGESIZE=1G OVS_ALLOCATE_HUGEPAGES=False Regards sean From: Prathyusha Guduri [mailto:prathyushaconne...@gmail.com<mailto:prathyushaconne...@gmail.com>] Sent: Monday, November 16, 2015 6:20 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [networking-ovs-dpdk] Hi all, I have a similar problem as Samta. Am also stuck at the same place. The following command $sudo ovs-vsctl br-set-external-id br-ex bridge-id br-ex hangs forever. As Sean said, it might be because of ovs-vswitchd proces. > The vswitchd process may exit if it failed to allocate memory (due to memory > fragmentation or lack of free hugepages) > if the ovs-vswitchd.log is not available can you check the the hugepage mount > point was created in > /mnt/huge And that Iis mounted > Run > ls -al /mnt/huge > and > mount > $mount /dev/sda6 on / type ext4 (rw,errors=remount-ro) proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) none on /sys/fs/cgroup type tmpfs (rw) none on /sys/fs/fuse/connections type fusectl (rw) none on /sys/kernel/debug type debugfs (rw) none on /sys/kernel/security type securityfs (rw) udev on /dev type devtmpfs (rw,mode=0755) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) none on /run/shm type tmpfs (rw,nosuid,nodev) none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755) none on /sys/fs/pstore type pstore (rw) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu) cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct) cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory) cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb) systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd) gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,user=ubuntu) /mnt/huge is my mount point. So no mounting happening. ovs-vswitchd.log says 2015-11-13T12:48:01Z|00001|dpdk|INFO|User-provided -vhost_sock_dir in use: /var/run/openvswitch EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 2 on socket 0 EAL: Detected lcore 3 as core 3 on socket 0 EAL: Detected lcore 4 as core 4 on socket 0 EAL: Detected lcore 5 as core 5 on socket 0 EAL: Detected lcore 6 as core 0 on socket 0 EAL: Detected lcore 7 as core 1 on socket 0 EAL: Detected lcore 8 as core 2 on socket 0 EAL: Detected lcore 9 as core 3 on socket 0 EAL: Detected lcore 10 as core 4 on socket 0 EAL: Detected lcore 11 as core 5 on socket 0 EAL: Support maximum 128 logical core(s) by configuration. EAL: Detected 12 lcore(s) EAL: VFIO modules not all loaded, skip VFIO support... EAL: Searching for IVSHMEM devices... EAL: No IVSHMEM configuration found! EAL: Setting up memory... EAL: Ask a virtual area of 0x180000000 bytes EAL: Virtual area found at 0x7f1e00000000 (size = 0x180000000) EAL: remap_all_hugepages(): mmap failed: Cannot allocate memory EAL: Failed to remap 1024 MB pages PANIC in rte_eal_init(): Cannot init memory 7: [/usr/sbin/ovs-vswitchd() [0x40b803]] 6: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f1fb52d3ec5]] 5: [/usr/sbin/ovs-vswitchd() [0x40a822]] 4: [/usr/sbin/ovs-vswitchd() [0x675432]] 3: [/usr/sbin/ovs-vswitchd() [0x442155]] 2: [/usr/sbin/ovs-vswitchd() [0x407c9f]] 1: [/usr/sbin/ovs-vswitchd() [0x447828]] I have given hugepages in /boot/grub/grub.cfg file. So there are free hugepages. AnonHugePages: 378880 kB HugePages_Total: 6 HugePages_Free: 6 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB It failed to allocate memory because mounting was not done. Did not understand why mounting is not done when there are free hugepages. And also dpdk binding did happen. $../DPDK-v2.0.0/tools/dpdk_nic_bind.py --status Network devices using DPDK-compatible driver ============================================ 0000:07:00.0 '82574L Gigabit Network Connection' unused=igb_uio Network devices using kernel driver =================================== 0000:00:19.0 'Ethernet Connection I217-LM' if=eth0 drv=e1000e unused=igb_uio *Active* 0000:06:02.0 '82540EM Gigabit Ethernet Controller' if=eth2 drv=e1000 unused=igb_uio Other network devices ===================== None Am using a 1G NIC card for the port (eth1) binds dpdk. Is that a problem??? Should dpdk binding port necessarily have a 10G NIC???? I dont think its a problem anyway because binding is done. Please correct me if am going wrong... Thanks, Prathyusha On Wed, Nov 11, 2015 at 3:52 PM, Samta Rangare <samtarang...@gmail.com<mailto:samtarang...@gmail.com>> wrote: Hi Sean, Thanks for replying back, response inline. On Mon, Nov 9, 2015 at 8:24 PM, Mooney, Sean K <sean.k.moo...@intel.com<mailto:sean.k.moo...@intel.com>> wrote: > Hi > Can you provide some more information regarding your deployment? > > Can you check which kernel you are using. > > uname -a Linux ubuntu 3.16.0-50-generic #67~14.04.1-Ubuntu SMP Fri Oct 2 22:07:51 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux > > If you are using a 3.19 kernel changes to some locking code in the kennel > broke synchronization dpdk2.0 and requires dpdk 2.1 to be used instead. > In general it is not advisable to use a 3.19 kernel with dpdk as it can lead > to non-deterministic behavior. > > When devstack hangs can you connect with a second ssh session and run > sudo service ovs-dpdk status > and > ps aux | grep ovs > sudo service ovs-dpdk status sourcing config /opt/stack/logs/ovs-vswitchd.pid is not running Not all processes are running restart!!! 1 ubuntu@ubuntu:~/samta/devstack$ ps -ef | grep ovs root 13385 1 0 15:17 ? 00:00:00 /usr/sbin/ovsdb-server --detach --pidfile=/opt/stack/logs/ovsdb-server.pid --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options ubuntu 24451 12855 0 15:45 pts/0 00:00:00 grep --color=auto ovs > > When the deployment hangs at sudo ovs-vsctl br-set-external-id br-ex > bridge-id br-ex > It usually means that the ovs-vswitchd process has exited. > The above result shows that ovs-vswitchd is not running. > This can happen for a number of reasons. > The vswitchd process may exit if it failed to allocate memory (due to memory > fragmentation or lack of free hugepages) > if the ovs-vswitchd.log is not available can you check the the hugepage mount > point was created in > /mnt/huge And that Iis mounted > Run > ls -al /mnt/huge > and > mount > ls -al /mnt/huge total 4 drwxr-xr-x 2 libvirt-qemu kvm 0 Nov 11 15:18 . drwxr-xr-x 3 root root 4096 May 15 00:09 .. ubuntu@ubuntu:~/samta/devstack$ mount /dev/mapper/ubuntu--vg-root on / type ext4 (rw,errors=remount-ro) proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) none on /sys/fs/cgroup type tmpfs (rw) none on /sys/fs/fuse/connections type fusectl (rw) none on /sys/kernel/debug type debugfs (rw) none on /sys/kernel/security type securityfs (rw) udev on /dev type devtmpfs (rw,mode=0755) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) none on /run/shm type tmpfs (rw,nosuid,nodev) none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755) none on /sys/fs/pstore type pstore (rw) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu) cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct) cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory) cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer) cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,relatime,net_cls) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event) cgroup on /sys/fs/cgroup/net_prio type cgroup (rw,relatime,net_prio) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb) /dev/sda1 on /boot type ext2 (rw) systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd) hugetlbfs-kvm on /run/hugepages/kvm type hugetlbfs (rw,mode=775,gid=106) nodev on /mnt/huge type hugetlbfs (rw,uid=106,gid=106) nodev on /mnt/huge type hugetlbfs (rw,uid=106,gid=106) > then checkout how many hugepages are mounted > > cat /proc/meminfo | grep huge > cat /proc/meminfo | grep Huge AnonHugePages: 292864 kB HugePages_Total: 5 HugePages_Free: 5 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB > > the vswitchd process may also exit if it failed to initializes dpdk > interfaces. > This can happen if no interface is compatible with the igb-uio or vfio-pci > drivers > (note in the vfio-pci case all interface in the same iommu group must be > bound to the vfio-pci driver and > The iommu must be enabled in the kernel command line with VT-d enabled in the > bios) > > Can you check which interface are bound to the dpdk driver by running the > following command > > /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py --status > /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py --status Network devices using DPDK-compatible driver ============================================ <none> Network devices using kernel driver =================================== 0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' if=p1p1 drv=ixgbe unused=igb_uio 0000:02:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=p4p1 drv=i40e unused=igb_uio 0000:03:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=p2p1 drv=i40e unused=igb_uio 0000:06:00.0 'I350 Gigabit Network Connection' if=em1 drv=igb unused=igb_uio *Active* 0000:06:00.1 'I350 Gigabit Network Connection' if=em2 drv=igb unused=igb_uio Other network devices ===================== 0000:01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' unused=igb_uio > > Finally can you confim that ovs-dpdk compiled successfully by either check > the xstack.log or > Checking for the BUILD_COMPLETE file in /opt/stack/ovs BUILD_COMPLETE exist in /opt/stack/ovs though its empty. > > Regards > sean > > > > > -----Original Message----- > From: Samta Rangare > [mailto:samtarang...@gmail.com<mailto:samtarang...@gmail.com>] > Sent: Monday, November 9, 2015 2:31 PM > To: Czesnowicz, Przemyslaw > Cc: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [networking-ovs-dpdk] > > Thanks for replying Przemyslaw, there is no ovs-vswitchd.log in > /opt/stack/logs/. This is all contains inside (ovsdb-server.pid, screen). > > When I cancel stack .sh (ctr c), and try to rerun this $sudo ovs-vsctl > br-set-external-id br-ex bridge-id br-ex it didnt hang, that means vSwitch > was running isn't it ? > > But rerunning stack.sh after unstack hangs again. > > Thanks, > Samta > > On Mon, Nov 9, 2015 at 7:50 PM, Czesnowicz, Przemyslaw > <przemyslaw.czesnow...@intel.com<mailto:przemyslaw.czesnow...@intel.com>> > wrote: >> Hi Samta, >> >> This usually means that the vSwitch is not running/has crashed. >> Can you check in /opt/stack/logs/ovs-vswitchd.log ? There should be an error >> msg there. >> >> Regards >> Przemek >> >>> -----Original Message----- >>> From: Samta Rangare >>> [mailto:samtarang...@gmail.com<mailto:samtarang...@gmail.com>] >>> Sent: Monday, November 9, 2015 1:51 PM >>> To: OpenStack Development Mailing List (not for usage questions) >>> Subject: [openstack-dev] [networking-ovs-dpdk] >>> >>> Hello Everyone, >>> >>> I am installing devstack with networking-ovs-dpdk. The local.conf >>> exactly looks like the one is available in /opt/stack/networking-ovs- >>> dpdk/doc/source/_downloads/local.conf.single_node. >>> So I believe all the necessary configuration will be taken care. >>> >>> However I am stuck at place where devstack is trying to set >>> external-id ($ sudo ovs-vsctl br-set-external-id br-ex bridge-id >>> br-ex). As soon as it hits at this place it's just hangs forever. I >>> tried commenting this line from >>> lib/neutron_plugin/ml2 (I know this is wrong) and then all services >>> came up except ovs-dpdk agent and ovs agent. >>> >>> BTW I am deploying it in ubuntu 14.04. Any pointer will be really helpful. >>> >>> Thanks, >>> Samta >>> >>> __________________________________________________________ >>> ________________ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: OpenStack-dev- >>> requ...@lists.openstack.org?subject:unsubscribe<http://requ...@lists.openstack.org?subject:unsubscribe> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev