This actually worked. Will this patch come to the master branch? Does it have any impact on performance?
Thanks & Regards, P Gyanesh Kumar Patra On Thu, Apr 12, 2018 at 7:31 AM, Elo, Matias (Nokia - FI/Espoo) < matias....@nokia.com> wrote: > > This patch should hopefully fix the bug: https://github.com/matiaselo/ > odp/commit/c32baeb1796636adfd12fd3f785e10929984ccc3 > > It would be great if you could verify that the patch works since I cannot > repeat the original issue on my test system. > > -Matias > > > > On 12 Apr 2018, at 10:53, Elo, Matias (Nokia - FI/Espoo) < > matias....@nokia.com> wrote: > > > > Still one more thing, the argument '-m' should be replaced with > '--socket-mem'. > > > > > >> On 12 Apr 2018, at 10:42, Elo, Matias (Nokia - FI/Espoo) < > matias....@nokia.com> wrote: > >> > >> Hi, > >> > >> I may have figured out the issue here. Currently, the ODP DPDK pktio > implementation configures DPDK to allocated memory only for socket 0. > >> > >> Could you please try running ODP again with environment variable > ODP_PKTIO_DPDK_PARAMS="-m 512,512" set. > >> > >> E.g. > >> sudo ODP_PKTIO_DPDK_PARAMS="-m 512,512" ./odp_l2fwd -c 1 -i 0,1 > >> > >> > >> If this doesn't help you could test this code change: > >> > >> diff --git a/platform/linux-generic/pktio/dpdk.c > b/platform/linux-generic/pktio/dpdk.c > >> index 7bccab8..2b8b8e4 100644 > >> --- a/platform/linux-generic/pktio/dpdk.c > >> +++ b/platform/linux-generic/pktio/dpdk.c > >> @@ -1120,7 +1120,8 @@ static int dpdk_pktio_init(void) > >> return -1; > >> } > >> > >> - mem_str_len = snprintf(NULL, 0, "%d", DPDK_MEMORY_MB); > >> + mem_str_len = snprintf(NULL, 0, "%d,%d", DPDK_MEMORY_MB, > >> + DPDK_MEMORY_MB); > >> > >> cmdline = getenv("ODP_PKTIO_DPDK_PARAMS"); > >> if (cmdline == NULL) > >> @@ -1133,8 +1134,8 @@ static int dpdk_pktio_init(void) > >> char full_cmd[cmd_len]; > >> > >> /* first argument is facility log, simply bind it to odpdpdk for > now.*/ > >> - cmd_len = snprintf(full_cmd, cmd_len, "odpdpdk -c %s -m %d %s", > >> - mask_str, DPDK_MEMORY_MB, cmdline); > >> + cmd_len = snprintf(full_cmd, cmd_len, "odpdpdk -c %s -m %d,%d > %s", > >> + mask_str, DPDK_MEMORY_MB, DPDK_MEMORY_MB, > cmdline); > >> > >> for (i = 0, dpdk_argc = 1; i < cmd_len; ++i) { > >> if (isspace(full_cmd[i])) > >> > >> > >> -Matias > >> > >> > >>> On 10 Apr 2018, at 21:37, gyanesh patra <pgyanesh.pa...@gmail.com> > wrote: > >>> > >>> Hi Matias, > >>> > >>> The Mellanox interfaces are mapped to Numa Node 1. (device id: 81:00.x) > >>> We have free hugepages on both Node0 and Node1 as identified below. > >>> > >>> root# cat /sys/devices/system/node/node0/hugepages/hugepages- > 1048576kB/free_hugepages > >>> 77 > >>> root# cat /sys/devices/system/node/node1/hugepages/hugepages- > 1048576kB/free_hugepages > >>> 83 > >>> > >>> The ODP application is using CPU/lcore associated with numa Node1 too. > >>> I have tried with the dpdk-17.11.1 version too without success. > >>> The issue may be somewhere else. > >>> > >>> Regarding the usage of 2M pages (1024 x 2M pages): > >>> - I unmounted the 1G hugepages and then set 1024x2M pages using > dpdk-setup.sh scripts. > >>> - But with this setup failed with the same error as before. > >>> > >>> Let me know if there is any other option we can try. > >>> > >>> Thanks, > >>> P Gyanesh Kumar Patra > >>> > >>> On Thu, Mar 29, 2018 at 4:46 AM, Elo, Matias (Nokia - FI/Espoo) < > matias....@nokia.com> wrote: > >>> A second thing to try. Since you seem to have a NUMA system, the ODP > application should be run on the same NUMA socket as the NIC (e.g. using > taskset if necessary). In case of different sockets, both sockets should > have huge pages mapped. > >>> > >>> -Matias > >>> > >>>> On 29 Mar 2018, at 10:00, Elo, Matias (Nokia - FI/Espoo) < > matias....@nokia.com> wrote: > >>>> > >>>> Hi Gyanesh, > >>>> > >>>> It seems you are using 1G huge pages. Have you tried using 2M pages > (1024 x 2M pages should be enough)? As Bill noted, this seems like a memory > related issue. > >>>> > >>>> -Matias > >>>> > >>>> > >>>>> On 28 Mar 2018, at 18:15, gyanesh patra <pgyanesh.pa...@gmail.com> > wrote: > >>>>> > >>>>> Yes, it is. > >>>>> The error is the same. I did replied that the only difference I see > is with Ubuntu version and different minor version of mellanox driver. > >>>>> > >>>>> On Wed, Mar 28, 2018, 07:29 Bill Fischofer < > bill.fischo...@linaro.org> wrote: > >>>>> Thanks for the update. Sounds like you're already using DPDK 17.11? > >>>>> What about Mellanox driver level? Is the failure the same as you > >>>>> originally reported? > >>>>> > >>>>> From the reported error: > >>>>> > >>>>> pktio/dpdk.c:1538:dpdk_start():Queue setup failed: err=-12, port=0 > >>>>> odp_l2fwd.c:1671:main():Error: unable to start 0 > >>>>> > >>>>> This is a DPDK PMD driver error reported by rte_eth_rx_queue_setup(). > >>>>> In the Mellanox PMD (drivers/net/mlx5/mlx5_rxq.c) this is the > >>>>> mlx5_rx_queue_setup() routine. The relevant code seems to be this: > >>>>> > >>>>> if (rxq != NULL) { > >>>>> DEBUG("%p: reusing already allocated queue index %u (%p)", > >>>>> (void *)dev, idx, (void *)rxq); > >>>>> if (priv->started) { > >>>>> priv_unlock(priv); > >>>>> return -EEXIST; > >>>>> } > >>>>> (*priv->rxqs)[idx] = NULL; > >>>>> rxq_cleanup(rxq_ctrl); > >>>>> /* Resize if rxq size is changed. */ > >>>>> if (rxq_ctrl->rxq.elts_n != log2above(desc)) { > >>>>> rxq_ctrl = rte_realloc(rxq_ctrl, > >>>>> sizeof(*rxq_ctrl) + > >>>>> (desc + desc_pad) * > >>>>> sizeof(struct > rte_mbuf *), > >>>>> RTE_CACHE_LINE_SIZE); > >>>>> if (!rxq_ctrl) { > >>>>> ERROR("%p: unable to reallocate queue index %u", > >>>>> (void *)dev, idx); > >>>>> priv_unlock(priv); > >>>>> return -ENOMEM; > >>>>> } > >>>>> } > >>>>> } else { > >>>>> rxq_ctrl = rte_calloc_socket("RXQ", 1, sizeof(*rxq_ctrl) + > >>>>> (desc + desc_pad) * > >>>>> sizeof(struct > rte_mbuf *), > >>>>> 0, socket); > >>>>> if (rxq_ctrl == NULL) { > >>>>> ERROR("%p: unable to allocate queue index %u", > >>>>> (void *)dev, idx); > >>>>> priv_unlock(priv); > >>>>> return -ENOMEM; > >>>>> } > >>>>> } > >>>>> > >>>>> The reported -12 error code is -ENOMEM so I'd say the issue is some > >>>>> sort of memory allocation failure. > >>>>> > >>>>> > >>>>> On Wed, Mar 28, 2018 at 8:43 AM, gyanesh patra < > pgyanesh.pa...@gmail.com> wrote: > >>>>>> Hi Bill, > >>>>>> I tried with Matias' suggestions but without success. > >>>>>> > >>>>>> P Gyanesh Kumar Patra > >>>>>> > >>>>>> On Mon, Mar 26, 2018 at 4:16 PM, Bill Fischofer < > bill.fischo...@linaro.org> > >>>>>> wrote: > >>>>>>> > >>>>>>> Hi Gyanesh, > >>>>>>> > >>>>>>> Have you had a chance to look at > >>>>>>> https://bugs.linaro.org/show_bug.cgi?id=3657 and see if Matias' > suggestions > >>>>>>> are helpful to you? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Regards, > >>>>>>> Bill > >>>>>> > >>>>>> > >>>> > >>> > >>> > >> > > > >