This patch should hopefully fix the bug: 
https://github.com/matiaselo/odp/commit/c32baeb1796636adfd12fd3f785e10929984ccc3

It would be great if you could verify that the patch works since I cannot 
repeat the original issue on my test system.

-Matias


> On 12 Apr 2018, at 10:53, Elo, Matias (Nokia - FI/Espoo) 
> <matias....@nokia.com> wrote:
> 
> Still one more thing, the argument '-m' should be replaced with 
> '--socket-mem'.
> 
> 
>> On 12 Apr 2018, at 10:42, Elo, Matias (Nokia - FI/Espoo) 
>> <matias....@nokia.com> wrote:
>> 
>> Hi,
>> 
>> I may have figured out the issue here. Currently, the ODP DPDK pktio 
>> implementation configures DPDK to allocated memory only for socket 0. 
>> 
>> Could you please try running ODP again with environment variable 
>> ODP_PKTIO_DPDK_PARAMS="-m 512,512" set.
>> 
>> E.g.
>> sudo ODP_PKTIO_DPDK_PARAMS="-m 512,512"  ./odp_l2fwd -c 1 -i 0,1
>> 
>> 
>> If this doesn't help you could test this code change:
>> 
>> diff --git a/platform/linux-generic/pktio/dpdk.c 
>> b/platform/linux-generic/pktio/dpdk.c
>> index 7bccab8..2b8b8e4 100644
>> --- a/platform/linux-generic/pktio/dpdk.c
>> +++ b/platform/linux-generic/pktio/dpdk.c
>> @@ -1120,7 +1120,8 @@ static int dpdk_pktio_init(void)
>>               return -1;
>>       }
>> 
>> -       mem_str_len = snprintf(NULL, 0, "%d", DPDK_MEMORY_MB);
>> +       mem_str_len = snprintf(NULL, 0, "%d,%d", DPDK_MEMORY_MB,
>> +                              DPDK_MEMORY_MB);
>> 
>>       cmdline = getenv("ODP_PKTIO_DPDK_PARAMS");
>>       if (cmdline == NULL)
>> @@ -1133,8 +1134,8 @@ static int dpdk_pktio_init(void)
>>       char full_cmd[cmd_len];
>> 
>>       /* first argument is facility log, simply bind it to odpdpdk for now.*/
>> -       cmd_len = snprintf(full_cmd, cmd_len, "odpdpdk -c %s -m %d %s",
>> -                          mask_str, DPDK_MEMORY_MB, cmdline);
>> +       cmd_len = snprintf(full_cmd, cmd_len, "odpdpdk -c %s -m %d,%d %s",
>> +                          mask_str, DPDK_MEMORY_MB, DPDK_MEMORY_MB, 
>> cmdline);
>> 
>>       for (i = 0, dpdk_argc = 1; i < cmd_len; ++i) {
>>               if (isspace(full_cmd[i]))
>> 
>> 
>> -Matias
>> 
>> 
>>> On 10 Apr 2018, at 21:37, gyanesh patra <pgyanesh.pa...@gmail.com> wrote:
>>> 
>>> Hi Matias,
>>> 
>>> The Mellanox interfaces are mapped to Numa Node 1. (device id: 81:00.x)
>>> We have free hugepages on both Node0 and Node1 as identified below.
>>> 
>>> ​root# cat 
>>> /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/free_hugepages 
>>>  77
>>> root# cat 
>>> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/free_hugepages 
>>>  83
>>> 
>>> The ODP application is using CPU/lcore associated with numa Node1 too.
>>> I have tried with the dpdk-17.11.1 version too without success.
>>> The issue may be somewhere else.
>>> 
>>> Regarding the usage of 2M pages ​ (1024 x 2M pages):
>>> - I unmounted the 1G hugepages and then set 1024x2M pages using 
>>> dpdk-setup.sh scripts.
>>> - But with this setup failed with the same error as before.
>>> 
>>> Let me know if there is any other option we can try.
>>> 
>>> ​Thanks,​
>>> P Gyanesh Kumar Patra
>>> 
>>> On Thu, Mar 29, 2018 at 4:46 AM, Elo, Matias (Nokia - FI/Espoo) 
>>> <matias....@nokia.com> wrote:
>>> A second thing to try. Since you seem to have a NUMA  system, the ODP 
>>> application should be run on the same NUMA socket as the NIC (e.g. using 
>>> taskset if necessary). In case of different sockets, both sockets should 
>>> have huge pages mapped.
>>> 
>>> -Matias
>>> 
>>>> On 29 Mar 2018, at 10:00, Elo, Matias (Nokia - FI/Espoo) 
>>>> <matias....@nokia.com> wrote:
>>>> 
>>>> Hi Gyanesh,
>>>> 
>>>> It seems you are using 1G huge pages. Have you tried using 2M pages​​ 
>>>> (1024 x 2M pages should be enough)? As Bill noted, this seems like a 
>>>> memory related issue.
>>>> 
>>>> -Matias
>>>> 
>>>> 
>>>>> On 28 Mar 2018, at 18:15, gyanesh patra <pgyanesh.pa...@gmail.com> wrote:
>>>>> 
>>>>> Yes, it is.
>>>>> The error is the same. I did replied that the only difference I see is 
>>>>> with Ubuntu version and different minor version of mellanox driver.
>>>>> 
>>>>> On Wed, Mar 28, 2018, 07:29 Bill Fischofer <bill.fischo...@linaro.org> 
>>>>> wrote:
>>>>> Thanks for the update. Sounds like you're already using DPDK 17.11?
>>>>> What about Mellanox driver level? Is the failure the same as you
>>>>> originally reported?
>>>>> 
>>>>> From the reported error:
>>>>> 
>>>>> pktio/dpdk.c:1538:dpdk_start():Queue setup failed: err=-12, port=0
>>>>> odp_l2fwd.c:1671:main():Error: unable to start 0
>>>>> 
>>>>> This is a DPDK PMD driver error reported by rte_eth_rx_queue_setup().
>>>>> In the Mellanox PMD (drivers/net/mlx5/mlx5_rxq.c) this is the
>>>>> mlx5_rx_queue_setup() routine. The relevant code seems to be this:
>>>>> 
>>>>> if (rxq != NULL) {
>>>>>      DEBUG("%p: reusing already allocated queue index %u (%p)",
>>>>>                    (void *)dev, idx, (void *)rxq);
>>>>>      if (priv->started) {
>>>>>              priv_unlock(priv);
>>>>>              return -EEXIST;
>>>>>      }
>>>>>      (*priv->rxqs)[idx] = NULL;
>>>>>      rxq_cleanup(rxq_ctrl);
>>>>>      /* Resize if rxq size is changed. */
>>>>>      if (rxq_ctrl->rxq.elts_n != log2above(desc)) {
>>>>>              rxq_ctrl = rte_realloc(rxq_ctrl,
>>>>>                                                sizeof(*rxq_ctrl) +
>>>>>                                                (desc + desc_pad) *
>>>>>                                                sizeof(struct rte_mbuf *),
>>>>>                                                RTE_CACHE_LINE_SIZE);
>>>>>              if (!rxq_ctrl) {
>>>>>                      ERROR("%p: unable to reallocate queue index %u",
>>>>>                                    (void *)dev, idx);
>>>>>                                    priv_unlock(priv);
>>>>>                                    return -ENOMEM;
>>>>>             }
>>>>>      }
>>>>> } else {
>>>>>      rxq_ctrl = rte_calloc_socket("RXQ", 1, sizeof(*rxq_ctrl) +
>>>>>                                                  (desc + desc_pad) *
>>>>>                                                   sizeof(struct rte_mbuf 
>>>>> *),
>>>>>                                                   0, socket);
>>>>>      if (rxq_ctrl == NULL) {
>>>>>               ERROR("%p: unable to allocate queue index %u",
>>>>>                             (void *)dev, idx);
>>>>>                             priv_unlock(priv);
>>>>>              return -ENOMEM;
>>>>>      }
>>>>> }
>>>>> 
>>>>> The reported -12 error code is -ENOMEM so I'd say the issue is some
>>>>> sort of memory allocation failure.
>>>>> 
>>>>> 
>>>>> On Wed, Mar 28, 2018 at 8:43 AM, gyanesh patra <pgyanesh.pa...@gmail.com> 
>>>>> wrote:
>>>>>> Hi Bill,
>>>>>> I tried with Matias' suggestions but without success.
>>>>>> 
>>>>>> P Gyanesh Kumar Patra
>>>>>> 
>>>>>> On Mon, Mar 26, 2018 at 4:16 PM, Bill Fischofer 
>>>>>> <bill.fischo...@linaro.org>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi Gyanesh,
>>>>>>> 
>>>>>>> Have you had a chance to look at
>>>>>>> https://bugs.linaro.org/show_bug.cgi?id=3657 and see if Matias' 
>>>>>>> suggestions
>>>>>>> are helpful to you?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Bill
>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>> 
> 

Reply via email to