Hi,

Have you tested the latest odp-dpdk code? It uses different shm implementation, 
so at least we could rule that one out.

-Matias


> On 10 Apr 2018, at 21:37, gyanesh patra <pgyanesh.pa...@gmail.com> wrote:
> 
> Hi Matias,
> 
> The Mellanox interfaces are mapped to Numa Node 1. (device id: 81:00.x)
> We have free hugepages on both Node0 and Node1 as identified below.
> 
>   ​root# cat 
> /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/free_hugepages 
>    77
>   root# cat 
> /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/free_hugepages 
>    83
> 
> The ODP application is using CPU/lcore associated with numa Node1 too.
> I have tried with the dpdk-17.11.1 version too without success.
> The issue may be somewhere else.
> 
> Regarding the usage of 2M pages ​ (1024 x 2M pages):
>  - I unmounted the 1G hugepages and then set 1024x2M pages using 
> dpdk-setup.sh scripts.
>  - But with this setup failed with the same error as before.
> 
> Let me know if there is any other option we can try.
> 
> ​Thanks,​
> P Gyanesh Kumar Patra
> 
> On Thu, Mar 29, 2018 at 4:46 AM, Elo, Matias (Nokia - FI/Espoo) 
> <matias....@nokia.com> wrote:
> A second thing to try. Since you seem to have a NUMA  system, the ODP 
> application should be run on the same NUMA socket as the NIC (e.g. using 
> taskset if necessary). In case of different sockets, both sockets should have 
> huge pages mapped.
> 
> -Matias
> 
> > On 29 Mar 2018, at 10:00, Elo, Matias (Nokia - FI/Espoo) 
> > <matias....@nokia.com> wrote:
> >
> > Hi Gyanesh,
> >
> > It seems you are using 1G huge pages. Have you tried using 2M pages​​ (1024 
> > x 2M pages should be enough)? As Bill noted, this seems like a memory 
> > related issue.
> >
> > -Matias
> >
> >
> >> On 28 Mar 2018, at 18:15, gyanesh patra <pgyanesh.pa...@gmail.com> wrote:
> >>
> >> Yes, it is.
> >> The error is the same. I did replied that the only difference I see is 
> >> with Ubuntu version and different minor version of mellanox driver.
> >>
> >> On Wed, Mar 28, 2018, 07:29 Bill Fischofer <bill.fischo...@linaro.org> 
> >> wrote:
> >> Thanks for the update. Sounds like you're already using DPDK 17.11?
> >> What about Mellanox driver level? Is the failure the same as you
> >> originally reported?
> >>
> >> From the reported error:
> >>
> >> pktio/dpdk.c:1538:dpdk_start():Queue setup failed: err=-12, port=0
> >> odp_l2fwd.c:1671:main():Error: unable to start 0
> >>
> >> This is a DPDK PMD driver error reported by rte_eth_rx_queue_setup().
> >> In the Mellanox PMD (drivers/net/mlx5/mlx5_rxq.c) this is the
> >> mlx5_rx_queue_setup() routine. The relevant code seems to be this:
> >>
> >> if (rxq != NULL) {
> >>        DEBUG("%p: reusing already allocated queue index %u (%p)",
> >>                      (void *)dev, idx, (void *)rxq);
> >>        if (priv->started) {
> >>                priv_unlock(priv);
> >>                return -EEXIST;
> >>        }
> >>        (*priv->rxqs)[idx] = NULL;
> >>        rxq_cleanup(rxq_ctrl);
> >>        /* Resize if rxq size is changed. */
> >>        if (rxq_ctrl->rxq.elts_n != log2above(desc)) {
> >>                rxq_ctrl = rte_realloc(rxq_ctrl,
> >>                                                  sizeof(*rxq_ctrl) +
> >>                                                  (desc + desc_pad) *
> >>                                                  sizeof(struct rte_mbuf *),
> >>                                                  RTE_CACHE_LINE_SIZE);
> >>                if (!rxq_ctrl) {
> >>                        ERROR("%p: unable to reallocate queue index %u",
> >>                                      (void *)dev, idx);
> >>                                      priv_unlock(priv);
> >>                                      return -ENOMEM;
> >>               }
> >>        }
> >> } else {
> >>        rxq_ctrl = rte_calloc_socket("RXQ", 1, sizeof(*rxq_ctrl) +
> >>                                                    (desc + desc_pad) *
> >>                                                     sizeof(struct rte_mbuf 
> >> *),
> >>                                                     0, socket);
> >>        if (rxq_ctrl == NULL) {
> >>                 ERROR("%p: unable to allocate queue index %u",
> >>                               (void *)dev, idx);
> >>                               priv_unlock(priv);
> >>                return -ENOMEM;
> >>        }
> >> }
> >>
> >> The reported -12 error code is -ENOMEM so I'd say the issue is some
> >> sort of memory allocation failure.
> >>
> >>
> >> On Wed, Mar 28, 2018 at 8:43 AM, gyanesh patra <pgyanesh.pa...@gmail.com> 
> >> wrote:
> >>> Hi Bill,
> >>> I tried with Matias' suggestions but without success.
> >>>
> >>> P Gyanesh Kumar Patra
> >>>
> >>> On Mon, Mar 26, 2018 at 4:16 PM, Bill Fischofer 
> >>> <bill.fischo...@linaro.org>
> >>> wrote:
> >>>>
> >>>> Hi Gyanesh,
> >>>>
> >>>> Have you had a chance to look at
> >>>> https://bugs.linaro.org/show_bug.cgi?id=3657 and see if Matias' 
> >>>> suggestions
> >>>> are helpful to you?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Regards,
> >>>> Bill
> >>>
> >>>
> >
> 
> 

Reply via email to