Re: [dpdk-dev] [PATCH] net/iavf: release port upon close

2020-09-13 Thread Thomas Monjalon
Hi,

SteveX Yang  wrote:
> Set RTE_ETH_DEV_CLOSE_REMOVE upon probe so all the private resources
> for the port can be freed by rte_eth_dev_close().
> 
> Signed-off-by: SteveX Yang 

I guess the X is not part of your name.

[...]
> -static int
> -iavf_dev_uninit(struct rte_eth_dev *dev)
> -{
> -   struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
> -
> -   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> -   return -EPERM;

All the code below is moved from iavf_dev_uninit() where it was
restricted to the primary process, to iavf_dev_close() which can be
called from the primary process.
It looks inconsistent.

> 
> dev->dev_ops = NULL;
> dev->rx_pkt_burst = NULL;
> dev->tx_pkt_burst = NULL;
> 
> -   iavf_dev_close(dev);
> +
> +   if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_RSS_PF) {
> +   if (vf->rss_lut) {
> +   rte_free(vf->rss_lut);
> +   vf->rss_lut = NULL;
> +   }
> +   if (vf->rss_key) {
> +   rte_free(vf->rss_key);
> +   vf->rss_key = NULL;
> +   }
> +   }
> 
> rte_free(vf->vf_res);
> vf->vsi_res = NULL;
> 
> @@ -1449,15 +1456,15 @@ iavf_dev_uninit(struct rte_eth_dev *dev)
> 
> rte_free(vf->aq_resp);
> vf->aq_resp = NULL;
> 
> +}
> 
> -   if (vf->rss_lut) {
> -   rte_free(vf->rss_lut);
> -   vf->rss_lut = NULL;
> -   }
> -   if (vf->rss_key) {
> -   rte_free(vf->rss_key);
> -   vf->rss_key = NULL;
> -   }
> +static int
> +iavf_dev_uninit(struct rte_eth_dev *dev)
> +{
> +   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +   return -EPERM;
> +
> +   iavf_dev_close(dev);

Are you sure about what should be freed in the secondary process?
If iavf_dev_close() should not be called by the secondary process,
you should move the check inside the function,
because iavf_dev_close() is also called by rte_eth_dev_close().

> 
> return 0;
>  
>  }




Re: [dpdk-dev] [PATCH] net/tap: release port upon close

2020-09-13 Thread Thomas Monjalon
Hi,

28/08/2020 14:37, wangyunjian:
> @@ -1078,6 +1085,23 @@ tap_dev_close(struct rte_eth_dev *dev)
> +
> + /* mac_addrs must not be freed alone because part of dev_private */
> + dev->data->mac_addrs = NULL;
> +
> + internals = dev->data->dev_private;
> + TAP_LOG(DEBUG, "Closing %s Ethernet device on numa %u",
> + tuntap_types[internals->type], rte_socket_id());
> +
> + if (internals->ioctl_sock != -1) {
> + close(internals->ioctl_sock);
> + internals->ioctl_sock = -1;
> + }
> + rte_free(dev->process_private);
> + dev->process_private = NULL;
> + if (tap_devices_count == 1)
> + rte_mp_action_unregister(TAP_MP_KEY);
> + tap_devices_count--;
>  }
[...]
> @@ -2446,12 +2466,11 @@ static int
>  rte_pmd_tap_remove(struct rte_vdev_device *dev)
>  {
>  
> struct rte_eth_dev *eth_dev = NULL;
> 
> -   struct pmd_internals *internals;
> 
> /* find the ethdev entry */
> eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
> if (!eth_dev)
> 
> -   return -ENODEV;
> +   return 0;
> 
> /* mac_addrs must not be freed alone because part of dev_private */
> eth_dev->data->mac_addrs = NULL;

The line above can be removed because mac_addrs is not freed
in secondary process, and it is redundant with tap_dev_close().

>  
>   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>   return rte_eth_dev_release_port(eth_dev);

There is an inconsistency in secondary process
if tap_dev_close() is not called from .remove (unplug)
but can be called from .dev_close (rte_eth_dev_close).

I think the process type check must be done inside tap_dev_close(),
so the process private resources can be freed.

>   
>   tap_dev_close(eth_dev);
>  

This blank line can be removed in my opinion.

> - internals = eth_dev->data->dev_private;
> - TAP_LOG(DEBUG, "Closing %s Ethernet device on numa %u",
> - tuntap_types[internals->type], rte_socket_id());
> -
> - close(internals->ioctl_sock);
> - rte_free(eth_dev->process_private);
> - if (tap_devices_count == 1)
> - rte_mp_action_unregister(TAP_MP_KEY);
> - tap_devices_count--;
>   rte_eth_dev_release_port(eth_dev);
>  
>   return 0;





Re: [dpdk-dev] [PATCH] app/testpmd: fix flow rules list after port stop

2020-09-13 Thread Ori Kam
Hi Ferruh,
Can we proceed with this patch?

Thanks,
Ori
> -Original Message-
> From: Ori Kam
> 
> Hi Ferruh,
> 
> > -Original Message-
> > From: Ferruh Yigit 
> >
> > On 8/20/2020 9:40 AM, Gregory Etelson wrote:
> > > Hello,
> > >
> > > Is this patch scheduled for merge with dpdk.org ?
> > > Please update me.
> > >
> > > Regards,
> > > Gregory
> > >
> > >> -Original Message-
> > >> From: Gregory Etelson 
> > >>
> > >> According to current RTE API, port flow rules must not be kept after port
> > >> stop.
> >
> > Hi Gregory, Ori,
> >
> > Can you please point where this is documented?
> >
> From: rte_ethdev.h
> "Please note that some configuration is not stored between calls to
>  rte_eth_dev_stop()/rte_eth_dev_start(). The following configuration will
>  be retained:
> 
>  - MTU
>  - flow control settings
>  - receive mode configuration (promiscuous mode, all-multicast mode,
>hardware checksum mode, RSS/VMDQ settings etc.)
>  - VLAN filtering configuration
>  - default MAC address
>  - MAC addresses supplied to MAC address array
>  - flow director filtering mode (but not filtering rules)
>  - NIC queue statistics mappings"
> 
> From my understanding this means that flows should not be stored on device
> stop.
> 
> 
> > >>
> > >> Testpmd did not flush port flow rules after `port stop' command was
> called.
> > >> As the result, after the port was restarted, it showed bogus flow rules.
> >
> > There are two issues,
> >
> > 1) According what I see in the rte_flow documentation, not sure if the "port
> > stop" should clear the rules:
> > "
> > PMDs, not applications, are responsible for maintaining flow rules
> > configuration
> > when stopping and restarting a port or performing other actions which may
> > affect
> > them. They can only be destroyed explicitly by applications.
> > "
> >
> Good catch I think this part should be removed, since it has many issues. The
> application is the only
> one that can be responsible for the rules.
> 
> Thinks about the following scenario: application configures 2 queues 0 and 1.
> It insert flow with queue action 1.
> It stops the port and remove queue 1. What should the PMD do?
> What happens if he changed some thing else in configuration that make
> the actions invalid?
> 
> For those reason (the description in rte_ethdev.h and the above issues with
> keeping the rules)
> we (Mellanox) modified our code to remove the flows in stop function from the
> device.
> This code was inserted to DPDK in 20.05 release.
> One more reason is that saving the flows also waste a lot of memory
> which is very costly to many applications.
> 
> 
> > As I tested with i40e, it keeps the rules after stop/start, cc'ing @Jeff,
> > @Beilei & @Qi if this is done intentionally.
> >
> >
> > 2) From the perspective of the testers, users of the testpmd. If they are
> > testing a complex set of filter rules, stopping and starting the port 
> > flushing
> > all rules may be troublesome.
> > Since there is explicit command to remove a rte_flow rule or to remove them
> > all,
> > user may prefer to call it when required to delete the rules, instead of 
> > this is
> > done implicitly in port stop.
> >
> > Btw, this is based on PMD should handle the rules on stop/start, we need to
> > agree on it first, but even that is not the case, we are in the application
> > domain now and we can apply the rules back again in the 'start' if it serves
> > better to the user.
> >
> First like I said above I think we should agree that it is the application
> responsibility to manage the rules and not the PMD, and first thing to do it
> update the rte_flow doc.
> 
> Second I agree that we should discuss if test-pmd should keep the rules and
> reapply them,
> but just like for the PMD the user may create invalid configuration, so re-
> applying the rules
> maybe incorrect.
> Currently test-pmd is not build to support large number of rules, unless 
> using a
> script, and if the user uses a script
> he can reuse this script.
> 
> 
> 
> Best,
> Ori



Re: [dpdk-dev] [PATCH] app/testpmd: fix flow rules list after port stop

2020-09-13 Thread Thomas Monjalon
13/09/2020 14:12, Ori Kam:
> Hi Ferruh,
> Can we proceed with this patch?

Below, you said "first thing to do it update the rte_flow doc".
So I am expecting a patch on the doc to start this discussion.
This testpmd patch is on hold in my understanding.


> From: Ori Kam
> > From: Ferruh Yigit 
> > > On 8/20/2020 9:40 AM, Gregory Etelson wrote:
> > > > Hello,
> > > >
> > > > Is this patch scheduled for merge with dpdk.org ?
> > > > Please update me.
> > > >
> > > >> From: Gregory Etelson 
> > > >>
> > > >> According to current RTE API, port flow rules must not be kept after 
> > > >> port
> > > >> stop.
> > >
> > > Hi Gregory, Ori,
> > >
> > > Can you please point where this is documented?
> > >
> > From: rte_ethdev.h
> > "Please note that some configuration is not stored between calls to
> >  rte_eth_dev_stop()/rte_eth_dev_start(). The following configuration will
> >  be retained:
> > 
> >  - MTU
> >  - flow control settings
> >  - receive mode configuration (promiscuous mode, all-multicast mode,
> >hardware checksum mode, RSS/VMDQ settings etc.)
> >  - VLAN filtering configuration
> >  - default MAC address
> >  - MAC addresses supplied to MAC address array
> >  - flow director filtering mode (but not filtering rules)
> >  - NIC queue statistics mappings"
> > 
> > From my understanding this means that flows should not be stored on device
> > stop.
> > 
> > 
> > > >>
> > > >> Testpmd did not flush port flow rules after `port stop' command was
> > called.
> > > >> As the result, after the port was restarted, it showed bogus flow 
> > > >> rules.
> > >
> > > There are two issues,
> > >
> > > 1) According what I see in the rte_flow documentation, not sure if the 
> > > "port
> > > stop" should clear the rules:
> > > "
> > > PMDs, not applications, are responsible for maintaining flow rules
> > > configuration
> > > when stopping and restarting a port or performing other actions which may
> > > affect
> > > them. They can only be destroyed explicitly by applications.
> > > "
> > >
> > Good catch I think this part should be removed, since it has many issues. 
> > The
> > application is the only
> > one that can be responsible for the rules.
> > 
> > Thinks about the following scenario: application configures 2 queues 0 and 
> > 1.
> > It insert flow with queue action 1.
> > It stops the port and remove queue 1. What should the PMD do?
> > What happens if he changed some thing else in configuration that make
> > the actions invalid?
> > 
> > For those reason (the description in rte_ethdev.h and the above issues with
> > keeping the rules)
> > we (Mellanox) modified our code to remove the flows in stop function from 
> > the
> > device.
> > This code was inserted to DPDK in 20.05 release.
> > One more reason is that saving the flows also waste a lot of memory
> > which is very costly to many applications.
> > 
> > 
> > > As I tested with i40e, it keeps the rules after stop/start, cc'ing @Jeff,
> > > @Beilei & @Qi if this is done intentionally.
> > >
> > >
> > > 2) From the perspective of the testers, users of the testpmd. If they are
> > > testing a complex set of filter rules, stopping and starting the port 
> > > flushing
> > > all rules may be troublesome.
> > > Since there is explicit command to remove a rte_flow rule or to remove 
> > > them
> > > all,
> > > user may prefer to call it when required to delete the rules, instead of 
> > > this is
> > > done implicitly in port stop.
> > >
> > > Btw, this is based on PMD should handle the rules on stop/start, we need 
> > > to
> > > agree on it first, but even that is not the case, we are in the 
> > > application
> > > domain now and we can apply the rules back again in the 'start' if it 
> > > serves
> > > better to the user.
> > >
> > First like I said above I think we should agree that it is the application
> > responsibility to manage the rules and not the PMD, and first thing to do it
> > update the rte_flow doc.
> > 
> > Second I agree that we should discuss if test-pmd should keep the rules and
> > reapply them,
> > but just like for the PMD the user may create invalid configuration, so re-
> > applying the rules
> > maybe incorrect.
> > Currently test-pmd is not build to support large number of rules, unless 
> > using a
> > script, and if the user uses a script
> > he can reuse this script.





Re: [dpdk-dev] [PATCH] bus/pci: support segment value as address domain on Windows

2020-09-13 Thread Tal Shnaiderman
> Subject: Re: [PATCH] bus/pci: support segment value as address domain on
> Windows
> 
> Thanks for the explanation, Tal.
> 
> I had always been curious how Windows stores the PCIe segment (domain)
> number.
> 
> On VMs hosted on Hyper-V, the VF segment numbers are always in the high
> 16-bit values.
> 
> Is this documented somewhere, or did you find this by experimentation?
> 
Hi Ranjit,

I didn’t find documentation on it.
I found it by experimentation on a Windows VM with several VFs using the same 
virtual switch, Linux VFs showed the same behavior but they are being detected 
using the domain value which is different.
> 
> ranjit m.



Re: [dpdk-dev] [PATCH] bus/pci: support segment value as address domain on Windows

2020-09-13 Thread Tal Shnaiderman
> Subject: Re: [PATCH] bus/pci: support segment value as address domain on
> Windows
> 
> On Thu, 10 Sep 2020 07:30:39 +, Tal Shnaiderman wrote:
> > Right, it can happen in virtualization setups when several virtual functions
> can have the same BDF, e.g.:
> >
> > PS  > Get-NetAdapterHardwareInfo
> >
> > Name   Segment Bus Device Function Slot NumaNode
> PcieLinkSpeed
> >    --- --- --    
> > -
> > Ethernet   0   0 10   0 
> >  Unknown
> > Ethernet 4   58601   0  20 0
> >Unknown
> > Ethernet 5   52956   0  20 0
> >Unknown
> >
> > DPDK currently can detect either Ethernet 4 or ethernet 5 if only BDF is
> checked.
> > Unix uses the Domain value, the equivalent value for Windows is Segment.
> 
> Hi Tal,
> 
> I wonder how exactly this setup can be reproduced, i.e. could you share
> relevant QEMU options, VMX file or some other config you're using?
> Patch idea and code look clear, however, I never managed to build QEMU
> PCIe hierarchy to see it working.

Hi Dmitry,

I'm not sure it is relevant to all NICs but for Mellanox you can recreate it on 
a Hyper-V Windows VM with several SR-IOV VFs using the same virtual switch.


[dpdk-dev] [RFC PATCH v2 3/4] ethdev: add hairpin bind APIs

2020-09-13 Thread Bing Zhao
In single port hairpin mode, all the hairpin TX and RX queues belong
to the same device. After the queues are set up properly, there is
no other dependency between the TX queue and its RX peer queue. The
binding process that connected the TX and RX queues together from
hardware level will be done automatically during the device start
procedure. Everything required for binding will be configured and
initialized before.
But in two ports hairpin mode, there will be some cross-dependences
between two different ports. Usually, the ports will be initialized
serially by the master thread but not in parallel. The earlier port
will not be able to enable the bind if the following peer port is
not configured done with HW resources. What's more, if one port is
detached / attached dynamically, it would introduce more trouble
for the hairpin binding.
To overcome these, new APIs for binding and unbinding are added.
During startup, only the hairpin TX and RX peer queues will be set
up. Nothing will be done when starting the device if the queues are
without auto bind attribute. Only after the required ports pair
started, the rte_eth_hairpin_bind() API can be called to bind the
all TX queues of the egress port to the RX queues of the peer port.
The connection between the egress and ingress ports pair will be
established.
rte_eth_hairpin_unbind() API could be used to disconnect the egress
and the peer ingress ports. This should only be called before the
device is closed if needed.

Signed-off-by: Bing Zhao 
---
 lib/librte_ethdev/rte_ethdev.c| 100 ++
 lib/librte_ethdev/rte_ethdev.h|  51 +
 lib/librte_ethdev/rte_ethdev_driver.h |  52 ++
 3 files changed, 203 insertions(+)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 066751f..8a3dc73 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -2175,6 +2175,106 @@ rte_eth_tx_hairpin_queue_setup(uint16_t port_id, 
uint16_t tx_queue_id,
return eth_err(port_id, ret);
 }
 
+int
+rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port)
+{
+   struct rte_eth_dev *dev;
+   struct rte_eth_dev *rdev;
+   uint16_t p;
+   uint16_t rp;
+   int ret = 0;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(tx_port, -EINVAL);
+   dev = &rte_eth_devices[tx_port];
+   if (!dev->data->dev_started) {
+   RTE_ETHDEV_LOG(ERR, "TX port %d is not started", tx_port);
+   return -EBUSY;
+   }
+
+   if (rx_port == RTE_MAX_ETHPORTS) {
+   RTE_ETH_FOREACH_DEV(p) {
+   rdev = &rte_eth_devices[p];
+   if (!rdev->data->dev_started) {
+   RTE_ETHDEV_LOG(ERR,
+  "RX port %d is not started", p);
+   ret = -EBUSY;
+   goto error;
+   }
+   ret = (*dev->dev_ops->hairpin_bind)(dev, p);
+   if (ret) {
+   RTE_ETHDEV_LOG(ERR, "Failed to bind hairpin TX "
+  "%d to RX %d", tx_port, p);
+   goto error;
+   }
+   }
+   } else {
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(rx_port, -EINVAL);
+   rdev = &rte_eth_devices[rx_port];
+   if (!rdev->data->dev_started) {
+   RTE_ETHDEV_LOG(ERR,
+  "RX port %d is not started", rx_port);
+   return -EBUSY;
+   }
+   ret = (*dev->dev_ops->hairpin_bind)(dev, rx_port);
+   if (ret)
+   RTE_ETHDEV_LOG(ERR, "Failed to bind hairpin TX %d "
+  "to RX %d", tx_port, rx_port);
+   }
+
+   return ret;
+
+error:
+   RTE_ETH_FOREACH_DEV(rp) {
+   if (rp < p)
+   (*dev->dev_ops->hairpin_unbind)(dev, rp);
+   }
+   return ret;
+}
+
+int
+rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port)
+{
+   struct rte_eth_dev *dev;
+   struct rte_eth_dev *rdev;
+   uint16_t p;
+   uint16_t rp;
+   int ret = 0;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(tx_port, -EINVAL);
+   dev = &rte_eth_devices[tx_port];
+   if (!dev->data->dev_started) {
+   RTE_ETHDEV_LOG(ERR, "TX port %d is stopped", tx_port);
+   return -EBUSY;
+   }
+
+   if (rx_port == RTE_MAX_ETHPORTS) {
+   RTE_ETH_FOREACH_DEV(p) {
+   rdev = &rte_eth_devices[p];
+   if (!rdev->data->dev_started) {
+   RTE_ETHDEV_LOG(ERR, "RX port %d is stopped", p);
+   ret = -EBUSY;
+   break;
+   }
+   ret = (*dev

[dpdk-dev] [RFC PATCH v2 1/4] ethdev: add support for flow item transmit queue

2020-09-13 Thread Bing Zhao
New rte_flow_item_tx_queue is introduced to support matching on the
traffic from a specific transmit queue. This is only for the egress
direction. For ingress, the receive queue index will be part of the
actions with ACTION QUEUE or RSS and there is no needs to match.

By adding this flow item, it will be easy for the application to
create some wildcard rules to distribute the egress traffic for the
further handling.

Normally, all the traffic from software to wire will have the same
wildcard rule. If the packets need to be handled in different ways,
packets' headers and metadata, etc. will be used for matching.

But in some cases, for example:
1. Packets from different TX queues will have different next common
   behaviors.
2. Hairpin TX traffic should have different behavior from software
   egress traffic.
Matching on the TX queue will help to reduce the rules number and
simplify the rules management.

The support for this new item will be decided by the PMD driver and
the capacity of the NIC.

Signed-off-by: Bing Zhao 
---
 lib/librte_ethdev/rte_flow.c |  1 +
 lib/librte_ethdev/rte_flow.h | 30 ++
 2 files changed, 31 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index f8fdd68..4600e27 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -96,6 +96,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = 
{
MK_FLOW_ITEM(L2TPV3OIP, sizeof(struct rte_flow_item_l2tpv3oip)),
MK_FLOW_ITEM(PFCP, sizeof(struct rte_flow_item_pfcp)),
MK_FLOW_ITEM(ECPRI, sizeof(struct rte_flow_item_ecpri)),
+   MK_FLOW_ITEM(TX_QUEUE, sizeof(struct rte_flow_item_tx_queue)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..264755a 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -537,6 +537,13 @@ enum rte_flow_item_type {
 */
RTE_FLOW_ITEM_TYPE_ECPRI,
 
+   /**
+* Matches TX queue ID of a specific port for egress traffic.
+*
+* See struct rte_flow_item_tx_queue.
+*/
+   RTE_FLOW_ITEM_TYPE_TX_QUEUE,
+
 };
 
 /**
@@ -1580,6 +1587,29 @@ static const struct rte_flow_item_ecpri 
rte_flow_item_ecpri_mask = {
 #endif
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_TX_QUEUE
+ *
+ * Match egress traffic originating from a TX queue of a port. Port refers to
+ * a struct rte_eth_dev object on the application side and the transmit queue
+ * index must be in the range [0, nb_tx_queue - 1] previously supplied to
+ * rte_eth_dev_configure(). Normally only supported if the TX queue ID is known
+ * by the underlying PMD.
+ */
+struct rte_flow_item_tx_queue {
+   uint32_t queue; /**< DPDK TX queue ID of a specific port. */
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TX_QUEUE. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tx_queue rte_flow_item_tx_queue_mask = {
+   .queue = 0x,
+};
+#endif
+
+/**
  * Matching pattern item definition.
  *
  * A pattern is formed by stacking items starting from the lowest protocol
-- 
2.5.5



[dpdk-dev] [RFC PATCH v2 0/4] introduce support for hairpin between two ports

2020-09-13 Thread Bing Zhao
Hairpin functionality only supports one single port mode (e.g testpmd
application) in the current implementation. It means that the traffic
will be sent out from the same port it comes. There is no such
restriction for some NICs, and strong demand to support two ports
hairpin mode in real-life cases.
Two ports hairpin mode does not really mean hairpin will only support
two ports in a single application. Indeed, it also needs to support
the single port hairpin today for compatibility. In the meanwhile,
'two ports' means the ingress and egress ports of the traffic could
Be different. And also, there is no restriction that
  1. traffic from the same ingress port must go to the same egress
 port
  2. traffic from the port that as 'egress' for other traffic flows
 must go to their 'ingress' port
The configuration should be flexible and the behavior of traffic will
be decided by the rte flows.

Usually, during the startup phase, all the hairpin configurations
except flows should be done. It means that hairpin TXQ and peer RXQ
should be bound together. It is feasible in single port mode and
transparent to the application. In two ports mode, there may be some
problems for the queues configuring and binding.
  1. Once TXQ & RXQ belong to different ports, it would be hard to
 configure the first port when the initialization of the second
 port is not done. Also, it is not proper to configure the first
 port during the second one starting.
  2. The port could be attached and detached dynamically. Hairpin
 between these ports should support dynamic configuration.

In two ports hairpin mode, since the TXQ and RXQ belong to different
ports. If some actions need to be done in the TX part, the egress flow
could be inserted explicitly and managed separately from the RX part.
What's more, one egress flow could be shared for different ingress
flows from the same or different ports.

In order to satisfy these, some changes on the current rte ethdev and
flow APIs are needed and some new APIs will be introduced.

1. Data structures in 'rte_ethdev.h'
Two new members are added.
struct rte_eth_hairpin_conf {
uint16_t peer_count; /**< The number of peers. */
struct rte_eth_hairpin_peer peers[RTE_ETH_MAX_HAIRPIN_PEERS];
uint16_t tx_explicit;
uint16_t manual_bind;
};
'tx_explicit': If 0, PMD will help to insert the egress flow in a
implicit way. If 1, the application will insert it by itself.
'manual_bind': If 0, PMD will try to bind hairpin TXQ and RXQ peer
automatically, like in today's single port hairpin mode and this is
for backward compatibility. If 1, then manual bind API will be called.
The application should ensure there is no conflict for the hairpin
peer configurations between TX & RX as today and PMD could check
them inside. For new member 'tx_explicit', all queue pairs from one
ingress port to the same egress are suggested to have the same value
in order not to create chaos, like in RSS cases.
For new member 'manual_bind', the same suggestion is applicable.
The support for the new members will be decided by the NICs' capacity
and real-life usage from the application.

2. New macros in 'rte_ethdev.h'
RTE_ETH_HAIRPIN_BIND_AUTO (0)
RTE_ETH_HAIRPIN_BIND_MANUAL (1)
RTE_ETH_HAIRPIN_TXRULE_IMPLICIT (0)
RTE_ETH_HAIRPIN_TXRULE_EXPLICIT (1)
These are used for the new members in 'struct rte_eth_hairpin_conf'.

3. New function APIs in 'rte_ethdev.h'
* int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port)
* typedef int (*eth_hairpin_bind)(struct rte_eth_dev *dev,
uint16_t rx_port);
This function will be used to bind one port egress to the peer port
ingress. If 'rx_port' is equal to RTE_MAX_ETHPORTS, then all the ports
will be traversed to bind hairpin egress queues to all of their
ingress queues configured. The application needs to call it repeatedly
to bind all egress ports.
This should be called after the hairpin queues are set up and devices
are started. If 'manual_bind' is not specified, no need to call this
API. A function pointer with 'eth_hairpin_bind' type should be
provided by the PMD to execute the hardware setting in the driver.
0 return value means success and a negative value will be returned to
indicate the actual failure.

* int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port)
* typedef int (*eth_hairpin_unbind)(struct rte_eth_dev *dev,
uint16_t rx_port);
This function will unbind one port egress to the peer port ingress,
only one direction hairpin will be unbound. Unbinding of the opposite
direction needs another call of this API.
If 'rx_port' is equal to RTE_MAX_ETHPORTS, all the ports will be
traversed to do the queues unbind (if any). The application needs to
call it repeatedly to unbind all egress ports.
The API could be called without stopping or closing the eth device,
but the application should ensure the flows inserted for the hairpin
port pairs be handled properly. The t

[dpdk-dev] [RFC PATCH v2 2/4] testpmd: add item transmit queue in flow CLI

2020-09-13 Thread Bing Zhao
In the testpmd command line for flow creation, add the keyword
'tx_queue' in the pattern scope so the item transmit queue ID
could be specified during the flow creation.
The index of the transmit queue should be in the range
[0, nb_tx_queue - 1] supplied to rte_eth_dev_configure() during
initialization. Normal TX queues and hairpin TX queues will have a
unified index number sequence.

Signed-off-by: Bing Zhao 
---
 app/test-pmd/cmdline_flow.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 6263d30..f7816fc 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -109,6 +109,8 @@ enum index {
ITEM_PHY_PORT_INDEX,
ITEM_PORT_ID,
ITEM_PORT_ID_ID,
+   ITEM_TX_QUEUE,
+   ITEM_TX_QUEUE_ID,
ITEM_MARK,
ITEM_MARK_ID,
ITEM_RAW,
@@ -759,6 +761,7 @@ static const enum index next_item[] = {
ITEM_VF,
ITEM_PHY_PORT,
ITEM_PORT_ID,
+   ITEM_TX_QUEUE,
ITEM_MARK,
ITEM_RAW,
ITEM_ETH,
@@ -1954,6 +1957,21 @@ static const struct token token_list[] = {
.next = NEXT(item_port_id, NEXT_ENTRY(UNSIGNED), item_param),
.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port_id, id)),
},
+   [ITEM_TX_QUEUE] = {
+   .name = "tx_queue",
+   .help = "match traffic from a given transmit queue",
+   .priv = PRIV_ITEM(QUEUE,
+ sizeof(struct rte_flow_item_tx_queue)),
+   .next = NEXT(NEXT_ENTRY(ITEM_TX_QUEUE_ID)),
+   .call = parse_vc,
+   },
+   [ITEM_TX_QUEUE_ID] = {
+   .name = "index",
+   .help = "TX queue index of this port",
+   .next = NEXT(NEXT_ENTRY(ITEM_NEXT), NEXT_ENTRY(UNSIGNED),
+item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_tx_queue, queue)),
+   },
[ITEM_MARK] = {
.name = "mark",
.help = "match traffic against value set in previously matched 
rule",
-- 
2.5.5



[dpdk-dev] [RFC PATCH v2 4/4] ethdev: add new attributes to hairpin queues config

2020-09-13 Thread Bing Zhao
To support two ports hairpin mode and keep the backward compatibility
for application, two new attribute members of hairpin queue config
structure are added.

`tx_explicit` means if PMD or application itself will insert the TX
part flow rules.
`manual_bind` means if the hairpin TX queue and peer RX queue will be
bound automatically during device start stage.

Different TX and RX queue pairs could have different values, but it
is highly recommend that all paired queues between one egress and its
peer ingress ports have the same values, in order not to bring any
chaos to the system. The actual support of these attribute parameters
will be checked and decided by the PMD driver.

In a single port hairpin, if both are zero without any setting, the
behavior will remain the same as before. It means no bind API needs
to be called and no TX flow rules need to be inserted manually by
the application.

Signed-off-by: Bing Zhao 
---
 lib/librte_ethdev/rte_ethdev.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index fb217b4..9560d60 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -996,6 +996,21 @@ struct rte_eth_hairpin_cap {
 
 #define RTE_ETH_MAX_HAIRPIN_PEERS 32
 
+/*
+ * Hairpin queue attribute parameters.
+ * Each TX queue and peer RX queue should have the same value.
+ * Default value 0 is for backward-compatibility, the same behaviors should
+ * remain if the value is not set (0).
+ */
+/**< Hairpin queues will be bound automatically */
+#define RTE_ETH_HAIRPIN_BIND_AUTO  (0)
+/**< Hairpin queues will be bound manually with bind API */
+#define RTE_ETH_HAIRPIN_BIND_MANUAL(1)
+/**< Hairpin TX part flow rule will be inserted implicitly by PMD */
+#define RTE_ETH_HAIRPIN_TXRULE_IMPLICIT(0)
+/**< Hairpin TX part flow rule will be inserted explicitly by APP */
+#define RTE_ETH_HAIRPIN_TXRULE_EXPLICIT(1)
+
 /**
  * @warning
  * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
@@ -1016,6 +1031,8 @@ struct rte_eth_hairpin_peer {
 struct rte_eth_hairpin_conf {
uint16_t peer_count; /**< The number of peers. */
struct rte_eth_hairpin_peer peers[RTE_ETH_MAX_HAIRPIN_PEERS];
+   uint16_t tx_explicit; /**< Explicit TX flow rule by APP. */
+   uint16_t manual_bind; /**< Manually binding hairpin queues. */
 };
 
 /**
-- 
2.5.5



Re: [dpdk-dev] [PATCH] bus/pci: netuio interface for windows

2020-09-13 Thread Tal Shnaiderman
> Subject: [PATCH] bus/pci: netuio interface for windows
> 
> This patch adds implementations to probe PCI devices bound to netuio with
> the help of "netuio" class device changes.
> Now Windows will support both "netuio" and "net" device class and can set
> kernel driver type based on the device class selection.
> 
> Note: Few definitions and structures have been copied from
> netuio_interface.h file from ("[v3] windows/netuio: add Windows NetUIO
> kernel driver") series and this will be fixed once the exact path for netuio
> source code is known.
> 
> Signed-off-by: Pallavi Kadam 
> Reviewed-by: Ranjit Menon 
> ---
>  drivers/bus/pci/windows/pci.c | 293 +--
> ---
>  1 file changed, 257 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/bus/pci/windows/pci.c b/drivers/bus/pci/windows/pci.c
> index c80bd5571..38db87a2a 100644
> --- a/drivers/bus/pci/windows/pci.c
> +++ b/drivers/bus/pci/windows/pci.c

I think it will be better to split the pci implementation into 2 files and add 
pci_netuio.c
(similarly to Linux with pci_uio.c and pci_vfio.c), moving all of the 
netuio-specific definitions and functions in this patch to the new file.

[snip]

> -int
> -rte_pci_scan(void)
> +static int
> +pci_scan_device_class(const GUID *guid)
>  {
> int   ret = -1;
> DWORD device_index = 0, found_device = 0;
> HDEVINFO dev_info;
> SP_DEVINFO_DATA device_info_data;
> +   const char *class;
> 
> -   /* for debug purposes, PCI can be disabled */
> -   if (!rte_eal_has_pci())
> -   return 0;
> +   if (IsEqualGUID((const void *)guid,
> +   (const void *)&GUID_DEVCLASS_NETUIO))
> +   class = netuio_class;
> +   else
> +   class = net_class;
> 
> -   dev_info = SetupDiGetClassDevs(&GUID_DEVCLASS_NET, TEXT("PCI"),
> NULL,
> -   DIGCF_PRESENT);
> +   dev_info = SetupDiGetClassDevs(guid, TEXT("PCI"), NULL,
> + DIGCF_PRESENT);

Is there a way to get both GUID_DEVCLASS_NET and GUID_DEVCLASS_NETUIO in a 
single call here?
We could avoid calling this function twice if we could get all of the devices.

> if (dev_info == INVALID_HANDLE_VALUE) {
> RTE_LOG_WIN32_ERR("SetupDiGetClassDevs(pci_scan)");
> -   RTE_LOG(ERR, EAL, "Unable to enumerate PCI devices.\n");
> +   RTE_LOG(ERR, EAL, "Unable to enumerate %s PCI devices.\n",
> +   class);
> goto end;
> }
> 
> @@ -415,7 +614,8 @@ rte_pci_scan(void)
> device_info_data.cbSize = sizeof(SP_DEVINFO_DATA);
> }
> 
> -   RTE_LOG(DEBUG, EAL, "PCI scan found %lu devices\n", found_device);
> +   RTE_LOG(DEBUG, EAL, "PCI scan found %lu %s devices\n",
> +   found_device, class);
> ret = 0;
>  end:
> if (dev_info != INVALID_HANDLE_VALUE) @@ -423,3 +623,24 @@
> rte_pci_scan(void)
> 
> return ret;
>  }
> +
> +/*
> + * Scan the contents of the PCI bus looking for devices  */ int
> +rte_pci_scan(void)
> +{
> +   int   ret = -1;
> +
> +   /* for debug purposes, PCI can be disabled */
> +   if (!rte_eal_has_pci())
> +   return 0;
> +
> +   /* first, scan for netUIO class devices */
> +   ret = pci_scan_device_class(&GUID_DEVCLASS_NETUIO);

ret is overwritten later on.

> +
> +   /* then, scan for the standard net class devices */
> +   ret = pci_scan_device_class(&GUID_DEVCLASS_NET);
> +
> +   return ret;
> +}
> --
> 2.18.0.windows.1



[dpdk-dev] [PATCH v1 1/2] net/mlx5: fix Rx objects creator selection

2020-09-13 Thread Michael Baum
There are 2 creators for Rx objects, DevX and Verbs.
There are supported DR versions when a DevX destination TIR flow action
creation cannot be supported, using this versions the TIR object should
be created by Verbs, what forces all the Rx objects to be created by
Verbs.

The selection of the Rx objects creator, wrongly, didn't take into
account the destination TIR action support what caused a failure in the
Rx flows creation.

Select Verbs creator when destination TIR action creation is not
supported by the DR version.

Fixes: 71dee7694a70 ("net/mlx5: separate Rx queue object creations")

Signed-off-by: Michael Baum 
Acked-by: Matan Azrad 
---
 drivers/net/mlx5/linux/mlx5_os.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 41db75e..0511a55 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1272,7 +1272,7 @@
goto error;
}
}
-   if (config->devx && config->dv_flow_en) {
+   if (config->devx && config->dv_flow_en && config->dest_tir) {
priv->obj_ops = devx_obj_ops;
priv->obj_ops.drop_action_create =
ibv_obj_ops.drop_action_create;
-- 
1.8.3.1



[dpdk-dev] [PATCH v1 2/2] net/mlx5: fix using hairpin without dest DevX TIR support

2020-09-13 Thread Michael Baum
The PMD supports hairpin only if DevX is supported and DV flow is enable.

When destination DevX TIR is not supported, the PMD tries to create TIR
action, and fails.

Avoid supporting hairpin when destination DevX TIR is not supported.

Fixes: b6b3bf86bd1a ("net/mlx5: get hairpin capabilities")
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Acked-by: Matan Azrad 
---
 drivers/net/mlx5/mlx5_ethdev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index cefb450..a7924b1 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -569,12 +569,12 @@ struct mlx5_priv *
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 int
-mlx5_hairpin_cap_get(struct rte_eth_dev *dev,
-struct rte_eth_hairpin_cap *cap)
+mlx5_hairpin_cap_get(struct rte_eth_dev *dev, struct rte_eth_hairpin_cap *cap)
 {
struct mlx5_priv *priv = dev->data->dev_private;
+   struct mlx5_dev_config *config = &priv->config;
 
-   if (priv->sh->devx == 0) {
+   if (!priv->sh->devx || !config->dest_tir || !config->dv_flow_en) {
rte_errno = ENOTSUP;
return -rte_errno;
}
-- 
1.8.3.1



[dpdk-dev] [PATCH] net/mlx5: fix vectorized Rx burst check

2020-09-13 Thread Viacheslav Ovsiienko
The Rx queue start/stop feature is not supported if vectorized
rx_burst routine is engaged. There was a routine address typo
and rx_burst type check was wrong.

Fixes: 161d103b231c ("net/mlx5: add queue start and stop")

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_rxq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 0b3e813..1aee6d1 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -562,7 +562,7 @@
 * The routine pointer depends on the process
 * type, should perform check there.
 */
-   if (pkt_burst == mlx5_rx_burst) {
+   if (pkt_burst == mlx5_rx_burst_vec) {
DRV_LOG(ERR, "Rx queue stop is not supported "
"for vectorized Rx");
rte_errno = EINVAL;
-- 
1.8.3.1



[dpdk-dev] [PATCH] common/mlx5: fix PCI location address routine

2020-09-13 Thread Viacheslav Ovsiienko
mlx5 PMDs use the mlx5_dev_to_pci_addr() routine to convert
Infiniband device name to the Bus-Device-Function location
on the PCI bus. The routine returned success even in case of
not found identification string. On caller side it likely
caused the wrong match with the BDF of previous device
resulting in wrong representor and master recognitions.

Fixes: 79aa430721b1 ("common/mlx5: split common file under Linux directory")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/common/mlx5/linux/mlx5_common_os.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c 
b/drivers/common/mlx5/linux/mlx5_common_os.c
index 7bb3ba6..0edd78e 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.c
+++ b/drivers/common/mlx5/linux/mlx5_common_os.c
@@ -39,6 +39,7 @@
 {
FILE *file;
char line[32];
+   int rc = -ENOENT;
MKSTR(path, "%s/device/uevent", dev_path);
 
file = fopen(path, "rb");
@@ -48,16 +49,19 @@
}
while (fgets(line, sizeof(line), file) == line) {
size_t len = strlen(line);
-   int ret;
 
/* Truncate long lines. */
-   if (len == (sizeof(line) - 1))
+   if (len == (sizeof(line) - 1)) {
while (line[(len - 1)] != '\n') {
-   ret = fgetc(file);
+   int ret = fgetc(file);
+
if (ret == EOF)
-   break;
+   goto exit;
line[(len - 1)] = ret;
}
+   /* No match for long lines. */
+   continue;
+   }
/* Extract information. */
if (sscanf(line,
   "PCI_SLOT_NAME="
@@ -66,11 +70,15 @@
   &pci_addr->bus,
   &pci_addr->devid,
   &pci_addr->function) == 4) {
+   rc = 0;
break;
}
}
+exit:
fclose(file);
-   return 0;
+   if (rc)
+   rte_errno = -rc;
+   return rc;
 }
 
 /**
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] eal: add new prefetch0_write variant

2020-09-13 Thread Pavan Nikhilesh Bhagavatula
>This commit adds a new rte_prefetch0_write() variant, suggests to the
>compiler to use a prefetch instruction with intention to write. As a
>compiler builtin, the compiler can choose based on compilation target
>what the best implementation for this instruction is.  

Why not have the other variants too i.e. l2/l3/temporal store prefetches too?


>
>Signed-off-by: Harry van Haaren 
>
>---
>
>The integer constants passed to the builtin are not available as
>a #define value, and doing #defines just for this write variant
>does not seems a nice solution to me... particularly for those using
>IDEs where any #define value is auto-hinted for code-completion.
>
>---
> lib/librte_eal/include/generic/rte_prefetch.h | 16 
> 1 file changed, 16 insertions(+)
>
>diff --git a/lib/librte_eal/include/generic/rte_prefetch.h
>b/lib/librte_eal/include/generic/rte_prefetch.h
>index 6e47bdfbad..44e2e9abfc 100644
>--- a/lib/librte_eal/include/generic/rte_prefetch.h
>+++ b/lib/librte_eal/include/generic/rte_prefetch.h
>@@ -51,4 +51,20 @@ static inline void rte_prefetch2(const volatile
>void *p);
>  */
> static inline void rte_prefetch_non_temporal(const volatile void *p);
>
>+/**
>+ * Prefetch a cache line into all cache levels, with intention to write.
>This
>+ * prefetch variant hints to the CPU that the program is expecting to
>write to
>+ * the cache line being prefetched.
>+ *
>+ * @param p Address to prefetch
>+ */
>+static inline void rte_prefetch0_write(const void *p)
>+{
>+  /* 1 indicates intention to write, 3 sets target cache level to L1.
>See
>+   * GCC docs where these integer constants are described in
>more detail:
>+   *  https://urldefense.proofpoint.com/v2/url?u=https-
>3A__gcc.gnu.org_onlinedocs_gcc_Other-
>2DBuiltins.html&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=1cjuAHrG
>h745jHNmj2fD85sUMIJ2IPIDsIJzo6FN6Z0&m=OMPc_t21vsWlzVl0UjdGB
>hTTv1Gngqfk8vth9UQtUSA&s=51jgfiV2UC5B3xxBE-
>4gRAte3lCZP3v370U7Fpn6LaE&e=
>+   */
>+  __builtin_prefetch(p, 1, 3);
>+}
>+
> #endif /* _RTE_PREFETCH_H_ */
>--
>2.17.1



Re: [dpdk-dev] [PATCH v3] windows/netuio: add Windows NetUIO kernel driver

2020-09-13 Thread Dmitry Kozlyuk
On Wed,  9 Sep 2020 11:41:23 -0700, Narcisa Ana Maria Vasile wrote:
> From: Narcisa Vasile 

Thanks for the hard work! A few comments inline worth checking, but this
version improved much anyway.

> The Windows netuio kernel driver provides the DPDK userspace application
> with direct access to hardware, by mapping the HW registers in userspace
> and allowing read/write operations from/to the device
> configuration space.
> 
> Two IOCTLs are defined by the netuio interface:
>   * IOCTL_NETUIO_MAP_HW_INTO_USERMODE
>   - used for mapping the device registers into userspace
> 
>   Description of output:
>   -   the physical address, virtual address and the size of the
> memory region where the BARs were mapped.
> 
>   * IOCTL_NETUIO_PCI_CONFIG_IO
>   - used to read/write from/into the device configuration space
> 
>   Description of input:
>   -   the operation type (read/write)
>   -   the offset into the device data where the operation begins
>   -   the length of data to read/write.
>   -   in case of a write operation, the data to be written to
> the device configuration space.
>   Description of output:
>   -   in case of a read operation, the output buffer is filled
> with the data read from the configuration space.

Detailed parameter descriptions would better suit netuio_interface.h, where
the IOCTLs are defined. Short ones are useful here, though.

>  windows/netuio/README.rst |  58 +

Nit: "git am" complains about spaces at end of a few lines in this file.

[snip]
> +static NTSTATUS
> +get_pci_device_info(_In_ WDFOBJECT device)
> +{
[snip]
> +// Retrieve the B:D:F details of our device
> +PDEVICE_OBJECT pdo = NULL;
> +pdo = WdfDeviceWdmGetPhysicalDevice(device);
> +if (pdo) {
> +ULONG prop = 0, length = 0;
> +status = IoGetDeviceProperty(pdo, DevicePropertyBusNumber, 
> sizeof(ULONG), (PVOID)&ctx->addr.bus_num, &length);

Status is not checked and is immediately overwritten.

> +status = IoGetDeviceProperty(pdo, DevicePropertyAddress, 
> sizeof(ULONG), (PVOID)&prop, &length);
> +
> +if (NT_SUCCESS(status)) {
> +ctx->addr.func_num = prop & 0x;
> +ctx->addr.dev_num = ((prop >> 16) & 0x);
> +}
> +}
> +
> +return status;
> +}
> +
> +static NTSTATUS
> +create_device_specific_symbolic_link(_In_ WDFOBJECT device)
> +{
> +NTSTATUS status = STATUS_UNSUCCESSFUL;
> +UNICODE_STRING netuio_symbolic_link;
> +
> +PNETUIO_CONTEXT_DATA  ctx;
> +ctx = netuio_get_context_data(device);
> +
> +if (!ctx)
> +return status;

Can "ctx" really be NULL? Same applies to file object context. If I
get WdfObjectGetTypedContext() description right, it cannot, so you could
save a lot of checks.

> +
> +// Build symbolic link name as _BDF  
> (bus/device/func)
> +CHAR  symbolic_link[64] = { 0 };
> +sprintf_s(symbolic_link, sizeof(symbolic_link), "%s_%04d%02d%02d",
> +NETUIO_DEVICE_SYMBOLIC_LINK_ANSI, ctx->addr.bus_num,
> +ctx->addr.dev_num, ctx->addr.func_num);

You probably want to use %x, and not %d for PCI addresses, because decimal
device numbers may not fit 2 digits and it will be hard to recognize. The
other place is DbgPrintEx() below. No strong opinion, however, because Device
Manager seems to use decimal.

> +
> +ANSI_STRING ansi_symbolic_link;
> +RtlInitAnsiString(&ansi_symbolic_link, symbolic_link);
> +
> +status = RtlAnsiStringToUnicodeString(&netuio_symbolic_link, 
> &ansi_symbolic_link, TRUE);
> +if (!NT_SUCCESS(status))
> +return status;
> +
> +status = WdfDeviceCreateSymbolicLink(device, &netuio_symbolic_link);
> +
> +RtlFreeUnicodeString(&netuio_symbolic_link);
> +
> +return status;
> +}

[snip]
> +
> +_Use_decl_annotations_
> +NTSTATUS
> +netuio_map_hw_resources(WDFDEVICE Device, WDFCMRESLIST Resources, 
> WDFCMRESLIST ResourcesTranslated)
> +{

Could you please document the solution here? Something along these lines:

ResourcesTranslated report MMIO BARs in the correct order, but their
indices differ from physical ones. Traverse PCI configuration
manually to maintain physical index, searching for the next MMIO
resource each time.

When I first implemented a netuio-like driver, I fell into this trap with
indices. Worse, I did work sometimes, e.g. with virtio-pci (legacy),that's
why I consider this non-obvious.

[snip]
> +for (INT bar_index = 0; bar_index < PCI_MAX_BAR; bar_index++) {
> +prev_bar = curr_bar;
> +curr_bar = pci_config.u.type0.BaseAddresses[bar_index];
> +if (curr_bar == 0 || (prev_bar & PCI_TYPE_64BIT)) {
> +// Skip this bar
> +ctx->bar[bar_index].base_addr.QuadPart = 0;
> +ctx->bar[bar_index].size = 0;
> +ctx->bar[bar_index].virt_addr = 0;
> +
> +ctx->dpdk_hw[bar_index].mdl = NULL;
> 

Re: [dpdk-dev] [PATCH v2] windows/netuio: add Windows NetUIO kernel driver

2020-09-13 Thread Dmitry Kozlyuk
On Wed, 9 Sep 2020 11:53:53 -0700, Narcisa Ana Maria Vasile wrote:
> On Mon, Aug 24, 2020 at 11:53:44PM +0300, Dmitry Kozlyuk wrote:
> > On Thu, 20 Aug 2020 15:23:55 -0700, Narcisa Ana Maria Vasile wrote:  
[snip]
> > > +ClassName = "Windows UIO"
> > > +DiskName = "DPDK netUIO Installation Disk"
> > > +netuio.DeviceDesc = "netuio Device"
> > > +netuio.SVCDESC = "netuio Service"
> > > +
> > > +// Build symbolic link name as _BDF  
> > > (bus/device/func)
> > > +CHAR  symbolic_link[64] = { 0 };
> > > +sprintf_s(symbolic_link, sizeof(symbolic_link), "%s_%04d%02d%02d",
> > > +NETUIO_DEVICE_SYMBOLIC_LINK_ANSI, 
> > > netuio_contextdata->addr.bus_num,
> > > +netuio_contextdata->addr.dev_num, 
> > > netuio_contextdata->addr.func_num);
> > > +
> > > +ANSI_STRING ansi_symbolic_link;
> > > +RtlInitAnsiString(&ansi_symbolic_link, symbolic_link);
> > > +
> > > +status = RtlAnsiStringToUnicodeString(&netuio_symbolic_link, 
> > > &ansi_symbolic_link, TRUE);
> > > +if (!NT_SUCCESS(status))
> > > +return status;  
> > 
> > Why not use Unicode directly?
> >   
> It looks like either way, a cast will be needed (for example, if I use wchar 
> for symbolic_link and then
> RtlInitUnicodeString()). I've left it as is, but let me know if there's an 
> elegant solution that I didn't see.

It's not that important, really, you can leave the code as-is. FWIW, I meant
something like this:

DECLARE_UNICODE_STRING_SIZE(symbolic_link_name, NETUIO_MAX_SYMLINK_LEN);
RtlUnicodeStringPrintf(
symbolic_link_name,
NETUIO_DEVICE_SYMBOLIC_LINK_UNICODE,
...);

https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntstrsafe/nf-ntstrsafe-rtlunicodestringprintf
 



[dpdk-dev] [PATCH 01/20] ethdev: reset device and interrupt pointers on release

2020-09-13 Thread Thomas Monjalon
The pointers .device and .intr_handle were already reset by the helper
rte_eth_dev_pci_generic_remove().
It is now made part of rte_eth_dev_release_port().

It makes rte_eth_dev_pci_release() meaningless,
so it is replaced with a call to rte_eth_dev_release_port().

Signed-off-by: Thomas Monjalon 
---
 drivers/net/ark/ark_ethdev.c|  2 +-
 drivers/net/octeontx2/otx2_ethdev.c |  2 +-
 drivers/net/szedata2/rte_eth_szedata2.c |  6 +++---
 lib/librte_ethdev/rte_ethdev.c  |  2 ++
 lib/librte_ethdev/rte_ethdev_pci.h  | 14 ++
 5 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index b32ccd8677..0321ec1264 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -105,7 +105,7 @@ eth_ark_pci_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
 
ret = eth_ark_dev_init(eth_dev);
if (ret)
-   rte_eth_dev_pci_release(eth_dev);
+   rte_eth_dev_release_port(eth_dev);
 
return ret;
 }
diff --git a/drivers/net/octeontx2/otx2_ethdev.c 
b/drivers/net/octeontx2/otx2_ethdev.c
index 33b72bd4db..7425ee55be 100644
--- a/drivers/net/octeontx2/otx2_ethdev.c
+++ b/drivers/net/octeontx2/otx2_ethdev.c
@@ -2668,7 +2668,7 @@ nix_remove(struct rte_pci_device *pci_dev)
if (rc)
return rc;
 
-   rte_eth_dev_pci_release(eth_dev);
+   rte_eth_dev_release_port(eth_dev);
}
 
/* Nothing to be done for secondary processes */
diff --git a/drivers/net/szedata2/rte_eth_szedata2.c 
b/drivers/net/szedata2/rte_eth_szedata2.c
index 30c888cd96..a17c53577c 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1802,7 +1802,7 @@ szedata2_eth_dev_release_interval(struct rte_eth_dev 
**eth_devs,
 
for (i = from; i < to; i++) {
rte_szedata2_eth_dev_uninit(eth_devs[i]);
-   rte_eth_dev_pci_release(eth_devs[i]);
+   rte_eth_dev_release_port(eth_devs[i]);
}
 }
 
@@ -1853,7 +1853,7 @@ static int szedata2_eth_pci_probe(struct rte_pci_driver 
*pci_drv __rte_unused,
if (ret != 0) {
PMD_INIT_LOG(ERR, "Failed to init eth_dev for port %u",
i);
-   rte_eth_dev_pci_release(eth_devs[i]);
+   rte_eth_dev_release_port(eth_devs[i]);
szedata2_eth_dev_release_interval(eth_devs, 0, i);
rte_free(list_entry);
return ret;
@@ -1922,7 +1922,7 @@ static int szedata2_eth_pci_remove(struct rte_pci_device 
*pci_dev)
retval = retval ? retval : ret;
}
 
-   rte_eth_dev_pci_release(eth_dev);
+   rte_eth_dev_release_port(eth_dev);
}
 
return retval;
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 7858ad5f11..3617f39ca0 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -555,6 +555,8 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
eth_dev->state = RTE_ETH_DEV_UNUSED;
+   eth_dev->device = NULL;
+   eth_dev->intr_handle = NULL;
 
if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
rte_free(eth_dev->data->rx_queues);
diff --git a/lib/librte_ethdev/rte_ethdev_pci.h 
b/lib/librte_ethdev/rte_ethdev_pci.h
index a999602fdd..114da807ff 100644
--- a/lib/librte_ethdev/rte_ethdev_pci.h
+++ b/lib/librte_ethdev/rte_ethdev_pci.h
@@ -134,16 +134,6 @@ rte_eth_dev_pci_allocate(struct rte_pci_device *dev, 
size_t private_data_size)
return eth_dev;
 }
 
-static inline void
-rte_eth_dev_pci_release(struct rte_eth_dev *eth_dev)
-{
-   eth_dev->device = NULL;
-   eth_dev->intr_handle = NULL;
-
-   /* free ether device */
-   rte_eth_dev_release_port(eth_dev);
-}
-
 typedef int (*eth_dev_pci_callback_t)(struct rte_eth_dev *eth_dev);
 
 /**
@@ -165,7 +155,7 @@ rte_eth_dev_pci_generic_probe(struct rte_pci_device 
*pci_dev,
RTE_FUNC_PTR_OR_ERR_RET(*dev_init, -EINVAL);
ret = dev_init(eth_dev);
if (ret)
-   rte_eth_dev_pci_release(eth_dev);
+   rte_eth_dev_release_port(eth_dev);
else
rte_eth_dev_probing_finish(eth_dev);
 
@@ -194,7 +184,7 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device 
*pci_dev,
return ret;
}
 
-   rte_eth_dev_pci_release(eth_dev);
+   rte_eth_dev_release_port(eth_dev);
return 0;
 }
 
-- 
2.28.0



[dpdk-dev] [PATCH 03/20] net/af_packet: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/af_packet/rte_eth_af_packet.c | 56 ---
 1 file changed, 29 insertions(+), 27 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index 7d0ff1cbb3..3ab6f4d388 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -377,8 +377,32 @@ eth_stats_reset(struct rte_eth_dev *dev)
 }
 
 static int
-eth_dev_close(struct rte_eth_dev *dev __rte_unused)
+eth_dev_close(struct rte_eth_dev *dev)
 {
+   struct pmd_internals *internals;
+   struct tpacket_req *req;
+   unsigned q;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
+   PMD_LOG(INFO, "Closing AF_PACKET ethdev on NUMA socket %u",
+   rte_socket_id());
+
+   internals = dev->data->dev_private;
+   req = &internals->req;
+   for (q = 0; q < internals->nb_queues; q++) {
+   munmap(internals->rx_queue[q].map,
+   2 * req->tp_block_size * req->tp_block_nr);
+   rte_free(internals->rx_queue[q].rd);
+   rte_free(internals->tx_queue[q].rd);
+   }
+   free(internals->if_name);
+   rte_free(internals->rx_queue);
+   rte_free(internals->tx_queue);
+
+   /* mac_addrs must not be freed alone because part of dev_private */
+   dev->data->mac_addrs = NULL;
return 0;
 }
 
@@ -835,6 +859,7 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
data->nb_tx_queues = (uint16_t)nb_queues;
data->dev_link = pmd_link;
data->mac_addrs = &(*internals)->eth_addr;
+   data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
 
(*eth_dev)->dev_ops = &ops;
 
@@ -1033,13 +1058,7 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
 static int
 rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
 {
-   struct rte_eth_dev *eth_dev = NULL;
-   struct pmd_internals *internals;
-   struct tpacket_req *req;
-   unsigned q;
-
-   PMD_LOG(INFO, "Closing AF_PACKET ethdev on numa socket %u",
-   rte_socket_id());
+   struct rte_eth_dev *eth_dev;
 
if (dev == NULL)
return -1;
@@ -1047,26 +1066,9 @@ rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
/* find the ethdev entry */
eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
if (eth_dev == NULL)
-   return -1;
-
-   /* mac_addrs must not be freed alone because part of dev_private */
-   eth_dev->data->mac_addrs = NULL;
-
-   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-   return rte_eth_dev_release_port(eth_dev);
-
-   internals = eth_dev->data->dev_private;
-   req = &internals->req;
-   for (q = 0; q < internals->nb_queues; q++) {
-   munmap(internals->rx_queue[q].map,
-   2 * req->tp_block_size * req->tp_block_nr);
-   rte_free(internals->rx_queue[q].rd);
-   rte_free(internals->tx_queue[q].rd);
-   }
-   free(internals->if_name);
-   rte_free(internals->rx_queue);
-   rte_free(internals->tx_queue);
+   return 0; /* port already released */
 
+   eth_dev_close(eth_dev);
rte_eth_dev_release_port(eth_dev);
 
return 0;
-- 
2.28.0



[dpdk-dev] [PATCH 05/20] net/axgbe: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.
The ".dev_close" callback is also called as part of the ".remove" one.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/axgbe/axgbe_ethdev.c | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index be6f7cbda6..c5e70ea0a1 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -10,7 +10,6 @@
 #include "axgbe_regs.h"
 
 static int eth_axgbe_dev_init(struct rte_eth_dev *eth_dev);
-static int eth_axgbe_dev_uninit(struct rte_eth_dev *eth_dev);
 static int  axgbe_dev_configure(struct rte_eth_dev *dev);
 static int  axgbe_dev_start(struct rte_eth_dev *dev);
 static void axgbe_dev_stop(struct rte_eth_dev *dev);
@@ -378,14 +377,6 @@ axgbe_dev_stop(struct rte_eth_dev *dev)
rte_bit_relaxed_set32(AXGBE_DOWN, &pdata->dev_state);
 }
 
-/* Clear all resources like TX/RX queues. */
-static int
-axgbe_dev_close(struct rte_eth_dev *dev)
-{
-   axgbe_dev_clear_queues(dev);
-   return 0;
-}
-
 static int
 axgbe_dev_promiscuous_enable(struct rte_eth_dev *dev)
 {
@@ -1632,6 +1623,7 @@ eth_axgbe_dev_init(struct rte_eth_dev *eth_dev)
int ret;
 
eth_dev->dev_ops = &axgbe_eth_dev_ops;
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
 
/*
 * For secondary processes, we don't initialise any further as primary
@@ -1794,7 +1786,7 @@ eth_axgbe_dev_init(struct rte_eth_dev *eth_dev)
 }
 
 static int
-eth_axgbe_dev_uninit(struct rte_eth_dev *eth_dev)
+axgbe_dev_close(struct rte_eth_dev *eth_dev)
 {
struct rte_pci_device *pci_dev;
 
@@ -1827,7 +1819,7 @@ static int eth_axgbe_pci_probe(struct rte_pci_driver 
*pci_drv __rte_unused,
 
 static int eth_axgbe_pci_remove(struct rte_pci_device *pci_dev)
 {
-   return rte_eth_dev_pci_generic_remove(pci_dev, eth_axgbe_dev_uninit);
+   return rte_eth_dev_pci_generic_remove(pci_dev, axgbe_dev_close);
 }
 
 static struct rte_pci_driver rte_axgbe_pmd = {
-- 
2.28.0



[dpdk-dev] [PATCH 04/20] net/atlantic: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/atlantic/atl_ethdev.c | 62 ---
 1 file changed, 24 insertions(+), 38 deletions(-)

diff --git a/drivers/net/atlantic/atl_ethdev.c 
b/drivers/net/atlantic/atl_ethdev.c
index 1d76883c52..16721a011f 100644
--- a/drivers/net/atlantic/atl_ethdev.c
+++ b/drivers/net/atlantic/atl_ethdev.c
@@ -15,8 +15,6 @@
 #include "hw_atl/hw_atl_b0_internal.h"
 
 static int eth_atl_dev_init(struct rte_eth_dev *eth_dev);
-static int eth_atl_dev_uninit(struct rte_eth_dev *eth_dev);
-
 static int  atl_dev_configure(struct rte_eth_dev *dev);
 static int  atl_dev_start(struct rte_eth_dev *dev);
 static void atl_dev_stop(struct rte_eth_dev *dev);
@@ -381,6 +379,8 @@ eth_atl_dev_init(struct rte_eth_dev *eth_dev)
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
return 0;
 
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
+
/* Vendor and Device ID need to be set before init of shared code */
hw->device_id = pci_dev->id.device_id;
hw->vendor_id = pci_dev->id.vendor_id;
@@ -441,40 +441,6 @@ eth_atl_dev_init(struct rte_eth_dev *eth_dev)
return err;
 }
 
-static int
-eth_atl_dev_uninit(struct rte_eth_dev *eth_dev)
-{
-   struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
-   struct rte_intr_handle *intr_handle = &pci_dev->intr_handle;
-   struct aq_hw_s *hw;
-
-   PMD_INIT_FUNC_TRACE();
-
-   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-   return -EPERM;
-
-   hw = ATL_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
-
-   if (hw->adapter_stopped == 0)
-   atl_dev_close(eth_dev);
-
-   eth_dev->dev_ops = NULL;
-   eth_dev->rx_pkt_burst = NULL;
-   eth_dev->tx_pkt_burst = NULL;
-
-   /* disable uio intr before callback unregister */
-   rte_intr_disable(intr_handle);
-   rte_intr_callback_unregister(intr_handle,
-atl_dev_interrupt_handler, eth_dev);
-
-   rte_free(eth_dev->data->mac_addrs);
-   eth_dev->data->mac_addrs = NULL;
-
-   pthread_mutex_destroy(&hw->mbox_mutex);
-
-   return 0;
-}
-
 static int
 eth_atl_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
struct rte_pci_device *pci_dev)
@@ -486,7 +452,7 @@ eth_atl_pci_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
 static int
 eth_atl_pci_remove(struct rte_pci_device *pci_dev)
 {
-   return rte_eth_dev_pci_generic_remove(pci_dev, eth_atl_dev_uninit);
+   return rte_eth_dev_pci_generic_remove(pci_dev, atl_dev_close);
 }
 
 static int
@@ -721,12 +687,32 @@ atl_dev_set_link_down(struct rte_eth_dev *dev)
 static int
 atl_dev_close(struct rte_eth_dev *dev)
 {
+   struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+   struct rte_intr_handle *intr_handle = &pci_dev->intr_handle;
+   struct aq_hw_s *hw;
+
PMD_INIT_FUNC_TRACE();
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return -EPERM;
+
+   hw = ATL_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
atl_dev_stop(dev);
 
atl_free_queues(dev);
 
+   dev->dev_ops = NULL;
+   dev->rx_pkt_burst = NULL;
+   dev->tx_pkt_burst = NULL;
+
+   /* disable uio intr before callback unregister */
+   rte_intr_disable(intr_handle);
+   rte_intr_callback_unregister(intr_handle,
+atl_dev_interrupt_handler, dev);
+
+   pthread_mutex_destroy(&hw->mbox_mutex);
+
return 0;
 }
 
@@ -735,7 +721,7 @@ atl_dev_reset(struct rte_eth_dev *dev)
 {
int ret;
 
-   ret = eth_atl_dev_uninit(dev);
+   ret = atl_dev_close(dev);
if (ret)
return ret;
 
-- 
2.28.0



[dpdk-dev] [PATCH 00/20] cleanup ethdev close operation

2020-09-13 Thread Thomas Monjalon
This is the end of a process started two years ago,
to have a close which reliably releases an ethdev port
without the need of removing the device (which can have more ports).

Unfortunately, some drivers might be broken because did not follow
the migration recommendations. We cannot wait more,
this should be merged before the 20.11-rc1 release.


Steve Yang (1):
  net/iavf: release port upon close

Thomas Monjalon (18):
  ethdev: reset device and interrupt pointers on release
  ethdev: allow drivers to return error on close
  net/af_packet: release port upon close
  net/atlantic: release port upon close
  net/axgbe: release port upon close
  net/bonding: release port upon close
  net/failsafe: release port upon close
  net/mlx4: release port upon close
  net/null: release port upon close
  net/octeontx: release port upon close
  net/pcap: release port upon close
  net/ring: release port upon close
  net/softnic: release port upon close
  ethdev: remove old close behaviour
  drivers/net: accept removing device without any port
  drivers/net: remove redundant MAC addresses freeing
  app/testpmd: reset port status on close notification
  app/testpmd: align behaviour of multi-port detach

Yunjian Wang (1):
  net/tap: release port upon close

 MAINTAINERS   | 22 
 app/test-pmd/config.c |  7 ++-
 app/test-pmd/testpmd.c| 56 +++
 app/test/virtual_pmd.c|  6 ++-
 doc/guides/rel_notes/deprecation.rst  |  6 ---
 drivers/net/af_packet/rte_eth_af_packet.c | 58 ++--
 drivers/net/af_xdp/rte_eth_af_xdp.c   |  6 +--
 drivers/net/ark/ark_ethdev.c  | 11 ++--
 drivers/net/atlantic/atl_ethdev.c | 66 +--
 drivers/net/avp/avp_ethdev.c  |  7 ++-
 drivers/net/axgbe/axgbe_ethdev.c  | 14 ++---
 drivers/net/bnx2x/bnx2x_ethdev.c  |  4 +-
 drivers/net/bnxt/bnxt_ethdev.c|  9 ++--
 drivers/net/bnxt/bnxt_reps.c  |  7 +--
 drivers/net/bnxt/bnxt_reps.h  |  2 +-
 drivers/net/bonding/eth_bond_private.h|  2 +-
 drivers/net/bonding/rte_eth_bond_pmd.c| 37 +++--
 drivers/net/cxgbe/cxgbe_ethdev.c  |  6 ++-
 drivers/net/cxgbe/cxgbe_pfvf.h|  2 +-
 drivers/net/dpaa/dpaa_ethdev.c|  4 +-
 drivers/net/dpaa2/dpaa2_ethdev.c  |  6 ++-
 drivers/net/e1000/em_ethdev.c | 11 ++--
 drivers/net/e1000/igb_ethdev.c| 22 +++-
 drivers/net/ena/ena_ethdev.c  | 12 ++---
 drivers/net/enetc/enetc_ethdev.c  |  4 +-
 drivers/net/enic/enic_ethdev.c|  6 +--
 drivers/net/failsafe/failsafe.c   | 24 ++---
 drivers/net/failsafe/failsafe_ops.c   | 60 +
 drivers/net/failsafe/failsafe_private.h   |  1 +
 drivers/net/fm10k/fm10k_ethdev.c  |  9 ++--
 drivers/net/hinic/hinic_pmd_ethdev.c  | 15 ++
 drivers/net/hns3/hns3_ethdev.c| 11 ++--
 drivers/net/hns3/hns3_ethdev_vf.c | 11 ++--
 drivers/net/i40e/i40e_ethdev.c| 10 ++--
 drivers/net/i40e/i40e_ethdev_vf.c | 10 ++--
 drivers/net/iavf/iavf_ethdev.c| 44 ---
 drivers/net/ice/ice_dcf_ethdev.c  |  8 +--
 drivers/net/ice/ice_ethdev.c  | 14 ++---
 drivers/net/igc/igc_ethdev.c  | 11 ++--
 drivers/net/ionic/ionic_ethdev.c  | 10 ++--
 drivers/net/ipn3ke/ipn3ke_ethdev.c|  6 +--
 drivers/net/ipn3ke/ipn3ke_representor.c   |  4 +-
 drivers/net/ixgbe/ixgbe_ethdev.c  | 21 +++-
 drivers/net/kni/rte_eth_kni.c | 22 
 drivers/net/liquidio/lio_ethdev.c |  4 +-
 drivers/net/memif/rte_eth_memif.c |  7 ++-
 drivers/net/mlx4/mlx4.c   |  5 +-
 drivers/net/mlx5/linux/mlx5_os.c  |  2 -
 drivers/net/mlx5/mlx5.c   |  9 ++--
 drivers/net/mlx5/mlx5.h   |  2 +-
 drivers/net/mvneta/mvneta_ethdev.c|  7 ++-
 drivers/net/mvpp2/mrvl_ethdev.c   |  7 ++-
 drivers/net/netvsc/hn_ethdev.c|  9 ++--
 drivers/net/nfb/nfb_ethdev.c  |  8 +--
 drivers/net/nfp/nfp_net.c |  8 ++-
 drivers/net/null/rte_eth_null.c   | 23 +---
 drivers/net/octeontx/octeontx_ethdev.c| 14 ++---
 drivers/net/octeontx2/otx2_ethdev.c   |  8 +--
 drivers/net/pcap/rte_eth_pcap.c   | 31 ++-
 drivers/net/pfe/pfe_ethdev.c  | 14 ++---
 drivers/net/qede/qede_ethdev.c|  4 +-
 drivers/net/ring/rte_eth_ring.c   | 50 ++---
 drivers/net/sfc/sfc_ethdev.c  |  8 +--
 drivers/net/softnic/rte_eth_softnic.c | 64 +++---
 drivers/net/szedata2/rte_eth_szedata2.c   | 20 +++
 drivers/net/tap/rte_eth_tap.c | 48 ++---
 drivers/net/thunderx/nicvf_ethdev.c   |  4 +-
 drivers/net/vhost/rte_eth_vhost.c |  8 +--
 drivers/net

[dpdk-dev] [PATCH 07/20] net/failsafe: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/failsafe/failsafe.c | 25 ++
 drivers/net/failsafe/failsafe_ops.c | 61 +++--
 drivers/net/failsafe/failsafe_private.h |  1 +
 3 files changed, 42 insertions(+), 45 deletions(-)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 4a4b7ceab6..44d47e8f72 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -60,12 +60,6 @@ fs_sub_device_alloc(struct rte_eth_dev *dev,
return 0;
 }
 
-static void
-fs_sub_device_free(struct rte_eth_dev *dev)
-{
-   rte_free(PRIV(dev)->subs);
-}
-
 static void fs_hotplug_alarm(void *arg);
 
 int
@@ -186,6 +180,7 @@ fs_eth_dev_create(struct rte_vdev_device *vdev)
ERROR("Unable to allocate rte_eth_dev");
return -1;
}
+   dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
priv = PRIV(dev);
priv->data = dev->data;
priv->rxp = FS_RX_PROXY_INIT;
@@ -285,7 +280,7 @@ fs_eth_dev_create(struct rte_vdev_device *vdev)
 free_args:
failsafe_args_free(dev);
 free_subs:
-   fs_sub_device_free(dev);
+   rte_free(PRIV(dev)->subs);
 free_dev:
/* mac_addrs must not be freed alone because part of dev_private */
dev->data->mac_addrs = NULL;
@@ -301,20 +296,8 @@ fs_rte_eth_free(const char *name)
 
dev = rte_eth_dev_allocated(name);
if (dev == NULL)
-   return -ENODEV;
-   rte_eth_dev_callback_unregister(RTE_ETH_ALL, RTE_ETH_EVENT_NEW,
-   failsafe_eth_new_event_callback, dev);
-   ret = failsafe_eal_uninit(dev);
-   if (ret)
-   ERROR("Error while uninitializing sub-EAL");
-   failsafe_args_free(dev);
-   fs_sub_device_free(dev);
-   ret = pthread_mutex_destroy(&PRIV(dev)->hotplug_mutex);
-   if (ret)
-   ERROR("Error while destroying hotplug mutex");
-   rte_free(PRIV(dev)->mcast_addrs);
-   /* mac_addrs must not be freed alone because part of dev_private */
-   dev->data->mac_addrs = NULL;
+   return 0; /* port already released */
+   ret = failsafe_eth_dev_close(dev);
rte_eth_dev_release_port(dev);
return ret;
 }
diff --git a/drivers/net/failsafe/failsafe_ops.c 
b/drivers/net/failsafe/failsafe_ops.c
index 93ebd09114..fc8c11b28c 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -239,29 +239,6 @@ fs_dev_set_link_down(struct rte_eth_dev *dev)
return 0;
 }
 
-static void fs_dev_free_queues(struct rte_eth_dev *dev);
-static int
-fs_dev_close(struct rte_eth_dev *dev)
-{
-   struct sub_device *sdev;
-   uint8_t i;
-
-   fs_lock(dev, 0);
-   failsafe_hotplug_alarm_cancel(dev);
-   if (PRIV(dev)->state == DEV_STARTED)
-   dev->dev_ops->dev_stop(dev);
-   PRIV(dev)->state = DEV_ACTIVE - 1;
-   FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
-   DEBUG("Closing sub_device %d", i);
-   failsafe_eth_dev_unregister_callbacks(sdev);
-   rte_eth_dev_close(PORT_ID(sdev));
-   sdev->state = DEV_ACTIVE - 1;
-   }
-   fs_dev_free_queues(dev);
-   fs_unlock(dev, 0);
-   return 0;
-}
-
 static int
 fs_rx_queue_stop(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 {
@@ -656,6 +633,42 @@ fs_dev_free_queues(struct rte_eth_dev *dev)
dev->data->nb_tx_queues = 0;
 }
 
+int
+failsafe_eth_dev_close(struct rte_eth_dev *dev)
+{
+   struct sub_device *sdev;
+   uint8_t i;
+   int ret;
+
+   fs_lock(dev, 0);
+   failsafe_hotplug_alarm_cancel(dev);
+   if (PRIV(dev)->state == DEV_STARTED)
+   dev->dev_ops->dev_stop(dev);
+   PRIV(dev)->state = DEV_ACTIVE - 1;
+   FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+   DEBUG("Closing sub_device %d", i);
+   failsafe_eth_dev_unregister_callbacks(sdev);
+   rte_eth_dev_close(PORT_ID(sdev));
+   sdev->state = DEV_ACTIVE - 1;
+   }
+   fs_dev_free_queues(dev);
+   rte_eth_dev_callback_unregister(RTE_ETH_ALL, RTE_ETH_EVENT_NEW,
+   failsafe_eth_new_event_callback, dev);
+   ret = failsafe_eal_uninit(dev);
+   if (ret)
+   ERROR("Error while uninitializing sub-EAL");
+   failsafe_args_free(dev);
+   rte_free(PRIV(dev)->subs);
+   ret = pthread_mutex_destroy(&PRIV(dev)->hotplug_mutex);
+   if (ret)
+   ERROR("Error while destroying hotplug mutex");
+   rte_free(PRIV(dev)->mcast_addrs);
+   /* mac_addrs must not be freed alone because part of dev_private */
+   dev->data->mac_addrs = NULL;
+   fs_unlock(dev, 0);
+

[dpdk-dev] [PATCH 06/20] net/bonding: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 34 --
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index e1123deb83..726b47055b 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -2103,6 +2103,9 @@ bond_ethdev_close(struct rte_eth_dev *dev)
int skipped = 0;
struct rte_flow_error ferror;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
RTE_BOND_LOG(INFO, "Closing bonded device %s", dev->device->name);
while (internals->slave_count != skipped) {
uint16_t port_id = internals->slaves[skipped].port_id;
@@ -2119,6 +2122,17 @@ bond_ethdev_close(struct rte_eth_dev *dev)
bond_flow_ops.flush(dev, &ferror);
bond_ethdev_free_queues(dev);
rte_bitmap_reset(internals->vlan_filter_bmp);
+   rte_bitmap_free(internals->vlan_filter_bmp);
+   rte_free(internals->vlan_filter_bmpmem);
+
+   /* Try to release mempool used in mode6. If the bond
+* device is not mode6, free the NULL is not problem.
+*/
+   rte_mempool_free(internals->mode6.mempool);
+
+   dev->dev_ops = NULL;
+   dev->rx_pkt_burst = NULL;
+   dev->tx_pkt_burst = NULL;
 
return 0;
 }
@@ -3195,6 +3209,7 @@ bond_alloc(struct rte_vdev_device *dev, uint8_t mode)
}
 
internals = eth_dev->data->dev_private;
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
eth_dev->data->nb_rx_queues = (uint16_t)1;
eth_dev->data->nb_tx_queues = (uint16_t)1;
 
@@ -3414,14 +3429,10 @@ bond_remove(struct rte_vdev_device *dev)
name = rte_vdev_device_name(dev);
RTE_BOND_LOG(INFO, "Uninitializing pmd_bond for %s", name);
 
-   /* now free all data allocation - for eth_dev structure,
-* dummy pci driver and internal (private) data
-*/
-
/* find an ethdev entry */
eth_dev = rte_eth_dev_allocated(name);
if (eth_dev == NULL)
-   return -ENODEV;
+   return 0; /* port already released */
 
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
return rte_eth_dev_release_port(eth_dev);
@@ -3436,19 +3447,6 @@ bond_remove(struct rte_vdev_device *dev)
bond_ethdev_stop(eth_dev);
bond_ethdev_close(eth_dev);
}
-
-   eth_dev->dev_ops = NULL;
-   eth_dev->rx_pkt_burst = NULL;
-   eth_dev->tx_pkt_burst = NULL;
-
-   internals = eth_dev->data->dev_private;
-   /* Try to release mempool used in mode6. If the bond
-* device is not mode6, free the NULL is not problem.
-*/
-   rte_mempool_free(internals->mode6.mempool);
-   rte_bitmap_free(internals->vlan_filter_bmp);
-   rte_free(internals->vlan_filter_bmpmem);
-
rte_eth_dev_release_port(eth_dev);
 
return 0;
-- 
2.28.0



[dpdk-dev] [PATCH 08/20] net/iavf: release port upon close

2020-09-13 Thread Thomas Monjalon
From: Steve Yang 

The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Steve Yang 
---
[Thomas] Note: freeing in secondary process is inconsistent
between .dev_close and .remove operations.
---
 drivers/net/iavf/iavf_ethdev.c | 46 --
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 4860a94d4c..f7552f33db 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -1362,6 +1362,11 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
adapter->eth_dev = eth_dev;
adapter->stopped = 1;
 
+   /* Pass the information to the rte_eth_dev_close() that it should also
+* release the private port resources.
+*/
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
+
if (iavf_init_vf(eth_dev) != 0) {
PMD_INIT_LOG(ERR, "Init vf failed");
return -1;
@@ -1416,6 +1421,7 @@ iavf_dev_close(struct rte_eth_dev *dev)
struct rte_intr_handle *intr_handle = &pci_dev->intr_handle;
struct iavf_adapter *adapter =
IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+   struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
 
iavf_dev_stop(dev);
iavf_flow_flush(dev, NULL);
@@ -1428,21 +1434,21 @@ iavf_dev_close(struct rte_eth_dev *dev)
rte_intr_callback_unregister(intr_handle,
 iavf_dev_interrupt_handler, dev);
iavf_disable_irq0(hw);
-   return 0;
-}
-
-static int
-iavf_dev_uninit(struct rte_eth_dev *dev)
-{
-   struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
-
-   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-   return -EPERM;
 
dev->dev_ops = NULL;
dev->rx_pkt_burst = NULL;
dev->tx_pkt_burst = NULL;
-   iavf_dev_close(dev);
+
+   if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_RSS_PF) {
+   if (vf->rss_lut) {
+   rte_free(vf->rss_lut);
+   vf->rss_lut = NULL;
+   }
+   if (vf->rss_key) {
+   rte_free(vf->rss_key);
+   vf->rss_key = NULL;
+   }
+   }
 
rte_free(vf->vf_res);
vf->vsi_res = NULL;
@@ -1451,14 +1457,16 @@ iavf_dev_uninit(struct rte_eth_dev *dev)
rte_free(vf->aq_resp);
vf->aq_resp = NULL;
 
-   if (vf->rss_lut) {
-   rte_free(vf->rss_lut);
-   vf->rss_lut = NULL;
-   }
-   if (vf->rss_key) {
-   rte_free(vf->rss_key);
-   vf->rss_key = NULL;
-   }
+   return 0;
+}
+
+static int
+iavf_dev_uninit(struct rte_eth_dev *dev)
+{
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return -EPERM;
+
+   iavf_dev_close(dev);
 
return 0;
 }
-- 
2.28.0



[dpdk-dev] [PATCH 11/20] net/octeontx: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

The callback ".dev_close(port)" is called also
from the ".remove(device)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/octeontx/octeontx_ethdev.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/net/octeontx/octeontx_ethdev.c 
b/drivers/net/octeontx/octeontx_ethdev.c
index 946844ca0b..6713c51504 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -509,10 +509,6 @@ octeontx_dev_close(struct rte_eth_dev *dev)
rte_free(txq);
}
 
-   /* Free MAC address table */
-   rte_free(dev->data->mac_addrs);
-   dev->data->mac_addrs = NULL;
-
octeontx_port_close(nic);
 
dev->tx_pkt_burst = NULL;
@@ -1379,6 +1375,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, 
uint8_t evdev,
data->promiscuous = 0;
data->all_multicast = 0;
data->scattered_rx = 0;
+   data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
 
/* Get maximum number of supported MAC entries */
max_entries = octeontx_bgx_port_mac_entries_get(nic->port_id);
@@ -1466,10 +1463,9 @@ octeontx_remove(struct rte_vdev_device *dev)
for (i = 0; i < OCTEONTX_VDEV_DEFAULT_MAX_NR_PORT; i++) {
sprintf(octtx_name, "eth_octeontx_%d", i);
 
-   /* reserve an ethdev entry */
eth_dev = rte_eth_dev_allocated(octtx_name);
if (eth_dev == NULL)
-   return -ENODEV;
+   continue; /* port already released */
 
if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
rte_eth_dev_release_port(eth_dev);
@@ -1479,9 +1475,8 @@ octeontx_remove(struct rte_vdev_device *dev)
nic = octeontx_pmd_priv(eth_dev);
rte_event_dev_stop(nic->evdev);
PMD_INIT_LOG(INFO, "Closing octeontx device %s", octtx_name);
-
+   octeontx_dev_close(eth_dev);
rte_eth_dev_release_port(eth_dev);
-   rte_event_dev_close(nic->evdev);
}
 
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-- 
2.28.0



[dpdk-dev] [PATCH 09/20] net/mlx4: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Signed-off-by: Thomas Monjalon 
---
 drivers/net/mlx4/mlx4.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index df59314b66..ad7c805d67 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -400,6 +400,8 @@ mlx4_dev_close(struct rte_eth_dev *dev)
MLX4_ASSERT(priv->ctx == NULL);
mlx4_intr_uninstall(priv);
memset(priv, 0, sizeof(*priv));
+   /* mac_addrs must not be freed because part of dev_private */
+   dev->data->mac_addrs = NULL;
return 0;
 }
 
@@ -1025,6 +1027,7 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
ERROR("can not allocate rte ethdev");
goto port_error;
}
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
eth_dev->data->dev_private = priv;
eth_dev->data->mac_addrs = priv->mac;
eth_dev->device = &pci_dev->device;
-- 
2.28.0



[dpdk-dev] [PATCH 12/20] net/pcap: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/pcap/rte_eth_pcap.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 76e704a65a..a946fa9a1a 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -734,6 +734,12 @@ eth_dev_close(struct rte_eth_dev *dev)
unsigned int i;
struct pmd_internals *internals = dev->data->dev_private;
 
+   if (internals == NULL)
+   return 0;
+
+   PMD_LOG(INFO, "Closing pcap ethdev on NUMA socket %d",
+   rte_socket_id());
+
/* Device wide flag, but cleanup must be performed per queue. */
if (internals->infinite_rx) {
for (i = 0; i < dev->data->nb_rx_queues; i++) {
@@ -748,6 +754,12 @@ eth_dev_close(struct rte_eth_dev *dev)
}
}
 
+   rte_free(dev->process_private);
+
+   if (internals->phy_mac == 0)
+   /* not dynamically allocated, must not be freed */
+   dev->data->mac_addrs = NULL;
+
return 0;
 }
 
@@ -1322,6 +1334,7 @@ eth_from_pcaps(struct rte_vdev_device *vdev,
else
eth_dev->tx_pkt_burst = eth_tx_drop;
 
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
rte_eth_dev_probing_finish(eth_dev);
return 0;
 }
@@ -1544,30 +1557,16 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
 static int
 pmd_pcap_remove(struct rte_vdev_device *dev)
 {
-   struct pmd_internals *internals = NULL;
struct rte_eth_dev *eth_dev = NULL;
 
-   PMD_LOG(INFO, "Closing pcap ethdev on numa socket %d",
-   rte_socket_id());
-
if (!dev)
return -1;
 
-   /* reserve an ethdev entry */
eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
if (eth_dev == NULL)
-   return -1;
-
-   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-   internals = eth_dev->data->dev_private;
-   if (internals != NULL && internals->phy_mac == 0)
-   /* not dynamically allocated, must not be freed */
-   eth_dev->data->mac_addrs = NULL;
-   }
+   return 0; /* port already released */
 
eth_dev_close(eth_dev);
-
-   rte_free(eth_dev->process_private);
rte_eth_dev_release_port(eth_dev);
 
return 0;
-- 
2.28.0



[dpdk-dev] [PATCH 10/20] net/null: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Signed-off-by: Thomas Monjalon 
---
 drivers/net/null/rte_eth_null.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 0ce073fa4b..33997013e4 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -458,7 +458,20 @@ eth_mac_address_set(__rte_unused struct rte_eth_dev *dev,
return 0;
 }
 
+static int
+eth_dev_close(struct rte_eth_dev *dev)
+{
+   PMD_LOG(INFO, "Closing null ethdev on NUMA socket %u",
+   rte_socket_id());
+
+   /* mac_addrs must not be freed alone because part of dev_private */
+   dev->data->mac_addrs = NULL;
+
+   return 0;
+}
+
 static const struct eth_dev_ops ops = {
+   .dev_close = eth_dev_close,
.dev_start = eth_dev_start,
.dev_stop = eth_dev_stop,
.dev_configure = eth_dev_configure,
@@ -532,6 +545,7 @@ eth_dev_null_create(struct rte_vdev_device *dev, struct 
pmd_options *args)
data->mac_addrs = &internals->eth_addr;
data->promiscuous = 1;
data->all_multicast = 1;
+   data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
 
eth_dev->dev_ops = &ops;
 
@@ -701,18 +715,12 @@ rte_pmd_null_remove(struct rte_vdev_device *dev)
if (!dev)
return -EINVAL;
 
-   PMD_LOG(INFO, "Closing null ethdev on numa socket %u",
-   rte_socket_id());
-
/* find the ethdev entry */
eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
if (eth_dev == NULL)
-   return -1;
-
-   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-   /* mac_addrs must not be freed alone because part of 
dev_private */
-   eth_dev->data->mac_addrs = NULL;
+   return 0; /* port already released */
 
+   eth_dev_close(eth_dev);
rte_eth_dev_release_port(eth_dev);
 
return 0;
-- 
2.28.0



[dpdk-dev] [PATCH 19/20] app/testpmd: reset port status on close notification

2020-09-13 Thread Thomas Monjalon
Since rte_eth_dev_release_port() is called on all port close operations,
the event RTE_ETH_EVENT_DESTROY can be reliably used for resetting
the port status on the application side.

The intermediate state RTE_PORT_HANDLING is removed in close_port()
because a port can also be closed by a PMD in a device remove operation.

In case multiple ports are closed, calling remove_invalid_ports()
only once is enough.

Signed-off-by: Thomas Monjalon 
---
 app/test-pmd/testpmd.c | 34 --
 1 file changed, 8 insertions(+), 26 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 7842c3b781..31dc97239b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2694,23 +2694,12 @@ close_port(portid_t pid)
continue;
}
 
-   if (rte_atomic16_cmpset(&(port->port_status),
-   RTE_PORT_STOPPED, RTE_PORT_HANDLING) == 0) {
-   printf("Port %d is now not stopped\n", pi);
-   continue;
-   }
-
if (port->flow_list)
port_flow_flush(pi);
rte_eth_dev_close(pi);
-
-   remove_invalid_ports();
-
-   if (rte_atomic16_cmpset(&(port->port_status),
-   RTE_PORT_HANDLING, RTE_PORT_CLOSED) == 0)
-   printf("Port %d cannot be set to closed\n", pi);
}
 
+   remove_invalid_ports();
printf("Done\n");
 }
 
@@ -2841,12 +2830,7 @@ detach_device(struct rte_device *dev)
return;
}
RTE_ETH_FOREACH_DEV_OF(sibling, dev) {
-   /* reset mapping between old ports and removed device */
-   rte_eth_devices[sibling].device = NULL;
if (ports[sibling].port_status != RTE_PORT_CLOSED) {
-   /* sibling ports are forced to be closed */
-   ports[sibling].port_status = RTE_PORT_CLOSED;
-   printf("Port %u is closed\n", sibling);
}
}
 
@@ -2902,11 +2886,8 @@ detach_devargs(char *identifier)
return;
}
 
-   /* sibling ports are forced to be closed */
if (ports[port_id].flow_list)
port_flow_flush(port_id);
-   ports[port_id].port_status = RTE_PORT_CLOSED;
-   printf("Port %u is now closed\n", port_id);
}
}
 
@@ -3055,12 +3036,6 @@ check_all_ports_link_status(uint32_t port_mask)
}
 }
 
-/*
- * This callback is for remove a port for a device. It has limitation because
- * it is not for multiple port removal for a device.
- * TODO: the device detach invoke will plan to be removed from user side to
- * eal. And convert all PMDs to free port resources on ether device closing.
- */
 static void
 rmv_port_callback(void *arg)
 {
@@ -3118,6 +3093,13 @@ eth_event_callback(portid_t port_id, enum 
rte_eth_event_type type, void *param,
rmv_port_callback, (void *)(intptr_t)port_id))
fprintf(stderr, "Could not set up deferred device 
removal\n");
break;
+   case RTE_ETH_EVENT_DESTROY:
+   if (rte_atomic16_cmpset(&(ports[port_id].port_status),
+   RTE_PORT_STOPPED,
+   RTE_PORT_CLOSED) == 0)
+   printf("Port %d cannot be set to closed\n", port_id);
+   printf("Port %u is closed\n", port_id);
+   break;
default:
break;
}
-- 
2.28.0



[dpdk-dev] [PATCH 15/20] net/tap: release port upon close

2020-09-13 Thread Thomas Monjalon
From: Yunjian Wang 

The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Yunjian Wang 
---
[Thomas] Note: freeing in secondary process is inconsistent
between .dev_close and .remove operations.
---
 drivers/net/tap/rte_eth_tap.c | 46 ---
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 6bce90c531..2e325f55aa 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -72,6 +72,10 @@
 
 static int tap_devices_count;
 
+static const char *tuntap_types[ETH_TUNTAP_TYPE_MAX] = {
+   "UNKNOWN", "TUN", "TAP"
+};
+
 static const char *valid_arguments[] = {
ETH_TAP_IFACE_ARG,
ETH_TAP_REMOTE_ARG,
@@ -1040,6 +1044,9 @@ tap_dev_close(struct rte_eth_dev *dev)
struct pmd_process_private *process_private = dev->process_private;
struct rx_queue *rxq;
 
+   if (process_private == NULL)
+   return 0;
+
tap_link_set_down(dev);
if (internals->nlsk_fd != -1) {
tap_flow_flush(dev, NULL);
@@ -1079,6 +1086,23 @@ tap_dev_close(struct rte_eth_dev *dev)
 * it will be removed from kernel
 */
 
+   /* mac_addrs must not be freed alone because part of dev_private */
+   dev->data->mac_addrs = NULL;
+
+   internals = dev->data->dev_private;
+   TAP_LOG(DEBUG, "Closing %s Ethernet device on NUMA %u",
+   tuntap_types[internals->type], rte_socket_id());
+
+   if (internals->ioctl_sock != -1) {
+   close(internals->ioctl_sock);
+   internals->ioctl_sock = -1;
+   }
+   rte_free(dev->process_private);
+   dev->process_private = NULL;
+   if (tap_devices_count == 1)
+   rte_mp_action_unregister(TAP_MP_KEY);
+   tap_devices_count--;
+
return 0;
 }
 
@@ -1802,10 +1826,6 @@ static const struct eth_dev_ops ops = {
.filter_ctrl= tap_dev_filter_ctrl,
 };
 
-static const char *tuntap_types[ETH_TUNTAP_TYPE_MAX] = {
-   "UNKNOWN", "TUN", "TAP"
-};
-
 static int
 eth_dev_tap_create(struct rte_vdev_device *vdev, const char *tap_name,
   char *remote_iface, struct rte_ether_addr *mac_addr,
@@ -1856,7 +1876,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, const 
char *tap_name,
/* Setup some default values */
data = dev->data;
data->dev_private = pmd;
-   data->dev_flags = RTE_ETH_DEV_INTR_LSC;
+   data->dev_flags = RTE_ETH_DEV_INTR_LSC | RTE_ETH_DEV_CLOSE_REMOVE;
data->numa_node = numa_node;
 
data->dev_link = pmd_link;
@@ -2448,30 +2468,16 @@ static int
 rte_pmd_tap_remove(struct rte_vdev_device *dev)
 {
struct rte_eth_dev *eth_dev = NULL;
-   struct pmd_internals *internals;
 
/* find the ethdev entry */
eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
if (!eth_dev)
-   return -ENODEV;
-
-   /* mac_addrs must not be freed alone because part of dev_private */
-   eth_dev->data->mac_addrs = NULL;
+   return 0;
 
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
return rte_eth_dev_release_port(eth_dev);
 
tap_dev_close(eth_dev);
-
-   internals = eth_dev->data->dev_private;
-   TAP_LOG(DEBUG, "Closing %s Ethernet device on numa %u",
-   tuntap_types[internals->type], rte_socket_id());
-
-   close(internals->ioctl_sock);
-   rte_free(eth_dev->process_private);
-   if (tap_devices_count == 1)
-   rte_mp_action_unregister(TAP_MP_KEY);
-   tap_devices_count--;
rte_eth_dev_release_port(eth_dev);
 
return 0;
-- 
2.28.0



[dpdk-dev] [PATCH 13/20] net/ring: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/ring/rte_eth_ring.c | 51 -
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index 733c898259..afe3ec46cc 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -226,7 +226,35 @@ static int
 eth_link_update(struct rte_eth_dev *dev __rte_unused,
int wait_to_complete __rte_unused) { return 0; }
 
+static int
+eth_dev_close(struct rte_eth_dev *dev)
+{
+   struct pmd_internals *internals = NULL;
+   struct ring_queue *r = NULL;
+   uint16_t i;
+
+   eth_dev_stop(dev);
+
+   internals = dev->data->dev_private;
+   if (internals->action == DEV_CREATE) {
+   /*
+* it is only necessary to delete the rings in rx_queues because
+* they are the same used in tx_queues
+*/
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   r = dev->data->rx_queues[i];
+   rte_ring_free(r->rng);
+   }
+   }
+
+   /* mac_addrs must not be freed alone because part of dev_private */
+   dev->data->mac_addrs = NULL;
+
+   return 0;
+}
+
 static const struct eth_dev_ops ops = {
+   .dev_close = eth_dev_close,
.dev_start = eth_dev_start,
.dev_stop = eth_dev_stop,
.dev_set_link_up = eth_dev_set_link_up,
@@ -328,6 +356,7 @@ do_eth_dev_ring_create(const char *name,
eth_dev->dev_ops = &ops;
data->kdrv = RTE_KDRV_NONE;
data->numa_node = numa_node;
+   data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
 
/* finally assign rx and tx ops */
eth_dev->rx_pkt_burst = eth_ring_rx;
@@ -659,9 +688,6 @@ rte_pmd_ring_remove(struct rte_vdev_device *dev)
 {
const char *name = rte_vdev_device_name(dev);
struct rte_eth_dev *eth_dev = NULL;
-   struct pmd_internals *internals = NULL;
-   struct ring_queue *r = NULL;
-   uint16_t i;
 
PMD_LOG(INFO, "Un-Initializing pmd_ring for %s", name);
 
@@ -671,24 +697,9 @@ rte_pmd_ring_remove(struct rte_vdev_device *dev)
/* find an ethdev entry */
eth_dev = rte_eth_dev_allocated(name);
if (eth_dev == NULL)
-   return -ENODEV;
+   return 0; /* port already released */
 
-   eth_dev_stop(eth_dev);
-
-   internals = eth_dev->data->dev_private;
-   if (internals->action == DEV_CREATE) {
-   /*
-* it is only necessary to delete the rings in rx_queues because
-* they are the same used in tx_queues
-*/
-   for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
-   r = eth_dev->data->rx_queues[i];
-   rte_ring_free(r->rng);
-   }
-   }
-
-   /* mac_addrs must not be freed alone because part of dev_private */
-   eth_dev->data->mac_addrs = NULL;
+   eth_dev_close(eth_dev);
rte_eth_dev_release_port(eth_dev);
return 0;
 }
-- 
2.28.0



[dpdk-dev] [PATCH 20/20] app/testpmd: align behaviour of multi-port detach

2020-09-13 Thread Thomas Monjalon
A port can be closed in multiple situations:
- close command calling close_port() -> rte_eth_dev_close()
- exit calling close_port() -> rte_eth_dev_close()
- hotplug calling close_port() -> rte_eth_dev_close()
- hotplug calling detach_device() -> rte_dev_remove()
- port detach command, detach_device() -> rte_dev_remove()
- device detach command, detach_devargs() -> rte_eal_hotplug_remove()

The flow rules are flushed before each close.
It was already done in close_port(), detach_devargs() and
detach_port_device() which calls detach_device(),
but not in detach_device(). As a consequence, it was missing for siblings
of port detach command and unplugged device.
The check before calling port_flow_flush() is moved inside the function.

The state of the port to close is checked to be stopped.
As above, this check was missing in detach_device(),
impacting the cases of a multi-port device unplugged or detached
with the port detach command.

Signed-off-by: Thomas Monjalon 
---
 app/test-pmd/config.c  |  7 +--
 app/test-pmd/testpmd.c | 22 +++---
 2 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 30bee33248..21513a2978 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1588,9 +1588,12 @@ int
 port_flow_flush(portid_t port_id)
 {
struct rte_flow_error error;
-   struct rte_port *port;
+   struct rte_port *port = &ports[port_id];
int ret = 0;
 
+   if (port->flow_list == NULL)
+   return ret;
+
/* Poisoning to make sure PMDs update it in case of error. */
memset(&error, 0x44, sizeof(error));
if (rte_flow_flush(port_id, &error)) {
@@ -1599,7 +1602,7 @@ port_flow_flush(portid_t port_id)
port_id == (portid_t)RTE_PORT_ALL)
return ret;
}
-   port = &ports[port_id];
+
while (port->flow_list) {
struct port_flow *pf = port->flow_list->next;
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 31dc97239b..a76e9877f6 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2694,8 +2694,7 @@ close_port(portid_t pid)
continue;
}
 
-   if (port->flow_list)
-   port_flow_flush(pi);
+   port_flow_flush(pi);
rte_eth_dev_close(pi);
}
 
@@ -2825,15 +2824,20 @@ detach_device(struct rte_device *dev)
 
printf("Removing a device...\n");
 
-   if (rte_dev_remove(dev) < 0) {
-   TESTPMD_LOG(ERR, "Failed to detach device %s\n", dev->name);
-   return;
-   }
RTE_ETH_FOREACH_DEV_OF(sibling, dev) {
if (ports[sibling].port_status != RTE_PORT_CLOSED) {
+   if (ports[sibling].port_status != RTE_PORT_STOPPED) {
+   printf("Port %u not stopped\n", sibling);
+   return;
+   }
+   port_flow_flush(sibling);
}
}
 
+   if (rte_dev_remove(dev) < 0) {
+   TESTPMD_LOG(ERR, "Failed to detach device %s\n", dev->name);
+   return;
+   }
remove_invalid_ports();
 
printf("Device is detached\n");
@@ -2854,8 +2858,6 @@ detach_port_device(portid_t port_id)
return;
}
printf("Port was not closed\n");
-   if (ports[port_id].flow_list)
-   port_flow_flush(port_id);
}
 
detach_device(rte_eth_devices[port_id].device);
@@ -2885,9 +2887,7 @@ detach_devargs(char *identifier)
rte_eth_iterator_cleanup(&iterator);
return;
}
-
-   if (ports[port_id].flow_list)
-   port_flow_flush(port_id);
+   port_flow_flush(port_id);
}
}
 
-- 
2.28.0



[dpdk-dev] [PATCH 18/20] drivers/net: remove redundant MAC addresses freeing

2020-09-13 Thread Thomas Monjalon
The MAC addresses array is already freed by rte_eth_dev_release_port().
The redundant freeing can be removed from the PMD port closing functions.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/ark/ark_ethdev.c| 3 ---
 drivers/net/hinic/hinic_pmd_ethdev.c| 3 ---
 drivers/net/ice/ice_ethdev.c| 3 ---
 drivers/net/nfb/nfb_ethdev.c| 3 ---
 drivers/net/szedata2/rte_eth_szedata2.c | 3 ---
 5 files changed, 15 deletions(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index 8012af75ee..641a694717 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -707,9 +707,6 @@ eth_ark_dev_close(struct rte_eth_dev *dev)
dev->data->rx_queues[i] = 0;
}
 
-   rte_free(dev->data->mac_addrs);
-   dev->data->mac_addrs = 0;
-
return 0;
 }
 
diff --git a/drivers/net/hinic/hinic_pmd_ethdev.c 
b/drivers/net/hinic/hinic_pmd_ethdev.c
index 623534fda4..460093bf95 100644
--- a/drivers/net/hinic/hinic_pmd_ethdev.c
+++ b/drivers/net/hinic/hinic_pmd_ethdev.c
@@ -3225,9 +3225,6 @@ static int hinic_dev_uninit(struct rte_eth_dev *dev)
 
rte_free(nic_dev->mc_list);
 
-   rte_free(dev->data->mac_addrs);
-   dev->data->mac_addrs = NULL;
-
return HINIC_OK;
 }
 
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index f0cb05cbc9..1a7896f0da 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -2423,9 +2423,6 @@ ice_dev_close(struct rte_eth_dev *dev)
dev->rx_pkt_burst = NULL;
dev->tx_pkt_burst = NULL;
 
-   rte_free(dev->data->mac_addrs);
-   dev->data->mac_addrs = NULL;
-
/* disable uio intr before callback unregister */
rte_intr_disable(intr_handle);
 
diff --git a/drivers/net/nfb/nfb_ethdev.c b/drivers/net/nfb/nfb_ethdev.c
index d937ac6922..6fe7500475 100644
--- a/drivers/net/nfb/nfb_ethdev.c
+++ b/drivers/net/nfb/nfb_ethdev.c
@@ -233,9 +233,6 @@ nfb_eth_dev_close(struct rte_eth_dev *dev)
}
dev->data->nb_tx_queues = 0;
 
-   rte_free(dev->data->mac_addrs);
-   dev->data->mac_addrs = NULL;
-
return 0;
 }
 
diff --git a/drivers/net/szedata2/rte_eth_szedata2.c 
b/drivers/net/szedata2/rte_eth_szedata2.c
index 5f589dfa4c..9ae653e1bd 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1178,9 +1178,6 @@ eth_dev_close(struct rte_eth_dev *dev)
}
dev->data->nb_tx_queues = 0;
 
-   rte_free(dev->data->mac_addrs);
-   dev->data->mac_addrs = NULL;
-
return 0;
 }
 
-- 
2.28.0



[dpdk-dev] [PATCH 14/20] net/softnic: release port upon close

2020-09-13 Thread Thomas Monjalon
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/softnic/rte_eth_softnic.c | 63 ++-
 1 file changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c 
b/drivers/net/softnic/rte_eth_softnic.c
index 491a308c11..5b94e250d8 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -201,9 +201,37 @@ pmd_dev_stop(struct rte_eth_dev *dev)
softnic_mtr_free(p);
 }
 
+static void
+pmd_free(struct pmd_internals *p)
+{
+   if (p == NULL)
+   return;
+
+   if (p->params.conn_port)
+   softnic_conn_free(p->conn);
+
+   softnic_thread_free(p);
+   softnic_pipeline_free(p);
+   softnic_table_action_profile_free(p);
+   softnic_port_in_action_profile_free(p);
+   softnic_tap_free(p);
+   softnic_tmgr_free(p);
+   softnic_link_free(p);
+   softnic_swq_free(p);
+   softnic_mempool_free(p);
+
+   tm_hierarchy_free(p);
+   softnic_mtr_free(p);
+
+   rte_free(p);
+}
+
 static int
-pmd_dev_close(struct rte_eth_dev *dev __rte_unused)
+pmd_dev_close(struct rte_eth_dev *dev)
 {
+   pmd_free(dev->data->dev_private);
+   dev->data->dev_private = NULL; /* already freed */
+   dev->data->mac_addrs = NULL; /* statically allocated */
return 0;
 }
 
@@ -335,31 +363,6 @@ pmd_init(struct pmd_params *params)
return p;
 }
 
-static void
-pmd_free(struct pmd_internals *p)
-{
-   if (p == NULL)
-   return;
-
-   if (p->params.conn_port)
-   softnic_conn_free(p->conn);
-
-   softnic_thread_free(p);
-   softnic_pipeline_free(p);
-   softnic_table_action_profile_free(p);
-   softnic_port_in_action_profile_free(p);
-   softnic_tap_free(p);
-   softnic_tmgr_free(p);
-   softnic_link_free(p);
-   softnic_swq_free(p);
-   softnic_mempool_free(p);
-
-   tm_hierarchy_free(p);
-   softnic_mtr_free(p);
-
-   rte_free(p);
-}
-
 static struct rte_ether_addr eth_addr = {
.addr_bytes = {0},
 };
@@ -384,6 +387,7 @@ pmd_ethdev_register(struct rte_vdev_device *vdev,
dev->device = &vdev->device;
 
/* dev->data */
+   dev->data->dev_flags |= RTE_ETH_DEV_CLOSE_REMOVE;
dev->data->dev_private = dev_private;
dev->data->dev_link.link_speed = ETH_SPEED_NUM_100G;
dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
@@ -653,12 +657,9 @@ pmd_remove(struct rte_vdev_device *vdev)
/* Find the ethdev entry */
dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
if (dev == NULL)
-   return -ENODEV;
+   return 0; /* port already released */
 
-   /* Free device data structures*/
-   pmd_free(dev->data->dev_private);
-   dev->data->dev_private = NULL; /* already freed */
-   dev->data->mac_addrs = NULL; /* statically allocated */
+   pmd_dev_close(dev);
rte_eth_dev_release_port(dev);
 
return 0;
-- 
2.28.0



[dpdk-dev] [PATCH 17/20] drivers/net: accept removing device without any port

2020-09-13 Thread Thomas Monjalon
The ports can be closed (i.e. completely released)
before removing the whole device.
Such case was wrongly considered an error by some drivers.

If the device supports only one port, there is nothing much
to free after the port is closed.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/ipn3ke/ipn3ke_ethdev.c  |  6 ++
 drivers/net/kni/rte_eth_kni.c   | 16 +++-
 drivers/net/netvsc/hn_ethdev.c  |  2 +-
 drivers/net/nfp/nfp_net.c   |  2 ++
 drivers/net/pfe/pfe_ethdev.c|  6 ++
 drivers/net/szedata2/rte_eth_szedata2.c |  6 ++
 6 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ipn3ke/ipn3ke_ethdev.c 
b/drivers/net/ipn3ke/ipn3ke_ethdev.c
index 027be29bd8..4446d2af9e 100644
--- a/drivers/net/ipn3ke/ipn3ke_ethdev.c
+++ b/drivers/net/ipn3ke/ipn3ke_ethdev.c
@@ -562,10 +562,8 @@ static int ipn3ke_vswitch_remove(struct rte_afu_device 
*afu_dev)
afu_dev->device.name, i);
 
ethdev = rte_eth_dev_allocated(afu_dev->device.name);
-   if (!ethdev)
-   return -ENODEV;
-
-   rte_eth_dev_destroy(ethdev, ipn3ke_rpst_uninit);
+   if (ethdev != NULL)
+   rte_eth_dev_destroy(ethdev, ipn3ke_rpst_uninit);
}
 
ret = rte_eth_switch_domain_free(hw->switch_domain_id);
diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index 45ab1b17a8..2a4058f7b0 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -488,17 +488,15 @@ eth_kni_remove(struct rte_vdev_device *vdev)
 
/* find the ethdev entry */
eth_dev = rte_eth_dev_allocated(name);
-   if (eth_dev == NULL)
-   return -1;
-
-   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
-   eth_kni_dev_stop(eth_dev);
-   return rte_eth_dev_release_port(eth_dev);
+   if (eth_dev != NULL) {
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   eth_kni_dev_stop(eth_dev);
+   return rte_eth_dev_release_port(eth_dev);
+   }
+   eth_kni_close(eth_dev);
+   rte_eth_dev_release_port(eth_dev);
}
 
-   eth_kni_close(eth_dev);
-   rte_eth_dev_release_port(eth_dev);
-
is_kni_initialized--;
if (is_kni_initialized == 0)
rte_kni_close();
diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index c4a2dd9f4a..4c37dc520c 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -1092,7 +1092,7 @@ static int eth_hn_remove(struct rte_vmbus_device *dev)
 
eth_dev = rte_eth_dev_allocated(dev->device.name);
if (!eth_dev)
-   return -ENODEV;
+   return 0; /* port already released */
 
ret = eth_hn_dev_uninit(eth_dev);
if (ret)
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index c20d71cdc3..2f108099a3 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -3721,6 +3721,8 @@ static int eth_nfp_pci_remove(struct rte_pci_device 
*pci_dev)
int port = 0;
 
eth_dev = rte_eth_dev_allocated(pci_dev->device.name);
+   if (eth_dev == NULL)
+   return 0; /* port already released */
if ((pci_dev->id.device_id == PCI_DEVICE_ID_NFP4000_PF_NIC) ||
(pci_dev->id.device_id == PCI_DEVICE_ID_NFP6000_PF_NIC)) {
port = get_pf_port_number(eth_dev->data->name);
diff --git a/drivers/net/pfe/pfe_ethdev.c b/drivers/net/pfe/pfe_ethdev.c
index 8867b61a9d..7a36303785 100644
--- a/drivers/net/pfe/pfe_ethdev.c
+++ b/drivers/net/pfe/pfe_ethdev.c
@@ -1158,10 +1158,8 @@ pmd_pfe_remove(struct rte_vdev_device *vdev)
return 0;
 
eth_dev = rte_eth_dev_allocated(name);
-   if (eth_dev == NULL)
-   return -ENODEV;
-
-   pfe_eth_exit(eth_dev, g_pfe);
+   if (eth_dev != NULL)
+   pfe_eth_exit(eth_dev, g_pfe);
munmap(g_pfe->cbus_baseaddr, g_pfe->cbus_size);
 
if (g_pfe->nb_devs == 0) {
diff --git a/drivers/net/szedata2/rte_eth_szedata2.c 
b/drivers/net/szedata2/rte_eth_szedata2.c
index 4325b9a30d..5f589dfa4c 100644
--- a/drivers/net/szedata2/rte_eth_szedata2.c
+++ b/drivers/net/szedata2/rte_eth_szedata2.c
@@ -1910,10 +1910,8 @@ static int szedata2_eth_pci_remove(struct rte_pci_device 
*pci_dev)
pci_dev->device.name, i);
PMD_DRV_LOG(DEBUG, "Removing eth_dev %s", name);
eth_dev = rte_eth_dev_allocated(name);
-   if (!eth_dev) {
-   PMD_DRV_LOG(ERR, "eth_dev %s not found", name);
-   retval = retval ? retval : -ENODEV;
-   }
+   if (eth_dev == NULL)
+   continue; /* port already released */
 
ret = rte_szedata2_eth_dev_uninit(eth_dev);
   

[dpdk-dev] [PATCH 16/20] ethdev: remove old close behaviour

2020-09-13 Thread Thomas Monjalon
The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed.
It was introduced in DPDK 18.11 in order to give time for PMDs to migrate.

The old behaviour was to free only queues when closing a port.
The new behaviour is calling rte_eth_dev_release_port() which does
three more tasks:
- trigger event callback
- reset state and few pointers
- free all generic port resources

The private port resources must be released in the .dev_close callback.

The .remove callback should:
- call .dev_close callback
- call rte_eth_dev_release_port()
- free multi-port device shared resources

Despite waiting two years, some drivers have not migrated,
so they may hit issues with the incompatible new behaviour.
After sending emails, adding logs, and announcing the deprecation,
the only last solution is to declare these drivers as unmaintained:
bnx2x, cxgbe, dpaa, dpaa2, enetc, ionic,
ipn3ke, liquidio, nfp, pfe, qede
Below is a summary of what to implement in those drivers.

* The freeing of private port resources must be moved
from the ".remove(device)" function to the ".dev_close(port)" function.

* If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
it must be set to NULL in ".dev_close" function to protect from
subsequent rte_eth_dev_release_port() freeing.

* Note 1:
The generic resources are freed in rte_eth_dev_release_port(),
after ".dev_close" is called in rte_eth_dev_close(), but not when
calling ".dev_close" directly from the ".remove" PMD function.
That's why rte_eth_dev_release_port() must still be called explicitly
from ".remove(device)" after calling the ".dev_close" PMD function.

* Note 2:
If a device can have multiple ports, the common resources must be freed
only in the ".remove(device)" function.

* Note 3:
The port is supposed to be in a stopped state when it is closed.
If it is not the case, it is free to the PMD implementation
how to react when trying to close a non-stopped port:
either try to stop it automatically or just return an error.

Cc: Rahul Lakkireddy 
Cc: Rosen Xu 
Cc: Shijith Thotton 
Cc: Srisivasubramanian Srinivasan 
Cc: Rasesh Mody 
Cc: Shahed Shaikh 
Cc: Heinrich Kuhn 
Cc: Hemant Agrawal 
Cc: Sachin Saxena 
Cc: Gagandeep Singh 
Cc: Akhil Goyal 
Cc: Alfredo Cardigliano 

Signed-off-by: Thomas Monjalon 
---
 MAINTAINERS   | 22 +++---
 doc/guides/rel_notes/deprecation.rst  |  6 --
 drivers/net/af_packet/rte_eth_af_packet.c |  1 -
 drivers/net/af_xdp/rte_eth_af_xdp.c   |  2 --
 drivers/net/ark/ark_ethdev.c  |  2 --
 drivers/net/atlantic/atl_ethdev.c |  2 --
 drivers/net/avp/avp_ethdev.c  |  2 --
 drivers/net/axgbe/axgbe_ethdev.c  |  1 -
 drivers/net/bnxt/bnxt_ethdev.c|  5 -
 drivers/net/bnxt/bnxt_reps.c  |  4 
 drivers/net/bonding/rte_eth_bond_pmd.c|  1 -
 drivers/net/e1000/em_ethdev.c |  5 -
 drivers/net/e1000/igb_ethdev.c| 10 --
 drivers/net/ena/ena_ethdev.c  |  6 --
 drivers/net/enic/enic_ethdev.c|  2 --
 drivers/net/failsafe/failsafe.c   |  1 -
 drivers/net/fm10k/fm10k_ethdev.c  |  5 -
 drivers/net/hinic/hinic_pmd_ethdev.c  |  6 --
 drivers/net/hns3/hns3_ethdev.c|  5 -
 drivers/net/hns3/hns3_ethdev_vf.c |  5 -
 drivers/net/i40e/i40e_ethdev.c|  5 -
 drivers/net/i40e/i40e_ethdev_vf.c |  5 -
 drivers/net/iavf/iavf_ethdev.c|  5 -
 drivers/net/ice/ice_dcf_ethdev.c  |  2 --
 drivers/net/ice/ice_ethdev.c  |  5 -
 drivers/net/igc/igc_ethdev.c  |  5 -
 drivers/net/ixgbe/ixgbe_ethdev.c  | 10 --
 drivers/net/kni/rte_eth_kni.c |  2 --
 drivers/net/memif/rte_eth_memif.c |  3 ---
 drivers/net/mlx4/mlx4.c   |  1 -
 drivers/net/mlx5/linux/mlx5_os.c  |  2 --
 drivers/net/mvneta/mvneta_ethdev.c|  3 ---
 drivers/net/mvpp2/mrvl_ethdev.c   |  3 ---
 drivers/net/netvsc/hn_ethdev.c|  3 ---
 drivers/net/nfb/nfb_ethdev.c  |  3 ---
 drivers/net/null/rte_eth_null.c   |  1 -
 drivers/net/octeontx/octeontx_ethdev.c|  1 -
 drivers/net/octeontx2/otx2_ethdev.c   |  1 -
 drivers/net/pcap/rte_eth_pcap.c   |  1 -
 drivers/net/ring/rte_eth_ring.c   |  1 -
 drivers/net/sfc/sfc_ethdev.c  |  4 +---
 drivers/net/softnic/rte_eth_softnic.c |  1 -
 drivers/net/szedata2/rte_eth_szedata2.c   |  3 ---
 drivers/net/tap/rte_eth_tap.c |  2 +-
 drivers/net/vhost/rte_eth_vhost.c |  2 +-
 drivers/net/virtio/virtio_ethdev.c|  5 -
 drivers/net/vmxnet3/vmxnet3_ethdev.c  |  3 ---
 lib/librte_ethdev/rte_ethdev.c| 17 +
 lib/librte_ethdev/rte_ethdev.h|  8 +---
 49 files changed, 16 insertions(+), 184 deletions(-)

di

[dpdk-dev] [PATCH 02/20] ethdev: allow drivers to return error on close

2020-09-13 Thread Thomas Monjalon
The device operation .dev_close was returning void.
This driver interface is changed to return an int.

Note that the API rte_eth_dev_close() is still returning void,
although a deprecation notice is pending to change it as well.

Signed-off-by: Thomas Monjalon 
---
 app/test/virtual_pmd.c|  6 --
 drivers/net/af_packet/rte_eth_af_packet.c |  3 ++-
 drivers/net/af_xdp/rte_eth_af_xdp.c   |  4 +++-
 drivers/net/ark/ark_ethdev.c  |  6 --
 drivers/net/atlantic/atl_ethdev.c |  6 --
 drivers/net/avp/avp_ethdev.c  |  5 +++--
 drivers/net/axgbe/axgbe_ethdev.c  |  5 +++--
 drivers/net/bnx2x/bnx2x_ethdev.c  |  4 +++-
 drivers/net/bnxt/bnxt_ethdev.c|  4 +++-
 drivers/net/bnxt/bnxt_reps.c  |  3 ++-
 drivers/net/bnxt/bnxt_reps.h  |  2 +-
 drivers/net/bonding/eth_bond_private.h|  2 +-
 drivers/net/bonding/rte_eth_bond_pmd.c|  4 +++-
 drivers/net/cxgbe/cxgbe_ethdev.c  |  6 --
 drivers/net/cxgbe/cxgbe_pfvf.h|  2 +-
 drivers/net/dpaa/dpaa_ethdev.c|  4 +++-
 drivers/net/dpaa2/dpaa2_ethdev.c  |  6 --
 drivers/net/e1000/em_ethdev.c |  6 --
 drivers/net/e1000/igb_ethdev.c| 12 
 drivers/net/ena/ena_ethdev.c  |  6 --
 drivers/net/enetc/enetc_ethdev.c  |  4 +++-
 drivers/net/enic/enic_ethdev.c|  4 +++-
 drivers/net/failsafe/failsafe_ops.c   |  3 ++-
 drivers/net/fm10k/fm10k_ethdev.c  |  4 +++-
 drivers/net/hinic/hinic_pmd_ethdev.c  |  6 --
 drivers/net/hns3/hns3_ethdev.c|  6 --
 drivers/net/hns3/hns3_ethdev_vf.c |  6 --
 drivers/net/i40e/i40e_ethdev.c|  5 +++--
 drivers/net/i40e/i40e_ethdev_vf.c |  5 +++--
 drivers/net/iavf/iavf_ethdev.c|  5 +++--
 drivers/net/ice/ice_dcf_ethdev.c  |  6 --
 drivers/net/ice/ice_ethdev.c  |  6 --
 drivers/net/igc/igc_ethdev.c  |  6 --
 drivers/net/ionic/ionic_ethdev.c  | 10 ++
 drivers/net/ipn3ke/ipn3ke_representor.c   |  4 +++-
 drivers/net/ixgbe/ixgbe_ethdev.c  | 11 +++
 drivers/net/kni/rte_eth_kni.c |  4 +++-
 drivers/net/liquidio/lio_ethdev.c |  4 +++-
 drivers/net/memif/rte_eth_memif.c |  4 +++-
 drivers/net/mlx4/mlx4.c   |  3 ++-
 drivers/net/mlx5/mlx5.c   |  9 +
 drivers/net/mlx5/mlx5.h   |  2 +-
 drivers/net/mvneta/mvneta_ethdev.c|  4 +++-
 drivers/net/mvpp2/mrvl_ethdev.c   |  4 +++-
 drivers/net/netvsc/hn_ethdev.c|  4 +++-
 drivers/net/nfb/nfb_ethdev.c  |  4 +++-
 drivers/net/nfp/nfp_net.c |  6 --
 drivers/net/octeontx/octeontx_ethdev.c|  4 +++-
 drivers/net/octeontx2/otx2_ethdev.c   |  5 +++--
 drivers/net/pcap/rte_eth_pcap.c   |  3 ++-
 drivers/net/pfe/pfe_ethdev.c  |  8 +---
 drivers/net/qede/qede_ethdev.c|  4 +++-
 drivers/net/sfc/sfc_ethdev.c  |  4 +++-
 drivers/net/softnic/rte_eth_softnic.c |  4 ++--
 drivers/net/szedata2/rte_eth_szedata2.c   |  4 +++-
 drivers/net/tap/rte_eth_tap.c |  4 +++-
 drivers/net/thunderx/nicvf_ethdev.c   |  4 +++-
 drivers/net/vhost/rte_eth_vhost.c |  6 --
 drivers/net/virtio/virtio_ethdev.c|  6 --
 drivers/net/vmxnet3/vmxnet3_ethdev.c  |  6 --
 lib/librte_ethdev/rte_ethdev_core.h   |  2 +-
 61 files changed, 200 insertions(+), 99 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index 79156cb85a..596b608f4c 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -62,9 +62,11 @@ static void  virtual_ethdev_stop(struct rte_eth_dev *eth_dev 
__rte_unused)
rte_pktmbuf_free(pkt);
 }
 
-static void
+static int
 virtual_ethdev_close(struct rte_eth_dev *dev __rte_unused)
-{}
+{
+   return 0;
+}
 
 static int
 virtual_ethdev_configure_success(struct rte_eth_dev *dev __rte_unused)
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index b9723e9619..7d0ff1cbb3 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -376,9 +376,10 @@ eth_stats_reset(struct rte_eth_dev *dev)
return 0;
 }
 
-static void
+static int
 eth_dev_close(struct rte_eth_dev *dev __rte_unused)
 {
+   return 0;
 }
 
 static void
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 936d4a7d5f..bac9729bf8 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -701,7 +701,7 @@ xdp_umem_destroy(struct xsk_umem_info *umem)
umem = NULL;
 }
 
-static void
+static int
 eth_dev_close(struct rte_eth_dev *dev)
 {
struct pmd_internals *internals = dev->data->dev_private;
@@ -731,6 +731,8 @@ eth_dev_c

Re: [dpdk-dev] CALL to eth PMD maintainers: complete closing of port

2020-09-13 Thread Thomas Monjalon
The patches for removing RTE_ETH_DEV_CLOSE_REMOVE are sent:
https://patches.dpdk.org/project/dpdk/list/?series=12173

11 drivers are not supporting the new behaviour correctly:
bnx2x, cxgbe, dpaa, dpaa2, enetc, ionic,
ipn3ke, liquidio, nfp, pfe, qede

If you are the maintainer of one of these drivers,
you can still consider fixing it in the next days.


12/09/2020 13:25, Thomas Monjalon:
> 03/08/2020 20:50, Thomas Monjalon:
> > 18/04/2019 12:59, Thomas Monjalon:
> > > Hi all,
> > > 
> > > Since DPDK 18.11, the behaviour of the close operation is changed
> > > if RTE_ETH_DEV_CLOSE_REMOVE is enabled in the driver:
> > > port is released (i.e. totally freed and data erased) on close.
> > > This new behaviour is enabled per driver for a migration period.
> > > 
> > > Looking at the code, you can see these comments:
> > > /* old behaviour: only free queue arrays */
> > > RTE_ETHDEV_LOG(DEBUG, "Port closing is using an old behaviour.\n"
> > >   "The driver %s should migrate to the new behaviour.\n",
> > > /* new behaviour: send event + reset state + free all data */
> > > 
> > > You can find an advice in the commit:
> > >   http://git.dpdk.org/dpdk/commit/?id=23ea57a2a
> > > "
> > > When enabling RTE_ETH_DEV_CLOSE_REMOVE,
> > > the PMD must free all its private resources for the port,
> > > in its dev_close function.
> > > It is advised to call the dev_close function in the remove function
> > > in order to support removing a device without closing its ports.
> > > "
> > > 
> > > It would be great to complete this migration for the next LTS
> > > version, which will be 19.11.
> > 
> > For the record, it did not happen in 19.11.
> > 
> > > Following drivers should be migrated:
> > > ( find drivers/net -mindepth 1 -maxdepth 1 -type d | cut -d/ -f3 ; git 
> > > grep -l RTE_ETH_DEV_CLOSE_REMOVE drivers | cut -d/ -f3 ) | sort | uniq -u
> > [...]
> > 
> > The progress in April 2019 was 4 of 46 (9%).
> > 
> > > Please let's progress smoothly on this topic, thanks.
> > 
> > More than one year later, the progress is 26 of 53 (49%).
> > 
> > > The concerned maintainers (Cc) can be found with the following command:
> > > devtools/get-maintainer.sh $(( find drivers/net -mindepth 1 -maxdepth 1 
> > > -type d | cut -d/ -f-3 ; git grep -l RTE_ETH_DEV_CLOSE_REMOVE drivers ) | 
> > > sort | uniq -u)
> > 
> > We cannot wait forever. Temporary cannot be longer than 2 years.
> > I am going to send a deprecation notice to remove the "temporary"
> > flag RTE_ETH_DEV_CLOSE_REMOVE.
> 
> The deprecation notice was merged in 20.08:
>   http://mails.dpdk.org/archives/dev/2020-August/177314.html
> 
> > It will break drivers which are not migrated.
> > It will probably help to find motivation in new priorities.
> > 
> > More details on what to do can be found in this mail thread:
> > http://inbox.dpdk.org/dev/1748144.UFpUr2FPnr@xps/
> 
> Summary:
> 
> * The freeing of private port resources must be moved in the PMD
> from the ".remove(device)" function to the ".dev_close(port)" function.
> 
> * If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
> it must be set to NULL in ".dev_close" PMD function to protect from
> subsequent rte_eth_dev_release_port() freeing.
> 
> * Note 1:
> The generic resources are freed in rte_eth_dev_release_port(),
> after ".dev_close" is called in rte_eth_dev_close(), but not when
> calling ".dev_close" directly from the ".remove" PMD function.
> That's why rte_eth_dev_release_port() must still be called explicitly
> from ".remove(device)" after calling the ".dev_close" PMD function.
> 
> * Note 2:
> If a device can have multiple ports, the common resources must be freed
> only in the ".remove(device)" function.
> 
> * Note 3:
> The port is supposed to be in a stopped state when it is closed.
> If it is not the case, it is free to the PMD implementation
> how to react when trying to close a non-stopped port:
> either try to stop it automatically or just return an error.





Re: [dpdk-dev] [PATCH] pci: support both io and mmio bar for legacy virtio on x86

2020-09-13 Thread 谢华伟(此时此刻)

Hi Ferruh:
This patch is to support both MMIO and PIO bar for legacy virtio. We use MM bar 
for legacy virtio on x86 because we create lots of virtio devices as PIO 
resource is very limited.
virtio on x86 is the only PMD which might use PIO. kernel virtio-pci driver 
supports both PIO bar and MMIO bar. This patch handles MMIO and PIO in the same 
way.



--
From:Chris 
Sent At:2020 Sep. 11 (Fri.) 20:57
To:ferruh.yigit 
Cc:dev ; "Wang, Zhihong" ; "Xia, Chenbo" 

Subject:[PATCH] pci: support both io and mmio bar for legacy virtio on x86

Signed-off-by: huawei.xhw 

---

 drivers/bus/pci/linux/pci.c |  71 --

 drivers/bus/pci/linux/pci_uio.c | 154 +++-

 2 files changed, 106 insertions(+), 119 deletions(-)



diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c

index a2198ab..619dfec 100644

--- a/drivers/bus/pci/linux/pci.c

+++ b/drivers/bus/pci/linux/pci.c

@@ -687,71 +687,6 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

}

 }



-#if defined(RTE_ARCH_X86)

-static int

-pci_ioport_map(struct rte_pci_device *dev, int bar __rte_unused,

-   struct rte_pci_ioport *p)

-{

-   uint16_t start, end;

-   FILE *fp;

-   char *line = NULL;

-   char pci_id[16];

-   int found = 0;

-   size_t linesz;

-

-   if (rte_eal_iopl_init() != 0) {

-   RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for 
PCI device %s\n",

-   __func__, dev->name);

-   return -1;

-   }

-

-   snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,

-dev->addr.domain, dev->addr.bus,

-dev->addr.devid, dev->addr.function);

-

-   fp = fopen("/proc/ioports", "r");

-   if (fp == NULL) {

-   RTE_LOG(ERR, EAL, "%s(): can't open ioports\n", __func__);

-   return -1;

-   }

-

-   while (getdelim(&line, &linesz, '\n', fp) > 0) {

-   char *ptr = line;

-   char *left;

-   int n;

-

-   n = strcspn(ptr, ":");

-   ptr[n] = 0;

-   left = &ptr[n + 1];

-

-   while (*left && isspace(*left))

-   left++;

-

-   if (!strncmp(left, pci_id, strlen(pci_id))) {

-   found = 1;

-

-   while (*ptr && isspace(*ptr))

-   ptr++;

-

-   sscanf(ptr, "%04hx-%04hx", &start, &end);

-

-   break;

-   }

-   }

-

-   free(line);

-   fclose(fp);

-

-   if (!found)

-   return -1;

-

-   p->base = start;

-   RTE_LOG(DEBUG, EAL, "PCI Port IO found start=0x%x\n", start);

-

-   return 0;

-}

-#endif

-

 int

 rte_pci_ioport_map(struct rte_pci_device *dev, int bar,

struct rte_pci_ioport *p)

@@ -766,14 +701,8 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

break;

 #endif

case RTE_KDRV_IGB_UIO:

-   ret = pci_uio_ioport_map(dev, bar, p);

-   break;

case RTE_KDRV_UIO_GENERIC:

-#if defined(RTE_ARCH_X86)

-   ret = pci_ioport_map(dev, bar, p);

-#else

ret = pci_uio_ioport_map(dev, bar, p);

-#endif

break;

default:

break;

diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c

index 097dc19..67ddbc0 100644

--- a/drivers/bus/pci/linux/pci_uio.c

+++ b/drivers/bus/pci/linux/pci_uio.c

@@ -372,12 +372,50 @@

 pci_uio_ioport_map(struct rte_pci_device *dev, int bar,

   struct rte_pci_ioport *p)

 {

+   FILE *f;

char dirname[PATH_MAX];

char filename[PATH_MAX];

+   char buf[BUFSIZ];

+   uint64_t phys_addr, end_addr, flags;

int uio_num;

-   unsigned long start;

+   unsigned long base;

+   bool iobar;

+   int i;



-   if (rte_eal_iopl_init() != 0) {

+   /* open and read addresses of the corresponding resource in sysfs */

+   snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT "/resource",

+   rte_pci_get_sysfs_path(), dev->addr.domain, dev->addr.bus,

+   dev->addr.devid, dev->addr.function);

+   f = fopen(filename, "r");

+   if (f == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot open sysfs resource: %s\n",

+   strerror(errno));

+   return -1;

+   }

+   for (i = 0; i < bar + 1; i++) {

+   if (fgets(buf, sizeof(buf), f) == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot read sysfs resource\n");

+   goto error;

+   }

+   }

+   if (pci_parse_one_sysfs_resource(buf, sizeof(buf), &phys_addr,

+   &end_addr, &flags

[dpdk-dev] [PATCH] pci: support both io and mmio bar for legacy virtio on x86

2020-09-13 Thread 谢华伟(此时此刻)
Signed-off-by: huawei.xhw 

---

 drivers/bus/pci/linux/pci.c |  71 --

 drivers/bus/pci/linux/pci_uio.c | 154 +++-

 2 files changed, 106 insertions(+), 119 deletions(-)



diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c

index a2198ab..619dfec 100644

--- a/drivers/bus/pci/linux/pci.c

+++ b/drivers/bus/pci/linux/pci.c

@@ -687,71 +687,6 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

}

 }



-#if defined(RTE_ARCH_X86)

-static int

-pci_ioport_map(struct rte_pci_device *dev, int bar __rte_unused,

-   struct rte_pci_ioport *p)

-{

-   uint16_t start, end;

-   FILE *fp;

-   char *line = NULL;

-   char pci_id[16];

-   int found = 0;

-   size_t linesz;

-

-   if (rte_eal_iopl_init() != 0) {

-   RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for 
PCI device %s\n",

-   __func__, dev->name);

-   return -1;

-   }

-

-   snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,

-dev->addr.domain, dev->addr.bus,

-dev->addr.devid, dev->addr.function);

-

-   fp = fopen("/proc/ioports", "r");

-   if (fp == NULL) {

-   RTE_LOG(ERR, EAL, "%s(): can't open ioports\n", __func__);

-   return -1;

-   }

-

-   while (getdelim(&line, &linesz, '\n', fp) > 0) {

-   char *ptr = line;

-   char *left;

-   int n;

-

-   n = strcspn(ptr, ":");

-   ptr[n] = 0;

-   left = &ptr[n + 1];

-

-   while (*left && isspace(*left))

-   left++;

-

-   if (!strncmp(left, pci_id, strlen(pci_id))) {

-   found = 1;

-

-   while (*ptr && isspace(*ptr))

-   ptr++;

-

-   sscanf(ptr, "%04hx-%04hx", &start, &end);

-

-   break;

-   }

-   }

-

-   free(line);

-   fclose(fp);

-

-   if (!found)

-   return -1;

-

-   p->base = start;

-   RTE_LOG(DEBUG, EAL, "PCI Port IO found start=0x%x\n", start);

-

-   return 0;

-}

-#endif

-

 int

 rte_pci_ioport_map(struct rte_pci_device *dev, int bar,

struct rte_pci_ioport *p)

@@ -766,14 +701,8 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

break;

 #endif

case RTE_KDRV_IGB_UIO:

-   ret = pci_uio_ioport_map(dev, bar, p);

-   break;

case RTE_KDRV_UIO_GENERIC:

-#if defined(RTE_ARCH_X86)

-   ret = pci_ioport_map(dev, bar, p);

-#else

ret = pci_uio_ioport_map(dev, bar, p);

-#endif

break;

default:

break;

diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c

index 097dc19..67ddbc0 100644

--- a/drivers/bus/pci/linux/pci_uio.c

+++ b/drivers/bus/pci/linux/pci_uio.c

@@ -372,12 +372,50 @@

 pci_uio_ioport_map(struct rte_pci_device *dev, int bar,

   struct rte_pci_ioport *p)

 {

+   FILE *f;

char dirname[PATH_MAX];

char filename[PATH_MAX];

+   char buf[BUFSIZ];

+   uint64_t phys_addr, end_addr, flags;

int uio_num;

-   unsigned long start;

+   unsigned long base;

+   bool iobar;

+   int i;



-   if (rte_eal_iopl_init() != 0) {

+   /* open and read addresses of the corresponding resource in sysfs */

+   snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT "/resource",

+   rte_pci_get_sysfs_path(), dev->addr.domain, dev->addr.bus,

+   dev->addr.devid, dev->addr.function);

+   f = fopen(filename, "r");

+   if (f == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot open sysfs resource: %s\n",

+   strerror(errno));

+   return -1;

+   }

+   for (i = 0; i < bar + 1; i++) {

+   if (fgets(buf, sizeof(buf), f) == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot read sysfs resource\n");

+   goto error;

+   }

+   }

+   if (pci_parse_one_sysfs_resource(buf, sizeof(buf), &phys_addr,

+   &end_addr, &flags) < 0)

+   goto error;

+

+   if (flags & IORESOURCE_IO) {

+   iobar = 1;

+   base = (unsigned long)phys_addr;

+   RTE_LOG(INFO, EAL, "%s(): iobar %08lx detected\n", __func__, 
base);

+   } else if (flags & IORESOURCE_MEM) {

+   iobar = 0;

+   base = (unsigned long)dev->mem_resource[bar].addr;

+   RTE_LOG(INFO, EAL, "%s():membar %08lx detected\n", __func__, 
base);

+   } else {

+   RTE_LOG(ERR, EAL, "%s(): unknow bar type\n", __func__);

+   goto error;

+   }

+

+   

Re: [dpdk-dev] [PATCH] net/iavf: release port upon close

2020-09-13 Thread Thomas Monjalon
As you may notice, I have included a slightly modified version
of this patch in my series in order to cover the full picture:
https://patches.dpdk.org/patch/77559/

Feel free to continue improving your patch in this thread or the other,
as you prefer, as long as the secondary process issue is adressed.

Thanks


13/09/2020 10:53, Thomas Monjalon:
> Hi,
> 
> SteveX Yang  wrote:
> > Set RTE_ETH_DEV_CLOSE_REMOVE upon probe so all the private resources
> > for the port can be freed by rte_eth_dev_close().
> > 
> > Signed-off-by: SteveX Yang 
> 
> I guess the X is not part of your name.
> 
> [...]
> > -static int
> > -iavf_dev_uninit(struct rte_eth_dev *dev)
> > -{
> > -   struct iavf_info *vf = 
> > IAVF_DEV_PRIVATE_TO_VF(dev->data->dev_private);
> > -
> > -   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > -   return -EPERM;
> 
> All the code below is moved from iavf_dev_uninit() where it was
> restricted to the primary process, to iavf_dev_close() which can be
> called from the primary process.
> It looks inconsistent.
> 
> > 
> > dev->dev_ops = NULL;
> > dev->rx_pkt_burst = NULL;
> > dev->tx_pkt_burst = NULL;
> > 
> > -   iavf_dev_close(dev);
> > +
> > +   if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_RSS_PF) {
> > +   if (vf->rss_lut) {
> > +   rte_free(vf->rss_lut);
> > +   vf->rss_lut = NULL;
> > +   }
> > +   if (vf->rss_key) {
> > +   rte_free(vf->rss_key);
> > +   vf->rss_key = NULL;
> > +   }
> > +   }
> > 
> > rte_free(vf->vf_res);
> > vf->vsi_res = NULL;
> > 
> > @@ -1449,15 +1456,15 @@ iavf_dev_uninit(struct rte_eth_dev *dev)
> > 
> > rte_free(vf->aq_resp);
> > vf->aq_resp = NULL;
> > 
> > +}
> > 
> > -   if (vf->rss_lut) {
> > -   rte_free(vf->rss_lut);
> > -   vf->rss_lut = NULL;
> > -   }
> > -   if (vf->rss_key) {
> > -   rte_free(vf->rss_key);
> > -   vf->rss_key = NULL;
> > -   }
> > +static int
> > +iavf_dev_uninit(struct rte_eth_dev *dev)
> > +{
> > +   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > +   return -EPERM;
> > +
> > +   iavf_dev_close(dev);
> 
> Are you sure about what should be freed in the secondary process?
> If iavf_dev_close() should not be called by the secondary process,
> you should move the check inside the function,
> because iavf_dev_close() is also called by rte_eth_dev_close().
> 
> > 
> > return 0;
> >  
> >  }






Re: [dpdk-dev] [PATCH] net/tap: release port upon close

2020-09-13 Thread Thomas Monjalon
As you may notice, I have included a slightly modified version
of this patch in my series in order to cover the full picture:
https://patches.dpdk.org/patch/77566/

Feel free to continue improving your patch in this thread or the other,
as you prefer, as long as the secondary process issue is adressed.

Thanks


13/09/2020 11:16, Thomas Monjalon:
> Hi,
> 
> 28/08/2020 14:37, wangyunjian:
> > @@ -1078,6 +1085,23 @@ tap_dev_close(struct rte_eth_dev *dev)
> > +
> > +   /* mac_addrs must not be freed alone because part of dev_private */
> > +   dev->data->mac_addrs = NULL;
> > +
> > +   internals = dev->data->dev_private;
> > +   TAP_LOG(DEBUG, "Closing %s Ethernet device on numa %u",
> > +   tuntap_types[internals->type], rte_socket_id());
> > +
> > +   if (internals->ioctl_sock != -1) {
> > +   close(internals->ioctl_sock);
> > +   internals->ioctl_sock = -1;
> > +   }
> > +   rte_free(dev->process_private);
> > +   dev->process_private = NULL;
> > +   if (tap_devices_count == 1)
> > +   rte_mp_action_unregister(TAP_MP_KEY);
> > +   tap_devices_count--;
> >  }
> [...]
> > @@ -2446,12 +2466,11 @@ static int
> >  rte_pmd_tap_remove(struct rte_vdev_device *dev)
> >  {
> >  
> > struct rte_eth_dev *eth_dev = NULL;
> > 
> > -   struct pmd_internals *internals;
> > 
> > /* find the ethdev entry */
> > eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
> > if (!eth_dev)
> > 
> > -   return -ENODEV;
> > +   return 0;
> > 
> > /* mac_addrs must not be freed alone because part of dev_private */
> > eth_dev->data->mac_addrs = NULL;
> 
> The line above can be removed because mac_addrs is not freed
> in secondary process, and it is redundant with tap_dev_close().
> 
> >  
> > if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > return rte_eth_dev_release_port(eth_dev);
> 
> There is an inconsistency in secondary process
> if tap_dev_close() is not called from .remove (unplug)
> but can be called from .dev_close (rte_eth_dev_close).
> 
> I think the process type check must be done inside tap_dev_close(),
> so the process private resources can be freed.
> 
> > 
> > tap_dev_close(eth_dev);
> >  
> 
> This blank line can be removed in my opinion.
> 
> > -   internals = eth_dev->data->dev_private;
> > -   TAP_LOG(DEBUG, "Closing %s Ethernet device on numa %u",
> > -   tuntap_types[internals->type], rte_socket_id());
> > -
> > -   close(internals->ioctl_sock);
> > -   rte_free(eth_dev->process_private);
> > -   if (tap_devices_count == 1)
> > -   rte_mp_action_unregister(TAP_MP_KEY);
> > -   tap_devices_count--;
> > rte_eth_dev_release_port(eth_dev);
> >  
> > return 0;





[dpdk-dev] [PATCH v1 0/2] refactor and clean FDIR

2020-09-13 Thread Zhirun Yan
V2:
Simplified code.

Zhirun Yan (2):
  net/ice: refactor FDIR set conf function
  net/ice: merge inner/outer flow seg info for FDIR

 drivers/net/ice/ice_ethdev.h  |  1 +
 drivers/net/ice/ice_fdir_filter.c | 93 +--
 2 files changed, 52 insertions(+), 42 deletions(-)

-- 
2.25.1



[dpdk-dev] [PATCH v2 1/2] net/ice: refactor FDIR set conf function

2020-09-13 Thread Zhirun Yan
The original set conf function in FDIR was very long. Refactor to
increase readability to make it clearer and allow for more convenient
further changes.

No functional change here.

Signed-off-by: Zhirun Yan 
---
 drivers/net/ice/ice_fdir_filter.c | 54 ++-
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ice/ice_fdir_filter.c 
b/drivers/net/ice/ice_fdir_filter.c
index 88c9bb03d..593dfd0e2 100644
--- a/drivers/net/ice/ice_fdir_filter.c
+++ b/drivers/net/ice/ice_fdir_filter.c
@@ -964,30 +964,10 @@ ice_fdir_input_set_parse(uint64_t inset, enum 
ice_flow_field *field)
}
 }
 
-static int
-ice_fdir_input_set_conf(struct ice_pf *pf, enum ice_fltr_ptype flow,
-   uint64_t input_set, enum ice_fdir_tunnel_type ttype)
+static void
+ice_fdir_input_set_hdrs(enum ice_fltr_ptype flow, struct ice_flow_seg_info 
*seg,
+   enum ice_fdir_tunnel_type ttype)
 {
-   struct ice_flow_seg_info *seg;
-   struct ice_flow_seg_info *seg_tun = NULL;
-   enum ice_flow_field field[ICE_FLOW_FIELD_IDX_MAX];
-   bool is_tunnel;
-   int i, ret;
-
-   if (!input_set)
-   return -EINVAL;
-
-   seg = (struct ice_flow_seg_info *)
-   ice_malloc(hw, sizeof(*seg));
-   if (!seg) {
-   PMD_DRV_LOG(ERR, "No memory can be allocated");
-   return -ENOMEM;
-   }
-
-   for (i = 0; i < ICE_FLOW_FIELD_IDX_MAX; i++)
-   field[i] = ICE_FLOW_FIELD_IDX_MAX;
-   ice_fdir_input_set_parse(input_set, field);
-
switch (flow) {
case ICE_FLTR_PTYPE_NONF_IPV4_UDP:
ICE_FLOW_SET_HDRS(seg, ICE_FLOW_SEG_HDR_UDP |
@@ -1063,6 +1043,34 @@ ice_fdir_input_set_conf(struct ice_pf *pf, enum 
ice_fltr_ptype flow,
PMD_DRV_LOG(ERR, "not supported filter type.");
break;
}
+}
+
+static int
+ice_fdir_input_set_conf(struct ice_pf *pf, enum ice_fltr_ptype flow,
+   uint64_t input_set, enum ice_fdir_tunnel_type ttype)
+{
+   struct ice_flow_seg_info *seg;
+   struct ice_flow_seg_info *seg_tun = NULL;
+   enum ice_flow_field field[ICE_FLOW_FIELD_IDX_MAX];
+   bool is_tunnel;
+   int i, ret;
+
+   if (!input_set)
+   return -EINVAL;
+
+   seg = (struct ice_flow_seg_info *)
+   ice_malloc(hw, sizeof(*seg));
+   if (!seg) {
+   PMD_DRV_LOG(ERR, "No memory can be allocated");
+   return -ENOMEM;
+   }
+
+   for (i = 0; i < ICE_FLOW_FIELD_IDX_MAX; i++)
+   field[i] = ICE_FLOW_FIELD_IDX_MAX;
+
+   ice_fdir_input_set_parse(input_set, field);
+
+   ice_fdir_input_set_hdrs(flow, seg, ttype);
 
for (i = 0; field[i] != ICE_FLOW_FIELD_IDX_MAX; i++) {
ice_flow_set_fld(seg, field[i],
-- 
2.25.1



[dpdk-dev] [PATCH v2 2/2] net/ice: merge inner/outer flow seg info for FDIR

2020-09-13 Thread Zhirun Yan
For tunnel and non-tunnel packets, it can share the same seg_tun info.
seg_tun[1] can be used for supporting inner fields with tunnel flow rule
or for non-tunnel packets, seg_tun[0] only used for tunnel outer part.
Add outer_input_set to distinguish inner/outer input set. So we can
identify different fields in outer or inner part.

Signed-off-by: Zhirun Yan 
---
 drivers/net/ice/ice_ethdev.h  |  1 +
 drivers/net/ice/ice_fdir_filter.c | 59 ---
 2 files changed, 31 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index 393dfeab1..6bc6dfbfb 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -285,6 +285,7 @@ struct ice_fdir_filter_conf {
struct rte_flow_action_count act_count;
 
uint64_t input_set;
+   uint64_t outer_input_set; /* only for tunnel packets outer fields */
 };
 
 #define ICE_MAX_FDIR_FILTER_NUM(1024 * 16)
diff --git a/drivers/net/ice/ice_fdir_filter.c 
b/drivers/net/ice/ice_fdir_filter.c
index 593dfd0e2..9caab05e3 100644
--- a/drivers/net/ice/ice_fdir_filter.c
+++ b/drivers/net/ice/ice_fdir_filter.c
@@ -1047,51 +1047,53 @@ ice_fdir_input_set_hdrs(enum ice_fltr_ptype flow, 
struct ice_flow_seg_info *seg,
 
 static int
 ice_fdir_input_set_conf(struct ice_pf *pf, enum ice_fltr_ptype flow,
-   uint64_t input_set, enum ice_fdir_tunnel_type ttype)
+   uint64_t inner_input_set, uint64_t outer_input_set,
+   enum ice_fdir_tunnel_type ttype)
 {
struct ice_flow_seg_info *seg;
struct ice_flow_seg_info *seg_tun = NULL;
enum ice_flow_field field[ICE_FLOW_FIELD_IDX_MAX];
+   uint64_t input_set;
bool is_tunnel;
-   int i, ret;
+   int k, i, ret = 0;
 
-   if (!input_set)
+   if (!(inner_input_set | outer_input_set))
return -EINVAL;
 
-   seg = (struct ice_flow_seg_info *)
-   ice_malloc(hw, sizeof(*seg));
-   if (!seg) {
+   seg_tun = (struct ice_flow_seg_info *)
+   ice_malloc(hw, sizeof(*seg_tun) * ICE_FD_HW_SEG_MAX);
+   if (!seg_tun) {
PMD_DRV_LOG(ERR, "No memory can be allocated");
return -ENOMEM;
}
 
-   for (i = 0; i < ICE_FLOW_FIELD_IDX_MAX; i++)
-   field[i] = ICE_FLOW_FIELD_IDX_MAX;
+   /* use seg_tun[1] to record tunnel inner part or non-tunnel */
+   for (k = 0; k <= ICE_FD_HW_SEG_TUN; k++) {
+   seg = &seg_tun[k];
+   input_set = (k == ICE_FD_HW_SEG_TUN) ? inner_input_set : 
outer_input_set;
+   if (input_set == 0)
+   continue;
+
+   for (i = 0; i < ICE_FLOW_FIELD_IDX_MAX; i++)
+   field[i] = ICE_FLOW_FIELD_IDX_MAX;
 
-   ice_fdir_input_set_parse(input_set, field);
+   ice_fdir_input_set_parse(input_set, field);
 
-   ice_fdir_input_set_hdrs(flow, seg, ttype);
+   ice_fdir_input_set_hdrs(flow, seg, ttype);
 
-   for (i = 0; field[i] != ICE_FLOW_FIELD_IDX_MAX; i++) {
-   ice_flow_set_fld(seg, field[i],
-ICE_FLOW_FLD_OFF_INVAL,
-ICE_FLOW_FLD_OFF_INVAL,
-ICE_FLOW_FLD_OFF_INVAL, false);
+   for (i = 0; field[i] != ICE_FLOW_FIELD_IDX_MAX; i++) {
+   ice_flow_set_fld(seg, field[i],
+ICE_FLOW_FLD_OFF_INVAL,
+ICE_FLOW_FLD_OFF_INVAL,
+ICE_FLOW_FLD_OFF_INVAL, false);
+   }
}
 
is_tunnel = ice_fdir_is_tunnel_profile(ttype);
if (!is_tunnel) {
ret = ice_fdir_hw_tbl_conf(pf, pf->main_vsi, pf->fdir.fdir_vsi,
-  seg, flow, false);
+  seg_tun + 1, flow, false);
} else {
-   seg_tun = (struct ice_flow_seg_info *)
-   ice_malloc(hw, sizeof(*seg) * ICE_FD_HW_SEG_MAX);
-   if (!seg_tun) {
-   PMD_DRV_LOG(ERR, "No memory can be allocated");
-   rte_free(seg);
-   return -ENOMEM;
-   }
-   rte_memcpy(&seg_tun[1], seg, sizeof(*seg));
ret = ice_fdir_hw_tbl_conf(pf, pf->main_vsi, pf->fdir.fdir_vsi,
   seg_tun, flow, true);
}
@@ -1099,9 +1101,7 @@ ice_fdir_input_set_conf(struct ice_pf *pf, enum 
ice_fltr_ptype flow,
if (!ret) {
return ret;
} else if (ret < 0) {
-   rte_free(seg);
-   if (is_tunnel)
-   rte_free(seg_tun);
+   rte_free(seg_tun);
return (ret == -EEXIST) ? 0 : ret;
} else {
return ret;
@@ -1311,7 +1311,8 @

[dpdk-dev] [PATCH] pci: support both io and mmio bar for legacy virtio on x86

2020-09-13 Thread 谢华伟(此时此刻)
Signed-off-by: huawei.xhw 
---

 drivers/bus/pci/linux/pci.c |  71 --

 drivers/bus/pci/linux/pci_uio.c | 154 +++-

 2 files changed, 106 insertions(+), 119 deletions(-)



diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c

index a2198ab..619dfec 100644

--- a/drivers/bus/pci/linux/pci.c

+++ b/drivers/bus/pci/linux/pci.c

@@ -687,71 +687,6 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

}

 }



-#if defined(RTE_ARCH_X86)

-static int

-pci_ioport_map(struct rte_pci_device *dev, int bar __rte_unused,

-   struct rte_pci_ioport *p)

-{

-   uint16_t start, end;

-   FILE *fp;

-   char *line = NULL;

-   char pci_id[16];

-   int found = 0;

-   size_t linesz;

-

-   if (rte_eal_iopl_init() != 0) {

-   RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for 
PCI device %s\n",

-   __func__, dev->name);

-   return -1;

-   }

-

-   snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,

-dev->addr.domain, dev->addr.bus,

-dev->addr.devid, dev->addr.function);

-

-   fp = fopen("/proc/ioports", "r");

-   if (fp == NULL) {

-   RTE_LOG(ERR, EAL, "%s(): can't open ioports\n", __func__);

-   return -1;

-   }

-

-   while (getdelim(&line, &linesz, '\n', fp) > 0) {

-   char *ptr = line;

-   char *left;

-   int n;

-

-   n = strcspn(ptr, ":");

-   ptr[n] = 0;

-   left = &ptr[n + 1];

-

-   while (*left && isspace(*left))

-   left++;

-

-   if (!strncmp(left, pci_id, strlen(pci_id))) {

-   found = 1;

-

-   while (*ptr && isspace(*ptr))

-   ptr++;

-

-   sscanf(ptr, "%04hx-%04hx", &start, &end);

-

-   break;

-   }

-   }

-

-   free(line);

-   fclose(fp);

-

-   if (!found)

-   return -1;

-

-   p->base = start;

-   RTE_LOG(DEBUG, EAL, "PCI Port IO found start=0x%x\n", start);

-

-   return 0;

-}

-#endif

-

 int

 rte_pci_ioport_map(struct rte_pci_device *dev, int bar,

struct rte_pci_ioport *p)

@@ -766,14 +701,8 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

break;

 #endif

case RTE_KDRV_IGB_UIO:

-   ret = pci_uio_ioport_map(dev, bar, p);

-   break;

case RTE_KDRV_UIO_GENERIC:

-#if defined(RTE_ARCH_X86)

-   ret = pci_ioport_map(dev, bar, p);

-#else

ret = pci_uio_ioport_map(dev, bar, p);

-#endif

break;

default:

break;

diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c

index 097dc19..67ddbc0 100644

--- a/drivers/bus/pci/linux/pci_uio.c

+++ b/drivers/bus/pci/linux/pci_uio.c

@@ -372,12 +372,50 @@

 pci_uio_ioport_map(struct rte_pci_device *dev, int bar,

   struct rte_pci_ioport *p)

 {

+   FILE *f;

char dirname[PATH_MAX];

char filename[PATH_MAX];

+   char buf[BUFSIZ];

+   uint64_t phys_addr, end_addr, flags;

int uio_num;

-   unsigned long start;

+   unsigned long base;

+   bool iobar;

+   int i;



-   if (rte_eal_iopl_init() != 0) {

+   /* open and read addresses of the corresponding resource in sysfs */

+   snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT "/resource",

+   rte_pci_get_sysfs_path(), dev->addr.domain, dev->addr.bus,

+   dev->addr.devid, dev->addr.function);

+   f = fopen(filename, "r");

+   if (f == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot open sysfs resource: %s\n",

+   strerror(errno));

+   return -1;

+   }

+   for (i = 0; i < bar + 1; i++) {

+   if (fgets(buf, sizeof(buf), f) == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot read sysfs resource\n");

+   goto error;

+   }

+   }

+   if (pci_parse_one_sysfs_resource(buf, sizeof(buf), &phys_addr,

+   &end_addr, &flags) < 0)

+   goto error;

+

+   if (flags & IORESOURCE_IO) {

+   iobar = 1;

+   base = (unsigned long)phys_addr;

+   RTE_LOG(INFO, EAL, "%s(): iobar %08lx detected\n", __func__, 
base);

+   } else if (flags & IORESOURCE_MEM) {

+   iobar = 0;

+   base = (unsigned long)dev->mem_resource[bar].addr;

+   RTE_LOG(INFO, EAL, "%s():membar %08lx detected\n", __func__, 
base);

+   } else {

+   RTE_LOG(ERR, EAL, "%s(): unknow bar type\n", __func__);

+   goto error;

+   }

+

+

Re: [dpdk-dev] [PATCH] pci: support both io and mmio bar for legacy virtio on x86

2020-09-13 Thread 谢华伟(此时此刻)


Hi Ferruy:
Resent the patch.
This patch is to support both MMIO and PIO bar for legacy virtio. We use MMIO 
bar for legacy virtio on x86 because we create lots of virtio devices as PIO 
resource is very limited.
virtio is the only PMD which might use PIO. kernel virtio-pci driver supports 
both PIO bar and MMIO bar. This patch handles MMIO and PIO in the same way.

--
From:Chris 
Sent At:2020 Sep. 14 (Mon.) 11:19
To:ferruh.yigit ; "Wang, Zhihong" 
; "Xia, Chenbo" 
Cc:dev 
Subject:[PATCH] pci: support both io and mmio bar for legacy virtio on x86


Signed-off-by: huawei.xhw 
---

 drivers/bus/pci/linux/pci.c |  71 --

 drivers/bus/pci/linux/pci_uio.c | 154 +++-

 2 files changed, 106 insertions(+), 119 deletions(-)



diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c

index a2198ab..619dfec 100644

--- a/drivers/bus/pci/linux/pci.c

+++ b/drivers/bus/pci/linux/pci.c

@@ -687,71 +687,6 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

}

 }



-#if defined(RTE_ARCH_X86)

-static int

-pci_ioport_map(struct rte_pci_device *dev, int bar __rte_unused,

-   struct rte_pci_ioport *p)

-{

-   uint16_t start, end;

-   FILE *fp;

-   char *line = NULL;

-   char pci_id[16];

-   int found = 0;

-   size_t linesz;

-

-   if (rte_eal_iopl_init() != 0) {

-   RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for 
PCI device %s\n",

-   __func__, dev->name);

-   return -1;

-   }

-

-   snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT,

-dev->addr.domain, dev->addr.bus,

-dev->addr.devid, dev->addr.function);

-

-   fp = fopen("/proc/ioports", "r");

-   if (fp == NULL) {

-   RTE_LOG(ERR, EAL, "%s(): can't open ioports\n", __func__);

-   return -1;

-   }

-

-   while (getdelim(&line, &linesz, '\n', fp) > 0) {

-   char *ptr = line;

-   char *left;

-   int n;

-

-   n = strcspn(ptr, ":");

-   ptr[n] = 0;

-   left = &ptr[n + 1];

-

-   while (*left && isspace(*left))

-   left++;

-

-   if (!strncmp(left, pci_id, strlen(pci_id))) {

-   found = 1;

-

-   while (*ptr && isspace(*ptr))

-   ptr++;

-

-   sscanf(ptr, "%04hx-%04hx", &start, &end);

-

-   break;

-   }

-   }

-

-   free(line);

-   fclose(fp);

-

-   if (!found)

-   return -1;

-

-   p->base = start;

-   RTE_LOG(DEBUG, EAL, "PCI Port IO found start=0x%x\n", start);

-

-   return 0;

-}

-#endif

-

 int

 rte_pci_ioport_map(struct rte_pci_device *dev, int bar,

struct rte_pci_ioport *p)

@@ -766,14 +701,8 @@ int rte_pci_write_config(const struct rte_pci_device 
*device,

break;

 #endif

case RTE_KDRV_IGB_UIO:

-   ret = pci_uio_ioport_map(dev, bar, p);

-   break;

case RTE_KDRV_UIO_GENERIC:

-#if defined(RTE_ARCH_X86)

-   ret = pci_ioport_map(dev, bar, p);

-#else

ret = pci_uio_ioport_map(dev, bar, p);

-#endif

break;

default:

break;

diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c

index 097dc19..67ddbc0 100644

--- a/drivers/bus/pci/linux/pci_uio.c

+++ b/drivers/bus/pci/linux/pci_uio.c

@@ -372,12 +372,50 @@

 pci_uio_ioport_map(struct rte_pci_device *dev, int bar,

   struct rte_pci_ioport *p)

 {

+   FILE *f;

char dirname[PATH_MAX];

char filename[PATH_MAX];

+   char buf[BUFSIZ];

+   uint64_t phys_addr, end_addr, flags;

int uio_num;

-   unsigned long start;

+   unsigned long base;

+   bool iobar;

+   int i;



-   if (rte_eal_iopl_init() != 0) {

+   /* open and read addresses of the corresponding resource in sysfs */

+   snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT "/resource",

+   rte_pci_get_sysfs_path(), dev->addr.domain, dev->addr.bus,

+   dev->addr.devid, dev->addr.function);

+   f = fopen(filename, "r");

+   if (f == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot open sysfs resource: %s\n",

+   strerror(errno));

+   return -1;

+   }

+   for (i = 0; i < bar + 1; i++) {

+   if (fgets(buf, sizeof(buf), f) == NULL) {

+   RTE_LOG(ERR, EAL, "Cannot read sysfs resource\n");

+   goto error;

+   }

+   }

+   if (pci_parse_one_sysfs_resource(buf, sizeof(buf), &phys_addr,

+   &end_a

Re: [dpdk-dev] [PATCH v5 2/2] gro: add VXLAN UDP GRO support

2020-09-13 Thread Jiayu Hu
Replies are inline.

BTW, you need to update the programmer guide
doc/guides/prog_guide/generic_receive_offload_lib.rst and
release note doc/guides/rel_notes/release_20_11.rst.

Thanks,
Jiayu

On Mon, Sep 14, 2020 at 10:13:44AM +0800, yang_y_yi wrote:
> Jiayu, thank you so much, please check my replies for some of your comments,
> here is incremental patch. I built it by meson this time :-)
> 
> diff --git a/lib/librte_gro/gro_vxlan_udp4.c b/lib/librte_gro/gro_vxlan_udp4.c
> index 9e9df72..1fcfaf1 100644
> --- a/lib/librte_gro/gro_vxlan_udp4.c
> +++ b/lib/librte_gro/gro_vxlan_udp4.c
> @@ -203,7 +203,7 @@
>  }
> 
>  static inline int
> -udp_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item,
> +udp4_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item,
> uint16_t frag_offset,
> uint16_t ip_dl)
>  {
> @@ -213,7 +213,7 @@
> int ret = 0;
> 
> l2_offset = pkt->outer_l2_len + pkt->outer_l3_len;
> -   cmp = udp_check_neighbor(&item->inner_item, frag_offset, ip_dl,
> +   cmp = udp4_check_neighbor(&item->inner_item, frag_offset, ip_dl,
> l2_offset);
> /* VXLAN outer IP ID is out of order, so don't touch it and
>  * don't compare it.
> @@ -379,7 +379,7 @@
> item_idx = insert_new_item(tbl, pkt, start_time,
> INVALID_ARRAY_INDEX, frag_offset,
> is_last_frag, outer_ip_id, outer_is_atomic);
> -   if (item_idx == INVALID_ARRAY_INDEX)
> +   if (unlikely(item_idx == INVALID_ARRAY_INDEX))
> return -1;
> if (insert_new_flow(tbl, &key, item_idx) ==
> INVALID_ARRAY_INDEX) {
> @@ -397,7 +397,7 @@
> cur_idx = tbl->flows[i].start_index;
> prev_idx = cur_idx;
> do {
> -   cmp = udp_check_vxlan_neighbor(&(tbl->items[cur_idx]),
> +   cmp = udp4_check_vxlan_neighbor(&(tbl->items[cur_idx]),
> frag_offset, ip_dl);
> if (cmp) {
> if (merge_two_vxlan_udp4_packets(
> @@ -436,7 +436,7 @@
> item_idx = insert_new_item(tbl, pkt, start_time,
> INVALID_ARRAY_INDEX, frag_offset,
> is_last_frag, outer_ip_id, outer_is_atomic);
> -   if (item_idx == INVALID_ARRAY_INDEX)
> +   if (unlikely(item_idx == INVALID_ARRAY_INDEX))
> return -1;
> tbl->items[item_idx].inner_item.next_pkt_idx = cur_idx;
> tbl->flows[i].start_index = item_idx;
> @@ -470,7 +470,7 @@
> ip_dl = pkt->pkt_len - hdr_len;
> diff --git a/lib/librte_gro/gro_vxlan_udp4.c b/lib/librte_gro/gro_vxlan_udp4.c
> index 9e9df72..1fcfaf1 100644
> --- a/lib/librte_gro/gro_vxlan_udp4.c
> +++ b/lib/librte_gro/gro_vxlan_udp4.c
> @@ -203,7 +203,7 @@
>  }
> 
>  static inline int
> -udp_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item,
> +udp4_check_vxlan_neighbor(struct gro_vxlan_udp4_item *item,
> uint16_t frag_offset,
> uint16_t ip_dl)
>  {
> @@ -213,7 +213,7 @@
> int ret = 0;
> 
> l2_offset = pkt->outer_l2_len + pkt->outer_l3_len;
> -   cmp = udp_check_neighbor(&item->inner_item, frag_offset, ip_dl,
> +   cmp = udp4_check_neighbor(&item->inner_item, frag_offset, ip_dl,
> l2_offset);
> /* VXLAN outer IP ID is out of order, so don't touch it and
>  * don't compare it.
> @@ -379,7 +379,7 @@
> item_idx = insert_new_item(tbl, pkt, start_time,
> INVALID_ARRAY_INDEX, frag_offset,
> is_last_frag, outer_ip_id, outer_is_atomic);
> -   if (item_idx == INVALID_ARRAY_INDEX)
> +   if (unlikely(item_idx == INVALID_ARRAY_INDEX))
> return -1;
> if (insert_new_flow(tbl, &key, item_idx) ==
> INVALID_ARRAY_INDEX) {
> @@ -397,7 +397,7 @@
> cur_idx = tbl->flows[i].start_index;
> prev_idx = cur_idx;
> do {
> -   cmp = udp_check_vxlan_neighbor(&(tbl->items[cur_idx]),
> +   cmp = udp4_check_vxlan_neighbor(&(tbl->items[cur_idx]),
> frag_offset, ip_dl);
> if (cmp) {
> if (merge_two_vxlan_udp4_packets(
> @@ -436,7 +436,7 @@
> item_idx = insert_new_item(tbl, pkt, start_time,
> INVALID_ARRAY_INDEX, frag_offset,
> is_last_frag, outer_ip_id, outer_is_atomic);
> -   if (item_idx == INVALID_ARRAY_INDEX)
> +   if (unlikely(item_idx == INVALID_ARRAY_INDEX))
> return -1;
> tbl->items[item_idx].inner_item.next_pkt

Re: [dpdk-dev] [PATCH v3 1/6] test/ring: add check to validate dequeued objects

2020-09-13 Thread Honnappa Nagarahalli


> 
> Add check in test_ring_basic_ex and test_ring_with_exact_size for single
> element enqueue and dequeue operations to validate the dequeued objects.
> 
> Signed-off-by: Feifei Wang 
> Reviewed-by: Ruifeng Wang 
> Reviewed-by: Phil Yang 
> Reviewed-by: Dharmik Thakkar 
Looks good.
Reviewed-by: Honnappa Nagarahalli 

> ---
>  app/test/test_ring.c | 135 +++
>  1 file changed, 99 insertions(+), 36 deletions(-)
> 
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c index
> 0ae97d341..6e48cc0b1 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -780,15 +780,9 @@ test_ring_basic_ex(void)
>   int ret = -1;
>   unsigned int i, j;
>   struct rte_ring *rp = NULL;
> - void *obj = NULL;
> + void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL;
> 
>   for (i = 0; i < RTE_DIM(esize); i++) {
> - obj = test_ring_calloc(RING_SIZE, esize[i]);
> - if (obj == NULL) {
> - printf("%s: failed to alloc memory\n", __func__);
> - goto fail_test;
> - }
> -
>   rp = test_ring_create("test_ring_basic_ex", esize[i], RING_SIZE,
>   SOCKET_ID_ANY,
>   RING_F_SP_ENQ | RING_F_SC_DEQ);
> @@ -797,6 +791,23 @@ test_ring_basic_ex(void)
>   goto fail_test;
>   }
> 
> + /* alloc dummy object pointers */
> + src = test_ring_calloc(RING_SIZE, esize[i]);
> + if (src == NULL) {
> + printf("%s: failed to alloc src memory\n", __func__);
> + goto fail_test;
> + }
> + test_ring_mem_init(src, RING_SIZE, esize[i]);
> + cur_src = src;
> +
> + /* alloc some room for copied objects */
> + dst = test_ring_calloc(RING_SIZE, esize[i]);
> + if (dst == NULL) {
> + printf("%s: failed to alloc dst memory\n", __func__);
> + goto fail_test;
> + }
> + cur_dst = dst;
> +
>   if (rte_ring_lookup("test_ring_basic_ex") != rp) {
>   printf("%s: failed to find ring\n", __func__);
>   goto fail_test;
> @@ -812,8 +823,9 @@ test_ring_basic_ex(void)
>   rte_ring_free_count(rp));
> 
>   for (j = 0; j < RING_SIZE; j++) {
> - test_ring_enqueue(rp, obj, esize[i], 1,
> + test_ring_enqueue(rp, cur_src, esize[i], 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
> + cur_src = test_ring_inc_ptr(cur_src, esize[i], 1);
>   }
> 
>   if (rte_ring_full(rp) != 1) {
> @@ -823,8 +835,9 @@ test_ring_basic_ex(void)
>   }
> 
>   for (j = 0; j < RING_SIZE; j++) {
> - test_ring_dequeue(rp, obj, esize[i], 1,
> + test_ring_dequeue(rp, cur_dst, esize[i], 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
> + cur_dst = test_ring_inc_ptr(cur_dst, esize[i], 1);
>   }
> 
>   if (rte_ring_empty(rp) != 1) {
> @@ -833,52 +846,80 @@ test_ring_basic_ex(void)
>   goto fail_test;
>   }
> 
> + /* check data */
> + if (memcmp(src, dst, cur_src - src)) {
> + rte_hexdump(stdout, "src", src, cur_src - src);
> + rte_hexdump(stdout, "dst", dst, cur_dst - dst);
> + printf("data after dequeue is not the same\n");
> + goto fail_test;
> + }
> +
>   /* Following tests use the configured flags to decide
>* SP/SC or MP/MC.
>*/
> + /* reset memory of dst */
> + memset(dst, 0, RTE_PTR_DIFF(cur_src, src));
> +
> + /* reset cur_src and cur_dst */
> + cur_src = src;
> + cur_dst = dst;
> +
>   /* Covering the ring burst operation */
> - ret = test_ring_enqueue(rp, obj, esize[i], 2,
> + ret = test_ring_enqueue(rp, cur_src, esize[i], 2,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_BURST);
>   if (ret != 2) {
>   printf("%s: rte_ring_enqueue_burst fails\n", __func__);
>   goto fail_test;
>   }
> + cur_src = test_ring_inc_ptr(cur_src, esize[i], 2);
> 
> - ret = test_ring_dequeue(rp, obj, esize[i], 2,
> + ret = test_ring_dequeue(rp, cur_dst, esize[i], 2,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_BURST);
>   if (ret != 2) {
>   printf("%s: rte_ring_dequeue_burst fails\n", __func__);
>   goto fail_test;
>  

Re: [dpdk-dev] [PATCH v3 2/6] test/ring: fix wrong parameter passed to the enqueue APIs

2020-09-13 Thread Honnappa Nagarahalli


> 
> When enqueue one element to ring in the performance test, a pointer should
> be passed to rte_ring_[sp|mp]enqueue APIs, not the pointer to a table of
> void *pointers.
> 
> Fixes: a9fe152363e2 ("test/ring: add custom element size functional tests")
> Cc: honnappa.nagaraha...@arm.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Feifei Wang 
> Reviewed-by: Ruifeng Wang 
> Reviewed-by: Phil Yang 
Reviewed-by: Honnappa Nagarahalli 

> ---
>  app/test/test_ring.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/app/test/test_ring.h b/app/test/test_ring.h index
> aa6ae67ca..d4b15af7c 100644
> --- a/app/test/test_ring.h
> +++ b/app/test/test_ring.h
> @@ -50,11 +50,11 @@ test_ring_enqueue(struct rte_ring *r, void **obj, int
> esize, unsigned int n,
>   if ((esize) == -1)
>   switch (api_type) {
>   case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_SINGLE):
> - return rte_ring_enqueue(r, obj);
> + return rte_ring_enqueue(r, *obj);
>   case (TEST_RING_THREAD_SPSC | TEST_RING_ELEM_SINGLE):
> - return rte_ring_sp_enqueue(r, obj);
> + return rte_ring_sp_enqueue(r, *obj);
>   case (TEST_RING_THREAD_MPMC | TEST_RING_ELEM_SINGLE):
> - return rte_ring_mp_enqueue(r, obj);
> + return rte_ring_mp_enqueue(r, *obj);
>   case (TEST_RING_THREAD_DEF | TEST_RING_ELEM_BULK):
>   return rte_ring_enqueue_bulk(r, obj, n, NULL);
>   case (TEST_RING_THREAD_SPSC | TEST_RING_ELEM_BULK):
> --
> 2.17.1



Re: [dpdk-dev] [PATCH v3 3/6] test/ring: validate the return value of enq/deq elements

2020-09-13 Thread Honnappa Nagarahalli


> 
> Validate the return value of single element enqueue/dequeue operation in
> the test.
> 
> Suggested-by: Honnappa Nagarahalli 
> Signed-off-by: Feifei Wang 
> Reviewed-by: Phil Yang 
Reviewed-by: Honnappa Nagarahalli 

> ---
>  app/test/test_ring.c | 26 ++
>  1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c index
> 6e48cc0b1..63d44d85e 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -823,8 +823,13 @@ test_ring_basic_ex(void)
>   rte_ring_free_count(rp));
> 
>   for (j = 0; j < RING_SIZE; j++) {
> - test_ring_enqueue(rp, cur_src, esize[i], 1,
> + ret = test_ring_enqueue(rp, cur_src, esize[i], 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
> + if (ret != 0) {
> + printf("%s: rte_ring_enqueue fails\n",
> + __func__);
> + goto fail_test;
> + }
>   cur_src = test_ring_inc_ptr(cur_src, esize[i], 1);
>   }
> 
> @@ -835,8 +840,13 @@ test_ring_basic_ex(void)
>   }
> 
>   for (j = 0; j < RING_SIZE; j++) {
> - test_ring_dequeue(rp, cur_dst, esize[i], 1,
> + ret = test_ring_dequeue(rp, cur_dst, esize[i], 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
> + if (ret != 0) {
> + printf("%s: rte_ring_dequeue fails\n",
> + __func__);
> + goto fail_test;
> + }
>   cur_dst = test_ring_inc_ptr(cur_dst, esize[i], 1);
>   }
> 
> @@ -990,10 +1000,18 @@ test_ring_with_exact_size(void)
>* than the standard ring. (16 vs 15 elements)
>*/
>   for (j = 0; j < ring_sz - 1; j++) {
> - test_ring_enqueue(std_r, cur_src, esize[i], 1,
> + ret = test_ring_enqueue(std_r, cur_src, esize[i], 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
> - test_ring_enqueue(exact_sz_r, cur_src, esize[i], 1,
> + if (ret != 0) {
> + printf("%s: error, enqueue failed\n",
> __func__);
> + goto test_fail;
> + }
> + ret = test_ring_enqueue(exact_sz_r, cur_src, esize[i], 
> 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
> + if (ret != 0) {
> + printf("%s: error, enqueue failed\n",
> __func__);
> + goto test_fail;
> + }
>   cur_src = test_ring_inc_ptr(cur_src, esize[i], 1);
>   }
>   ret = test_ring_enqueue(std_r, cur_src, esize[i], 1,
> --
> 2.17.1



Re: [dpdk-dev] [PATCH v3 4/6] test/ring: fix wrong number of enq/deq elements

2020-09-13 Thread Honnappa Nagarahalli


> 
> The ring capacity is (RING_SIZE - 1), thus only (RING_SIZE - 1) number of
> elements can be enqueued into the ring.
> 
> Fixes: af75078fece3 ("first public release")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Feifei Wang 
> Reviewed-by: Ruifeng Wang 
> Reviewed-by: Phil Yang 
Looks good
Reviewed-by: Honnappa Nagarahalli 

> ---
>  app/test/test_ring.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c index
> 63d44d85e..811adc523 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -822,7 +822,7 @@ test_ring_basic_ex(void)
>   printf("%u ring entries are now free\n",
>   rte_ring_free_count(rp));
> 
> - for (j = 0; j < RING_SIZE; j++) {
> + for (j = 0; j < RING_SIZE - 1; j++) {
>   ret = test_ring_enqueue(rp, cur_src, esize[i], 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
>   if (ret != 0) {
> @@ -839,7 +839,7 @@ test_ring_basic_ex(void)
>   goto fail_test;
>   }
> 
> - for (j = 0; j < RING_SIZE; j++) {
> + for (j = 0; j < RING_SIZE - 1; j++) {
>   ret = test_ring_dequeue(rp, cur_dst, esize[i], 1,
>   TEST_RING_THREAD_DEF |
> TEST_RING_ELEM_SINGLE);
>   if (ret != 0) {
> --
> 2.17.1



Re: [dpdk-dev] [PATCH v3 5/6] test/ring: fix wrong size used in memcmp

2020-09-13 Thread Honnappa Nagarahalli


> 
> When using memcmp function to check data, the third param should be the
> size of all elements, rather than the number of the elements.
> 
> Furthermore, do code clean up by moving repeated code inside
> 'test_ring_mem_cmp' function to validate data and print information of
> enqueue/dequeue elements if validation fails.
This patch is fixing 2 things. But only the memcmp issue need to be backported 
to stable. I would prefer to split this into 2 and mark only the memcmp commit 
to be backported.

Hi David,
Do you have a different opinion here?

> 
> Fixes: a9fe152363e2 ("test/ring: add custom element size functional tests")
> Cc: honnappa.nagaraha...@arm.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Feifei Wang 
> Reviewed-by: Ruifeng Wang 
> Reviewed-by: Phil Yang 
> Reviewed-by: Dharmik Thakkar 
> ---
>  app/test/test_ring.c | 59 
>  1 file changed, 27 insertions(+), 32 deletions(-)
> 
> diff --git a/app/test/test_ring.c b/app/test/test_ring.c index
> 811adc523..4a2bd39fc 100644
> --- a/app/test/test_ring.c
> +++ b/app/test/test_ring.c
> @@ -258,6 +258,21 @@ test_ring_mem_init(void *obj, unsigned int count, int
> esize)
>   ((uint32_t *)obj)[i] = i;
>  }
> 
> +static int
> +test_ring_mem_cmp(void *src, void *dst, unsigned int size) {
> + int ret;
> +
> + ret = memcmp(src, dst, size);
> + if (ret) {
> + rte_hexdump(stdout, "src", src, size);
> + rte_hexdump(stdout, "dst", dst, size);
> + printf("data after dequeue is not the same\n");
> + }
> +
> + return ret;
> +}
> +
>  static void
>  test_ring_print_test_string(const char *istr, unsigned int api_type, int 
> esize)
> { @@ -383,7 +398,7 @@ test_ring_burst_bulk_tests1(unsigned int test_idx)
>   struct rte_ring *r;
>   void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL;
>   int ret;
> - unsigned int i, j;
> + unsigned int i, j, temp_sz;
>   int rand;
>   const unsigned int rsz = RING_SIZE - 1;
> 
> @@ -444,7 +459,11 @@ test_ring_burst_bulk_tests1(unsigned int test_idx)
>   TEST_RING_VERIFY(rte_ring_empty(r));
> 
>   /* check data */
> - TEST_RING_VERIFY(memcmp(src, dst, rsz) == 0);
> + temp_sz = rsz * sizeof(void *);
> + if (esize[i] != -1)
> + temp_sz = rsz * esize[i];
> + TEST_RING_VERIFY(test_ring_mem_cmp(src, dst,
> + temp_sz) == 0);
>   }
> 
>   /* Free memory before test completed */ @@ -538,12 +557,8
> @@ test_ring_burst_bulk_tests2(unsigned int test_idx)
>   cur_dst = test_ring_inc_ptr(cur_dst, esize[i], MAX_BULK);
> 
>   /* check data */
> - if (memcmp(src, dst, cur_dst - dst)) {
> - rte_hexdump(stdout, "src", src, cur_src - src);
> - rte_hexdump(stdout, "dst", dst, cur_dst - dst);
> - printf("data after dequeue is not the same\n");
> + if (test_ring_mem_cmp(src, dst, RTE_PTR_DIFF(cur_dst, dst)))
>   goto fail;
> - }
> 
>   /* Free memory before test completed */
>   rte_ring_free(r);
> @@ -614,12 +629,8 @@ test_ring_burst_bulk_tests3(unsigned int test_idx)
>   }
> 
>   /* check data */
> - if (memcmp(src, dst, cur_dst - dst)) {
> - rte_hexdump(stdout, "src", src, cur_src - src);
> - rte_hexdump(stdout, "dst", dst, cur_dst - dst);
> - printf("data after dequeue is not the same\n");
> + if (test_ring_mem_cmp(src, dst, RTE_PTR_DIFF(cur_dst, dst)))
>   goto fail;
> - }
> 
>   /* Free memory before test completed */
>   rte_ring_free(r);
> @@ -747,12 +758,8 @@ test_ring_burst_bulk_tests4(unsigned int test_idx)
>   goto fail;
> 
>   /* check data */
> - if (memcmp(src, dst, cur_dst - dst)) {
> - rte_hexdump(stdout, "src", src, cur_src - src);
> - rte_hexdump(stdout, "dst", dst, cur_dst - dst);
> - printf("data after dequeue is not the same\n");
> + if (test_ring_mem_cmp(src, dst, RTE_PTR_DIFF(cur_dst, dst)))
>   goto fail;
> - }
> 
>   /* Free memory before test completed */
>   rte_ring_free(r);
> @@ -857,12 +864,8 @@ test_ring_basic_ex(void)
>   }
> 
>   /* check data */
> - if (memcmp(src, dst, cur_src - src)) {
> - rte_hexdump(stdout, "src", src, cur_src - src);
> - rte_hexdump(stdout, "dst", dst, cur_dst - dst);
> - printf("data after dequeue is not the same\n");
> +   

Re: [dpdk-dev] [PATCH 1/2] config: add Graviton2(arm64) meson configuration

2020-09-13 Thread Honnappa Nagarahalli


> 
> On 9/11/20 8:23 PM, Honnappa Nagarahalli wrote:
> >
> > +Jerin, Hemant, Dharmik
> >
> > 
> > Hi Vimal,
> > Few comments inline.
> >
> >>
> >> Add meson build configuration for Graviton2 platform with 64-bit ARM
> >> Neoverse N1 cores. This patch makes the following changes to generic
> >> Neoverse N1 config:
> >>
> >> 1. increase lcore limit to 64
> >> 2. increase memory support to 1TB
> > There will be multiple SoCs with N1 cores. All of them will have the same
> implementor ID and part number. But, they will have different values for
> these configurable parameters.
> > IMO, from usage perspective, we have 2 cases:
> > 1) Ability to build a portable binary that can run on multiple Arm
> > SoCs (for ex: BlueField, thunderx1, thunderx2, N1SDP, Graviton2 etc)
> > 2) Ability to build a binary which would run only on a SoC it was compiled
> for and provide the most optimized binary for that SoC. But, this may not be
> portable.
> >
> > For 1) we have default march.
> >
> > For 2) we do not have the capability today in meson build (at least, this is
> my understanding, please correct me if I am wrong). In this case, the user
> knows the target platform for compilation. IMO, we should add the capability
> to take the target platform as an input from the user (similar to the make
> build system) and Graviton2 can be one such target platform.
> 
> My intention was to have parameters that work for both N1SDP and
> Graviton2 rather than 2). Does the change to RTE_MAX_LCORE and
> RTE_MAX_MEM_MB make them incompatible with N1SDP?
They are not optimal for N1SDP. In the future these parameters might have to be 
changed. For ex: if there a N1 based SoC with more than 64 CPU cores.

> 
> I'm not sure if taking target platform from user is the best option here.
> Would this be specific to N1 since other platforms like thunderx differentiate
> the flags with part number?
This issue is specific to Arm CPU cores in general. So, it applies to N1 too.

> 
> >
> >> 3. remove +crc from -march as that is default when setting armv8.2
> >>
> >> For more information about Graviton2 platform, refer to:
> >> https://aws.amazon.com/ec2/graviton/
> >>
> >> Signed-off-by: Vimal Chungath 
> >> ---
> >>  config/arm/arm64_graviton2_linux_gcc | 17 +
> >>  config/arm/meson.build   | 12 +++-
> >>  2 files changed, 28 insertions(+), 1 deletion(-)  create mode 100644
> >> config/arm/arm64_graviton2_linux_gcc
> >>
> >> diff --git a/config/arm/arm64_graviton2_linux_gcc
> >> b/config/arm/arm64_graviton2_linux_gcc
> >> new file mode 100644
> >> index 0..022e06303
> >> --- /dev/null
> >> +++ b/config/arm/arm64_graviton2_linux_gcc
> >> @@ -0,0 +1,17 @@
> >> +[binaries]
> >> +c = 'aarch64-linux-gnu-gcc'
> >> +cpp = 'aarch64-linux-gnu-cpp'
> >> +ar = 'aarch64-linux-gnu-gcc-ar'
> >> +strip = 'aarch64-linux-gnu-strip'
> >> +pkgconfig = 'aarch64-linux-gnu-pkg-config'
> >> +pcap-config = ''
> >> +
> >> +[host_machine]
> >> +system = 'linux'
> >> +cpu_family = 'aarch64'
> >> +cpu = 'armv8-a'
> >> +endian = 'little'
> >> +
> >> +[properties]
> >> +implementor_id = '0x41'
> >> +implementor_pn = '0xd0c'
> >> diff --git a/config/arm/meson.build b/config/arm/meson.build index
> >> 8728051d5..64e277ebc 100644
> >> --- a/config/arm/meson.build
> >> +++ b/config/arm/meson.build
> >> @@ -86,6 +86,16 @@ flags_octeontx2_extra = [
> >>   ['RTE_ARM_FEATURE_ATOMICS', true],
> >>   ['RTE_EAL_IGB_UIO', false],
> >>   ['RTE_USE_C11_MEM_MODEL', true]]
> >> +flags_n1generic_extra = [
> >> + ['RTE_MACHINE', '"neoverse-n1"'],
> >> + ['RTE_MAX_LCORE', 64],
> >> + ['RTE_CACHE_LINE_SIZE', 64],
> >> + ['RTE_ARM_FEATURE_ATOMICS', true],
> >> + ['RTE_USE_C11_MEM_MODEL', true],
> >> + ['RTE_MAX_MEM_MB', 1048576],
> >> + ['RTE_MAX_NUMA_NODES', 1],
> >> + ['RTE_EAL_NUMA_AWARE_HUGEPAGES', false],
> >> + ['RTE_LIBRTE_VHOST_NUMA', false]]
> >>
> >>  machine_args_generic = [
> >>   ['default', ['-march=armv8-a+crc']], @@ -97,7 +107,7 @@
> >> machine_args_generic = [
> >>   ['0xd09', ['-mcpu=cortex-a73']],
> >>   ['0xd0a', ['-mcpu=cortex-a75']],
> >>   ['0xd0b', ['-mcpu=cortex-a76']],
> >> - ['0xd0c', ['-march=armv8.2-a+crc+crypto', '-mcpu=neoverse-n1'],
> >> flags_n1sdp_extra]]
> >> + ['0xd0c', ['-march=armv8.2-a+crypto', '-mcpu=neoverse-n1'],
> >> +flags_n1generic_extra]]
> >>
> >>  machine_args_cavium = [
> >>   ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
> >> --
> >> 2.16.6
> >



Re: [dpdk-dev] [PATCH v1 0/2] refactor and clean FDIR

2020-09-13 Thread Zhang, Qi Z



> -Original Message-
> From: Yan, Zhirun 
> Sent: Monday, September 14, 2020 11:05 AM
> To: Zhang, Qi Z ; dev@dpdk.org
> Cc: Cao, Yahui ; Wang, Xiao W
> ; Su, Simei ; Guo, Junfeng
> ; Yan, Zhirun 
> Subject: [PATCH v1 0/2] refactor and clean FDIR
> 
> V2:
> Simplified code.
> 
> Zhirun Yan (2):
>   net/ice: refactor FDIR set conf function
>   net/ice: merge inner/outer flow seg info for FDIR
> 
>  drivers/net/ice/ice_ethdev.h  |  1 +
>  drivers/net/ice/ice_fdir_filter.c | 93 +--
>  2 files changed, 52 insertions(+), 42 deletions(-)
> 
> --
> 2.25.1

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi



Re: [dpdk-dev] [PATCH v6 0/2] update CPU flags for arm64 platform

2020-09-13 Thread Wei Hu (Xavier)

Hi, all

  Are there any other comments?

Thanks

Xavier

On 2020/8/19 18:56, Wei Hu (Xavier) wrote:

This series updates CPU flags for arm64 platform.

Wei Hu (Xavier) (2):
   eal/arm64: update CPU flags
   test/cpuflag: add new flags for ARM64 platform

  app/test/test_cpuflags.c | 39 
  lib/librte_eal/arm/include/rte_cpuflags_64.h | 13 +++
  lib/librte_eal/arm/rte_cpuflags.c| 13 +++
  3 files changed, 65 insertions(+)