[dpdk-dev] [PATCH v5 01/10] qede: Add maintainers
> >2016-03-31 19:15, Rasesh Mody: >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -371,6 +371,13 @@ M: Declan Doherty >> F: drivers/crypto/aesni_gcm/ >> F: doc/guides/cryptodevs/aesni_gcm.rst >> >> +QLogic qede PMD >> +M: Harish Patil >> +M: Rasesh Mody >> +M: Sony Chacko >> +F: drivers/net/qede/ >> +F: doc/guides/nics/qede.rst > >Please keep the logic order by adding qede below bnx2x. > >This patch may be merged with documentation (and license). > Sure. Thanks, Harish
[dpdk-dev] Questions about reading/writing/modifying packet header.
Dear DPDK exports. I am Ick-Sung Choi living in South Korea. Thank you very much for your contributions. I studied a lot from your source codes. However, actually I have a lot of codes/algorithms that I can't understand. The codes seems to be incomplete, but it works in my test case. If I take an example, the worker assignment method using (not %) in load balancing was not fixed yet. There are a lot of similar codes such as in rte_distributor_process() in distributor. I have a few questions about reading/writing/modifying packet header. I know it is complex. I will really appreciate if I can be given answer and some example codes. Question #1) I would like to know how can I read/write/modify TCP/UDP/ICMP/IGMP/... headers from packet in rte_mbuf. I will really appreciate if I can be given an example code. I guess it would be somewhat complex. Question #2) The IP checksum does not include 6 the ptr. 6 th ptr (ptr16[5]) is missing in the example code. Is it right? ( ip_cksum += ptr16[5]; in the following code.) The following code reads headers and write modified headers (used in DPDK source codes). void swap_header_in_a_packet ( struct rte_mbuf *buf ) { struct ether_hdr *eth_hdr; struct ether_addr eth_src_addr, eth_dest_addr ; struct ipv4_hdr *ip_hdr; uint32_t ip_src_addr, ip_dest_addr ; // Read Eternet header. eth_hdr = rte_pktmbuf_mtod( buf, struct ether_hdr *); // Extract MAC addresses. ether_addr_copy(eth_hdr-s_addr, eth_src_addr ); ether_addr_copy(eth_hdr-d_addr, eth_dest_addr ); // Swap MAC addresses. ether_addr_copy(eth_src_addr, eth_hdr-d_addr); ether_addr_copy(eth_dest_addr, eth_hdr-s_addr); // Swap IP addresses. ip_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod( buf, unsigned char *) + sizeof(struct ether_hdr)); ip_src_addr = (uint32_t) ip_hdr-src_addr; // source IP address. ip_dest_addr = (uint32_t) ip_hdr-dst_addr; // destination IP address. ip_hdr-src_addr = (uint32_t) ip_dest_addr ; ip_hdr-dst_addr = (uint32_t) ip_src_addr ; setup_simple_example_pkt_ip_headers( (char *) ip_hdr ); copy_buf_to_pkt(eth_hdr, sizeof(eth_hdr), buf, 0); copy_buf_to_pkt(ip_hdr, sizeof(ip_hdr), buf, sizeof(struct ether_hdr)); } static void setup_simple_example_pkt_ip_headers( char *ip_hdr ) { uint16_t *ptr16; uint32_t ip_cksum; // uint16_t pkt_len; struct ipv4_hdr *ip_hdr1 = (struct ipv4_hdr *) ip_hdr ; // Initialize UDP header. /* pkt_len = (uint16_t) (pkt_data_len + sizeof(struct udp_hdr)); udp_hdr-src_port = rte_cpu_to_be_16(UDP_SRC_PORT); udp_hdr-dst_port = rte_cpu_to_be_16(UDP_DST_PORT); udp_hdr-dgram_len = RTE_CPU_TO_BE_16(pkt_len); udp_hdr-dgram_cksum= 0; // No UDP checksum. */ // Compute IP header checksum. ptr16 = (uint16_t*) ip_hdr1; ip_cksum = 0; ip_cksum += ptr16[0]; ip_cksum += ptr16[1]; ip_cksum += ptr16[2]; ip_cksum += ptr16[3]; ip_cksum += ptr16[4]; ip_cksum += ptr16[5]; ? // 6 th ptr (ptr16[5]) is missing in the example code. Is it right? ip_cksum += ptr16[6]; ip_cksum += ptr16[7]; ip_cksum += ptr16[8]; ip_cksum += ptr16[9]; // Reduce 32 bit checksum to 16 bits and complement it. ip_cksum = ((ip_cksum 0x) 16) + (ip_cksum 0x); if (ip_cksum 65535) ip_cksum -= 65535; ip_cksum = (~ip_cksum) 0x; if (ip_cksum == 0) ip_cksum = 0x; ip_hdr1-hdr_checksum = (uint16_t) ip_cksum; } Is the Ben really coding machine? (as in the presentation.) ^^ Thank you very much. Sincerely Yours, Ick-Sung Choi.
[dpdk-dev] Memory leak when adding/removing vhost_user ports
I assume there is a leak somewhere on adding/removing vhost_user ports. Although it could also be "only" a fragmentation issue. Reproduction is easy: I set up a pair of nicely working OVS-DPDK connected KVM Guests. Then in a loop I - add up to more 512 ports - test connectivity between the two guests - remove up to 512 ports Depending on memory and the amount of multiqueue/rxq I use it seems to slightly change when exactly it breaks. But for my default setup of 4 queues and 5G Hugepages initialized by DPDK it always breaks at the sixth iteration. Here a link to the stack trace indicating a memory shortage (TBC): https://launchpadlibrarian.net/253916410/apport-retrace.log Known Todos: - I want to track it down more, and will try to come up with a non openvswitch based looping testcase that might show it as well to simplify debugging. - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and Openvswitch master is planned. I will go on debugging this and let you know, but I wanted to give a heads up to everyone. In case this is a known issue for some of you please let me know. Kind Regards, Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd P.S. I think it is a dpdk issue, but adding Daniele on CC to represent ovs-dpdk as well.
[dpdk-dev] [PATCH v2] i40e: dereference before null check
Fix issue reported by Coverity. Coverity ID 13302: There may be a null pointer dereference, or else the comparison against null is unnecessary. In i40evf_config_vlan_pvid: All paths that lead to this null pointer comparison already dereference the pointer earlier Fixes: 2b12431b5369 ("i40e: add vlan stripping and insertion to VF") Signed-off-by: Daniel Mrzyglod --- drivers/net/i40e/i40e_ethdev_vf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c index 2bce69b..2d75b96 100644 --- a/drivers/net/i40e/i40e_ethdev_vf.c +++ b/drivers/net/i40e/i40e_ethdev_vf.c @@ -538,7 +538,7 @@ i40evf_config_vlan_pvid(struct rte_eth_dev *dev, struct vf_cmd_info args; struct i40e_virtchnl_pvid_info tpid_info; - if (dev == NULL || info == NULL) { + if (info == NULL) { PMD_DRV_LOG(ERR, "invalid parameters"); return I40E_ERR_PARAM; } -- 2.5.5
[dpdk-dev] [PATCH] i40e: dereference before null check
Fix issue reported by Coverity. Coverity ID 13302: There may be a null pointer dereference, or else the comparison against null is unnecessary. In i40evf_config_vlan_pvid: All paths that lead to this null pointer comparison already dereference the pointer earlier Fixes: 2b12431b5369 ("i40e: add vlan stripping and insertion to VF") Signed-off-by: Daniel Mrzyglod --- drivers/net/i40e/i40e_ethdev_vf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/i40e/i40e_ethdev_vf.c b/drivers/net/i40e/i40e_ethdev_vf.c index 2bce69b..0d69322 100644 --- a/drivers/net/i40e/i40e_ethdev_vf.c +++ b/drivers/net/i40e/i40e_ethdev_vf.c @@ -533,7 +533,7 @@ static int i40evf_config_vlan_pvid(struct rte_eth_dev *dev, struct i40e_vsi_vlan_pvid_info *info) { - struct i40e_vf *vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private); + struct i40e_vf *vf = NULL; int err; struct vf_cmd_info args; struct i40e_virtchnl_pvid_info tpid_info; @@ -542,6 +542,7 @@ i40evf_config_vlan_pvid(struct rte_eth_dev *dev, PMD_DRV_LOG(ERR, "invalid parameters"); return I40E_ERR_PARAM; } + vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private); memset(_info, 0, sizeof(tpid_info)); tpid_info.vsi_id = vf->vsi_res->vsi_id; -- 2.5.5
[dpdk-dev] [PATCH] i40evf: Ignore disabled HW CRC strip for Linux PF hosts
>> Not sure this is the right way to handle it. The driver should >> return an error rather than silently discard what the >> application asked. > > I also think it should return an error with checking if the > host is kernel driver, and crc strip is disabled in VF. Thank > you David! Thanks for reviewing my patch, Helin and David. I agree that it's subtle to ignore the error, and just log the error. This is how ixgbevf behaves (refer to ixgbevf_dev_configure), so I figured that i40evf should behave analogous. I'll submit a v2 of the patch that returns an EINVAL and logs the failure. Would it make sense to change the ixgbevf_dev_configure as well, in a separate patch? >> -- >> Intel Sweden AB >> Registered Office: Isafjordsgatan 30B, 164 40 Kista, Stockholm, Sweden >> Registration Number: 556189-6027 >> >> This e-mail and any attachments may contain confidential material for >> the sole use of the intended recipient(s). Any review or distribution >> by others is strictly prohibited. If you are not the intended >> recipient, please contact the sender and delete all copies. > > Please, remove this. Noted. Will make sure to fix that for future revisions. Thanks! Bj?rn -- Intel Sweden AB Registered Office: Isafjordsgatan 30B, 164 40 Kista, Stockholm, Sweden Registration Number: 556189-6027 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
[dpdk-dev] [PATCH v5 01/10] qede: Add maintainers
2016-03-31 19:15, Rasesh Mody: > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -371,6 +371,13 @@ M: Declan Doherty > F: drivers/crypto/aesni_gcm/ > F: doc/guides/cryptodevs/aesni_gcm.rst > > +QLogic qede PMD > +M: Harish Patil > +M: Rasesh Mody > +M: Sony Chacko > +F: drivers/net/qede/ > +F: doc/guides/nics/qede.rst Please keep the logic order by adding qede below bnx2x. This patch may be merged with documentation (and license).
[dpdk-dev] Questions about reading/writing/modifying packet header.
Hi Ick-Sung, Please see inline. On Mon, Apr 18, 2016 at 2:14 PM, ??? wrote: > If I take an example, the worker assignment method using (not %) in > load balancing was not fixed yet. If the code works, there is nothing to fix, right? ;) > Question #1) I would like to know how can I read/write/modify > TCP/UDP/ICMP/IGMP/... headers from packet in rte_mbuf. > I will really appreciate if I can be given an example code. I guess it > would be somewhat complex. For an example please have a look at parse_ethernet() in test-pmd: http://dpdk.org/browse/dpdk/tree/app/test-pmd/csumonly.c#n171 The example usage is in the same file: eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); parse_ethernet(eth_hdr, ); l3_hdr = (char *)eth_hdr + info.l2_len; if (info.l4_proto == IPPROTO_UDP) { udp_hdr = (struct udp_hdr *)((char *)l3_hdr + info.l3_len); udp_hdr->dst_port = ... } Then you might need to recalculate the L4 checksum, so have a look at rte_ipv4_udptcp_cksum(). > Question #2) The IP checksum does not include 6 the ptr. 6 th ptr (ptr16[5]) > is missing in the example code. Is it right? > ( ip_cksum += ptr16[5]; in the following code.) The code seems fine, ptr16[5] is the checksum itself. It should be zero, so we can skip it. There is a users at dpdk.org mailing list now, so please use it for your further questions. Here is the link for your convenience: http://dpdk.org/ml Regards, Andriy
[dpdk-dev] [PATCH v5 10/10] qede: Enable PMD build
2016-03-31 19:15, Rasesh Mody: > --- a/config/common_base > +++ b/config/common_base > +CONFIG_RTE_LIBRTE_QEDE_RX_COAL_US=24 > +CONFIG_RTE_LIBRTE_QEDE_TX_COAL_US=48 It looks to be some tuning which could be done at runtime. Isn't it? > +CONFIG_RTE_LIBRTE_QEDE_TX_SWITCHING=y Is it possible to enable this feature at runtime? > +#Provides path/name of the firmware file > +CONFIG_RTE_LIBRTE_QEDE_FW=n Should we replace n by a string? Why not defaults to an empty string?
[dpdk-dev] [PATCH v5 10/10] qede: Enable PMD build
2016-03-31 19:15, Rasesh Mody: > --- a/scripts/test-build.sh > +++ b/scripts/test-build.sh > @@ -116,6 +116,7 @@ config () # > test "$DPDK_DEP_ZLIB" != y || \ > sed -ri 's,(BNX2X_PMD=)n,\1y,' $1/.config > sed -ri's,(NFP_PMD=)n,\1y,' $1/.config > + sed -ri 's,(QEDE_PMD=)n,\1y,' $1/.config As QEDE is enabled in common_base, we do not need to add it in test-build.sh.
[dpdk-dev] [PATCH] i40e: dereference before null check
> -Original Message- > From: Mrzyglod, DanielX T > Sent: Tuesday, April 19, 2016 12:49 AM > To: Chen, Jing D ; Wu, Jingjing intel.com>; > Zhang, Helin > Cc: dev at dpdk.org; Mrzyglod, DanielX T > Subject: [PATCH] i40e: dereference before null check > > Fix issue reported by Coverity. > Coverity ID 13302: > There may be a null pointer dereference, or else the comparison against null > is > unnecessary. Does that really happen? I guess not. If I am correct, I'd suggest to just remove the check of if (dev == NULL) as not needed. /Helin > > In i40evf_config_vlan_pvid: All paths that lead to this null pointer > comparison > already dereference the pointer earlier > > Fixes: 2b12431b5369 ("i40e: add vlan stripping and insertion to VF") > > Signed-off-by: Daniel Mrzyglod > --- > drivers/net/i40e/i40e_ethdev_vf.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/i40e/i40e_ethdev_vf.c > b/drivers/net/i40e/i40e_ethdev_vf.c > index 2bce69b..0d69322 100644 > --- a/drivers/net/i40e/i40e_ethdev_vf.c > +++ b/drivers/net/i40e/i40e_ethdev_vf.c > @@ -533,7 +533,7 @@ static int > i40evf_config_vlan_pvid(struct rte_eth_dev *dev, > struct i40e_vsi_vlan_pvid_info *info) { > - struct i40e_vf *vf = > I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private); > + struct i40e_vf *vf = NULL; > int err; > struct vf_cmd_info args; > struct i40e_virtchnl_pvid_info tpid_info; @@ -542,6 +542,7 @@ > i40evf_config_vlan_pvid(struct rte_eth_dev *dev, > PMD_DRV_LOG(ERR, "invalid parameters"); > return I40E_ERR_PARAM; > } > + vf = I40EVF_DEV_PRIVATE_TO_VF(dev->data->dev_private); > > memset(_info, 0, sizeof(tpid_info)); > tpid_info.vsi_id = vf->vsi_res->vsi_id; > -- > 2.5.5
[dpdk-dev] [PATCH v2] fm10k: set packet type for multi-segment packets
Hi, > -Original Message- > From: Michael Frasca [mailto:michael.frasca at oracle.com] > Sent: Monday, April 18, 2016 8:52 PM > To: Chen, Jing D > Cc: dev at dpdk.org; Michael Frasca > Subject: [PATCH v2] fm10k: set packet type for multi-segment packets > > When building a chain of mbufs for a multi-segment packet, the packet_type > field resides at the end of the chain. It should be copied forward to the head > of the list. > > Also, uses RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE to guard packet-type > computation. The mbuf fields are not copied when this define is not set. > > Fixes: fe65e1e1ce61 ("fm10k: add vector scatter Rx") > > Signed-off-by: Michael Frasca Acked-by : Jing Chen
[dpdk-dev] [PATCH v2 1/2] mempool: allow for user-owned mempool caches
Hi Lazaros, Looks ok to me in general, few comments below. One more generic question - did you observe any performance impact caused by these changes? Konstantin > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Lazaros Koromilas > Sent: Monday, April 04, 2016 4:43 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v2 1/2] mempool: allow for user-owned mempool > caches > > The mempool cache is only available to EAL threads as a per-lcore > resource. Change this so that the user can create and provide their own > cache on mempool get and put operations. This works with non-EAL threads > too. This commit introduces the new API calls: > > rte_mempool_cache_create(size, socket_id) > rte_mempool_cache_flush(cache, mp) > rte_mempool_cache_free(cache) > rte_mempool_default_cache(mp, lcore_id) > rte_mempool_generic_put(mp, obj_table, n, cache, is_mp) > rte_mempool_generic_get(mp, obj_table, n, cache, is_mc) > > Removes the API calls: > > rte_mempool_sp_put_bulk(mp, obj_table, n) > rte_mempool_sc_get_bulk(mp, obj_table, n) > rte_mempool_sp_put(mp, obj) > rte_mempool_sc_get(mp, obj) Hmm, shouldn't we deprecate it first for a release before removing completely? Let say for now you can just make them macros that calls the remaining functions or so. > > And the remaining API calls use the per-lcore default local cache: > > rte_mempool_put_bulk(mp, obj_table, n) > rte_mempool_get_bulk(mp, obj_table, n) > rte_mempool_put(mp, obj) > rte_mempool_get(mp, obj) > > Signed-off-by: Lazaros Koromilas > --- > app/test/test_mempool.c| 58 +-- > app/test/test_mempool_perf.c | 46 +- > lib/librte_eal/common/eal_common_log.c | 8 +- > lib/librte_mempool/rte_mempool.c | 76 - > lib/librte_mempool/rte_mempool.h | 291 > + > 5 files changed, 275 insertions(+), 204 deletions(-) > > > diff --git a/lib/librte_mempool/rte_mempool.c > b/lib/librte_mempool/rte_mempool.c > index 73ca770..4d977c1 100644 > --- a/lib/librte_mempool/rte_mempool.c > +++ b/lib/librte_mempool/rte_mempool.c > @@ -375,6 +375,63 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, > size_t elt_sz, > return usz; > } > > +static void > +mempool_cache_init(struct rte_mempool_cache *cache, uint32_t size) > +{ > + cache->size = size; > + cache->flushthresh = CALC_CACHE_FLUSHTHRESH(size); > + cache->len = 0; > +} > + > +/* > + * Create and initialize a cache for objects that are retrieved from and > + * returned to an underlying mempool. This structure is identical to the > + * local_cache[lcore_id] pointed to by the mempool structure. > + */ > +struct rte_mempool_cache * > +rte_mempool_cache_create(uint32_t size, int socket_id) > +{ > + struct rte_mempool_cache *cache; > + > + if (size > RTE_MEMPOOL_CACHE_MAX_SIZE) { > + rte_errno = EINVAL; > + return NULL; > + } > + > + cache = rte_zmalloc_socket("MEMPOOL_CACHE", sizeof(*cache), > +RTE_CACHE_LINE_SIZE, socket_id); > + if (cache == NULL) { > + RTE_LOG(ERR, MEMPOOL, "Cannot allocate mempool cache!\n"); > + rte_errno = ENOMEM; > + return NULL; > + } > + > + mempool_cache_init(cache, size); > + > + return cache; > +} > + > +/* > + * Free a cache. It's the responsibility of the user to make sure that any > + * remaining objects in the cache are flushed to the corresponding > + * mempool. > + */ > +void > +rte_mempool_cache_free(struct rte_mempool_cache *cache) > +{ > + rte_free(cache); > +} > + > +/* > + * Put all objects in the cache to the specified mempool's ring. > + */ > +void > +rte_mempool_cache_flush(struct rte_mempool_cache *cache, > + struct rte_mempool *mp) > +{ > + rte_ring_enqueue_bulk(mp->ring, cache->objs, cache->len); Shouldn't you also reset cache->len too here? cache->len = 0; Another thought - might be that function deserved to be inline one. > +} > + > #ifndef RTE_LIBRTE_XEN_DOM0 > /* stub if DOM0 support not configured */ > struct rte_mempool * > @@ -448,6 +505,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, > unsigned elt_size, > struct rte_mempool_objsz objsz; > void *startaddr; > int page_size = getpagesize(); > + unsigned lcore_id; > > /* compilation-time checks */ > RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) & > @@ -583,8 +641,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, > unsigned elt_size, > mp->elt_size = objsz.elt_size; > mp->header_size = objsz.header_size; > mp->trailer_size = objsz.trailer_size; > + /* Size of default caches, zero means disabled. */ > mp->cache_size = cache_size; > - mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size); > mp->private_data_size = private_data_size; > > /* > @@ -594,6
[dpdk-dev] [Announce] A new tree for vhost/virtio
Hi, Here I'd like to announce a new tree for vhost/virtio[0], and I'm going to be the maintainer. [0]: http://dpdk.org/browse/next/dpdk-next-virtio/ This is done by a private request to Thomas few days ago (well, I'd confess this should have been a public request/discussion, and you can find it in the end of this email). And this is for merging patches a bit faster, especially for those fix and cleanup patches. Note that it still takes time to merge feature patches, even those from myself. Another note is that this tree is meant to be rebaseable. You are suggested to make virtio/vhost patches based on this tree, to reduce patch conflict chance. @Tetsuya, do you mind if I take over patches for vhost-pmd as well? Thanks. --yliu - Forwarded message from Yuanhan Liu - Date: Wed, 13 Apr 2016 01:53:41 +0800 From: Yuanhan LiuTo: Thomas Monjalon Cc: Bruce Richardson , "Zhu, Heqing" , Yuanhan Liu Subject: A request to take over vhost/virtio patches (was Re: [dpdk-dev] dpdk: vhost/virtio staging/testing tree) User-Agent: Mutt/1.5.23 (2014-03-12) Hi Thomas, Due to the nature of vhost/virtio (being open standard), we have more external contributors (not from intel) than other components: I just wrote a script to count that (the number means the # of contributors not from intel, and the date is got from this release only): vhost|virtio: 10 doc: 10 ethdev: 9 eal: 9 mlx: 8 mk: 7 mlx5: 6 app/test: 6 examples: 5 config: 5 tools: 4 mlx4: 4 eal/linux: 4 bonding: 4 vmxnet3: 3 nfp: 3 mbuf: 3 lpm: 3 ixgbe: 3 igb: 3 i40e: 3 As you can see, vhost/virtio is on the top of the list, which is a great thing: it means we have a health community. We have done well to achieve that, however, I'm thinking we could do better: to be more active on patch reviewing/merging, to try to solve some problems I found as I stated in my following email. Therefore, I'd like to request again to take over all vhost/virtio patches. In another word, I'd like to maintain another tree, like Bruce does for dpdk-next-net tree, and to apply patches in time. And now, I'd like to introduce myself a bit, and hopefully this could convince you that I'm competent to the committer role, though you might have already known that from my recent performs :) I have been working on open source projects since 2009. Till now, it would be about 7 years of experience on open source. My first project was Syslinux, later on, I have worked on few more projects, including Linux Kernel, Mesa and so on. Therefore, I'm sure that my rich experience on open source would definitely let me be capable of the new role. Thanks. --yliu On Tue, Feb 16, 2016 at 12:02:42PM +0800, Yuanhan Liu wrote: > On Fri, Feb 12, 2016 at 01:54:21PM +0200, Victor Kaplansky wrote: > > Hi! > > Hi Victor, > > > Since I was maintaining an internal tree with patches related to > > vhost/virtio, I decided to make this staging tree public. It is > > useful to me and I hope it will be useful to others. > > > > Collecting patches like this allows tracking dependencies between > > patches, their improvement etc. I also rebase the tree so > > contributors don't have to. > > I had same thoughts, before, aiming to speed the patch review and > merge process. > > DPDK community, likely, has a culture of very slow patch review and > merge process: I often saw patches not get reviewed for weeks, even > months! I also saw that a patch has been ACK-ed, but not get merged > until few weeks has been passed. While I am inside the team, I > understand it's a very reasonable phenomenon: every one of us has > lots of tasks to do, and we intend to do the review after all tasks > have been finished. > > Despite the fact, I was thinking that I could maintain a tree, so > that I could apply all virtio/vhost patches that has been ACKed in > the first time. Later, I will send pull request to Thomas, from > time to time. Thomas, on the other hand, only need to have a double > check of the patches from my request. If he has any concerns on > some specific patch (or patch set), I will drop them, and let the > author to send a new version. > > Put simply, it's a similar style Linux kernel (and QEMU) takes. > > Another thing worthy noting is that Bruce started to maintain > a such tree recently: > > http://dpdk.org/browse/next/dpdk-next-net/ > > So, as long as Bruce merges patches quickly, it should not matter. > > > Before publishing, I test the tree so it can serve as a known > > good state for people interested in preliminary testing of > > patches that aren't yet upstream, improving testing/validation as > > multiple people can test the same code. > > I was thinking to build a very rough and simple test bot to > achieve that; however, no time for that. > > --yliu - End forwarded message -
[dpdk-dev] Memory leak when adding/removing vhost_user ports
On Mon, Apr 18, 2016 at 10:46:50AM -0700, Yuanhan Liu wrote: > On Mon, Apr 18, 2016 at 07:18:05PM +0200, Christian Ehrhardt wrote: > > I assume there is a leak somewhere on adding/removing vhost_user ports. > > Although it could also be "only" a fragmentation issue. > > > > Reproduction is easy: > > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > > Then in a loop I > >- add up to more 512 ports > >- test connectivity between the two guests > >- remove up to 512 ports > > > > Depending on memory and the amount of multiqueue/rxq I use it seems to > > slightly change when exactly it breaks. But for my default setup of 4 > > queues and 5G Hugepages initialized by DPDK it always breaks at the sixth > > iteration. > > Here a link to the stack trace indicating a memory shortage (TBC): > > https://launchpadlibrarian.net/253916410/apport-retrace.log > > > > Known Todos: > > - I want to track it down more, and will try to come up with a non > > openvswitch based looping testcase that might show it as well to simplify > > debugging. > > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and > > Openvswitch master is planned. > > > > I will go on debugging this and let you know, but I wanted to give a heads > > up to everyone. > > Thanks for the report. > > > In case this is a known issue for some of you please let me know. > > Yeah, it might be. I'm wondering that virtio_net struct is not freed. > It will be freed only (if I'm not mistaken) when guest quits, by far. Would you try following diff and to see if it fix your issue? --yliu --- lib/librte_vhost/vhost_user/vhost-net-user.c | 6 ++ lib/librte_vhost/vhost_user/vhost-net-user.h | 1 + 2 files changed, 7 insertions(+) diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c index df2bd64..8f7ebd7 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -309,6 +309,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove) } vdev_ctx.fh = fh; + vserver->fh = fh; size = strnlen(vserver->path, PATH_MAX); vhost_set_ifname(vdev_ctx, vserver->path, size); @@ -501,6 +502,11 @@ rte_vhost_driver_unregister(const char *path) for (i = 0; i < g_vhost_server.vserver_cnt; i++) { if (!strcmp(g_vhost_server.server[i]->path, path)) { + struct vhost_device_ctx ctx; + + ctx.fh = g_vhost_server.server[i]->fh; + vhost_destroy_device(ctx); + fdset_del(_vhost_server.fdset, g_vhost_server.server[i]->listenfd); diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.h b/lib/librte_vhost/vhost_user/vhost-net-user.h index e3bb413..7cf21db 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.h +++ b/lib/librte_vhost/vhost_user/vhost-net-user.h @@ -43,6 +43,7 @@ struct vhost_server { char *path; /**< The path the uds is bind to. */ int listenfd; /**< The listener sockfd. */ + uint32_t fh; }; /* refer to hw/virtio/vhost-user.c */ -- 1.9.3
[dpdk-dev] ethtool doesnt work on some interface after unbinding dpdk
On 15/04/2016 23:56, Gopakumar Choorakkot Edakkunni wrote: > This time the problem statement is more narrowed down. > > 1. dpdk is enabled on the interface, interfaces bound to igb_uio > 3. kill the process using dpdk > 3. rmmod rte_kni > 4. rmmod igb_uio > 5. bind interface to igb > 6. ethtool, ifconfig up/down etc.. works for approximately 30 seconds, > and then stops working Hmm.. can you try that but with rte_kni left out completely? KNI hooks into the Linux network stack and think it at the least needs eliminating as a casual factor. Can you also try using uio_pci_generic rather than igb_uio? Those aside, I'm suspecting driver issues, so seeing if I can get one of the driver test guys to have a look at this.. Regards, ..Remy
[dpdk-dev] Memory leak when adding/removing vhost_user ports
On Mon, Apr 18, 2016 at 07:18:05PM +0200, Christian Ehrhardt wrote: > I assume there is a leak somewhere on adding/removing vhost_user ports. > Although it could also be "only" a fragmentation issue. > > Reproduction is easy: > I set up a pair of nicely working OVS-DPDK connected KVM Guests. > Then in a loop I >- add up to more 512 ports >- test connectivity between the two guests >- remove up to 512 ports > > Depending on memory and the amount of multiqueue/rxq I use it seems to > slightly change when exactly it breaks. But for my default setup of 4 > queues and 5G Hugepages initialized by DPDK it always breaks at the sixth > iteration. > Here a link to the stack trace indicating a memory shortage (TBC): > https://launchpadlibrarian.net/253916410/apport-retrace.log > > Known Todos: > - I want to track it down more, and will try to come up with a non > openvswitch based looping testcase that might show it as well to simplify > debugging. > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and > Openvswitch master is planned. > > I will go on debugging this and let you know, but I wanted to give a heads > up to everyone. Thanks for the report. > In case this is a known issue for some of you please let me know. Yeah, it might be. I'm wondering that virtio_net struct is not freed. It will be freed only (if I'm not mistaken) when guest quits, by far. BTW, could you dump the ovs-dpdk log? --yliu
[dpdk-dev] About bond api lacp problem.
Hi, Basically, you have to make sure you call rte_eth_tx_burst() every 100 ms in your forwarding loop. Here is such an example: const uint64_t bond_tx_cycles = (rte_get_timer_hz() + MS_PER_S - 1) * 100 / MS_PER_S; uint64_t cur_bond_cycles, diff_cycles; uint64_t last_bond_tx_cycles = 0; /* Inside your forwarding loop: */ cur_bond_cycles = rte_get_timer_cycles(); diff_cycles = cur_bond_cycles - last_bond_tx_cycles; if (diff_cycles > bond_tx_cycles) { last_bond_tx_cycles = cur_bond_cycles; rte_eth_tx_burst(bond_port_id, 0, NULL, 0); } There is a user at dpdk.org mailing list, please address such questions there. Regards, Andriy On Sat, Apr 16, 2016 at 11:41 AM, yangbo wrote: > Hi, > > How to understand bond api comments: > > for LACP mode to work the rx/tx burst functions must be invoked at least once > every 100ms, otherwise the out-of-band LACP messages will not be handled with > the expected latency and this may cause the link status to be incorrectly > marked as down or failure to correctly negotiate with peers. > > > can any one give me example or more detail info ? > > I am extremely grateful for it. -- Andriy Berestovskyy
[dpdk-dev] [PATCH v2] fm10k: set packet type for multi-segment packets
When building a chain of mbufs for a multi-segment packet, the packet_type field resides at the end of the chain. It should be copied forward to the head of the list. Also, uses RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE to guard packet-type computation. The mbuf fields are not copied when this define is not set. Fixes: fe65e1e1ce61 ("fm10k: add vector scatter Rx") Signed-off-by: Michael Frasca --- v2: - Only copy hash, ol_flags, and packet_type to 'start' when RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE is defined. drivers/net/fm10k/fm10k_rxtx_vec.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c b/drivers/net/fm10k/fm10k_rxtx_vec.c index f8efe8f..03e4a5c 100644 --- a/drivers/net/fm10k/fm10k_rxtx_vec.c +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c @@ -606,8 +606,11 @@ fm10k_reassemble_packets(struct fm10k_rx_queue *rxq, if (!split_flags[buf_idx]) { /* it's the last packet of the set */ +#ifdef RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE start->hash = end->hash; start->ol_flags = end->ol_flags; + start->packet_type = end->packet_type; +#endif pkts[pkt_idx++] = start; start = end = NULL; } -- 2.5.0
[dpdk-dev] [PATCH] fm10k: set packet type for multi-segment packets
Hi, Frasca, > -Original Message- > From: Michael Frasca [mailto:michael.frasca at oracle.com] > Sent: Friday, April 15, 2016 3:32 AM > To: Chen, Jing D > Cc: dev at dpdk.org; Michael Frasca > Subject: [PATCH] fm10k: set packet type for multi-segment packets > > When building a chain of mbufs for a multi-segment packet, the > packet_type field resides at the end of the chain. It should be > copied forward to the head of the list. > > Fixes: fe65e1e1ce61 ("fm10k: add vector scatter Rx") > > Signed-off-by: Michael Frasca > --- > drivers/net/fm10k/fm10k_rxtx_vec.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c > b/drivers/net/fm10k/fm10k_rxtx_vec.c > index f8efe8f..66f126f 100644 > --- a/drivers/net/fm10k/fm10k_rxtx_vec.c > +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c > @@ -608,6 +608,7 @@ fm10k_reassemble_packets(struct fm10k_rx_queue > *rxq, > /* it's the last packet of the set */ > start->hash = end->hash; > start->ol_flags = end->ol_flags; > + start->packet_type = end->packet_type; > pkts[pkt_idx++] = start; > start = end = NULL; > } > -- > 2.5.0 Good catch. Just one comment. We'll parse packet type until "RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE" is applied. Can we add this macro for your change? Same to "hash" and "olf_flags". Best Regards, Mark
[dpdk-dev] [PATCH] fm10k: set packet type for multi-segment packets
Hi Mark, Not a problem. I?ll post a v2 change with check for RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE. Thanks, Michael > On Apr 18, 2016, at 4:40 AM, Chen, Jing D wrote: > > Hi, Frasca, > >> -Original Message- >> From: Michael Frasca [mailto:michael.frasca at oracle.com] >> Sent: Friday, April 15, 2016 3:32 AM >> To: Chen, Jing D >> Cc: dev at dpdk.org; Michael Frasca >> Subject: [PATCH] fm10k: set packet type for multi-segment packets >> >> When building a chain of mbufs for a multi-segment packet, the >> packet_type field resides at the end of the chain. It should be >> copied forward to the head of the list. >> >> Fixes: fe65e1e1ce61 ("fm10k: add vector scatter Rx") >> >> Signed-off-by: Michael Frasca >> --- >> drivers/net/fm10k/fm10k_rxtx_vec.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c >> b/drivers/net/fm10k/fm10k_rxtx_vec.c >> index f8efe8f..66f126f 100644 >> --- a/drivers/net/fm10k/fm10k_rxtx_vec.c >> +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c >> @@ -608,6 +608,7 @@ fm10k_reassemble_packets(struct fm10k_rx_queue >> *rxq, >> /* it's the last packet of the set */ >> start->hash = end->hash; >> start->ol_flags = end->ol_flags; >> +start->packet_type = end->packet_type; >> pkts[pkt_idx++] = start; >> start = end = NULL; >> } >> -- >> 2.5.0 > Good catch. Just one comment. We'll parse packet type until > "RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE" is applied. Can we add this macro for > your change? Same to "hash" and "olf_flags". > > Best Regards, > Mark
[dpdk-dev] compile error on ubuntu 14.4.4 kernel 4.2.0-27-generic in qemu
It is hard for me to spot what exactly is missing, but while it was never intended for trusty the version we have for xenial should get you going. # all kind of dependencies sudo apt-get install build-essential ubuntu-dev-tools debhelper dh-python dh-systemd doxygen graphviz inkscape libcap-dev libpcap-dev libxen-dev libxenstore3.0 python python-sphinx texlive-fonts-recommended texlive-latex-extra # get the version we currently package for xenial pull-lp-source dpdk xenial # run the packaged build which takes care of most enabling/disabling and such cd dpdk-2.2.0 ./debian/rules build Also as you need at least some SSE (level depends on your config) which isn't in the guest by default. Take a look at this - although I'm not 100% sure how mouch of it works back on trusty. https://help.ubuntu.com/16.04/serverguide/DPDK.html#dpdk-in-guest >From there you can derive into whatever you need. Or just take the build process as your starting point for your own. On the other hand I encourage you to go to 16.04 and just use the packaged version if that would suit your needs. Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Sun, Apr 17, 2016 at 8:41 PM, Masaru OKIwrote: > On 2016/04/18 3:27, Sharath wrote: > >> I am facing feew compile errors while compiling dpdk. The env is ubuntu >> running as an VM, VM is started by qemu. How do I fix the compile errors? >> > > Default qemu virtual cpu does not support SSE4.2. > Try qemu -cpu help, and specify your reasonable cpu. >
[dpdk-dev] [PATCH] cfgfile: fix return value comment
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Dmitriy Yakovlev > Sent: Friday, April 15, 2016 11:59 PM > To: dev at dpdk.org > Cc: Dmitriy Yakovlev > Subject: [dpdk-dev] [PATCH] cfgfile: fix return value comment > > Function rte_cfgfile_load can return NULL value, when something goes > wrong. > > Signed-off-by: Dmitriy Yakovlev > --- > lib/librte_cfgfile/rte_cfgfile.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/librte_cfgfile/rte_cfgfile.h > b/lib/librte_cfgfile/rte_cfgfile.h > index 834f828..f649836 100644 > --- a/lib/librte_cfgfile/rte_cfgfile.h > +++ b/lib/librte_cfgfile/rte_cfgfile.h > @@ -72,7 +72,7 @@ struct rte_cfgfile_entry { > * @param flags > * Config file flags, Reserved for future use. Must be set to 0. > * @return > -* Handle to configuration file > +* Handle to configuration file on success, NULL otherwise > */ > struct rte_cfgfile *rte_cfgfile_load(const char *filename, int flags); > > -- > 2.6.2.windows.1 Acked-by: Cristian Dumitrescu
[dpdk-dev] compile error on ubuntu 14.4.4 kernel 4.2.0-27-generic in qemu
On 2016/04/18 3:27, Sharath wrote: > I am facing feew compile errors while compiling dpdk. The env is ubuntu > running as an VM, VM is started by qemu. How do I fix the compile errors? Default qemu virtual cpu does not support SSE4.2. Try qemu -cpu help, and specify your reasonable cpu.
[dpdk-dev] [PATCH] ixgbe: fix bad shift operation in ixgbe_set_pool_tx
Hi, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tomasz Kulasek > Sent: Friday, April 15, 2016 10:33 PM > To: dev at dpdk.org > Cc: Zhang, Helin; Ananyev, Konstantin > Subject: [dpdk-dev] [PATCH] ixgbe: fix bad shift operation in > ixgbe_set_pool_tx > > CID 13190 (#1 of 1): Bad bit shift operation (BAD_SHIFT) > large_shift: In expression 1 << pool, left shifting by more than 31 bits has > undefined behavior. The shift amount, pool, is at least 32. > > This patch limits mask shift to be in range of 32 bit PFVFTE[1] register, for > pool > > 31. > > Fixes: fe3a45fd4104 ("ixgbe: add VMDq support") > > Signed-off-by: Tomasz Kulasek Acked-by: Wenzhuo Lu
[dpdk-dev] [PATCH] ixgbe: fix bad shift operation in ixgbe_set_pool_rx
Hi Tomasz, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tomasz Kulasek > Sent: Friday, April 15, 2016 9:39 PM > To: dev at dpdk.org > Cc: Zhang, Helin; Ananyev, Konstantin > Subject: [dpdk-dev] [PATCH] ixgbe: fix bad shift operation in > ixgbe_set_pool_rx > > CID 13193 (#1 of 1): Bad bit shift operation (BAD_SHIFT) > large_shift: In expression 1 << pool, left shifting by more than 31 bits has > undefined behavior. The shift amount, pool, is at least 32. > > This patch limits mask shift to be in range of 32 bit PFVFRE[1] register, for > pool > > 31. > > Fixes: fe3a45fd4104 ("ixgbe: add VMDq support") > > Signed-off-by: Tomasz Kulasek Acked-by: Wenzhuo Lu
[dpdk-dev] [PATCH] xenvirt: support dynamic page size
Fix build failure since PAGE_SIZE is not defined on ARM (multiple values are possible, so it needs to dynamically get the page size used). Signed-off-by: Ricardo Salveti --- drivers/net/xenvirt/rte_eth_xenvirt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.c b/drivers/net/xenvirt/rte_eth_xenvirt.c index b9638d9..afc0193 100644 --- a/drivers/net/xenvirt/rte_eth_xenvirt.c +++ b/drivers/net/xenvirt/rte_eth_xenvirt.c @@ -39,6 +39,9 @@ #include #include #include +#ifndef PAGE_SIZE +#define PAGE_SIZE sysconf(_SC_PAGE_SIZE) +#endif #include #include #if __XEN_LATEST_INTERFACE_VERSION__ < 0x00040200 -- 2.7.4
[dpdk-dev] compile error on ubuntu 14.4.4 kernel 4.2.0-27-generic in qemu
HI, I am facing feew compile errors while compiling dpdk. The env is ubuntu running as an VM, VM is started by qemu. How do I fix the compile errors? In file included from /home/vpptest/vpp/build-root/build-vpp-native/dpdk/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_pci.c:42:0: /home/vpptest/vpp/build-root/install-vpp-native/dpdk/include/rte_memcpy.h: In function ?rte_memcpy?: /home/vpptest/vpp/build-root/install-vpp-native/dpdk/include/rte_memcpy.h:625:2: error: implicit declaration of function ?_mm_alignr_epi8? [-Werror=implicit-function-declaration] MOVEUNALIGNED_LEFT47(dst, src, n, srcofs); ^ /home/vpptest/vpp/build-root/install-vpp-native/dpdk/include/rte_memcpy.h:625:2: error: nested extern declaration of ?_mm_alignr_epi8? [-Werror=nested-externs] /home/vpptest/vpp/build-root/install-vpp-native/dpdk/include/rte_memcpy.h:625:2: error: incompatible type for argument 2 of ?_mm_storeu_si128? In file included from /home/vpptest/vpp/build-root/install-vpp-native/dpdk/include/rte_common.h:289:0, from /home/vpptest/vpp/build-root/install-vpp-native/dpdk/include/rte_memory.h:55, from /home/vpptest/vpp/build-root/install-vpp-native/dpdk/include/rte_eal_memconfig.h:38, from /home/vpptest/vpp/build-root/build-vpp-native/dpdk/dpdk-2.2.0/lib/librte_eal/linuxapp/eal/eal_pci.c:39: /usr/lib/gcc/x86_64-linux-gnu/4.8/include/emmintrin.h:700:1: note: expected ?__m128i? but argument is of type ?int? Thanks Sharath