[dpdk-dev] Inputs needed - testing l2fwd
On Wed, 25 Mar 2015 09:40:47 +0530 Shankari Vaidyalingam wrote: > Hi, > > Can anyone please help me whether I'm missing something in the below > exercise > > Regards > Shankari.V > > On Wed, Mar 25, 2015 at 12:25 AM, Shankari Vaidyalingam < > shankari.v2k6 at gmail.com> wrote: > > > I'm trying to run sample DPDK applications by injecting packets from an > > external traffic generator. > > I'm not able to get the frames to reach the RX queue of the corresponding > > port and make the PMD detect those packets and get the application process > > them. > > > > My configuration is: > > > > I'm using Oracle VirtualBox to have 2 virtual network adapters for use in > > the sample application. I have configured both the network adapters in the > > "Bridged Networking" mode. > > When I open the VM I can see the interfaces - eth0 and eth1 > > I bound the 2 interfaces to igb_uio driver and then configured the > > hugepage mapping. > > When I try sending the frames from an external source I had set the dest > > MAC to the MAC addr of the ports bound to DPDK. > > But still I'm not able to see either the "Frames received" or "Frames > > dropped" count getting incremented. > > > > I tried googling and tried various options for networking mode as > > (Bridged/Host Only...) but none of them seem to work fine. > > > > I'm sure that I'm missing something. > > Can you please help me in this regard. > > > > Regards > > Shankari.V > > > > VirtualBox has its own version of virtio which is not compatiable with the current version of DPDK virtio driver.
[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'
On Tue, Mar 24, 2015 at 6:20 PM, Xie, Huawei wrote: > On 3/24/2015 7:10 PM, Pavel Boldin wrote: > > > On Tue, Mar 24, 2015 at 8:28 AM, Xie, Huawei huawei.xie at intel.com>> wrote: > On 3/23/2015 8:54 PM, Pavel Boldin wrote: > > Due to increased `struct file's reference counter subsequent call > > to `filp_close' does not free the `struct file'. Prepend `fput' call > > to decrease the reference counter. > > > > Signed-off-by: Pavel Boldin pboldin at mirantis.com>> > > --- > > lib/librte_vhost/eventfd_link/eventfd_link.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c > b/lib/librte_vhost/eventfd_link/eventfd_link.c > > index 7755dd6..62c45c8 100644 > > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c > > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c > > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int > ioctl, unsigned long arg) > >* Release the existing eventfd in the source process > >*/ > > spin_lock(>file_lock); > > + fput(file); > > filp_close(file, files); > > fdt = files_fdtable(files); > > fdt->fd[eventfd_copy.source_fd] = NULL; > Acked-by Huawei Xie mailto:huawei.xie at intel.com>> > > In future, we should remove the allocation of src eventfd, allocate a > free fd from kernel, and install it with target fd. > > Well, I don't think this is correct, because this will put too much job > for the kernel module making it a combiner. > > At the moment I propose to accept module refactoring patch to the upstream. > > After that the module can be reworked along with the application code. The > reason is I can't work on DPDK since I have no hardware to test application > at so I can't help with patch that touches application code. > > Pavel > Pavel: > I am not asking to do it right now but in future, and i agree we accept > the refactoring patch now, so i ack the patch. > Huawei, this is the refactoring patch [1]. This patch is merely a bugfix. [1] http://dpdk.org/dev/patchwork/patch/4113/ > Huawei >
[dpdk-dev] [PATCH v2 2/6] eal: Close file descriptor of uio configuration
On Wed, 25 Mar 2015 12:17:32 +0900 Tetsuya Mukawa wrote: > On 2015/03/25 3:33, Stephen Hemminger wrote: > > On Tue, 24 Mar 2015 13:18:33 +0900 > > Tetsuya Mukawa wrote: > > > >> When pci_uio_unmap_resource() is called, a file descriptor that is used > >> for uio configuration should be closed. > >> > >> Signed-off-by: Tetsuya Mukawa > >> --- > >> lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 6 +- > >> 1 file changed, 5 insertions(+), 1 deletion(-) > >> > >> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > >> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > >> index 9cdf24f..f0277be 100644 > >> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > >> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > >> @@ -459,8 +459,12 @@ pci_uio_unmap_resource(struct rte_pci_device *dev) > >> > >>/* close fd if in primary process */ > >>close(dev->intr_handle.fd); > >> - > >>dev->intr_handle.fd = -1; > >> + > >> + /* close cfg_fd if in primary process */ > >> + close(dev->intr_handle.uio_cfg_fd); > >> + dev->intr_handle.uio_cfg_fd = -1; > >> + > >>dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN; > >> } > >> #endif /* RTE_LIBRTE_EAL_HOTPLUG */ > > > > For the Qlogic/Broadcom driver it needed the config fd handle, and I added > > generic config space access functions. > > Hi Stephen, > > Is this the patch you mentioned? > http://dpdk.org/dev/patchwork/patch/3024/ > > > Hi David, Bernard, Stephen > > I guess here are works we will need to do. > 1. Add close(dev->config_fd) in Stephen's patch. > 2. Write a patch for uio to merge "dev->intr_handle->uio_cfg_fd" and > "dev->config_fd". > 3. Write a patch for vfio to merge "dev->intr_handle->vfio_cfg_fd" and > "dev->config_fd". > > If we already have these patches, I guess it may be nice to merge above > patches first. > Do you have a suggestion how to merge patches related with pci config fd? > > Thanks, > Tetsuya > Yeah, that is the patch. It reopens config fd, it seems to overlap this.
[dpdk-dev] [PATCH] cast used->idx to volatile
On 2015/3/24 18:06, Xie, Huawei wrote: > On 3/24/2015 3:44 PM, Linhaifeng wrote: >> >> On 2015/3/24 9:53, Xie, Huawei wrote: >>> On 3/24/2015 9:00 AM, Linhaifeng wrote: On 2015/3/23 20:54, Xie, Huawei wrote: >> -Original Message- >> From: Linhaifeng [mailto:haifeng.lin at huawei.com] >> Sent: Monday, March 23, 2015 8:24 PM >> To: dev at dpdk.org >> Cc: Ouyang, Changchun; Xie, Huawei >> Subject: Re: [dpdk-dev] [PATCH] cast used->idx to volatile >> >> >> >> On 2015/3/21 16:07, linhaifeng wrote: >>> From: Linhaifeng >>> >>> Same as rte_vhost_enqueue_burst we should cast used->idx >>> to volatile before notify guest. >>> >>> Signed-off-by: Linhaifeng >>> --- >>> lib/librte_vhost/vhost_rxtx.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/lib/librte_vhost/vhost_rxtx.c >>> b/lib/librte_vhost/vhost_rxtx.c >>> index 535c7a1..8d674d1 100644 >>> --- a/lib/librte_vhost/vhost_rxtx.c >>> +++ b/lib/librte_vhost/vhost_rxtx.c >>> @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, >> uint16_t queue_id, >>> } >>> >>> rte_compiler_barrier(); >>> - vq->used->idx += entry_success; >>> + *(volatile uint16_t *)>used->idx += entry_success; > Haifeng: > We have compiler barrier before and an external function call behind, so > we don't need volatile here. > Do you meet issue? > Tx_q is sometimes stopped when we use virtio_net. Because vhost thought there are no buffers in tx_q and virtio_net though vhost haven't handle all packets so we have to restart VM to restore work. The status in VM is: Mar 18 17:11:10 linux-b2ij kernel: [46337.246687] net eth7: virtnet_poll Mar 18 17:11:10 linux-b2ij kernel: [46337.246690] net eth7: receive_buf Mar 18 17:11:10 linux-b2ij kernel: [46337.246693] net eth7: vi->num=239 Mar 18 17:11:10 linux-b2ij kernel: [46337.246695] net eth7: svq:avail->idx=52939 used->idx=52939 num_free=18 num_added=0 svq->last_used_idx=52820 Mar 18 17:11:10 linux-b2ij kernel: [46337.246699] net eth7: rvq:avail->idx=36215 used->idx=35977 num_free=18 num_added=0 rvq->last_used_idx=35977 Mar 18 17:11:11 linux-b2ij kernel: [46337.901038] net eth7: dev_queue_xmit, qdisc->flags=4, qdisc->state deactiveed=0 Mar 18 17:11:12 linux-b2ij kernel: [46337.901042] net eth7: dev_queue_xmit, txq->state=1, stopped=1 Why compiler barrier not take effect in our case? Is compiler barrier depended on -O3 option? We use -O2 option. >>> compiler barrier always works regardless of the optimization option. >>> I don't get your story, but the key thing is, do you check the asm code? >>> If called from outside as an API, how is it possible it is optimized? >>> there is only one update to used->idx in that function. >> >> Do you mean rte_vhost_enqueue_burst also not need cast used->idx to volatile >> ? Why not remove it? > I checked the code. Seems we can remove. That is another issue. > For your issue, you meet problem, and submit this this patch, but i am a > bit confused it is the root cause. Do you check the asm code that > volatile is optimized? I'm not sure about this too.How to check volatile is optimized? > >>> /* Kick guest if required. */ >>> if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) >>> eventfd_write((int)vq->callfd, 1); >>> >> -- >> Regards, >> Haifeng > >>> >>> . >>> > > > . > -- Regards, Haifeng
[dpdk-dev] [PATCH v2] hash: fix breaking strict-aliasing rules
Fix rte_hash_crc() function by making use of uintptr_t variable to hold a pointer to data being hashed. In this way, casting uint64_t pointer to uint32_t avoided. Signed-off-by: Yerden Zhumabekov --- lib/librte_hash/rte_hash_crc.h | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/lib/librte_hash/rte_hash_crc.h b/lib/librte_hash/rte_hash_crc.h index 1cd626c..abdbd9a 100644 --- a/lib/librte_hash/rte_hash_crc.h +++ b/lib/librte_hash/rte_hash_crc.h @@ -513,35 +513,36 @@ rte_hash_crc(const void *data, uint32_t data_len, uint32_t init_val) { unsigned i; uint64_t temp = 0; - const uint64_t *p64 = (const uint64_t *)data; + uintptr_t pd = (uintptr_t) data; for (i = 0; i < data_len / 8; i++) { - init_val = rte_hash_crc_8byte(*p64++, init_val); + init_val = rte_hash_crc_8byte(*(const uint64_t *)pd, init_val); + pd += 8; } switch (7 - (data_len & 0x07)) { case 0: - temp |= (uint64_t) *((const uint8_t *)p64 + 6) << 48; + temp |= (uint64_t) *((const uint8_t *)pd + 6) << 48; /* Fallthrough */ case 1: - temp |= (uint64_t) *((const uint8_t *)p64 + 5) << 40; + temp |= (uint64_t) *((const uint8_t *)pd + 5) << 40; /* Fallthrough */ case 2: - temp |= (uint64_t) *((const uint8_t *)p64 + 4) << 32; - temp |= *((const uint32_t *)p64); + temp |= (uint64_t) *((const uint8_t *)pd + 4) << 32; + temp |= *(const uint32_t *)pd; init_val = rte_hash_crc_8byte(temp, init_val); break; case 3: - init_val = rte_hash_crc_4byte(*(const uint32_t *)p64, init_val); + init_val = rte_hash_crc_4byte(*(const uint32_t *)pd, init_val); break; case 4: - temp |= *((const uint8_t *)p64 + 2) << 16; + temp |= *((const uint8_t *)pd + 2) << 16; /* Fallthrough */ case 5: - temp |= *((const uint8_t *)p64 + 1) << 8; + temp |= *((const uint8_t *)pd + 1) << 8; /* Fallthrough */ case 6: - temp |= *((const uint8_t *)p64); + temp |= *(const uint8_t *)pd; init_val = rte_hash_crc_4byte(temp, init_val); /* Fallthrough */ default: -- 1.7.9.5
[dpdk-dev] ovs-dpdk: placing the metadata
Hi, I've noticed in lib/netdev-dpdk.c that __rte_pktmbuf_init() stores the packet metadata right after "struct rte_mbuf", and before the buffer data: /* start of buffer is just after mbuf structure */ m->buf_addr = (char *)m + sizeof(struct dp_packet); (struct dp_packet has the rte_mbuf as first member if DPDK enabled) However, lib/librte_mbuf/rte_mbuf.h seems to codify that the buffer should start right after the rte_mbuf: /** * Given the buf_addr returns the pointer to corresponding mbuf. */ #define RTE_MBUF_FROM_BADDR(ba) (((struct rte_mbuf *)(ba)) - 1) /** * Given the pointer to mbuf returns an address where it's buf_addr * should point to. */ #define RTE_MBUF_TO_BADDR(mb) (((struct rte_mbuf *)(mb)) + 1) These macros are used for attaching/detaching mbuf's to each other. This is the way the code retrieves the direct buffer from an indirect one, and vica versa. I think if we want to keep the metadata feature (which I guess is quite important), we need to add a pointer to rte_mbuf, which helps the direct and indirect structs to find each other. Something like: struct rte_mbuf *attach;/**< Points to the other buffer if this one is (in)direct. Otherwise NULL. */ What do you think? Regards, Zoltan Kiss
[dpdk-dev] Need some info on the impact of BIOS setting
Hi I'd like to know the impact of the virtualization setting (Intel VT set to enabled in BIOS) on the execution of the DPDK applications. Do I need to disable this setting while executing the DPDK applications in virtual machines in VirtualBox. Any help in this regard is appreciated. Regards Shankari.V
[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst
On Tue, Mar 24, 2015 at 04:10:18PM +0200, Dor Green wrote: > 1 . The eth_conf is: > > static struct rte_eth_conf const ethconf = { > .link_speed = 0, > .link_duplex = 0, > > .rxmode = { > .mq_mode = ETH_MQ_RX_RSS, > .max_rx_pkt_len = ETHER_MAX_LEN, > .split_hdr_size = 0, > .header_split = 0, > .hw_ip_checksum = 0, > .hw_vlan_filter = 0, > .jumbo_frame = 0, > .hw_strip_crc = 0, /**< CRC stripped by hardware */ > }, > > .txmode = { > }, > > .rx_adv_conf = { > .rss_conf = { > .rss_key = NULL, > .rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6, > } > }, > > .fdir_conf = { > .mode = RTE_FDIR_MODE_SIGNATURE, > > }, > > .intr_conf = { > .lsc = 0, > }, > }; > > I've tried setting jumbo frames on with a larger packet length and > even turning off RSS/FDIR. No luck. > > I don't see anything relating to the port in the initial prints, what > are you looking for? I'm looking for the PMD initialization text, like that shown below (from testpmd): Configuring Port 0 (socket 0) PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f9ba08cd700 hw_ring=0x7f9ba0b00080 dma_addr=0x36d00080 PMD: ixgbe_set_tx_function(): Using simple tx code path PMD: ixgbe_set_tx_function(): Vector tx enabled. PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f9ba08cce80 hw_ring=0x7f9ba0b10080 dma_addr=0x36d10080 PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size no less than 32. Port 0: 68:05:CA:04:51:3A Configuring Port 1 (socket 0) PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f9ba08cab40 hw_ring=0x7f9ba0b20100 dma_addr=0x36d20100 PMD: ixgbe_set_tx_function(): Using simple tx code path PMD: ixgbe_set_tx_function(): Vector tx enabled. PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f9ba08ca2c0 hw_ring=0x7f9ba0b30100 dma_addr=0x36d30100 PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size no less than 32. Port 1: 68:05:CA:04:51:38 This tells us what RX and TX functions are going to be used for each port. > > 2. The packet is a normal, albeit somewhat large (1239 bytes) TCP data > packet (SSL certificate data, specifically). > One important thing of note that I've just realised is that it's not > this "packet of death" which causes the segmentation fault (i.e. has > an out-of-bounds address for its data), but the packet afterwards-- no > matter what packet it is. > Can this problem be reproduced using testpmd or any of the standard dpdk example apps, by sending in the same packet sequence? Is there anything unusual being done in the setup of the mempool used for the packet buffers? /Bruce > > On Tue, Mar 24, 2015 at 3:17 PM, Bruce Richardson > wrote: > > On Tue, Mar 24, 2015 at 12:54:14PM +0200, Dor Green wrote: > >> I've managed to fix it so 1.8 works, and the segmentation fault still > >> occurs. > >> > >> On Tue, Mar 24, 2015 at 11:55 AM, Dor Green wrote: > >> > I tried 1.8, but that fails to initialize my device and fails at the pci > >> > probe: > >> > "Cause: Requested device :04:00.1 cannot be used" > >> > Can't even compile 2.0rc2 atm, getting: > >> > "/usr/lib/gcc/x86_64-linux-gnu/4.6/include/emmintrin.h:701:1: note: > >> > expected '__m128i' but argument is of type 'int'" > >> > For reasons I don't understand. > >> > > >> > As for the example apps (in 1.7), I can run them properly but I don't > >> > think any of them do the same processing as I do. Note that mine does > >> > work with most packets. > >> > > >> > > > > > Couple of further questions: > > 1. What config options are being used to configure the port and what is the > > output printed at port initialization time? This is needed to let us track > > down > > what specific RX path is being used inside the ixgbe driver > > 2. What type of packets specifically cause problems? Is it reproducible with > > one particular packet, or packet type? Are you sending in jumbo-frames? > > > > Regards, > > /Bruce > > > >> > On Mon, Mar 23, 2015 at 11:24 PM, Matthew Hall >> > mhcomputing.net> wrote: > >> >> On Mon, Mar 23, 2015 at 05:19:00PM +0200, Dor Green wrote: > >> >>> I changed it to free and it still happens. Note that the segmentation > >> >>> fault > >> >>> happens before that anyway. > >> >>> > >> >>> I am using 1.7.1 at the moment. I can try using a newer version. > >> >> > >> >> I'm using 1.7.X in my open-source DPDK-based app and it works, but I > >> >> have an > >> >> IGB 1-gigabit NIC though, and how RX / TX work are quite driver > >> >> specific of > >> >> course. > >> >> > >> >> I suspect there's some issue with how things are working in your IXGBE > >> >> NIC > >> >> driver / setup. Do the same failures occur inside of the DPDK's own > >> >> sample > >> >> apps? > >> >> > >> >> Matthew.
[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'
On 3/24/2015 7:10 PM, Pavel Boldin wrote: On Tue, Mar 24, 2015 at 8:28 AM, Xie, Huawei mailto:huawei.xie at intel.com>> wrote: On 3/23/2015 8:54 PM, Pavel Boldin wrote: > Due to increased `struct file's reference counter subsequent call > to `filp_close' does not free the `struct file'. Prepend `fput' call > to decrease the reference counter. > > Signed-off-by: Pavel Boldin mailto:pboldin at > mirantis.com>> > --- > lib/librte_vhost/eventfd_link/eventfd_link.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c > b/lib/librte_vhost/eventfd_link/eventfd_link.c > index 7755dd6..62c45c8 100644 > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, > unsigned long arg) >* Release the existing eventfd in the source process >*/ > spin_lock(>file_lock); > + fput(file); > filp_close(file, files); > fdt = files_fdtable(files); > fdt->fd[eventfd_copy.source_fd] = NULL; Acked-by Huawei Xie mailto:huawei.xie at intel.com>> In future, we should remove the allocation of src eventfd, allocate a free fd from kernel, and install it with target fd. Well, I don't think this is correct, because this will put too much job for the kernel module making it a combiner. At the moment I propose to accept module refactoring patch to the upstream. After that the module can be reworked along with the application code. The reason is I can't work on DPDK since I have no hardware to test application at so I can't help with patch that touches application code. Pavel Pavel: I am not asking to do it right now but in future, and i agree we accept the refactoring patch now, so i ack the patch. Huawei
[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst
1 . The eth_conf is: static struct rte_eth_conf const ethconf = { .link_speed = 0, .link_duplex = 0, .rxmode = { .mq_mode = ETH_MQ_RX_RSS, .max_rx_pkt_len = ETHER_MAX_LEN, .split_hdr_size = 0, .header_split = 0, .hw_ip_checksum = 0, .hw_vlan_filter = 0, .jumbo_frame = 0, .hw_strip_crc = 0, /**< CRC stripped by hardware */ }, .txmode = { }, .rx_adv_conf = { .rss_conf = { .rss_key = NULL, .rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6, } }, .fdir_conf = { .mode = RTE_FDIR_MODE_SIGNATURE, }, .intr_conf = { .lsc = 0, }, }; I've tried setting jumbo frames on with a larger packet length and even turning off RSS/FDIR. No luck. I don't see anything relating to the port in the initial prints, what are you looking for? 2. The packet is a normal, albeit somewhat large (1239 bytes) TCP data packet (SSL certificate data, specifically). One important thing of note that I've just realised is that it's not this "packet of death" which causes the segmentation fault (i.e. has an out-of-bounds address for its data), but the packet afterwards-- no matter what packet it is. On Tue, Mar 24, 2015 at 3:17 PM, Bruce Richardson wrote: > On Tue, Mar 24, 2015 at 12:54:14PM +0200, Dor Green wrote: >> I've managed to fix it so 1.8 works, and the segmentation fault still occurs. >> >> On Tue, Mar 24, 2015 at 11:55 AM, Dor Green wrote: >> > I tried 1.8, but that fails to initialize my device and fails at the pci >> > probe: >> > "Cause: Requested device :04:00.1 cannot be used" >> > Can't even compile 2.0rc2 atm, getting: >> > "/usr/lib/gcc/x86_64-linux-gnu/4.6/include/emmintrin.h:701:1: note: >> > expected '__m128i' but argument is of type 'int'" >> > For reasons I don't understand. >> > >> > As for the example apps (in 1.7), I can run them properly but I don't >> > think any of them do the same processing as I do. Note that mine does >> > work with most packets. >> > >> > > > Couple of further questions: > 1. What config options are being used to configure the port and what is the > output printed at port initialization time? This is needed to let us track > down > what specific RX path is being used inside the ixgbe driver > 2. What type of packets specifically cause problems? Is it reproducible with > one particular packet, or packet type? Are you sending in jumbo-frames? > > Regards, > /Bruce > >> > On Mon, Mar 23, 2015 at 11:24 PM, Matthew Hall >> > wrote: >> >> On Mon, Mar 23, 2015 at 05:19:00PM +0200, Dor Green wrote: >> >>> I changed it to free and it still happens. Note that the segmentation >> >>> fault >> >>> happens before that anyway. >> >>> >> >>> I am using 1.7.1 at the moment. I can try using a newer version. >> >> >> >> I'm using 1.7.X in my open-source DPDK-based app and it works, but I have >> >> an >> >> IGB 1-gigabit NIC though, and how RX / TX work are quite driver specific >> >> of >> >> course. >> >> >> >> I suspect there's some issue with how things are working in your IXGBE NIC >> >> driver / setup. Do the same failures occur inside of the DPDK's own sample >> >> apps? >> >> >> >> Matthew.
[dpdk-dev] [PATCH v3] lib/librte_vhost: update used->idx when allocation of mbuf fails
On 2015/3/24 15:14, Xie, Huawei wrote: > On 3/22/2015 8:08 PM, Ouyang, Changchun wrote: >> >>> -Original Message- >>> From: linhaifeng [mailto:haifeng.lin at huawei.com] >>> Sent: Saturday, March 21, 2015 9:47 AM >>> To: dev at dpdk.org >>> Cc: Ouyang, Changchun; Xie, Huawei >>> Subject: [PATCH v3] lib/librte_vhost: update used->idx when allocation of >>> mbuf fails >>> >>> From: Linhaifeng >>> >>> When failed to malloc buffer from mempool we just update last_used_idx >>> but not used->idx so after many times vhost thought have handle all packets >>> but virtio_net thought vhost have not handle all packets and will not update >>> avail->idx. >>> >>> Signed-off-by: Linhaifeng >> Acked-by: Changchun Ouyang >> >> >> > Acked-by: Huawei Xie > > This patch fix the issue. > Simple solution like other PMDs is before processing one descriptor, > ensure allocation of new mbuf is successfull, and then immediately > refill after receiving the packet from the descriptor. > In future, we should consider optimized bulk allocation strategy with > threshold. > Hi, huawei THis is patch is for librte_vhost. Do you want to ack for the other patch for virtio-net-pmd? > > > > > -- Regards, Haifeng
[dpdk-dev] [PATCH] cast used->idx to volatile
On 2015/3/24 9:53, Xie, Huawei wrote: > On 3/24/2015 9:00 AM, Linhaifeng wrote: >> >> On 2015/3/23 20:54, Xie, Huawei wrote: >>> -Original Message- From: Linhaifeng [mailto:haifeng.lin at huawei.com] Sent: Monday, March 23, 2015 8:24 PM To: dev at dpdk.org Cc: Ouyang, Changchun; Xie, Huawei Subject: Re: [dpdk-dev] [PATCH] cast used->idx to volatile On 2015/3/21 16:07, linhaifeng wrote: > From: Linhaifeng > > Same as rte_vhost_enqueue_burst we should cast used->idx > to volatile before notify guest. > > Signed-off-by: Linhaifeng > --- > lib/librte_vhost/vhost_rxtx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c > index 535c7a1..8d674d1 100644 > --- a/lib/librte_vhost/vhost_rxtx.c > +++ b/lib/librte_vhost/vhost_rxtx.c > @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > } > > rte_compiler_barrier(); > - vq->used->idx += entry_success; > + *(volatile uint16_t *)>used->idx += entry_success; >>> >>> Haifeng: >>> We have compiler barrier before and an external function call behind, so we >>> don't need volatile here. >>> Do you meet issue? >>> >> Tx_q is sometimes stopped when we use virtio_net. Because vhost thought >> there are no buffers in tx_q and virtio_net >> though vhost haven't handle all packets so we have to restart VM to restore >> work. >> >> The status in VM is: >> Mar 18 17:11:10 linux-b2ij kernel: [46337.246687] net eth7: virtnet_poll >> Mar 18 17:11:10 linux-b2ij kernel: [46337.246690] net eth7: receive_buf >> Mar 18 17:11:10 linux-b2ij kernel: [46337.246693] net eth7: vi->num=239 >> Mar 18 17:11:10 linux-b2ij kernel: [46337.246695] net eth7: >> svq:avail->idx=52939 used->idx=52939 num_free=18 num_added=0 >> svq->last_used_idx=52820 >> Mar 18 17:11:10 linux-b2ij kernel: [46337.246699] net eth7: >> rvq:avail->idx=36215 used->idx=35977 num_free=18 num_added=0 >> rvq->last_used_idx=35977 >> Mar 18 17:11:11 linux-b2ij kernel: [46337.901038] net eth7: dev_queue_xmit, >> qdisc->flags=4, qdisc->state deactiveed=0 >> Mar 18 17:11:12 linux-b2ij kernel: [46337.901042] net eth7: dev_queue_xmit, >> txq->state=1, stopped=1 >> >> Why compiler barrier not take effect in our case? Is compiler barrier >> depended on -O3 option? We use -O2 option. > compiler barrier always works regardless of the optimization option. > I don't get your story, but the key thing is, do you check the asm code? > If called from outside as an API, how is it possible it is optimized? > there is only one update to used->idx in that function. Do you mean rte_vhost_enqueue_burst also not need cast used->idx to volatile ? Why not remove it? >> > /* Kick guest if required. */ > if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) > eventfd_write((int)vq->callfd, 1); > -- Regards, Haifeng >>> >>> > > > . > -- Regards, Haifeng
[dpdk-dev] [PATCH] mk: added make target to print out system info
Added a 'make system_info' target to print out system info related to DPDK. This is intended as output that can be attached to bug reports. --- mk/rte.sdkroot.mk | 33 + 1 file changed, 33 insertions(+) diff --git a/mk/rte.sdkroot.mk b/mk/rte.sdkroot.mk index e8423b0..b477d09 100644 --- a/mk/rte.sdkroot.mk +++ b/mk/rte.sdkroot.mk @@ -123,3 +123,36 @@ examples examples_clean: %: $(Q)$(MAKE) -f $(RTE_SDK)/mk/rte.sdkconfig.mk checkconfig $(Q)$(MAKE) -f $(RTE_SDK)/mk/rte.sdkbuild.mk $@ + +.PHONY: system_info +system_info: + $(Q)echo + $(Q)echo "CC version" + $(Q)echo "==" + $(Q)$(CC) --version + $(Q)echo + + $(Q)echo "DPDK version" + $(Q)echo "" + $(Q)$(MAKE) showversion + $(Q)echo + + $(Q)echo "Git commit" + $(Q)echo "==" + $(Q)git log --pretty=format:'%H' -1 + $(Q)echo + + $(Q)echo "Uname" + $(Q)echo "=" + $(Q)uname -srvmpio + $(Q)echo + + $(Q)echo "Hugepages" + $(Q)echo "=" + $(Q)grep -i huge /proc/meminfo + $(Q)echo + + $(Q)tools/cpu_layout.py + + $(Q)tools/dpdk_nic_bind.py --status + $(Q)echo -- 1.8.1.4
[dpdk-dev] [PATCH] mk: added make target to print out system info
Added a 'make system_info' target to print out system info related to DPDK. This is intended as output that can be attached to bug reports. This is related to the recent call for tools brainstorming by Thomas. http://dpdk.org/ml/archives/dev/2015-March/015499.html Bug reports to the DPDK mailing list rarely have enough information and require several follow-up questions to determine the version of software, the OS, the compiler etc. This 'make' target prints out some useful system information that can be attached to an email. There is no guarantee that the end user will do this but at least it can be documented and we can point them to it. Suggestions for other useful information to be output welcome. Untested on FreeBSD for now. Sample output: $ make system_info CC version == cc (GCC) 4.7.2 20121109 (Red Hat 4.7.2-8) Copyright (C) 2012 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. DPDK version 2.0.0-rc2 Git commit == 0431d322b80858b9efa1b43480be6d4ddccf4a66 Uname = Linux 3.6.10-4.fc18.x86_64 #1 SMP Tue Dec 11 18:01:27 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Hugepages = AnonHugePages:102400 kB HugePages_Total:1024 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB Core and Socket Information (as reported by '/proc/cpuinfo') cores = [0, 1, 2, 3, 4, 5, 6, 7] sockets = [0, 1] Socket 0Socket 1 Core 0 [0, 16] [8, 24] Core 1 [1, 17] [9, 25] Core 2 [2, 18] [10, 26] Core 3 [3, 19] [11, 27] Core 4 [4, 20] [12, 28] Core 5 [5, 21] [13, 29] Core 6 [6, 22] [14, 30] Core 7 [7, 23] [15, 31] Network devices using DPDK-compatible driver :84:00.0 '82599EB 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused= :86:00.0 '82599EB 10-Gigabit SFI/SFP+ Network Connection' drv=igb_uio unused= Network devices using kernel driver === :03:00.0 'RTL8111/8168 PCI Express Gigabit Ethernet controller' if=p260p1 drv=r8169 unused=igb_uio :06:00.0 'I350 Gigabit Network Connection' if=em0 drv=igb unused=igb_uio :06:00.1 'I350 Gigabit Network Connection' if=eth0 drv=igb unused=igb_uio :81:00.0 '82599EB 10-Gigabit SFI/SFP+ Network Connection' if=p264p1 drv=ixgbe unused=igb_uio :81:00.1 '82599EB 10-Gigabit SFI/SFP+ Network Connection' if=p264p2 drv=ixgbe unused=igb_uio :84:00.1 '82599EB 10-Gigabit SFI/SFP+ Network Connection' if=p263p2 drv=ixgbe unused=igb_uio :86:00.1 '82599EB 10-Gigabit SFI/SFP+ Network Connection' if=p262p2 drv=ixgbe unused=igb_uio :88:00.0 '82599EB 10-Gigabit SFI/SFP+ Network Connection' if=p261p1 drv=ixgbe unused=igb_uio :88:00.1 '82599EB 10-Gigabit SFI/SFP+ Network Connection' if=p261p2 drv=ixgbe unused=igb_uio Other network devices = John McNamara (1): mk: added make target to print out system info mk/rte.sdkroot.mk | 33 + 1 file changed, 33 insertions(+) -- 1.8.1.4
[dpdk-dev] [PATCH v2] ixgbe: fix ixgbe access endian issue on bigendian arch
From: Xuelin Shienforce rules of the cpu and ixgbe exchange data. 1. cpu use data owned by ixgbe must use rte_le_to_cpu_xx(...) 2. cpu fill data to ixgbe must use rte_cpu_to_le_xx(...) Signed-off-by: Xuelin Shi --- changes for v2: rebased on latest head. fix some style issue detected by checkpatch.pl lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 104 -- 1 file changed, 66 insertions(+), 38 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index 9da2c7e..961ac08 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -128,7 +128,7 @@ ixgbe_tx_free_bufs(struct ixgbe_tx_queue *txq) int i; /* check DD bit on threshold descriptor */ - status = txq->tx_ring[txq->tx_next_dd].wb.status; + status = rte_le_to_cpu_32(txq->tx_ring[txq->tx_next_dd].wb.status); if (! (status & IXGBE_ADVTXD_STAT_DD)) return 0; @@ -174,11 +174,14 @@ tx4(volatile union ixgbe_adv_tx_desc *txdp, struct rte_mbuf **pkts) pkt_len = (*pkts)->data_len; /* write data to descriptor */ - txdp->read.buffer_addr = buf_dma_addr; + txdp->read.buffer_addr = rte_cpu_to_le_64(buf_dma_addr); + txdp->read.cmd_type_len = - ((uint32_t)DCMD_DTYP_FLAGS | pkt_len); + rte_cpu_to_le_32((uint32_t)DCMD_DTYP_FLAGS | pkt_len); + txdp->read.olinfo_status = - (pkt_len << IXGBE_ADVTXD_PAYLEN_SHIFT); + rte_cpu_to_le_32(pkt_len << IXGBE_ADVTXD_PAYLEN_SHIFT); + rte_prefetch0(&(*pkts)->pool); } } @@ -194,11 +197,14 @@ tx1(volatile union ixgbe_adv_tx_desc *txdp, struct rte_mbuf **pkts) pkt_len = (*pkts)->data_len; /* write data to descriptor */ - txdp->read.buffer_addr = buf_dma_addr; + txdp->read.buffer_addr = rte_cpu_to_le_64(buf_dma_addr); + txdp->read.cmd_type_len = - ((uint32_t)DCMD_DTYP_FLAGS | pkt_len); + rte_cpu_to_le_32((uint32_t)DCMD_DTYP_FLAGS | pkt_len); + txdp->read.olinfo_status = - (pkt_len << IXGBE_ADVTXD_PAYLEN_SHIFT); + rte_cpu_to_le_32(pkt_len << IXGBE_ADVTXD_PAYLEN_SHIFT); + rte_prefetch0(&(*pkts)->pool); } @@ -285,7 +291,7 @@ tx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, * a divisor of the ring size */ tx_r[txq->tx_next_rs].read.cmd_type_len |= - rte_cpu_to_le_32(IXGBE_ADVTXD_DCMD_RS); + rte_cpu_to_le_32(IXGBE_ADVTXD_DCMD_RS); txq->tx_next_rs = (uint16_t)(txq->tx_rs_thresh - 1); txq->tx_tail = 0; @@ -304,7 +310,7 @@ tx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, */ if (txq->tx_tail > txq->tx_next_rs) { tx_r[txq->tx_next_rs].read.cmd_type_len |= - rte_cpu_to_le_32(IXGBE_ADVTXD_DCMD_RS); + rte_cpu_to_le_32(IXGBE_ADVTXD_DCMD_RS); txq->tx_next_rs = (uint16_t)(txq->tx_next_rs + txq->tx_rs_thresh); if (txq->tx_next_rs >= txq->nb_tx_desc) @@ -505,6 +511,7 @@ ixgbe_xmit_cleanup(struct ixgbe_tx_queue *txq) uint16_t nb_tx_desc = txq->nb_tx_desc; uint16_t desc_to_clean_to; uint16_t nb_tx_to_clean; + uint32_t stat; /* Determine the last descriptor needing to be cleaned */ desc_to_clean_to = (uint16_t)(last_desc_cleaned + txq->tx_rs_thresh); @@ -513,7 +520,9 @@ ixgbe_xmit_cleanup(struct ixgbe_tx_queue *txq) /* Check to make sure the last descriptor to clean is done */ desc_to_clean_to = sw_ring[desc_to_clean_to].last_id; - if (! (txr[desc_to_clean_to].wb.status & IXGBE_TXD_STAT_DD)) + + stat = rte_le_to_cpu_32(txr[desc_to_clean_to].wb.status); + if (!(stat & IXGBE_TXD_STAT_DD)) { PMD_TX_FREE_LOG(DEBUG, "TX descriptor %4u is not done" @@ -544,7 +553,7 @@ ixgbe_xmit_cleanup(struct ixgbe_tx_queue *txq) * up to the last descriptor with the RS bit set * are done. Only reset the threshold descriptor. */ - txr[desc_to_clean_to].wb.status = 0; + txr[desc_to_clean_to].wb.status = rte_cpu_to_le_32(0); /* Update the txq to reflect the last descriptor that was cleaned */ txq->last_desc_cleaned = desc_to_clean_to; @@ -801,12 +810,14 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, */ slen = m_seg->data_len; buf_dma_addr = RTE_MBUF_DATA_DMA_ADDR(m_seg); + txd->read.buffer_addr = -
[dpdk-dev] [PATCH v2] librte_lpm: define tbl entry reversely for big endian
From: Xuelin ShiThis module uses type conversion between struct and int. Also truncation and comparison is used with this int. It is not safe for different endian arch. Add ifdef for big endian struct to fix this issue. Signed-off-by: Xuelin Shi --- changes for v2: add lib/librte_lpm/rte_lpm.h | 20 1 file changed, 20 insertions(+) diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h index 1af150c..6cbcd4c 100644 --- a/lib/librte_lpm/rte_lpm.h +++ b/lib/librte_lpm/rte_lpm.h @@ -44,6 +44,7 @@ #include #include #include +#include #include #include #include @@ -96,6 +97,7 @@ extern "C" { /** Bitmask used to indicate successful lookup */ #define RTE_LPM_LOOKUP_SUCCESS 0x0100 +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN /** @internal Tbl24 entry structure. */ struct rte_lpm_tbl24_entry { /* Stores Next hop or group index (i.e. gindex)into tbl8. */ @@ -117,6 +119,24 @@ struct rte_lpm_tbl8_entry { uint8_t valid_group :1; /**< Group validation flag. */ uint8_t depth :6; /**< Rule depth. */ }; +#else +struct rte_lpm_tbl24_entry { + uint8_t depth :6; + uint8_t ext_entry :1; + uint8_t valid :1; + union { + uint8_t tbl8_gindex; + uint8_t next_hop; + }; +}; + +struct rte_lpm_tbl8_entry { + uint8_t depth :6; + uint8_t valid_group :1; + uint8_t valid :1; + uint8_t next_hop; +}; +#endif /** @internal Rule structure. */ struct rte_lpm_rule { -- 1.9.1
[dpdk-dev] [PATCH v2 6/6] eal: Fix interface of pci_map_resource()
The function is implemented in both linuxapp and bsdapp, but interface is different. The patch fixes the function of bsdapp to do same as linuxapp. After applying it, file descriptor should be opened and closed out of pci_map_resource(). Also, remove redundant error messages from linuxapp. Signed-off-by: Tetsuya Mukawa --- lib/librte_eal/bsdapp/eal/eal_pci.c | 111 ++ lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 21 +++--- 2 files changed, 77 insertions(+), 55 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c index 08b91b4..d83916b 100644 --- a/lib/librte_eal/bsdapp/eal/eal_pci.c +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c @@ -100,7 +100,7 @@ struct mapped_pci_resource { struct rte_pci_addr pci_addr; char path[PATH_MAX]; - size_t nb_maps; + int nb_maps; struct pci_map maps[PCI_MAX_RESOURCE]; }; @@ -122,47 +122,30 @@ pci_unbind_kernel_driver(struct rte_pci_device *dev __rte_unused) /* map a particular resource from a file */ static void * -pci_map_resource(void *requested_addr, const char *devname, off_t offset, -size_t size) +pci_map_resource(void *requested_addr, int fd, off_t offset, size_t size, +int additional_flags) { - int fd; void *mapaddr; - /* -* open devname, to mmap it -*/ - fd = open(devname, O_RDWR); - if (fd < 0) { - RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", - devname, strerror(errno)); - goto fail; - } - /* Map the PCI memory resource of device */ mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE, - MAP_SHARED, fd, offset); - close(fd); - if (mapaddr == MAP_FAILED || - (requested_addr != NULL && mapaddr != requested_addr)) { - RTE_LOG(ERR, EAL, "%s(): cannot mmap(%s(%d), %p, 0x%lx, 0x%lx):" - " %s (%p)\n", __func__, devname, fd, requested_addr, + MAP_SHARED | additional_flags, fd, offset); + if (mapaddr == MAP_FAILED) { + RTE_LOG(ERR, EAL, + "%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s (%p)\n", + __func__, fd, requested_addr, (unsigned long)size, (unsigned long)offset, strerror(errno), mapaddr); - goto fail; - } - - RTE_LOG(DEBUG, EAL, " PCI memory mapped at %p\n", mapaddr); + } else + RTE_LOG(DEBUG, EAL, " PCI memory mapped at %p\n", mapaddr); return mapaddr; - -fail: - return NULL; } static int pci_uio_map_secondary(struct rte_pci_device *dev) { - size_t i; + int i, fd; struct mapped_pci_resource *uio_res; struct mapped_pci_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list); @@ -170,19 +153,34 @@ pci_uio_map_secondary(struct rte_pci_device *dev) TAILQ_FOREACH(uio_res, uio_res_list, next) { /* skip this element if it doesn't match our PCI address */ - if (memcmp(_res->pci_addr, >addr, sizeof(dev->addr))) + if (rte_eal_compare_pci_addr(_res->pci_addr, >addr)) continue; for (i = 0; i != uio_res->nb_maps; i++) { - if (pci_map_resource(uio_res->maps[i].addr, -uio_res->path, -(off_t)uio_res->maps[i].offset, -(size_t)uio_res->maps[i].size) - != uio_res->maps[i].addr) { + /* +* open devname, to mmap it +*/ + fd = open(uio_res->maps[i].path, O_RDWR); + if (fd < 0) { + RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", + uio_res->maps[i].path, strerror(errno)); + return -1; + } + + void *mapaddr = pci_map_resource(uio_res->maps[i].addr, + fd, (off_t)uio_res->maps[i].offset, + (size_t)uio_res->maps[i].size, 0); + if (mapaddr != uio_res->maps[i].addr) { RTE_LOG(ERR, EAL, - "Cannot mmap device resource\n"); + "Cannot mmap device resource " + "file %s to address: %p\n", + uio_res->maps[i].path, + uio_res->maps[i].addr); + close(fd);
[dpdk-dev] [PATCH v2 5/6] eal: Use map_idx in pci_uio_map_resource() of bsdapp to work same as linuxapp
This patch changes code that maps pci resources in bsdapp. Linuxapp has almost same code. To consolidate both, fix implementation of bsdapp to work same as linuxapp. Signed-off-by: Tetsuya Mukawa --- lib/librte_eal/bsdapp/eal/eal_pci.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c index 85f8671..08b91b4 100644 --- a/lib/librte_eal/bsdapp/eal/eal_pci.c +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c @@ -195,7 +195,7 @@ pci_uio_map_secondary(struct rte_pci_device *dev) static int pci_uio_map_resource(struct rte_pci_device *dev) { - int i, j; + int i, map_idx; char devname[PATH_MAX]; /* contains the /dev/uioX */ void *mapaddr; uint64_t phaddr; @@ -247,31 +247,31 @@ pci_uio_map_resource(struct rte_pci_device *dev) pagesz = sysconf(_SC_PAGESIZE); maps = uio_res->maps; - for (i = uio_res->nb_maps = 0; i != PCI_MAX_RESOURCE; i++) { + for (i = 0, map_idx = 0; i != PCI_MAX_RESOURCE; i++) { - j = uio_res->nb_maps; /* skip empty BAR */ if ((phaddr = dev->mem_resource[i].phys_addr) == 0) continue; /* if matching map is found, then use it */ offset = i * pagesz; - maps[j].offset = offset; - maps[j].phaddr = dev->mem_resource[i].phys_addr; - maps[j].size = dev->mem_resource[i].len; - if (maps[j].addr != NULL || - (mapaddr = pci_map_resource(NULL, devname, (off_t)offset, - (size_t)maps[j].size) - ) == NULL) { + maps[map_idx].offset = offset; + maps[map_idx].phaddr = dev->mem_resource[i].phys_addr; + maps[map_idx].size = dev->mem_resource[i].len; + mapaddr = pci_map_resource(NULL, devname, (off_t)offset, + (size_t)maps[map_idx].size); + if ((maps[map_idx].addr != NULL) || (mapaddr == NULL)) { rte_free(uio_res); return -1; } - maps[j].addr = mapaddr; - uio_res->nb_maps++; + maps[map_idx].addr = mapaddr; + map_idx++; dev->mem_resource[i].addr = mapaddr; } + uio_res->nb_maps = map_idx; + TAILQ_INSERT_TAIL(uio_res_list, uio_res, next); return 0; -- 1.9.1
[dpdk-dev] [PATCH v2 4/6] eal/bsdapp: Change names of pci related data structure
To merge pci code of linuxapp and bsdapp, this patch changes names like below. - uio_map to pci_map - uio_resource to mapped_pci_resource - uio_res_list to mapped_pci_res_list Also, add 'path' variable to pci_map of bsdapp. Signed-off-by: Tetsuya Mukawa --- lib/librte_eal/bsdapp/eal/eal_pci.c | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c index 3a22b49..85f8671 100644 --- a/lib/librte_eal/bsdapp/eal/eal_pci.c +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c @@ -83,8 +83,9 @@ * enabling bus master. */ -struct uio_map { +struct pci_map { void *addr; + char *path; uint64_t offset; uint64_t size; uint64_t phaddr; @@ -94,16 +95,16 @@ struct uio_map { * For multi-process we need to reproduce all PCI mappings in secondary * processes, so save them in a tailq. */ -struct uio_resource { - TAILQ_ENTRY(uio_resource) next; +struct mapped_pci_resource { + TAILQ_ENTRY(mapped_pci_resource) next; struct rte_pci_addr pci_addr; char path[PATH_MAX]; size_t nb_maps; - struct uio_map maps[PCI_MAX_RESOURCE]; + struct pci_map maps[PCI_MAX_RESOURCE]; }; -TAILQ_HEAD(uio_res_list, uio_resource); +TAILQ_HEAD(mapped_pci_res_list, mapped_pci_resource); static struct rte_tailq_elem rte_uio_tailq = { .name = "UIO_RESOURCE_LIST", @@ -162,9 +163,9 @@ static int pci_uio_map_secondary(struct rte_pci_device *dev) { size_t i; - struct uio_resource *uio_res; - struct uio_res_list *uio_res_list = - RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list); + struct mapped_pci_resource *uio_res; + struct mapped_pci_res_list *uio_res_list = + RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list); TAILQ_FOREACH(uio_res, uio_res_list, next) { @@ -201,10 +202,10 @@ pci_uio_map_resource(struct rte_pci_device *dev) uint64_t offset; uint64_t pagesz; struct rte_pci_addr *loc = >addr; - struct uio_resource *uio_res; - struct uio_res_list *uio_res_list = - RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list); - struct uio_map *maps; + struct mapped_pci_resource *uio_res; + struct mapped_pci_res_list *uio_res_list = + RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list); + struct pci_map *maps; dev->intr_handle.fd = -1; dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN; -- 1.9.1
[dpdk-dev] [PATCH v2 3/6] eal: Fix memory leaks and needless increment of pci_map_addr
This patch fixes following memory leaks. - When pci_map_resource() is failed but path is allocated correctly, path won't be freed in pci_uio_map_recource(). - When open() is failed, uio_res won't be freed in pci_uio_map_resource(). - When pci_uio_unmap() is called, path should be freed. Also, fixes below. - When pci_map_resource() is failed, mapaddr will be MAP_FAILED. In this case, pci_map_addr should not be incremented in pci_uio_map_resource(). - To shrink code, move close(). - Remove fail variable. Signed-off-by: Tetsuya Mukawa --- lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 35 ++- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c index f0277be..0128cec 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c @@ -333,7 +333,6 @@ pci_uio_map_resource(struct rte_pci_device *dev) maps = uio_res->maps; for (i = 0, map_idx = 0; i != PCI_MAX_RESOURCE; i++) { int fd; - int fail = 0; /* skip empty BAR */ phaddr = dev->mem_resource[i].phys_addr; @@ -347,6 +346,11 @@ pci_uio_map_resource(struct rte_pci_device *dev) loc->domain, loc->bus, loc->devid, loc->function, i); + /* allocate memory to keep path */ + maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0); + if (maps[map_idx].path == NULL) + goto fail0; + /* * open resource file, to mmap it */ @@ -354,7 +358,7 @@ pci_uio_map_resource(struct rte_pci_device *dev) if (fd < 0) { RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", devname, strerror(errno)); - return -1; + goto fail1; } /* try mapping somewhere close to the end of hugepages */ @@ -363,23 +367,13 @@ pci_uio_map_resource(struct rte_pci_device *dev) mapaddr = pci_map_resource(pci_map_addr, fd, 0, (size_t)dev->mem_resource[i].len, 0); + close(fd); if (mapaddr == MAP_FAILED) - fail = 1; + goto fail1; pci_map_addr = RTE_PTR_ADD(mapaddr, (size_t)dev->mem_resource[i].len); - maps[map_idx].path = rte_malloc(NULL, strlen(devname) + 1, 0); - if (maps[map_idx].path == NULL) - fail = 1; - - if (fail) { - rte_free(uio_res); - close(fd); - return -1; - } - close(fd); - maps[map_idx].phaddr = dev->mem_resource[i].phys_addr; maps[map_idx].size = dev->mem_resource[i].len; maps[map_idx].addr = mapaddr; @@ -394,6 +388,15 @@ pci_uio_map_resource(struct rte_pci_device *dev) TAILQ_INSERT_TAIL(uio_res_list, uio_res, next); return 0; + +fail1: + rte_free(maps[map_idx].path); +fail0: + for (i = 0; i < map_idx; i++) + rte_free(maps[i].path); + rte_free(uio_res); + + return -1; } #ifdef RTE_LIBRTE_EAL_HOTPLUG @@ -405,9 +408,11 @@ pci_uio_unmap(struct mapped_pci_resource *uio_res) if (uio_res == NULL) return; - for (i = 0; i != uio_res->nb_maps; i++) + for (i = 0; i != uio_res->nb_maps; i++) { pci_unmap_resource(uio_res->maps[i].addr, (size_t)uio_res->maps[i].size); + rte_free(uio_res->maps[i].path); + } } static struct mapped_pci_resource * -- 1.9.1
[dpdk-dev] [PATCH v2 2/6] eal: Close file descriptor of uio configuration
When pci_uio_unmap_resource() is called, a file descriptor that is used for uio configuration should be closed. Signed-off-by: Tetsuya Mukawa --- lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c index 9cdf24f..f0277be 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c @@ -459,8 +459,12 @@ pci_uio_unmap_resource(struct rte_pci_device *dev) /* close fd if in primary process */ close(dev->intr_handle.fd); - dev->intr_handle.fd = -1; + + /* close cfg_fd if in primary process */ + close(dev->intr_handle.uio_cfg_fd); + dev->intr_handle.uio_cfg_fd = -1; + dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN; } #endif /* RTE_LIBRTE_EAL_HOTPLUG */ -- 1.9.1
[dpdk-dev] [PATCH v2 1/6] eal: Fix coding style of eal_pci.c and eal_pci_uio.c
This patch fixes coding style of below files in linuxapp and bsdapp. - eal_pci.c - eal_pci_uio.c Signed-off-by: Tetsuya Mukawa --- lib/librte_eal/bsdapp/eal/eal_pci.c | 24 +--- lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 12 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c b/lib/librte_eal/bsdapp/eal/eal_pci.c index fe3ef86..3a22b49 100644 --- a/lib/librte_eal/bsdapp/eal/eal_pci.c +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c @@ -161,9 +161,10 @@ fail: static int pci_uio_map_secondary(struct rte_pci_device *dev) { -size_t i; -struct uio_resource *uio_res; - struct uio_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list); + size_t i; + struct uio_resource *uio_res; + struct uio_res_list *uio_res_list = + RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list); TAILQ_FOREACH(uio_res, uio_res_list, next) { @@ -179,10 +180,10 @@ pci_uio_map_secondary(struct rte_pci_device *dev) != uio_res->maps[i].addr) { RTE_LOG(ERR, EAL, "Cannot mmap device resource\n"); - return (-1); + return -1; } } - return (0); + return 0; } RTE_LOG(ERR, EAL, "Cannot find resource for device\n"); @@ -201,7 +202,8 @@ pci_uio_map_resource(struct rte_pci_device *dev) uint64_t pagesz; struct rte_pci_addr *loc = >addr; struct uio_resource *uio_res; - struct uio_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list); + struct uio_res_list *uio_res_list = + RTE_TAILQ_CAST(rte_uio_tailq.head, uio_res_list); struct uio_map *maps; dev->intr_handle.fd = -1; @@ -209,7 +211,7 @@ pci_uio_map_resource(struct rte_pci_device *dev) /* secondary processes - use already recorded details */ if (rte_eal_process_type() != RTE_PROC_PRIMARY) - return (pci_uio_map_secondary(dev)); + return pci_uio_map_secondary(dev); snprintf(devname, sizeof(devname), "/dev/uio at pci:%u:%u:%u", dev->addr.bus, dev->addr.devid, dev->addr.function); @@ -233,7 +235,7 @@ pci_uio_map_resource(struct rte_pci_device *dev) if ((uio_res = rte_zmalloc("UIO_RES", sizeof (*uio_res), 0)) == NULL) { RTE_LOG(ERR, EAL, "%s(): cannot store uio mmap details\n", __func__); - return (-1); + return -1; } snprintf(uio_res->path, sizeof(uio_res->path), "%s", devname); @@ -261,7 +263,7 @@ pci_uio_map_resource(struct rte_pci_device *dev) (size_t)maps[j].size) ) == NULL) { rte_free(uio_res); - return (-1); + return -1; } maps[j].addr = mapaddr; @@ -271,7 +273,7 @@ pci_uio_map_resource(struct rte_pci_device *dev) TAILQ_INSERT_TAIL(uio_res_list, uio_res, next); - return (0); + return 0; } /* Scan one pci sysfs entry, and fill the devices list from it. */ @@ -311,7 +313,7 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf) /* FreeBSD has no NUMA support (yet) */ dev->numa_node = 0; -/* parse resources */ + /* parse resources */ switch (conf->pc_hdr & PCIM_HDRTYPE) { case PCIM_HDRTYPE_NORMAL: max = PCIR_MAX_BAR_0; diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c index 2d1c69b..9cdf24f 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c @@ -92,7 +92,8 @@ pci_uio_map_secondary(struct rte_pci_device *dev) { int fd, i; struct mapped_pci_resource *uio_res; - struct mapped_pci_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list); + struct mapped_pci_res_list *uio_res_list = + RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list); TAILQ_FOREACH(uio_res, uio_res_list, next) { @@ -272,7 +273,8 @@ pci_uio_map_resource(struct rte_pci_device *dev) uint64_t phaddr; struct rte_pci_addr *loc = >addr; struct mapped_pci_resource *uio_res; - struct mapped_pci_res_list *uio_res_list = RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list); + struct mapped_pci_res_list *uio_res_list = + RTE_TAILQ_CAST(rte_uio_tailq.head, mapped_pci_res_list); struct pci_map *maps; dev->intr_handle.fd = -1; @@ -412,7 +414,8 @@ static struct mapped_pci_resource * pci_uio_find_resource(struct rte_pci_device *dev) { struct mapped_pci_resource
[dpdk-dev] [PATCH 6/6] eal: Fix interface of pci_map_resource()
On 2015/03/20 18:53, Iremonger, Bernard wrote: >> -Original Message- >> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp] >> Sent: Tuesday, March 17, 2015 9:31 AM >> To: dev at dpdk.org >> Cc: Iremonger, Bernard; Richardson, Bruce; Tetsuya Mukawa >> Subject: [PATCH 6/6] eal: Fix interface of pci_map_resource() >> >> The function is implemented in both linuxapp and bsdapp, but interface is >> different. The patch fixes >> the function of bsdapp to do same as linuxapp. After applying it, file >> descriptor should be opened and >> closed out of pci_map_resource(). >> Also, remove redundant error messages from linuxapp. >> >> Signed-off-by: Tetsuya Mukawa >> --- >> lib/librte_eal/bsdapp/eal/eal_pci.c | 109 >> ++ >> lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 21 +++--- >> 2 files changed, 75 insertions(+), 55 deletions(-) >> >> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c >> b/lib/librte_eal/bsdapp/eal/eal_pci.c >> index 08b91b4..21a3d79 100644 >> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c >> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c >> @@ -100,7 +100,7 @@ struct mapped_pci_resource { >> >> struct rte_pci_addr pci_addr; >> char path[PATH_MAX]; >> -size_t nb_maps; >> +int nb_maps; >> struct pci_map maps[PCI_MAX_RESOURCE]; }; >> >> @@ -122,47 +122,30 @@ pci_unbind_kernel_driver(struct rte_pci_device *dev >> __rte_unused) >> >> /* map a particular resource from a file */ static void * >> -pci_map_resource(void *requested_addr, >> const char *devname, off_t offset, >> - size_t size) >> +pci_map_resource(void *requested_addr, int fd, off_t offset, size_t size, >> + int additional_flags) >> { >> -int fd; >> void *mapaddr; >> >> -/* >> - * open devname, to mmap it >> - */ >> -fd = open(devname, O_RDWR); >> -if (fd < 0) { >> -RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", >> -devname, strerror(errno)); >> -goto fail; >> -} >> - >> /* Map the PCI memory resource of device */ >> mapaddr = mmap(requested_addr, size, PROT_READ | PROT_WRITE, >> -MAP_SHARED, fd, offset); >> -close(fd); >> -if (mapaddr == MAP_FAILED || >> -(requested_addr != NULL && mapaddr != requested_addr)) { >> -RTE_LOG(ERR, EAL, "%s(): cannot mmap(%s(%d), %p, 0x%lx, 0x%lx):" >> -" %s (%p)\n", __func__, devname, fd, requested_addr, >> +MAP_SHARED | additional_flags, fd, offset); >> +if (mapaddr == MAP_FAILED) { >> +RTE_LOG(ERR, EAL, >> +"%s(): cannot mmap(%d, %p, 0x%lx, 0x%lx): %s (%p)\n", >> +__func__, fd, requested_addr, >> (unsigned long)size, (unsigned long)offset, >> strerror(errno), mapaddr); >> -goto fail; > Hi Tetsuya, > > Previously pci_map_resource() returned NULL when MAP_FAILED occurred. > It now returns MAP_FAILED. > Where pci_map_resource() is called it should now check for MAP_FAILED instead > of NULL, or else the return value of NULL should be restored. > There is another comment below. Hi Bernard, I appreciate for your comment. Yes it should be. I will fix it. Regards, Tetsuya > > >> -} >> - >> -RTE_LOG(DEBUG, EAL, " PCI memory mapped at %p\n", mapaddr); >> +} else >> +RTE_LOG(DEBUG, EAL, " PCI memory mapped at %p\n", mapaddr); >> >> return mapaddr; >> - >> -fail: >> -return NULL; >> } >> >> static int >> pci_uio_map_secondary(struct rte_pci_device *dev) { >> -size_t i; >> +int i, fd; >> struct mapped_pci_resource *uio_res; >> struct mapped_pci_res_list *uio_res_list = >> RTE_TAILQ_CAST(rte_uio_tailq.head, >> mapped_pci_res_list); @@ -170,19 >> +153,34 @@ pci_uio_map_secondary(struct rte_pci_device *dev) >> TAILQ_FOREACH(uio_res, uio_res_list, next) { >> >> /* skip this element if it doesn't match our PCI address */ >> -if (memcmp(_res->pci_addr, >addr, sizeof(dev->addr))) >> +if (rte_eal_compare_pci_addr(_res->pci_addr, >addr)) >> continue; >> >> for (i = 0; i != uio_res->nb_maps; i++) { >> -if (pci_map_resource(uio_res->maps[i].addr, >> - uio_res->path, >> - (off_t)uio_res->maps[i].offset, >> - (size_t)uio_res->maps[i].size) >> -!= uio_res->maps[i].addr) { >> +/* >> + * open devname, to mmap it >> + */ >> +fd = open(uio_res->maps[i].path, O_RDWR); >> +if (fd < 0) { >> +RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", >> +uio_res->maps[i].path, strerror(errno)); >> +
[dpdk-dev] [PATCH v2 0/6] Clean up pci uio implementations
This patch set cleans up pci uio implementation. These clean up are for consolidating pci uio implementation of linuxapp and bsdapp, and moving consolidated functions in eal common. Because of above, this patch set tries to implement linuxapp and bsdapp almost same. Actual consolidations will be done later patch set. PATCH v2 changes: - Move 'if-condition' to later patch series. - Fix memory leaks of path. - Fix typos. (Thanks to David Marchand) - Fix commit title and body. - Fix pci_map_resource() to handle MAP_FAILED. (Thanks to Iremonger, Bernard) Changes: - This patch set is derived from below. "[PATCH v2] eal: Port Hotplug support for BSD" - Set cfg_fd as -1, when cfg_fd is closed. (Thanks to Iremonger, Bernard) - Remove needless coding style fixings. - Fix coding style of if-else condition. (Thanks to Richardson, Bruce) Tetsuya Mukawa (6): eal: Fix coding style of eal_pci.c and eal_pci_uio.c eal: Close file descriptor of uio configuration eal: Fix memory leaks and needless increment of pci_map_addr eal/bsdapp: Change names of pci related data structure eal: Use map_idx in pci_uio_map_resource() of bsdapp to work same as linuxapp eal: Fix interface of pci_map_resource() lib/librte_eal/bsdapp/eal/eal_pci.c | 162 ++ lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 72 +++-- 2 files changed, 136 insertions(+), 98 deletions(-) -- 1.9.1
[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst
On Tue, Mar 24, 2015 at 12:54:14PM +0200, Dor Green wrote: > I've managed to fix it so 1.8 works, and the segmentation fault still occurs. > > On Tue, Mar 24, 2015 at 11:55 AM, Dor Green wrote: > > I tried 1.8, but that fails to initialize my device and fails at the pci > > probe: > > "Cause: Requested device :04:00.1 cannot be used" > > Can't even compile 2.0rc2 atm, getting: > > "/usr/lib/gcc/x86_64-linux-gnu/4.6/include/emmintrin.h:701:1: note: > > expected '__m128i' but argument is of type 'int'" > > For reasons I don't understand. > > > > As for the example apps (in 1.7), I can run them properly but I don't > > think any of them do the same processing as I do. Note that mine does > > work with most packets. > > > > Couple of further questions: 1. What config options are being used to configure the port and what is the output printed at port initialization time? This is needed to let us track down what specific RX path is being used inside the ixgbe driver 2. What type of packets specifically cause problems? Is it reproducible with one particular packet, or packet type? Are you sending in jumbo-frames? Regards, /Bruce > > On Mon, Mar 23, 2015 at 11:24 PM, Matthew Hall > > wrote: > >> On Mon, Mar 23, 2015 at 05:19:00PM +0200, Dor Green wrote: > >>> I changed it to free and it still happens. Note that the segmentation > >>> fault > >>> happens before that anyway. > >>> > >>> I am using 1.7.1 at the moment. I can try using a newer version. > >> > >> I'm using 1.7.X in my open-source DPDK-based app and it works, but I have > >> an > >> IGB 1-gigabit NIC though, and how RX / TX work are quite driver specific of > >> course. > >> > >> I suspect there's some issue with how things are working in your IXGBE NIC > >> driver / setup. Do the same failures occur inside of the DPDK's own sample > >> apps? > >> > >> Matthew.
[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'
On Tue, Mar 24, 2015 at 8:28 AM, Xie, Huawei wrote: > On 3/23/2015 8:54 PM, Pavel Boldin wrote: > > Due to increased `struct file's reference counter subsequent call > > to `filp_close' does not free the `struct file'. Prepend `fput' call > > to decrease the reference counter. > > > > Signed-off-by: Pavel Boldin > > --- > > lib/librte_vhost/eventfd_link/eventfd_link.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c > b/lib/librte_vhost/eventfd_link/eventfd_link.c > > index 7755dd6..62c45c8 100644 > > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c > > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c > > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int > ioctl, unsigned long arg) > >* Release the existing eventfd in the source process > >*/ > > spin_lock(>file_lock); > > + fput(file); > > filp_close(file, files); > > fdt = files_fdtable(files); > > fdt->fd[eventfd_copy.source_fd] = NULL; > Acked-by Huawei Xie > > In future, we should remove the allocation of src eventfd, allocate a > free fd from kernel, and install it with target fd. > Well, I don't think this is correct, because this will put too much job for the kernel module making it a combiner. At the moment I propose to accept module refactoring patch to the upstream. After that the module can be reworked along with the application code. The reason is I can't work on DPDK since I have no hardware to test application at so I can't help with patch that touches application code. Pavel
[dpdk-dev] [PATCH] mk: added make target to print out system info
On Tue, Mar 24, 2015 at 02:52:59PM +, John McNamara wrote: > Added a 'make system_info' target to print out system info > related to DPDK. This is intended as output that can be > attached to bug reports. > --- > mk/rte.sdkroot.mk | 33 + > 1 file changed, 33 insertions(+) > > diff --git a/mk/rte.sdkroot.mk b/mk/rte.sdkroot.mk > index e8423b0..b477d09 100644 > --- a/mk/rte.sdkroot.mk > +++ b/mk/rte.sdkroot.mk > @@ -123,3 +123,36 @@ examples examples_clean: > %: > $(Q)$(MAKE) -f $(RTE_SDK)/mk/rte.sdkconfig.mk checkconfig > $(Q)$(MAKE) -f $(RTE_SDK)/mk/rte.sdkbuild.mk $@ > + > +.PHONY: system_info > +system_info: > + $(Q)echo > + $(Q)echo "CC version" > + $(Q)echo "==" > + $(Q)$(CC) --version > + $(Q)echo > + > + $(Q)echo "DPDK version" > + $(Q)echo "" > + $(Q)$(MAKE) showversion > + $(Q)echo > + > + $(Q)echo "Git commit" > + $(Q)echo "==" > + $(Q)git log --pretty=format:'%H' -1 > + $(Q)echo > + > + $(Q)echo "Uname" > + $(Q)echo "=" > + $(Q)uname -srvmpio > + $(Q)echo > + > + $(Q)echo "Hugepages" > + $(Q)echo "=" > + $(Q)grep -i huge /proc/meminfo > + $(Q)echo > + > + $(Q)tools/cpu_layout.py > + > + $(Q)tools/dpdk_nic_bind.py --status > + $(Q)echo > -- > 1.8.1.4 > > Nak, for a few reasons: 1) While this target is in a common makefile, at least some of the information it gathers is operating system specfic (e.g. /proc/meminfo). This isn't going to work on BSD, or other operating systems that we might support in the future 2) This is tied to the build system. Theres no guarantee that users will diagnose problems only on the system that they built the DPDK on. A better solution might be to simply document the sort of information that a bug reporter is expected to gather, along with some sample tools for doing so. There are numerous tools to get the above information, both in isolation and in aggregate. Neil
[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst
I've managed to fix it so 1.8 works, and the segmentation fault still occurs. On Tue, Mar 24, 2015 at 11:55 AM, Dor Green wrote: > I tried 1.8, but that fails to initialize my device and fails at the pci > probe: > "Cause: Requested device :04:00.1 cannot be used" > Can't even compile 2.0rc2 atm, getting: > "/usr/lib/gcc/x86_64-linux-gnu/4.6/include/emmintrin.h:701:1: note: > expected '__m128i' but argument is of type 'int'" > For reasons I don't understand. > > As for the example apps (in 1.7), I can run them properly but I don't > think any of them do the same processing as I do. Note that mine does > work with most packets. > > > On Mon, Mar 23, 2015 at 11:24 PM, Matthew Hall > wrote: >> On Mon, Mar 23, 2015 at 05:19:00PM +0200, Dor Green wrote: >>> I changed it to free and it still happens. Note that the segmentation fault >>> happens before that anyway. >>> >>> I am using 1.7.1 at the moment. I can try using a newer version. >> >> I'm using 1.7.X in my open-source DPDK-based app and it works, but I have an >> IGB 1-gigabit NIC though, and how RX / TX work are quite driver specific of >> course. >> >> I suspect there's some issue with how things are working in your IXGBE NIC >> driver / setup. Do the same failures occur inside of the DPDK's own sample >> apps? >> >> Matthew.
[dpdk-dev] [PATCH] cmdline: fix type format from unsigned to size_t for buffer size
Hi, On 03/24/2015 11:48 AM, Jastrzebski, MichalX K wrote: >>> On 02/20/2015 05:18 PM, Daniel Mrzyglod wrote: Function match_inst is used to take buffor using sizeof() which is size_t >> type. This modification also involved changing '%u' to '%zu' in printf function. Signed-off-by: Daniel Mrzyglod [...] >>> >>> Did you see a specific issue with the current code? (maybe a compilation >>> issue or a klocwork issue?) > Hi Olivier, Thomas > Yes, this is an issue reported by the static analysis tool. >>> >>> I think this patch is ok, but there are many places where this kind >>> of fixes should be applied in cmdline (cmdline_parse_*(), >>> cmdline_get_help_*(), etc.). Is there a motivation for changing it >>> only there? > The tool we use didn't reported other places, > that is why Daniel send only this one change. I think it would be great to look at other places to check if the same change is required. Regards, Olivier
[dpdk-dev] Packet data out of bounds after rte_eth_rx_burst
I tried 1.8, but that fails to initialize my device and fails at the pci probe: "Cause: Requested device :04:00.1 cannot be used" Can't even compile 2.0rc2 atm, getting: "/usr/lib/gcc/x86_64-linux-gnu/4.6/include/emmintrin.h:701:1: note: expected '__m128i' but argument is of type 'int'" For reasons I don't understand. As for the example apps (in 1.7), I can run them properly but I don't think any of them do the same processing as I do. Note that mine does work with most packets. On Mon, Mar 23, 2015 at 11:24 PM, Matthew Hall wrote: > On Mon, Mar 23, 2015 at 05:19:00PM +0200, Dor Green wrote: >> I changed it to free and it still happens. Note that the segmentation fault >> happens before that anyway. >> >> I am using 1.7.1 at the moment. I can try using a newer version. > > I'm using 1.7.X in my open-source DPDK-based app and it works, but I have an > IGB 1-gigabit NIC though, and how RX / TX work are quite driver specific of > course. > > I suspect there's some issue with how things are working in your IXGBE NIC > driver / setup. Do the same failures occur inside of the DPDK's own sample > apps? > > Matthew.
[dpdk-dev] [PATCH v2 2/6] eal: Close file descriptor of uio configuration
On Tue, 24 Mar 2015 13:18:33 +0900 Tetsuya Mukawa wrote: > When pci_uio_unmap_resource() is called, a file descriptor that is used > for uio configuration should be closed. > > Signed-off-by: Tetsuya Mukawa > --- > lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > index 9cdf24f..f0277be 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c > @@ -459,8 +459,12 @@ pci_uio_unmap_resource(struct rte_pci_device *dev) > > /* close fd if in primary process */ > close(dev->intr_handle.fd); > - > dev->intr_handle.fd = -1; > + > + /* close cfg_fd if in primary process */ > + close(dev->intr_handle.uio_cfg_fd); > + dev->intr_handle.uio_cfg_fd = -1; > + > dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN; > } > #endif /* RTE_LIBRTE_EAL_HOTPLUG */ For the Qlogic/Broadcom driver it needed the config fd handle, and I added generic config space access functions.
[dpdk-dev] dpdk kni ping answer
The source of my problem seems to be that Ethtool is not supported in VMs (VF). Is there another solution to give an ip to a dpdk port? -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yaron Illouz Sent: Monday, March 23, 2015 2:08 PM To: dev at dpdk.org; dev Subject: [dpdk-dev] dpdk kni ping answer Hi I want to give an ip to a dpdk port, because I need to receive traffic by gre replication. I though I could do it with dpdk kni (http://dpdk.org/doc/guides/prog_guide/kernel_nic_interface.html) I am able to give an ip to interface vEth0, but I can't ping or ssh to the ip address. I set the ip with the following command: ifconfig vEth0 192.168.69.4 netmask 255.255.255.0 up Is this something possible to do with dpdk kni? If not, what could be another solution for this. Regards
[dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions added + unittest
Hi Vadim, > From: Vadim Suraev [mailto:vadim.suraev at gmail.com] > Sent: Tuesday, March 24, 2015 7:53 AM > To: Ananyev, Konstantin > Cc: Olivier MATZ; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions > added + unittest > > Hi, Konstantin, > > >Though from my point, such function should be generic as > >rte_pktmbuf_free_chain() - > >no special limitations like all mbufs from one pool, refcnt==1, etc. > I misunderstood, I'll rework. Ok, thanks. Konstantin > Regards, > ?Vadim. > > On Tue, Mar 24, 2015 at 1:48 AM, Ananyev, Konstantin intel.com> wrote: > Hi Vadim, > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vadim Suraev > > Sent: Monday, March 23, 2015 5:31 PM > > To: Olivier MATZ > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions > > added + unittest > > > > Hi, Olivier, > > No, I personally need to free a chain using mempool bulk. If I'm not > > mistaken, it has been proposed by one of reviewers to have lower level > > function, so it was done (I'm sorry if misunderstood) > > Was it me? > As I remember, I said it would be good to create rte_pktmbuf_bulk_free() or > rte_pktmbuf_seg_bulk_free() - > that would free a bulk of mbufs segments in the same manner as > rte_pktmbuf_free_chain() does: > count number of consecutive mbufs from the same mempool to be freed and then > put them back into the pool at one go. > Such function would be useful inside PMD code. > In fact we already using analogy of such function inside vPMD TX code. > Though from my point, such function should be generic as > rte_pktmbuf_free_chain() - > no special limitations like all mbufs from one pool, refcnt==1, etc. > So if it was me who confused you - I am sorry. > Konstantin > > > Regards, > > Vadim. > > On Mar 23, 2015 8:44 PM, "Olivier MATZ" wrote: > > > > > Hi Neil, > > > > > > On 03/19/2015 02:16 PM, Neil Horman wrote: > > > >> On 03/18/2015 09:58 PM, Neil Horman wrote: > > > +/** > > > + * Free a bulk of mbufs into its original mempool. > > > + * This function assumes: > > > + * - refcnt equals 1 > > > + * - mbufs are direct > > > + * - all mbufs must belong to the same mempool > > > + * > > > + * @param mbufs > > > + *? ? Array of pointers to mbuf > > > + * @param count > > > + *? ? Array size > > > + */ > > > +static inline void rte_pktmbuf_bulk_free(struct rte_mbuf **mbufs, > > > +? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?unsigned count) > > > +{ > > > +? unsigned idx; > > > + > > > +? RTE_MBUF_ASSERT(count > 0); > > > + > > > +? for (idx = 0; idx < count; idx++) { > > > +? ? ? ? ? RTE_MBUF_ASSERT(mbufs[idx]->pool == mbufs[0]->pool); > > > +? ? ? ? ? RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 1); > > > +? ? ? ? ? rte_mbuf_refcnt_set(mbufs[idx], 0); > > > >>> This is really a misuse of the API.? The entire point of reference > > > counting is > > > >>> to know when an mbuf has no more references and can be freed.? By > > > forcing all > > > >>> the reference counts to zero here, you allow the refcnt infrastructure > > > to be > > > >>> circumvented, causing memory leaks. > > > >>> > > > >>> I think what you need to do here is enhance the underlying pktmbuf > > > interface > > > >>> such that an rte_mbuf structure has a destructor method association > > > with it > > > >>> which is called when its refcnt reaches zero.? That way the > > > >>> rte_pktmbuf_bulk_free function can just decrement the refcnt on each > > > >>> mbuf_structure, and the pool as a whole can be returned when the > > > destructor > > > >>> function discovers that all mbufs in that bulk pool are freed. > > > >> > > > >> I don't really understand what's the problem here. The API explicitly > > > >> describes the conditions for calling this functions: the segments are > > > >> directs, they belong to the same mempool, and their refcnt is 1. > > > >> > > > >> This function could be useful in a driver which knows that the mbuf > > > >> it allocated matches this conditions. I think an application that > > > >> only uses direct mbufs and one mempool could also use this function. > > > > > > > > > > > > That last condition is my issue with this patch, that the user has to > > > know what > > > > refcnts are.? It makes this api useful for little more than the test > > > case that > > > > is provided with it.? Its irritating enough that for singly allocated > > > mbufs the > > > > user has to know what the refcount of a buffer is before freeing, but at > > > least > > > > they can macrotize a {rte_pktmbuf_refcnt_update; > > > if(rte_pktmbuf_refct_read) then > > > > free} operation. > > > > > > > > With this, you've placed the user in charge of not only doing that, but > > > also of > > > > managing the relationship between pktmbufs and the pool they came from. > > > while > > > > that makes
[dpdk-dev] [PATCH] cmdline: fix type format from unsigned to size_t for buffer size
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > Sent: Monday, March 23, 2015 12:47 PM > To: Mrzyglod, DanielX T > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH] cmdline: fix type format from unsigned to > size_t for buffer size > > Daniel, > Without answer to the question from Olivier, this patch is going to be > ignored. > > 2015-02-24 12:03, Olivier MATZ: > > Hi Daniel, > > > > On 02/20/2015 05:18 PM, Daniel Mrzyglod wrote: > > > Function match_inst is used to take buffor using sizeof() which is size_t > type. > > > This modification also involved changing '%u' to '%zu' in printf function. > > > > > > Signed-off-by: Daniel Mrzyglod > > > --- > > > lib/librte_cmdline/cmdline_parse.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/lib/librte_cmdline/cmdline_parse.c > b/lib/librte_cmdline/cmdline_parse.c > > > index dfc885c..0821791 100644 > > > --- a/lib/librte_cmdline/cmdline_parse.c > > > +++ b/lib/librte_cmdline/cmdline_parse.c > > > @@ -138,7 +138,7 @@ nb_common_chars(const char * s1, const char * > s2) > > >*/ > > > static int > > > match_inst(cmdline_parse_inst_t *inst, const char *buf, > > > -unsigned int nb_match_token, void *resbuf, unsigned resbuf_size) > > > +unsigned int nb_match_token, void *resbuf, size_t resbuf_size) > > > { > > > unsigned int token_num=0; > > > cmdline_parse_token_hdr_t * token_p; > > > @@ -169,7 +169,7 @@ match_inst(cmdline_parse_inst_t *inst, const > char *buf, > > > > > > if (token_hdr.offset > resbuf_size) { > > > printf("Parse error(%s:%d): Token > > > offset(%u) > " > > > - "exceeds maximum size(%u)\n", > > > + "exceeds maximum size(%zu)\n", > > > __FILE__, __LINE__, > > > token_hdr.offset, resbuf_size); > > > return -ENOBUFS; > > > > > > > > > Did you see a specific issue with the current code? (maybe a compilation > > issue or a klocwork issue?) Hi Olivier, Thomas Yes, this is an issue reported by the static analysis tool. > > > > I think this patch is ok, but there are many places where this kind > > of fixes should be applied in cmdline (cmdline_parse_*(), > > cmdline_get_help_*(), etc.). Is there a motivation for changing it > > only there? The tool we use didn't reported other places, that is why Daniel send only this one change. > > > > Regards, > > Olivier > > >
[dpdk-dev] [PATCH] cast used->idx to volatile
On 3/24/2015 3:44 PM, Linhaifeng wrote: > > On 2015/3/24 9:53, Xie, Huawei wrote: >> On 3/24/2015 9:00 AM, Linhaifeng wrote: >>> On 2015/3/23 20:54, Xie, Huawei wrote: > -Original Message- > From: Linhaifeng [mailto:haifeng.lin at huawei.com] > Sent: Monday, March 23, 2015 8:24 PM > To: dev at dpdk.org > Cc: Ouyang, Changchun; Xie, Huawei > Subject: Re: [dpdk-dev] [PATCH] cast used->idx to volatile > > > > On 2015/3/21 16:07, linhaifeng wrote: >> From: Linhaifeng >> >> Same as rte_vhost_enqueue_burst we should cast used->idx >> to volatile before notify guest. >> >> Signed-off-by: Linhaifeng >> --- >> lib/librte_vhost/vhost_rxtx.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/lib/librte_vhost/vhost_rxtx.c >> b/lib/librte_vhost/vhost_rxtx.c >> index 535c7a1..8d674d1 100644 >> --- a/lib/librte_vhost/vhost_rxtx.c >> +++ b/lib/librte_vhost/vhost_rxtx.c >> @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, > uint16_t queue_id, >> } >> >> rte_compiler_barrier(); >> -vq->used->idx += entry_success; >> +*(volatile uint16_t *)>used->idx += entry_success; Haifeng: We have compiler barrier before and an external function call behind, so we don't need volatile here. Do you meet issue? >>> Tx_q is sometimes stopped when we use virtio_net. Because vhost thought >>> there are no buffers in tx_q and virtio_net >>> though vhost haven't handle all packets so we have to restart VM to restore >>> work. >>> >>> The status in VM is: >>> Mar 18 17:11:10 linux-b2ij kernel: [46337.246687] net eth7: virtnet_poll >>> Mar 18 17:11:10 linux-b2ij kernel: [46337.246690] net eth7: receive_buf >>> Mar 18 17:11:10 linux-b2ij kernel: [46337.246693] net eth7: vi->num=239 >>> Mar 18 17:11:10 linux-b2ij kernel: [46337.246695] net eth7: >>> svq:avail->idx=52939 used->idx=52939 num_free=18 num_added=0 >>> svq->last_used_idx=52820 >>> Mar 18 17:11:10 linux-b2ij kernel: [46337.246699] net eth7: >>> rvq:avail->idx=36215 used->idx=35977 num_free=18 num_added=0 >>> rvq->last_used_idx=35977 >>> Mar 18 17:11:11 linux-b2ij kernel: [46337.901038] net eth7: dev_queue_xmit, >>> qdisc->flags=4, qdisc->state deactiveed=0 >>> Mar 18 17:11:12 linux-b2ij kernel: [46337.901042] net eth7: dev_queue_xmit, >>> txq->state=1, stopped=1 >>> >>> Why compiler barrier not take effect in our case? Is compiler barrier >>> depended on -O3 option? We use -O2 option. >> compiler barrier always works regardless of the optimization option. >> I don't get your story, but the key thing is, do you check the asm code? >> If called from outside as an API, how is it possible it is optimized? >> there is only one update to used->idx in that function. > > Do you mean rte_vhost_enqueue_burst also not need cast used->idx to volatile > ? Why not remove it? I checked the code. Seems we can remove. That is another issue. For your issue, you meet problem, and submit this this patch, but i am a bit confused it is the root cause. Do you check the asm code that volatile is optimized? >> /* Kick guest if required. */ >> if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) >> eventfd_write((int)vq->callfd, 1); >> > -- > Regards, > Haifeng >> >> . >>
[dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free functions added + unittest
Hi, Konstantin, >Though from my point, such function should be generic as rte_pktmbuf_free_chain() - >no special limitations like all mbufs from one pool, refcnt==1, etc. I misunderstood, I'll rework. Regards, Vadim. On Tue, Mar 24, 2015 at 1:48 AM, Ananyev, Konstantin < konstantin.ananyev at intel.com> wrote: > Hi Vadim, > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vadim Suraev > > Sent: Monday, March 23, 2015 5:31 PM > > To: Olivier MATZ > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v2] rte_mbuf: mbuf bulk alloc/free > functions added + unittest > > > > Hi, Olivier, > > No, I personally need to free a chain using mempool bulk. If I'm not > > mistaken, it has been proposed by one of reviewers to have lower level > > function, so it was done (I'm sorry if misunderstood) > > Was it me? > As I remember, I said it would be good to create rte_pktmbuf_bulk_free() > or rte_pktmbuf_seg_bulk_free() - > that would free a bulk of mbufs segments in the same manner as > rte_pktmbuf_free_chain() does: > count number of consecutive mbufs from the same mempool to be freed and > then put them back into the pool at one go. > Such function would be useful inside PMD code. > In fact we already using analogy of such function inside vPMD TX code. > Though from my point, such function should be generic as > rte_pktmbuf_free_chain() - > no special limitations like all mbufs from one pool, refcnt==1, etc. > So if it was me who confused you - I am sorry. > Konstantin > > > Regards, > > Vadim. > > On Mar 23, 2015 8:44 PM, "Olivier MATZ" wrote: > > > > > Hi Neil, > > > > > > On 03/19/2015 02:16 PM, Neil Horman wrote: > > > >> On 03/18/2015 09:58 PM, Neil Horman wrote: > > > +/** > > > + * Free a bulk of mbufs into its original mempool. > > > + * This function assumes: > > > + * - refcnt equals 1 > > > + * - mbufs are direct > > > + * - all mbufs must belong to the same mempool > > > + * > > > + * @param mbufs > > > + *Array of pointers to mbuf > > > + * @param count > > > + *Array size > > > + */ > > > +static inline void rte_pktmbuf_bulk_free(struct rte_mbuf **mbufs, > > > + unsigned count) > > > +{ > > > + unsigned idx; > > > + > > > + RTE_MBUF_ASSERT(count > 0); > > > + > > > + for (idx = 0; idx < count; idx++) { > > > + RTE_MBUF_ASSERT(mbufs[idx]->pool == mbufs[0]->pool); > > > + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 1); > > > + rte_mbuf_refcnt_set(mbufs[idx], 0); > > > >>> This is really a misuse of the API. The entire point of reference > > > counting is > > > >>> to know when an mbuf has no more references and can be freed. By > > > forcing all > > > >>> the reference counts to zero here, you allow the refcnt > infrastructure > > > to be > > > >>> circumvented, causing memory leaks. > > > >>> > > > >>> I think what you need to do here is enhance the underlying pktmbuf > > > interface > > > >>> such that an rte_mbuf structure has a destructor method association > > > with it > > > >>> which is called when its refcnt reaches zero. That way the > > > >>> rte_pktmbuf_bulk_free function can just decrement the refcnt on > each > > > >>> mbuf_structure, and the pool as a whole can be returned when the > > > destructor > > > >>> function discovers that all mbufs in that bulk pool are freed. > > > >> > > > >> I don't really understand what's the problem here. The API > explicitly > > > >> describes the conditions for calling this functions: the segments > are > > > >> directs, they belong to the same mempool, and their refcnt is 1. > > > >> > > > >> This function could be useful in a driver which knows that the mbuf > > > >> it allocated matches this conditions. I think an application that > > > >> only uses direct mbufs and one mempool could also use this function. > > > > > > > > > > > > That last condition is my issue with this patch, that the user has to > > > know what > > > > refcnts are. It makes this api useful for little more than the test > > > case that > > > > is provided with it. Its irritating enough that for singly allocated > > > mbufs the > > > > user has to know what the refcount of a buffer is before freeing, > but at > > > least > > > > they can macrotize a {rte_pktmbuf_refcnt_update; > > > if(rte_pktmbuf_refct_read) then > > > > free} operation. > > > > > > > > With this, you've placed the user in charge of not only doing that, > but > > > also of > > > > managing the relationship between pktmbufs and the pool they came > from. > > > while > > > > that makes sense for the test case, it really doesn't in any general > use > > > case in > > > > which packet processing is ever deferred or queued, because it means > > > that the > > > > application is now responsible for holding a pointer to every packet > it > > > > allocates and checking its
[dpdk-dev] [RFC PATCHv2 5/8] add OSv support
On Sat, Mar 21, 2015 at 12:23:02PM +0900, Takuya ASADA wrote: > Adding OSv support. > Based on Linux/FreeBSD EAL, but calling OSv kernel APIs to access devices, > allocate contiguous memory, etc. > > Signed-off-by: Takuya ASADA > --- > diff --git a/lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h > b/lib/librte_eal/osvapp/eal/include/exec-env/rte_interrupts.h > similarity index 100% > copy from lib/librte_eal/bsdapp/eal/include/exec-env/rte_interrupts.h > copy to lib/librte_eal/osvapp/eal/include/exec-env/rte_interrupts.h > diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map > b/lib/librte_eal/osvapp/eal/rte_eal_version.map > similarity index 92% > copy from lib/librte_eal/bsdapp/eal/rte_eal_version.map > copy to lib/librte_eal/osvapp/eal/rte_eal_version.map > index 67b6a6c..ebb584a 100644 > --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map > +++ b/lib/librte_eal/osvapp/eal/rte_eal_version.map > @@ -31,24 +31,26 @@ DPDK_2.0 { > rte_eal_get_physmem_layout; > rte_eal_get_physmem_size; > rte_eal_has_hugepages; > - rte_eal_hpet_init; > rte_eal_init; > rte_eal_iopl_init; > rte_eal_lcore_role; > rte_eal_mp_remote_launch; > rte_eal_mp_wait_lcore; > + rte_eal_parse_devargs_str; > + rte_eal_pci_close_one; > rte_eal_pci_dump; > rte_eal_pci_probe; > + rte_eal_pci_probe_one; > rte_eal_pci_register; > rte_eal_pci_unregister; > rte_eal_process_type; > rte_eal_remote_launch; > rte_eal_tailq_lookup; > rte_eal_tailq_register; > + rte_eal_vdev_init; > + rte_eal_vdev_uninit; > rte_eal_wait_lcore; > rte_exit; > - rte_get_hpet_cycles; > - rte_get_hpet_hz; > rte_get_tsc_hz; > rte_hexdump; > rte_intr_callback_register; > @@ -86,9 +88,6 @@ DPDK_2.0 { > rte_thread_get_affinity; > rte_thread_set_affinity; > rte_vlog; > - rte_xen_dom0_memory_attach; > - rte_xen_dom0_memory_init; > - test_mp_secondary; > I understand removing symbols that you don't support, but why are you adding the vdev, pci and parsing symbols? I don't see you using them anywhere in your code... Neil
[dpdk-dev] tools brainstorming
On 2015-03-23 17:29, Thomas Monjalon wrote: > 2015-03-20 16:18, Simon K?gstr?m: >>> - make autotests easier and faster to run for smoke testing >>> - automated basic testpmd check >> >> Code coverage for automated tests can be useful as well. >> >> In a way I'm speaking in my own interests here since I've written a tool >> to do just this (and produce nice HTML etc output), kcov, that can be >> found at github (https://github.com/SimonKagstrom/kcov). > > Feel free to do some DPDK integration for your kcov tool. > It could definitely help to write good tests. Well, there's not that much integration needed for kcov. It needs debugging information in the binary, so something like the patch below is needed (and the same for clang, not sure about icc). Apart from that, I just ran it as kcov --exclude-path=/usr /tmp/kcov-dpdk ./basicfwd --no-huge -c 1 -n 1 to collect coverage info and generate a report. The report for this particular case can be seen here: http://www.c64-network.org/kcov-dpdk/index.html I'm not sure how your automatic test suite works, but just running it like you do today but through kcov like above should do the trick. // Simon diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk index 88f235c..d3cd9e5 100644 --- a/mk/toolchain/gcc/rte.vars.mk +++ b/mk/toolchain/gcc/rte.vars.mk @@ -60,7 +60,7 @@ endif HOSTAS= as TOOLCHAIN_ASFLAGS = -TOOLCHAIN_CFLAGS = +TOOLCHAIN_CFLAGS = -g TOOLCHAIN_LDFLAGS = ifeq ($(CONFIG_RTE_LIBRTE_GCOV),y)
[dpdk-dev] [PATCH] cast used->idx to volatile
On 2015/3/23 20:54, Xie, Huawei wrote: > > >> -Original Message- >> From: Linhaifeng [mailto:haifeng.lin at huawei.com] >> Sent: Monday, March 23, 2015 8:24 PM >> To: dev at dpdk.org >> Cc: Ouyang, Changchun; Xie, Huawei >> Subject: Re: [dpdk-dev] [PATCH] cast used->idx to volatile >> >> >> >> On 2015/3/21 16:07, linhaifeng wrote: >>> From: Linhaifeng >>> >>> Same as rte_vhost_enqueue_burst we should cast used->idx >>> to volatile before notify guest. >>> >>> Signed-off-by: Linhaifeng >>> --- >>> lib/librte_vhost/vhost_rxtx.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c >>> index 535c7a1..8d674d1 100644 >>> --- a/lib/librte_vhost/vhost_rxtx.c >>> +++ b/lib/librte_vhost/vhost_rxtx.c >>> @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, >> uint16_t queue_id, >>> } >>> >>> rte_compiler_barrier(); >>> - vq->used->idx += entry_success; >>> + *(volatile uint16_t *)>used->idx += entry_success; > > > Haifeng: > We have compiler barrier before and an external function call behind, so we > don't need volatile here. > Do you meet issue? > Tx_q is sometimes stopped when we use virtio_net. Because vhost thought there are no buffers in tx_q and virtio_net though vhost haven't handle all packets so we have to restart VM to restore work. The status in VM is: Mar 18 17:11:10 linux-b2ij kernel: [46337.246687] net eth7: virtnet_poll Mar 18 17:11:10 linux-b2ij kernel: [46337.246690] net eth7: receive_buf Mar 18 17:11:10 linux-b2ij kernel: [46337.246693] net eth7: vi->num=239 Mar 18 17:11:10 linux-b2ij kernel: [46337.246695] net eth7: svq:avail->idx=52939 used->idx=52939 num_free=18 num_added=0 svq->last_used_idx=52820 Mar 18 17:11:10 linux-b2ij kernel: [46337.246699] net eth7: rvq:avail->idx=36215 used->idx=35977 num_free=18 num_added=0 rvq->last_used_idx=35977 Mar 18 17:11:11 linux-b2ij kernel: [46337.901038] net eth7: dev_queue_xmit, qdisc->flags=4, qdisc->state deactiveed=0 Mar 18 17:11:12 linux-b2ij kernel: [46337.901042] net eth7: dev_queue_xmit, txq->state=1, stopped=1 Why compiler barrier not take effect in our case? Is compiler barrier depended on -O3 option? We use -O2 option. >>> /* Kick guest if required. */ >>> if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) >>> eventfd_write((int)vq->callfd, 1); >>> >> >> -- >> Regards, >> Haifeng > > > -- Regards, Haifeng
[dpdk-dev] [PATCH] ixgbe: do not include CRC in Tx byte count
On Mon, 23 Mar 2015 16:45:44 + "Ananyev, Konstantin" wrote: > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of stephen at > > networkplumber.org > > Sent: Friday, January 23, 2015 6:24 AM > > To: dev at dpdk.org > > Cc: Stephen Hemminger > > Subject: [dpdk-dev] [PATCH] ixgbe: do not include CRC in Tx byte count > > > > From: Stephen Hemminger > > > > The ixgbe driver was including CRC in the transmit packet byte > > count, but not for packets received. > > This was notice when forwarding and > > the number of bytes received was greater than the number of bytes > > transmitted > > for the same number of packets. Make the driver behave like other > > virtual devices and not include CRC in byte count. Use the same queue > > counters already computed and used for Rx. > > About RX side stats - as I remember it depends to what value hw_stip_crc is > set at configure(). > If hw_stip_crc==1, then, yes CRC bytes are not included into QBRC value. > I If hw_stip_crc==0, then CRC bytes are included into QBRC. That is an additional bug! * CRC should never be included in the byte count. This is not how Linux or other drivers report byte count. * the byte count must be symmetrical Rx == Tx The Brocade router always set strip_crc to 1. So did not see the additional bug of CRC being included in byte count.
[dpdk-dev] [PATCH] ixgbe: fix data access on big endian cpu
Hi Thomas, Done. http://patchwork.dpdk.org/dev/patchwork/patch/4123/ Thanks, Xuelin Shi > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Monday, March 23, 2015 22:02 > To: Shi Xuelin-B29237 > Cc: dev at dpdk.org; konstantin.ananyev at intel.com; helin.zhang at intel.com > Subject: Re: [PATCH] ixgbe: fix data access on big endian cpu > > 2015-03-03 16:27, xuelin.shi at freescale.com: > > From: Xuelin Shi > > > > enforce rules for cpu and ixgbe exchanging data. > > 1. cpu use data owned by ixgbe must use rte_le_to_cpu_xx(...) 2. cpu > > fill data to ixgbe must use rte_cpu_to_le_xx(...) > > > > Signed-off-by: Xuelin Shi > > Please Xuelin, could you rebase on HEAD and fix these checkpatch errors? > > ERROR:SPACING: space prohibited after that '!' (ctx:BxW) > > ERROR:CODE_INDENT: code indent should use tabs where possible > +^I^I ^I ^I IXGBE_RXDADV_STAT_DD)) {$ > > Thanks
[dpdk-dev] [PATCH] librte_lpm: define tbl entry reversely for big endian
Hi Thomas, Done. http://patchwork.dpdk.org/dev/patchwork/patch/4122/ Thanks, Xuelin Shi > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Monday, March 23, 2015 22:04 > To: Shi Xuelin-B29237 > Cc: Bruce Richardson; dev at dpdk.org > Subject: Re: [PATCH] librte_lpm: define tbl entry reversely for big > endian > > 2015-03-09 14:02, Bruce Richardson: > > On Wed, Mar 04, 2015 at 02:34:12PM +0800, xuelin.shi at freescale.com > wrote: > > > From: Xuelin Shi > > > > > > This module uses type conversion between struct and int. > > > Also truncation and comparison is used with this int. > > > It is not safe for different endian arch. > > > > > > Add ifdef for big endian struct to fix this issue. > > > > > > Signed-off-by: Xuelin Shi > > > > Get an error compiling this up (using clang on FreeBSD). > > > > CC rte_lpm.o > > In file included from > /usr/home/bruce/dpdk.org/lib/librte_lpm/rte_lpm.c:57: > > /usr/home/bruce/dpdk.org/lib/librte_lpm/rte_lpm.h:99:5: fatal error: > > 'RTE_BYTE_ORDER' is not defined, evaluates to 0 [-Wundef] #if > RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN > > ^ > > 1 error generated. > > > > Adding "#include " should fix the issue. > > Please Xuelin, could you submit a v2? > Thanks > > > Existing unit tests on IA (little endian) pass fine there-after, but I > > think for this patch it would be good to have an ack from someone who > > can validate on a big endian system, since this is what this patch is > meant to enable. > > > > /Bruce > > >
[dpdk-dev] [PATCH v3] lib/librte_vhost: update used->idx when allocation of mbuf fails
On 3/22/2015 8:08 PM, Ouyang, Changchun wrote: > >> -Original Message- >> From: linhaifeng [mailto:haifeng.lin at huawei.com] >> Sent: Saturday, March 21, 2015 9:47 AM >> To: dev at dpdk.org >> Cc: Ouyang, Changchun; Xie, Huawei >> Subject: [PATCH v3] lib/librte_vhost: update used->idx when allocation of >> mbuf fails >> >> From: Linhaifeng >> >> When failed to malloc buffer from mempool we just update last_used_idx >> but not used->idx so after many times vhost thought have handle all packets >> but virtio_net thought vhost have not handle all packets and will not update >> avail->idx. >> >> Signed-off-by: Linhaifeng > Acked-by: Changchun Ouyang > > > Acked-by: Huawei Xie This patch fix the issue. Simple solution like other PMDs is before processing one descriptor, ensure allocation of new mbuf is successfull, and then immediately refill after receiving the packet from the descriptor. In future, we should consider optimized bulk allocation strategy with threshold.
[dpdk-dev] [PATCH] vhost: Fix `struct file' leakage in `eventfd_link'
On 3/23/2015 8:54 PM, Pavel Boldin wrote: > Due to increased `struct file's reference counter subsequent call > to `filp_close' does not free the `struct file'. Prepend `fput' call > to decrease the reference counter. > > Signed-off-by: Pavel Boldin > --- > lib/librte_vhost/eventfd_link/eventfd_link.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c > b/lib/librte_vhost/eventfd_link/eventfd_link.c > index 7755dd6..62c45c8 100644 > --- a/lib/librte_vhost/eventfd_link/eventfd_link.c > +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c > @@ -117,6 +117,7 @@ eventfd_link_ioctl(struct file *f, unsigned int ioctl, > unsigned long arg) >* Release the existing eventfd in the source process >*/ > spin_lock(>file_lock); > + fput(file); > filp_close(file, files); > fdt = files_fdtable(files); > fdt->fd[eventfd_copy.source_fd] = NULL; Acked-by Huawei Xie In future, we should remove the allocation of src eventfd, allocate a free fd from kernel, and install it with target fd.
[dpdk-dev] [PATCH] i40e: remove ALLOW_LB flag on SRIOV vsi
> -Original Message- > From: Wu, Jingjing > Sent: Friday, March 20, 2015 3:32 PM > To: dev at dpdk.org > Cc: Wu, Jingjing; Xu, HuilongX; Zhang, Helin > Subject: [PATCH] i40e: remove ALLOW_LB flag on SRIOV vsi > > Disable VEB switching by removing ALLOW_LB on SRIOV vsi. > > If the source mac address of packet sent from VF is not listed in the VEB's > mac > table, the VEB will switch the packet back to the VF. > It's a hardware issue. Enabling ALLOW_LB flag will block VF functions. > > Signed-off-by: Jingjing Wu Acked-by: Helin Zhang > --- > lib/librte_pmd_i40e/i40e_ethdev.c | 14 +- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c > b/lib/librte_pmd_i40e/i40e_ethdev.c > index cf6685e..28ea5dc 100644 > --- a/lib/librte_pmd_i40e/i40e_ethdev.c > +++ b/lib/librte_pmd_i40e/i40e_ethdev.c > @@ -3059,11 +3059,15 @@ i40e_vsi_setup(struct i40e_pf *pf, > ctxt.connection_type = 0x1; > ctxt.flags = I40E_AQ_VSI_TYPE_VF; > > - /* Configure switch ID */ > - ctxt.info.valid_sections |= > - rte_cpu_to_le_16(I40E_AQ_VSI_PROP_SWITCH_VALID); > - ctxt.info.switch_id = > - rte_cpu_to_le_16(I40E_AQ_VSI_SW_ID_FLAG_ALLOW_LB); > + /** > + * Do not configure switch ID to enable VEB switch by > + * I40E_AQ_VSI_SW_ID_FLAG_ALLOW_LB. Because in Fortville, > + * if the source mac address of packet sent from VF is not > + * listed in the VEB's mac table, the VEB will switch the > + * packet back to the VF. Need to enable it when HW issue > + * is fixed. > + */ > + > /* Configure port/vlan */ > ctxt.info.valid_sections |= > rte_cpu_to_le_16(I40E_AQ_VSI_PROP_VLAN_VALID); > -- > 1.9.3
[dpdk-dev] [PATCH] kni/ethtool/ixgbe: enforce access between ixgbe PCI and CPU
Hi Xuelin > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of > xuelin.shi at freescale.com > Sent: Wednesday, February 11, 2015 2:50 PM > To: thomas.monjalon at 6wind.com > Cc: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] kni/ethtool/ixgbe: enforce access between ixgbe > PCI and CPU > > From: Xuelin Shi > > make sure: > CPU read from ixgbe with IXGBE_LE32_TO_CPUS > CPU write to ixgbe with IXGBE_CPU_TO_LE32 > > otherwise, there is endian issue for ixgbe on BIG_ENDIAN CPU. I got your concern, but you modified in wrong places. Source files in linux/kni/ will be compiled into a kernel module. So the endian issue will be handled quite well by kernel functions like writel, readl, etc. And your modifications are not needed at all for KNI kernel module. But I think the similar changes are needed in librte_pmd_e1000, librte_pmd_ixgbe, librte_pmd_i40e, etc. Regards, Helin > > Signed-off-by: Xuelin Shi > --- > .../linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h | 24 > -- > 1 file changed, 18 insertions(+), 6 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h > b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h > index d161600..0612632 100644 > --- a/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h > +++ b/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h > @@ -53,6 +53,16 @@ > > #undef ASSERT > > +static inline uint32_t ixgbe_read_addr(volatile void* addr) { > + return IXGBE_LE32_TO_CPUS(*((volatile uint32_t *)addr)); } > + > +static inline uint32_t ixgbe_write_addr(u32 value, volatile void* addr) > +{ > + return writel(IXGBE_CPU_TO_LE32(value), addr); } > + > #ifdef DBG > #define hw_dbg(hw, S, A...) printk(KERN_DEBUG S, ## A) > #else > @@ -91,19 +101,20 @@ > default: \ > break; \ > } \ > - writel((value), ((a)->hw_addr + (reg))); \ > + ixgbe_write_addr((value), ((a)->hw_addr + (reg))); \ > } while (0) > #else > -#define IXGBE_WRITE_REG(a, reg, value) writel((value), ((a)->hw_addr + > (reg))) > +#define IXGBE_WRITE_REG(a, reg, value) \ > + ixgbe_write_addr((value), ((a)->hw_addr + (reg))) > #endif > > -#define IXGBE_READ_REG(a, reg) readl((a)->hw_addr + (reg)) > +#define IXGBE_READ_REG(a, reg) ixgbe_read_addr((a)->hw_addr + (reg)) > > #define IXGBE_WRITE_REG_ARRAY(a, reg, offset, value) ( \ > - writel((value), ((a)->hw_addr + (reg) + ((offset) << 2 > + ixgbe_write_addr((value), ((a)->hw_addr + (reg) + ((offset) << 2 > > #define IXGBE_READ_REG_ARRAY(a, reg, offset) ( \ > - readl((a)->hw_addr + (reg) + ((offset) << 2))) > + ixgbe_read_addr((a)->hw_addr + (reg) + ((offset) << 2))) > > #ifndef writeq > #define writeq(val, addr)do { writel((u32) (val), addr); \ > @@ -111,7 +122,8 @@ > } while (0); > #endif > > -#define IXGBE_WRITE_REG64(a, reg, value) writeq((value), ((a)->hw_addr + > (reg))) > +#define IXGBE_WRITE_REG64(a, reg, value) \ > + writeq((cpu_to_le64(value)), ((a)->hw_addr + (reg))) > > #define IXGBE_WRITE_FLUSH(a) IXGBE_READ_REG(a, IXGBE_STATUS) > struct ixgbe_hw; > -- > 1.9.1
[dpdk-dev] [PATCH] cast used->idx to volatile
On 3/24/2015 9:00 AM, Linhaifeng wrote: > > On 2015/3/23 20:54, Xie, Huawei wrote: >> >>> -Original Message- >>> From: Linhaifeng [mailto:haifeng.lin at huawei.com] >>> Sent: Monday, March 23, 2015 8:24 PM >>> To: dev at dpdk.org >>> Cc: Ouyang, Changchun; Xie, Huawei >>> Subject: Re: [dpdk-dev] [PATCH] cast used->idx to volatile >>> >>> >>> >>> On 2015/3/21 16:07, linhaifeng wrote: From: Linhaifeng Same as rte_vhost_enqueue_burst we should cast used->idx to volatile before notify guest. Signed-off-by: Linhaifeng --- lib/librte_vhost/vhost_rxtx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 535c7a1..8d674d1 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -722,7 +722,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, >>> uint16_t queue_id, } rte_compiler_barrier(); - vq->used->idx += entry_success; + *(volatile uint16_t *)>used->idx += entry_success; >> >> Haifeng: >> We have compiler barrier before and an external function call behind, so we >> don't need volatile here. >> Do you meet issue? >> > Tx_q is sometimes stopped when we use virtio_net. Because vhost thought there > are no buffers in tx_q and virtio_net > though vhost haven't handle all packets so we have to restart VM to restore > work. > > The status in VM is: > Mar 18 17:11:10 linux-b2ij kernel: [46337.246687] net eth7: virtnet_poll > Mar 18 17:11:10 linux-b2ij kernel: [46337.246690] net eth7: receive_buf > Mar 18 17:11:10 linux-b2ij kernel: [46337.246693] net eth7: vi->num=239 > Mar 18 17:11:10 linux-b2ij kernel: [46337.246695] net eth7: > svq:avail->idx=52939 used->idx=52939 num_free=18 num_added=0 > svq->last_used_idx=52820 > Mar 18 17:11:10 linux-b2ij kernel: [46337.246699] net eth7: > rvq:avail->idx=36215 used->idx=35977 num_free=18 num_added=0 > rvq->last_used_idx=35977 > Mar 18 17:11:11 linux-b2ij kernel: [46337.901038] net eth7: dev_queue_xmit, > qdisc->flags=4, qdisc->state deactiveed=0 > Mar 18 17:11:12 linux-b2ij kernel: [46337.901042] net eth7: dev_queue_xmit, > txq->state=1, stopped=1 > > Why compiler barrier not take effect in our case? Is compiler barrier > depended on -O3 option? We use -O2 option. compiler barrier always works regardless of the optimization option. I don't get your story, but the key thing is, do you check the asm code? If called from outside as an API, how is it possible it is optimized? there is only one update to used->idx in that function. > /* Kick guest if required. */ if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) eventfd_write((int)vq->callfd, 1); >>> -- >>> Regards, >>> Haifeng >> >>
[dpdk-dev] [PATCH] scripts: enable extended tag of PCIe
Hi Thomas Zhida is our intern who has already been back to university. I think Yong might have reviewed it. It is good supplementation for setting extended tag on Linux, though not necessary. I am OK to have it merged or not. Thanks! Marvin, could you help to ack it, as I know you have reviewed it? Regards, Helin > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Monday, March 23, 2015 7:53 PM > To: Zang, Zhida > Cc: dev at dpdk.org; Butler, Siobhan A; Zhang, Helin > Subject: Re: [dpdk-dev] [PATCH] scripts: enable extended tag of PCIe > > Hi, > > This patch needs review and documentation. > It's going to be dropped if nobody cares. > > There were some previous discussions about it: > http://dpdk.org/ml/archives/dev/2015-February/012708.html > > > 2015-01-30 12:57, zhida zang: > > As 'extended tag' of PCIe needs to be enabled for i40e high > > performance, Linux command of 'setpci' can be used to check and set > > the corresponding bit of 'extended tag' of PCIe configuration space. > > The script is to check and set the right bit in PCIe configuration space to > enable 'extended tag'. > > > > Signed-off-by: Zhida Zang > > --- > > tools/set_pci.py | 124 > > +++ > > 1 file changed, 124 insertions(+) > > create mode 100755 tools/set_pci.py > > > > diff --git a/tools/set_pci.py b/tools/set_pci.py new file mode 100755 > > index 000..e242efb > > --- /dev/null > > +++ b/tools/set_pci.py > > @@ -0,0 +1,124 @@ > > +#! /usr/bin/python > > +import sys > > +import os > > +import subprocess > > +import getopt > > +from os.path import basename > > + > > +# The register to check if extended tag is supported or not. > > +PCI_DEV_CAP_REG = 0xA4 > > +# The control register which contains the bit to enable/disable 'extended > tag'. > > +PCI_DEV_CTRL_REG = 0xA8 > > +# The mask of 'extended tag' in capability register. > > +PCI_DEV_CAP_EXT_TAG_MASK = 0x20 > > +# The mask of 'extended tag' in control register. > > +PCI_DEV_CTRL_EXT_TAG_MASK = 0x100 > > + > > +dev_ids = {} > > +flag = "Set" > > + > > + > > +def usage(): > > +'''Print usage information for the program''' > > +argv0 = basename(sys.argv[0]) > > +print """ > > +Usage: > > +-- > > + > > +%(argv0)s [options] DEVICE1 DEVICE2 > > + > > +where DEVICE1, DEVICE2 etc, are specified via PCI > > +"domain:bus:slot.func" syntax or "bus:slot.func" syntax. For devices > > +bound to Linux kernel drivers, they may also be referred to by Linux > > interface > name e.g. eth0, eth1, em0, em1, etc. > > + > > +Options: > > +--help, --usage: > > +Display usage information and quit > > + > > +-s --set: > > +Set the following pci device > > + > > +-u --Unset: > > +Unset the following pci device > > + > > +Examples: > > +- > > +To set pci 0a:00.0 > > +%(argv0)s -s 0a:00.0 > > +%(argv0)s --set 0a:00.0 > > + > > +To unset :01:00.0 > > +%(argv0)s -u :01:00.0 > > +%(argv0)s --unset :01:00.0 > > + > > +To set :02:00.0 and :02:00.1 > > +%(argv0)s -s 02:00.0 02:00.1 > > + > > +""" % locals() # replace items from local variables > > + > > + > > +def parse_args(): > > +global flag > > +global dev_ids > > +if len(sys.argv) <= 1: > > +usage() > > +sys.exit(0) > > +try: > > +opts, dev_ids = getopt.getopt( > > +sys.argv[1:], > > +"su", > > +["help", "usage", "set", "unset"] > > +) > > +except getopt.GetoptError, error: > > +print str(error) > > +print "Run '%s --usage' for further information" % sys.argv[0] > > +sys.exit(1) > > + > > +for opt, arg in opts: > > +if opt == "--help" or opt == "--usage": > > +usage() > > +sys.exit(0) > > +if opt == "-s" or opt == "--set": > > +flag = "Set" > > +if opt == "-u" or opt == "--unset": > > +flag = "Unset" > > + > > + > > +def check_output(args, stderr=None): > > +'''Run a command and capture its output''' > > +return subprocess.Popen( > > +args, > > +stdout=subprocess.PIPE, > > +stderr=stderr > > +).communicate()[0] > > + > > + > > +def set_pci(): > > +if len(dev_ids) == 0: > > +print "Error: No devices specified." > > +print "Run '%s --usage' for further information" % sys.argv[0] > > +sys.exit(1) > > +param_cap = "%x.W" % PCI_DEV_CAP_REG > > +for k in range(len(dev_ids)): > > +val = check_output(["setpci", "-s", dev_ids[k], param_cap]) > > +if (not (int(val, 16) & PCI_DEV_CAP_EXT_TAG_MASK)): > > +print dev_ids[k], "Not supported" > > +continue > > +if (int(val, 16) & PCI_DEV_CTRL_EXT_TAG_MASK): > > +continue > > +param_ctrl = "%x.W" % PCI_DEV_CTRL_REG > > +val =
[dpdk-dev] [PATCH] eal: remove unnecessary #ifdef CONFIG_BQL
Hi Alex > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Gartrell > Sent: Saturday, March 14, 2015 3:22 AM > To: dev at dpdk.org > Cc: kernel-team at fb.com > Subject: [dpdk-dev] [PATCH] eal: remove unnecessary #ifdef CONFIG_BQL > > I couldn't figure out why this #ifdef existed so I looked around upstream and > it's > not there. It seems to build just fine without it. You can see in igb/kcomat.h of below code section. #ifndef CONFIG_BQL #define netdev_tx_completed_queue(_q, _p, _b) do {} while (0) #define netdev_completed_queue(_n, _p, _b) do {} while (0) #define netdev_tx_sent_queue(_q, _b) do {} while (0) #define netdev_sent_queue(_n, _b) do {} while (0) #define netdev_tx_reset_queue(_q) do {} while (0) #define netdev_reset_queue(_n) do {} while (0) #endif That means all callers of txring_txq() are conditional compiled. I don't think your modifications are needed. Regards, Helin > > Signed-off-by: Alex Gartrell > --- > lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h > b/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h > index a582f52..d58e1f3 100644 > --- a/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h > +++ b/lib/librte_eal/linuxapp/kni/ethtool/igb/igb.h > @@ -472,12 +472,10 @@ static inline u16 igb_desc_unused(const struct > igb_ring *ring) > return ((ntc > ntu) ? 0 : ring->count) + ntc - ntu - 1; } > > -#ifdef CONFIG_BQL > static inline struct netdev_queue *txring_txq(const struct igb_ring *tx_ring) > { > return netdev_get_tx_queue(tx_ring->netdev, tx_ring->queue_index); } > -#endif /* CONFIG_BQL */ > > // #ifdef EXT_THERMAL_SENSOR_SUPPORT > // #ifdef IGB_PROCFS > -- > Alex Gartrell
[dpdk-dev] Need info on --vdev option
Hi Olivier, Thanks a lot. I changed the ordering and it worked. But the pcap file was empty even when I specified the rx interface. I have a basic doubt. When the ethernet interfaces are bound to teh igb driver they become user interfaces and no longer shown in "ifconfig" command output. By what name can we identify those interfaces after they get bound to igb driver and become dpdk interfaces? Regards Shankari.V On Mon, Mar 23, 2015 at 9:31 PM, Olivier MATZ wrote: > Hi Shankari, > > On 03/23/2015 04:54 PM, Shankari Vaidyalingam wrote: > > Hi > > > > I'm trying to do a packet capture on the DPDK interface while running > l2fwd > > sample application and injecting packets from a traffic generator. > > I'm getting the below error when I give this command: sudo ./build/l2fw-c > > 0x03 -n 2 -- -p 0x03 --vdev > > 'eth_pcap0,tx_pcap=/home/controller/pkt_capt/try.pcap' > > > > [...] > > EAL: :00:11.0 not managed by UIO driver, skipping > > ./build/l2fwd: unrecognized option '--vdev' > > ./build/l2fwd [EAL options] -- -p PORTMASK [-q NQ] > > -p PORTMASK: hexadecimal bitmask of ports to configure > > -q NQ: number of queue (=ports) per lcore (default is 1) > > -T PERIOD: statistics will be refreshed each PERIOD seconds (0 to > > disable, 10 default, 86400 maximum) > > EAL: Error - exiting with code: 1 > > Cause: Invalid L2FWD arguments > > Please note the position of "--" in your command line. This is > used to separate the arguments of eal and application. As --vdev > is an eal argument, it should go before the "--". > > Regards, > Olivier >