[dpdk-dev] PATCH] mbuf: rte_pktmbuf_dump: don't add 0x when using %p in format strings
I.e., avoid dump messages with double 0x0x, e.g., dump mbuf at 0x0x7fac7b17c800, phys=17b17c880, buf_len=2176 pkt_len=2064, ol_flags=0, nb_segs=1, in_port=255 segment at 0x0x7fac7b17c800, data=0x0x7fac7b17c8f0, data_len=2064 Signed-off-by: Simon Kagstrom --- lib/librte_mbuf/rte_mbuf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c index dc0467c..523e219 100644 --- a/lib/librte_mbuf/rte_mbuf.c +++ b/lib/librte_mbuf/rte_mbuf.c @@ -218,7 +218,7 @@ rte_pktmbuf_dump(FILE *f, const struct rte_mbuf *m, unsigned dump_len) __rte_mbuf_sanity_check(m, 1); - fprintf(f, "dump mbuf at 0x%p, phys=%"PRIx64", buf_len=%u\n", + fprintf(f, "dump mbuf at %p, phys=%"PRIx64", buf_len=%u\n", m, (uint64_t)m->buf_physaddr, (unsigned)m->buf_len); fprintf(f, " pkt_len=%"PRIu32", ol_flags=%"PRIx64", nb_segs=%u, " "in_port=%u\n", m->pkt_len, m->ol_flags, @@ -228,7 +228,7 @@ rte_pktmbuf_dump(FILE *f, const struct rte_mbuf *m, unsigned dump_len) while (m && nb_segs != 0) { __rte_mbuf_sanity_check(m, 0); - fprintf(f, " segment at 0x%p, data=0x%p, data_len=%u\n", + fprintf(f, " segment at %p, data=%p, data_len=%u\n", m, rte_pktmbuf_mtod(m, void *), (unsigned)m->data_len); len = dump_len; if (len > m->data_len) -- 1.9.1
[dpdk-dev] [PATCH / RFC] sched: Correct subport calcuation
Signed-off-by: Simon Kagstrom --- I'm a total newbie to the rte_sched design and implementation, so I've added the RFC. We get crashes (at other places in the scheduler) without this code. lib/librte_sched/rte_sched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 1609ea8..b46ecfb 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -1869,7 +1869,7 @@ grinder_next_pipe(struct rte_sched_port *port, uint32_t pos) /* Install new pipe in the grinder */ grinder->pindex = pipe_qindex >> 4; - grinder->subport = port->subport + (grinder->pindex / port->n_pipes_per_subport); + grinder->subport = port->subport + (grinder->pindex / port->n_subports_per_port); grinder->pipe = port->pipe + grinder->pindex; grinder->pipe_params = NULL; /* to be set after the pipe structure is prefetched */ grinder->productive = 0; -- 1.9.1
[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues
This allows releasing RX/TX queue memory. --- We're using DPDK 16.04 and have a test suite which performs a sequence of separate tests of the type allocate mempool rte_eth_dev_configure(port, n_rxq, n_txq, ...) setup rx/tx queues rte_eth_dev_start(port) stop rx/tx queues rte_eth_dev_stop(port) -> rte_eth_dev_configure(port, 0, 0, ...) check that there are no leaks from the mempool The crucial point is the marked line above. This is done so that the rx_queue_release/tx_queue_release callbacks in the PMD is called, so that mbufs allocated by the driver is released. Without this patch, this explicitly isn't allowed. Is there a particular reason why it shouldn't? It was introduced in d505ba80a165a9735f3d9d3c6ab68a7bd85f271b "ethdev: support unidirectional configuration" lib/librte_ether/rte_ethdev.c | 5 - 1 file changed, 5 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a31018e..5481d45 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, */ (*dev->dev_ops->dev_infos_get)(dev, _info); - if (nb_rx_q == 0 && nb_tx_q == 0) { - RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx queue cannot be 0\n", port_id); - return -EINVAL; - } - if (nb_rx_q > dev_info.max_rx_queues) { RTE_PMD_DEBUG_TRACE("ethdev port_id=%d nb_rx_queues=%d > %d\n", port_id, nb_rx_q, dev_info.max_rx_queues); -- 1.9.1
[dpdk-dev] [PATCH v2] mk: pass EXTRA_CFLAGS to AUTO_CPUFLAGS to enable local modifications
We have encountered a CPU where the AES-NI instruction set is disabled due to export restrictions. Since the build machine and target machine is different, using -native configs doesn't work, and on this CPU, the application refuses to run due to the AES CPU flags being amiss. The patch passes EXTRA_CFLAGS to the figure-out-cpu-flags helper, which allows us to add -mno-aes to the compile flags and resolve this problem. Signed-off-by: Simon Kagstrom --- ChangeLog: v2: * Put EXTRA_CFLAGS after MACHINE_CFLAGS to enable overriding values mk/rte.cpuflags.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk index bec7bdd..4b30c6b 100644 --- a/mk/rte.cpuflags.mk +++ b/mk/rte.cpuflags.mk @@ -33,7 +33,7 @@ # used to set the RTE_CPUFLAG_* environment variables giving details # of what instruction sets the target cpu supports. -AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) -dM -E - < /dev/null) +AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) $(EXTRA_CFLAGS) -dM -E - < /dev/null) # adding flags to CPUFLAGS -- 1.9.1
[dpdk-dev] [PATCH] mk: pass EXTRA_CFLAGS to AUTO_CPUFLAGS to enable local modifications
We have encountered a CPU where the AES-NI instruction set is disabled due to export restrictions. Since the build machine and target machine is different, using -native configs doesn't work, and on this CPU, the application refuses to run due to the AES CPU flags being amiss. The patch passes EXTRA_CFLAGS to the figure-out-cpu-flags helper, which allows us to add -mno-aes to the compile flags and resolve this problem. Signed-off-by: Simon Kagstrom --- mk/rte.cpuflags.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk index f595cd0..17da810 100644 --- a/mk/rte.cpuflags.mk +++ b/mk/rte.cpuflags.mk @@ -33,7 +33,7 @@ # used to set the RTE_CPUFLAG_* environment variables giving details # of what instruction sets the target cpu supports. -AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) -dM -E - < /dev/null) +AUTO_CPUFLAGS := $(shell $(CC) $(EXTRA_CFLAGS) $(MACHINE_CFLAGS) -dM -E - < /dev/null) # adding flags to CPUFLAGS -- 1.9.1
[dpdk-dev] [PATCH v2] rte_sched: release enqueued mbufs on rte_sched_port_free()
Otherwise mbufs will leak when the port is destroyed. The rte_sched_port_qbase() and rte_sched_port_qsize() functions are used in free now, so move them up. Signed-off-by: Simon Kagstrom --- ChangeLog: v2: * Break long line in rte_sched_port_qbase() * Provide some air after variable in rte_sched_port_free() - I did not provide an API to free the buffers without freeing the port since I'm unsure how to manually flush the queue (without breaking the rest of the functionality!) Sorry about the delay, I missed Stephens review! lib/librte_sched/rte_sched.c | 46 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 9c9419d..c66415d 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -312,6 +312,24 @@ rte_sched_port_queues_per_port(struct rte_sched_port *port) return RTE_SCHED_QUEUES_PER_PIPE * port->n_pipes_per_subport * port->n_subports_per_port; } +static inline struct rte_mbuf ** +rte_sched_port_qbase(struct rte_sched_port *port, uint32_t qindex) +{ + uint32_t pindex = qindex >> 4; + uint32_t qpos = qindex & 0xF; + + return (port->queue_array + pindex * + port->qsize_sum + port->qsize_add[qpos]); +} + +static inline uint16_t +rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex) +{ + uint32_t tc = (qindex >> 2) & 0x3; + + return port->qsize[tc]; +} + static int rte_sched_port_check_params(struct rte_sched_port_params *params) { @@ -717,11 +735,22 @@ rte_sched_port_config(struct rte_sched_port_params *params) void rte_sched_port_free(struct rte_sched_port *port) { + unsigned int queue; + /* Check user parameters */ if (port == NULL){ return; } + /* Free enqueued mbufs */ + for (queue = 0; queue < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; queue++) { + struct rte_mbuf **mbufs = rte_sched_port_qbase(port, queue); + unsigned int i; + + for (i = 0; i < rte_sched_port_qsize(port, queue); i++) + rte_pktmbuf_free(mbufs[i]); + } + rte_bitmap_free(port->bmp); rte_free(port); } @@ -1032,23 +1061,6 @@ rte_sched_port_qindex(struct rte_sched_port *port, uint32_t subport, uint32_t pi return result; } -static inline struct rte_mbuf ** -rte_sched_port_qbase(struct rte_sched_port *port, uint32_t qindex) -{ - uint32_t pindex = qindex >> 4; - uint32_t qpos = qindex & 0xF; - - return (port->queue_array + pindex * port->qsize_sum + port->qsize_add[qpos]); -} - -static inline uint16_t -rte_sched_port_qsize(struct rte_sched_port *port, uint32_t qindex) -{ - uint32_t tc = (qindex >> 2) & 0x3; - - return port->qsize[tc]; -} - #if RTE_SCHED_DEBUG static inline int -- 1.9.1
[dpdk-dev] [PATCH] rte_ether: clarify rte_eth_set_queue_rate_limit tx_rate parameter
The tx_rate unit is Mbps. Gleaned from the ixgbe implementation, the 82599 datasheet and the use in test-pmd. Signed-off-by: Simon Kagstrom --- lib/librte_ether/rte_ethdev.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 8a8c82b..ff9aab7 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -3021,7 +3021,7 @@ int rte_eth_mirror_rule_reset(uint8_t port_id, * @param queue_idx * The queue id. * @param tx_rate - * The tx rate allocated from the total link speed for this queue. + * The tx rate in Mbps. Allocated from the total port link speed. * @return * - (0) if successful. * - (-ENOTSUP) if hardware doesn't support this feature. -- 1.9.1
[dpdk-dev] [PATCH v4] mbuf/ip_frag: move mbuf chaining to common code
Chaining/segmenting mbufs can be useful in many places, so make it global. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- ChangeLog: v4: * Line up doxygen comments better (Olivier Matz) * Correct patch name (Thomas Monajlon) v3: * Describe performance implications of linear search * Correct check-for-out-of-bounds (Konstantin Ananyev) v2: * Check for nb_segs byte overflow (Olivier MATZ) * Don't reset nb_segs in tail (Olivier MATZ) lib/librte_ip_frag/ip_frag_common.h | 23 --- lib/librte_ip_frag/rte_ipv4_reassembly.c | 7 -- lib/librte_ip_frag/rte_ipv6_reassembly.c | 7 -- lib/librte_mbuf/rte_mbuf.h | 38 4 files changed, 48 insertions(+), 27 deletions(-) diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h index 6b2acee..cde6ed4 100644 --- a/lib/librte_ip_frag/ip_frag_common.h +++ b/lib/librte_ip_frag/ip_frag_common.h @@ -166,27 +166,4 @@ ip_frag_reset(struct ip_frag_pkt *fp, uint64_t tms) fp->frags[IP_FIRST_FRAG_IDX] = zero_frag; } -/* chain two mbufs */ -static inline void -ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp) -{ - struct rte_mbuf *ms; - - /* adjust start of the last fragment data. */ - rte_pktmbuf_adj(mp, (uint16_t)(mp->l2_len + mp->l3_len)); - - /* chain two fragments. */ - ms = rte_pktmbuf_lastseg(mn); - ms->next = mp; - - /* accumulate number of segments and total length. */ - mn->nb_segs = (uint8_t)(mn->nb_segs + mp->nb_segs); - mn->pkt_len += mp->pkt_len; - - /* reset pkt_len and nb_segs for chained fragment. */ - mp->pkt_len = mp->data_len; - mp->nb_segs = 1; -} - - #endif /* _IP_FRAG_COMMON_H_ */ diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c index 5d24843..26d07f9 100644 --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c @@ -63,7 +63,9 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if(fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -78,7 +80,8 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c index 1f1c172..5969b4a 100644 --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c @@ -86,7 +86,9 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if (fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -101,7 +103,8 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index d7c9030..4a93189 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -1775,6 +1775,44 @@ static inline int rte_pktmbuf_is_contiguous(const struct rte_mbuf *m) } /** + * Chain an mbuf to another, thereby creating a segmented packet. + * + * Note: The implementation will do a linear walk over the segments to find + * the tail entry. For cases when there are many segments, it's better to + * chain the entries manually. + * + * @param head + * The head of the mbu
[dpdk-dev] [PATCH v2] kni: Use utsrelease.h to determine Ubuntu kernel version
Ping? // Simon On Thu, 20 Aug 2015 08:51:06 +0200 Simon Kagstrom wrote: > /proc/version_signature is the version for the host machine, but in > e.g., chroots, this does not necessarily match that DPDK is built > for. DPDK will then build for the wrong kernel version - that of the > server, and not that installed in the (build) chroot. > > The patch uses utsrelease.h from the kernel sources instead and fakes > the upload version. > > Tested on a server with Ubuntu 12.04, building in a chroot for Ubuntu > 14.04. > > Signed-off-by: Simon Kagstrom > Signed-off-by: Johan Faltstrom > --- > ChangeLog: > > v2: Improve description and motivation for the patch. > > lib/librte_eal/linuxapp/kni/Makefile | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/kni/Makefile > b/lib/librte_eal/linuxapp/kni/Makefile > index fb673d9..ac99d3f 100644 > --- a/lib/librte_eal/linuxapp/kni/Makefile > +++ b/lib/librte_eal/linuxapp/kni/Makefile > @@ -44,10 +44,10 @@ MODULE_CFLAGS += -I$(RTE_OUTPUT)/include > -I$(SRCDIR)/ethtool/ixgbe -I$(SRCDIR)/e > MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h > MODULE_CFLAGS += -Wall -Werror > > -ifeq ($(shell test -f /proc/version_signature && lsb_release -si > 2>/dev/null),Ubuntu) > +ifeq ($(shell lsb_release -si 2>/dev/null),Ubuntu) > MODULE_CFLAGS += -DUBUNTU_RELEASE_CODE=$(shell lsb_release -sr | tr -d .) > -UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature | \ > -cut -d'~' -f1 | cut -d- -f1,2 | tr .- $(comma)) > +UBUNTU_KERNEL_CODE := $(shell echo `grep UTS_RELEASE > $(RTE_KERNELDIR)/include/generated/utsrelease.h \ > + | cut -d '"' -f2 | cut -d- -f1,2 | tr .- $(comma)`,1) > MODULE_CFLAGS += > -D"UBUNTU_KERNEL_CODE=UBUNTU_KERNEL_VERSION($(UBUNTU_KERNEL_CODE))" > endif >
[dpdk-dev] [PATCH v3] mbuf/ip_frag: Move mbuf chaining to common code
Ping? // Simon On Mon, 7 Sep 2015 14:50:09 +0200 Simon Kagstrom wrote: > Chaining/segmenting mbufs can be useful in many places, so make it > global. > > Signed-off-by: Simon Kagstrom > Signed-off-by: Johan Faltstrom > --- > ChangeLog: > v2: > * Check for nb_segs byte overflow (Olivier MATZ) > * Don't reset nb_segs in tail (Olivier MATZ) > v3: > * Describe performance implications of linear search > * Correct check-for-out-of-bounds (Konstantin Ananyev) > > lib/librte_ip_frag/ip_frag_common.h | 23 - > lib/librte_ip_frag/rte_ipv4_reassembly.c | 7 +-- > lib/librte_ip_frag/rte_ipv6_reassembly.c | 7 +-- > lib/librte_mbuf/rte_mbuf.h | 34 > > 4 files changed, 44 insertions(+), 27 deletions(-) > > diff --git a/lib/librte_ip_frag/ip_frag_common.h > b/lib/librte_ip_frag/ip_frag_common.h > index 6b2acee..cde6ed4 100644 > --- a/lib/librte_ip_frag/ip_frag_common.h > +++ b/lib/librte_ip_frag/ip_frag_common.h > @@ -166,27 +166,4 @@ ip_frag_reset(struct ip_frag_pkt *fp, uint64_t tms) > fp->frags[IP_FIRST_FRAG_IDX] = zero_frag; > } > > -/* chain two mbufs */ > -static inline void > -ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp) > -{ > - struct rte_mbuf *ms; > - > - /* adjust start of the last fragment data. */ > - rte_pktmbuf_adj(mp, (uint16_t)(mp->l2_len + mp->l3_len)); > - > - /* chain two fragments. */ > - ms = rte_pktmbuf_lastseg(mn); > - ms->next = mp; > - > - /* accumulate number of segments and total length. */ > - mn->nb_segs = (uint8_t)(mn->nb_segs + mp->nb_segs); > - mn->pkt_len += mp->pkt_len; > - > - /* reset pkt_len and nb_segs for chained fragment. */ > - mp->pkt_len = mp->data_len; > - mp->nb_segs = 1; > -} > - > - > #endif /* _IP_FRAG_COMMON_H_ */ > diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c > b/lib/librte_ip_frag/rte_ipv4_reassembly.c > index 5d24843..26d07f9 100644 > --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c > +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c > @@ -63,7 +63,9 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) > /* previous fragment found. */ > if(fp->frags[i].ofs + fp->frags[i].len == ofs) { > > - ip_frag_chain(fp->frags[i].mb, m); > + /* adjust start of the last fragment data. */ > + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + > m->l3_len)); > + rte_pktmbuf_chain(fp->frags[i].mb, m); > > /* update our last fragment and offset. */ > m = fp->frags[i].mb; > @@ -78,7 +80,8 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) > } > > /* chain with the first fragment. */ > - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); > + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); > + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); > m = fp->frags[IP_FIRST_FRAG_IDX].mb; > > /* update mbuf fields for reassembled packet. */ > diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c > b/lib/librte_ip_frag/rte_ipv6_reassembly.c > index 1f1c172..5969b4a 100644 > --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c > +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c > @@ -86,7 +86,9 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) > /* previous fragment found. */ > if (fp->frags[i].ofs + fp->frags[i].len == ofs) { > > - ip_frag_chain(fp->frags[i].mb, m); > + /* adjust start of the last fragment data. */ > + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + > m->l3_len)); > + rte_pktmbuf_chain(fp->frags[i].mb, m); > > /* update our last fragment and offset. */ > m = fp->frags[i].mb; > @@ -101,7 +103,8 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) > } > > /* chain with the first fragment. */ > - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); > + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); > + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); > m = fp->frags[IP_FIRST_FRAG_IDX].mb; > > /* update mbuf fields for reassembled packet. */ > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > index d7c9030..f1f1400 100644 > ---
[dpdk-dev] [PATCH] mk: Quote $(KERNELCC) to allow ccache builds
Ping? This is one of three outstanding DPDK patches I have which hasn't seen any activitiy in a while. Is there a list of pending applies somewhere to monitor activity? // Simon On Thu, 24 Sep 2015 09:43:28 +0200 Simon Kagstrom wrote: > Otherwise building with KERNELCC="ccache gcc" will fail: > > == Build lib/librte_eal/linuxapp/igb_uio > /usr/src/linux-headers-3.13.0-63-generic/arch/x86/Makefile:98: stack > protector enabled but no compiler support > /usr/src/linux-headers-3.13.0-63-generic/arch/x86/Makefile:113: > CONFIG_X86_X32 enabled but no binutils support > ccache: invalid option -- 'p' > Usage: > ccache [options] > ccache compiler [compiler options] > compiler [compiler options] (via symbolic link) > > Options: > -c, --cleanup delete old files and recalculate size counters >(normally not needed as this is done automatically) > -C, --clear clear the cache completely > -F, --max-files=N set maximum number of files in cache to N (use 0 > for >no limit) > -M, --max-size=SIZE set maximum size of cache to SIZE (use 0 for no >limit; available suffixes: G, M and K; default >suffix: G) > -s, --show-stats show statistics summary > -z, --zero-stats zero statistics counters > > -h, --helpprint this help text > -V, --version print version and copyright information > > Signed-off-by: Simon Kagstrom > --- > mk/rte.module.mk | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mk/rte.module.mk b/mk/rte.module.mk > index 7bf77c1..53ed4fe 100644 > --- a/mk/rte.module.mk > +++ b/mk/rte.module.mk > @@ -78,7 +78,7 @@ build: _postbuild > $(MODULE).ko: $(SRCS_LINKS) > @if [ ! -f $(notdir Makefile) ]; then ln -nfs $(SRCDIR)/Makefile . ; fi > @$(MAKE) -C $(RTE_KERNELDIR) M=$(CURDIR) O=$(RTE_KERNELDIR) \ > - CC=$(KERNELCC) CROSS_COMPILE=$(CROSS) V=$(if $V,1,0) > + CC="$(KERNELCC)" CROSS_COMPILE=$(CROSS) V=$(if $V,1,0) > > # install module in $(RTE_OUTPUT)/kmod > $(RTE_OUTPUT)/kmod/$(MODULE).ko: $(MODULE).ko
[dpdk-dev] [PATCH] mk: Quote $(KERNELCC) to allow ccache builds
Otherwise building with KERNELCC="ccache gcc" will fail: == Build lib/librte_eal/linuxapp/igb_uio /usr/src/linux-headers-3.13.0-63-generic/arch/x86/Makefile:98: stack protector enabled but no compiler support /usr/src/linux-headers-3.13.0-63-generic/arch/x86/Makefile:113: CONFIG_X86_X32 enabled but no binutils support ccache: invalid option -- 'p' Usage: ccache [options] ccache compiler [compiler options] compiler [compiler options] (via symbolic link) Options: -c, --cleanup delete old files and recalculate size counters (normally not needed as this is done automatically) -C, --clear clear the cache completely -F, --max-files=N set maximum number of files in cache to N (use 0 for no limit) -M, --max-size=SIZE set maximum size of cache to SIZE (use 0 for no limit; available suffixes: G, M and K; default suffix: G) -s, --show-stats show statistics summary -z, --zero-stats zero statistics counters -h, --helpprint this help text -V, --version print version and copyright information Signed-off-by: Simon Kagstrom --- mk/rte.module.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mk/rte.module.mk b/mk/rte.module.mk index 7bf77c1..53ed4fe 100644 --- a/mk/rte.module.mk +++ b/mk/rte.module.mk @@ -78,7 +78,7 @@ build: _postbuild $(MODULE).ko: $(SRCS_LINKS) @if [ ! -f $(notdir Makefile) ]; then ln -nfs $(SRCDIR)/Makefile . ; fi @$(MAKE) -C $(RTE_KERNELDIR) M=$(CURDIR) O=$(RTE_KERNELDIR) \ - CC=$(KERNELCC) CROSS_COMPILE=$(CROSS) V=$(if $V,1,0) + CC="$(KERNELCC)" CROSS_COMPILE=$(CROSS) V=$(if $V,1,0) # install module in $(RTE_OUTPUT)/kmod $(RTE_OUTPUT)/kmod/$(MODULE).ko: $(MODULE).ko -- 1.9.1
[dpdk-dev] [PATCH v3] mbuf/ip_frag: Move mbuf chaining to common code
Chaining/segmenting mbufs can be useful in many places, so make it global. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- ChangeLog: v2: * Check for nb_segs byte overflow (Olivier MATZ) * Don't reset nb_segs in tail (Olivier MATZ) v3: * Describe performance implications of linear search * Correct check-for-out-of-bounds (Konstantin Ananyev) lib/librte_ip_frag/ip_frag_common.h | 23 - lib/librte_ip_frag/rte_ipv4_reassembly.c | 7 +-- lib/librte_ip_frag/rte_ipv6_reassembly.c | 7 +-- lib/librte_mbuf/rte_mbuf.h | 34 4 files changed, 44 insertions(+), 27 deletions(-) diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h index 6b2acee..cde6ed4 100644 --- a/lib/librte_ip_frag/ip_frag_common.h +++ b/lib/librte_ip_frag/ip_frag_common.h @@ -166,27 +166,4 @@ ip_frag_reset(struct ip_frag_pkt *fp, uint64_t tms) fp->frags[IP_FIRST_FRAG_IDX] = zero_frag; } -/* chain two mbufs */ -static inline void -ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp) -{ - struct rte_mbuf *ms; - - /* adjust start of the last fragment data. */ - rte_pktmbuf_adj(mp, (uint16_t)(mp->l2_len + mp->l3_len)); - - /* chain two fragments. */ - ms = rte_pktmbuf_lastseg(mn); - ms->next = mp; - - /* accumulate number of segments and total length. */ - mn->nb_segs = (uint8_t)(mn->nb_segs + mp->nb_segs); - mn->pkt_len += mp->pkt_len; - - /* reset pkt_len and nb_segs for chained fragment. */ - mp->pkt_len = mp->data_len; - mp->nb_segs = 1; -} - - #endif /* _IP_FRAG_COMMON_H_ */ diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c index 5d24843..26d07f9 100644 --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c @@ -63,7 +63,9 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if(fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -78,7 +80,8 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c index 1f1c172..5969b4a 100644 --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c @@ -86,7 +86,9 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if (fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -101,7 +103,8 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index d7c9030..f1f1400 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -1775,6 +1775,40 @@ static inline int rte_pktmbuf_is_contiguous(const struct rte_mbuf *m) } /** + * Chain an mbuf to another, thereby creating a segmented packet. + * + * Note: The implementation will do a linear walk over the segments to find + * the tail entry. For cases when there are many segments, it's better to + * chain the entries manually. + * + * @param head the head of the mbuf chain (the first packet) + * @param tail the mbuf to put last in the chain + * + * @return 0 on succ
[dpdk-dev] [PATCH v2] mbuf/ip_frag: Move mbuf chaining to common code
Chaining/segmenting mbufs can be useful in many places, so make it global. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- ChangeLog: v2: * Check for nb_segs byte overflow (Olivier MATZ) * Don't reset nb_segs in tail (Olivier MATZ) lib/librte_ip_frag/ip_frag_common.h | 23 --- lib/librte_ip_frag/rte_ipv4_reassembly.c | 7 +-- lib/librte_ip_frag/rte_ipv6_reassembly.c | 7 +-- lib/librte_mbuf/rte_mbuf.h | 30 ++ 4 files changed, 40 insertions(+), 27 deletions(-) diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h index 6b2acee..cde6ed4 100644 --- a/lib/librte_ip_frag/ip_frag_common.h +++ b/lib/librte_ip_frag/ip_frag_common.h @@ -166,27 +166,4 @@ ip_frag_reset(struct ip_frag_pkt *fp, uint64_t tms) fp->frags[IP_FIRST_FRAG_IDX] = zero_frag; } -/* chain two mbufs */ -static inline void -ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp) -{ - struct rte_mbuf *ms; - - /* adjust start of the last fragment data. */ - rte_pktmbuf_adj(mp, (uint16_t)(mp->l2_len + mp->l3_len)); - - /* chain two fragments. */ - ms = rte_pktmbuf_lastseg(mn); - ms->next = mp; - - /* accumulate number of segments and total length. */ - mn->nb_segs = (uint8_t)(mn->nb_segs + mp->nb_segs); - mn->pkt_len += mp->pkt_len; - - /* reset pkt_len and nb_segs for chained fragment. */ - mp->pkt_len = mp->data_len; - mp->nb_segs = 1; -} - - #endif /* _IP_FRAG_COMMON_H_ */ diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c index 5d24843..26d07f9 100644 --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c @@ -63,7 +63,9 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if(fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -78,7 +80,8 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c index 1f1c172..5969b4a 100644 --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c @@ -86,7 +86,9 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if (fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -101,7 +103,8 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index d7c9030..19a4bb5 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -1775,6 +1775,36 @@ static inline int rte_pktmbuf_is_contiguous(const struct rte_mbuf *m) } /** + * Chain an mbuf to another, thereby creating a segmented packet. + * + * @param head the head of the mbuf chain (the first packet) + * @param tail the mbuf to put last in the chain + * + * @return 0 on success, -EOVERFLOW if the chain is full (256 entries) + */ +static inline int rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail) +{ + struct rte_mbuf *cur_tail; + + /* Check for number-of-segments-overflow */ + if (head->nb_segs + tail->nb_segs >= sizeof(he
[dpdk-dev] [PATCH RFC] mbuf/ip_frag: Move mbuf chaining to common code
Chaining/segmenting mbufs can be useful in many places, so make it global. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- NOTE! Only compile-tested. We were looking for packet segmenting functionality in the MBUF API but didn't find it. This patch moves the implementation, apart from the things which look ip_frag-specific. lib/librte_ip_frag/ip_frag_common.h | 23 --- lib/librte_ip_frag/rte_ipv4_reassembly.c | 7 +-- lib/librte_ip_frag/rte_ipv6_reassembly.c | 7 +-- lib/librte_mbuf/rte_mbuf.h | 23 +++ 4 files changed, 33 insertions(+), 27 deletions(-) diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h index 6b2acee..cde6ed4 100644 --- a/lib/librte_ip_frag/ip_frag_common.h +++ b/lib/librte_ip_frag/ip_frag_common.h @@ -166,27 +166,4 @@ ip_frag_reset(struct ip_frag_pkt *fp, uint64_t tms) fp->frags[IP_FIRST_FRAG_IDX] = zero_frag; } -/* chain two mbufs */ -static inline void -ip_frag_chain(struct rte_mbuf *mn, struct rte_mbuf *mp) -{ - struct rte_mbuf *ms; - - /* adjust start of the last fragment data. */ - rte_pktmbuf_adj(mp, (uint16_t)(mp->l2_len + mp->l3_len)); - - /* chain two fragments. */ - ms = rte_pktmbuf_lastseg(mn); - ms->next = mp; - - /* accumulate number of segments and total length. */ - mn->nb_segs = (uint8_t)(mn->nb_segs + mp->nb_segs); - mn->pkt_len += mp->pkt_len; - - /* reset pkt_len and nb_segs for chained fragment. */ - mp->pkt_len = mp->data_len; - mp->nb_segs = 1; -} - - #endif /* _IP_FRAG_COMMON_H_ */ diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c index 5d24843..26d07f9 100644 --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c @@ -63,7 +63,9 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if(fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -78,7 +80,8 @@ ipv4_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c index 1f1c172..5969b4a 100644 --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c @@ -86,7 +86,9 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) /* previous fragment found. */ if (fp->frags[i].ofs + fp->frags[i].len == ofs) { - ip_frag_chain(fp->frags[i].mb, m); + /* adjust start of the last fragment data. */ + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[i].mb, m); /* update our last fragment and offset. */ m = fp->frags[i].mb; @@ -101,7 +103,8 @@ ipv6_frag_reassemble(const struct ip_frag_pkt *fp) } /* chain with the first fragment. */ - ip_frag_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); + rte_pktmbuf_adj(m, (uint16_t)(m->l2_len + m->l3_len)); + rte_pktmbuf_chain(fp->frags[IP_FIRST_FRAG_IDX].mb, m); m = fp->frags[IP_FIRST_FRAG_IDX].mb; /* update mbuf fields for reassembled packet. */ diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 8c2db1b..ef47256 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -1801,6 +1801,29 @@ static inline int rte_pktmbuf_is_contiguous(const struct rte_mbuf *m) } /** + * Chain an mbuf to another, thereby creating a segmented packet. + * + * @param head the head of the mbuf chain (the first packet) + * @param tail the mbuf to put last in the chain + */ +static inline void rte_pktmbuf_chain(struct rte_mbuf *head, struct rte_mbuf *tail) +{ + struct rte_mbuf *cur_tail; + + /* Chain 'tail' onto the old tail */ + cur_tail = rte_pktmbuf_lastseg(head); + cur_tail-&g
[dpdk-dev] [PATCH v2] kni: Use utsrelease.h to determine Ubuntu kernel version
/proc/version_signature is the version for the host machine, but in e.g., chroots, this does not necessarily match that DPDK is built for. DPDK will then build for the wrong kernel version - that of the server, and not that installed in the (build) chroot. The patch uses utsrelease.h from the kernel sources instead and fakes the upload version. Tested on a server with Ubuntu 12.04, building in a chroot for Ubuntu 14.04. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- ChangeLog: v2: Improve description and motivation for the patch. lib/librte_eal/linuxapp/kni/Makefile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/kni/Makefile b/lib/librte_eal/linuxapp/kni/Makefile index fb673d9..ac99d3f 100644 --- a/lib/librte_eal/linuxapp/kni/Makefile +++ b/lib/librte_eal/linuxapp/kni/Makefile @@ -44,10 +44,10 @@ MODULE_CFLAGS += -I$(RTE_OUTPUT)/include -I$(SRCDIR)/ethtool/ixgbe -I$(SRCDIR)/e MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h MODULE_CFLAGS += -Wall -Werror -ifeq ($(shell test -f /proc/version_signature && lsb_release -si 2>/dev/null),Ubuntu) +ifeq ($(shell lsb_release -si 2>/dev/null),Ubuntu) MODULE_CFLAGS += -DUBUNTU_RELEASE_CODE=$(shell lsb_release -sr | tr -d .) -UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature | \ -cut -d'~' -f1 | cut -d- -f1,2 | tr .- $(comma)) +UBUNTU_KERNEL_CODE := $(shell echo `grep UTS_RELEASE $(RTE_KERNELDIR)/include/generated/utsrelease.h \ +| cut -d '"' -f2 | cut -d- -f1,2 | tr .- $(comma)`,1) MODULE_CFLAGS += -D"UBUNTU_KERNEL_CODE=UBUNTU_KERNEL_VERSION($(UBUNTU_KERNEL_CODE))" endif -- 1.9.1
[dpdk-dev] [PATCH v2] mem: Warn once if /proc/self/pagemap is unreadable
Newer kernels make this unreadable for security reasons for non-roots. Running the application will then fill the logs with rte_mem_virt2phy: cannot open /proc/self/pagemap messages. However, there are cases when DPDK is and should be run as non-root, without the need for virtual-to-physical address translations: a typical example is when working with PCAP input/output. This patch adds a start-time check for /proc/self/pagemap readability, and directly returns an error code from rte_mem_virt2phy(). This way, there is only a one-time warning at startup instead of constant warnings all the time. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- ChangeLog: v2: Fix use with huge pages. lib/librte_eal/linuxapp/eal/eal_memory.c | 25 + 1 file changed, 25 insertions(+) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 9b8d946..f4a1936 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -111,6 +111,8 @@ static uint64_t baseaddr_offset; +static unsigned proc_pagemap_readable; + #define RANDOMIZE_VA_SPACE_FILE "/proc/sys/kernel/randomize_va_space" /* Lock page in physical memory and prevent from swapping. */ @@ -135,6 +137,10 @@ rte_mem_virt2phy(const void *virtaddr) int page_size; off_t offset; + /* Cannot parse /proc/self/pagemap, no need to log errors everywhere */ + if (!proc_pagemap_readable) + return RTE_BAD_PHYS_ADDR; + /* standard page size */ page_size = getpagesize(); @@ -1546,12 +1552,31 @@ rte_eal_memdevice_init(void) return 0; } +static int +test_proc_pagemap_readable(void) +{ + int fd = open("/proc/self/pagemap", O_RDONLY); + + if (fd < 0) + return 0; + /* Is readable */ + close(fd); + + return 1; +} /* init memory subsystem */ int rte_eal_memory_init(void) { RTE_LOG(INFO, EAL, "Setting up memory...\n"); + + proc_pagemap_readable = test_proc_pagemap_readable(); + if (!proc_pagemap_readable) + RTE_LOG(ERR, EAL, + "Cannot open /proc/self/pagemap: %s. virt2phys address translation will not work\n", + strerror(errno)); + const int retval = rte_eal_process_type() == RTE_PROC_PRIMARY ? rte_eal_hugepage_init() : rte_eal_hugepage_attach(); -- 1.9.1
[dpdk-dev] [PATCH] mem: Warn once if /proc/self/pagemap is unreadable
Newer kernels make this unreadable for security reasons for non-roots. Running the application will then fill the logs with rte_mem_virt2phy: cannot open /proc/self/pagemap messages. However, there are cases when DPDK is and should be run as non-root, without the need for virtual-to-physical address translations: a typical example is when working with PCAP input/output. This patch adds a start-time check for /proc/self/pagemap readability, and directly returns an error code from rte_mem_virt2phy(). This way, there is only a one-time warning at startup instead of constant warnings all the time. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- lib/librte_eal/linuxapp/eal/eal_memory.c | 24 1 file changed, 24 insertions(+) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 9b8d946..7274cb3 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -111,6 +111,8 @@ static uint64_t baseaddr_offset; +static unsigned proc_pagemap_readable; + #define RANDOMIZE_VA_SPACE_FILE "/proc/sys/kernel/randomize_va_space" /* Lock page in physical memory and prevent from swapping. */ @@ -135,6 +137,10 @@ rte_mem_virt2phy(const void *virtaddr) int page_size; off_t offset; + /* Cannot parse /proc/self/pagemap, no need to log errors everywhere */ + if (!proc_pagemap_readable) + return RTE_BAD_PHYS_ADDR; + /* standard page size */ page_size = getpagesize(); @@ -1546,6 +1552,18 @@ rte_eal_memdevice_init(void) return 0; } +static int +test_proc_pagemap_readable(void) +{ + int fd = open("/proc/self/pagemap", O_RDONLY); + + if (fd < 0) + return 0; + /* Is readable */ + close(fd); + + return 1; +} /* init memory subsystem */ int @@ -1561,5 +1579,11 @@ rte_eal_memory_init(void) if (internal_config.no_shconf == 0 && rte_eal_memdevice_init() < 0) return -1; + proc_pagemap_readable = test_proc_pagemap_readable(); + if (!proc_pagemap_readable) + RTE_LOG(ERR, EAL, + "Cannot open /proc/self/pagemap: %s. virt2phys address translation will not work\n", + strerror(errno)); + return 0; } -- 1.9.1
[dpdk-dev] deadline notice
On Thu, 11 Jun 2015 10:33:21 +0200 Thomas Monjalon wrote: > 2015-06-11 09:27, Simon Kagstrom: > > On Wed, 10 Jun 2015 20:39:59 +0200 > > Thomas Monjalon wrote: > > > > I didn't find this in the list: > > > > rte_reorder: Allow sequence numbers > 0 as starting point > > It was considered more or less as a fix. > > > and I think it would be good to have as well. > > Yes of course. OK, thanks! It would be good to have some sort of feedback on accepted patches though - in addition to the above one, I also sent a build-fix kni: Use utsrelease.h to determine Ubuntu kernel version which I also think would be good to have (basically, building in a chroot is broken otherwise). I haven't had time to test the patch on older Ubuntus though. So, while this is an old discussion by now, I think DPDK is large enough to warrant some more infrastructure for patch submissions - at least in the form of build-tests for patches, and preferrably some sort of status marker for accepted patches. Basically what Github/Travis-ci etc already provides. // Simon
[dpdk-dev] [PATCH v2] kni: Add set_rx_mode callback to handle multicast groups
We did some (very basic) tests with IGMP, which involves adding multicast addresses to ETH interfaces. This is done via the ip tool, an example can be found on e.g., http://superuser.com/questions/324824/linux-built-in-or-open-source-program-to-join-multicast-group and this will fail on KNI interfaces because of an unimplemented ioctl SIOCADDMULTI. The patch simply adds an empty callback for set_rx_mode (typically used for setting up hardware) so that the ioctl succeeds. This is the same thing as the Linux tap interface does. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- ChangeLog: v2: Improve motivation for the patch lib/librte_eal/linuxapp/kni/kni_net.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c b/lib/librte_eal/linuxapp/kni/kni_net.c index dd95db5..cf93c4b 100644 --- a/lib/librte_eal/linuxapp/kni/kni_net.c +++ b/lib/librte_eal/linuxapp/kni/kni_net.c @@ -495,6 +495,11 @@ kni_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) return 0; } +static void +kni_net_set_rx_mode(struct net_device *dev) +{ +} + static int kni_net_change_mtu(struct net_device *dev, int new_mtu) { @@ -645,6 +650,7 @@ static const struct net_device_ops kni_net_netdev_ops = { .ndo_start_xmit = kni_net_tx, .ndo_change_mtu = kni_net_change_mtu, .ndo_do_ioctl = kni_net_ioctl, + .ndo_set_rx_mode = kni_net_set_rx_mode, .ndo_get_stats = kni_net_stats, .ndo_tx_timeout = kni_net_tx_timeout, .ndo_set_mac_address = kni_net_set_mac, -- 1.9.1
[dpdk-dev] [PATCH v2] eal: Allow combining -m and --no-huge
On Wed, 27 May 2015 09:06:46 -0400 Thomas F Herbert wrote: > I just tried applying v2 of the patch to master: > git apply ../patches/eal_common_options.patch > error: patch failed: lib/librte_eal/common/eal_common_options.c:850 > error: lib/librte_eal/common/eal_common_options.c: patch does not apply Hmm... I tried it myself, but just saved the email and used git am main.mbox and that worked. I?m using claws-mail as a mailer. // Simon
[dpdk-dev] [PATCH] kni: Use utsrelease.h to determine Ubuntu kernel version
/proc/version_signature is the version for the host machine, but in e.g., chroots, this does not need to match that DPDK is built for. Use utsrelease.h from the kernel sources instead and fake the upload version. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- lib/librte_eal/linuxapp/kni/Makefile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/kni/Makefile b/lib/librte_eal/linuxapp/kni/Makefile index fb673d9..ac99d3f 100644 --- a/lib/librte_eal/linuxapp/kni/Makefile +++ b/lib/librte_eal/linuxapp/kni/Makefile @@ -44,10 +44,10 @@ MODULE_CFLAGS += -I$(RTE_OUTPUT)/include -I$(SRCDIR)/ethtool/ixgbe -I$(SRCDIR)/e MODULE_CFLAGS += -include $(RTE_OUTPUT)/include/rte_config.h MODULE_CFLAGS += -Wall -Werror -ifeq ($(shell test -f /proc/version_signature && lsb_release -si 2>/dev/null),Ubuntu) +ifeq ($(shell lsb_release -si 2>/dev/null),Ubuntu) MODULE_CFLAGS += -DUBUNTU_RELEASE_CODE=$(shell lsb_release -sr | tr -d .) -UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature | \ -cut -d'~' -f1 | cut -d- -f1,2 | tr .- $(comma)) +UBUNTU_KERNEL_CODE := $(shell echo `grep UTS_RELEASE $(RTE_KERNELDIR)/include/generated/utsrelease.h \ +| cut -d '"' -f2 | cut -d- -f1,2 | tr .- $(comma)`,1) MODULE_CFLAGS += -D"UBUNTU_KERNEL_CODE=UBUNTU_KERNEL_VERSION($(UBUNTU_KERNEL_CODE))" endif -- 1.9.1
[dpdk-dev] [PATCH v2] eal: Allow combining -m and --no-huge
Needed to run as non-root but with higher memory allocations, and removes a constraint on no-huge mode being limited to 64M. A usage example is if running with file input with the pcap PMD, which can be done as non-root after this patch via e.g., ./test-dpdk --no-huge -m 1024 -l 0,1 -n3 --vdev 'eth_pcap0,rx_pcap=/tmp/eth-rx.pcap,tx_pcap=/tmp/eth-tx.pcap' Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- v2: * Remove unneeded parentheses and merge lines * Patch prefix now eal: * Add example and more description (from David Marchand) lib/librte_eal/common/eal_common_options.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index 8fcb1ab..1f459ac 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -850,9 +850,8 @@ eal_check_common_options(struct internal_config *internal_cfg) "be specified at the same time\n"); return -1; } - if (internal_cfg->no_hugetlbfs && - (mem_parsed || internal_cfg->force_sockets == 1)) { - RTE_LOG(ERR, EAL, "Options -m or --"OPT_SOCKET_MEM" cannot " + if (internal_cfg->no_hugetlbfs && internal_cfg->force_sockets == 1) { + RTE_LOG(ERR, EAL, "Option --"OPT_SOCKET_MEM" cannot " "be specified together with --"OPT_NO_HUGE"\n"); return -1; } -- 1.9.1
[dpdk-dev] [PATCH] eal_common_options: Allow combining -m and --no-huge
Needed to run as non-root but with higher memory allocations. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- lib/librte_eal/common/eal_common_options.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index 8fcb1ab..89b867d 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -851,8 +851,8 @@ eal_check_common_options(struct internal_config *internal_cfg) return -1; } if (internal_cfg->no_hugetlbfs && - (mem_parsed || internal_cfg->force_sockets == 1)) { - RTE_LOG(ERR, EAL, "Options -m or --"OPT_SOCKET_MEM" cannot " + (internal_cfg->force_sockets == 1)) { + RTE_LOG(ERR, EAL, "Option --"OPT_SOCKET_MEM" cannot " "be specified together with --"OPT_NO_HUGE"\n"); return -1; } -- 1.9.1
[dpdk-dev] [PATCH] rte_reorder: Allow sequence numbers > 0 as starting point
We use sequence numbers from a generator which has potentially started long before the receiver. Therefore, the first number will typically be > 0. The rte_reorder code will not work in this case, since the packet is seen as outside of the buffer. The patch instead records the first sequence number inserted as the starting point. Signed-off-by: Simon Kagstrom Signed-off-by: Johan Faltstrom --- lib/librte_reorder/rte_reorder.c | 8 1 file changed, 8 insertions(+) diff --git a/lib/librte_reorder/rte_reorder.c b/lib/librte_reorder/rte_reorder.c index dc0e806..4d6449e 100644 --- a/lib/librte_reorder/rte_reorder.c +++ b/lib/librte_reorder/rte_reorder.c @@ -73,6 +73,8 @@ struct rte_reorder_buffer { unsigned int memsize; /**< memory area size of reorder buffer */ struct cir_buffer ready_buf; /**< temp buffer for dequeued entries */ struct cir_buffer order_buf; /**< buffer used to reorder entries */ + + int is_initialized; } __rte_cache_aligned; static void @@ -325,6 +327,12 @@ rte_reorder_insert(struct rte_reorder_buffer *b, struct rte_mbuf *mbuf) uint32_t offset, position; struct cir_buffer *order_buf = >order_buf; + if (!b->is_initialized) { + b->min_seqn = mbuf->seqn; + + b->is_initialized = 1; + } + /* * calculate the offset from the head pointer we need to go. * The subtraction takes care of the sequence number wrapping. -- 1.9.1
[dpdk-dev] [PATCH / RFC] kni: Add set_rx_mode callback to handle multicast groups
This is needed to add / remove interfaces in multicast groups via the ip tool. The callback does nothing - the same as the kernel tun.c. Signed-off-by: Simon Kagstrom --- Marked RFC since I'm by no means an expert on this. We noticed this when playing with KNI and IGMP handling. lib/librte_eal/linuxapp/kni/kni_net.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c b/lib/librte_eal/linuxapp/kni/kni_net.c index dd95db5..cf93c4b 100644 --- a/lib/librte_eal/linuxapp/kni/kni_net.c +++ b/lib/librte_eal/linuxapp/kni/kni_net.c @@ -495,6 +495,11 @@ kni_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd) return 0; } +static void +kni_net_set_rx_mode(struct net_device *dev) +{ +} + static int kni_net_change_mtu(struct net_device *dev, int new_mtu) { @@ -645,6 +650,7 @@ static const struct net_device_ops kni_net_netdev_ops = { .ndo_start_xmit = kni_net_tx, .ndo_change_mtu = kni_net_change_mtu, .ndo_do_ioctl = kni_net_ioctl, + .ndo_set_rx_mode = kni_net_set_rx_mode, .ndo_get_stats = kni_net_stats, .ndo_tx_timeout = kni_net_tx_timeout, .ndo_set_mac_address = kni_net_set_mac, -- 1.9.1
[dpdk-dev] [PATCH] librte_eal: Allow combining --no-huge with -m XXX
Useful to run applications in usermode via a test driver. Signed-off-by: Simon Kagstrom --- Not sure if there are other implications of this, so please check! lib/librte_eal/common/eal_common_options.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index 4319549..af865f5 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -850,7 +850,7 @@ eal_check_common_options(struct internal_config *internal_cfg) return -1; } if (internal_cfg->no_hugetlbfs && - (mem_parsed || internal_cfg->force_sockets == 1)) { + (internal_cfg->force_sockets == 1)) { RTE_LOG(ERR, EAL, "Options -m or --"OPT_SOCKET_MEM" cannot " "be specified together with --"OPT_NO_HUGE"\n"); return -1; -- 1.9.1