On Oct 7, 2014, at 10:42 AM, Ananyev, Konstantin <konstantin.ananyev at intel.com> wrote:
> Hi Keith, > >> -----Original Message----- >> From: Wiles, Roger Keith [mailto:keith.wiles at windriver.com] >> Sent: Tuesday, October 07, 2014 3:22 PM >> To: Ananyev, Konstantin >> Cc: dev at dpdk.org >> Subject: Re: [dpdk-dev] [PATCH 2/2] Adding the routines >> rte_pktmbuf_alloc_bulk() and rte_pktmbuf_free_bulk() >> >> >> On Oct 7, 2014, at 4:09 AM, Ananyev, Konstantin <konstantin.ananyev at >> intel.com> wrote: >> >>> >>> >>>> -----Original Message----- >>>> From: Wiles, Roger Keith [mailto:keith.wiles at windriver.com] >>>> Sent: Monday, October 06, 2014 9:08 PM >>>> To: Ananyev, Konstantin >>>> Cc: dev at dpdk.org >>>> Subject: Re: [dpdk-dev] [PATCH 2/2] Adding the routines >>>> rte_pktmbuf_alloc_bulk() and rte_pktmbuf_free_bulk() >>>> >>>> Attaching to the list does not work. If you want the code let me know it >>>> is only about 5K in size. >>>> >>>> On Oct 6, 2014, at 2:45 PM, Wiles, Roger Keith <keith.wiles at >>>> windriver.com> wrote: >>>> >>>>> >>>>> On Oct 6, 2014, at 11:13 AM, Wiles, Roger Keith <keith.wiles at >>>>> windriver.com> wrote: >>>>> >>>>>> >>>>>> On Oct 6, 2014, at 10:54 AM, Ananyev, Konstantin <konstantin.ananyev at >>>>>> intel.com> wrote: >>>>>> >>>>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce >>>>>>>> Richardson >>>>>>>> Sent: Monday, October 06, 2014 3:54 PM >>>>>>>> To: Wiles, Roger Keith (Wind River) >>>>>>>> Cc: dev at dpdk.org >>>>>>>> Subject: Re: [dpdk-dev] [PATCH 2/2] Adding the routines >>>>>>>> rte_pktmbuf_alloc_bulk() and rte_pktmbuf_free_bulk() >>>>>>>> >>>>>>>> On Mon, Oct 06, 2014 at 03:50:38PM +0100, Wiles, Roger Keith wrote: >>>>>>>>> Hi Bruce, >>>>>>>>> >>>>>>>>> Do I need to reject the for the new routines or just make sure the >>>>>>>>> vector driver does not get updated to use those routines? >>>>>>>>> >>>>>>>> >>>>>>>> The new routines are probably useful in the general case. I see no >>>>>>>> issue >>>>>>>> with having them in the code, so long as the vector driver is not >>>>>>>> modified >>>>>>>> to use them. >>>>>>> >>>>>>> I 'd say the same thing for non-vector RX/TX PMD code-paths too. >>>>>>> >>>>>>> BTW, are the new functions comments valid? >>>>>>> >>>>>>> + * @return >>>>>>> + * - 0 if the number of mbufs allocated was ok >>>>>>> + * - <0 is an ERROR. >>>>>>> + */ >>>>>>> +static inline int __rte_mbuf_raw_alloc_bulk( >>>>>>> >>>>>>> Though, as I can see __rte_mbuf_raw_alloc_bulk() returns either: >>>>>>> - number of allocated mbuf (cnt) >>>>>>> - negative error code >>>>>> >>>>>> Let me fix up the comments. >>>>>>> >>>>>>> And: >>>>>>> + * @return >>>>>>> + * - The number of valid mbufs pointers in the m_list array. >>>>>>> + * - Zero if the request cnt could not be allocated. >>>>>>> + */ >>>>>>> +static inline int __attribute__((always_inline)) >>>>>>> +rte_pktmbuf_alloc_bulk(struct rte_mempool *mp, struct rte_mbuf >>>>>>> *m_list[], int16_t cnt) >>>>>>> +{ >>>>>>> + return __rte_mbuf_raw_alloc_bulk(mp, m_list, cnt); >>>>>>> +} >>>>>>> >>>>>>> Shouldn't be "less than zero if the request cnt could not be >>>>>>> allocated."? >>>>>>> >>>>>>> BTW, is there any point to have __rte_mbuf_raw_alloc_bulk() at all? >>>>>>> After all, as you are calling rte_pktmbuf_reset() inside it, it doesn't >>>>>>> look __raw__ any more. >>>>>>> Might be just put its content into rte_pktmbuf_alloc_bulk() and get rid >>>>>>> of it. >>>>>>> >>>>>> I was just following the non-bulk routine style __rte_mbuf_raw_alloc(), >>>>>> but I can pull that into a single routine. >>>>>> >>>>>>> Also wonder, what is the advantage of having multiple counters inside >>>>>>> the same loop? >>>>>>> i.e: >>>>>>> + for(i = 0; i < cnt; i++) { >>>>>>> + m = *m_list++; >>>>>>> >>>>>>> Why not just: >>>>>>> >>>>>>> for(i = 0; i < cnt; i++) { >>>>>>> m = &m_list[i]; >>>>>>> >>>>>>> Same for free: >>>>>>> + while(npkts--) >>>>>>> + rte_pktmbuf_free(*m_list++); >>>>>>> >>>>>>> While not just: >>>>>>> for (i = 0; i < npkts; i++) >>>>>>> rte_pktmbuf_free(&m_list[i]); >>>>>> >>>>>> Maybe I have it wrong or the compilers are doing the right thing now, >>>>>> but at one point the &m_list[i] would cause the compiler >> to >>>> generate a shift or multiple of 'i' and then add it to the base of m_list. >>>> If that is not the case anymore then I can update the code as >>>> you suggested. Using the *m_list++ just adds the size of a pointer to a >>>> register and continues. >>>>> >>>>> I compared the clang assembler (.s file) output from an example test code >>>>> I wrote to see if we have any differences in the code >>>> using the two styles and I found no difference and the code looked the >>>> same. I am not a Intel assembler expert and I would >> suggest >>>> someone else determine if it generates different code. I tried to compare >>>> the GCC outputs and it did look the same to me. >>> >>> That's was my question: >>> Modern compilers are able to generate a good code for a simple loop as >>> above. >>> So what's the point to use 2 iterators inside the loop, when just one is >>> enough? >>> Nothing wrong technically, but makes code a bit harder to follow. >>> Plus, in general, it is a good practise to minimise number of iterators >>> inside the loop, when possible. >>> >>> Konstantin >> >> Hi Konstantin, >> >> I really do not understand the concern if the code is the same, as it >> appears to me the current patch is very clean and simple. Maybe >> you have not seen the v2 patch and now v3 patch I sent this morning to fix >> Bruce's comment suggestion. >> >> For the case of the free routine your suggestion would require an extra >> counter/variable a bit more code a 'for' loop instead of a >> 'while' loop. > > My point was that just one iterator for both loops is enough. > In general, it is a good practise to minimise number of iterators per loop if > possible: > in some cases compiler might get confused and wouldn't be able to eliminate > redundant iterators itself. I learned a while back to not to be a compiler, but a programmer :-) Now a days the compilers handle the basic cases we have here and for the special cases we need to be aware of how the compiler generates code. I agree having less iterators per loop is cleaner, but in this case I do not think it matters. > Though yes - technically there is nothing wrong with your approach. > So if you prefer to keep it as it is - I wouldn't insist. > > Konstantin > >> +static inline void __attribute__((always_inline)) >> +rte_pktmbuf_free_bulk(struct rte_mbuf *m_list[], int16_t npkts) >> +{ >> + while(npkts--) >> + rte_pktmbuf_free(*m_list++); >> +} >> >> For the case of the alloc routine I did remove the rte_mbuf * m variable and >> now I believe it is very clean and changing it to use index >> variables is just a personal preference. I personal preference of this type >> is not useful IMO and does not cause any harm. Unless you >> can suggest a good technical reason to change I am going to leave the patch >> as is. >> >> +static inline int __attribute__((always_inline)) >> +rte_pktmbuf_alloc_bulk(struct rte_mempool *mp, struct rte_mbuf *m_list[], >> int16_t cnt) >> +{ >> + int ret; >> + >> + ret = rte_mempool_get_bulk(mp, (void **)m_list, cnt); >> + if ( ret == 0 ) { >> + ret = cnt; >> + while(cnt--) { >> +#ifdef RTE_MBUF_REFCNT >> + rte_mbuf_refcnt_set(*m_list, 1); >> +#endif /* RTE_MBUF_REFCNT */ >> + rte_pktmbuf_reset(*m_list++); >> + } >> + } >> + return ret; >> +} >> >>>>> >>>>> I have attached the code and output, please let me know if I did >>>>> something wrong, but as it stands using the original style is what I >>>> want to go with. >>>>> >>>>>>> >>>>>>> Konstantin >>>>>>> >>>>>>>> >>>>>>>> /Bruce >>>>>>>> >>>>>>>>> Thanks >>>>>>>>> ++Keith >>>>>>>>> >>>>>>>>> On Oct 6, 2014, at 3:56 AM, Richardson, Bruce <bruce.richardson at >>>>>>>>> intel.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Keith Wiles >>>>>>>>>>> Sent: Sunday, October 05, 2014 12:10 AM >>>>>>>>>>> To: dev at dpdk.org >>>>>>>>>>> Subject: [dpdk-dev] [PATCH 2/2] Adding the routines >>>>>>>>>>> rte_pktmbuf_alloc_bulk() >>>>>>>>>>> and rte_pktmbuf_free_bulk() >>>>>>>>>>> >>>>>>>>>>> Minor helper routines to mirror the mempool routines and remove the >>>>>>>>>>> code >>>>>>>>>>> from applications. The ixgbe_rxtx_vec.c routine could be changed to >>>>>>>>>>> use >>>>>>>>>>> the ret_pktmbuf_alloc_bulk() routine inplace of >>>>>>>>>>> rte_mempool_get_bulk(). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I believe such a change would cause a performance regression, as the >>>>>>>>>> extra init code in the alloc_bulk() function would >> take >>>>>>>> additional cycles and is not needed. The vector routines use the >>>>>>>> mempool function directly, so that there is no overhead of >>>> mbuf >>>>>>>> initialization, as the vector routines use their additional >>>>>>>> "knowledge" of what the mbufs will be used for to init them in a faster >>>> manner >>>>>>>> than can be done inside the mbuf library. >>>>>>>>>> >>>>>>>>>> /Bruce >>>>>>>>>> >>>>>>>>>>> Signed-off-by: Keith Wiles <keith.wiles at windriver.com> >>>>>>>>>>> --- >>>>>>>>>>> lib/librte_mbuf/rte_mbuf.h | 77 >>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>>>>>> 1 file changed, 77 insertions(+) >>>>>>>>>>> >>>>>>>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h >>>>>>>>>>> index 1c6e115..f298621 100644 >>>>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h >>>>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h >>>>>>>>>>> @@ -546,6 +546,41 @@ static inline void rte_pktmbuf_reset(struct >>>>>>>>>>> rte_mbuf >>>>>>>>>>> *m) >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> /** >>>>>>>>>>> + * @internal Allocate a list of mbufs from mempool *mp*. >>>>>>>>>>> + * The use of that function is reserved for RTE internal needs. >>>>>>>>>>> + * Please use rte_pktmbuf_alloc_bulk(). >>>>>>>>>>> + * >>>>>>>>>>> + * @param mp >>>>>>>>>>> + * The mempool from which mbuf is allocated. >>>>>>>>>>> + * @param m_list >>>>>>>>>>> + * The array to place the allocated rte_mbufs pointers. >>>>>>>>>>> + * @param cnt >>>>>>>>>>> + * The number of mbufs to allocate >>>>>>>>>>> + * @return >>>>>>>>>>> + * - 0 if the number of mbufs allocated was ok >>>>>>>>>>> + * - <0 is an ERROR. >>>>>>>>>>> + */ >>>>>>>>>>> +static inline int __rte_mbuf_raw_alloc_bulk(struct rte_mempool >>>>>>>>>>> *mp, struct >>>>>>>>>>> rte_mbuf *m_list[], int cnt) >>>>>>>>>>> +{ >>>>>>>>>>> + struct rte_mbuf *m; >>>>>>>>>>> + int ret; >>>>>>>>>>> + >>>>>>>>>>> + ret = rte_mempool_get_bulk(mp, (void **)m_list, cnt); >>>>>>>>>>> + if ( ret == 0 ) { >>>>>>>>>>> + int i; >>>>>>>>>>> + for(i = 0; i < cnt; i++) { >>>>>>>>>>> + m = *m_list++; >>>>>>>>>>> +#ifdef RTE_MBUF_REFCNT >>>>>>>>>>> + rte_mbuf_refcnt_set(m, 1); >>>>>>>>>>> +#endif /* RTE_MBUF_REFCNT */ >>>>>>>>>>> + rte_pktmbuf_reset(m); >>>>>>>>>>> + } >>>>>>>>>>> + ret = cnt; >>>>>>>>>>> + } >>>>>>>>>>> + return ret; >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +/** >>>>>>>>>>> * Allocate a new mbuf from a mempool. >>>>>>>>>>> * >>>>>>>>>>> * This new mbuf contains one segment, which has a length of 0. The >>>>>>>>>>> pointer >>>>>>>>>>> @@ -671,6 +706,32 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m) >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> /** >>>>>>>>>>> + * Allocate a list of mbufs from a mempool into a mbufs array. >>>>>>>>>>> + * >>>>>>>>>>> + * This mbuf list contains one segment per mbuf, which has a >>>>>>>>>>> length of 0. The >>>>>>>>>>> pointer >>>>>>>>>>> + * to data is initialized to have some bytes of headroom in the >>>>>>>>>>> buffer >>>>>>>>>>> + * (if buffer size allows). >>>>>>>>>>> + * >>>>>>>>>>> + * The routine is just a simple wrapper routine to reduce code in >>>>>>>>>>> the application >>>>>>>>>>> and >>>>>>>>>>> + * provide a cleaner API for multiple mbuf requests. >>>>>>>>>>> + * >>>>>>>>>>> + * @param mp >>>>>>>>>>> + * The mempool from which the mbuf is allocated. >>>>>>>>>>> + * @param m_list >>>>>>>>>>> + * An array of mbuf pointers, cnt must be less then or equal to >>>>>>>>>>> the size of the >>>>>>>>>>> list. >>>>>>>>>>> + * @param cnt >>>>>>>>>>> + * Number of slots in the m_list array to fill. >>>>>>>>>>> + * @return >>>>>>>>>>> + * - The number of valid mbufs pointers in the m_list array. >>>>>>>>>>> + * - Zero if the request cnt could not be allocated. >>>>>>>>>>> + */ >>>>>>>>>>> +static inline int __attribute__((always_inline)) >>>>>>>>>>> +rte_pktmbuf_alloc_bulk(struct rte_mempool *mp, struct rte_mbuf >>>>>>>>>>> *m_list[], >>>>>>>>>>> int16_t cnt) >>>>>>>>>>> +{ >>>>>>>>>>> + return __rte_mbuf_raw_alloc_bulk(mp, m_list, cnt); >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +/** >>>>>>>>>>> * Free a segment of a packet mbuf into its original mempool. >>>>>>>>>>> * >>>>>>>>>>> * Free an mbuf, without parsing other segments in case of chained >>>>>>>>>>> @@ -708,6 +769,22 @@ static inline void rte_pktmbuf_free(struct >>>>>>>>>>> rte_mbuf >>>>>>>>>>> *m) >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> +/** >>>>>>>>>>> + * Free a list of packet mbufs back into its original mempool. >>>>>>>>>>> + * >>>>>>>>>>> + * Free a list of mbufs by calling rte_pktmbuf_free() in a loop as >>>>>>>>>>> a wrapper >>>>>>>>>>> function. >>>>>>>>>>> + * >>>>>>>>>>> + * @param m_list >>>>>>>>>>> + * An array of rte_mbuf pointers to be freed. >>>>>>>>>>> + * @param npkts >>>>>>>>>>> + * Number of packets to free in list. >>>>>>>>>>> + */ >>>>>>>>>>> +static inline void rte_pktmbuf_free_bulk(struct rte_mbuf >>>>>>>>>>> *m_list[], int16_t >>>>>>>>>>> npkts) >>>>>>>>>>> +{ >>>>>>>>>>> + while(npkts--) >>>>>>>>>>> + rte_pktmbuf_free(*m_list++); >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> #ifdef RTE_MBUF_REFCNT >>>>>>>>>>> >>>>>>>>>>> /** >>>>>>>>>>> -- >>>>>>>>>>> 2.1.0 >>>>>>>>>> >>>>>>>>> >>>>>>>>> Keith Wiles, Principal Technologist with CTO office, Wind River >>>>>>>>> mobile 972-213-5533 >>>>>> >>>>>> Keith Wiles, Principal Technologist with CTO office, Wind River mobile >>>>>> 972-213-5533 >>>>> >>>>> Keith Wiles, Principal Technologist with CTO office, Wind River mobile >>>>> 972-213-5533 >>>> >>>> Keith Wiles, Principal Technologist with CTO office, Wind River mobile >>>> 972-213-5533 >> >> Keith Wiles, Principal Technologist with CTO office, Wind River mobile >> 972-213-5533 Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533