Re: [dpdk-users] Query on handling packets

2018-11-19 Thread Wiles, Keith



> On Nov 17, 2018, at 4:05 PM, Kyle Larose  wrote:
> 
> On Sat, Nov 17, 2018 at 5:22 AM Harsh Patel  wrote:
>> 
>> Hello,
>> Thanks a lot for going through the code and providing us with so much
>> information.
>> We removed all the memcpy/malloc from the data path as you suggested and
> ...
>> After removing this, we are able to see a performance gain but not as good
>> as raw socket.
>> 
> 
> You're using an unordered_map to map your buffer pointers back to the
> mbufs. While it may not do a memcpy all the time, It will likely end
> up doing a malloc arbitrarily when you insert or remove entries from
> the map. If it needs to resize the table, it'll be even worse. You may
> want to consider using librte_hash:
> https://doc.dpdk.org/api/rte__hash_8h.html instead. Or, even better,
> see if you can design the system to avoid needing to do a lookup like
> this. Can you return a handle with the mbuf pointer and the data
> together?
> 
> You're also using floating point math where it's unnecessary (the
> timing check). Just multiply the numerator by 100 prior to doing
> the division. I doubt you'll overflow a uint64_t with that. It's not
> as efficient as integer math, though I'm not sure offhand it'd cause a
> major perf problem.
> 
> One final thing: using a raw socket, the kernel will take over
> transmitting and receiving to the NIC itself. that means it is free to
> use multiple CPUs for the rx and tx. I notice that you only have one
> rx/tx queue, meaning at most one CPU can send and receive packets.
> When running your performance test with the raw socket, you may want
> to see how busy the system is doing packet sends and receives. Is it
> using more than one CPU's worth of processing? Is it using less, but
> when combined with your main application's usage, the overall system
> is still using more than one?

Along with the floating point math, I would remove all floating point math and 
use the rte_rdtsc() function to use cycles. Using something like:

uint64_t cur_tsc, next_tsc, timo = (rte_timer_get_hz() / 16);   /* One 16th of 
a second use 2/4/8/16/32 power of two numbers to make the math simple divide */

cur_tsc = rte_rdtsc();

next_tsc = cur_tsc + timo; /* Now next_tsc the next time to flush */

while(1) {
cur_tsc = rte_rdtsc();
if (cur_tsc >= next_tsc) {
flush();
next_tsc += timo;
}
/* Do other stuff */
}

For the m_bufPktMap I would use the rte_hash or do not use a hash at all by 
grabbing the buffer address and subtract the
mbuf = (struct rte_mbuf *)RTE_PTR_SUB(buf, sizeof(struct rte_mbuf) + 
RTE_MAX_HEADROOM);


DpdkNetDevice:Write(uint8_t *buffer, size_t length)
{
struct rte_mbuf *pkt;
uint64_t cur_tsc;

pkt = (struct rte_mbuf *)RTE_PTR_SUB(buffer, sizeof(struct rte_mbuf) + 
RTE_MAX_HEADROOM);

/* No need to test pkt, but buffer maybe tested to make sure it is not 
null above the math above */

pkt->pk_len = length;
pkt->data_len = length;

rte_eth_tx_buffer(m_portId, 0, m_txBuffer, pkt);

cur_tsc = rte_rdtsc();

/* next_tsc is a private variable */
if (cur_tsc >= next_tsc) {
rte_eth_tx_buffer_flush(m_portId, 0, m_txBuffer);   /* 
hardcoded the queue id, should be fixed */
next_tsc = cur_tsc + timo; /* timo is a fixed number of cycles 
to wait */
}
return length;
}

DpdkNetDevice::Read()
{
struct rte_mbuf *pkt;

if (m_rxBuffer->length == 0) {
m_rxBuffer->next = 0;
m_rxBuffer->length = rte_eth_rx_burst(m_portId, 0, 
m_rxBuffer->pmts, MAX_PKT_BURST);

if (m_rxBuffer->length == 0)
return std::make_pair(NULL, -1);
}

pkt = m_rxBuffer->pkts[m_rxBuffer->next++];

/* do not use rte_pktmbuf_read() as it does a copy for the complete 
packet */

return std:make_pair(rte_pktmbuf_mtod(pkt, char *), pkt->pkt_len);
}

void
DpdkNetDevice::FreeBuf(uint8_t *buf)
{
struct rte_mbuf *pkt;

if (!buf)
return;
pkt = (struct rte_mbuf *)RTE_PKT_SUB(buf, sizeof(rte_mbuf) + 
RTE_MAX_HEADROOM);

rte_pktmbuf_free(pkt);
}

When your code is done with the buffer, then convert the buffer address back to 
a rte_mbuf pointer and call rte_pktmbuf_free(pkt); This should eliminate the 
copy and floating point code. Converting my C code to C++ priceless :-)

Hopefully the buffer address passed is the original buffer address and has not 
be adjusted.


Regards,
Keith



Re: [dpdk-users] [ovs-dev] Packet Drop Issue in OVS-DPDK L2FWD Application

2018-11-19 Thread Ian Stokes

On 11/18/2018 8:16 PM, vkrishnabhat k wrote:

Hi Team,

I am new to OVS and DPDK. While I am using l2fwd application with OVS and
DPDK I am seeing packet drop issue in OVS bridge.

Topology : My topology has Ubuntu machine (Ubuntu 18.04 LTS). I have
installed Qemu-KVM 2.11.1 version. Also I am using OVS-DPDK. Please find
the detailed topology attached with this mail. I have bound two NICs (Intel
82599ES 10-gigabit ) to dpdk IGB_UIO driver and also have added same ports
in to OVS bridge "br0". I am trying to send the bidirectional traffic from
both the port and measure the throughput value for the l2fwd application.




Could you please help me with below questions to understand l2fwd better.



Hi, just a few questions to clarify, I assume you mean you are running 
the DPDK sample app 'l2fwd' in a Virtual machine that is also connected 
to bridge br0 via a vhostuser port?



What is the reason for packet drops in OVS bridge ?
What is the expected throughput value for l2fwd ?
How to improve the performance of l2fwd to get better throughput value ?
Is it possible to send or can l2fwd handle layer 7 traffic ?

I tried tuning performance by adding more number of Rx queues and
increasing the Rx queue size as per the link "
http://docs.openvswitch.org/en/latest/intro/install/dpdk/;, but it didn't
help much.



Can you provide what versions of OVS and DPDK are being used in the host 
and VM instances?



I have attached screen shots of the Topology, DPDK port statistics, OVS
configurations with this mail.


I don't see these attached, they may have been filtered. Could you copy 
the output in these to the mail in text.


Thanks
Ian



It will be really great if you could help me with this.

Look forward to hear from you.

Thanks in advance.

Regards,
Venkat



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev