Hi Nagadheeraj,

I am no expert in crypto dev, maybe you can educate me if I am wrong: 
I got an impression in this series, the barriers were used too much, too 
heavily and unnecessarily. 

For enqueue operations, I understand they are stores to the DMA buffer, the 
queue will be fetched and updated by the crypto device after processing, then 
dequeued by the other CPU cores. So for enqueue operations, an rte_io_wmb is 
required before the doorbell ringing,  and an rte_smp_wmb is required to ensure 
the enqueue operations were done before the consumer on the other side(who 
dequeues) sees the updated pending_count. For dequeue operations, rte_smp_rmb 
is required after reading the pending_count to ensure reading the intact 
content from the queue(if the queue entries were not handled yet by the crypto 
dev, the status will show that, maybe an rte_io_rmb is required to ensure the 
status is read out first).

The rte_smp_xmb can even be optimized with C11 atomics, but it can be next 
step. 

Best Regards,
Gavin

Reply via email to