On Thu, Dec 13, 2018 at 08:14:57AM -0800, Sagi Grimberg wrote:
>>>> +  if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
>>>> +          bio->bi_opf &= ~REQ_HIPRI;
>>>> +
>>>
>>> Maybe we can simply check (q->queue_flags & (1 << QUEUE_FLAG_POLL)) and
>>> avoid the extra atomic operation in the host path?
>>>
>>> Would it make sense?
>>
>> test_bit is not usually implemented as an atomic operation.
>>
>> Take a look at e.g.
>>
>> arch/x86/include/asm/bitops.h:constant_test_bit()
>
> Ah.. But its still read from volatile argument so still more expensive?

I don't think the volatile should make a difference.  I actually
compiled both versions and the test_bit version generates a movq + testl
insted of testb:

-       movq    120(%rbx), %rdx # MEM[(const long unsigned int *)q_38 + 120B], 
_135
-       testl   $524288, %edx   #, _135
+       testb   $8, 122(%rbx)   #, q_40->queue_flags

But actually generates a larger object:

 36966     9470      88   46524    b5bc blk-core.o-opencode
 36956     9470      88   46514    b5b2 blk-core.o-test-bit

No idea what is going there.

Reply via email to