On Mon, May 24, 2010 at 09:37:05AM +0300, Avi Kivity wrote:
> On 05/23/2010 07:30 PM, Michael S. Tsirkin wrote:
>>
>>
>>>> Maybe we should use atomics on index then?
>>>>
>>>>
>>> This should only be helpful if you access the cacheline several times in
>>> a row. That's not the case in virtio (or here).
>>>
>> So why does it help?
>>
>
> We actually do access the cacheline several times in a row here (but not
> in virtio?):
>
>> case SHARE:
>> while (count< MAX_BOUNCES) {
>> /* Spin waiting for other side to change it. */
>> while (counter->cacheline1 != count);
>>
>
> Broadcast a read request.
>
>> count++;
>> counter->cacheline1 = count;
>>
>
> Broadcast an invalidate request.
>
>> count++;
>> }
>> break;
>>
>> case LOCKSHARE:
>> while (count< MAX_BOUNCES) {
>> /* Spin waiting for other side to change it. */
>> while
>> (__sync_val_compare_and_swap(&counter->cacheline1, count, count+1)
>> != count);
>>
>
> Broadcast a 'read for ownership' request.
>
>> count += 2;
>> }
>> break;
>>
>
> So RMW should certainly by faster using single-instruction RMW
> operations (or using prefetchw).
Okay, but why is lockunshare faster than unshare?
> --
> Do not meddle in the internals of kernels, for they are subtle and quick to
> panic.
_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/virtualization