"Ronald S. Bultje" <[email protected]> writes:

> Hi,
>
> On Thu, Jul 26, 2012 at 9:05 AM, Måns Rullgård <[email protected]> wrote:
>> "Ronald S. Bultje" <[email protected]> writes:
>>> On Thu, Jul 26, 2012 at 7:30 AM, Martin Storsjö <[email protected]> wrote:
>>>> On Thu, 26 Jul 2012, Ronald S. Bultje wrote:
>>>>> On Thu, Jul 26, 2012 at 2:06 AM, Diego Biurrun <[email protected]> wrote:
>>>>>> On Thu, Jul 26, 2012 at 05:10:10AM +0200, Luca Barbato wrote:
>>>>>>> On 07/26/2012 04:27 AM, Ronald S. Bultje wrote:
>>>>>>>> From: "Ronald S. Bultje" <[email protected]>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>  libswscale/swscale.c |    2 +-
>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>>
>>>>>>> Ok.
>>>>>>
>>>>>>
>>>>>> No, not OK.  This is just a repackaged piece of another patch that
>>>>>> has review questions that were never answered.  Until those questions
>>>>>> are settled, this cannot go in.
>>>>>
>>>>>
>>>>> I've looked at all emails in:
>>>>> http://comments.gmane.org/gmane.comp.video.libav.devel/28861
>>>>>
>>>>> including yours:
>>>>> http://permalink.gmane.org/gmane.comp.video.libav.devel/28871
>>>>>
>>>>> and Mans':
>>>>> http://permalink.gmane.org/gmane.comp.video.libav.devel/28863
>>>>>
>>>>> My original mail has the "fence" part in it (simply ctrl-F in your
>>>>> browser), and neither you nor Mans respond to that particular section.
>>>>> So I'm lost now. What is the specific comment you want me to respond
>>>>> to?
>>>>
>>>>
>>>> http://article.gmane.org/gmane.comp.video.libav.devel/30834
>>>
>>> If someone feels like rewriting swscale, I'm all supportive of that
>>> effort. For now, sws uses movntq in its inline assembly mmx/3dnow
>>> optimizations and we'll have to deal with it until someone changes it
>>> not to do that.
>>>
>>> Doing it in generic code is silly because in practice there is never
>>> any advantage to doing movntq. Thus, we should discourage its use.
>>> Adding generic versions of sfence does not contribute to that. The
>>> whole goal - back when I worked on sws - was to kill all these old
>>> mmx/3dnow optimizations and replace with modern sse2/avx, which would
>>> mean we don't need a call to sfence anymore anyways.
>>
>> I'm still missing an explanation of why sfence is needed here other than
>> movntq somehow being involved.
>
> My understanding is that if you use movntq and not sfence, the data
> may not be in the destination memory pointer by the time swScale()
> returns.

The movntq (and movntps) instructions do not write-allocate caches, the
theory being that if the data won't be needed again soon, bringing it
into the cache will do more harm than good (by evicting something
potentially more useful).  These stores are also weakly ordered with
respect to other memory operations as seen by other CPUs in the system.
If a CPU writes some data using movntq while signalling progress to
other CPUs by updating a shared variable, another CPU might observe the
progress update before the actual data unless a memory barrier (fence in
Intel parlance) is inserted between the two operations.

Portable multi-processing code must use synchronisation primitives such
as those provided by pthreads since most non-x86 architectures have much
weaker memory models and some kind of barrier is almost always needed.
Since x86 does not generally need barriers (only when movnt*
instructions are used), it may well be that the usual synchronisation
primitives omit the barriers there so as not to adversely affect
performance in the common case of them not being required.  If this is
so, an sfence instruction must be manually inserted somewhere between
the last movntq and any possible inter-CPU communication.

All in all, there seems to be a good chance the sfence is needed.
However, the question still remains why it is in generic code.

-- 
Måns Rullgård
[email protected]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to