Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Bernhard Praschinger
Hallo

Christian Ebert wrote:
> * Trent Piepho on Friday, October 15, 2010 at 11:33:01 -0700
>> On Fri, Oct 15, 2010 at 10:58 AM, Christian Ebertwrote:
>>> Index: yuvdenoise/main.c
>>> ===
>>> RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
>>> retrieving revision 1.71
>>> diff -u -r1.71 main.c
>>> --- yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
>>> +++ yuvdenoise/main.c   15 Oct 2010 17:58:13 -
>>> @@ -1336,7 +1336,8 @@
>>> mjpeg_info("SETTING SSE2 for standard
>>> Temporal-Noise-Filter");
>>>temporal_filter_planes = temporal_filter_planes_sse2;
>>>
>>> -   __asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) :
>>> "ebx", "ecx");
>>> +/*__asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) :
>>> "ebx", "ecx");*/
>>> +   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" :
>>> "=d"(d) : "a"(0x8001) : "ecx");
>>>
>>
>> Not quite right.
>
> Indeed not.
>
>> It should look the exactly the same as the first one, with
>> "=&g"(tmp) as an output, except the number after the "a"
>> should still be 0x8001.
>
> I corrected that in my other version which even seems to work in
> some small tests ;-) Thanks for your patience and explanations.

Thanks for finding the problem. I committed the 2nd change to the CVS. I 
hope that I did it right. So please check it. if it doesn't work for you 
please send me the version that works for you.

auf hoffentlich bald,

Berni the Chaos of Woodquarter

Email: shadowl...@utanet.at
www: http://www.lysator.liu.se/~gz/bernhard

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Christian Ebert
* Trent Piepho on Friday, October 15, 2010 at 11:33:01 -0700
> On Fri, Oct 15, 2010 at 10:58 AM, Christian Ebert wrote:
>> Index: yuvdenoise/main.c
>> ===
>> RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
>> retrieving revision 1.71
>> diff -u -r1.71 main.c
>> --- yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
>> +++ yuvdenoise/main.c   15 Oct 2010 17:58:13 -
>> @@ -1336,7 +1336,8 @@
>>mjpeg_info("SETTING SSE2 for standard
>> Temporal-Noise-Filter");
>>   temporal_filter_planes = temporal_filter_planes_sse2;
>> 
>> -   __asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) :
>> "ebx", "ecx");
>> +/*__asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) :
>> "ebx", "ecx");*/
>> +   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" :
>> "=d"(d) : "a"(0x8001) : "ecx");
>> 
> 
> Not quite right.

Indeed not.

> It should look the exactly the same as the first one, with
> "=&g"(tmp) as an output, except the number after the "a"
> should still be 0x8001.

I corrected that in my other version which even seems to work in
some small tests ;-) Thanks for your patience and explanations.

c
-- 
\black\trash movie _SAME  TIME  SAME  PLACE_
 --->> http://www.blacktrash.org/underdogma/stsp.php
\black\trash audio   _ANOTHER  TIME  ANOTHER  PLACE_
--->> http://www.blacktrash.org/underdogma/atap.html

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Trent Piepho
On Fri, Oct 15, 2010 at 10:58 AM, Christian Ebert wrote:

> No! Trent nudged me in the right direction I believe. By applying
> the same fix to the other asm volatile line I was able to build
> it here as well. This change makes it build, but of course I've
> no idea what I was doing:
>
> Index: yuvdenoise/main.c
> ===
> RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
> retrieving revision 1.71
> diff -u -r1.71 main.c
> --- yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
> +++ yuvdenoise/main.c   15 Oct 2010 17:58:13 -
> @@ -1336,7 +1336,8 @@
> mjpeg_info("SETTING SSE2 for standard
> Temporal-Noise-Filter");
>temporal_filter_planes = temporal_filter_planes_sse2;
>
> -   __asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) :
> "ebx", "ecx");
> +/*__asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) :
> "ebx", "ecx");*/
> +   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" :
> "=d"(d) : "a"(0x8001) : "ecx");
>

Not quite right.  It should look the exactly the same as the first one, with
"=&g"(tmp) as an output, except the number after the "a"
should still be 0x8001.
--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Christian Ebert
* Christian Ebert on Friday, October 15, 2010 at 19:49:02 +0200
> With this change, yuvdenoise builds again, but someone in the
> know should check whether I've broken other stuff:

And I did break it, but this seems to work:

Index: yuvdenoise/main.c
===
RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
retrieving revision 1.71
diff -u -r1.71 main.c
--- yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
+++ yuvdenoise/main.c   15 Oct 2010 18:24:12 -
@@ -1336,7 +1336,8 @@
mjpeg_info("SETTING SSE2 for standard Temporal-Noise-Filter");
temporal_filter_planes = temporal_filter_planes_sse2;

-   __asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : "ebx", 
"ecx");
+/*__asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : 
"ebx", "ecx");*/
+   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : 
"=d"(d), "=&g"(tmp) : "a"(0x8001) : "ecx");
if ((d & (1 << 29))) {
/* x86_64 processor */
mjpeg_info("SETTING SSE2 for Median-Filter");


-- 
\black\trash movie _SAME  TIME  SAME  PLACE_
 --->> http://www.blacktrash.org/underdogma/stsp.php
\black\trash audio   _ANOTHER  TIME  ANOTHER  PLACE_
--->> http://www.blacktrash.org/underdogma/atap.html

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Christian Ebert
* Christian Ebert on Friday, October 15, 2010 at 19:58:28 +0200
> * Bernhard Praschinger on Friday, October 15, 2010 at 19:17:13 +0200
>> I have at home a OSX 10.4.11 (PowerPC) and 10.6.4 (Intel) with gcc 4.2.1 
>> and it compiles without problems on both computers.
>> 
>> I did a "make clean" in the yuvdenoise directory and it compiled it 
>> (make) on both machines. So I'm a bit confused.
>> 
>> I need to look if I can install a 10.5.x using a virtual machine.
> 
> No! Trent nudged me in the right direction I believe. By applying
> the same fix to the other asm volatile line I was able to build
> it here as well. This change makes it build, but of course I've
> no idea what I was doing:
> 
> Index: yuvdenoise/main.c

g, this one actually seems to work:

Index: yuvdenoise/main.c
===
RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
retrieving revision 1.71
diff -u -r1.71 main.c
--- yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
+++ yuvdenoise/main.c   15 Oct 2010 18:22:09 -
@@ -1336,7 +1336,8 @@
mjpeg_info("SETTING SSE2 for standard Temporal-Noise-Filter");
temporal_filter_planes = temporal_filter_planes_sse2;

-   __asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : "ebx", 
"ecx");
+/*__asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : 
"ebx", "ecx");*/
+   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : 
"=d"(d), "=&g"(tmp) : "a"(0x8001) : "ecx");
if ((d & (1 << 29))) {
/* x86_64 processor */
mjpeg_info("SETTING SSE2 for Median-Filter");

-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Christian Ebert
* Bernhard Praschinger on Friday, October 15, 2010 at 19:17:13 +0200
>>> I tested your better version. And it compiles here on my linux and Intel
>>> osx box. I did also a quick test with the new version on the linux box.
>>> And it works well.
>>> 
>>> So I would appreciate a feedback if it works on a mac.
>> 
>> Thanks for looking into this, but I get:
>> 
>> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
>> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
>> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
>> main.o main.c
>> main.c: In function ‘main’:
>> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
>> make: *** [main.o] Error 1
>> 
>> $ sw_vers
>> ProductName: Mac OS X
>> ProductVersion:  10.5.8
>> BuildVersion:9L30
> I have at home a OSX 10.4.11 (PowerPC) and 10.6.4 (Intel) with gcc 4.2.1 
> and it compiles without problems on both computers.
> 
> I did a "make clean" in the yuvdenoise directory and it compiled it 
> (make) on both machines. So I'm a bit confused.
> 
> I need to look if I can install a 10.5.x using a virtual machine.

No! Trent nudged me in the right direction I believe. By applying
the same fix to the other asm volatile line I was able to build
it here as well. This change makes it build, but of course I've
no idea what I was doing:

Index: yuvdenoise/main.c
===
RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
retrieving revision 1.71
diff -u -r1.71 main.c
--- yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
+++ yuvdenoise/main.c   15 Oct 2010 17:58:13 -
@@ -1336,7 +1336,8 @@
mjpeg_info("SETTING SSE2 for standard Temporal-Noise-Filter");
temporal_filter_planes = temporal_filter_planes_sse2;

-   __asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : "ebx", 
"ecx");
+/*__asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : 
"ebx", "ecx");*/
+   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : 
"=d"(d) : "a"(0x8001) : "ecx");
if ((d & (1 << 29))) {
/* x86_64 processor */
mjpeg_info("SETTING SSE2 for Median-Filter");


-- 
\black\trash movie   _SAME  TIME  SAME  PLACE_
   New York, in the summer of 2001

--->> http://www.blacktrash.org/underdogma/stsp.php

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Christian Ebert
* Trent Piepho on Friday, October 15, 2010 at 10:19:20 -0700
> On Fri, Oct 15, 2010 at 12:40 AM, Christian Ebert wrote:
> 
>> * Trent Piepho on Thursday, October 14, 2010 at 17:38:39 -0700
>>> Looks like you didn't actually change the needed lines.
>> 
>> No, I didn't but Bernhard did:
>> +
>> #if defined(__SSE2__)
>>   int d = 0;
>> -   __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx");
>> +/* __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx"); */
>> +   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
>> "=&g"(tmp) : "a"(1) : "ecx");
>>   if ((d & (1 << 26))) {
>>   mjpeg_info("SETTING SSE2 for standard
>> Temporal-Noise-Filter");
>>   temporal_filter_planes = temporal_filter_planes_sse2;
>> 
> 
> The second cpuid call below this one needs to be fixed as well.

You're right of course; the error appeared for only one line as
well, should've noticed that.

With this change, yuvdenoise builds again, but someone in the
know should check whether I've broken other stuff:


Index: yuvdenoise/main.c
===
RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
retrieving revision 1.71
diff -u -r1.71 main.c
--- yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
+++ yuvdenoise/main.c   15 Oct 2010 17:48:20 -
@@ -1336,7 +1336,8 @@
mjpeg_info("SETTING SSE2 for standard Temporal-Noise-Filter");
temporal_filter_planes = temporal_filter_planes_sse2;

-   __asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : "ebx", 
"ecx");
+/*__asm__ volatile("cpuid" : "=d"(d) : "a"(0x8001) : 
"ebx", "ecx");*/
+   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : 
"=d"(d) : "a"(0x8001) : "ecx");
if ((d & (1 << 29))) {
/* x86_64 processor */
mjpeg_info("SETTING SSE2 for Median-Filter");


-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Trent Piepho
On Fri, Oct 15, 2010 at 12:40 AM, Christian Ebert wrote:

> * Trent Piepho on Thursday, October 14, 2010 at 17:38:39 -0700
> > On Thu, Oct 14, 2010 at 2:43 PM, Christian Ebert 
> wrote:
> 
>  Easy fix would just be the change the sse detection asm to save and
>  restore ebx.
> 
>  __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1)
> :
>  "ecx");
> 
>  or better
> 
>  uint32_t tmp;
>  __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
>  "=&g"(tmp) : "a"(1) : "ecx");
> 
>  The latter is safer in general, as you can't use push or pop around
> any
>  asm code that has a parameter with a constraint that allows memory
>  references.  The memory reference might be relative to esp, in which
>  case the push/pop would move it.  Or it might not be relative to esp,
> in
>  which case the push/pop doesn't move it.  So there's no way to adjust
>  for it.
> >>> I tested your better version. And it compiles here on my linux and
> Intel
> >>> osx box. I did also a quick test with the new version on the linux box.
> >>> And it works well.
> >>>
> >>> So I would appreciate a feedback if it works on a mac.
> >>
> >> Thanks for looking into this, but I get:
> >>
> >> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops
> >> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include
> -no-cpp-precomp
> >> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c
> -o
> >> main.o main.c
> >> main.c: In function ‘main’:
> >> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
> >> make: *** [main.o] Error 1
> >
> > Looks like you didn't actually change the needed lines.
>
> No, I didn't but Bernhard did:
> +
>  #if defined(__SSE2__)
>int d = 0;
> -   __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx");
> +/* __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx"); */
> +   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
> "=&g"(tmp) : "a"(1) : "ecx");
>if ((d & (1 << 26))) {
>mjpeg_info("SETTING SSE2 for standard
> Temporal-Noise-Filter");
>temporal_filter_planes = temporal_filter_planes_sse2;
>

The second cpuid call below this one needs to be fixed as well.
--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Bernhard Praschinger
Hallo


>> I tested your better version. And it compiles here on my linux and Intel
>> osx box. I did also a quick test with the new version on the linux box.
>> And it works well.
>>
>> So I would appreciate a feedback if it works on a mac.
>
> Thanks for looking into this, but I get:
>
> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
> main.o main.c
> main.c: In function ‘main’:
> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
> make: *** [main.o] Error 1
>
> $ sw_vers
> ProductName:  Mac OS X
> ProductVersion:   10.5.8
> BuildVersion: 9L30
I have at home a OSX 10.4.11 (PowerPC) and 10.6.4 (Intel) with gcc 4.2.1 
and it compiles without problems on both computers.

I did a "make clean" in the yuvdenoise directory and it compiled it 
(make) on both machines. So I'm a bit confused.

I need to look if I can install a 10.5.x using a virtual machine.

auf hoffentlich bald,

Berni the Chaos of Woodquarter

Email: shadowl...@utanet.at
www: http://www.lysator.liu.se/~gz/bernhard

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-15 Thread Christian Ebert
* Trent Piepho on Thursday, October 14, 2010 at 17:38:39 -0700
> On Thu, Oct 14, 2010 at 2:43 PM, Christian Ebert  wrote:
> 
>> * Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
 Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
 checking if sse2 is available.  Using CPUID isn't enough, because sse
 requires OS support that might not be there.  I.e., the cpu supports
 sse2 but you're not actually able to use it.  Probably not much of issue
 on OSX.
 
 Easy fix would just be the change the sse detection asm to save and
 restore ebx.
 
 __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
 "ecx");
 
 or better
 
 uint32_t tmp;
 __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
 "=&g"(tmp) : "a"(1) : "ecx");
 
 The latter is safer in general, as you can't use push or pop around any
 asm code that has a parameter with a constraint that allows memory
 references.  The memory reference might be relative to esp, in which
 case the push/pop would move it.  Or it might not be relative to esp, in
 which case the push/pop doesn't move it.  So there's no way to adjust
 for it.
>>> I tested your better version. And it compiles here on my linux and Intel
>>> osx box. I did also a quick test with the new version on the linux box.
>>> And it works well.
>>> 
>>> So I would appreciate a feedback if it works on a mac.
>> 
>> Thanks for looking into this, but I get:
>> 
>> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops
>> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp
>> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o
>> main.o main.c
>> main.c: In function ‘main’:
>> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
>> make: *** [main.o] Error 1
> 
> Looks like you didn't actually change the needed lines.

No, I didn't but Bernhard did:

$ cvs status yuvdenoise/main.c
===
File: main.cStatus: Up-to-date

   Working revision:1.71
   Repository revision: 1.71/cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
   Sticky Tag:  HEAD (revision: 1.71)
   Sticky Date: (none)
   Sticky Options:  (none)

$ cvs diff -r 1.70 yuvdenoise/main.c
Index: yuvdenoise/main.c
===
RCS file: /cvsroot/mjpeg/mjpeg_play/yuvdenoise/main.c,v
retrieving revision 1.70
retrieving revision 1.71
diff -u -r1.70 -r1.71
--- yuvdenoise/main.c   10 Oct 2010 13:01:55 -  1.70
+++ yuvdenoise/main.c   14 Oct 2010 16:57:54 -  1.71
@@ -810,8 +810,8 @@
 /* 4 to 5 times faster */
 void filter_plane_median_sse2(uint8_t *plane, int w, int h, int level) {
int i;
-   int avg;
-   int cnt;
+   /* int avg; should not be needed any more */
+   /* int cnt; should not be needed any more */
uint8_t * p;
uint8_t * d;

@@ -1326,10 +1326,12 @@
 static void init_accel() {
filter_plane_median = filter_plane_median_p;
temporal_filter_planes = temporal_filter_planes_p;
-   
+   uint32_t tmp;
+
 #if defined(__SSE2__)
int d = 0;
-   __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx");
+/* __asm__ volatile("cpuid" : "=d"(d) : "a"(1) : "ebx", "ecx"); */
+   __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d), 
"=&g"(tmp) : "a"(1) : "ecx");
if ((d & (1 << 26))) {
mjpeg_info("SETTING SSE2 for standard Temporal-Noise-Filter");
temporal_filter_planes = temporal_filter_planes_sse2;


c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Trent Piepho
On Thu, Oct 14, 2010 at 2:43 PM, Christian Ebert  wrote:

> * Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
> >> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
> >> checking if sse2 is available.  Using CPUID isn't enough, because sse
> >> requires OS support that might not be there.  I.e., the cpu supports
> >> sse2 but you're not actually able to use it.  Probably not much of issue
> >> on OSX.
> >>
> >> Easy fix would just be the change the sse detection asm to save and
> >> restore ebx.
> >>
> >> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
> >> "ecx");
> >>
> >> or better
> >>
> >> uint32_t tmp;
> >> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
> >> "=&g"(tmp) : "a"(1) : "ecx");
> >>
> >> The latter is safer in general, as you can't use push or pop around any
> >> asm code that has a parameter with a constraint that allows memory
> >> references.  The memory reference might be relative to esp, in which
> >> case the push/pop would move it.  Or it might not be relative to esp, in
> >> which case the push/pop doesn't move it.  So there's no way to adjust
> >> for it.
> > I tested your better version. And it compiles here on my linux and Intel
> > osx box. I did also a quick test with the new version on the linux box.
> > And it works well.
> >
> > So I would appreciate a feedback if it works on a mac.
>
> Thanks for looking into this, but I get:
>
> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o
> main.o main.c
> main.c: In function ‘main’:
> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
> make: *** [main.o] Error 1
>

Looks like you didn't actually change the needed lines.
--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Christian Ebert
* Christian Ebert on Thursday, October 14, 2010 at 23:43:20 +0200
> * Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
>>> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
>>> checking if sse2 is available.  Using CPUID isn't enough, because sse
>>> requires OS support that might not be there.  I.e., the cpu supports
>>> sse2 but you're not actually able to use it.  Probably not much of issue
>>> on OSX.
>>> 
>>> Easy fix would just be the change the sse detection asm to save and
>>> restore ebx.
>>> 
>>> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
>>> "ecx");
>>> 
>>> or better
>>> 
>>> uint32_t tmp;
>>> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
>>> "=&g"(tmp) : "a"(1) : "ecx");
>>> 
>>> The latter is safer in general, as you can't use push or pop around any
>>> asm code that has a parameter with a constraint that allows memory
>>> references.  The memory reference might be relative to esp, in which
>>> case the push/pop would move it.  Or it might not be relative to esp, in
>>> which case the push/pop doesn't move it.  So there's no way to adjust
>>> for it.
>> I tested your better version. And it compiles here on my linux and Intel 
>> osx box. I did also a quick test with the new version on the linux box. 
>> And it works well.
>> 
>> So I would appreciate a feedback if it works on a mac.
> 
> Thanks for looking into this, but I get:
> 
> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
> main.o main.c
> main.c: In function ‘main’:
> main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
> make: *** [main.o] Error 1
> 
> $ sw_vers
> ProductName:  Mac OS X
> ProductVersion:   10.5.8
> BuildVersion: 9L30

Just came across this:

http://lists.mplayerhq.hu/pipermail/mplayer-users/2010-October/081276.html

c
-- 
\black\trash movie   _COWBOY  CANOE  COMA_
Ein deutscher Western/A German Western

--->> http://www.blacktrash.org/underdogma/ccc.php

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Christian Ebert
* Bernhard Praschinger on Thursday, October 14, 2010 at 19:06:33 +0200
>> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
>> checking if sse2 is available.  Using CPUID isn't enough, because sse
>> requires OS support that might not be there.  I.e., the cpu supports
>> sse2 but you're not actually able to use it.  Probably not much of issue
>> on OSX.
>> 
>> Easy fix would just be the change the sse detection asm to save and
>> restore ebx.
>> 
>> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
>> "ecx");
>> 
>> or better
>> 
>> uint32_t tmp;
>> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
>> "=&g"(tmp) : "a"(1) : "ecx");
>> 
>> The latter is safer in general, as you can't use push or pop around any
>> asm code that has a parameter with a constraint that allows memory
>> references.  The memory reference might be relative to esp, in which
>> case the push/pop would move it.  Or it might not be relative to esp, in
>> which case the push/pop doesn't move it.  So there's no way to adjust
>> for it.
> I tested your better version. And it compiles here on my linux and Intel 
> osx box. I did also a quick test with the new version on the linux box. 
> And it works well.
> 
> So I would appreciate a feedback if it works on a mac.

Thanks for looking into this, but I get:

gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
-ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
-D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
main.o main.c
main.c: In function ‘main’:
main.c:1339: error: PIC register ‘ebx’ clobbered in ‘asm’
make: *** [main.o] Error 1

$ sw_vers
ProductName:Mac OS X
ProductVersion: 10.5.8
BuildVersion:   9L30

c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-14 Thread Bernhard Praschinger
Hallo

> Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
> checking if sse2 is available.  Using CPUID isn't enough, because sse
> requires OS support that might not be there.  I.e., the cpu supports
> sse2 but you're not actually able to use it.  Probably not much of issue
> on OSX.
>
> Easy fix would just be the change the sse detection asm to save and
> restore ebx.
>
> __asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
> "ecx");
>
> or better
>
> uint32_t tmp;
> __asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
> "=&g"(tmp) : "a"(1) : "ecx");
>
> The latter is safer in general, as you can't use push or pop around any
> asm code that has a parameter with a constraint that allows memory
> references.  The memory reference might be relative to esp, in which
> case the push/pop would move it.  Or it might not be relative to esp, in
> which case the push/pop doesn't move it.  So there's no way to adjust
> for it.
I tested your better version. And it compiles here on my linux and Intel 
osx box. I did also a quick test with the new version on the linux box. 
And it works well.

So I would appreciate a feedback if it works on a mac.

auf hoffentlich bald,

Berni the Chaos of Woodquarter

Email: shadowl...@utanet.at
www: http://www.lysator.liu.se/~gz/bernhard

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-13 Thread Trent Piepho
On Wed, Oct 13, 2010 at 12:11 AM, Christian Ebert wrote:

> * Trent Piepho on Tuesday, October 12, 2010 at 12:06:45 -0700
> > It looks like the only use of ebx is in the code to detect CPU features
> > using cpuid.  A better way to do it would be to read the /proc/cpuinfo
> file
> > and look for the sse2 or whatever flag.
>
> There is no /proc/ directory on MacOS X.
>

Oh yeah, this isn't on Linux.  OSX probably has some kind of API for
checking if sse2 is available.  Using CPUID isn't enough, because sse
requires OS support that might not be there.  I.e., the cpu supports sse2
but you're not actually able to use it.  Probably not much of issue on OSX.

Easy fix would just be the change the sse detection asm to save and restore
ebx.

__asm__ volatile("pushl %%ebx ; cpuid ; popl %%ebx" : "=d"(d) : "a"(1) :
"ecx");

or better

uint32_t tmp;
__asm__ volatile("movl %%ebx, %1; cpuid; movl %1, %%ebx" : "=d"(d),
"=&g"(tmp) : "a"(1) : "ecx");

The latter is safer in general, as you can't use push or pop around any asm
code that has a parameter with a constraint that allows memory references.
The memory reference might be relative to esp, in which case the push/pop
would move it.  Or it might not be relative to esp, in which case the
push/pop doesn't move it.  So there's no way to adjust for it.
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-13 Thread Christian Ebert
* sfrase6 on Tuesday, October 12, 2010 at 18:06:17 +
>> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
>> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
>> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
>> main.o main.c
>> main.c: In function ‘filter_plane_median_sse2’:
>> main.c:814: warning: unused variable ‘cnt’
>> main.c:813: warning: unused variable ‘avg’
>> main.c: In function ‘main’:
>> main.c:1332: error: PIC register ‘ebx’ clobbered in ‘asm’
>> main.c:1337: error: PIC register ‘ebx’ clobbered in ‘asm’
>> make[2]: *** [main.o] Error 1
>> make[1]: *** [all-recursive] Error 1
>> make: *** [all] Error 2
> 
> I'm not sure how to fix your particular issue, but it seems
> that there are quite a few instances of that issue in Google
> that you might want to research:
> http://www.google.com/search?q=error%3A+PIC+register+%E2%80%98ebx%E2%80%99+clobbered+in+%E2%80%98asm%E2%80%99&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

I tried with --with-pic=no --enable-simd-accel=no but the error
persists -- apart from the fact that I probably wouldn't want to
lose the overall optimizations for a speedup in yuvdenoise.

c
-- 
  Was heißt hier Dogma, ich bin Underdogma!
[ What the hell do you mean dogma, I am underdogma. ]
free movies   --->>> http://www.blacktrash.org/underdogma
http://itunes.apple.com/podcast/underdogma-movies/id363423596

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-13 Thread Christian Ebert
* Trent Piepho on Tuesday, October 12, 2010 at 12:06:45 -0700
> It looks like the only use of ebx is in the code to detect CPU features
> using cpuid.  A better way to do it would be to read the /proc/cpuinfo file
> and look for the sse2 or whatever flag.

There is no /proc/ directory on MacOS X.

c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-12 Thread Trent Piepho
It looks like the only use of ebx is in the code to detect CPU features
using cpuid.  A better way to do it would be to read the /proc/cpuinfo file
and look for the sse2 or whatever flag.

On Tue, Oct 12, 2010 at 11:06 AM, sfrase6  wrote:

>
> - Original Message -
> From: "Christian Ebert" 
> To: mjpeg-users@lists.sourceforge.net
> Sent: Tuesday, October 12, 2010 6:17:56 AM
> Subject: Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch
>
> > gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o
> main.o main.c
> > main.c: In function ‘filter_plane_median_sse2’:
> > main.c:814: warning: unused variable ‘cnt’
> > main.c:813: warning: unused variable ‘avg’
> > main.c: In function ‘main’:
> > main.c:1332: error: PIC register ‘ebx’ clobbered in ‘asm’
> > main.c:1337: error: PIC register ‘ebx’ clobbered in ‘asm’
> > make[2]: *** [main.o] Error 1
> > make[1]: *** [all-recursive] Error 1
> > make: *** [all] Error 2
>
> Hi Christian,
> I'm not sure how to fix your particular issue, but it seems that there are
> quite a few instances of that issue in Google that you might want to
> research:
>
> http://www.google.com/search?q=error%3A+PIC+register+%E2%80%98ebx%E2%80%99+clobbered+in+%E2%80%98asm%E2%80%99&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
>
> scott
>
>
> --
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
> Spend less time writing and  rewriting code and more time creating great
> experiences on the web. Be a part of the beta today.
> http://p.sf.net/sfu/beautyoftheweb
> ___
> Mjpeg-users mailing list
> Mjpeg-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mjpeg-users
>
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-12 Thread sfrase6

- Original Message -
From: "Christian Ebert" 
To: mjpeg-users@lists.sourceforge.net
Sent: Tuesday, October 12, 2010 6:17:56 AM
Subject: Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
> main.o main.c
> main.c: In function ‘filter_plane_median_sse2’:
> main.c:814: warning: unused variable ‘cnt’
> main.c:813: warning: unused variable ‘avg’
> main.c: In function ‘main’:
> main.c:1332: error: PIC register ‘ebx’ clobbered in ‘asm’
> main.c:1337: error: PIC register ‘ebx’ clobbered in ‘asm’
> make[2]: *** [main.o] Error 1
> make[1]: *** [all-recursive] Error 1
> make: *** [all] Error 2

Hi Christian,
I'm not sure how to fix your particular issue, but it seems that there are 
quite a few instances of that issue in Google that you might want to research:
http://www.google.com/search?q=error%3A+PIC+register+%E2%80%98ebx%E2%80%99+clobbered+in+%E2%80%98asm%E2%80%99&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

scott

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-12 Thread Christian Ebert
* Christian Ebert on Sunday, October 10, 2010 at 14:51:22 +0100
> * Bernhard Praschinger on Sunday, October 10, 2010 at 15:14:35 +0200
>> I have tested your patch and sent it to CVS. So it will appear in the 
>> next version.
> 
> However, it breaks building on MacOS 10.5.8:
> 
> gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
> -ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
> -D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
> main.o main.c
> main.c: In function ‘filter_plane_median_sse2’:
> main.c:814: warning: unused variable ‘cnt’
> main.c:813: warning: unused variable ‘avg’
> main.c: In function ‘main’:
> main.c:1332: error: PIC register ‘ebx’ clobbered in ‘asm’
> main.c:1337: error: PIC register ‘ebx’ clobbered in ‘asm’
> make[2]: *** [main.o] Error 1
> make[1]: *** [all-recursive] Error 1
> make: *** [all] Error 2

Do you need any additional info?

c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-10 Thread Christian Ebert
* Bernhard Praschinger on Sunday, October 10, 2010 at 15:14:35 +0200
> I have tested your patch and sent it to CVS. So it will appear in the 
> next version.

However, it breaks building on MacOS 10.5.8:

gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../utils   -O3 -funroll-all-loops 
-ffast-math -march=nocona -mtune=nocona -g -O2 -I/sw/include -no-cpp-precomp 
-D_THREAD_SAFE  -Wall -Wunused -MT main.o -MD -MP -MF .deps/main.Tpo -c -o 
main.o main.c
main.c: In function ‘filter_plane_median_sse2’:
main.c:814: warning: unused variable ‘cnt’
main.c:813: warning: unused variable ‘avg’
main.c: In function ‘main’:
main.c:1332: error: PIC register ‘ebx’ clobbered in ‘asm’
main.c:1337: error: PIC register ‘ebx’ clobbered in ‘asm’
make[2]: *** [main.o] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

c
-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users


Re: [Mjpeg-users] [Mjpeg-developer] yuvdenoise performance patch

2010-10-10 Thread Bernhard Praschinger
Hallo

> On Sat, 02 Oct 2010 07:12:39 +0200 Bernhard Praschinger
>> There are still people that monitor the list :)
>> My main problem is finding time to answer mails and do tests.
> I see, same here. I didn't know whether the project was still alive,
> since there's quite little traffic on this list and the last release
> some time ago...
The project it still active (but not that much) ;)
I plan a release but than Andrew hast added some features to mpeg2enc so 
I delayed my release.
My current plan is making a RC1 in the beginning of Dec. And a release 
at the end of december. No one has objected by now.

>>> Attached is a patch which contains SSE2-accelerated versions of the
>>> (non MC-) functions for temporal and spatial filtering (which I mainly
>>> use). Additionally I've reenabled the shortcircuiting of
>>> temporal_filter_planes_MC, otherwise it fails with divide by zero when
>>> using level 0 (e.g. to only filter the luma plane).
>> So I (and people on the list) can use the patch that came with this
>> mail. And ignore the other patch ?
> Yes, the other patch is obsoleted by the new one. Please feel free to
> use it.
I have tested your patch and sent it to CVS. So it will appear in the 
next version.

Your patch speed up denoising when using the -M option by the factor of 
4 ! :-)  (From 89sec down to 22 sec)

> I've noticed the utils/mmx.h header file, which contains all the
> MMX-commands as macros mapping to inline assembler. Can you tell me,
> why it's being used and not?
That is a good question.

Long time ago the mjpegtools used the jpeg-mmx to decode jpeg images 
"fast" to yuv. I think that this might be some remains that were missed 
when removing the jpeg-mmx.

auf hoffentlich bald,

Berni the Chaos of Woodquarter

Email: shadowl...@utanet.at
www: http://www.lysator.liu.se/~gz/bernhard

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Mjpeg-users mailing list
Mjpeg-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mjpeg-users