Re: [FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-11 Thread Adrian Tong
On Mon, 10 Jun 2019 at 23:02, Lauri Kasanen wrote: > On Mon, 10 Jun 2019 17:42:00 -0700 > Adrian Tong wrote: > > > I have been trying to implement yuv420_to_bgr24 using SSE2 instruction. I > > ran into the case where the output of C implemented yuv420_to_bgr24 has > > slightly different

Re: [FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-11 Thread Lauri Kasanen
On Mon, 10 Jun 2019 17:42:00 -0700 Adrian Tong wrote: > I have been trying to implement yuv420_to_bgr24 using SSE2 instruction. I > ran into the case where the output of C implemented yuv420_to_bgr24 has > slightly different resulting bgr24 image from MMX implemented > yuv420_to_bgr24. Is this

Re: [FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-10 Thread Adrian Tong
On Sat, 8 Jun 2019 at 09:42, Adrian Tong wrote: > > > On Sat, 8 Jun 2019 at 09:38, Lauri Kasanen wrote: > >> On Sat, 8 Jun 2019 06:51:51 -0700 >> Adrian Tong wrote: >> >> > Hi Lauri. >> > >> > Thanks for the reply, any reason why this has not been implemented >> before ? >> > it seems to me

Re: [FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-08 Thread Adrian Tong
On Sat, 8 Jun 2019 at 09:38, Lauri Kasanen wrote: > On Sat, 8 Jun 2019 06:51:51 -0700 > Adrian Tong wrote: > > > Hi Lauri. > > > > Thanks for the reply, any reason why this has not been implemented > before ? > > it seems to me that this would be a pretty important/hot function. > > Just the

Re: [FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-08 Thread Lauri Kasanen
On Sat, 8 Jun 2019 06:51:51 -0700 Adrian Tong wrote: > Hi Lauri. > > Thanks for the reply, any reason why this has not been implemented before ? > it seems to me that this would be a pretty important/hot function. Just the usual, nobody has had the interest. There are other places too where the

Re: [FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-08 Thread Adrian Tong
On Fri, 7 Jun 2019 at 23:20, Lauri Kasanen wrote: > On Fri, 7 Jun 2019 08:38:35 -0700 > Adrian Tong wrote: > > > Hi > > > > I have a workload which spends a significant amount of time (~10%) in > > the yuv420_bgr24_mmxext function in FFMEPG. > > > > I looked at the assembly and profile and see

Re: [FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-08 Thread Lauri Kasanen
On Fri, 7 Jun 2019 08:38:35 -0700 Adrian Tong wrote: > Hi > > I have a workload which spends a significant amount of time (~10%) in > the yuv420_bgr24_mmxext function in FFMEPG. > > I looked at the assembly and profile and see MMX (64 bit) registers are > used. I wonder whether we can have a

[FFmpeg-devel] yuv420_bgr24_mmxext conversion taking significant time

2019-06-07 Thread Adrian Tong
Hi I have a workload which spends a significant amount of time (~10%) in the yuv420_bgr24_mmxext function in FFMEPG. I looked at the assembly and profile and see MMX (64 bit) registers are used. I wonder whether we can have a SSE2 version which has a register bit width of 128. I am very