Re: [Mlt-devel] [PATCH] use INT_MAX for for default producer length ?

2014-06-29 Thread Maksym Veremeyenko
27.06.14 20:30, Dan Dennedy написав(ла):

 On Thu, Jun 26, 2014 at 11:25 PM, Maksym Veremeyenko ve...@m1stereo.tv
 mailto:ve...@m1stereo.tv wrote:

 Hi,

 i found that specifying in/out properties of producer (with still
 image) is not enough for playing it for specified time. In the
 attached test file logo will disappear in 10 minutes because of
 default length for producer was set to 15000 frames.

 may be we can change 15000 magic value with another large value like
 INT_MAX?


 OK, I have known about this since forever, and I have been on the fence
 about whether to change it. For the longest time my feeling was that
 apps should manage the length of image producers, and then this problem
 never really appears to the user. However, for the person using the
 command line, melted, or manually learning and authoring MLT XML, it is
 inconvenient to have to be reminded about this default length and not be
 able to simply set the out point. So, I can accept the patch after some
 testing in apps Shotcut, Flowblade, and Kdenlive. We have to test
 different kinds of producers to ensure the user does not end up
 accidentally adding some huge item to the timeline, potentially causing
 a crash due to exceeding some GUI canvas limitation. Therefore, I will
 apply the patch after the 0.9.2 release.
that patch would definitely break Shotcut, Flowblade, and Kdenlive 
behaviour... may then introduce env variable to override default value?


-- 

Maksym Veremeyenko

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Mlt-devel mailing list
Mlt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlt-devel


[Mlt-devel] [PATCH] implement SSE optimized luma copy/scale functions

2014-06-29 Thread Maksym Veremeyenko

Hi,

attached set of patches implements optimization for luma scale in matte 
transition,


first patch change scaling equation that avoid division and only use 
shift and multiplication


second patch implement using SSE code for scaling and copying luma.

--

Maksym Veremeyenko

From fd58ca781aa2d8f68aaedabe7b0f428c5baf6853 Mon Sep 17 00:00:00 2001
From: Maksym Veremeyenko ve...@m1.tv
Date: Fri, 27 Jun 2014 12:14:45 +0300
Subject: [PATCH 1/2] update scaling equantion to avoid division

---
 src/modules/core/transition_matte.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/src/modules/core/transition_matte.c b/src/modules/core/transition_matte.c
index c127761..25e93c4 100644
--- a/src/modules/core/transition_matte.c
+++ b/src/modules/core/transition_matte.c
@@ -57,7 +57,9 @@ static void copy_Y_to_A_scaled_luma(uint8_t* alpha_a, int stride_a, uint8_t* ima
 p = 16;
 			if(p  235)
 p = 235;
-			p = (p - 16) * 255 / 219;
+			/* p = (p - 16) * 255 / 219; */
+			p -= 16;
+			p = ((p  8) + (p * 43))  8;
 
 			alpha_a[i] = p;
 		};
-- 
1.7.7.6

From 466e71bd8f7f9fd7ec7fb800bc312c5d0305b16b Mon Sep 17 00:00:00 2001
From: Maksym Veremeyenko ve...@m1.tv
Date: Fri, 27 Jun 2014 17:05:50 +0300
Subject: [PATCH 2/2] implement SSE optimized luma copy/scale functions

---
 src/modules/core/transition_matte.c |  151 ++-
 1 files changed, 149 insertions(+), 2 deletions(-)

diff --git a/src/modules/core/transition_matte.c b/src/modules/core/transition_matte.c
index 25e93c4..2ea0acd 100644
--- a/src/modules/core/transition_matte.c
+++ b/src/modules/core/transition_matte.c
@@ -30,26 +30,173 @@
 
 typedef void ( *copy_luma_fn )(uint8_t* alpha_a, int stride_a, uint8_t* image_b, int stride_b, int width, int height);
 
+#if defined(USE_SSE)
+static void __attribute__((noinline)) copy_Y_to_A_full_luma_sse(uint8_t* alpha_a, uint8_t* image_b, int cnt)
+{
+	const static unsigned char const4[] =
+	{
+		255, 0, 255, 0, 255, 0, 255, 0, 255, 0, 255, 0, 255, 0, 255, 0
+	};
+
+	__asm__ volatile
+	(
+		movdqu (%[equ255]), %%xmm4 \n\t   /* load bottom value 0xff */
+
+		loop_start1:   \n\t
+
+		/* load pixels block 1 */
+		movdqu 0(%[image_b]), %%xmm0   \n\t
+		add$0x10, %[image_b]   \n\t
+
+		/* load pixels block 2 */
+		movdqu 0(%[image_b]), %%xmm1   \n\t
+		add$0x10, %[image_b]   \n\t
+
+		/* leave only Y */
+		pand   %%xmm4, %%xmm0  \n\t
+		pand   %%xmm4, %%xmm1  \n\t
+
+		/* pack to 8 bit value */
+		packuswb   %%xmm1, %%xmm0  \n\t
+
+		/* store */
+		movdqu %%xmm0, (%[alpha_a])\n\t
+		add$0x10, %[alpha_a]   \n\t
+
+		/* loop if we done */
+		dec%[cnt]  \n\t
+		jnzloop_start1 \n\t
+		:
+		: [cnt]r (cnt), [alpha_a]r(alpha_a), [image_b]r(image_b), [equ255]r(const4)
+	);
+};
+#endif
+
 static void copy_Y_to_A_full_luma(uint8_t* alpha_a, int stride_a, uint8_t* image_b, int stride_b, int width, int height)
 {
 	int i, j;
 
 	for(j = 0; j  height; j++)
 	{
-		for(i = 0; i  width; i++)
+		i = 0;
+#if defined(USE_SSE)
+		if(width = 16)
+		{
+			copy_Y_to_A_full_luma_sse(alpha_a, image_b, width  4);
+			i = (width  4)  4;
+		}
+#endif
+		for(; i  width; i++)
 			alpha_a[i] = image_b[2*i];
 		alpha_a += stride_a;
 		image_b += stride_b;
 	};
 };
 
+#if defined(USE_SSE)
+static void __attribute__((noinline)) copy_Y_to_A_scaled_luma_sse(uint8_t* alpha_a, uint8_t* image_b, int cnt)
+{
+	const static unsigned char const1[] =
+	{
+		43, 0, 43, 0, 43, 0, 43, 0, 43, 0, 43, 0, 43, 0, 43, 0
+	};
+	const static unsigned char const2[] =
+	{
+		16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0, 16, 0
+	};
+	const static unsigned char const3[] =
+	{
+		235, 0, 235, 0, 235, 0, 235, 0, 235, 0, 235, 0, 235, 0, 235, 0
+	};
+	const static unsigned char const4[] =
+	{
+		255, 0, 255, 0, 255, 0, 255, 0, 255, 0, 255, 0, 255, 0, 255, 0
+	};
+
+	__asm__ volatile
+	(
+		movdqu (%[equ43]), %%xmm7  \n\t   /* load multiplier 43 */
+		movdqu (%[equ16]), %%xmm6  \n\t   /* load bottom value 16 */
+		movdqu (%[equ235]), %%xmm5 \n\t   /* load bottom value 235 */
+		movdqu (%[equ255]), %%xmm4 \n\t   /* load bottom value 0xff */
+
+		loop_start:\n\t
+
+		/* load pixels block 1 */
+		movdqu 0(%[image_b]), %%xmm0   \n\t
+		add$0x10, %[image_b]   \n\t
+
+		/* load pixels block 2 */
+		movdqu 0(%[image_b]), %%xmm1   \n\t
+		add$0x10, %[image_b]   \n\t
+
+		/* leave only Y */
+		pand   %%xmm4, %%xmm0  \n\t
+		pand   %%xmm4, %%xmm1  \n\t
+
+		/* upper range clip */
+		pminsw %%xmm5, %%xmm0  \n\t
+		pminsw %%xmm5, %%xmm1  \n\t
+
+		/* upper range clip */
+		pmaxsw %%xmm6, %%xmm0

[Mlt-devel] [PATCH] avoid creating alpha channel if not required (review request)

2014-06-29 Thread Maksym Veremeyenko

Hi,

i was tried to reach realtime performance for some simple cg operation 
and found that during composite transition operation MLT create an alpha 
plane for frame that has not it and not even require for further 
display. i.e. if you put a small CG over HD frame, MLT create a full 
frame alpha channel and further blending operation use it memory for no 
reason IMHO.


so i implement function mlt_frame_get_alpha_mask_nc that do the same as 
mlt_frame_get_alpha_mask but do not create alpha channel if not exist - 
it just return NULL. next i replaced some code parts for using 
mlt_frame_get_alpha_mask_nc and handling returned NULL value.


finally composite_line_yuv_sse2_simple function was split into 8 variants:

|0| dest_a == NULL | src_a == NULL | weight == 256 | blit
|1| dest_a == NULL | src_a == NULL | weight != 256 | blend: with given alpha
|2| dest_a == NULL | src_a != NULL | weight == 256 | blend: only src alpha
|3| dest_a == NULL | src_a != NULL | weight != 256 | blend: premultiply 
src alpha
|4| dest_a != NULL | src_a == NULL | weight == 256 | blit: blit and set 
dst alpha to FF

|5| dest_a != NULL | src_a == NULL | weight != 256 | blend: with given alpha
|6| dest_a != NULL | src_a != NULL | weight == 256 | blend: full blend 
without src alpha premutiply
|7| dest_a != NULL | src_a != NULL | weight != 256 | blend: full (origin 
version)


from my tests i did not found visible regression. may be somebody else 
could also review/test proposed code.


--

Maksym Veremeyenko


From 2e973085a151bd43762b17bf37e802cdcb130167 Mon Sep 17 00:00:00 2001
From: Maksym Veremeyenko ve...@m1.tv
Date: Fri, 27 Jun 2014 18:02:16 +0300
Subject: [PATCH 1/6] rename arguments indexes to literal names

---
 src/modules/core/composite_line_yuv_sse2_simple.c |   30 ++--
 1 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/modules/core/composite_line_yuv_sse2_simple.c b/src/modules/core/composite_line_yuv_sse2_simple.c
index 04eb1ca..049ed9e 100644
--- a/src/modules/core/composite_line_yuv_sse2_simple.c
+++ b/src/modules/core/composite_line_yuv_sse2_simple.c
@@ -33,9 +33,9 @@ void composite_line_yuv_sse2_simple(uint8_t *dest, uint8_t *src, int width, uint
 __asm__ volatile
 (
 pxor   %%xmm0, %%xmm0  \n\t   /* clear zero register */
-movdqu (%4), %%xmm9\n\t   /* load const1 */
-movdqu (%7), %%xmm10   \n\t   /* load const2 */
-movd   %0, %%xmm1  \n\t   /* load weight and decompose */
+movdqu (%[const1]), %%xmm9 \n\t   /* load const1 */
+movdqu (%[const2]), %%xmm10\n\t   /* load const2 */
+movd   %[weight], %%xmm1   \n\t   /* load weight and decompose */
 movlhps%%xmm1, %%xmm1  \n\t
 pshuflw$0, %%xmm1, %%xmm1  \n\t
 pshufhw$0, %%xmm1, %%xmm1  \n\t
@@ -46,7 +46,7 @@ void composite_line_yuv_sse2_simple(uint8_t *dest, uint8_t *src, int width, uint
 00  W 00  W 00  W 00  W 00  W 00  W 00  W 00  W
 */
 loop_start:\n\t
-movq   (%1), %%xmm2\n\t   /* load source alpha */
+movq   (%[alpha_b]), %%xmm2\n\t   /* load source alpha */
 punpcklbw  %%xmm0, %%xmm2  \n\t   /* unpack alpha 8 8-bits alphas to 8 16-bits values */
 
 /*
@@ -68,7 +68,7 @@ void composite_line_yuv_sse2_simple(uint8_t *dest, uint8_t *src, int width, uint
 /*
 DSTa = DSTa + (SRCa * (0xFF - DSTa))  8
 */
-movq   (%5), %%xmm3\n\t   /* load dst alpha */
+movq   (%[alpha_a]), %%xmm3\n\t   /* load dst alpha */
 punpcklbw  %%xmm0, %%xmm3  \n\t   /* unpack dst 8 8-bits alphas to 8 16-bits values */
 movdqa %%xmm9, %%xmm4  \n\t
 psubw  %%xmm3, %%xmm4  \n\t
@@ -80,10 +80,10 @@ void composite_line_yuv_sse2_simple(uint8_t *dest, uint8_t *src, int width, uint
 psrlw  $8, %%xmm4  \n\t
 paddw  %%xmm4, %%xmm3  \n\t
 packuswb   %%xmm0, %%xmm3  \n\t
-movq   %%xmm3, (%5)\n\t   /* save dst alpha */
+movq   %%xmm3, (%[alpha_a])\n\t   /* save dst alpha */
 
-movdqu (%2), %%xmm3\n\t   /* load src */
-movdqu (%3), %%xmm4\n\t   /* load dst */
+movdqu (%[src]), %%xmm3\n\t   /* load src */
+movdqu (%[dest]), %%xmm4   \n\t   /* load dst */
 movdqa %%xmm3, %%xmm5  \n\t   /* dub src */
 movdqa %%xmm4, %%xmm6  \n\t   /* dub dst */
 
@@ -185,21 +185,21 @@ void composite_line_yuv_sse2_simple(uint8_t *dest, uint8_t *src, int width, uint
 
 

[Mlt-devel] [mltframework/mlt]

2014-06-29 Thread GitHub
  Branch: refs/tags/list
  Home:   https://github.com/mltframework/mlt
--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft___
Mlt-devel mailing list
Mlt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlt-devel


[Mlt-devel] [mltframework/mlt] 79908d: Add release notes for version 0.9.2.

2014-06-29 Thread GitHub
  Branch: refs/heads/master
  Home:   https://github.com/mltframework/mlt
  Commit: 79908de1ae56378d79c4e1a03c135c0a7bc813c7
  
https://github.com/mltframework/mlt/commit/79908de1ae56378d79c4e1a03c135c0a7bc813c7
  Author: Dan Dennedy d...@dennedy.org
  Date:   2014-06-29 (Sun, 29 Jun 2014)

  Changed paths:
M NEWS

  Log Message:
  ---
  Add release notes for version 0.9.2.


  Commit: 3922139532c6b2e4aa8bf7d466c48c36f1301ae9
  
https://github.com/mltframework/mlt/commit/3922139532c6b2e4aa8bf7d466c48c36f1301ae9
  Author: Dan Dennedy d...@dennedy.org
  Date:   2014-06-29 (Sun, 29 Jun 2014)

  Changed paths:
M Doxyfile
M configure
M docs/melt.1
M src/framework/mlt_version.h

  Log Message:
  ---
  Set version to 0.9.2.


  Commit: 3c93969da2de858ad8ba0a81a78e130cdbf3165a
  
https://github.com/mltframework/mlt/commit/3c93969da2de858ad8ba0a81a78e130cdbf3165a
  Author: Dan Dennedy d...@dennedy.org
  Date:   2014-06-29 (Sun, 29 Jun 2014)

  Changed paths:
M ChangeLog

  Log Message:
  ---
  Update ChangeLog for v0.9.2.


Compare: https://github.com/mltframework/mlt/compare/fa50c0b5895e...3c93969da2de--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft___
Mlt-devel mailing list
Mlt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlt-devel


Re: [Mlt-devel] NDVI processing

2014-06-29 Thread Dan Dennedy
On Thu, Jun 19, 2014 at 2:45 AM, Stefan Gofferje
li...@home.gofferje.net wrote:
 On 06/19/2014 07:47 AM, Brian Matherly wrote:
 That's a long time to wait. What about one of the 54000 DIY-Drones
 members? Does anyone have any footage they could share?

 http://diydrones.com/profiles/blogs/development-of-ndvi-postprocessing-plugin-for-frei0r-the-mlt


 --
  (o_   Stefan Gofferje| SCLT, MCP, CCSA
  //\   Reg'd Linux User #247167   | VCP #2263
  V_/_  Heckler  Koch - the original point and click interface

Stefan, Brian reached a good working state on the NDVI plugin, and
today I pushed it into the upstream git repos. On Tuesday July 1,
there will be a new version of Shotcut (14.07) that will include the
filter. The filter is not yet exposed in the Shotcut GUI, but it
includes the repo head versions of MLT and frei0r. And it is
cross-platform for your diverse community. So, people will have a
fairly convenient way to run melt with the NDVI filter:

Linux console: Shotcut.app/melt infrablue.mp4 -attach frei0r.ndvi 0=heat
Windows cmd.exe: Program Files\Shotcut\melt infrablue.mp4 -attach
frei0r.ndvi 0=heat
OS X terminal: /Applications/Shotcut.app/Contents/MacOS/melt
infrablue.mp4 -attach frei0r.ndvi 0=heat

Parameter docs here:
http://www.mltframework.org/bin/view/MLT/FilterFrei0r-ndvi

Want to do something with the result? Add -consumer xml:ndvi.mlt to
the above command line and then open the .mlt file with Shotcut. You
can give the .mlt file any name you want; it is a MLT XML file.

-- 
+-DRD-+

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Mlt-devel mailing list
Mlt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlt-devel