Re: [patch] autogain support for bayer10 format (was Re: [patch] propagating controls in libv4l2)

2017-07-13 Thread Pavel Machek
Hi!

> Oh, somehow I got confused that this is kernel code :)
> 
> >But I'd say NEON conversion is not neccessary anytime soon. First,
> >this is just trying to get average luminosity. We can easily skip
> >quite a lot of pixels, and still get reasonable answer.
> >
> >Second, omap3isp actually has a hardware block computing statistics
> >for us. We just don't use it for simplicity.
> >
> 
> Right, I forgot about that.
> 
> >(But if you want to play with camera, I'll get you patches; there's
> >ton of work to be done, both kernel and userspace :-).
> 
> Well, I saw a low hanging fruit I thought I can convert to NEON in a day or
> two, while having some rest from the huge "project" I am devoting all my
> spare time recently (rebasing hildon/maemo 5 on top of devuan Jessie).
> Still, if there is something relatively small to be done, just email me and
> I'll have a look.

Well, there's a ton of work on camera, and some work on
libcmtspeechdata. The later is rather self-contained.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [patch] autogain support for bayer10 format (was Re: [patch] propagating controls in libv4l2)

2017-05-03 Thread Pavel Machek
On Wed 2017-05-03 20:05:56, Russell King - ARM Linux wrote:
> On Wed, Apr 26, 2017 at 06:43:54PM +0300, Ivaylo Dimitrov wrote:
> > >+static int get_luminosity_bayer10(uint16_t *buf, const struct v4l2_format 
> > >*fmt)
> > >+{
> > >+  long long avg_lum = 0;
> > >+  int x, y;
> > >+  
> > >+  buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
> > >+  fmt->fmt.pix.width / 4;
> > >+
> > >+  for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
> > >+  for (x = 0; x < fmt->fmt.pix.width / 2; x++)
> > 
> > That would take some time :). AIUI, we have NEON support in ARM kernels
> > (CONFIG_KERNEL_MODE_NEON), I wonder if it makes sense (me) to convert the
> > above loop to NEON-optimized when it comes to it? Are there any drawbacks in
> > using NEON code in kernel?
> 
> Using neon without the VFP state saved and restored corrupts userspace's
> FP state.  So, you have to save the entire VFP state to use neon in kernel
> mode.  There are helper functions for this: kernel_neon_begin() and
> kernel_neon_end().
...
> Given that, do we really want to be walking over multi-megabytes of image
> data in the kernel with preemption disabled - it sounds like a recipe for
> a very sluggish system.  I think this should (and can only sensibly be
> done) in userspace.

The patch was for libv4l2. (And I explained why we don't need to
overoptimize this.)
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [patch] autogain support for bayer10 format (was Re: [patch] propagating controls in libv4l2)

2017-05-03 Thread Russell King - ARM Linux
On Wed, Apr 26, 2017 at 06:43:54PM +0300, Ivaylo Dimitrov wrote:
> >+static int get_luminosity_bayer10(uint16_t *buf, const struct v4l2_format 
> >*fmt)
> >+{
> >+long long avg_lum = 0;
> >+int x, y;
> >+
> >+buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
> >+fmt->fmt.pix.width / 4;
> >+
> >+for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
> >+for (x = 0; x < fmt->fmt.pix.width / 2; x++)
> 
> That would take some time :). AIUI, we have NEON support in ARM kernels
> (CONFIG_KERNEL_MODE_NEON), I wonder if it makes sense (me) to convert the
> above loop to NEON-optimized when it comes to it? Are there any drawbacks in
> using NEON code in kernel?

Using neon without the VFP state saved and restored corrupts userspace's
FP state.  So, you have to save the entire VFP state to use neon in kernel
mode.  There are helper functions for this: kernel_neon_begin() and
kernel_neon_end().

You can't build C code with the compiler believing that neon is available
as the compiler could emit neon instructions in unprotected kernel code.

Note that kernel_neon_begin() is only allowed to be called outside
interrupt context and with preemption disabled.

Given that, do we really want to be walking over multi-megabytes of image
data in the kernel with preemption disabled - it sounds like a recipe for
a very sluggish system.  I think this should (and can only sensibly be
done) in userspace.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [patch] autogain support for bayer10 format (was Re: [patch] propagating controls in libv4l2)

2017-04-30 Thread Pavel Machek
On Wed 2017-04-26 15:23:37, Pavel Machek wrote:
> Hi!
> 
> > > > I don't see why it would be hard to open files or have threads inside
> > > > a library. There are several libraries that do that already, specially
> > > > the ones designed to be used on multimidia apps.  
> > > 
> > > Well, This is what the libv4l2 says:
> > > 
> > >This file implements libv4l2, which offers v4l2_ prefixed versions
> > >of
> > >   open/close/etc. The API is 100% the same as directly opening
> > >/dev/videoX
> > >   using regular open/close/etc, the big difference is that format
> > >conversion
> > >
> > > but if I open additional files in v4l2_open(), API is no longer the
> > > same, as unix open() is defined to open just one file descriptor.
> > > 
> > > Now. There is autogain support in libv4lconvert, but it expects to use
> > > same fd for camera and for the gain... which does not work with
> > > subdevs.
> > > 
> > > Of course, opening subdevs by name like this is not really
> > > acceptable. But can you suggest a method that is?
> > 
> > There are two separate things here:
> > 
> > 1) Autofoucs for a device that doesn't use subdev API
> > 2) libv4l2 support for devices that require MC and subdev API
> 
> Actually there are three: 0) autogain. Unfortunately, I need autogain
> first before autofocus has a chance...
> 
> And that means... bayer10 support for autogain.
> 
> Plus, I changed avg_lum to long long. Quick calculation tells me int
> could overflow with few megapixel sensor.
> 
> Oh, btw http://ytse.tricolour.net/docs/LowLightOptimization.html no
> longer works.

Can I get some comments here? Patch will need fixup (constants need
adjusting), but is style/design acceptable?

Thanks,
Pavel

> diff --git a/lib/libv4lconvert/processing/autogain.c 
> b/lib/libv4lconvert/processing/autogain.c
> index c6866d6..0b52d0f 100644
> --- a/lib/libv4lconvert/processing/autogain.c
> +++ b/lib/libv4lconvert/processing/autogain.c
> @@ -68,6 +71,41 @@ static void autogain_adjust(struct v4l2_queryctrl *ctrl, 
> int *value,
>   }
>  }
>  
> +static int get_luminosity_bayer10(uint16_t *buf, const struct v4l2_format 
> *fmt)
> +{
> + long long avg_lum = 0;
> + int x, y;
> + 
> + buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
> + fmt->fmt.pix.width / 4;
> +
> + for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
> + for (x = 0; x < fmt->fmt.pix.width / 2; x++)
> + avg_lum += *buf++;
> + buf += fmt->fmt.pix.bytesperline - fmt->fmt.pix.width / 2;
> + }
> + avg_lum /= fmt->fmt.pix.height * fmt->fmt.pix.width / 4;
> + avg_lum /= 4;
> + return avg_lum;
> +}
> +
> +static int get_luminosity_bayer8(unsigned char *buf, const struct 
> v4l2_format *fmt)
> +{
> + long long avg_lum = 0;
> + int x, y;
> + 
> + buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
> + fmt->fmt.pix.width / 4;
> +
> + for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
> + for (x = 0; x < fmt->fmt.pix.width / 2; x++)
> + avg_lum += *buf++;
> + buf += fmt->fmt.pix.bytesperline - fmt->fmt.pix.width / 2;
> + }
> + avg_lum /= fmt->fmt.pix.height * fmt->fmt.pix.width / 4;
> + return avg_lum;
> +}
> +
>  /* auto gain and exposure algorithm based on the knee algorithm described 
> here:
>  http://ytse.tricolour.net/docs/LowLightOptimization.html */
>  static int autogain_calculate_lookup_tables(
> @@ -100,17 +142,16 @@ static int autogain_calculate_lookup_tables(
>   switch (fmt->fmt.pix.pixelformat) {
> + case V4L2_PIX_FMT_SGBRG10:
> + case V4L2_PIX_FMT_SGRBG10:
> + case V4L2_PIX_FMT_SBGGR10:
> + case V4L2_PIX_FMT_SRGGB10:
> + avg_lum = get_luminosity_bayer10((void *) buf, fmt);
> + break;
> +
>   case V4L2_PIX_FMT_SGBRG8:
>   case V4L2_PIX_FMT_SGRBG8:
>   case V4L2_PIX_FMT_SBGGR8:
>   case V4L2_PIX_FMT_SRGGB8:
> - buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
> - fmt->fmt.pix.width / 4;
> -
> - for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
> - for (x = 0; x < fmt->fmt.pix.width / 2; x++)
> - avg_lum += *buf++;
> - buf += fmt->fmt.pix.bytesperline - fmt->fmt.pix.width / 
> 2;
> - }
> - avg_lum /= fmt->fmt.pix.height * fmt->fmt.pix.width / 4;
> + avg_lum = get_luminosity_bayer8(buf, fmt);
>   break;
>  
>   case V4L2_PIX_FMT_RGB24:
> diff --git a/lib/libv4lconvert/processing/libv4lprocessing.c 
> b/lib/libv4lconvert/processing/libv4lprocessing.c
> index b061f50..b98d024 100644
> --- a/lib/libv4lconvert/processing/libv4lprocessing.c
> +++ b/lib/libv4lconvert/processing/libv4lprocessing.c
> @@ -164,6 +165,10 @@ void v4lprocessing_processing(struct 

Re: [patch] autogain support for bayer10 format (was Re: [patch] propagating controls in libv4l2)

2017-04-26 Thread Ivaylo Dimitrov



On 27.04.2017 01:51, Pavel Machek wrote:

Hi!


There are two separate things here:

1) Autofoucs for a device that doesn't use subdev API
2) libv4l2 support for devices that require MC and subdev API


Actually there are three: 0) autogain. Unfortunately, I need autogain
first before autofocus has a chance...

And that means... bayer10 support for autogain.

Plus, I changed avg_lum to long long. Quick calculation tells me int
could overflow with few megapixel sensor.

Oh, btw http://ytse.tricolour.net/docs/LowLightOptimization.html no
longer works.

Regards,
Pavel

diff --git a/lib/libv4lconvert/processing/autogain.c 
b/lib/libv4lconvert/processing/autogain.c
index c6866d6..0b52d0f 100644
--- a/lib/libv4lconvert/processing/autogain.c
+++ b/lib/libv4lconvert/processing/autogain.c
@@ -68,6 +71,41 @@ static void autogain_adjust(struct v4l2_queryctrl *ctrl, int 
*value,
}
}

+static int get_luminosity_bayer10(uint16_t *buf, const struct v4l2_format *fmt)
+{
+   long long avg_lum = 0;
+   int x, y;
+   
+   buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
+   fmt->fmt.pix.width / 4;
+
+   for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
+   for (x = 0; x < fmt->fmt.pix.width / 2; x++)


That would take some time :). AIUI, we have NEON support in ARM kernels
(CONFIG_KERNEL_MODE_NEON), I wonder if it makes sense (me) to convert the
above loop to NEON-optimized when it comes to it? Are there any drawbacks in
using NEON code in kernel?


Well, thanks for offer. This is actualy libv4l2.



Oh, somehow I got confused that this is kernel code :)


But I'd say NEON conversion is not neccessary anytime soon. First,
this is just trying to get average luminosity. We can easily skip
quite a lot of pixels, and still get reasonable answer.

Second, omap3isp actually has a hardware block computing statistics
for us. We just don't use it for simplicity.



Right, I forgot about that.


(But if you want to play with camera, I'll get you patches; there's
ton of work to be done, both kernel and userspace :-).


Well, I saw a low hanging fruit I thought I can convert to NEON in a day 
or two, while having some rest from the huge "project" I am devoting all 
my spare time recently (rebasing hildon/maemo 5 on top of devuan 
Jessie). Still, if there is something relatively small to be done, just 
email me and I'll have a look.


Regards,
Ivo


Re: [patch] autogain support for bayer10 format (was Re: [patch] propagating controls in libv4l2)

2017-04-26 Thread Pavel Machek
Hi!

> >>There are two separate things here:
> >>
> >>1) Autofoucs for a device that doesn't use subdev API
> >>2) libv4l2 support for devices that require MC and subdev API
> >
> >Actually there are three: 0) autogain. Unfortunately, I need autogain
> >first before autofocus has a chance...
> >
> >And that means... bayer10 support for autogain.
> >
> >Plus, I changed avg_lum to long long. Quick calculation tells me int
> >could overflow with few megapixel sensor.
> >
> >Oh, btw http://ytse.tricolour.net/docs/LowLightOptimization.html no
> >longer works.
> >
> >Regards,
> > Pavel
> >
> >diff --git a/lib/libv4lconvert/processing/autogain.c 
> >b/lib/libv4lconvert/processing/autogain.c
> >index c6866d6..0b52d0f 100644
> >--- a/lib/libv4lconvert/processing/autogain.c
> >+++ b/lib/libv4lconvert/processing/autogain.c
> >@@ -68,6 +71,41 @@ static void autogain_adjust(struct v4l2_queryctrl *ctrl, 
> >int *value,
> > }
> > }
> >
> >+static int get_luminosity_bayer10(uint16_t *buf, const struct v4l2_format 
> >*fmt)
> >+{
> >+long long avg_lum = 0;
> >+int x, y;
> >+
> >+buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
> >+fmt->fmt.pix.width / 4;
> >+
> >+for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
> >+for (x = 0; x < fmt->fmt.pix.width / 2; x++)
> 
> That would take some time :). AIUI, we have NEON support in ARM kernels
> (CONFIG_KERNEL_MODE_NEON), I wonder if it makes sense (me) to convert the
> above loop to NEON-optimized when it comes to it? Are there any drawbacks in
> using NEON code in kernel?

Well, thanks for offer. This is actualy libv4l2.

But I'd say NEON conversion is not neccessary anytime soon. First,
this is just trying to get average luminosity. We can easily skip
quite a lot of pixels, and still get reasonable answer.

Second, omap3isp actually has a hardware block computing statistics
for us. We just don't use it for simplicity.

(But if you want to play with camera, I'll get you patches; there's
ton of work to be done, both kernel and userspace :-).

Regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [patch] autogain support for bayer10 format (was Re: [patch] propagating controls in libv4l2)

2017-04-26 Thread Ivaylo Dimitrov

Hi,

On 26.04.2017 16:23, Pavel Machek wrote:

Hi!


I don't see why it would be hard to open files or have threads inside
a library. There are several libraries that do that already, specially
the ones designed to be used on multimidia apps.


Well, This is what the libv4l2 says:

   This file implements libv4l2, which offers v4l2_ prefixed versions
   of
  open/close/etc. The API is 100% the same as directly opening
   /dev/videoX
  using regular open/close/etc, the big difference is that format
   conversion

but if I open additional files in v4l2_open(), API is no longer the
same, as unix open() is defined to open just one file descriptor.

Now. There is autogain support in libv4lconvert, but it expects to use
same fd for camera and for the gain... which does not work with
subdevs.

Of course, opening subdevs by name like this is not really
acceptable. But can you suggest a method that is?


There are two separate things here:

1) Autofoucs for a device that doesn't use subdev API
2) libv4l2 support for devices that require MC and subdev API


Actually there are three: 0) autogain. Unfortunately, I need autogain
first before autofocus has a chance...

And that means... bayer10 support for autogain.

Plus, I changed avg_lum to long long. Quick calculation tells me int
could overflow with few megapixel sensor.

Oh, btw http://ytse.tricolour.net/docs/LowLightOptimization.html no
longer works.

Regards,
Pavel

diff --git a/lib/libv4lconvert/processing/autogain.c 
b/lib/libv4lconvert/processing/autogain.c
index c6866d6..0b52d0f 100644
--- a/lib/libv4lconvert/processing/autogain.c
+++ b/lib/libv4lconvert/processing/autogain.c
@@ -68,6 +71,41 @@ static void autogain_adjust(struct v4l2_queryctrl *ctrl, int 
*value,
}
 }

+static int get_luminosity_bayer10(uint16_t *buf, const struct v4l2_format *fmt)
+{
+   long long avg_lum = 0;
+   int x, y;
+   
+   buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
+   fmt->fmt.pix.width / 4;
+
+   for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
+   for (x = 0; x < fmt->fmt.pix.width / 2; x++)


That would take some time :). AIUI, we have NEON support in ARM kernels 
(CONFIG_KERNEL_MODE_NEON), I wonder if it makes sense (me) to convert 
the above loop to NEON-optimized when it comes to it? Are there any 
drawbacks in using NEON code in kernel?



+   avg_lum += *buf++;
+   buf += fmt->fmt.pix.bytesperline - fmt->fmt.pix.width / 2;
+   }
+   avg_lum /= fmt->fmt.pix.height * fmt->fmt.pix.width / 4;
+   avg_lum /= 4;
+   return avg_lum;
+}
+
+static int get_luminosity_bayer8(unsigned char *buf, const struct v4l2_format 
*fmt)
+{
+   long long avg_lum = 0;
+   int x, y;
+   
+   buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
+   fmt->fmt.pix.width / 4;
+
+   for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
+   for (x = 0; x < fmt->fmt.pix.width / 2; x++)


ditto.


+   avg_lum += *buf++;
+   buf += fmt->fmt.pix.bytesperline - fmt->fmt.pix.width / 2;
+   }
+   avg_lum /= fmt->fmt.pix.height * fmt->fmt.pix.width / 4;
+   return avg_lum;
+}
+
 /* auto gain and exposure algorithm based on the knee algorithm described here:
 http://ytse.tricolour.net/docs/LowLightOptimization.html */
 static int autogain_calculate_lookup_tables(
@@ -100,17 +142,16 @@ static int autogain_calculate_lookup_tables(
switch (fmt->fmt.pix.pixelformat) {
+   case V4L2_PIX_FMT_SGBRG10:
+   case V4L2_PIX_FMT_SGRBG10:
+   case V4L2_PIX_FMT_SBGGR10:
+   case V4L2_PIX_FMT_SRGGB10:
+   avg_lum = get_luminosity_bayer10((void *) buf, fmt);
+   break;
+
case V4L2_PIX_FMT_SGBRG8:
case V4L2_PIX_FMT_SGRBG8:
case V4L2_PIX_FMT_SBGGR8:
case V4L2_PIX_FMT_SRGGB8:
-   buf += fmt->fmt.pix.height * fmt->fmt.pix.bytesperline / 4 +
-   fmt->fmt.pix.width / 4;
-
-   for (y = 0; y < fmt->fmt.pix.height / 2; y++) {
-   for (x = 0; x < fmt->fmt.pix.width / 2; x++)
-   avg_lum += *buf++;
-   buf += fmt->fmt.pix.bytesperline - fmt->fmt.pix.width / 
2;
-   }
-   avg_lum /= fmt->fmt.pix.height * fmt->fmt.pix.width / 4;
+   avg_lum = get_luminosity_bayer8(buf, fmt);
break;

case V4L2_PIX_FMT_RGB24:
diff --git a/lib/libv4lconvert/processing/libv4lprocessing.c 
b/lib/libv4lconvert/processing/libv4lprocessing.c
index b061f50..b98d024 100644
--- a/lib/libv4lconvert/processing/libv4lprocessing.c
+++ b/lib/libv4lconvert/processing/libv4lprocessing.c
@@ -164,6 +165,10 @@ void v4lprocessing_processing(struct v4lprocessing_data 
*data,
case V4L2_PIX_FMT_SGRBG8:
case V4L2_PIX_FMT_SBGGR8:
case