Re: cx18 fix patches

2010-01-29 Thread Devin Heitmueller
Hi Andy,

On Thu, Jan 28, 2010 at 9:24 PM, Andy Walls awa...@radix.net wrote:
 Devin,

 I found interesting system interactions.  On my dual core x86_64 Fedora
 12 machine loading an HVR-1600 cold (no firmware has been loaded yet),
 the pulseaudio daemon opens up a CX23418 ALSA node almost immediately
 after it appears and has these effects:

 1. Pulseaudio tries to perform some sort of op that starts a capture on
 the PCM stream before the APU and CPU firmware has finished loading.
 This results in error messages in the log and probably an undesirable
 driver state, if there was never any firmware loaded prior - such as at
 power up.

I'm a little surprised by that, since the cx18-alsa module is only
initialized after the rest of the cx18 driver is loaded.

 2. Pulseaudio grabs the ALSA control node for the CX23418 and won't let
 go.  If I kill the Pulseaudio process that has the node open, it just
 respawns and grabs the control node again.  This prevents unloading the
 cx18-alsa and cx18 module.

As far as I know, this is one of those dumb Pulseaudio things.
Doesn't it do this with all PCI cards that provide ALSA?

 3. If Pulseaudio also keeps the PCM analog stream going, then TV image
 settings are fixed to the values at the time Pulseaudio started the
 stream.  I don't think it does, but I'm not sure yet.

I know that Pulseaudio binds to the device, but as far as I know it
does not actually open the PCM device for streaming.

 My off the cuff ideas for fixes are:

 1. Integrate cx18-alsa functions into the driver and no longer have it
 as a module, to better coordinate firmware loading with the ALSA nodes.
 (The modular architecture appears to have been a bad choice on my part.)

I'm not against merging the two into a single module, although it's
not clear to me that it will help with the issues you are seeing.

 2. Add a module option to disable setting up the cx18-alsa device nodes.

I can see some value in such an option in general for debugging
purposes, although I don't think it provides a whole lot of value for
regular users who would not normally have it enabled.


 I'll try to work on these this Friday and Saturday.

I will be out of town this weekend, but if you send me email I will
try to respond as promptly as possible.

Devin

-- 
Devin J. Heitmueller - Kernel Labs
http://www.kernellabs.com
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-29 Thread Mauro Carvalho Chehab
Devin Heitmueller wrote:
 Hi Andy,
 
 On Thu, Jan 28, 2010 at 9:24 PM, Andy Walls awa...@radix.net wrote:
 Devin,

 I found interesting system interactions.  On my dual core x86_64 Fedora
 12 machine loading an HVR-1600 cold (no firmware has been loaded yet),
 the pulseaudio daemon opens up a CX23418 ALSA node almost immediately
 after it appears and has these effects:

 1. Pulseaudio tries to perform some sort of op that starts a capture on
 the PCM stream before the APU and CPU firmware has finished loading.
 This results in error messages in the log and probably an undesirable
 driver state, if there was never any firmware loaded prior - such as at
 power up.
 
 I'm a little surprised by that, since the cx18-alsa module is only
 initialized after the rest of the cx18 driver is loaded.

This is a problem that may affect all drivers: just after registering a
device, udev (and other userspace tools) may try to use it. I doubt that making
cx18-alsa part of cx18 would fix this issue. Also, it tends to became worse:
as the number of CPU cores are increasing, the probability for reaching such 
race
condition increases.

The proper solution is to lock the driver while it is not completely 
initialized,
or to delay the alsa registration to happen after having all firmware loaded.

 2. Pulseaudio grabs the ALSA control node for the CX23418 and won't let
 go.  If I kill the Pulseaudio process that has the node open, it just
 respawns and grabs the control node again.  This prevents unloading the
 cx18-alsa and cx18 module.
 
 As far as I know, this is one of those dumb Pulseaudio things.
 Doesn't it do this with all PCI cards that provide ALSA?

All cards that provide alsa support have this trouble, even without pulseaudio.
kmixer does the same thing: when a new mixer is detected, it holds the mixer 
opened,
preventing module unloading.

 3. If Pulseaudio also keeps the PCM analog stream going, then TV image
 settings are fixed to the values at the time Pulseaudio started the
 stream.  I don't think it does, but I'm not sure yet.
 
 I know that Pulseaudio binds to the device, but as far as I know it
 does not actually open the PCM device for streaming.

Probably, it holds open just the mixer.

 My off the cuff ideas for fixes are:

 1. Integrate cx18-alsa functions into the driver and no longer have it
 as a module, to better coordinate firmware loading with the ALSA nodes.
 (The modular architecture appears to have been a bad choice on my part.)
 
 I'm not against merging the two into a single module, although it's
 not clear to me that it will help with the issues you are seeing.

I doubt it would solve. IMO, having it modular is good, since you may not
need cx18 alsa on all devices.
 
-- 

Cheers,
Mauro
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-29 Thread Devin Heitmueller
On Fri, Jan 29, 2010 at 1:40 PM, Mauro Carvalho Chehab
mche...@redhat.com wrote:
 I doubt it would solve. IMO, having it modular is good, since you may not
 need cx18 alsa on all devices.

Modularity is good, but we really need to rethink about the way we are
loading these modules (this applies to dvb as well).  For example, on
em28xx, the dvb module is often getting loaded while at the same that
hald is connecting to the v4l2 device (resulting in i2c errors while
attempting to talk to tvp5150).  A simple initialization lock would
seem like a good idea, except that doesn't really work because the
em28xx submodules get loaded asynchronously.  And the problem isn't
specific to em28xx by any means.  I've hit comparable bugs in cx88.

If we didn't load the modules asynchronously, then at least we would
be able to hold the lock throughout the entire device initialization
(ensuring that nobody can connect to the v4l2 device while the dvb and
alsa drivers are initializing).  Sure, it in theory adds a second or
two to the module load (depending on the device), but we would have a
much simpler model that would be less prone to race conditions.  We
would also lose the ability to modprobe the dvb module after-the-fact
(and expect it to bind to existing devices), but I don't really think
that would be a big deal since everything is auto-detected anyway.  In
fact, it might actually be a good thing given the number of times I've
had to explain to people that you cannot do modprobe em28xx-dvb on
an unsupported device and expect it to magically start working.

I didn't mean to hijack the thread, but I'm just trying to point out
that this is a pretty widespread problem, and not specific to cx18.

Devin

-- 
Devin J. Heitmueller - Kernel Labs
http://www.kernellabs.com
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-29 Thread Mauro Carvalho Chehab
Devin Heitmueller wrote:
 On Fri, Jan 29, 2010 at 1:40 PM, Mauro Carvalho Chehab
 mche...@redhat.com wrote:
 I doubt it would solve. IMO, having it modular is good, since you may not
 need cx18 alsa on all devices.
 
 Modularity is good, but we really need to rethink about the way we are
 loading these modules (this applies to dvb as well).  For example, on
 em28xx, the dvb module is often getting loaded while at the same that
 hald is connecting to the v4l2 device (resulting in i2c errors while
 attempting to talk to tvp5150).  A simple initialization lock would
 seem like a good idea, except that doesn't really work because the
 em28xx submodules get loaded asynchronously.  And the problem isn't
 specific to em28xx by any means.  I've hit comparable bugs in cx88.
 
 If we didn't load the modules asynchronously, then at least we would
 be able to hold the lock throughout the entire device initialization
 (ensuring that nobody can connect to the v4l2 device while the dvb and
 alsa drivers are initializing).  Sure, it in theory adds a second or
 two to the module load (depending on the device), but we would have a
 much simpler model that would be less prone to race conditions.

The asynchronous load were added not to improve the boot load time, but to
avoid some troubles that happens when the load is synchronous. 
I don't remember what were the exact trouble, but I suspect that it was
something related to i2c. The result was that, sometimes, the driver
used to enter into a deadlock state (something like driver A waits for driver B
to load, but, as driver B needs functions provided by driver A, both are put
into sleep).

Also, reducing the driver load time is a good thing. The asynchronous load 
is very interesting for devices where the firmware load takes a very long time.

I love the fact that new kernels with new distros boot the machine on a few
seconds. This is thanks to some asynchronous loads that happen on several 
drivers. The removal of KBL from open/close/ioct2 is one of the reasons for 
those improvements. Of course, if the driver is not properly locked, it will 
cause race conditions.

Maybe one alternative would be to register the interfaces asynchronously
also, as a deferred task that is started only after the driver enters into
a sane state. 

For example, a kref may be used to indicate that there are init
tasks pending. Only after having kref zeroed, the driver registers.
As kref_put() automatically calls a routine when the usage count reaches
zero, it shouldn't be hard to implement such locking schema.



As the problem is common, the better is to provide a global way to avoid
device open while the initialization is not complete, at the v4l core.

Cheers,
Mauro
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-29 Thread Devin Heitmueller
On Fri, Jan 29, 2010 at 2:59 PM, Mauro Carvalho Chehab
mche...@redhat.com wrote:
 The asynchronous load were added not to improve the boot load time, but to
 avoid some troubles that happens when the load is synchronous.
 I don't remember what were the exact trouble, but I suspect that it was
 something related to i2c. The result was that, sometimes, the driver
 used to enter into a deadlock state (something like driver A waits for driver 
 B
 to load, but, as driver B needs functions provided by driver A, both are put
 into sleep).

It would be good if you could locate some specifics in terms of what
prompted this.

 Also, reducing the driver load time is a good thing. The asynchronous load
 is very interesting for devices where the firmware load takes a very long 
 time.

I do not believe that loading the module synchronously will have any
impact on the actual load time, since other modules can be loading in
parallel to the initialization of the em28xx device (regardless of
whether it is a single module, or three modules loading synchronously
or asynchronously).

Also, for xc3028 in particular, we could defer firmware loading until
first use like we do with xc5000 - doing the firmware load at driver
init isn't very useful anyway since we load the firmware and then
immediately and put the device to sleep.

 Maybe one alternative would be to register the interfaces asynchronously
 also, as a deferred task that is started only after the driver enters into
 a sane state.

Potentially.  I feel this should really only be done though in
response to an actual problem/bug.  Otherwise it adds additional
complexity with no real benefit.

 As the problem is common, the better is to provide a global way to avoid
 device open while the initialization is not complete, at the v4l core.

I would be in favor of this, although I am not sure how practical it
is given the diversity in the way different bridges are implemented.
Also, we would need to take into account how this would work with DVB,
since many of the races we run into are applications attempting to use
both the v4l and dvb interfaces of a hybrid device.

Devin

-- 
Devin J. Heitmueller - Kernel Labs
http://www.kernellabs.com
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-28 Thread Andy Walls
On Thu, 2010-01-28 at 00:40 -0200, Mauro Carvalho Chehab wrote:
 Hi Andy,
 
 I've made two fix patches to solve the issues with cx18 compilation.
 My original intention were to send you an email for your ack.
 
 Unfortunately, those got added at the wrong branch and went upstream.
 
 That proofs that my scripts aren't reliable yet, and that I need
 an independent tree for such patches... I hope I have enough disk for all
 those trees...

I understand.


 As we can't rebase the -git tree without breaking the replicas,
 I'd like you to review the patches:
 
 http://git.linuxtv.org/v4l-dvb.git?a=commit;h=701ca4249401fe9705a66ad806e933f15cb42489
 http://git.linuxtv.org/v4l-dvb.git?a=commit;h=dd01705f6a6f732ca95d20959a90dd46482530df
 
 If a committed patch is bad, the remaining solution is to write a patch 
 reverting
 it, and generating some dirty at the git logs.
 
 So, I hope both patches are ok...
 
 Please test.

I had coordinated with Devin on IRC and was going to work up fixes
tonight. 

Now I'll just review and test tonight (some time between 6:00 - 10:30
p.m. EST)


 Sorry for the mess.

No problem.

Regards,
Andy

 Cheers,
 Mauro.


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-28 Thread Mauro Carvalho Chehab
Andy Walls wrote:
 Now I'll just review and test tonight (some time between 6:00 - 10:30
 p.m. EST)

One more error (on x86_64):

drivers/media/video/cx18/cx18-alsa-pcm.c: In function 
‘cx18_alsa_announce_pcm_data’:
drivers/media/video/cx18/cx18-alsa-pcm.c:82: warning: format ‘%d’ expects type 
‘int’, but argument 5 has type ‘size_t’

You should use %zu for size_t.

Cheers,
Mauro
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-28 Thread Andy Walls
On Thu, 2010-01-28 at 00:40 -0200, Mauro Carvalho Chehab wrote:
 Hi Andy,
 
 I've made two fix patches to solve the issues with cx18 compilation.
 My original intention were to send you an email for your ack.
 
 Unfortunately, those got added at the wrong branch and went upstream.
 
 That proofs that my scripts aren't reliable yet, and that I need
 an independent tree for such patches... I hope I have enough disk for all
 those trees...
 
 As we can't rebase the -git tree without breaking the replicas,
 I'd like you to review the patches:
 
 http://git.linuxtv.org/v4l-dvb.git?a=commit;h=701ca4249401fe9705a66ad806e933f15cb42489
 http://git.linuxtv.org/v4l-dvb.git?a=commit;h=dd01705f6a6f732ca95d20959a90dd46482530df
 
 If a committed patch is bad, the remaining solution is to write a patch 
 reverting
 it, and generating some dirty at the git logs.
 
 So, I hope both patches are ok...

Mauro,

By visual inspection, compilation test, and module loading test on a
kernel configured to be modular the patches are OK.

I did not test with them statically recompiled in the kernel, but by
inspection, they should be OK.


Devin,

I found interesting system interactions.  On my dual core x86_64 Fedora
12 machine loading an HVR-1600 cold (no firmware has been loaded yet),
the pulseaudio daemon opens up a CX23418 ALSA node almost immediately
after it appears and has these effects:

1. Pulseaudio tries to perform some sort of op that starts a capture on
the PCM stream before the APU and CPU firmware has finished loading.
This results in error messages in the log and probably an undesirable
driver state, if there was never any firmware loaded prior - such as at
power up.

2. Pulseaudio grabs the ALSA control node for the CX23418 and won't let
go.  If I kill the Pulseaudio process that has the node open, it just
respawns and grabs the control node again.  This prevents unloading the
cx18-alsa and cx18 module.

3. If Pulseaudio also keeps the PCM analog stream going, then TV image
settings are fixed to the values at the time Pulseaudio started the
stream.  I don't think it does, but I'm not sure yet.


My off the cuff ideas for fixes are:

1. Integrate cx18-alsa functions into the driver and no longer have it
as a module, to better coordinate firmware loading with the ALSA nodes.
(The modular architecture appears to have been a bad choice on my part.)

2. Add a module option to disable setting up the cx18-alsa device nodes.


I'll try to work on these this Friday and Saturday.

Regards,
Andy

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-28 Thread Andy Walls
On Fri, 2010-01-29 at 00:08 -0200, Mauro Carvalho Chehab wrote:
 Andy Walls wrote:
  Now I'll just review and test tonight (some time between 6:00 - 10:30
  p.m. EST)
 
 One more error (on x86_64):
 
 drivers/media/video/cx18/cx18-alsa-pcm.c: In function 
 ‘cx18_alsa_announce_pcm_data’:
 drivers/media/video/cx18/cx18-alsa-pcm.c:82: warning: format ‘%d’ expects 
 type ‘int’, but argument 5 has type ‘size_t’
 
 You should use %zu for size_t.

Yes, I saw it.

I'll handle it this weekend with some other cx18 fixes.  I'll have to
give you changes via -hg or as patches posted to the list, as I don't
have a -git clone yet.

Regards,
Andy

 Cheers,
 Mauro
 

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cx18 fix patches

2010-01-28 Thread Mauro Carvalho Chehab
Andy Walls wrote:
 On Thu, 2010-01-28 at 00:40 -0200, Mauro Carvalho Chehab wrote:
 Hi Andy,

 I've made two fix patches to solve the issues with cx18 compilation.
 My original intention were to send you an email for your ack.

 Unfortunately, those got added at the wrong branch and went upstream.

 That proofs that my scripts aren't reliable yet, and that I need
 an independent tree for such patches... I hope I have enough disk for all
 those trees...

 As we can't rebase the -git tree without breaking the replicas,
 I'd like you to review the patches:

 http://git.linuxtv.org/v4l-dvb.git?a=commit;h=701ca4249401fe9705a66ad806e933f15cb42489
 http://git.linuxtv.org/v4l-dvb.git?a=commit;h=dd01705f6a6f732ca95d20959a90dd46482530df

 If a committed patch is bad, the remaining solution is to write a patch 
 reverting
 it, and generating some dirty at the git logs.

 So, I hope both patches are ok...
 
 Mauro,
 
 By visual inspection, compilation test, and module loading test on a
 kernel configured to be modular the patches are OK.
 
 I did not test with them statically recompiled in the kernel, but by
 inspection, they should be OK.

Thanks for the test.

I did the compilations and the errors disappeared. The only remaining
one is that %d instead of %zd that appears with x86_64 (I sent
you a report earlier today).

Cheers,
Mauro.
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html