Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Thu, 8 Nov 2012 10:55:50 -0500 (EST), Alan Stern wrote: > > On Thu, 8 Nov 2012, Takashi Iwai wrote: > > > At Thu, 08 Nov 2012 08:31:35 +0100, > > Daniel Mack wrote: > > (snip) > > > >> We can't simply stop both endpoints in the prepare callback. > > > > > > > > The new function doesn't stop the stream by itself but it just syncs > > > > if the stream is being stopped beforehand. So, it's safe to call it > > > > there. > > > > > > > > Maybe the name was confusing. It should have been like > > > > snd_usb_endpoint_sync_pending_stop() or such. > > > > > > Ah, right. I was errornously looking closer to Alan's patch but then > > > replied to yours. Alright then - thanks for explaining :) > > > > OK, thanks for checking. > > > > FWIW, below is the patch I applied now to for-linus branch. > > Renamed the function, added the comment and put NULL check to the > > function to simplify. > > Thanks for fixing this. Is your patch marked for -stable? Yes. I'm going to send a pull request to Linus tomorrow. > I have submitted a patch for ehci-hcd, so we should be all set. OK, thanks! Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Thu, 8 Nov 2012, Takashi Iwai wrote: > At Thu, 08 Nov 2012 08:31:35 +0100, > Daniel Mack wrote: > (snip) > > >> We can't simply stop both endpoints in the prepare callback. > > > > > > The new function doesn't stop the stream by itself but it just syncs > > > if the stream is being stopped beforehand. So, it's safe to call it > > > there. > > > > > > Maybe the name was confusing. It should have been like > > > snd_usb_endpoint_sync_pending_stop() or such. > > > > Ah, right. I was errornously looking closer to Alan's patch but then > > replied to yours. Alright then - thanks for explaining :) > > OK, thanks for checking. > > FWIW, below is the patch I applied now to for-linus branch. > Renamed the function, added the comment and put NULL check to the > function to simplify. Thanks for fixing this. Is your patch marked for -stable? I have submitted a patch for ehci-hcd, so we should be all set. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Thu, 08 Nov 2012 08:31:35 +0100, Daniel Mack wrote: (snip) > >> We can't simply stop both endpoints in the prepare callback. > > > > The new function doesn't stop the stream by itself but it just syncs > > if the stream is being stopped beforehand. So, it's safe to call it > > there. > > > > Maybe the name was confusing. It should have been like > > snd_usb_endpoint_sync_pending_stop() or such. > > Ah, right. I was errornously looking closer to Alan's patch but then > replied to yours. Alright then - thanks for explaining :) OK, thanks for checking. FWIW, below is the patch I applied now to for-linus branch. Renamed the function, added the comment and put NULL check to the function to simplify. Takashi --- From: Takashi Iwai Subject: [PATCH] ALSA: usb-audio: Fix crash at re-preparing the PCM stream There are bug reports of a crash with USB-audio devices when PCM prepare is performed immediately after the stream is stopped via trigger callback. It turned out that the problem is that we don't wait until all URBs are killed. This patch adds a new function to synchronize the pending stop operation on an endpoint, and calls in the prepare callback for avoiding the crash above. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=49181 Reported-and-tested-by: Artem S. Tashkinov Cc: [v3.6] Signed-off-by: Takashi Iwai --- sound/usb/endpoint.c | 13 + sound/usb/endpoint.h | 1 + sound/usb/pcm.c | 3 +++ 3 files changed, 17 insertions(+) diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index 7f78c6d..34de6f2 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -35,6 +35,7 @@ #define EP_FLAG_ACTIVATED 0 #define EP_FLAG_RUNNING1 +#define EP_FLAG_STOPPING 2 /* * snd_usb_endpoint is a model that abstracts everything related to an @@ -502,10 +503,20 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) if (alive) snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n", alive, ep->ep_num); + clear_bit(EP_FLAG_STOPPING, >flags); return 0; } +/* sync the pending stop operation; + * this function itself doesn't trigger the stop operation + */ +void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep) +{ + if (ep && test_bit(EP_FLAG_STOPPING, >flags)) + wait_clear_urbs(ep); +} + /* * unlink active urbs. */ @@ -918,6 +929,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, if (wait) wait_clear_urbs(ep); + else + set_bit(EP_FLAG_STOPPING, >flags); } } diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index 6376ccf..3d4c970 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -19,6 +19,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 37428f7..5c12a3f 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -568,6 +568,9 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) goto unlock; } + snd_usb_endpoint_sync_pending_stop(subs->sync_endpoint); + snd_usb_endpoint_sync_pending_stop(subs->data_endpoint); + ret = set_format(subs, subs->cur_audiofmt); if (ret < 0) goto unlock; -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Thu, 08 Nov 2012 08:31:35 +0100, Daniel Mack wrote: (snip) We can't simply stop both endpoints in the prepare callback. The new function doesn't stop the stream by itself but it just syncs if the stream is being stopped beforehand. So, it's safe to call it there. Maybe the name was confusing. It should have been like snd_usb_endpoint_sync_pending_stop() or such. Ah, right. I was errornously looking closer to Alan's patch but then replied to yours. Alright then - thanks for explaining :) OK, thanks for checking. FWIW, below is the patch I applied now to for-linus branch. Renamed the function, added the comment and put NULL check to the function to simplify. Takashi --- From: Takashi Iwai ti...@suse.de Subject: [PATCH] ALSA: usb-audio: Fix crash at re-preparing the PCM stream There are bug reports of a crash with USB-audio devices when PCM prepare is performed immediately after the stream is stopped via trigger callback. It turned out that the problem is that we don't wait until all URBs are killed. This patch adds a new function to synchronize the pending stop operation on an endpoint, and calls in the prepare callback for avoiding the crash above. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=49181 Reported-and-tested-by: Artem S. Tashkinov t.ar...@lycos.com Cc: sta...@vger.kernel.org [v3.6] Signed-off-by: Takashi Iwai ti...@suse.de --- sound/usb/endpoint.c | 13 + sound/usb/endpoint.h | 1 + sound/usb/pcm.c | 3 +++ 3 files changed, 17 insertions(+) diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index 7f78c6d..34de6f2 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -35,6 +35,7 @@ #define EP_FLAG_ACTIVATED 0 #define EP_FLAG_RUNNING1 +#define EP_FLAG_STOPPING 2 /* * snd_usb_endpoint is a model that abstracts everything related to an @@ -502,10 +503,20 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) if (alive) snd_printk(KERN_ERR timeout: still %d active urbs on EP #%x\n, alive, ep-ep_num); + clear_bit(EP_FLAG_STOPPING, ep-flags); return 0; } +/* sync the pending stop operation; + * this function itself doesn't trigger the stop operation + */ +void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep) +{ + if (ep test_bit(EP_FLAG_STOPPING, ep-flags)) + wait_clear_urbs(ep); +} + /* * unlink active urbs. */ @@ -918,6 +929,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, if (wait) wait_clear_urbs(ep); + else + set_bit(EP_FLAG_STOPPING, ep-flags); } } diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index 6376ccf..3d4c970 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -19,6 +19,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 37428f7..5c12a3f 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -568,6 +568,9 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) goto unlock; } + snd_usb_endpoint_sync_pending_stop(subs-sync_endpoint); + snd_usb_endpoint_sync_pending_stop(subs-data_endpoint); + ret = set_format(subs, subs-cur_audiofmt); if (ret 0) goto unlock; -- 1.8.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Thu, 8 Nov 2012, Takashi Iwai wrote: At Thu, 08 Nov 2012 08:31:35 +0100, Daniel Mack wrote: (snip) We can't simply stop both endpoints in the prepare callback. The new function doesn't stop the stream by itself but it just syncs if the stream is being stopped beforehand. So, it's safe to call it there. Maybe the name was confusing. It should have been like snd_usb_endpoint_sync_pending_stop() or such. Ah, right. I was errornously looking closer to Alan's patch but then replied to yours. Alright then - thanks for explaining :) OK, thanks for checking. FWIW, below is the patch I applied now to for-linus branch. Renamed the function, added the comment and put NULL check to the function to simplify. Thanks for fixing this. Is your patch marked for -stable? I have submitted a patch for ehci-hcd, so we should be all set. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Thu, 8 Nov 2012 10:55:50 -0500 (EST), Alan Stern wrote: On Thu, 8 Nov 2012, Takashi Iwai wrote: At Thu, 08 Nov 2012 08:31:35 +0100, Daniel Mack wrote: (snip) We can't simply stop both endpoints in the prepare callback. The new function doesn't stop the stream by itself but it just syncs if the stream is being stopped beforehand. So, it's safe to call it there. Maybe the name was confusing. It should have been like snd_usb_endpoint_sync_pending_stop() or such. Ah, right. I was errornously looking closer to Alan's patch but then replied to yours. Alright then - thanks for explaining :) OK, thanks for checking. FWIW, below is the patch I applied now to for-linus branch. Renamed the function, added the comment and put NULL check to the function to simplify. Thanks for fixing this. Is your patch marked for -stable? Yes. I'm going to send a pull request to Linus tomorrow. I have submitted a patch for ehci-hcd, so we should be all set. OK, thanks! Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 08.11.2012 07:43, Takashi Iwai wrote: > At Thu, 08 Nov 2012 01:42:59 +0100, > Daniel Mack wrote: >> >> On 07.11.2012 20:19, Takashi Iwai wrote: >>> At Wed, 7 Nov 2012 12:34:43 -0500 (EST), >>> Alan Stern wrote: On Mon, 5 Nov 2012, Christof Meerwald wrote: > BTW, I have been able to reproduce the problem on a completely > different machine (also running Ubuntu 12.10, but different hardware). > The important thing appears to be that the USB audio device is > connected via a USB 2.0 hub (and then using the test code posted in > http://pastebin.com/aHGe1S1X specifying the audio device as > "plughw:Set" (or whatever it's called) seems to trigger the freeze). Christof: Thank you for that reference, it was a big help. After crashing my system many times I have tracked the problem, at least in part. The patch below should prevent your system from freezing. Takashi: It turns out the the problem is triggered when the audio subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls snd_usb_endpoint_start(). Since the driver doesn't wait for the outstanding URBs to finish, it tries to submit them again while they are still active. Normally the USB core would realize this and fail the submission, but a bug in ehci-hcd prevented this from happening. (That bug is what the patch below fixes.) The URB gets added to the active list twice, resulting in list corruption and an oops in interrupt context, which freezes the system. The user program that triggers the problem basically looks like this: snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); snd_pcm_drop(rec_pcm); snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); The snd_pcm_drop call unlinks the URBs but does not wait for them to finish. Then the second snd_pcm_start call submits the URBs before they have finished. >> >> >> Thanks for investigating on this and to everyone who so quickyl tested >> the provided patch. Seems like we got the right idea where the problem >> really is. >> >> However, the proposed patch seems wrong to me (see below). >> What is the right solution for this problem? >>> >>> How about the patch below? (It's for 3.6, and won't be applied cleanly >>> to 3.7, but easy to adapt.) >>> >>> >>> Takashi >>> >>> --- >>> diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c >>> index d9de667..38830e2 100644 >>> --- a/sound/usb/endpoint.c >>> +++ b/sound/usb/endpoint.c >>> @@ -35,6 +35,7 @@ >>> >>> #define EP_FLAG_ACTIVATED 0 >>> #define EP_FLAG_RUNNING1 >>> +#define EP_FLAG_STOPPING 2 >>> >>> /* >>> * snd_usb_endpoint is a model that abstracts everything related to an >>> @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint >>> *ep) >>> if (alive) >>> snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n", >>> alive, ep->ep_num); >>> + clear_bit(EP_FLAG_STOPPING, >flags); >>> >>> return 0; >>> } >>> >>> +/* wait until urbs are really dropped */ >>> +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) >>> +{ >>> + if (test_bit(EP_FLAG_STOPPING, >flags)) >>> + wait_clear_urbs(ep); >>> +} >>> + >>> + >>> /* >>> * unlink active urbs. >>> */ >>> @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, >>> >>> if (wait) >>> wait_clear_urbs(ep); >>> + else >>> + set_bit(EP_FLAG_STOPPING, >flags); >>> } >>> } >>> >>> diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h >>> index cbbbdf2..c1540a4 100644 >>> --- a/sound/usb/endpoint.h >>> +++ b/sound/usb/endpoint.h >>> @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint >>> *ep, >>> int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); >>> void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, >>>int force, int can_sleep, int wait); >>> +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); >>> int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); >>> int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); >>> void snd_usb_endpoint_free(struct list_head *head); >>> diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c >>> index f782ce1..aee3ab0 100644 >>> --- a/sound/usb/pcm.c >>> +++ b/sound/usb/pcm.c >>> @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct >>> snd_pcm_substream *substream) >>> if (snd_BUG_ON(!subs->data_endpoint)) >>> return -EIO; >>> >>> + if (subs->sync_endpoint) >>> + snd_usb_endpoint_sync_stop(subs->sync_endpoint); >>> + if (subs->data_endpoint) >>> + snd_usb_endpoint_sync_stop(subs->data_endpoint); >> >> We can't simply stop both endpoints in the prepare callback. > > The new function doesn't stop the
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Wed, 7 Nov 2012 15:37:17 -0500 (EST), Alan Stern wrote: > > On Wed, 7 Nov 2012, Takashi Iwai wrote: > > > > What is the right solution for this problem? > > > > How about the patch below? (It's for 3.6, and won't be applied cleanly > > to 3.7, but easy to adapt.) > > I simplified your patch a little. You can't drop the check of stopping endpoint. As Daniel pointed, endpoints might be still running when it's called. I already did a similar failure in the past, so this patch is a revised version with the check for pending operations. > This is for 3.7, not 3.6. I > verified that it does fix the problem raised by the test program. > > If you think this is okay, I'll submit it officially. Don't worry, my patch is also based on 3.7, too :) 3.6 patch was provided just for convenience, testers seemed to have 3.6 systems. thanks, Takashi > > Alan Stern > > > > Index: usb-3.7/sound/usb/endpoint.h > === > --- usb-3.7.orig/sound/usb/endpoint.h > +++ usb-3.7/sound/usb/endpoint.h > @@ -19,6 +19,7 @@ int snd_usb_endpoint_set_params(struct s > int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); > void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, > int force, int can_sleep, int wait); > +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); > int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); > int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); > void snd_usb_endpoint_free(struct list_head *head); > Index: usb-3.7/sound/usb/pcm.c > === > --- usb-3.7.orig/sound/usb/pcm.c > +++ usb-3.7/sound/usb/pcm.c > @@ -576,6 +576,11 @@ static int snd_usb_pcm_prepare(struct sn > subs->need_setup_ep = false; > } > > + if (subs->sync_endpoint) > + snd_usb_endpoint_sync_stop(subs->sync_endpoint); > + if (subs->data_endpoint) > + snd_usb_endpoint_sync_stop(subs->data_endpoint); > + > /* some unit conversions in runtime */ > subs->data_endpoint->maxframesize = > bytes_to_frames(runtime, subs->data_endpoint->maxpacksize); > Index: usb-3.7/sound/usb/endpoint.c > === > --- usb-3.7.orig/sound/usb/endpoint.c > +++ usb-3.7/sound/usb/endpoint.c > @@ -481,7 +481,7 @@ __exit_unlock: > /* > * wait until all urbs are processed. > */ > -static int wait_clear_urbs(struct snd_usb_endpoint *ep) > +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) > { > unsigned long end_time = jiffies + msecs_to_jiffies(1000); > unsigned int i; > @@ -502,8 +502,6 @@ static int wait_clear_urbs(struct snd_us > if (alive) > snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n", > alive, ep->ep_num); > - > - return 0; > } > > /* > @@ -556,7 +554,7 @@ static void release_urbs(struct snd_usb_ > > /* stop urbs */ > deactivate_urbs(ep, force, 1); > - wait_clear_urbs(ep); > + snd_usb_endpoint_sync_stop(ep); > > for (i = 0; i < ep->nurbs; i++) > release_urb_ctx(>urb[i]); > @@ -833,7 +831,7 @@ int snd_usb_endpoint_start(struct snd_us > /* just to be sure */ > deactivate_urbs(ep, 0, can_sleep); > if (can_sleep) > - wait_clear_urbs(ep); > + snd_usb_endpoint_sync_stop(ep); > > ep->active_mask = 0; > ep->unlink_mask = 0; > @@ -917,7 +915,7 @@ void snd_usb_endpoint_stop(struct snd_us > ep->prepare_data_urb = NULL; > > if (wait) > - wait_clear_urbs(ep); > + snd_usb_endpoint_sync_stop(ep); > } > } > > @@ -940,7 +938,7 @@ int snd_usb_endpoint_deactivate(struct s > return -EINVAL; > > deactivate_urbs(ep, 1, 1); > - wait_clear_urbs(ep); > + snd_usb_endpoint_sync_stop(ep); > > if (ep->use_count != 0) > return 0; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Thu, 08 Nov 2012 01:42:59 +0100, Daniel Mack wrote: > > On 07.11.2012 20:19, Takashi Iwai wrote: > > At Wed, 7 Nov 2012 12:34:43 -0500 (EST), > > Alan Stern wrote: > >> > >> On Mon, 5 Nov 2012, Christof Meerwald wrote: > >> > >>> BTW, I have been able to reproduce the problem on a completely > >>> different machine (also running Ubuntu 12.10, but different hardware). > >>> The important thing appears to be that the USB audio device is > >>> connected via a USB 2.0 hub (and then using the test code posted in > >>> http://pastebin.com/aHGe1S1X specifying the audio device as > >>> "plughw:Set" (or whatever it's called) seems to trigger the freeze). > >> > >> Christof: Thank you for that reference, it was a big help. After > >> crashing my system many times I have tracked the problem, at least in > >> part. The patch below should prevent your system from freezing. > >> > >> > >> Takashi: It turns out the the problem is triggered when the audio > >> subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls > >> snd_usb_endpoint_start(). Since the driver doesn't wait for the > >> outstanding URBs to finish, it tries to submit them again while they > >> are still active. > >> > >> Normally the USB core would realize this and fail the submission, but a > >> bug in ehci-hcd prevented this from happening. (That bug is what the > >> patch below fixes.) The URB gets added to the active list twice, > >> resulting in list corruption and an oops in interrupt context, which > >> freezes the system. > >> > >> The user program that triggers the problem basically looks like this: > >> > >> snd_pcm_prepare(rec_pcm); > >> snd_pcm_start(rec_pcm); > >> snd_pcm_drop(rec_pcm); > >> > >> snd_pcm_prepare(rec_pcm); > >> snd_pcm_start(rec_pcm); > >> > >> The snd_pcm_drop call unlinks the URBs but does not wait for them to > >> finish. Then the second snd_pcm_start call submits the URBs before > >> they have finished. > > > Thanks for investigating on this and to everyone who so quickyl tested > the provided patch. Seems like we got the right idea where the problem > really is. > > However, the proposed patch seems wrong to me (see below). > > >> What is the right solution for this problem? > > > > How about the patch below? (It's for 3.6, and won't be applied cleanly > > to 3.7, but easy to adapt.) > > > > > > Takashi > > > > --- > > diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c > > index d9de667..38830e2 100644 > > --- a/sound/usb/endpoint.c > > +++ b/sound/usb/endpoint.c > > @@ -35,6 +35,7 @@ > > > > #define EP_FLAG_ACTIVATED 0 > > #define EP_FLAG_RUNNING1 > > +#define EP_FLAG_STOPPING 2 > > > > /* > > * snd_usb_endpoint is a model that abstracts everything related to an > > @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint > > *ep) > > if (alive) > > snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n", > > alive, ep->ep_num); > > + clear_bit(EP_FLAG_STOPPING, >flags); > > > > return 0; > > } > > > > +/* wait until urbs are really dropped */ > > +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) > > +{ > > + if (test_bit(EP_FLAG_STOPPING, >flags)) > > + wait_clear_urbs(ep); > > +} > > + > > + > > /* > > * unlink active urbs. > > */ > > @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, > > > > if (wait) > > wait_clear_urbs(ep); > > + else > > + set_bit(EP_FLAG_STOPPING, >flags); > > } > > } > > > > diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h > > index cbbbdf2..c1540a4 100644 > > --- a/sound/usb/endpoint.h > > +++ b/sound/usb/endpoint.h > > @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint > > *ep, > > int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); > > void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, > >int force, int can_sleep, int wait); > > +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); > > int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); > > int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); > > void snd_usb_endpoint_free(struct list_head *head); > > diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c > > index f782ce1..aee3ab0 100644 > > --- a/sound/usb/pcm.c > > +++ b/sound/usb/pcm.c > > @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct > > snd_pcm_substream *substream) > > if (snd_BUG_ON(!subs->data_endpoint)) > > return -EIO; > > > > + if (subs->sync_endpoint) > > + snd_usb_endpoint_sync_stop(subs->sync_endpoint); > > + if (subs->data_endpoint) > > + snd_usb_endpoint_sync_stop(subs->data_endpoint); > > We can't simply stop both endpoints in the prepare callback. The new function doesn't stop the stream by itself but it just syncs if the stream is
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 07.11.2012 20:19, Takashi Iwai wrote: > At Wed, 7 Nov 2012 12:34:43 -0500 (EST), > Alan Stern wrote: >> >> On Mon, 5 Nov 2012, Christof Meerwald wrote: >> >>> BTW, I have been able to reproduce the problem on a completely >>> different machine (also running Ubuntu 12.10, but different hardware). >>> The important thing appears to be that the USB audio device is >>> connected via a USB 2.0 hub (and then using the test code posted in >>> http://pastebin.com/aHGe1S1X specifying the audio device as >>> "plughw:Set" (or whatever it's called) seems to trigger the freeze). >> >> Christof: Thank you for that reference, it was a big help. After >> crashing my system many times I have tracked the problem, at least in >> part. The patch below should prevent your system from freezing. >> >> >> Takashi: It turns out the the problem is triggered when the audio >> subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls >> snd_usb_endpoint_start(). Since the driver doesn't wait for the >> outstanding URBs to finish, it tries to submit them again while they >> are still active. >> >> Normally the USB core would realize this and fail the submission, but a >> bug in ehci-hcd prevented this from happening. (That bug is what the >> patch below fixes.) The URB gets added to the active list twice, >> resulting in list corruption and an oops in interrupt context, which >> freezes the system. >> >> The user program that triggers the problem basically looks like this: >> >> snd_pcm_prepare(rec_pcm); >> snd_pcm_start(rec_pcm); >> snd_pcm_drop(rec_pcm); >> >> snd_pcm_prepare(rec_pcm); >> snd_pcm_start(rec_pcm); >> >> The snd_pcm_drop call unlinks the URBs but does not wait for them to >> finish. Then the second snd_pcm_start call submits the URBs before >> they have finished. Thanks for investigating on this and to everyone who so quickyl tested the provided patch. Seems like we got the right idea where the problem really is. However, the proposed patch seems wrong to me (see below). >> What is the right solution for this problem? > > How about the patch below? (It's for 3.6, and won't be applied cleanly > to 3.7, but easy to adapt.) > > > Takashi > > --- > diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c > index d9de667..38830e2 100644 > --- a/sound/usb/endpoint.c > +++ b/sound/usb/endpoint.c > @@ -35,6 +35,7 @@ > > #define EP_FLAG_ACTIVATED0 > #define EP_FLAG_RUNNING 1 > +#define EP_FLAG_STOPPING 2 > > /* > * snd_usb_endpoint is a model that abstracts everything related to an > @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) > if (alive) > snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n", > alive, ep->ep_num); > + clear_bit(EP_FLAG_STOPPING, >flags); > > return 0; > } > > +/* wait until urbs are really dropped */ > +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) > +{ > + if (test_bit(EP_FLAG_STOPPING, >flags)) > + wait_clear_urbs(ep); > +} > + > + > /* > * unlink active urbs. > */ > @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, > > if (wait) > wait_clear_urbs(ep); > + else > + set_bit(EP_FLAG_STOPPING, >flags); > } > } > > diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h > index cbbbdf2..c1540a4 100644 > --- a/sound/usb/endpoint.h > +++ b/sound/usb/endpoint.h > @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, > int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); > void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, > int force, int can_sleep, int wait); > +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); > int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); > int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); > void snd_usb_endpoint_free(struct list_head *head); > diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c > index f782ce1..aee3ab0 100644 > --- a/sound/usb/pcm.c > +++ b/sound/usb/pcm.c > @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream > *substream) > if (snd_BUG_ON(!subs->data_endpoint)) > return -EIO; > > + if (subs->sync_endpoint) > + snd_usb_endpoint_sync_stop(subs->sync_endpoint); > + if (subs->data_endpoint) > + snd_usb_endpoint_sync_stop(subs->data_endpoint); We can't simply stop both endpoints in the prepare callback. The essence of the new reference-counting system is that we can use endpoints from multiple contexts, and the logic inside endpoint.c will care about when to start up and take down the urbs. The idea here is that endoints can be run for many purposes, and the new implementation that was added allows capture endpoints to run purely as timing reference for playback.
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Nov 8, 2012, Takashi Iwai wrote: > How about the patch below? (It's for 3.6, and won't be applied cleanly > to 3.7, but easy to adapt.) This patch fixes my problem, thank you! You can add me as "Tested by". Artem -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Wed, Nov 07, 2012 at 08:19:19PM +0100, Takashi Iwai wrote: > How about the patch below? (It's for 3.6, and won't be applied cleanly > to 3.7, but easy to adapt.) Thanks, that patch seems to fix the problem. Christof -- http://cmeerw.org sip:cmeerw at cmeerw.org mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Wed, 7 Nov 2012, Takashi Iwai wrote: > > What is the right solution for this problem? > > How about the patch below? (It's for 3.6, and won't be applied cleanly > to 3.7, but easy to adapt.) I simplified your patch a little. This is for 3.7, not 3.6. I verified that it does fix the problem raised by the test program. If you think this is okay, I'll submit it officially. Alan Stern Index: usb-3.7/sound/usb/endpoint.h === --- usb-3.7.orig/sound/usb/endpoint.h +++ usb-3.7/sound/usb/endpoint.h @@ -19,6 +19,7 @@ int snd_usb_endpoint_set_params(struct s int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); Index: usb-3.7/sound/usb/pcm.c === --- usb-3.7.orig/sound/usb/pcm.c +++ usb-3.7/sound/usb/pcm.c @@ -576,6 +576,11 @@ static int snd_usb_pcm_prepare(struct sn subs->need_setup_ep = false; } + if (subs->sync_endpoint) + snd_usb_endpoint_sync_stop(subs->sync_endpoint); + if (subs->data_endpoint) + snd_usb_endpoint_sync_stop(subs->data_endpoint); + /* some unit conversions in runtime */ subs->data_endpoint->maxframesize = bytes_to_frames(runtime, subs->data_endpoint->maxpacksize); Index: usb-3.7/sound/usb/endpoint.c === --- usb-3.7.orig/sound/usb/endpoint.c +++ usb-3.7/sound/usb/endpoint.c @@ -481,7 +481,7 @@ __exit_unlock: /* * wait until all urbs are processed. */ -static int wait_clear_urbs(struct snd_usb_endpoint *ep) +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) { unsigned long end_time = jiffies + msecs_to_jiffies(1000); unsigned int i; @@ -502,8 +502,6 @@ static int wait_clear_urbs(struct snd_us if (alive) snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n", alive, ep->ep_num); - - return 0; } /* @@ -556,7 +554,7 @@ static void release_urbs(struct snd_usb_ /* stop urbs */ deactivate_urbs(ep, force, 1); - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); for (i = 0; i < ep->nurbs; i++) release_urb_ctx(>urb[i]); @@ -833,7 +831,7 @@ int snd_usb_endpoint_start(struct snd_us /* just to be sure */ deactivate_urbs(ep, 0, can_sleep); if (can_sleep) - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); ep->active_mask = 0; ep->unlink_mask = 0; @@ -917,7 +915,7 @@ void snd_usb_endpoint_stop(struct snd_us ep->prepare_data_urb = NULL; if (wait) - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); } } @@ -940,7 +938,7 @@ int snd_usb_endpoint_deactivate(struct s return -EINVAL; deactivate_urbs(ep, 1, 1); - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); if (ep->use_count != 0) return 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Wed, 7 Nov 2012 12:34:43 -0500 (EST), Alan Stern wrote: > > On Mon, 5 Nov 2012, Christof Meerwald wrote: > > > BTW, I have been able to reproduce the problem on a completely > > different machine (also running Ubuntu 12.10, but different hardware). > > The important thing appears to be that the USB audio device is > > connected via a USB 2.0 hub (and then using the test code posted in > > http://pastebin.com/aHGe1S1X specifying the audio device as > > "plughw:Set" (or whatever it's called) seems to trigger the freeze). > > Christof: Thank you for that reference, it was a big help. After > crashing my system many times I have tracked the problem, at least in > part. The patch below should prevent your system from freezing. > > > Takashi: It turns out the the problem is triggered when the audio > subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls > snd_usb_endpoint_start(). Since the driver doesn't wait for the > outstanding URBs to finish, it tries to submit them again while they > are still active. > > Normally the USB core would realize this and fail the submission, but a > bug in ehci-hcd prevented this from happening. (That bug is what the > patch below fixes.) The URB gets added to the active list twice, > resulting in list corruption and an oops in interrupt context, which > freezes the system. > > The user program that triggers the problem basically looks like this: > > snd_pcm_prepare(rec_pcm); > snd_pcm_start(rec_pcm); > snd_pcm_drop(rec_pcm); > > snd_pcm_prepare(rec_pcm); > snd_pcm_start(rec_pcm); > > The snd_pcm_drop call unlinks the URBs but does not wait for them to > finish. Then the second snd_pcm_start call submits the URBs before > they have finished. > > What is the right solution for this problem? How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) Takashi --- diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index d9de667..38830e2 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -35,6 +35,7 @@ #define EP_FLAG_ACTIVATED 0 #define EP_FLAG_RUNNING1 +#define EP_FLAG_STOPPING 2 /* * snd_usb_endpoint is a model that abstracts everything related to an @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) if (alive) snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n", alive, ep->ep_num); + clear_bit(EP_FLAG_STOPPING, >flags); return 0; } +/* wait until urbs are really dropped */ +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) +{ + if (test_bit(EP_FLAG_STOPPING, >flags)) + wait_clear_urbs(ep); +} + + /* * unlink active urbs. */ @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, if (wait) wait_clear_urbs(ep); + else + set_bit(EP_FLAG_STOPPING, >flags); } } diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index cbbbdf2..c1540a4 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..aee3ab0 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (snd_BUG_ON(!subs->data_endpoint)) return -EIO; + if (subs->sync_endpoint) + snd_usb_endpoint_sync_stop(subs->sync_endpoint); + if (subs->data_endpoint) + snd_usb_endpoint_sync_stop(subs->data_endpoint); + /* some unit conversions in runtime */ subs->data_endpoint->maxframesize = bytes_to_frames(runtime, subs->data_endpoint->maxpacksize); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Mon, 5 Nov 2012, Christof Meerwald wrote: > BTW, I have been able to reproduce the problem on a completely > different machine (also running Ubuntu 12.10, but different hardware). > The important thing appears to be that the USB audio device is > connected via a USB 2.0 hub (and then using the test code posted in > http://pastebin.com/aHGe1S1X specifying the audio device as > "plughw:Set" (or whatever it's called) seems to trigger the freeze). Christof: Thank you for that reference, it was a big help. After crashing my system many times I have tracked the problem, at least in part. The patch below should prevent your system from freezing. Takashi: It turns out the the problem is triggered when the audio subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls snd_usb_endpoint_start(). Since the driver doesn't wait for the outstanding URBs to finish, it tries to submit them again while they are still active. Normally the USB core would realize this and fail the submission, but a bug in ehci-hcd prevented this from happening. (That bug is what the patch below fixes.) The URB gets added to the active list twice, resulting in list corruption and an oops in interrupt context, which freezes the system. The user program that triggers the problem basically looks like this: snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); snd_pcm_drop(rec_pcm); snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); The snd_pcm_drop call unlinks the URBs but does not wait for them to finish. Then the second snd_pcm_start call submits the URBs before they have finished. What is the right solution for this problem? Alan Stern Index: usb-3.7/drivers/usb/host/ehci-sched.c === --- usb-3.7.orig/drivers/usb/host/ehci-sched.c +++ usb-3.7/drivers/usb/host/ehci-sched.c @@ -1632,7 +1632,7 @@ static void itd_link_urb( /* don't need that schedule data any more */ iso_sched_free (stream, iso_sched); - urb->hcpriv = NULL; + urb->hcpriv = stream; ++ehci->isoc_count; enable_periodic(ehci); @@ -2031,7 +2031,7 @@ static void sitd_link_urb( /* don't need that schedule data any more */ iso_sched_free (stream, sched); - urb->hcpriv = NULL; + urb->hcpriv = stream; ++ehci->isoc_count; enable_periodic(ehci); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Mon, 5 Nov 2012, Christof Meerwald wrote: BTW, I have been able to reproduce the problem on a completely different machine (also running Ubuntu 12.10, but different hardware). The important thing appears to be that the USB audio device is connected via a USB 2.0 hub (and then using the test code posted in http://pastebin.com/aHGe1S1X specifying the audio device as plughw:Set (or whatever it's called) seems to trigger the freeze). Christof: Thank you for that reference, it was a big help. After crashing my system many times I have tracked the problem, at least in part. The patch below should prevent your system from freezing. Takashi: It turns out the the problem is triggered when the audio subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls snd_usb_endpoint_start(). Since the driver doesn't wait for the outstanding URBs to finish, it tries to submit them again while they are still active. Normally the USB core would realize this and fail the submission, but a bug in ehci-hcd prevented this from happening. (That bug is what the patch below fixes.) The URB gets added to the active list twice, resulting in list corruption and an oops in interrupt context, which freezes the system. The user program that triggers the problem basically looks like this: snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); snd_pcm_drop(rec_pcm); snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); The snd_pcm_drop call unlinks the URBs but does not wait for them to finish. Then the second snd_pcm_start call submits the URBs before they have finished. What is the right solution for this problem? Alan Stern Index: usb-3.7/drivers/usb/host/ehci-sched.c === --- usb-3.7.orig/drivers/usb/host/ehci-sched.c +++ usb-3.7/drivers/usb/host/ehci-sched.c @@ -1632,7 +1632,7 @@ static void itd_link_urb( /* don't need that schedule data any more */ iso_sched_free (stream, iso_sched); - urb-hcpriv = NULL; + urb-hcpriv = stream; ++ehci-isoc_count; enable_periodic(ehci); @@ -2031,7 +2031,7 @@ static void sitd_link_urb( /* don't need that schedule data any more */ iso_sched_free (stream, sched); - urb-hcpriv = NULL; + urb-hcpriv = stream; ++ehci-isoc_count; enable_periodic(ehci); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Wed, 7 Nov 2012 12:34:43 -0500 (EST), Alan Stern wrote: On Mon, 5 Nov 2012, Christof Meerwald wrote: BTW, I have been able to reproduce the problem on a completely different machine (also running Ubuntu 12.10, but different hardware). The important thing appears to be that the USB audio device is connected via a USB 2.0 hub (and then using the test code posted in http://pastebin.com/aHGe1S1X specifying the audio device as plughw:Set (or whatever it's called) seems to trigger the freeze). Christof: Thank you for that reference, it was a big help. After crashing my system many times I have tracked the problem, at least in part. The patch below should prevent your system from freezing. Takashi: It turns out the the problem is triggered when the audio subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls snd_usb_endpoint_start(). Since the driver doesn't wait for the outstanding URBs to finish, it tries to submit them again while they are still active. Normally the USB core would realize this and fail the submission, but a bug in ehci-hcd prevented this from happening. (That bug is what the patch below fixes.) The URB gets added to the active list twice, resulting in list corruption and an oops in interrupt context, which freezes the system. The user program that triggers the problem basically looks like this: snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); snd_pcm_drop(rec_pcm); snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); The snd_pcm_drop call unlinks the URBs but does not wait for them to finish. Then the second snd_pcm_start call submits the URBs before they have finished. What is the right solution for this problem? How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) Takashi --- diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index d9de667..38830e2 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -35,6 +35,7 @@ #define EP_FLAG_ACTIVATED 0 #define EP_FLAG_RUNNING1 +#define EP_FLAG_STOPPING 2 /* * snd_usb_endpoint is a model that abstracts everything related to an @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) if (alive) snd_printk(KERN_ERR timeout: still %d active urbs on EP #%x\n, alive, ep-ep_num); + clear_bit(EP_FLAG_STOPPING, ep-flags); return 0; } +/* wait until urbs are really dropped */ +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) +{ + if (test_bit(EP_FLAG_STOPPING, ep-flags)) + wait_clear_urbs(ep); +} + + /* * unlink active urbs. */ @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, if (wait) wait_clear_urbs(ep); + else + set_bit(EP_FLAG_STOPPING, ep-flags); } } diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index cbbbdf2..c1540a4 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..aee3ab0 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (snd_BUG_ON(!subs-data_endpoint)) return -EIO; + if (subs-sync_endpoint) + snd_usb_endpoint_sync_stop(subs-sync_endpoint); + if (subs-data_endpoint) + snd_usb_endpoint_sync_stop(subs-data_endpoint); + /* some unit conversions in runtime */ subs-data_endpoint-maxframesize = bytes_to_frames(runtime, subs-data_endpoint-maxpacksize); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Wed, 7 Nov 2012, Takashi Iwai wrote: What is the right solution for this problem? How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) I simplified your patch a little. This is for 3.7, not 3.6. I verified that it does fix the problem raised by the test program. If you think this is okay, I'll submit it officially. Alan Stern Index: usb-3.7/sound/usb/endpoint.h === --- usb-3.7.orig/sound/usb/endpoint.h +++ usb-3.7/sound/usb/endpoint.h @@ -19,6 +19,7 @@ int snd_usb_endpoint_set_params(struct s int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); Index: usb-3.7/sound/usb/pcm.c === --- usb-3.7.orig/sound/usb/pcm.c +++ usb-3.7/sound/usb/pcm.c @@ -576,6 +576,11 @@ static int snd_usb_pcm_prepare(struct sn subs-need_setup_ep = false; } + if (subs-sync_endpoint) + snd_usb_endpoint_sync_stop(subs-sync_endpoint); + if (subs-data_endpoint) + snd_usb_endpoint_sync_stop(subs-data_endpoint); + /* some unit conversions in runtime */ subs-data_endpoint-maxframesize = bytes_to_frames(runtime, subs-data_endpoint-maxpacksize); Index: usb-3.7/sound/usb/endpoint.c === --- usb-3.7.orig/sound/usb/endpoint.c +++ usb-3.7/sound/usb/endpoint.c @@ -481,7 +481,7 @@ __exit_unlock: /* * wait until all urbs are processed. */ -static int wait_clear_urbs(struct snd_usb_endpoint *ep) +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) { unsigned long end_time = jiffies + msecs_to_jiffies(1000); unsigned int i; @@ -502,8 +502,6 @@ static int wait_clear_urbs(struct snd_us if (alive) snd_printk(KERN_ERR timeout: still %d active urbs on EP #%x\n, alive, ep-ep_num); - - return 0; } /* @@ -556,7 +554,7 @@ static void release_urbs(struct snd_usb_ /* stop urbs */ deactivate_urbs(ep, force, 1); - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); for (i = 0; i ep-nurbs; i++) release_urb_ctx(ep-urb[i]); @@ -833,7 +831,7 @@ int snd_usb_endpoint_start(struct snd_us /* just to be sure */ deactivate_urbs(ep, 0, can_sleep); if (can_sleep) - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); ep-active_mask = 0; ep-unlink_mask = 0; @@ -917,7 +915,7 @@ void snd_usb_endpoint_stop(struct snd_us ep-prepare_data_urb = NULL; if (wait) - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); } } @@ -940,7 +938,7 @@ int snd_usb_endpoint_deactivate(struct s return -EINVAL; deactivate_urbs(ep, 1, 1); - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); if (ep-use_count != 0) return 0; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Wed, Nov 07, 2012 at 08:19:19PM +0100, Takashi Iwai wrote: How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) Thanks, that patch seems to fix the problem. Christof -- http://cmeerw.org sip:cmeerw at cmeerw.org mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Nov 8, 2012, Takashi Iwai wrote: How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) This patch fixes my problem, thank you! You can add me as Tested by. Artem -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 07.11.2012 20:19, Takashi Iwai wrote: At Wed, 7 Nov 2012 12:34:43 -0500 (EST), Alan Stern wrote: On Mon, 5 Nov 2012, Christof Meerwald wrote: BTW, I have been able to reproduce the problem on a completely different machine (also running Ubuntu 12.10, but different hardware). The important thing appears to be that the USB audio device is connected via a USB 2.0 hub (and then using the test code posted in http://pastebin.com/aHGe1S1X specifying the audio device as plughw:Set (or whatever it's called) seems to trigger the freeze). Christof: Thank you for that reference, it was a big help. After crashing my system many times I have tracked the problem, at least in part. The patch below should prevent your system from freezing. Takashi: It turns out the the problem is triggered when the audio subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls snd_usb_endpoint_start(). Since the driver doesn't wait for the outstanding URBs to finish, it tries to submit them again while they are still active. Normally the USB core would realize this and fail the submission, but a bug in ehci-hcd prevented this from happening. (That bug is what the patch below fixes.) The URB gets added to the active list twice, resulting in list corruption and an oops in interrupt context, which freezes the system. The user program that triggers the problem basically looks like this: snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); snd_pcm_drop(rec_pcm); snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); The snd_pcm_drop call unlinks the URBs but does not wait for them to finish. Then the second snd_pcm_start call submits the URBs before they have finished. Thanks for investigating on this and to everyone who so quickyl tested the provided patch. Seems like we got the right idea where the problem really is. However, the proposed patch seems wrong to me (see below). What is the right solution for this problem? How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) Takashi --- diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index d9de667..38830e2 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -35,6 +35,7 @@ #define EP_FLAG_ACTIVATED0 #define EP_FLAG_RUNNING 1 +#define EP_FLAG_STOPPING 2 /* * snd_usb_endpoint is a model that abstracts everything related to an @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) if (alive) snd_printk(KERN_ERR timeout: still %d active urbs on EP #%x\n, alive, ep-ep_num); + clear_bit(EP_FLAG_STOPPING, ep-flags); return 0; } +/* wait until urbs are really dropped */ +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) +{ + if (test_bit(EP_FLAG_STOPPING, ep-flags)) + wait_clear_urbs(ep); +} + + /* * unlink active urbs. */ @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, if (wait) wait_clear_urbs(ep); + else + set_bit(EP_FLAG_STOPPING, ep-flags); } } diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index cbbbdf2..c1540a4 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..aee3ab0 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (snd_BUG_ON(!subs-data_endpoint)) return -EIO; + if (subs-sync_endpoint) + snd_usb_endpoint_sync_stop(subs-sync_endpoint); + if (subs-data_endpoint) + snd_usb_endpoint_sync_stop(subs-data_endpoint); We can't simply stop both endpoints in the prepare callback. The essence of the new reference-counting system is that we can use endpoints from multiple contexts, and the logic inside endpoint.c will care about when to start up and take down the urbs. The idea here is that endoints can be run for many purposes, and the new implementation that was added allows capture endpoints to run purely as timing reference for playback. This bug needs to be fixed in the ehci controller, or we need some other solution in the snd-usb-audio driver. I'll do some test once I'm back from ELC. Daniel --
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Thu, 08 Nov 2012 01:42:59 +0100, Daniel Mack wrote: On 07.11.2012 20:19, Takashi Iwai wrote: At Wed, 7 Nov 2012 12:34:43 -0500 (EST), Alan Stern wrote: On Mon, 5 Nov 2012, Christof Meerwald wrote: BTW, I have been able to reproduce the problem on a completely different machine (also running Ubuntu 12.10, but different hardware). The important thing appears to be that the USB audio device is connected via a USB 2.0 hub (and then using the test code posted in http://pastebin.com/aHGe1S1X specifying the audio device as plughw:Set (or whatever it's called) seems to trigger the freeze). Christof: Thank you for that reference, it was a big help. After crashing my system many times I have tracked the problem, at least in part. The patch below should prevent your system from freezing. Takashi: It turns out the the problem is triggered when the audio subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls snd_usb_endpoint_start(). Since the driver doesn't wait for the outstanding URBs to finish, it tries to submit them again while they are still active. Normally the USB core would realize this and fail the submission, but a bug in ehci-hcd prevented this from happening. (That bug is what the patch below fixes.) The URB gets added to the active list twice, resulting in list corruption and an oops in interrupt context, which freezes the system. The user program that triggers the problem basically looks like this: snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); snd_pcm_drop(rec_pcm); snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); The snd_pcm_drop call unlinks the URBs but does not wait for them to finish. Then the second snd_pcm_start call submits the URBs before they have finished. Thanks for investigating on this and to everyone who so quickyl tested the provided patch. Seems like we got the right idea where the problem really is. However, the proposed patch seems wrong to me (see below). What is the right solution for this problem? How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) Takashi --- diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index d9de667..38830e2 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -35,6 +35,7 @@ #define EP_FLAG_ACTIVATED 0 #define EP_FLAG_RUNNING1 +#define EP_FLAG_STOPPING 2 /* * snd_usb_endpoint is a model that abstracts everything related to an @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) if (alive) snd_printk(KERN_ERR timeout: still %d active urbs on EP #%x\n, alive, ep-ep_num); + clear_bit(EP_FLAG_STOPPING, ep-flags); return 0; } +/* wait until urbs are really dropped */ +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) +{ + if (test_bit(EP_FLAG_STOPPING, ep-flags)) + wait_clear_urbs(ep); +} + + /* * unlink active urbs. */ @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, if (wait) wait_clear_urbs(ep); + else + set_bit(EP_FLAG_STOPPING, ep-flags); } } diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index cbbbdf2..c1540a4 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..aee3ab0 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (snd_BUG_ON(!subs-data_endpoint)) return -EIO; + if (subs-sync_endpoint) + snd_usb_endpoint_sync_stop(subs-sync_endpoint); + if (subs-data_endpoint) + snd_usb_endpoint_sync_stop(subs-data_endpoint); We can't simply stop both endpoints in the prepare callback. The new function doesn't stop the stream by itself but it just syncs if the stream is being stopped beforehand. So, it's safe to call it there. Maybe the name was confusing. It should have been like snd_usb_endpoint_sync_pending_stop() or such. Takashi The essence of the new reference-counting system is that we can use endpoints from multiple contexts, and the logic inside
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
At Wed, 7 Nov 2012 15:37:17 -0500 (EST), Alan Stern wrote: On Wed, 7 Nov 2012, Takashi Iwai wrote: What is the right solution for this problem? How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) I simplified your patch a little. You can't drop the check of stopping endpoint. As Daniel pointed, endpoints might be still running when it's called. I already did a similar failure in the past, so this patch is a revised version with the check for pending operations. This is for 3.7, not 3.6. I verified that it does fix the problem raised by the test program. If you think this is okay, I'll submit it officially. Don't worry, my patch is also based on 3.7, too :) 3.6 patch was provided just for convenience, testers seemed to have 3.6 systems. thanks, Takashi Alan Stern Index: usb-3.7/sound/usb/endpoint.h === --- usb-3.7.orig/sound/usb/endpoint.h +++ usb-3.7/sound/usb/endpoint.h @@ -19,6 +19,7 @@ int snd_usb_endpoint_set_params(struct s int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); Index: usb-3.7/sound/usb/pcm.c === --- usb-3.7.orig/sound/usb/pcm.c +++ usb-3.7/sound/usb/pcm.c @@ -576,6 +576,11 @@ static int snd_usb_pcm_prepare(struct sn subs-need_setup_ep = false; } + if (subs-sync_endpoint) + snd_usb_endpoint_sync_stop(subs-sync_endpoint); + if (subs-data_endpoint) + snd_usb_endpoint_sync_stop(subs-data_endpoint); + /* some unit conversions in runtime */ subs-data_endpoint-maxframesize = bytes_to_frames(runtime, subs-data_endpoint-maxpacksize); Index: usb-3.7/sound/usb/endpoint.c === --- usb-3.7.orig/sound/usb/endpoint.c +++ usb-3.7/sound/usb/endpoint.c @@ -481,7 +481,7 @@ __exit_unlock: /* * wait until all urbs are processed. */ -static int wait_clear_urbs(struct snd_usb_endpoint *ep) +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) { unsigned long end_time = jiffies + msecs_to_jiffies(1000); unsigned int i; @@ -502,8 +502,6 @@ static int wait_clear_urbs(struct snd_us if (alive) snd_printk(KERN_ERR timeout: still %d active urbs on EP #%x\n, alive, ep-ep_num); - - return 0; } /* @@ -556,7 +554,7 @@ static void release_urbs(struct snd_usb_ /* stop urbs */ deactivate_urbs(ep, force, 1); - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); for (i = 0; i ep-nurbs; i++) release_urb_ctx(ep-urb[i]); @@ -833,7 +831,7 @@ int snd_usb_endpoint_start(struct snd_us /* just to be sure */ deactivate_urbs(ep, 0, can_sleep); if (can_sleep) - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); ep-active_mask = 0; ep-unlink_mask = 0; @@ -917,7 +915,7 @@ void snd_usb_endpoint_stop(struct snd_us ep-prepare_data_urb = NULL; if (wait) - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); } } @@ -940,7 +938,7 @@ int snd_usb_endpoint_deactivate(struct s return -EINVAL; deactivate_urbs(ep, 1, 1); - wait_clear_urbs(ep); + snd_usb_endpoint_sync_stop(ep); if (ep-use_count != 0) return 0; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 08.11.2012 07:43, Takashi Iwai wrote: At Thu, 08 Nov 2012 01:42:59 +0100, Daniel Mack wrote: On 07.11.2012 20:19, Takashi Iwai wrote: At Wed, 7 Nov 2012 12:34:43 -0500 (EST), Alan Stern wrote: On Mon, 5 Nov 2012, Christof Meerwald wrote: BTW, I have been able to reproduce the problem on a completely different machine (also running Ubuntu 12.10, but different hardware). The important thing appears to be that the USB audio device is connected via a USB 2.0 hub (and then using the test code posted in http://pastebin.com/aHGe1S1X specifying the audio device as plughw:Set (or whatever it's called) seems to trigger the freeze). Christof: Thank you for that reference, it was a big help. After crashing my system many times I have tracked the problem, at least in part. The patch below should prevent your system from freezing. Takashi: It turns out the the problem is triggered when the audio subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls snd_usb_endpoint_start(). Since the driver doesn't wait for the outstanding URBs to finish, it tries to submit them again while they are still active. Normally the USB core would realize this and fail the submission, but a bug in ehci-hcd prevented this from happening. (That bug is what the patch below fixes.) The URB gets added to the active list twice, resulting in list corruption and an oops in interrupt context, which freezes the system. The user program that triggers the problem basically looks like this: snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); snd_pcm_drop(rec_pcm); snd_pcm_prepare(rec_pcm); snd_pcm_start(rec_pcm); The snd_pcm_drop call unlinks the URBs but does not wait for them to finish. Then the second snd_pcm_start call submits the URBs before they have finished. Thanks for investigating on this and to everyone who so quickyl tested the provided patch. Seems like we got the right idea where the problem really is. However, the proposed patch seems wrong to me (see below). What is the right solution for this problem? How about the patch below? (It's for 3.6, and won't be applied cleanly to 3.7, but easy to adapt.) Takashi --- diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index d9de667..38830e2 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -35,6 +35,7 @@ #define EP_FLAG_ACTIVATED 0 #define EP_FLAG_RUNNING1 +#define EP_FLAG_STOPPING 2 /* * snd_usb_endpoint is a model that abstracts everything related to an @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) if (alive) snd_printk(KERN_ERR timeout: still %d active urbs on EP #%x\n, alive, ep-ep_num); + clear_bit(EP_FLAG_STOPPING, ep-flags); return 0; } +/* wait until urbs are really dropped */ +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep) +{ + if (test_bit(EP_FLAG_STOPPING, ep-flags)) + wait_clear_urbs(ep); +} + + /* * unlink active urbs. */ @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, if (wait) wait_clear_urbs(ep); + else + set_bit(EP_FLAG_STOPPING, ep-flags); } } diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index cbbbdf2..c1540a4 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep, int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep); void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep, int force, int can_sleep, int wait); +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep); int snd_usb_endpoint_activate(struct snd_usb_endpoint *ep); int snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep); void snd_usb_endpoint_free(struct list_head *head); diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..aee3ab0 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (snd_BUG_ON(!subs-data_endpoint)) return -EIO; + if (subs-sync_endpoint) + snd_usb_endpoint_sync_stop(subs-sync_endpoint); + if (subs-data_endpoint) + snd_usb_endpoint_sync_stop(subs-data_endpoint); We can't simply stop both endpoints in the prepare callback. The new function doesn't stop the stream by itself but it just syncs if the stream is being stopped beforehand. So, it's safe to call it there. Maybe the name was confusing. It should have been like snd_usb_endpoint_sync_pending_stop() or such. Ah, right. I was errornously looking closer to Alan's patch but then replied to yours. Alright then - thanks for explaining :) Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Nov 03, 2012 at 03:16:36PM +0100, Daniel Mack wrote: > On 03.11.2012 15:10, Christof Meerwald wrote: > > http://comments.gmane.org/gmane.comp.voip.twinkle/3052 and > > http://pastebin.com/aHGe1S1X for a self-contained C test. > Some questions: > > - Are you seeing the same issue with 3.6.x? I haven't tried it myself, but the other poster on http://comments.gmane.org/gmane.comp.voip.twinkle/3052 mentions 3.6.2 (and 3.6.3) > - If you can reproduce this issue, could you paste the messages in > dmesg when this happens? Do they resemble to the list corruption that > was reported? I am not seeing any kernel messages at all - the system just freezes and not even the SysRq stuff works after that. > - Do you see the same problem with 3.4? I upgraded from Ubuntu 12.04 (Linux 3.2) where I didn't see the problem. However, http://www.linuxquestions.org/questions/linux-desktop-74/twinkle-causes-linux-freeze-kernel-3-6-2-a-4175433799/ mentions 3.4.0 > - Are you able to apply the patch Alan Stern posted in this thread earlier? Unfortunately, I am not really in a position to apply kernel patches at the moment. > We should really sort this out, but I unfortunately lack a system or > setup that shows the bug. BTW, I have been able to reproduce the problem on a completely different machine (also running Ubuntu 12.10, but different hardware). The important thing appears to be that the USB audio device is connected via a USB 2.0 hub (and then using the test code posted in http://pastebin.com/aHGe1S1X specifying the audio device as "plughw:Set" (or whatever it's called) seems to trigger the freeze). So I guess another question is: do you have a USB headset connected via a USB 2.0 hub and not seeing the problem or is your USB headset not connected via a USB 2.0 hub? (of course, it would also be useful if others could comment if they are seeing the problem with that setup or not) Christof -- http://cmeerw.org sip:cmeerw at cmeerw.org mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Nov 03, 2012 at 03:16:36PM +0100, Daniel Mack wrote: On 03.11.2012 15:10, Christof Meerwald wrote: http://comments.gmane.org/gmane.comp.voip.twinkle/3052 and http://pastebin.com/aHGe1S1X for a self-contained C test. Some questions: - Are you seeing the same issue with 3.6.x? I haven't tried it myself, but the other poster on http://comments.gmane.org/gmane.comp.voip.twinkle/3052 mentions 3.6.2 (and 3.6.3) - If you can reproduce this issue, could you paste the messages in dmesg when this happens? Do they resemble to the list corruption that was reported? I am not seeing any kernel messages at all - the system just freezes and not even the SysRq stuff works after that. - Do you see the same problem with 3.4? I upgraded from Ubuntu 12.04 (Linux 3.2) where I didn't see the problem. However, http://www.linuxquestions.org/questions/linux-desktop-74/twinkle-causes-linux-freeze-kernel-3-6-2-a-4175433799/ mentions 3.4.0 - Are you able to apply the patch Alan Stern posted in this thread earlier? Unfortunately, I am not really in a position to apply kernel patches at the moment. We should really sort this out, but I unfortunately lack a system or setup that shows the bug. BTW, I have been able to reproduce the problem on a completely different machine (also running Ubuntu 12.10, but different hardware). The important thing appears to be that the USB audio device is connected via a USB 2.0 hub (and then using the test code posted in http://pastebin.com/aHGe1S1X specifying the audio device as plughw:Set (or whatever it's called) seems to trigger the freeze). So I guess another question is: do you have a USB headset connected via a USB 2.0 hub and not seeing the problem or is your USB headset not connected via a USB 2.0 hub? (of course, it would also be useful if others could comment if they are seeing the problem with that setup or not) Christof -- http://cmeerw.org sip:cmeerw at cmeerw.org mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, 3 Nov 2012, Daniel Mack wrote: > On 03.11.2012 15:10, Christof Meerwald wrote: > > On Sat, 20 Oct 2012 23:15:17 + (GMT), Artem S. Tashkinov wrote: > >> It's almost definitely either a USB driver bug or video4linux driver bug: > >> > >> I'm CC'ing linux-media and linux-usb mailing lists, the problem is > >> described here: > >> https://lkml.org/lkml/2012/10/20/35 > >> https://lkml.org/lkml/2012/10/20/148 > > > > Not sure if it's related, but I am seeing a kernel freeze with a > > usb-audio headset (connected via an external USB hub) on Linux 3.5.0 > > (Ubuntu 12.10) - see > > Does Ubuntu 12.10 really ship with 3.5.0? Not any more recent They ship 3.5.7 plus some more fixes, but call it 3.5.0-18.29 c'ya sven-haegar -- Three may keep a secret, if two of them are dead. - Ben F. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 03.11.2012 15:10, Christof Meerwald wrote: > On Sat, 20 Oct 2012 23:15:17 + (GMT), Artem S. Tashkinov wrote: >> It's almost definitely either a USB driver bug or video4linux driver bug: >> >> I'm CC'ing linux-media and linux-usb mailing lists, the problem is described >> here: >> https://lkml.org/lkml/2012/10/20/35 >> https://lkml.org/lkml/2012/10/20/148 > > Not sure if it's related, but I am seeing a kernel freeze with a > usb-audio headset (connected via an external USB hub) on Linux 3.5.0 > (Ubuntu 12.10) - see Does Ubuntu 12.10 really ship with 3.5.0? Not any more recent > http://comments.gmane.org/gmane.comp.voip.twinkle/3052 and > http://pastebin.com/aHGe1S1X for a self-contained C test. Some questions: - Are you seeing the same issue with 3.6.x? - If you can reproduce this issue, could you paste the messages in dmesg when this happens? Do they resemble to the list corruption that was reported? - Do you see the same problem with 3.4? - Are you able to apply the patch Alan Stern posted in this thread earlier? We should really sort this out, but I unfortunately lack a system or setup that shows the bug. Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, 20 Oct 2012 23:15:17 + (GMT), Artem S. Tashkinov wrote: > It's almost definitely either a USB driver bug or video4linux driver bug: > > I'm CC'ing linux-media and linux-usb mailing lists, the problem is described > here: > https://lkml.org/lkml/2012/10/20/35 > https://lkml.org/lkml/2012/10/20/148 Not sure if it's related, but I am seeing a kernel freeze with a usb-audio headset (connected via an external USB hub) on Linux 3.5.0 (Ubuntu 12.10) - see http://comments.gmane.org/gmane.comp.voip.twinkle/3052 and http://pastebin.com/aHGe1S1X for a self-contained C test. Christof -- http://cmeerw.org sip:cmeerw at cmeerw.org mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, 20 Oct 2012 23:15:17 + (GMT), Artem S. Tashkinov wrote: It's almost definitely either a USB driver bug or video4linux driver bug: I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Not sure if it's related, but I am seeing a kernel freeze with a usb-audio headset (connected via an external USB hub) on Linux 3.5.0 (Ubuntu 12.10) - see http://comments.gmane.org/gmane.comp.voip.twinkle/3052 and http://pastebin.com/aHGe1S1X for a self-contained C test. Christof -- http://cmeerw.org sip:cmeerw at cmeerw.org mailto:cmeerw at cmeerw.org xmpp:cmeerw at cmeerw.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 03.11.2012 15:10, Christof Meerwald wrote: On Sat, 20 Oct 2012 23:15:17 + (GMT), Artem S. Tashkinov wrote: It's almost definitely either a USB driver bug or video4linux driver bug: I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Not sure if it's related, but I am seeing a kernel freeze with a usb-audio headset (connected via an external USB hub) on Linux 3.5.0 (Ubuntu 12.10) - see Does Ubuntu 12.10 really ship with 3.5.0? Not any more recent http://comments.gmane.org/gmane.comp.voip.twinkle/3052 and http://pastebin.com/aHGe1S1X for a self-contained C test. Some questions: - Are you seeing the same issue with 3.6.x? - If you can reproduce this issue, could you paste the messages in dmesg when this happens? Do they resemble to the list corruption that was reported? - Do you see the same problem with 3.4? - Are you able to apply the patch Alan Stern posted in this thread earlier? We should really sort this out, but I unfortunately lack a system or setup that shows the bug. Thanks, Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, 3 Nov 2012, Daniel Mack wrote: On 03.11.2012 15:10, Christof Meerwald wrote: On Sat, 20 Oct 2012 23:15:17 + (GMT), Artem S. Tashkinov wrote: It's almost definitely either a USB driver bug or video4linux driver bug: I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Not sure if it's related, but I am seeing a kernel freeze with a usb-audio headset (connected via an external USB hub) on Linux 3.5.0 (Ubuntu 12.10) - see Does Ubuntu 12.10 really ship with 3.5.0? Not any more recent They ship 3.5.7 plus some more fixes, but call it 3.5.0-18.29 c'ya sven-haegar -- Three may keep a secret, if two of them are dead. - Ben F. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Mon, 22 Oct 2012, Artem S. Tashkinov wrote: > OK, here's what the kernel prints with your patch: > > usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4 > > A small delay before I got thousands of list_del corruption messages would > have been nice, but I managed to catch the message anyway. All right. Here's a new patch, which will print more information and will provide a 10-second delay. For this to be useful, you should capture a usbmon trace at the same time. The relevant entries will show up in the trace shortly before _and_ shortly after the error message appears. Alan Stern P.S.: It will help if you unplug as many of the other USB devices as possible before running this test. Index: usb-3.6/drivers/usb/core/hcd.c === --- usb-3.6.orig/drivers/usb/core/hcd.c +++ usb-3.6/drivers/usb/core/hcd.c @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); /*-*/ +static bool list_error; + /** * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue * @hcd: host controller to which @urb was submitted @@ -1193,6 +1195,25 @@ void usb_hcd_unlink_urb_from_ep(struct u { /* clear all state linking urb to this dev (and hcd) */ spin_lock(_urb_list_lock); + { + struct list_head *cur = >urb_list; + struct list_head *prev = cur->prev; + struct list_head *next = cur->next; + + if (prev->next != cur && !list_error) { + list_error = true; + dev_err(>dev->dev, + "ep %x list del corruption prev: %p %p %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, prev, prev->next, next, next->prev); + dev_err(>dev->dev, + "head %p urb %p urbprev %p urbnext %p\n", + >ep->urb_list, urb, + list_entry(prev, struct urb, urb_list), + list_entry(next, struct urb, urb_list)); + mdelay(1); + } + } list_del_init(>urb_list); spin_unlock(_urb_list_lock); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 22, 2012, Alan Stern wrote: > A BUG() at these points would crash the machine hard. And where we > came from doesn't matter; what matters is the values in the pointers. OK, here's what the kernel prints with your patch: usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4 A small delay before I got thousands of list_del corruption messages would have been nice, but I managed to catch the message anyway. Artem -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Mon, 22 Oct 2012, Daniel Mack wrote: > On 22.10.2012 17:17, Alan Stern wrote: > > On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: > > > >> dmesg messages up to a crash can be seen here: > >> https://bugzilla.kernel.org/attachment.cgi?id=84221 > > > > The first problem in the log is endpoint list corruption. Here's a > > debugging patch which should provide a little more information. > > Maybe add a BUG() after each of these dev_err() so we stop at the first > occurance and also see where we're coming from? A BUG() at these points would crash the machine hard. And where we came from doesn't matter; what matters is the values in the pointers. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 22.10.2012 17:17, Alan Stern wrote: > On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: > >> dmesg messages up to a crash can be seen here: >> https://bugzilla.kernel.org/attachment.cgi?id=84221 > > The first problem in the log is endpoint list corruption. Here's a > debugging patch which should provide a little more information. Maybe add a BUG() after each of these dev_err() so we stop at the first occurance and also see where we're coming from? > drivers/usb/core/hcd.c | 36 > 1 file changed, 36 insertions(+) > > Index: usb-3.6/drivers/usb/core/hcd.c > === > --- usb-3.6.orig/drivers/usb/core/hcd.c > +++ usb-3.6/drivers/usb/core/hcd.c > @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); > > /*-*/ > > +static bool list_error; > + > /** > * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue > * @hcd: host controller to which @urb was submitted > @@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc >*/ > if (HCD_RH_RUNNING(hcd)) { > urb->unlinked = 0; > + > + { > + struct list_head *cur = >ep->urb_list; > + struct list_head *prev = cur->prev; > + > + if (prev->next != cur && !list_error) { > + list_error = true; > + dev_err(>dev->dev, > + "ep %x list add corruption: %p %p %p\n", > + urb->ep->desc.bEndpointAddress, > + cur, prev, prev->next); > + } > + } > + > list_add_tail(>urb_list, >ep->urb_list); > } else { > rc = -ESHUTDOWN; > @@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u > { > /* clear all state linking urb to this dev (and hcd) */ > spin_lock(_urb_list_lock); > + { > + struct list_head *cur = >urb_list; > + struct list_head *prev = cur->prev; > + struct list_head *next = cur->next; > + > + if (prev->next != cur && !list_error) { > + list_error = true; > + dev_err(>dev->dev, > + "ep %x list del corruption prev: %p %p %p\n", > + urb->ep->desc.bEndpointAddress, > + cur, prev, prev->next); > + } > + if (next->prev != cur && !list_error) { > + list_error = true; > + dev_err(>dev->dev, > + "ep %x list del corruption next: %p %p %p\n", > + urb->ep->desc.bEndpointAddress, > + cur, next, next->prev); > + } > + } > list_del_init(>urb_list); > spin_unlock(_urb_list_lock); > } > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: > dmesg messages up to a crash can be seen here: > https://bugzilla.kernel.org/attachment.cgi?id=84221 The first problem in the log is endpoint list corruption. Here's a debugging patch which should provide a little more information. Alan Stern drivers/usb/core/hcd.c | 36 1 file changed, 36 insertions(+) Index: usb-3.6/drivers/usb/core/hcd.c === --- usb-3.6.orig/drivers/usb/core/hcd.c +++ usb-3.6/drivers/usb/core/hcd.c @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); /*-*/ +static bool list_error; + /** * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue * @hcd: host controller to which @urb was submitted @@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc */ if (HCD_RH_RUNNING(hcd)) { urb->unlinked = 0; + + { + struct list_head *cur = >ep->urb_list; + struct list_head *prev = cur->prev; + + if (prev->next != cur && !list_error) { + list_error = true; + dev_err(>dev->dev, + "ep %x list add corruption: %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, prev, prev->next); + } + } + list_add_tail(>urb_list, >ep->urb_list); } else { rc = -ESHUTDOWN; @@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u { /* clear all state linking urb to this dev (and hcd) */ spin_lock(_urb_list_lock); + { + struct list_head *cur = >urb_list; + struct list_head *prev = cur->prev; + struct list_head *next = cur->next; + + if (prev->next != cur && !list_error) { + list_error = true; + dev_err(>dev->dev, + "ep %x list del corruption prev: %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, prev, prev->next); + } + if (next->prev != cur && !list_error) { + list_error = true; + dev_err(>dev->dev, + "ep %x list del corruption next: %p %p %p\n", + urb->ep->desc.bEndpointAddress, + cur, next, next->prev); + } + } list_del_init(>urb_list); spin_unlock(_urb_list_lock); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 The first problem in the log is endpoint list corruption. Here's a debugging patch which should provide a little more information. Alan Stern drivers/usb/core/hcd.c | 36 1 file changed, 36 insertions(+) Index: usb-3.6/drivers/usb/core/hcd.c === --- usb-3.6.orig/drivers/usb/core/hcd.c +++ usb-3.6/drivers/usb/core/hcd.c @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); /*-*/ +static bool list_error; + /** * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue * @hcd: host controller to which @urb was submitted @@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc */ if (HCD_RH_RUNNING(hcd)) { urb-unlinked = 0; + + { + struct list_head *cur = urb-ep-urb_list; + struct list_head *prev = cur-prev; + + if (prev-next != cur !list_error) { + list_error = true; + dev_err(urb-dev-dev, + ep %x list add corruption: %p %p %p\n, + urb-ep-desc.bEndpointAddress, + cur, prev, prev-next); + } + } + list_add_tail(urb-urb_list, urb-ep-urb_list); } else { rc = -ESHUTDOWN; @@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u { /* clear all state linking urb to this dev (and hcd) */ spin_lock(hcd_urb_list_lock); + { + struct list_head *cur = urb-urb_list; + struct list_head *prev = cur-prev; + struct list_head *next = cur-next; + + if (prev-next != cur !list_error) { + list_error = true; + dev_err(urb-dev-dev, + ep %x list del corruption prev: %p %p %p\n, + urb-ep-desc.bEndpointAddress, + cur, prev, prev-next); + } + if (next-prev != cur !list_error) { + list_error = true; + dev_err(urb-dev-dev, + ep %x list del corruption next: %p %p %p\n, + urb-ep-desc.bEndpointAddress, + cur, next, next-prev); + } + } list_del_init(urb-urb_list); spin_unlock(hcd_urb_list_lock); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 22.10.2012 17:17, Alan Stern wrote: On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 The first problem in the log is endpoint list corruption. Here's a debugging patch which should provide a little more information. Maybe add a BUG() after each of these dev_err() so we stop at the first occurance and also see where we're coming from? drivers/usb/core/hcd.c | 36 1 file changed, 36 insertions(+) Index: usb-3.6/drivers/usb/core/hcd.c === --- usb-3.6.orig/drivers/usb/core/hcd.c +++ usb-3.6/drivers/usb/core/hcd.c @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); /*-*/ +static bool list_error; + /** * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue * @hcd: host controller to which @urb was submitted @@ -1126,6 +1128,20 @@ int usb_hcd_link_urb_to_ep(struct usb_hc */ if (HCD_RH_RUNNING(hcd)) { urb-unlinked = 0; + + { + struct list_head *cur = urb-ep-urb_list; + struct list_head *prev = cur-prev; + + if (prev-next != cur !list_error) { + list_error = true; + dev_err(urb-dev-dev, + ep %x list add corruption: %p %p %p\n, + urb-ep-desc.bEndpointAddress, + cur, prev, prev-next); + } + } + list_add_tail(urb-urb_list, urb-ep-urb_list); } else { rc = -ESHUTDOWN; @@ -1193,6 +1209,26 @@ void usb_hcd_unlink_urb_from_ep(struct u { /* clear all state linking urb to this dev (and hcd) */ spin_lock(hcd_urb_list_lock); + { + struct list_head *cur = urb-urb_list; + struct list_head *prev = cur-prev; + struct list_head *next = cur-next; + + if (prev-next != cur !list_error) { + list_error = true; + dev_err(urb-dev-dev, + ep %x list del corruption prev: %p %p %p\n, + urb-ep-desc.bEndpointAddress, + cur, prev, prev-next); + } + if (next-prev != cur !list_error) { + list_error = true; + dev_err(urb-dev-dev, + ep %x list del corruption next: %p %p %p\n, + urb-ep-desc.bEndpointAddress, + cur, next, next-prev); + } + } list_del_init(urb-urb_list); spin_unlock(hcd_urb_list_lock); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Mon, 22 Oct 2012, Daniel Mack wrote: On 22.10.2012 17:17, Alan Stern wrote: On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 The first problem in the log is endpoint list corruption. Here's a debugging patch which should provide a little more information. Maybe add a BUG() after each of these dev_err() so we stop at the first occurance and also see where we're coming from? A BUG() at these points would crash the machine hard. And where we came from doesn't matter; what matters is the values in the pointers. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 22, 2012, Alan Stern st...@rowland.harvard.edu wrote: A BUG() at these points would crash the machine hard. And where we came from doesn't matter; what matters is the values in the pointers. OK, here's what the kernel prints with your patch: usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4 A small delay before I got thousands of list_del corruption messages would have been nice, but I managed to catch the message anyway. Artem -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Mon, 22 Oct 2012, Artem S. Tashkinov wrote: OK, here's what the kernel prints with your patch: usb 6.1.4: ep 86 list del corruption prev: e5103b54 e5103a94 e51039d4 A small delay before I got thousands of list_del corruption messages would have been nice, but I managed to catch the message anyway. All right. Here's a new patch, which will print more information and will provide a 10-second delay. For this to be useful, you should capture a usbmon trace at the same time. The relevant entries will show up in the trace shortly before _and_ shortly after the error message appears. Alan Stern P.S.: It will help if you unplug as many of the other USB devices as possible before running this test. Index: usb-3.6/drivers/usb/core/hcd.c === --- usb-3.6.orig/drivers/usb/core/hcd.c +++ usb-3.6/drivers/usb/core/hcd.c @@ -1083,6 +1083,8 @@ EXPORT_SYMBOL_GPL(usb_calc_bus_time); /*-*/ +static bool list_error; + /** * usb_hcd_link_urb_to_ep - add an URB to its endpoint queue * @hcd: host controller to which @urb was submitted @@ -1193,6 +1195,25 @@ void usb_hcd_unlink_urb_from_ep(struct u { /* clear all state linking urb to this dev (and hcd) */ spin_lock(hcd_urb_list_lock); + { + struct list_head *cur = urb-urb_list; + struct list_head *prev = cur-prev; + struct list_head *next = cur-next; + + if (prev-next != cur !list_error) { + list_error = true; + dev_err(urb-dev-dev, + ep %x list del corruption prev: %p %p %p %p %p\n, + urb-ep-desc.bEndpointAddress, + cur, prev, prev-next, next, next-prev); + dev_err(urb-dev-dev, + head %p urb %p urbprev %p urbnext %p\n, + urb-ep-urb_list, urb, + list_entry(prev, struct urb, urb_list), + list_entry(next, struct urb, urb_list)); + mdelay(1); + } + } list_del_init(urb-urb_list); spin_unlock(hcd_urb_list_lock); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 22:43, Artem S. Tashkinov wrote: >> Nice. Could you do that again with the patch applied I sent yo some >> hours ago? > > That patch was of no help - the system has crashed and I couldn't spot > relevant > messages. > > I've no idea what it means. The sequence of driver callbacks issued on a stream start is .open() .hw_params() .prepare() .trigger() If the ALSA part really causes this issue, the bad things happen either in any of the driver callback functions or in the core underneath. The patch I sent returns an error from the hw_params callback, and as you still see the problem, that means that the crash happens before any of the USB audio streaming really starts. Could you try and return -EINVAL from snd_usb_capture_open() please? If anyone has a better idea on how to debug this, please chime in. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
> Nice. Could you do that again with the patch applied I sent yo some > hours ago? That patch was of no help - the system has crashed and I couldn't spot relevant messages. I've no idea what it means. Artem -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, Oct 21, 2012 at 07:49:01PM +, Artem S. Tashkinov wrote: > I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes, > straight to a hdd partition to eliminate a FS cache) Well, I'm no fs guy but this should still go through the buffer cache. I think the O_SYNC flag makes sure it all lands on the partition in time. Oh well, it doesn't matter. > Don't judge me harshly - I'm not a programmer. If you wrote that and you're not a programmer, it certainly looks cool, good job!. [ Btw, don't forget to free(buffer) at the end. ] Also, there was a patchset recently which added a blockconsole method to the kernel with which you can do something like that in a generic way. Back to the issue at hand: it looks like ehci_hcd is causing some list corruptions, maybe coming from the uvcvideo or whatever. I think the usb people will have a better idea. Btw, is there any particular reason you're running a 32-bit kernel? Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 21:49, Artem S. Tashkinov wrote: >> >> On Oct 21, 2012, Borislav Petkov wrote: >> >> On Sun, Oct 21, 2012 at 11:59:36AM +, Artem S. Tashkinov wrote: >>> http://imageshack.us/a/img685/9452/panicz.jpg >>> >>> list_del corruption. prev->next should be ... but was ... >> >> Btw, this is one of the debug options I told you to enable. >> >>> I cannot show you more as I have no serial console to use :( and the kernel >>> doesn't have enough time to push error messages to rsyslog and fsync >>> /var/log/messages >> >> I already told you how to catch that oops: boot with "pause_on_oops=600" >> on the kernel command line and photograph the screen when the first oops >> happens. This'll show us where the problem begins. > > This option didn't have any effect, or maybe it's because it's such a serious > crash > the kernel has no time to actually print an ooops/panic message. > > dmesg messages up to a crash can be seen here: > https://bugzilla.kernel.org/attachment.cgi?id=84221 Nice. Could you do that again with the patch applied I sent yo some hours ago? Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
> > On Oct 21, 2012, Borislav Petkov wrote: > > On Sun, Oct 21, 2012 at 11:59:36AM +, Artem S. Tashkinov wrote: > > http://imageshack.us/a/img685/9452/panicz.jpg > > > > list_del corruption. prev->next should be ... but was ... > > Btw, this is one of the debug options I told you to enable. > > > I cannot show you more as I have no serial console to use :( and the kernel > > doesn't have enough time to push error messages to rsyslog and fsync > > /var/log/messages > > I already told you how to catch that oops: boot with "pause_on_oops=600" > on the kernel command line and photograph the screen when the first oops > happens. This'll show us where the problem begins. This option didn't have any effect, or maybe it's because it's such a serious crash the kernel has no time to actually print an ooops/panic message. dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 I dumped them using this application: $ cat scat.c #include #include #include #include #include #include #include #define O_LARGEFILE 010 #define BUFFER 4096 #define __USE_FILE_OFFSET64 1 #define __USE_LARGEFILE64 1 int main(int argc, char *argv[]) { int fd_out; int64_t bytes_read; void *buffer; if (argc!=2) { printf("Usage is: scat destination\n"); return 1; } buffer = malloc(BUFFER * sizeof(char)); if (buffer == NULL) { printf("Error: can't allocate buffers\n"); return 2; } memset(buffer, 0, BUFFER); printf("Dumping to \"%s\" ... ", argv[1]); fflush(NULL); if ((fd_out = open64(argv[1], O_WRONLY | O_LARGEFILE | O_SYNC | O_NOFOLLOW, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)) == -1) { printf("Error: destination file can't be created\n"); perror("open() "); return 2; } bytes_read = 1; while (bytes_read) { bytes_read = fread(buffer, sizeof(char), BUFFER, stdin); if (write(fd_out, (void *) buffer, bytes_read) != bytes_read) { printf("Error: can't write data to the destination file! Possibly a target disk is full\n"); return 3; } } close(fd_out); printf(" OK\n"); return 0; } I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes, straight to a hdd partition to eliminate a FS cache) Don't judge me harshly - I'm not a programmer. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, Oct 21, 2012 at 11:59:36AM +, Artem S. Tashkinov wrote: > http://imageshack.us/a/img685/9452/panicz.jpg > > list_del corruption. prev->next should be ... but was ... Btw, this is one of the debug options I told you to enable. > I cannot show you more as I have no serial console to use :( and the kernel > doesn't have enough time to push error messages to rsyslog and fsync > /var/log/messages I already told you how to catch that oops: boot with "pause_on_oops=600" on the kernel command line and photograph the screen when the first oops happens. This'll show us where the problem begins. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, 21 Oct 2012, Daniel Mack wrote: > As the usb list is still in Cc: - Artem's lcpci dump shows that his > machine features XHCI controllers. Can anyone think of a relation to > this problem? > > And Artem, is there any way you boot your system on an older machine > that only has EHCI ports? Thinking about it, I wonder whether the freeze > in VBox and the crashes on native hardware have the same root cause. In > that case, would it be possible to share that VBox image? Don't grasp at straws. All of the kernel logs Artem has posted show ehci-hcd; none of them show xhci-hcd. Therefore the xHCI controller is highly unlikely to be involved. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: > What I've found out is that my system crashes *only* when I try to enable > usb-audio (from the same webcam) - I still have no idea how to capture a > panic message, but I ran > > "while :; do dmesg -c; done" in xterm, then I got like thousands of messages > and I photographed my monitor: > > http://imageshack.us/a/img685/9452/panicz.jpg > > list_del corruption. prev->next should be ... but was ... > > I cannot show you more as I have no serial console to use :( and the kernel > doesn't have enough time to push error messages to rsyslog and fsync > /var/log/messages Is it possible to use netconsole? The screenshot above appears to be the end of a long series of error messages, which isn't too useful. The most important information is in the first error. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 16:57, Artem S. Tashkinov wrote: >> On Oct 21, 2012, Daniel Mack wrote: >> >> [Cc: alsa-devel] >> >> On 21.10.2012 14:30, Artem S. Tashkinov wrote: >>> On Oct 21, 2012, Daniel Mack wrote: >>> A hint at least. How did you enable the audio record exactly? Can you reproduce this with arecord? What chipset are you on? Please provide both "lspci -v" and "lsusb -v" dumps. As I said, I fail to reproduce that issue on any of my machines. >>> >>> All other applications can read from the USB audio without problems, it's >>> just something in the way Adobe Flash polls my audio input which causes >>> a crash. >>> >>> Just video capture (without audio) works just fine in Adobe Flash. >> >> Ok, so that pretty much rules out the host controller. I just wonder why >> I still don't see it here, and I haven't heard of any such problem from >> anyone else. >> >> Some more questions: >> >> - Which version of Flash are you running? > > Google Chrome has its own version of Adobe Flash: > > Name: Shockwave Flash > Description: Shockwave Flash 11.4 r31 > Version: 11.4.31.110 So that's the same that I'm using. >> - Does this also happen with Firefox? > > No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), > it shows > just two input devices instead of three which the newer Flash players sees. > > * HDA Intel PCH > * USB Device 0x46d:0x81d And that works, I assume? Does the second choice in the newer Flash version work maybe? >> - Does flash access the device directly or via PulseAudio? > > PA is not installed on my computer, so Flash accesses it directly via ALSA > calls. Ok, Same here. >> - Could you please apply the attached patch and see what it spits out to >> dmesg once Flash opens the device? It returns -EINVAL in the hw_params >> callback to prevent the actual streaming. On my machine with Flash >> 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. >> Or does your machine still crash before anything is written to the logs? > > I will try it a bit later. Yes, we need to trace the call chain and see at which point the trouble starts. What could help is tracing the google-chrome binary with strace maybe. At least we would see the ioctl command sequence, if the log file survives the crash. As the usb list is still in Cc: - Artem's lcpci dump shows that his machine features XHCI controllers. Can anyone think of a relation to this problem? And Artem, is there any way you boot your system on an older machine that only has EHCI ports? Thinking about it, I wonder whether the freeze in VBox and the crashes on native hardware have the same root cause. In that case, would it be possible to share that VBox image? Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
> On Oct 21, 2012, Daniel Mack wrote: > > [Cc: alsa-devel] > > On 21.10.2012 14:30, Artem S. Tashkinov wrote: > > On Oct 21, 2012, Daniel Mack wrote: > > > >> A hint at least. How did you enable the audio record exactly? Can you > >> reproduce this with arecord? > >> > >> What chipset are you on? Please provide both "lspci -v" and "lsusb -v" > >> dumps. As I said, I fail to reproduce that issue on any of my machines. > > > > All other applications can read from the USB audio without problems, it's > > just something in the way Adobe Flash polls my audio input which causes > > a crash. > > > > Just video capture (without audio) works just fine in Adobe Flash. > > Ok, so that pretty much rules out the host controller. I just wonder why > I still don't see it here, and I haven't heard of any such problem from > anyone else. > > Some more questions: > > - Which version of Flash are you running? Google Chrome has its own version of Adobe Flash: Name: Shockwave Flash Description:Shockwave Flash 11.4 r31 Version:11.4.31.110 > - Does this also happen with Firefox? No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), it shows just two input devices instead of three which the newer Flash players sees. * HDA Intel PCH * USB Device 0x46d:0x81d > - Does flash access the device directly or via PulseAudio? PA is not installed on my computer, so Flash accesses it directly via ALSA calls. > - Could you please apply the attached patch and see what it spits out to > dmesg once Flash opens the device? It returns -EINVAL in the hw_params > callback to prevent the actual streaming. On my machine with Flash > 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. > Or does your machine still crash before anything is written to the logs? I will try it a bit later. > > Only and only when I choose to use > > > > USB Device 0x46d:0x81d my system crashes in Adobe Flash. > > > > See the screenshot: > > > > https://bugzilla.kernel.org/attachment.cgi?id=84151 > > When exactly does the crash happen? Right after you selected that entry > from the list? There's a little recording level meter in that dialog. > Does that show any input from the microphone? Yes, right after I select it and move the mouse cursor away from this combobox so that this selection becomes active. > > My hardware information can be fetched from here: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=49181 > > > > On a second thought that can be even an ALSA crash or pretty much > > anything else. > > We'll see. Thanks for your help to sort this out! Thank you for your assistance! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
[Cc: alsa-devel] On 21.10.2012 14:30, Artem S. Tashkinov wrote: > On Oct 21, 2012, Daniel Mack wrote: > >> A hint at least. How did you enable the audio record exactly? Can you >> reproduce this with arecord? >> >> What chipset are you on? Please provide both "lspci -v" and "lsusb -v" >> dumps. As I said, I fail to reproduce that issue on any of my machines. > > All other applications can read from the USB audio without problems, it's > just something in the way Adobe Flash polls my audio input which causes > a crash. > > Just video capture (without audio) works just fine in Adobe Flash. Ok, so that pretty much rules out the host controller. I just wonder why I still don't see it here, and I haven't heard of any such problem from anyone else. Some more questions: - Which version of Flash are you running? - Does this also happen with Firefox? - Does flash access the device directly or via PulseAudio? - Could you please apply the attached patch and see what it spits out to dmesg once Flash opens the device? It returns -EINVAL in the hw_params callback to prevent the actual streaming. On my machine with Flash 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. Or does your machine still crash before anything is written to the logs? > Only and only when I choose to use > > USB Device 0x46d:0x81d my system crashes in Adobe Flash. > > See the screenshot: > > https://bugzilla.kernel.org/attachment.cgi?id=84151 When exactly does the crash happen? Right after you selected that entry from the list? There's a little recording level meter in that dialog. Does that show any input from the microphone? > My hardware information can be fetched from here: > > https://bugzilla.kernel.org/show_bug.cgi?id=49181 > > On a second thought that can be even an ALSA crash or pretty much > anything else. We'll see. Thanks for your help to sort this out! Daniel diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..5664b45 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -453,6 +453,18 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream, unsigned int channels, rate, format; int ret, changed; + + printk(">>> %s()\n", __func__); + + printk("format: %d\n", params_format(hw_params)); + printk("rate: %d\n", params_rate(hw_params)); + printk("channels: %d\n", params_channels(hw_params)); + printk("buffer bytes: %d\n", params_buffer_bytes(hw_params)); + printk("period bytes: %d\n", params_period_bytes(hw_params)); + printk("access: %d\n", params_access(hw_params)); + + return -EINVAL; + ret = snd_pcm_lib_alloc_vmalloc_buffer(substream, params_buffer_bytes(hw_params)); if (ret < 0)
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Daniel Mack wrote: > A hint at least. How did you enable the audio record exactly? Can you > reproduce this with arecord? > > What chipset are you on? Please provide both "lspci -v" and "lsusb -v" > dumps. As I said, I fail to reproduce that issue on any of my machines. All other applications can read from the USB audio without problems, it's just something in the way Adobe Flash polls my audio input which causes a crash. Just video capture (without audio) works just fine in Adobe Flash. Only and only when I choose to use USB Device 0x46d:0x81d my system crashes in Adobe Flash. See the screenshot: https://bugzilla.kernel.org/attachment.cgi?id=84151 My hardware information can be fetched from here: https://bugzilla.kernel.org/show_bug.cgi?id=49181 On a second thought that can be even an ALSA crash or pretty much anything else. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 13:59, Artem S. Tashkinov wrote: > On Oct 21, 2012, Borislav Petkov wrote: >> >> On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: >>> The freeze happens on my *host* Linux PC. For an experiment I decided >>> to check if I could reproduce the freeze under a virtual machine - it >>> turns out the Linux kernel running under it also freezes. >> >> I know that - but a freeze != oops - at least not necessarily. Which >> means it could very well be a different issue now that vbox is gone. >> >> Or, it could be the same issue with different incarnations: with vbox >> you get the corruptions and without it, you get the freezes. I'm >> assuming you do the same flash player thing in both cases? >> >> Here's a crazy idea: can you try to reproduce it in KVM? > > OK, dismiss VBox altogether - it has a very buggy USB implementation, thus > it just hangs when trying to access my webcam. > > What I've found out is that my system crashes *only* when I try to enable > usb-audio (from the same webcam) It would also be interesting to know whether you have problems with *only* the video capture, with some tool like "cheese". It might be you're hitting a host controller issue here, and then isochronous input packets on the video interface would most likely also trigger such am effect. Actually, knowing whether that's the case would be crucial for further debugging. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 13:59, Artem S. Tashkinov wrote: > On Oct 21, 2012, Borislav Petkov wrote: >> >> On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: >>> The freeze happens on my *host* Linux PC. For an experiment I decided >>> to check if I could reproduce the freeze under a virtual machine - it >>> turns out the Linux kernel running under it also freezes. >> >> I know that - but a freeze != oops - at least not necessarily. Which >> means it could very well be a different issue now that vbox is gone. >> >> Or, it could be the same issue with different incarnations: with vbox >> you get the corruptions and without it, you get the freezes. I'm >> assuming you do the same flash player thing in both cases? >> >> Here's a crazy idea: can you try to reproduce it in KVM? > > OK, dismiss VBox altogether - it has a very buggy USB implementation, thus > it just hangs when trying to access my webcam. Ok. > What I've found out is that my system crashes *only* when I try to enable > usb-audio (from the same webcam) - I still have no idea how to capture a > panic message, but I ran > > "while :; do dmesg -c; done" in xterm, then I got like thousands of messages > and I photographed my monitor: > > http://imageshack.us/a/img685/9452/panicz.jpg A hint at least. How did you enable the audio record exactly? Can you reproduce this with arecord? What chipset are you on? Please provide both "lspci -v" and "lsusb -v" dumps. As I said, I fail to reproduce that issue on any of my machines. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Borislav Petkov wrote: > > On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: > > The freeze happens on my *host* Linux PC. For an experiment I decided > > to check if I could reproduce the freeze under a virtual machine - it > > turns out the Linux kernel running under it also freezes. > > I know that - but a freeze != oops - at least not necessarily. Which > means it could very well be a different issue now that vbox is gone. > > Or, it could be the same issue with different incarnations: with vbox > you get the corruptions and without it, you get the freezes. I'm > assuming you do the same flash player thing in both cases? > > Here's a crazy idea: can you try to reproduce it in KVM? OK, dismiss VBox altogether - it has a very buggy USB implementation, thus it just hangs when trying to access my webcam. What I've found out is that my system crashes *only* when I try to enable usb-audio (from the same webcam) - I still have no idea how to capture a panic message, but I ran "while :; do dmesg -c; done" in xterm, then I got like thousands of messages and I photographed my monitor: http://imageshack.us/a/img685/9452/panicz.jpg list_del corruption. prev->next should be ... but was ... I cannot show you more as I have no serial console to use :( and the kernel doesn't have enough time to push error messages to rsyslog and fsync /var/log/messages -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 12:34, Daniel Mack wrote: > On 21.10.2012 01:15, Artem S. Tashkinov wrote: >> You don't get me - I have *no* VirtualBox (or any proprietary) modules >> running >> - but I can reproduce this problem using *the same system running under* >> VirtualBox >> in Windows 7 64. >> >> It's almost definitely either a USB driver bug or video4linux driver bug: >> >> I'm CC'ing linux-media and linux-usb mailing lists, the problem is described >> here: >> https://lkml.org/lkml/2012/10/20/35 >> https://lkml.org/lkml/2012/10/20/148 >> >> Here are the last lines from my dmesg (with usbmon loaded): >> >> [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 >> [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK >> POWER sig=se0 PEC CSC >> [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s >> [ 292.174883] usb 1-1: USB disconnect, device number 2 >> [ 292.178045] usb 1-1: unregistering device >> [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 >> [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 >> [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 >> [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 >> [ 292.236175] usb 1-1: usb_disable_device nuking all URBs >> [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms >> status 0x100 >> [ 294.364279] hub 1-0:1.0: hub_suspend >> [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 >> [ 294.367375] ehci_hcd :00:1f.5: suspend root hub >> [ 296.501084] usb usb1: usb wakeup-resume >> [ 296.508311] usb usb1: usb auto-resume >> [ 296.509833] ehci_hcd :00:1f.5: resume root hub >> [ 296.560149] hub 1-0:1.0: hub_resume >> [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK >> POWER sig=se0 CSC CONNECT >> [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 >> [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt >> [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s >> [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd >> [ 300.547237] usb 1-1: skipped 1 descriptor after configuration >> [ 300.549443] usb 1-1: skipped 4 descriptors after interface >> [ 300.552273] usb 1-1: skipped 2 descriptors after interface >> [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint >> [ 300.559392] usb 1-1: skipped 2 descriptors after interface >> [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint >> [ 300.562169] usb 1-1: skipped 2 descriptors after interface >> [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint >> [ 300.564639] usb 1-1: skipped 2 descriptors after interface >> [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint >> [ 300.567084] usb 1-1: skipped 9 descriptors after interface >> [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint >> [ 300.570484] usb 1-1: skipped 53 descriptors after interface >> [ 300.595843] usb 1-1: default language 0x0409 >> [ 300.602503] usb 1-1: USB interface quirks for this device: 2 >> [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 >> [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d >> [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, >> SerialNumber=1 >> [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 >> [ 300.617703] usb 1-1: usb_probe_device >> [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice >> [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) >> [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface >> [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id >> [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) >> [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) >> [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface >> [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id >> [ 301.059395] uvcvideo: Found UVC 1.00 device (046d:081d) >> [ 301.090173] input: UVC Camera (046d:081d) as >> /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 > > That seems to be a Logitech model. > >> [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) >> [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] >> [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] >> [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule >> [ 301.158310] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] >> [ 301.160238] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] >> [ 301.196606] set resolution quirk: cval->res = 384 >> [ 371.309569] e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow >> Control: RX >> [ 390.729568] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule >> f5ade900 2296555[ 390.730023] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 >> us] >> 437 S Ii:1:003:7[ 390.736394] usb 1-1: unlink qh16-0001/f48d64c0 start 2 >> [1/0 us] >> -115:128 16 < >> f5ade900 2296566256 C Ii:1:003:7 -2:128 0 >> [ 391.100896] ehci_hcd
Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: > The freeze happens on my *host* Linux PC. For an experiment I decided > to check if I could reproduce the freeze under a virtual machine - it > turns out the Linux kernel running under it also freezes. I know that - but a freeze != oops - at least not necessarily. Which means it could very well be a different issue now that vbox is gone. Or, it could be the same issue with different incarnations: with vbox you get the corruptions and without it, you get the freezes. I'm assuming you do the same flash player thing in both cases? Here's a crazy idea: can you try to reproduce it in KVM? Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 01:15, Artem S. Tashkinov wrote: > You don't get me - I have *no* VirtualBox (or any proprietary) modules running > - but I can reproduce this problem using *the same system running under* > VirtualBox > in Windows 7 64. > > It's almost definitely either a USB driver bug or video4linux driver bug: > > I'm CC'ing linux-media and linux-usb mailing lists, the problem is described > here: > https://lkml.org/lkml/2012/10/20/35 > https://lkml.org/lkml/2012/10/20/148 > > Here are the last lines from my dmesg (with usbmon loaded): > > [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 > [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK > POWER sig=se0 PEC CSC > [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s > [ 292.174883] usb 1-1: USB disconnect, device number 2 > [ 292.178045] usb 1-1: unregistering device > [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 > [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 > [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 > [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 > [ 292.236175] usb 1-1: usb_disable_device nuking all URBs > [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status > 0x100 > [ 294.364279] hub 1-0:1.0: hub_suspend > [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 > [ 294.367375] ehci_hcd :00:1f.5: suspend root hub > [ 296.501084] usb usb1: usb wakeup-resume > [ 296.508311] usb usb1: usb auto-resume > [ 296.509833] ehci_hcd :00:1f.5: resume root hub > [ 296.560149] hub 1-0:1.0: hub_resume > [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK > POWER sig=se0 CSC CONNECT > [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 > [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt > [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s > [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd > [ 300.547237] usb 1-1: skipped 1 descriptor after configuration > [ 300.549443] usb 1-1: skipped 4 descriptors after interface > [ 300.552273] usb 1-1: skipped 2 descriptors after interface > [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint > [ 300.559392] usb 1-1: skipped 2 descriptors after interface > [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint > [ 300.562169] usb 1-1: skipped 2 descriptors after interface > [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint > [ 300.564639] usb 1-1: skipped 2 descriptors after interface > [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint > [ 300.567084] usb 1-1: skipped 9 descriptors after interface > [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint > [ 300.570484] usb 1-1: skipped 53 descriptors after interface > [ 300.595843] usb 1-1: default language 0x0409 > [ 300.602503] usb 1-1: USB interface quirks for this device: 2 > [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 > [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d > [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, > SerialNumber=1 > [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 > [ 300.617703] usb 1-1: usb_probe_device > [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice > [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) > [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface > [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id > [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) > [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) > [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface > [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id > [ 301.059395] uvcvideo: Found UVC 1.00 device (046d:081d) > [ 301.090173] input: UVC Camera (046d:081d) as > /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 That seems to be a Logitech model. > [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) > [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] > [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] > [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule > [ 301.158310] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] > [ 301.160238] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] > [ 301.196606] set resolution quirk: cval->res = 384 > [ 371.309569] e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow > Control: RX > [ 390.729568] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule > f5ade900 2296555[ 390.730023] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 > us] > 437 S Ii:1:003:7[ 390.736394] usb 1-1: unlink qh16-0001/f48d64c0 start 2 > [1/0 us] > -115:128 16 < > f5ade900 2296566256 C Ii:1:003:7 -2:128 0 > [ 391.100896] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule > [ 391.103188] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] > f5ade900 2296926929 S Ii:1:003:7[
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 01:15, Artem S. Tashkinov wrote: You don't get me - I have *no* VirtualBox (or any proprietary) modules running - but I can reproduce this problem using *the same system running under* VirtualBox in Windows 7 64. It's almost definitely either a USB driver bug or video4linux driver bug: I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Here are the last lines from my dmesg (with usbmon loaded): [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK POWER sig=se0 PEC CSC [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s [ 292.174883] usb 1-1: USB disconnect, device number 2 [ 292.178045] usb 1-1: unregistering device [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 [ 292.236175] usb 1-1: usb_disable_device nuking all URBs [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x100 [ 294.364279] hub 1-0:1.0: hub_suspend [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 [ 294.367375] ehci_hcd :00:1f.5: suspend root hub [ 296.501084] usb usb1: usb wakeup-resume [ 296.508311] usb usb1: usb auto-resume [ 296.509833] ehci_hcd :00:1f.5: resume root hub [ 296.560149] hub 1-0:1.0: hub_resume [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK POWER sig=se0 CSC CONNECT [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd [ 300.547237] usb 1-1: skipped 1 descriptor after configuration [ 300.549443] usb 1-1: skipped 4 descriptors after interface [ 300.552273] usb 1-1: skipped 2 descriptors after interface [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint [ 300.559392] usb 1-1: skipped 2 descriptors after interface [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint [ 300.562169] usb 1-1: skipped 2 descriptors after interface [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint [ 300.564639] usb 1-1: skipped 2 descriptors after interface [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint [ 300.567084] usb 1-1: skipped 9 descriptors after interface [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint [ 300.570484] usb 1-1: skipped 53 descriptors after interface [ 300.595843] usb 1-1: default language 0x0409 [ 300.602503] usb 1-1: USB interface quirks for this device: 2 [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=1 [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 [ 300.617703] usb 1-1: usb_probe_device [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id [ 301.059395] uvcvideo: Found UVC 1.00 device unnamed (046d:081d) [ 301.090173] input: UVC Camera (046d:081d) as /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 That seems to be a Logitech model. [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 301.158310] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.160238] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.196606] set resolution quirk: cval-res = 384 [ 371.309569] e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 390.729568] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule f5ade900 2296555[ 390.730023] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] 437 S Ii:1:003:7[ 390.736394] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] -115:128 16 f5ade900 2296566256 C Ii:1:003:7 -2:128 0 [ 391.100896] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 391.103188] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] f5ade900 2296926929 S Ii:1:003:7[ 391.104889] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] -115:128 16
Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: The freeze happens on my *host* Linux PC. For an experiment I decided to check if I could reproduce the freeze under a virtual machine - it turns out the Linux kernel running under it also freezes. I know that - but a freeze != oops - at least not necessarily. Which means it could very well be a different issue now that vbox is gone. Or, it could be the same issue with different incarnations: with vbox you get the corruptions and without it, you get the freezes. I'm assuming you do the same flash player thing in both cases? Here's a crazy idea: can you try to reproduce it in KVM? Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 12:34, Daniel Mack wrote: On 21.10.2012 01:15, Artem S. Tashkinov wrote: You don't get me - I have *no* VirtualBox (or any proprietary) modules running - but I can reproduce this problem using *the same system running under* VirtualBox in Windows 7 64. It's almost definitely either a USB driver bug or video4linux driver bug: I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Here are the last lines from my dmesg (with usbmon loaded): [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK POWER sig=se0 PEC CSC [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s [ 292.174883] usb 1-1: USB disconnect, device number 2 [ 292.178045] usb 1-1: unregistering device [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 [ 292.236175] usb 1-1: usb_disable_device nuking all URBs [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x100 [ 294.364279] hub 1-0:1.0: hub_suspend [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 [ 294.367375] ehci_hcd :00:1f.5: suspend root hub [ 296.501084] usb usb1: usb wakeup-resume [ 296.508311] usb usb1: usb auto-resume [ 296.509833] ehci_hcd :00:1f.5: resume root hub [ 296.560149] hub 1-0:1.0: hub_resume [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK POWER sig=se0 CSC CONNECT [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd [ 300.547237] usb 1-1: skipped 1 descriptor after configuration [ 300.549443] usb 1-1: skipped 4 descriptors after interface [ 300.552273] usb 1-1: skipped 2 descriptors after interface [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint [ 300.559392] usb 1-1: skipped 2 descriptors after interface [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint [ 300.562169] usb 1-1: skipped 2 descriptors after interface [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint [ 300.564639] usb 1-1: skipped 2 descriptors after interface [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint [ 300.567084] usb 1-1: skipped 9 descriptors after interface [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint [ 300.570484] usb 1-1: skipped 53 descriptors after interface [ 300.595843] usb 1-1: default language 0x0409 [ 300.602503] usb 1-1: USB interface quirks for this device: 2 [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=1 [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 [ 300.617703] usb 1-1: usb_probe_device [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id [ 301.059395] uvcvideo: Found UVC 1.00 device unnamed (046d:081d) [ 301.090173] input: UVC Camera (046d:081d) as /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 That seems to be a Logitech model. [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 301.158310] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.160238] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.196606] set resolution quirk: cval-res = 384 [ 371.309569] e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 390.729568] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule f5ade900 2296555[ 390.730023] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] 437 S Ii:1:003:7[ 390.736394] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] -115:128 16 f5ade900 2296566256 C Ii:1:003:7 -2:128 0 [ 391.100896] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 391.103188] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] f5ade900 2296926929 S Ii:1:003:7[ 391.104889] usb 1-1: unlink
Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Borislav Petkov wrote: On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: The freeze happens on my *host* Linux PC. For an experiment I decided to check if I could reproduce the freeze under a virtual machine - it turns out the Linux kernel running under it also freezes. I know that - but a freeze != oops - at least not necessarily. Which means it could very well be a different issue now that vbox is gone. Or, it could be the same issue with different incarnations: with vbox you get the corruptions and without it, you get the freezes. I'm assuming you do the same flash player thing in both cases? Here's a crazy idea: can you try to reproduce it in KVM? OK, dismiss VBox altogether - it has a very buggy USB implementation, thus it just hangs when trying to access my webcam. What I've found out is that my system crashes *only* when I try to enable usb-audio (from the same webcam) - I still have no idea how to capture a panic message, but I ran while :; do dmesg -c; done in xterm, then I got like thousands of messages and I photographed my monitor: http://imageshack.us/a/img685/9452/panicz.jpg list_del corruption. prev-next should be ... but was ... I cannot show you more as I have no serial console to use :( and the kernel doesn't have enough time to push error messages to rsyslog and fsync /var/log/messages -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 13:59, Artem S. Tashkinov wrote: On Oct 21, 2012, Borislav Petkov wrote: On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: The freeze happens on my *host* Linux PC. For an experiment I decided to check if I could reproduce the freeze under a virtual machine - it turns out the Linux kernel running under it also freezes. I know that - but a freeze != oops - at least not necessarily. Which means it could very well be a different issue now that vbox is gone. Or, it could be the same issue with different incarnations: with vbox you get the corruptions and without it, you get the freezes. I'm assuming you do the same flash player thing in both cases? Here's a crazy idea: can you try to reproduce it in KVM? OK, dismiss VBox altogether - it has a very buggy USB implementation, thus it just hangs when trying to access my webcam. Ok. What I've found out is that my system crashes *only* when I try to enable usb-audio (from the same webcam) - I still have no idea how to capture a panic message, but I ran while :; do dmesg -c; done in xterm, then I got like thousands of messages and I photographed my monitor: http://imageshack.us/a/img685/9452/panicz.jpg A hint at least. How did you enable the audio record exactly? Can you reproduce this with arecord? What chipset are you on? Please provide both lspci -v and lsusb -v dumps. As I said, I fail to reproduce that issue on any of my machines. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 13:59, Artem S. Tashkinov wrote: On Oct 21, 2012, Borislav Petkov wrote: On Sun, Oct 21, 2012 at 01:57:21AM +, Artem S. Tashkinov wrote: The freeze happens on my *host* Linux PC. For an experiment I decided to check if I could reproduce the freeze under a virtual machine - it turns out the Linux kernel running under it also freezes. I know that - but a freeze != oops - at least not necessarily. Which means it could very well be a different issue now that vbox is gone. Or, it could be the same issue with different incarnations: with vbox you get the corruptions and without it, you get the freezes. I'm assuming you do the same flash player thing in both cases? Here's a crazy idea: can you try to reproduce it in KVM? OK, dismiss VBox altogether - it has a very buggy USB implementation, thus it just hangs when trying to access my webcam. What I've found out is that my system crashes *only* when I try to enable usb-audio (from the same webcam) It would also be interesting to know whether you have problems with *only* the video capture, with some tool like cheese. It might be you're hitting a host controller issue here, and then isochronous input packets on the video interface would most likely also trigger such am effect. Actually, knowing whether that's the case would be crucial for further debugging. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Daniel Mack wrote: A hint at least. How did you enable the audio record exactly? Can you reproduce this with arecord? What chipset are you on? Please provide both lspci -v and lsusb -v dumps. As I said, I fail to reproduce that issue on any of my machines. All other applications can read from the USB audio without problems, it's just something in the way Adobe Flash polls my audio input which causes a crash. Just video capture (without audio) works just fine in Adobe Flash. Only and only when I choose to use USB Device 0x46d:0x81d my system crashes in Adobe Flash. See the screenshot: https://bugzilla.kernel.org/attachment.cgi?id=84151 My hardware information can be fetched from here: https://bugzilla.kernel.org/show_bug.cgi?id=49181 On a second thought that can be even an ALSA crash or pretty much anything else. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
[Cc: alsa-devel] On 21.10.2012 14:30, Artem S. Tashkinov wrote: On Oct 21, 2012, Daniel Mack wrote: A hint at least. How did you enable the audio record exactly? Can you reproduce this with arecord? What chipset are you on? Please provide both lspci -v and lsusb -v dumps. As I said, I fail to reproduce that issue on any of my machines. All other applications can read from the USB audio without problems, it's just something in the way Adobe Flash polls my audio input which causes a crash. Just video capture (without audio) works just fine in Adobe Flash. Ok, so that pretty much rules out the host controller. I just wonder why I still don't see it here, and I haven't heard of any such problem from anyone else. Some more questions: - Which version of Flash are you running? - Does this also happen with Firefox? - Does flash access the device directly or via PulseAudio? - Could you please apply the attached patch and see what it spits out to dmesg once Flash opens the device? It returns -EINVAL in the hw_params callback to prevent the actual streaming. On my machine with Flash 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. Or does your machine still crash before anything is written to the logs? Only and only when I choose to use USB Device 0x46d:0x81d my system crashes in Adobe Flash. See the screenshot: https://bugzilla.kernel.org/attachment.cgi?id=84151 When exactly does the crash happen? Right after you selected that entry from the list? There's a little recording level meter in that dialog. Does that show any input from the microphone? My hardware information can be fetched from here: https://bugzilla.kernel.org/show_bug.cgi?id=49181 On a second thought that can be even an ALSA crash or pretty much anything else. We'll see. Thanks for your help to sort this out! Daniel diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index f782ce1..5664b45 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -453,6 +453,18 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream, unsigned int channels, rate, format; int ret, changed; + + printk( %s()\n, __func__); + + printk(format: %d\n, params_format(hw_params)); + printk(rate: %d\n, params_rate(hw_params)); + printk(channels: %d\n, params_channels(hw_params)); + printk(buffer bytes: %d\n, params_buffer_bytes(hw_params)); + printk(period bytes: %d\n, params_period_bytes(hw_params)); + printk(access: %d\n, params_access(hw_params)); + + return -EINVAL; + ret = snd_pcm_lib_alloc_vmalloc_buffer(substream, params_buffer_bytes(hw_params)); if (ret 0)
Re: Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Daniel Mack wrote: [Cc: alsa-devel] On 21.10.2012 14:30, Artem S. Tashkinov wrote: On Oct 21, 2012, Daniel Mack wrote: A hint at least. How did you enable the audio record exactly? Can you reproduce this with arecord? What chipset are you on? Please provide both lspci -v and lsusb -v dumps. As I said, I fail to reproduce that issue on any of my machines. All other applications can read from the USB audio without problems, it's just something in the way Adobe Flash polls my audio input which causes a crash. Just video capture (without audio) works just fine in Adobe Flash. Ok, so that pretty much rules out the host controller. I just wonder why I still don't see it here, and I haven't heard of any such problem from anyone else. Some more questions: - Which version of Flash are you running? Google Chrome has its own version of Adobe Flash: Name: Shockwave Flash Description:Shockwave Flash 11.4 r31 Version:11.4.31.110 - Does this also happen with Firefox? No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), it shows just two input devices instead of three which the newer Flash players sees. * HDA Intel PCH * USB Device 0x46d:0x81d - Does flash access the device directly or via PulseAudio? PA is not installed on my computer, so Flash accesses it directly via ALSA calls. - Could you please apply the attached patch and see what it spits out to dmesg once Flash opens the device? It returns -EINVAL in the hw_params callback to prevent the actual streaming. On my machine with Flash 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. Or does your machine still crash before anything is written to the logs? I will try it a bit later. Only and only when I choose to use USB Device 0x46d:0x81d my system crashes in Adobe Flash. See the screenshot: https://bugzilla.kernel.org/attachment.cgi?id=84151 When exactly does the crash happen? Right after you selected that entry from the list? There's a little recording level meter in that dialog. Does that show any input from the microphone? Yes, right after I select it and move the mouse cursor away from this combobox so that this selection becomes active. My hardware information can be fetched from here: https://bugzilla.kernel.org/show_bug.cgi?id=49181 On a second thought that can be even an ALSA crash or pretty much anything else. We'll see. Thanks for your help to sort this out! Thank you for your assistance! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 16:57, Artem S. Tashkinov wrote: On Oct 21, 2012, Daniel Mack wrote: [Cc: alsa-devel] On 21.10.2012 14:30, Artem S. Tashkinov wrote: On Oct 21, 2012, Daniel Mack wrote: A hint at least. How did you enable the audio record exactly? Can you reproduce this with arecord? What chipset are you on? Please provide both lspci -v and lsusb -v dumps. As I said, I fail to reproduce that issue on any of my machines. All other applications can read from the USB audio without problems, it's just something in the way Adobe Flash polls my audio input which causes a crash. Just video capture (without audio) works just fine in Adobe Flash. Ok, so that pretty much rules out the host controller. I just wonder why I still don't see it here, and I haven't heard of any such problem from anyone else. Some more questions: - Which version of Flash are you running? Google Chrome has its own version of Adobe Flash: Name: Shockwave Flash Description: Shockwave Flash 11.4 r31 Version: 11.4.31.110 So that's the same that I'm using. - Does this also happen with Firefox? No, Adobe Flash in Firefox is an older version (Shockwave Flash 11.1 r102), it shows just two input devices instead of three which the newer Flash players sees. * HDA Intel PCH * USB Device 0x46d:0x81d And that works, I assume? Does the second choice in the newer Flash version work maybe? - Does flash access the device directly or via PulseAudio? PA is not installed on my computer, so Flash accesses it directly via ALSA calls. Ok, Same here. - Could you please apply the attached patch and see what it spits out to dmesg once Flash opens the device? It returns -EINVAL in the hw_params callback to prevent the actual streaming. On my machine with Flash 11.4.31.110, I get values of 2/44800/1/32768/2048/0, which seems sane. Or does your machine still crash before anything is written to the logs? I will try it a bit later. Yes, we need to trace the call chain and see at which point the trouble starts. What could help is tracing the google-chrome binary with strace maybe. At least we would see the ioctl command sequence, if the log file survives the crash. As the usb list is still in Cc: - Artem's lcpci dump shows that his machine features XHCI controllers. Can anyone think of a relation to this problem? And Artem, is there any way you boot your system on an older machine that only has EHCI ports? Thinking about it, I wonder whether the freeze in VBox and the crashes on native hardware have the same root cause. In that case, would it be possible to share that VBox image? Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, 21 Oct 2012, Artem S. Tashkinov wrote: What I've found out is that my system crashes *only* when I try to enable usb-audio (from the same webcam) - I still have no idea how to capture a panic message, but I ran while :; do dmesg -c; done in xterm, then I got like thousands of messages and I photographed my monitor: http://imageshack.us/a/img685/9452/panicz.jpg list_del corruption. prev-next should be ... but was ... I cannot show you more as I have no serial console to use :( and the kernel doesn't have enough time to push error messages to rsyslog and fsync /var/log/messages Is it possible to use netconsole? The screenshot above appears to be the end of a long series of error messages, which isn't too useful. The most important information is in the first error. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: was: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, 21 Oct 2012, Daniel Mack wrote: As the usb list is still in Cc: - Artem's lcpci dump shows that his machine features XHCI controllers. Can anyone think of a relation to this problem? And Artem, is there any way you boot your system on an older machine that only has EHCI ports? Thinking about it, I wonder whether the freeze in VBox and the crashes on native hardware have the same root cause. In that case, would it be possible to share that VBox image? Don't grasp at straws. All of the kernel logs Artem has posted show ehci-hcd; none of them show xhci-hcd. Therefore the xHCI controller is highly unlikely to be involved. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, Oct 21, 2012 at 11:59:36AM +, Artem S. Tashkinov wrote: http://imageshack.us/a/img685/9452/panicz.jpg list_del corruption. prev-next should be ... but was ... Btw, this is one of the debug options I told you to enable. I cannot show you more as I have no serial console to use :( and the kernel doesn't have enough time to push error messages to rsyslog and fsync /var/log/messages I already told you how to catch that oops: boot with pause_on_oops=600 on the kernel command line and photograph the screen when the first oops happens. This'll show us where the problem begins. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Borislav Petkov b...@alien8.de wrote: On Sun, Oct 21, 2012 at 11:59:36AM +, Artem S. Tashkinov wrote: http://imageshack.us/a/img685/9452/panicz.jpg list_del corruption. prev-next should be ... but was ... Btw, this is one of the debug options I told you to enable. I cannot show you more as I have no serial console to use :( and the kernel doesn't have enough time to push error messages to rsyslog and fsync /var/log/messages I already told you how to catch that oops: boot with pause_on_oops=600 on the kernel command line and photograph the screen when the first oops happens. This'll show us where the problem begins. This option didn't have any effect, or maybe it's because it's such a serious crash the kernel has no time to actually print an ooops/panic message. dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 I dumped them using this application: $ cat scat.c #include stdio.h #include stdlib.h #include unistd.h #include string.h #include sys/types.h #include sys/stat.h #include fcntl.h #define O_LARGEFILE 010 #define BUFFER 4096 #define __USE_FILE_OFFSET64 1 #define __USE_LARGEFILE64 1 int main(int argc, char *argv[]) { int fd_out; int64_t bytes_read; void *buffer; if (argc!=2) { printf(Usage is: scat destination\n); return 1; } buffer = malloc(BUFFER * sizeof(char)); if (buffer == NULL) { printf(Error: can't allocate buffers\n); return 2; } memset(buffer, 0, BUFFER); printf(Dumping to \%s\ ... , argv[1]); fflush(NULL); if ((fd_out = open64(argv[1], O_WRONLY | O_LARGEFILE | O_SYNC | O_NOFOLLOW, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH)) == -1) { printf(Error: destination file can't be created\n); perror(open() ); return 2; } bytes_read = 1; while (bytes_read) { bytes_read = fread(buffer, sizeof(char), BUFFER, stdin); if (write(fd_out, (void *) buffer, bytes_read) != bytes_read) { printf(Error: can't write data to the destination file! Possibly a target disk is full\n); return 3; } } close(fd_out); printf( OK\n); return 0; } I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes, straight to a hdd partition to eliminate a FS cache) Don't judge me harshly - I'm not a programmer. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 21:49, Artem S. Tashkinov wrote: On Oct 21, 2012, Borislav Petkov b...@alien8.de wrote: On Sun, Oct 21, 2012 at 11:59:36AM +, Artem S. Tashkinov wrote: http://imageshack.us/a/img685/9452/panicz.jpg list_del corruption. prev-next should be ... but was ... Btw, this is one of the debug options I told you to enable. I cannot show you more as I have no serial console to use :( and the kernel doesn't have enough time to push error messages to rsyslog and fsync /var/log/messages I already told you how to catch that oops: boot with pause_on_oops=600 on the kernel command line and photograph the screen when the first oops happens. This'll show us where the problem begins. This option didn't have any effect, or maybe it's because it's such a serious crash the kernel has no time to actually print an ooops/panic message. dmesg messages up to a crash can be seen here: https://bugzilla.kernel.org/attachment.cgi?id=84221 Nice. Could you do that again with the patch applied I sent yo some hours ago? Thanks, Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sun, Oct 21, 2012 at 07:49:01PM +, Artem S. Tashkinov wrote: I ran it this way: while :; do dmesg -c; done | scat /dev/sda11 (yes, straight to a hdd partition to eliminate a FS cache) Well, I'm no fs guy but this should still go through the buffer cache. I think the O_SYNC flag makes sure it all lands on the partition in time. Oh well, it doesn't matter. Don't judge me harshly - I'm not a programmer. If you wrote that and you're not a programmer, it certainly looks cool, good job!. [ Btw, don't forget to free(buffer) at the end. ] Also, there was a patchset recently which added a blockconsole method to the kernel with which you can do something like that in a generic way. Back to the issue at hand: it looks like ehci_hcd is causing some list corruptions, maybe coming from the uvcvideo or whatever. I think the usb people will have a better idea. Btw, is there any particular reason you're running a 32-bit kernel? Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
Nice. Could you do that again with the patch applied I sent yo some hours ago? That patch was of no help - the system has crashed and I couldn't spot relevant messages. I've no idea what it means. Artem -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On 21.10.2012 22:43, Artem S. Tashkinov wrote: Nice. Could you do that again with the patch applied I sent yo some hours ago? That patch was of no help - the system has crashed and I couldn't spot relevant messages. I've no idea what it means. The sequence of driver callbacks issued on a stream start is .open() .hw_params() .prepare() .trigger() If the ALSA part really causes this issue, the bad things happen either in any of the driver callback functions or in the core underneath. The patch I sent returns an error from the hw_params callback, and as you still see the problem, that means that the crash happens before any of the USB audio streaming really starts. Could you try and return -EINVAL from snd_usb_capture_open() please? If anyone has a better idea on how to debug this, please chime in. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, 20 Oct 2012, Artem S. Tashkinov wrote: > You don't get me - I have *no* VirtualBox (or any proprietary) modules running > - but I can reproduce this problem using *the same system running under* > VirtualBox > in Windows 7 64. > > It's almost definitely either a USB driver bug or video4linux driver bug: Does the same thing happen with earlier kernel versions? What about if you unload snd-usb-audio or ehci-hcd? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
> On Oct 21, 2012, Borislav Petkov wrote: > > On Sat, Oct 20, 2012 at 11:15:17PM +, Artem S. Tashkinov wrote: > > You don't get me - I have *no* VirtualBox (or any proprietary) modules > > running > > Ok, good. We got that out of the way - I wanted to make sure after you > replied with two other possibilities of the system freezing. > > > - but I can reproduce this problem using *the same system running > > under* VirtualBox in Windows 7 64. > > That's windoze as host and linux as a guest, correct? Exactly. > If so, that's virtualbox's problem, I'd say. I can reproduce it on my host *alone* as I said in the very first message - never before I tried to run my Linux in a virtual machine. Please, just forget about VirtualBox - it has nothing to do with this problem. > > It's almost definitely either a USB driver bug or video4linux driver > > bug: > > And you're assuming that because the freeze happens when using your usb > webcam, correct? And not otherwise? Yes, like I said earlier - only when I try to access its settings using Adobe Flash the system crashes/freezes. > Maybe you can describe in more detail what exactly you're doing so that > people could try to reproduce your issue. I don't think many people have the same webcam so it's going to be a problem. It can be reproduced easily - just open Flash "Settings" in Google Chrome 22. The crash will occur immediately. > > I'm CC'ing linux-media and linux-usb mailing lists, the problem is > > described here: > > https://lkml.org/lkml/2012/10/20/35 > > https://lkml.org/lkml/2012/10/20/148 > > Yes, good idea. Maybe the folks there have some more ideas how to debug > this. > > I'm leaving in the rest for reference. > > What should be pointed out, though, is that you don't have any more > random corruptions causing oopses now that virtualbox is gone. The > freeze below is a whole another issue. The freeze happens on my *host* Linux PC. For an experiment I decided to check if I could reproduce the freeze under a virtual machine - it turns out the Linux kernel running under it also freezes. Artem -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 11:15:17PM +, Artem S. Tashkinov wrote: > You don't get me - I have *no* VirtualBox (or any proprietary) modules > running Ok, good. We got that out of the way - I wanted to make sure after you replied with two other possibilities of the system freezing. > - but I can reproduce this problem using *the same system running > under* VirtualBox in Windows 7 64. That's windoze as host and linux as a guest, correct? If so, that's virtualbox's problem, I'd say. > It's almost definitely either a USB driver bug or video4linux driver > bug: And you're assuming that because the freeze happens when using your usb webcam, correct? And not otherwise? Maybe you can describe in more detail what exactly you're doing so that people could try to reproduce your issue. > I'm CC'ing linux-media and linux-usb mailing lists, the problem is described > here: > https://lkml.org/lkml/2012/10/20/35 > https://lkml.org/lkml/2012/10/20/148 Yes, good idea. Maybe the folks there have some more ideas how to debug this. I'm leaving in the rest for reference. What should be pointed out, though, is that you don't have any more random corruptions causing oopses now that virtualbox is gone. The freeze below is a whole another issue. Thanks. > Here are the last lines from my dmesg (with usbmon loaded): > > [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 > [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK > POWER sig=se0 PEC CSC > [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s > [ 292.174883] usb 1-1: USB disconnect, device number 2 > [ 292.178045] usb 1-1: unregistering device > [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 > [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 > [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 > [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 > [ 292.236175] usb 1-1: usb_disable_device nuking all URBs > [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status > 0x100 > [ 294.364279] hub 1-0:1.0: hub_suspend > [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 > [ 294.367375] ehci_hcd :00:1f.5: suspend root hub > [ 296.501084] usb usb1: usb wakeup-resume > [ 296.508311] usb usb1: usb auto-resume > [ 296.509833] ehci_hcd :00:1f.5: resume root hub > [ 296.560149] hub 1-0:1.0: hub_resume > [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK > POWER sig=se0 CSC CONNECT > [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 > [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt > [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s > [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd > [ 300.547237] usb 1-1: skipped 1 descriptor after configuration > [ 300.549443] usb 1-1: skipped 4 descriptors after interface > [ 300.552273] usb 1-1: skipped 2 descriptors after interface > [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint > [ 300.559392] usb 1-1: skipped 2 descriptors after interface > [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint > [ 300.562169] usb 1-1: skipped 2 descriptors after interface > [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint > [ 300.564639] usb 1-1: skipped 2 descriptors after interface > [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint > [ 300.567084] usb 1-1: skipped 9 descriptors after interface > [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint > [ 300.570484] usb 1-1: skipped 53 descriptors after interface > [ 300.595843] usb 1-1: default language 0x0409 > [ 300.602503] usb 1-1: USB interface quirks for this device: 2 > [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 > [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d > [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, > SerialNumber=1 > [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 > [ 300.617703] usb 1-1: usb_probe_device > [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice > [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) > [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface > [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id > [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) > [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) > [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface > [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id > [ 301.059395] uvcvideo: Found UVC 1.00 device (046d:081d) > [ 301.090173] input: UVC Camera (046d:081d) as > /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 > [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) > [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] > [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] > [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule > [
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
You don't get me - I have *no* VirtualBox (or any proprietary) modules running - but I can reproduce this problem using *the same system running under* VirtualBox in Windows 7 64. It's almost definitely either a USB driver bug or video4linux driver bug: I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Here are the last lines from my dmesg (with usbmon loaded): [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK POWER sig=se0 PEC CSC [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s [ 292.174883] usb 1-1: USB disconnect, device number 2 [ 292.178045] usb 1-1: unregistering device [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 [ 292.236175] usb 1-1: usb_disable_device nuking all URBs [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x100 [ 294.364279] hub 1-0:1.0: hub_suspend [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 [ 294.367375] ehci_hcd :00:1f.5: suspend root hub [ 296.501084] usb usb1: usb wakeup-resume [ 296.508311] usb usb1: usb auto-resume [ 296.509833] ehci_hcd :00:1f.5: resume root hub [ 296.560149] hub 1-0:1.0: hub_resume [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK POWER sig=se0 CSC CONNECT [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd [ 300.547237] usb 1-1: skipped 1 descriptor after configuration [ 300.549443] usb 1-1: skipped 4 descriptors after interface [ 300.552273] usb 1-1: skipped 2 descriptors after interface [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint [ 300.559392] usb 1-1: skipped 2 descriptors after interface [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint [ 300.562169] usb 1-1: skipped 2 descriptors after interface [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint [ 300.564639] usb 1-1: skipped 2 descriptors after interface [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint [ 300.567084] usb 1-1: skipped 9 descriptors after interface [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint [ 300.570484] usb 1-1: skipped 53 descriptors after interface [ 300.595843] usb 1-1: default language 0x0409 [ 300.602503] usb 1-1: USB interface quirks for this device: 2 [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=1 [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 [ 300.617703] usb 1-1: usb_probe_device [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id [ 301.059395] uvcvideo: Found UVC 1.00 device (046d:081d) [ 301.090173] input: UVC Camera (046d:081d) as /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 301.158310] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.160238] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.196606] set resolution quirk: cval->res = 384 [ 371.309569] e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 390.729568] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule f5ade900 2296555[ 390.730023] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] 437 S Ii:1:003:7[ 390.736394] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] -115:128 16 < f5ade900 2296566256 C Ii:1:003:7 -2:128 0 [ 391.100896] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 391.103188] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] f5ade900 2296926929 S Ii:1:003:7[ 391.104889] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] -115:128 16 < f5ade900 2296937889 C Ii:1:003:7 -2:128 0 f5272300 2310382508 S Co:1:003:0 s 01 0b 0004 0001 0 f5272300 2310407888 C Co:1:003:0 0 0 f5272300 2310408051 S Co:1:003:0 s 22 01 0100 0086
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 10:32:28PM +0200, Pavel Machek wrote: > On Sat 2012-10-20 17:41:49, Artem S. Tashkinov wrote: > > On Oct 20, 2012, Borislav Petkov wrote: > > > > > Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can > > > you reproduce your corruptions (this is what it looks like) without that > > > module? > > > > Yes, I can reproduce this panic with zero proprietary/non-free modules > > loaded. > > > > The problem is the kernel doesn't even print a kernel panic - the > > system just freezes completely - cursor in a text console stops > > blinking. > > bugtraq? :-). > > If remote website can crash your Linux, that's quite significant news. > > (Cc-ed netdev@ and security@ ... this may be important). I don't think that's the problem - I rather suspect the fact that he's using virtualbox which is causing random corruptions by writing to arbitrary locations. Artem, please remove virtualbox completely from your system, rebuild the kernel and make sure the virtualbox kernel modules don't get loaded - simply delete them so that they are completely gone; *and* *then* retest again. Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
Hi, I can only reproduce this panic when my USB webcamera is plugged in - when I click settings in Adobe Flash it sends some commands to my USB webcam using, presumably, Video4Linux API calls which cause a kernel hard crash. Your kernel debug features haven't helped at all, even the virtual machine crashes the way I cannot get any information from it - under Windows 7 64 VirtualBox becomes an unkillable process. I've no idea what's crashing - it can be the kernel itself, or some of v4l or usb modules. Artem -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Borislav Petkov wrote: > Ok, here's what you can try: > > * You say this happens with google chrome. Does it happen if you use > another browser: firefox, etc? > > * Can you build a 64-bit kernel and try the same with it? The 32-bit > userspace should work in compat mode just fine. > > * Can you run memtest on your machine and check whether your DIMMs > aren't generating ECC errors? Are your DIMMs ECC, btw? > ... I can reproduce this problem in a virtual machine, which means I have found a real kernel or GCC bug. Alas, VirtualBox 4.2.2 hangs entirely when I run this virtual machine - I've never seen anything like that. Windows 7 64 bit which hosts this VirtualBox cannot even kill a VirtualBox instance. Unfortunately even though I run the kernel with "console=ttyS0,115200 console=tty0" parameters they don't help - I see no panic messages on a "virtual" serial port, which looks like we've got a very deep freeze. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat 2012-10-20 17:41:49, Artem S. Tashkinov wrote: > On Oct 20, 2012, Borislav Petkov wrote: > > > Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can > > you reproduce your corruptions (this is what it looks like) without that > > module? > > Yes, I can reproduce this panic with zero proprietary/non-free modules loaded. > > The problem is the kernel doesn't even print a kernel panic - the > system just freezes completely - cursor in a text console stops > blinking. bugtraq? :-). If remote website can crash your Linux, that's quite significant news. (Cc-ed netdev@ and security@ ... this may be important). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 05:41:49PM +, Artem S. Tashkinov wrote: > On Oct 20, 2012, Borislav Petkov wrote: > > > Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can > > you reproduce your corruptions (this is what it looks like) without that > > module? > > Yes, I can reproduce this panic with zero proprietary/non-free modules loaded. > > The problem is the kernel doesn't even print a kernel panic - the > system just freezes completely - cursor in a text console stops blinking. > > I have no means to debug it using a serial console - what can I do? Ok, here's what you can try: * You say this happens with google chrome. Does it happen if you use another browser: firefox, etc? * Can you build a 64-bit kernel and try the same with it? The 32-bit userspace should work in compat mode just fine. * Can you run memtest on your machine and check whether your DIMMs aren't generating ECC errors? Are your DIMMs ECC, btw? * What about netconsole? You only need another machine on the same network: Documentation/networking/netconsole.txt. * boot with "pause_on_oops=600" on the kernel command line to stop the machine for 600 secs after the first oops happens. Then try to make a photo of the screen. Make sure to disable X or to be on a text console so that you can see the oops. * Try enabling a bunch of debugging options in "Kernel hacking". More specifically, CONFIG_DETECT_HUNG_TASK CONFIG_DEBUG_PREEMPT CONFIG_DEBUG_SPINLOCK CONFIG_DEBUG_MUTEXES CONFIG_DEBUG_LOCK_ALLOC CONFIG_PROVE_LOCKING CONFIG_PROVE_RCU CONFIG_DEBUG_ATOMIC_SLEEP CONFIG_DEBUG_BUGVERBOSE CONFIG_DEBUG_INFO CONFIG_DEBUG_VM CONFIG_DEBUG_VIRTUAL CONFIG_DEBUG_MEMORY_INIT CONFIG_DEBUG_LIST CONFIG_X86_VERBOSE_BOOTUP CONFIG_DEBUG_RODATA ... I hope those should scream in case something goes awry. HTH. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 12:06:55PM +, Artem S. Tashkinov wrote: > Hello, > > I'm running vanilla Linux 3.6.2 x86 on top of CentOS 6.3 userspace. > > Every time when I enter the chat roulette website, right click anywhere and > choose "Settings", > my PC crashes (with or without NVIDIA drivers running, it happens even when > I'm running Vesa). > > Web browser: google-chrome-stable-22.0.1229.94-161065.i386.rpm > OS: Linux 3.6.2 vanilla x86 > CPU: Intel Core i5 2500 (non-overclocked) > GCC: 4.7.2 vanilla > > The latest crash: > > Oct 20 07:15:22 localhost kernel: [ 224.293756] Modules linked in: pppoe > pppox ppp_synctty ppp_async crc_ccitt ppp_generic slhc ipv6 nf_conntrack_ftp > nf_conntrack_netbios_ns nf_conntrack_broadcast xt_LOG xt_limit > nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_pkttype > ipt_ULOG xt_owner xt_multiport iptable_filter ip_tables x_tables w83627ehf > adt7475 hwmon_vid vboxpci(O) > vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc fuse hid_generic > snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi uvcvideo videobuf2_core > videodev > videobuf2_vmalloc videobuf2_memops usbhid hid sr_mod cdrom coretemp > aesni_intel ablk_helper cryptd aes_i586 aes_generic microcode agpgart pcspkr > snd_hda_codec_realtek > snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq > snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd snd_page_alloc > i2c_i801 sg xhci_hcd fan ehci_hcd e1000e evdev [last unloaded: nvidia] > > Oct 20 07:15:22 localhost kernel: [ 224.293811] Pid: 2569, comm: > console-kit-dae Tainted: P O 3.6.2-ic #2 Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can you reproduce your corruptions (this is what it looks like) without that module? Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
A reliable kernel panic (3.6.2) and system crash when visiting a particular website
Hello, I'm running vanilla Linux 3.6.2 x86 on top of CentOS 6.3 userspace. Every time when I enter the chat roulette website, right click anywhere and choose "Settings", my PC crashes (with or without NVIDIA drivers running, it happens even when I'm running Vesa). Web browser: google-chrome-stable-22.0.1229.94-161065.i386.rpm OS: Linux 3.6.2 vanilla x86 CPU: Intel Core i5 2500 (non-overclocked) GCC: 4.7.2 vanilla The latest crash: Oct 20 07:15:22 localhost kernel: [ 224.293756] Modules linked in: pppoe pppox ppp_synctty ppp_async crc_ccitt ppp_generic slhc ipv6 nf_conntrack_ftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_pkttype ipt_ULOG xt_owner xt_multiport iptable_filter ip_tables x_tables w83627ehf adt7475 hwmon_vid vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc fuse hid_generic snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops usbhid hid sr_mod cdrom coretemp aesni_intel ablk_helper cryptd aes_i586 aes_generic microcode agpgart pcspkr snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd snd_page_alloc i2c_i801 sg xhci_hcd fan ehci_hcd e1000e evdev [last unloaded: nvidia] Oct 20 07:15:22 localhost kernel: [ 224.293811] Pid: 2569, comm: console-kit-dae Tainted: P O 3.6.2-ic #2 Oct 20 07:15:22 localhost kernel: [ 224.293811] Call Trace: Oct 20 07:15:22 localhost kernel: [ 224.293813] [] warn_slowpath_common+0x6d/0xa0 Oct 20 07:15:22 localhost kernel: [ 224.293817] [] ? default_send_IPI_mask_logical+0x9b/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293819] [] ? default_send_IPI_mask_logical+0x9b/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293822] [] warn_slowpath_fmt+0x2e/0x30 Oct 20 07:15:22 localhost kernel: [ 224.293824] [] default_send_IPI_mask_logical+0x9b/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293827] [] native_send_call_func_ipi+0x40/0x60 Oct 20 07:15:22 localhost kernel: [ 224.293830] [] smp_call_function_many+0x16a/0x200 Oct 20 07:15:22 localhost kernel: [ 224.293834] [] native_flush_tlb_others+0x26/0x30 Oct 20 07:15:22 localhost kernel: [ 224.293836] [] flush_tlb_page+0x82/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293839] [] ptep_set_access_flags+0x51/0x60 Oct 20 07:15:22 localhost kernel: [ 224.293841] [] handle_pte_fault+0x380/0xc40 Oct 20 07:15:22 localhost kernel: [ 224.293846] [] handle_mm_fault+0x1c4/0x240 Oct 20 07:15:22 localhost kernel: [ 224.293848] [] ? vmalloc_sync_all+0x10/0x10 Oct 20 07:15:22 localhost kernel: [ 224.293852] [] do_page_fault+0xf7/0x3e0 Oct 20 07:15:22 localhost kernel: [ 224.293855] [] ? finish_task_switch+0x42/0xa0 Oct 20 07:15:22 localhost kernel: [ 224.293858] [] ? schedule_tail+0x1b/0x90 Oct 20 07:15:22 localhost kernel: [ 224.293861] [] ? vmalloc_sync_all+0x10/0x10 Oct 20 07:15:22 localhost kernel: [ 224.293863] [] error_code+0x5a/0x60 Oct 20 07:15:22 localhost kernel: [ 224.293867] [] ? vmalloc_sync_all+0x10/0x10 Oct 20 07:15:22 localhost kernel: [ 224.293871] ---[ end trace c30478a5e27a7255 ]--- Another crash: Oct 20 07:08:21 localhost kernel: [ 146.992435] [ cut here ] Oct 20 07:08:21 localhost kernel: [ 146.992444] WARNING: at arch/x86/kernel/apic/ipi.c:109 default_send_IPI_mask_logical+0x9b/0xd0() Oct 20 07:08:21 localhost kernel: [ 146.992447] Hardware name: System Product Name Oct 20 07:08:21 localhost kernel: [ 146.992448] empty IPI mask Oct 20 07:08:21 localhost kernel: [ 146.992450] Modules linked in: pppoe pppox ppp_synctty ppp_async crc_ccitt ppp_generic slhc ipv6 nf_conntrack_ftp nf_con ntrack_netbios_ns nf_conntrack_broadcast xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_pkttype ipt_ULOG xt_owner xt_mul tiport iptable_filter ip_tables x_tables w83627ehf adt7475 hwmon_vid vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc fuse hid_generic uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi usbhid hid sg coretemp aesni_intel ablk_helper cryptd aes_i586 aes_generic microcode sr_mod cdrom pcspkr i2c_i801 xhci_hcd snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd snd_page_alloc ehci_hcd nvidia(PO) agpgart e1000e fan evdev Oct 20 07:08:21 localhost kernel: [ 146.992503] Pid: 2451, comm: Chrome_ProcessL Tainted: P O 3.6.2-ic #2 Oct 20 07:08:21 localhost kernel: [ 146.992504] Call Trace: Oct 20 07:08:21 localhost kernel: [ 146.992509] [] warn_slowpath_common+0x6d/0xa0 Oct 20 07:08:21 localhost kernel: [ 146.992512] [] ? default_send_IPI_mask_logical+0x9b/0xd0 Oct 20 07:08:21 localhost kernel: [ 146.992514]
A reliable kernel panic (3.6.2) and system crash when visiting a particular website
Hello, I'm running vanilla Linux 3.6.2 x86 on top of CentOS 6.3 userspace. Every time when I enter the chat roulette website, right click anywhere and choose Settings, my PC crashes (with or without NVIDIA drivers running, it happens even when I'm running Vesa). Web browser: google-chrome-stable-22.0.1229.94-161065.i386.rpm OS: Linux 3.6.2 vanilla x86 CPU: Intel Core i5 2500 (non-overclocked) GCC: 4.7.2 vanilla The latest crash: Oct 20 07:15:22 localhost kernel: [ 224.293756] Modules linked in: pppoe pppox ppp_synctty ppp_async crc_ccitt ppp_generic slhc ipv6 nf_conntrack_ftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_pkttype ipt_ULOG xt_owner xt_multiport iptable_filter ip_tables x_tables w83627ehf adt7475 hwmon_vid vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc fuse hid_generic snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops usbhid hid sr_mod cdrom coretemp aesni_intel ablk_helper cryptd aes_i586 aes_generic microcode agpgart pcspkr snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd snd_page_alloc i2c_i801 sg xhci_hcd fan ehci_hcd e1000e evdev [last unloaded: nvidia] Oct 20 07:15:22 localhost kernel: [ 224.293811] Pid: 2569, comm: console-kit-dae Tainted: P O 3.6.2-ic #2 Oct 20 07:15:22 localhost kernel: [ 224.293811] Call Trace: Oct 20 07:15:22 localhost kernel: [ 224.293813] [c102e6bd] warn_slowpath_common+0x6d/0xa0 Oct 20 07:15:22 localhost kernel: [ 224.293817] [c10209eb] ? default_send_IPI_mask_logical+0x9b/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293819] [c10209eb] ? default_send_IPI_mask_logical+0x9b/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293822] [c102e76e] warn_slowpath_fmt+0x2e/0x30 Oct 20 07:15:22 localhost kernel: [ 224.293824] [c10209eb] default_send_IPI_mask_logical+0x9b/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293827] [c101eb90] native_send_call_func_ipi+0x40/0x60 Oct 20 07:15:22 localhost kernel: [ 224.293830] [c106aa6a] smp_call_function_many+0x16a/0x200 Oct 20 07:15:22 localhost kernel: [ 224.293834] [c102b116] native_flush_tlb_others+0x26/0x30 Oct 20 07:15:22 localhost kernel: [ 224.293836] [c102b422] flush_tlb_page+0x82/0xd0 Oct 20 07:15:22 localhost kernel: [ 224.293839] [c102a3a1] ptep_set_access_flags+0x51/0x60 Oct 20 07:15:22 localhost kernel: [ 224.293841] [c109e030] handle_pte_fault+0x380/0xc40 Oct 20 07:15:22 localhost kernel: [ 224.293846] [c109f6d4] handle_mm_fault+0x1c4/0x240 Oct 20 07:15:22 localhost kernel: [ 224.293848] [c1026870] ? vmalloc_sync_all+0x10/0x10 Oct 20 07:15:22 localhost kernel: [ 224.293852] [c1026967] do_page_fault+0xf7/0x3e0 Oct 20 07:15:22 localhost kernel: [ 224.293855] [c1051782] ? finish_task_switch+0x42/0xa0 Oct 20 07:15:22 localhost kernel: [ 224.293858] [c1052c1b] ? schedule_tail+0x1b/0x90 Oct 20 07:15:22 localhost kernel: [ 224.293861] [c1026870] ? vmalloc_sync_all+0x10/0x10 Oct 20 07:15:22 localhost kernel: [ 224.293863] [c12f950a] error_code+0x5a/0x60 Oct 20 07:15:22 localhost kernel: [ 224.293867] [c1026870] ? vmalloc_sync_all+0x10/0x10 Oct 20 07:15:22 localhost kernel: [ 224.293871] ---[ end trace c30478a5e27a7255 ]--- Another crash: Oct 20 07:08:21 localhost kernel: [ 146.992435] [ cut here ] Oct 20 07:08:21 localhost kernel: [ 146.992444] WARNING: at arch/x86/kernel/apic/ipi.c:109 default_send_IPI_mask_logical+0x9b/0xd0() Oct 20 07:08:21 localhost kernel: [ 146.992447] Hardware name: System Product Name Oct 20 07:08:21 localhost kernel: [ 146.992448] empty IPI mask Oct 20 07:08:21 localhost kernel: [ 146.992450] Modules linked in: pppoe pppox ppp_synctty ppp_async crc_ccitt ppp_generic slhc ipv6 nf_conntrack_ftp nf_con ntrack_netbios_ns nf_conntrack_broadcast xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_pkttype ipt_ULOG xt_owner xt_mul tiport iptable_filter ip_tables x_tables w83627ehf adt7475 hwmon_vid vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc fuse hid_generic uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi usbhid hid sg coretemp aesni_intel ablk_helper cryptd aes_i586 aes_generic microcode sr_mod cdrom pcspkr i2c_i801 xhci_hcd snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd snd_page_alloc ehci_hcd nvidia(PO) agpgart e1000e fan evdev Oct 20 07:08:21 localhost kernel: [ 146.992503] Pid: 2451, comm: Chrome_ProcessL Tainted: P O 3.6.2-ic #2 Oct 20 07:08:21 localhost kernel: [ 146.992504] Call Trace: Oct 20 07:08:21 localhost kernel: [ 146.992509] [c102e6bd]
Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 12:06:55PM +, Artem S. Tashkinov wrote: Hello, I'm running vanilla Linux 3.6.2 x86 on top of CentOS 6.3 userspace. Every time when I enter the chat roulette website, right click anywhere and choose Settings, my PC crashes (with or without NVIDIA drivers running, it happens even when I'm running Vesa). Web browser: google-chrome-stable-22.0.1229.94-161065.i386.rpm OS: Linux 3.6.2 vanilla x86 CPU: Intel Core i5 2500 (non-overclocked) GCC: 4.7.2 vanilla The latest crash: Oct 20 07:15:22 localhost kernel: [ 224.293756] Modules linked in: pppoe pppox ppp_synctty ppp_async crc_ccitt ppp_generic slhc ipv6 nf_conntrack_ftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_pkttype ipt_ULOG xt_owner xt_multiport iptable_filter ip_tables x_tables w83627ehf adt7475 hwmon_vid vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) binfmt_misc fuse hid_generic snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops usbhid hid sr_mod cdrom coretemp aesni_intel ablk_helper cryptd aes_i586 aes_generic microcode agpgart pcspkr snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd snd_page_alloc i2c_i801 sg xhci_hcd fan ehci_hcd e1000e evdev [last unloaded: nvidia] Oct 20 07:15:22 localhost kernel: [ 224.293811] Pid: 2569, comm: console-kit-dae Tainted: P O 3.6.2-ic #2 Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can you reproduce your corruptions (this is what it looks like) without that module? Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 05:41:49PM +, Artem S. Tashkinov wrote: On Oct 20, 2012, Borislav Petkov wrote: Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can you reproduce your corruptions (this is what it looks like) without that module? Yes, I can reproduce this panic with zero proprietary/non-free modules loaded. The problem is the kernel doesn't even print a kernel panic - the system just freezes completely - cursor in a text console stops blinking. I have no means to debug it using a serial console - what can I do? Ok, here's what you can try: * You say this happens with google chrome. Does it happen if you use another browser: firefox, etc? * Can you build a 64-bit kernel and try the same with it? The 32-bit userspace should work in compat mode just fine. * Can you run memtest on your machine and check whether your DIMMs aren't generating ECC errors? Are your DIMMs ECC, btw? * What about netconsole? You only need another machine on the same network: Documentation/networking/netconsole.txt. * boot with pause_on_oops=600 on the kernel command line to stop the machine for 600 secs after the first oops happens. Then try to make a photo of the screen. Make sure to disable X or to be on a text console so that you can see the oops. * Try enabling a bunch of debugging options in Kernel hacking. More specifically, CONFIG_DETECT_HUNG_TASK CONFIG_DEBUG_PREEMPT CONFIG_DEBUG_SPINLOCK CONFIG_DEBUG_MUTEXES CONFIG_DEBUG_LOCK_ALLOC CONFIG_PROVE_LOCKING CONFIG_PROVE_RCU CONFIG_DEBUG_ATOMIC_SLEEP CONFIG_DEBUG_BUGVERBOSE CONFIG_DEBUG_INFO CONFIG_DEBUG_VM CONFIG_DEBUG_VIRTUAL CONFIG_DEBUG_MEMORY_INIT CONFIG_DEBUG_LIST CONFIG_X86_VERBOSE_BOOTUP CONFIG_DEBUG_RODATA ... I hope those should scream in case something goes awry. HTH. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat 2012-10-20 17:41:49, Artem S. Tashkinov wrote: On Oct 20, 2012, Borislav Petkov wrote: Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can you reproduce your corruptions (this is what it looks like) without that module? Yes, I can reproduce this panic with zero proprietary/non-free modules loaded. The problem is the kernel doesn't even print a kernel panic - the system just freezes completely - cursor in a text console stops blinking. bugtraq? :-). If remote website can crash your Linux, that's quite significant news. (Cc-ed netdev@ and security@ ... this may be important). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Borislav Petkov wrote: Ok, here's what you can try: * You say this happens with google chrome. Does it happen if you use another browser: firefox, etc? * Can you build a 64-bit kernel and try the same with it? The 32-bit userspace should work in compat mode just fine. * Can you run memtest on your machine and check whether your DIMMs aren't generating ECC errors? Are your DIMMs ECC, btw? ... I can reproduce this problem in a virtual machine, which means I have found a real kernel or GCC bug. Alas, VirtualBox 4.2.2 hangs entirely when I run this virtual machine - I've never seen anything like that. Windows 7 64 bit which hosts this VirtualBox cannot even kill a VirtualBox instance. Unfortunately even though I run the kernel with console=ttyS0,115200 console=tty0 parameters they don't help - I see no panic messages on a virtual serial port, which looks like we've got a very deep freeze. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
Hi, I can only reproduce this panic when my USB webcamera is plugged in - when I click settings in Adobe Flash it sends some commands to my USB webcam using, presumably, Video4Linux API calls which cause a kernel hard crash. Your kernel debug features haven't helped at all, even the virtual machine crashes the way I cannot get any information from it - under Windows 7 64 VirtualBox becomes an unkillable process. I've no idea what's crashing - it can be the kernel itself, or some of v4l or usb modules. Artem -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 10:32:28PM +0200, Pavel Machek wrote: On Sat 2012-10-20 17:41:49, Artem S. Tashkinov wrote: On Oct 20, 2012, Borislav Petkov wrote: Yeah, your kernel is tainted with a proprietary module (vbox*, etc). Can you reproduce your corruptions (this is what it looks like) without that module? Yes, I can reproduce this panic with zero proprietary/non-free modules loaded. The problem is the kernel doesn't even print a kernel panic - the system just freezes completely - cursor in a text console stops blinking. bugtraq? :-). If remote website can crash your Linux, that's quite significant news. (Cc-ed netdev@ and security@ ... this may be important). I don't think that's the problem - I rather suspect the fact that he's using virtualbox which is causing random corruptions by writing to arbitrary locations. Artem, please remove virtualbox completely from your system, rebuild the kernel and make sure the virtualbox kernel modules don't get loaded - simply delete them so that they are completely gone; *and* *then* retest again. Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
You don't get me - I have *no* VirtualBox (or any proprietary) modules running - but I can reproduce this problem using *the same system running under* VirtualBox in Windows 7 64. It's almost definitely either a USB driver bug or video4linux driver bug: I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Here are the last lines from my dmesg (with usbmon loaded): [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK POWER sig=se0 PEC CSC [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s [ 292.174883] usb 1-1: USB disconnect, device number 2 [ 292.178045] usb 1-1: unregistering device [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 [ 292.236175] usb 1-1: usb_disable_device nuking all URBs [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x100 [ 294.364279] hub 1-0:1.0: hub_suspend [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 [ 294.367375] ehci_hcd :00:1f.5: suspend root hub [ 296.501084] usb usb1: usb wakeup-resume [ 296.508311] usb usb1: usb auto-resume [ 296.509833] ehci_hcd :00:1f.5: resume root hub [ 296.560149] hub 1-0:1.0: hub_resume [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK POWER sig=se0 CSC CONNECT [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd [ 300.547237] usb 1-1: skipped 1 descriptor after configuration [ 300.549443] usb 1-1: skipped 4 descriptors after interface [ 300.552273] usb 1-1: skipped 2 descriptors after interface [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint [ 300.559392] usb 1-1: skipped 2 descriptors after interface [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint [ 300.562169] usb 1-1: skipped 2 descriptors after interface [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint [ 300.564639] usb 1-1: skipped 2 descriptors after interface [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint [ 300.567084] usb 1-1: skipped 9 descriptors after interface [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint [ 300.570484] usb 1-1: skipped 53 descriptors after interface [ 300.595843] usb 1-1: default language 0x0409 [ 300.602503] usb 1-1: USB interface quirks for this device: 2 [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=1 [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 [ 300.617703] usb 1-1: usb_probe_device [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id [ 301.059395] uvcvideo: Found UVC 1.00 device unnamed (046d:081d) [ 301.090173] input: UVC Camera (046d:081d) as /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 301.158310] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.160238] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.196606] set resolution quirk: cval-res = 384 [ 371.309569] e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 390.729568] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule f5ade900 2296555[ 390.730023] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] 437 S Ii:1:003:7[ 390.736394] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] -115:128 16 f5ade900 2296566256 C Ii:1:003:7 -2:128 0 [ 391.100896] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 391.103188] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] f5ade900 2296926929 S Ii:1:003:7[ 391.104889] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] -115:128 16 f5ade900 2296937889 C Ii:1:003:7 -2:128 0 f5272300 2310382508 S Co:1:003:0 s 01 0b 0004 0001 0 f5272300 2310407888 C Co:1:003:0 0 0 f5272300 2310408051 S Co:1:003:0 s 22 01 0100
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, Oct 20, 2012 at 11:15:17PM +, Artem S. Tashkinov wrote: You don't get me - I have *no* VirtualBox (or any proprietary) modules running Ok, good. We got that out of the way - I wanted to make sure after you replied with two other possibilities of the system freezing. - but I can reproduce this problem using *the same system running under* VirtualBox in Windows 7 64. That's windoze as host and linux as a guest, correct? If so, that's virtualbox's problem, I'd say. It's almost definitely either a USB driver bug or video4linux driver bug: And you're assuming that because the freeze happens when using your usb webcam, correct? And not otherwise? Maybe you can describe in more detail what exactly you're doing so that people could try to reproduce your issue. I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Yes, good idea. Maybe the folks there have some more ideas how to debug this. I'm leaving in the rest for reference. What should be pointed out, though, is that you don't have any more random corruptions causing oopses now that virtualbox is gone. The freeze below is a whole another issue. Thanks. Here are the last lines from my dmesg (with usbmon loaded): [ 292.164833] hub 1-0:1.0: state 7 ports 8 chg evt 0002 [ 292.168091] ehci_hcd :00:1f.5: GetStatus port:1 status 00100a 0 ACK POWER sig=se0 PEC CSC [ 292.172063] hub 1-0:1.0: port 1, status 0100, change 0003, 12 Mb/s [ 292.174883] usb 1-1: USB disconnect, device number 2 [ 292.178045] usb 1-1: unregistering device [ 292.183539] usb 1-1: unregistering interface 1-1:1.0 [ 292.197034] usb 1-1: unregistering interface 1-1:1.1 [ 292.204317] usb 1-1: unregistering interface 1-1:1.2 [ 292.234519] usb 1-1: unregistering interface 1-1:1.3 [ 292.236175] usb 1-1: usb_disable_device nuking all URBs [ 292.364429] hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x100 [ 294.364279] hub 1-0:1.0: hub_suspend [ 294.366045] usb usb1: bus auto-suspend, wakeup 1 [ 294.367375] ehci_hcd :00:1f.5: suspend root hub [ 296.501084] usb usb1: usb wakeup-resume [ 296.508311] usb usb1: usb auto-resume [ 296.509833] ehci_hcd :00:1f.5: resume root hub [ 296.560149] hub 1-0:1.0: hub_resume [ 296.562240] ehci_hcd :00:1f.5: GetStatus port:1 status 001003 0 ACK POWER sig=se0 CSC CONNECT [ 296.566141] hub 1-0:1.0: port 1: status 0501 change 0001 [ 296.670413] hub 1-0:1.0: state 7 ports 8 chg 0002 evt [ 296.673222] hub 1-0:1.0: port 1, status 0501, change , 480 Mb/s [ 297.311720] usb 1-1: new high-speed USB device number 3 using ehci_hcd [ 300.547237] usb 1-1: skipped 1 descriptor after configuration [ 300.549443] usb 1-1: skipped 4 descriptors after interface [ 300.552273] usb 1-1: skipped 2 descriptors after interface [ 300.556499] usb 1-1: skipped 1 descriptor after endpoint [ 300.559392] usb 1-1: skipped 2 descriptors after interface [ 300.560960] usb 1-1: skipped 1 descriptor after endpoint [ 300.562169] usb 1-1: skipped 2 descriptors after interface [ 300.563440] usb 1-1: skipped 1 descriptor after endpoint [ 300.564639] usb 1-1: skipped 2 descriptors after interface [ 300.565828] usb 1-1: skipped 2 descriptors after endpoint [ 300.567084] usb 1-1: skipped 9 descriptors after interface [ 300.569205] usb 1-1: skipped 1 descriptor after endpoint [ 300.570484] usb 1-1: skipped 53 descriptors after interface [ 300.595843] usb 1-1: default language 0x0409 [ 300.602503] usb 1-1: USB interface quirks for this device: 2 [ 300.605700] usb 1-1: udev 3, busnum 1, minor = 2 [ 300.606959] usb 1-1: New USB device found, idVendor=046d, idProduct=081d [ 300.610298] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=1 [ 300.613742] usb 1-1: SerialNumber: 48C5D2B0 [ 300.617703] usb 1-1: usb_probe_device [ 300.620594] usb 1-1: configuration #1 chosen from 1 choice [ 300.639218] usb 1-1: adding 1-1:1.0 (config #1, interface 0) [ 300.640736] snd-usb-audio 1-1:1.0: usb_probe_interface [ 300.642307] snd-usb-audio 1-1:1.0: usb_probe_interface - got id [ 301.050296] usb 1-1: adding 1-1:1.1 (config #1, interface 1) [ 301.054897] usb 1-1: adding 1-1:1.2 (config #1, interface 2) [ 301.056934] uvcvideo 1-1:1.2: usb_probe_interface [ 301.058072] uvcvideo 1-1:1.2: usb_probe_interface - got id [ 301.059395] uvcvideo: Found UVC 1.00 device unnamed (046d:081d) [ 301.090173] input: UVC Camera (046d:081d) as /devices/pci:00/:00:1f.5/usb1/1-1/1-1:1.2/input/input7 [ 301.111289] usb 1-1: adding 1-1:1.3 (config #1, interface 3) [ 301.131207] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.137066] usb 1-1: unlink qh16-0001/f48d64c0 start 2 [1/0 us] [ 301.156451] ehci_hcd :00:1f.5: reused qh f48d64c0 schedule [ 301.158310] usb 1-1: link qh16-0001/f48d64c0 start 2 [1/0 us] [
Re: Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Oct 21, 2012, Borislav Petkov wrote: On Sat, Oct 20, 2012 at 11:15:17PM +, Artem S. Tashkinov wrote: You don't get me - I have *no* VirtualBox (or any proprietary) modules running Ok, good. We got that out of the way - I wanted to make sure after you replied with two other possibilities of the system freezing. - but I can reproduce this problem using *the same system running under* VirtualBox in Windows 7 64. That's windoze as host and linux as a guest, correct? Exactly. If so, that's virtualbox's problem, I'd say. I can reproduce it on my host *alone* as I said in the very first message - never before I tried to run my Linux in a virtual machine. Please, just forget about VirtualBox - it has nothing to do with this problem. It's almost definitely either a USB driver bug or video4linux driver bug: And you're assuming that because the freeze happens when using your usb webcam, correct? And not otherwise? Yes, like I said earlier - only when I try to access its settings using Adobe Flash the system crashes/freezes. Maybe you can describe in more detail what exactly you're doing so that people could try to reproduce your issue. I don't think many people have the same webcam so it's going to be a problem. It can be reproduced easily - just open Flash Settings in Google Chrome 22. The crash will occur immediately. I'm CC'ing linux-media and linux-usb mailing lists, the problem is described here: https://lkml.org/lkml/2012/10/20/35 https://lkml.org/lkml/2012/10/20/148 Yes, good idea. Maybe the folks there have some more ideas how to debug this. I'm leaving in the rest for reference. What should be pointed out, though, is that you don't have any more random corruptions causing oopses now that virtualbox is gone. The freeze below is a whole another issue. The freeze happens on my *host* Linux PC. For an experiment I decided to check if I could reproduce the freeze under a virtual machine - it turns out the Linux kernel running under it also freezes. Artem -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: A reliable kernel panic (3.6.2) and system crash when visiting a particular website
On Sat, 20 Oct 2012, Artem S. Tashkinov wrote: You don't get me - I have *no* VirtualBox (or any proprietary) modules running - but I can reproduce this problem using *the same system running under* VirtualBox in Windows 7 64. It's almost definitely either a USB driver bug or video4linux driver bug: Does the same thing happen with earlier kernel versions? What about if you unload snd-usb-audio or ehci-hcd? Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/