Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-25 Thread Justin T. Gibbs
On 6/25/11 12:39 AM, Andrey Chernov wrote: On Fri, Jun 24, 2011 at 09:09:08PM -0400, Justin T. Gibbs wrote: No problem. I just set kern.geom.debugflags=4 in loader.conf and here is new photo (with recent kernel, no patches): http://img803.imageshack.us/img803/4679/25062011006.jpg I skip all

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-25 Thread Andrey Chernov
On Sat, Jun 25, 2011 at 07:27:20AM -0400, Justin T. Gibbs wrote: I use splitting by half method to find exact date which boots, then see the next commit above that date. Pre-commit kernel goes to multiuser and network is alive. I don't test CDs are working, I'll do that later and

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Andrey Chernov
On Thu, Jun 23, 2011 at 11:01:17PM -0400, Justin T. Gibbs wrote: To test this theory, apply the following patch. I do not know if this is safe for changer devices, but I will review the changer code if this patch fixes ache's problem. I don't have changers. One of the plain ATA DVDs is

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Andrey Chernov
I forget to mention that the place is the same as trace shows: g_event_procbody-g_run_events-g_new_provider_event-g_dev_taste- g_dev_attrchanged-g_access-g_disk_access-cdopen-cam_periph_hold and it sleeps. On Fri, Jun 24, 2011 at 05:08:27PM +0400, Andrey Chernov wrote: On Thu, Jun 23, 2011 at

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Scott Long
On Jun 23, 2011, at 11:18 PM, Scott Long wrote: On Jun 23, 2011, at 9:01 PM, Justin T. Gibbs wrote: On 6/22/11 4:09 PM, Kenneth D. Merry wrote: On Wed, Jun 22, 2011 at 08:13:25 +0400, Andrey Chernov wrote: On Tue, Jun 21, 2011 at 09:54:04PM -0600, Kenneth D. Merry wrote: These two are

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Justin T. Gibbs
On 6/24/11 3:35 PM, Scott Long wrote: On Jun 23, 2011, at 11:18 PM, Scott Long wrote: On Jun 23, 2011, at 9:01 PM, Justin T. Gibbs wrote: On 6/22/11 4:09 PM, Kenneth D. Merry wrote: On Wed, Jun 22, 2011 at 08:13:25 +0400, Andrey Chernov wrote: On Tue, Jun 21, 2011 at 09:54:04PM -0600,

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Andrey Chernov
On Fri, Jun 24, 2011 at 04:20:24PM -0400, Justin T. Gibbs wrote: Instead, I believe that either one of the GEOM taste methods is leaking an access reference (so cdclose() is not called), or the CD driver is failing to release the hold semaphore during probing. Setting kern.geom.debugflags to

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Justin T. Gibbs
On 6/24/11 6:26 PM, Andrey Chernov wrote: On Fri, Jun 24, 2011 at 04:20:24PM -0400, Justin T. Gibbs wrote: Instead, I believe that either one of the GEOM taste methods is leaking an access reference (so cdclose() is not called), or the CD driver is failing to release the hold semaphore

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Andrey Chernov
On Fri, Jun 24, 2011 at 09:09:08PM -0400, Justin T. Gibbs wrote: No problem. I just set kern.geom.debugflags=4 in loader.conf and here is new photo (with recent kernel, no patches): http://img803.imageshack.us/img803/4679/25062011006.jpg I skip all noisy parts related to ada0 and ada1

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-24 Thread Andrey Chernov
On Sat, Jun 25, 2011 at 08:39:17AM +0400, Andrey Chernov wrote: Are you positive it is this specific SVN revision that prevents cd0 from probing properly and not one of my previous CAM commits? Just getting to multi-user doesn't mean we're ok here. My GEOM changes may make the system

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-23 Thread Andriy Gapon
on 22/06/2011 23:09 Kenneth D. Merry said the following: The GEOM event thread is stuck sleeping in the mtx_sleep() call above. So that tells me that one of several things is going on: - There is a path in the cd(4) driver where it can call cam_periph_hold() but not cam_periph_unhold().

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-23 Thread Andrey Chernov
Apparently there is another problem plain ATA CD/DVD related. With r223443 hangs nature is changed: I see no more waiting in caplck state, just xpt_thrd waiting in ccb_scan state forever and those repeated messages: run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-23 Thread Andrey Chernov
On Thu, Jun 23, 2011 at 03:51:36PM +0300, Andriy Gapon wrote: More than once I've seen under qemu that the kernel boot non-deterministically gets stuck in the cd driver. Other people have also bumped into this. E.g., here's one of the reports that I googled up, it's not exactly the same as

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-23 Thread Andrey Chernov
On Wed, Jun 22, 2011 at 02:09:19PM -0600, Kenneth D. Merry wrote: Well, after looking at the code a little more, it looks like the lock that is being held is the periph lock, which is really just a flag. So 'show lock' wouldn't show anything relevant. Here's cam_periph_hold(): With recent

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-23 Thread Justin T. Gibbs
On 6/22/11 4:09 PM, Kenneth D. Merry wrote: On Wed, Jun 22, 2011 at 08:13:25 +0400, Andrey Chernov wrote: On Tue, Jun 21, 2011 at 09:54:04PM -0600, Kenneth D. Merry wrote: These two are interesting: http://img825.imageshack.us/img825/1249/21062011014m.jpg

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-23 Thread Scott Long
On Jun 23, 2011, at 9:01 PM, Justin T. Gibbs wrote: On 6/22/11 4:09 PM, Kenneth D. Merry wrote: On Wed, Jun 22, 2011 at 08:13:25 +0400, Andrey Chernov wrote: On Tue, Jun 21, 2011 at 09:54:04PM -0600, Kenneth D. Merry wrote: These two are interesting:

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-22 Thread Garrett Cooper
On Tue, Jun 21, 2011 at 9:51 PM, Andrey Chernov a...@freebsd.org wrote: Remove cd/acd from your kernel config to see if that allows you to boot? I unplug DVDs physically and kernel finally boots! BTW both DVDs was empty during the hanged boot and works normally under Win7. Put a DVD in each

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-22 Thread Andrey Chernov
On Tue, Jun 21, 2011 at 11:19:38PM -0700, Garrett Cooper wrote: On Tue, Jun 21, 2011 at 9:51 PM, Andrey Chernov a...@freebsd.org wrote: Remove cd/acd from your kernel config to see if that allows you to boot? I unplug DVDs physically and kernel finally boots! BTW both DVDs was empty

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-22 Thread Andrey Chernov
I forget to add that only 1 disk inside each drive doesn't change this picture at all. On Wed, Jun 22, 2011 at 10:42:30AM +0400, Andrey Chernov wrote: On Tue, Jun 21, 2011 at 11:19:38PM -0700, Garrett Cooper wrote: On Tue, Jun 21, 2011 at 9:51 PM, Andrey Chernov a...@freebsd.org wrote:

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-22 Thread Kenneth D. Merry
On Wed, Jun 22, 2011 at 08:13:25 +0400, Andrey Chernov wrote: On Tue, Jun 21, 2011 at 09:54:04PM -0600, Kenneth D. Merry wrote: These two are interesting: http://img825.imageshack.us/img825/1249/21062011014m.jpg http://img839.imageshack.us/img839/3791/21062011015.jpg It looks like

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread Kenneth D. Merry
On Wed, Jun 22, 2011 at 00:49:34 +0400, Andrey Chernov wrote: On Tue, Jun 21, 2011 at 10:17:19AM -0600, Kenneth D. Merry wrote: ps alltrace show locks show msgbuf Hopefully that will give us something to start looking at... This would really work a lot better if there is any way

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread Andrey Chernov
On Tue, Jun 21, 2011 at 09:54:04PM -0600, Kenneth D. Merry wrote: These two are interesting: http://img825.imageshack.us/img825/1249/21062011014m.jpg http://img839.imageshack.us/img839/3791/21062011015.jpg It looks like the GEOM event thread is stuck inside the cd(4) driver. The cd(4)

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread Andrey Chernov
Remove cd/acd from your kernel config to see if that allows you to boot? I unplug DVDs physically and kernel finally boots! BTW both DVDs was empty during the hanged boot and works normally under Win7. -- http://ache.vniz.net/ ___

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread Kenneth D. Merry
On Mon, Jun 20, 2011 at 15:46:56 +0400, Andrey Chernov wrote: On Mon, Jun 20, 2011 at 11:01:46AM +0300, Kostik Belousov wrote: On Mon, Jun 20, 2011 at 11:02:22AM +0400, Andrey Chernov wrote: On Sun, Jun 19, 2011 at 08:15:43PM -0600, Justin T. Gibbs wrote: On 6/19/11 6:19 PM, Andrey

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread Will Andrews
Hi Andrey, On Mon, Jun 20, 2011 at 5:46 AM, Andrey Chernov a...@freebsd.org wrote: As the second message in the thread states, I try first even 223296 with the same hang and the same xpt_action_default: CCB type 0xe not supported As I think, DDB's 'ps' indicates that kernel waits something

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread George Kontostanos
After applying the patch the system does not boot anymore! It hangs after probing for scsi devices. On Tue, Jun 21, 2011 at 8:58 PM, Will Andrews w...@firepipe.net wrote: Hi Andrey, On Mon, Jun 20, 2011 at 5:46 AM, Andrey Chernov a...@freebsd.org wrote: As the second message in the thread

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread Andrey Chernov
On Tue, Jun 21, 2011 at 11:58:17AM -0600, Will Andrews wrote: Hi Andrey, On Mon, Jun 20, 2011 at 5:46 AM, Andrey Chernov a...@freebsd.org wrote: As the second message in the thread states, I try first even 223296 with the same hang and the same xpt_action_default: CCB type 0xe not

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-21 Thread Andrey Chernov
On Tue, Jun 21, 2011 at 10:17:19AM -0600, Kenneth D. Merry wrote: ps alltrace show locks show msgbuf Hopefully that will give us something to start looking at... This would really work a lot better if there is any way to get a serial console on the machine. The above will produce a

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-20 Thread Andrey Chernov
On Sun, Jun 19, 2011 at 08:15:43PM -0600, Justin T. Gibbs wrote: On 6/19/11 6:19 PM, Andrey Chernov wrote: Exactly that commit is responsible for boot hang. Please fix. BTW, I have MBR on SATA disk (CAM emulated), ICH9. Since it works for me, you'll need to provide more information. Can

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-20 Thread Kostik Belousov
On Mon, Jun 20, 2011 at 11:02:22AM +0400, Andrey Chernov wrote: On Sun, Jun 19, 2011 at 08:15:43PM -0600, Justin T. Gibbs wrote: On 6/19/11 6:19 PM, Andrey Chernov wrote: Exactly that commit is responsible for boot hang. Please fix. BTW, I have MBR on SATA disk (CAM emulated), ICH9.

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-20 Thread Andrey Chernov
On Mon, Jun 20, 2011 at 11:01:46AM +0300, Kostik Belousov wrote: On Mon, Jun 20, 2011 at 11:02:22AM +0400, Andrey Chernov wrote: On Sun, Jun 19, 2011 at 08:15:43PM -0600, Justin T. Gibbs wrote: On 6/19/11 6:19 PM, Andrey Chernov wrote: Exactly that commit is responsible for boot hang.

Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-19 Thread Andrey Chernov
Exactly that commit is responsible for boot hang. Please fix. BTW, I have MBR on SATA disk (CAM emulated), ICH9. Revision 223089 - Directory Listing Modified Tue Jun 14 17:10:32 2011 UTC (5 days, 6 hours ago) by gibbs Plumb device physical path reporting from CAM devices, through GEOM and DEVFS,

Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

2011-06-19 Thread Justin T. Gibbs
On 6/19/11 6:19 PM, Andrey Chernov wrote: Exactly that commit is responsible for boot hang. Please fix. BTW, I have MBR on SATA disk (CAM emulated), ICH9. Since it works for me, you'll need to provide more information. Can you at least drop into kdb to determine the likely source of the hang