Re: About "panic: bufwrite: buffer is not busy???"
Not sure if the CC: line needs to be trimmed, leaving it as is for now. On Sun, Feb 20, 2011 at 7:46 AM, Jeremy Chadwick wrote: > On Sun, Feb 20, 2011 at 10:30:52AM -0500, Mike Tancsa wrote: >> On 2/20/2011 9:33 AM, Andrey Smagin wrote: >> > On week -current I have same problem, my box paniced every 2-15 min. I >> > resolve problem by next steps - unplug network connectors from 2 intel em >> > (82574L) cards. I think last time that mpd5 related panic, but mpd5 work >> > with another re interface interated on MB. I think it may be em related >> > panic, or em+mpd5. >> >> The latest panic I saw didnt have anything to do with em. Are you sure >> your crashes are because of the nic drive ? > > Not to mention, the error string the OP provided (see Subject) is only > contained in one file: sys/ufs/ffs/ffs_vfsops.c, function > ffs_bufwrite(). So, that would be some kind of weird filesystem-related > issue, not NIC-specific. I have no idea how to debug said problem. I can semi-reliably reproduce this panic message on a 9-CURRENT box, with sources from March 7. On this box, it happens every other time I start hastd. hastd creates the 12 GEOM providers used to create a ZFS pool. A simple "service hastd onestart" will generate the panic. Is there any extra info that needed to help track this down? -- Freddie Cash fjwc...@gmail.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: About "panic: bufwrite: buffer is not busy???"
On 2/21/2011 4:10 PM, Kostik Belousov wrote: > Is this reproducable ? The box seems to have a number of bugs it has been triggering. g...@freebsd.org's netgraph patch, seems to have fixed one of them. Max seems to have fixed two others. This one, I am not sure. I can re-enable memguard to randomly sample again, which is what seemed to have caught / triggered it. > What system version is it ? 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #11: Thu Feb 17 i386, 4G of RAM > > Could you, please, go to frame 12 and show the output of "p *p", > "p *(p->p_ucred)" ? (kgdb) frame 12 #12 0xc0654fd1 in crcopysafe (p=0xc90cc810, cr=0xce3ee800) at /usr/src/sys/kern/kern_prot.c:1950 1950crcopy(cr, oldcred); (kgdb) list 1945PROC_UNLOCK(p); 1946crextend(cr, groups); 1947PROC_LOCK(p); 1948oldcred = p->p_ucred; 1949} 1950crcopy(cr, oldcred); 1951 1952return (oldcred); 1953} 1954 (kgdb) p *(p->p_ucred) $1 = {cr_ref = 3373030400, cr_uid = 3460374784, cr_ruid = 3231313392, cr_svuid = 7196, cr_ngroups = 0, cr_rgid = 503415038, cr_svgid = 0, cr_uidinfo = 0x0, cr_ruidinfo = 0x0, cr_prison = 0x0, cr_pspare = 0x, cr_flags = 4294967295, cr_pspare2 = {0x0, 0x0}, cr_label = 0x, cr_audit = {ai_auid = 0, ai_mask = {am_success = 0, am_failure = 1298034100}, ai_termid = {at_port = 3, at_type = 1, at_addr = {0, 64, 0, 0}}, ai_asid = 0, ai_flags = 0}, cr_groups = 0xc9e37900, cr_agroups = 16} (kgdb) p *p $2 = {p_list = {le_next = 0xc93ed560, le_prev = 0xc9187ac0}, p_threads = {tqh_first = 0xc9196b80, tqh_last = 0xc9196b88}, p_slock = {lock_object = { lo_name = 0xc08efca2 "process slock", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ucred = 0xce3ee600, p_fd = 0xc9559100, p_fdtol = 0x0, p_stats = 0xc90cd600, p_limit = 0xc912d600, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0xc90cc898, c_flags = 0, c_cpu = 0}, p_sigacts = 0xc911f000, p_flag = 268435713, p_state = PRS_NORMAL, p_pid = 565, p_hash = {le_next = 0x0, le_prev = 0xc8d148d4}, p_pglist = {le_next = 0x0, le_prev = 0xc90c85c8}, p_pptr = 0xc8d2b000, p_sibling = {le_next = 0xc93ed560, le_prev = 0xc9187b3c}, p_children = {lh_first = 0x0}, p_mtx = { lock_object = {lo_name = 0xc08efc95 "process lock", lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 3373886336}, p_ksi = 0xc908f9b0, p_sigqueue = { sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xc90cc8d0}, sq_proc = 0xc90cc810, sq_flags = 1}, p_oppid = 0, p_vmspace = 0xc93f0e80, p_swtick = 6600, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = { tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 109046064880, rux_uticks = 1368, rux_sticks = 5393, rux_iticks = 0, rux_uu = 10366008, rux_su = 40860399, rux_tu = 51225136}, p_crux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xc95bf96c, p_lock = 0, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, p_pendingcnt = 0, p_itimers = 0x0, p_magic = 3203398350, p_osrel = 802500, p_comm = "zebra", '\0' , p_pgrp = 0xc90c85c0, p_sysent = 0xc095c800, p_args = 0xc90c8440, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xc062a990 , kl_unlock = 0xc062a940 , kl_assert_locked = 0xc06275f0 , kl_assert_unlocked = 0xc0627600 , kl_lockarg = 0xc90cc898}, p_numthreads = 1, p_md = { md_ldt = 0x0}, p_itcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, p_acflag = 1, p_peers = 0x0, p_leader = 0xc90cc810, p_emuldata = 0x0, p_label = 0x0, p_sched = 0xc90ccac0, p_ktr = {stqh_first = 0x0, stqh_last = 0xc90ccaa0}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0, p_pwait = {cv_description = 0xc08f00ef "ppwait", cv_waiters = 0}, p_dbgwait = {cv_description = 0xc08f00f6 "dbgwait", cv_waiters = 0}} (kgdb) -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing In
Re: About "panic: bufwrite: buffer is not busy???"
On Sun, Feb 20, 2011 at 10:30:52AM -0500, Mike Tancsa wrote: > On 2/20/2011 9:33 AM, Andrey Smagin wrote: > > On week -current I have same problem, my box paniced every 2-15 min. I > > resolve problem by next steps - unplug network connectors from 2 intel em > > (82574L) cards. I think last time that mpd5 related panic, but mpd5 work > > with another re interface interated on MB. I think it may be em related > > panic, or em+mpd5. > > The latest panic I saw didnt have anything to do with em. Are you sure > your crashes are because of the nic drive ? > > The latest I saw was on Friday. > > # kgdb /usr/obj/usr/src/sys/router/kernel.debug vmcore.11 > (kgdb) bt > #0 doadump () at pcpu.h:231 > #1 0xc04a51f9 in db_fncall (dummy1=1, dummy2=0, dummy3=-106856, > dummy4=0xc6b9696c "") at /usr/src/sys/ddb/db_command.c:548 > #2 0xc04a55f1 in db_command (last_cmdp=0xc096f73c, cmd_table=0x0, > dopager=1) at /usr/src/sys/ddb/db_command.c:445 > #3 0xc04a574a in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 > #4 0xc04a764d in db_trap (type=12, code=0) at > /usr/src/sys/ddb/db_main.c:229 > #5 0xc068ba7e in kdb_trap (type=12, code=0, tf=0xc6b96b94) at > /usr/src/sys/kern/subr_kdb.c:546 > #6 0xc088056f in trap_fatal (frame=0xc6b96b94, eva=52) at > /usr/src/sys/i386/i386/trap.c:937 > #7 0xc0880830 in trap_pfault (frame=0xc6b96b94, usermode=0, eva=52) at > /usr/src/sys/i386/i386/trap.c:859 > #8 0xc0880d4a in trap (frame=0xc6b96b94) at > /usr/src/sys/i386/i386/trap.c:532 > #9 0xc086716c in calltrap () at /usr/src/sys/i386/i386/exception.s:166 > #10 0xc0657a16 in uihold (uip=0x0) at /usr/src/sys/kern/kern_resource.c:1248 > #11 0xc0654ec9 in crcopy (dest=0xce3ee800, src=0xce3ee600) at > /usr/src/sys/kern/kern_prot.c:1873 > #12 0xc0654fd1 in crcopysafe (p=0xc90cc810, cr=0xce3ee800) at > /usr/src/sys/kern/kern_prot.c:1950 > #13 0xc0656d7f in seteuid (td=0xc9196b80, uap=0xc6b96cec) at > /usr/src/sys/kern/kern_prot.c:615 > #14 0xc06985ff in syscallenter (td=0xc9196b80, sa=0xc6b96ce4) at > /usr/src/sys/kern/subr_trap.c:315 > #15 0xc0880884 in syscall (frame=0xc6b96d28) at > /usr/src/sys/i386/i386/trap.c:1061 > #16 0xc08671d1 in Xint0x80_syscall () at > /usr/src/sys/i386/i386/exception.s:264 > #17 0x0033 in ?? () > > (kgdb) frame 10 > #10 0xc0657a16 in uihold (uip=0x0) at /usr/src/sys/kern/kern_resource.c:1248 > 1248{ > (kgdb) list > 1243 * Place another refcount on a uidinfo struct. > 1244 */ > 1245void > 1246uihold(uip) > 1247struct uidinfo *uip; > 1248{ > 1249 > 1250refcount_acquire(&uip->ui_ref); > 1251} > 1252 > (kgdb) p *uip > Cannot access memory at address 0x0 > (kgdb) p uip > $1 = (struct uidinfo *) 0x0 > (kgdb) Is this reproducable ? What system version is it ? Could you, please, go to frame 12 and show the output of "p *p", "p *(p->p_ucred)" ? pgpqIxtvc9MgK.pgp Description: PGP signature
Re: About "panic: bufwrite: buffer is not busy???"
On Sun, Feb 20, 2011 at 10:30:52AM -0500, Mike Tancsa wrote: > On 2/20/2011 9:33 AM, Andrey Smagin wrote: > > On week -current I have same problem, my box paniced every 2-15 min. I > > resolve problem by next steps - unplug network connectors from 2 intel em > > (82574L) cards. I think last time that mpd5 related panic, but mpd5 work > > with another re interface interated on MB. I think it may be em related > > panic, or em+mpd5. > > The latest panic I saw didnt have anything to do with em. Are you sure > your crashes are because of the nic drive ? Not to mention, the error string the OP provided (see Subject) is only contained in one file: sys/ufs/ffs/ffs_vfsops.c, function ffs_bufwrite(). So, that would be some kind of weird filesystem-related issue, not NIC-specific. I have no idea how to debug said problem. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: About "panic: bufwrite: buffer is not busy???"
On Feb 20, 2011, at 10:46 AM, Jeremy Chadwick wrote: > On Sun, Feb 20, 2011 at 10:30:52AM -0500, Mike Tancsa wrote: >> On 2/20/2011 9:33 AM, Andrey Smagin wrote: >>> On week -current I have same problem, my box paniced every 2-15 min. I >>> resolve problem by next steps - unplug network connectors from 2 intel em >>> (82574L) cards. I think last time that mpd5 related panic, but mpd5 work >>> with another re interface interated on MB. I think it may be em related >>> panic, or em+mpd5. >> >> The latest panic I saw didnt have anything to do with em. Are you sure >> your crashes are because of the nic drive ? > > Not to mention, the error string the OP provided (see Subject) is only > contained in one file: sys/ufs/ffs/ffs_vfsops.c, function > ffs_bufwrite(). So, that would be some kind of weird filesystem-related > issue, not NIC-specific. I have no idea how to debug said problem. > The issue is the file system activity occurring in parallel with the coredump, which is strange. It seems like everything else should be halted before the dump begins but I couldn't find a place in the code that actually tries to stop the other CPUs. My question isn't about the initial panic (I was using the sysctl to provoke one), but about the secondary panic. This is on 8-core systems. -Andrew -- Andrew Boyerabo...@averesystems.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: About "panic: bufwrite: buffer is not busy???"
On 2/20/2011 9:33 AM, Andrey Smagin wrote: > On week -current I have same problem, my box paniced every 2-15 min. I > resolve problem by next steps - unplug network connectors from 2 intel em > (82574L) cards. I think last time that mpd5 related panic, but mpd5 work with > another re interface interated on MB. I think it may be em related panic, or > em+mpd5. The latest panic I saw didnt have anything to do with em. Are you sure your crashes are because of the nic drive ? The latest I saw was on Friday. # kgdb /usr/obj/usr/src/sys/router/kernel.debug vmcore.11 (kgdb) bt #0 doadump () at pcpu.h:231 #1 0xc04a51f9 in db_fncall (dummy1=1, dummy2=0, dummy3=-106856, dummy4=0xc6b9696c "") at /usr/src/sys/ddb/db_command.c:548 #2 0xc04a55f1 in db_command (last_cmdp=0xc096f73c, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:445 #3 0xc04a574a in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xc04a764d in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229 #5 0xc068ba7e in kdb_trap (type=12, code=0, tf=0xc6b96b94) at /usr/src/sys/kern/subr_kdb.c:546 #6 0xc088056f in trap_fatal (frame=0xc6b96b94, eva=52) at /usr/src/sys/i386/i386/trap.c:937 #7 0xc0880830 in trap_pfault (frame=0xc6b96b94, usermode=0, eva=52) at /usr/src/sys/i386/i386/trap.c:859 #8 0xc0880d4a in trap (frame=0xc6b96b94) at /usr/src/sys/i386/i386/trap.c:532 #9 0xc086716c in calltrap () at /usr/src/sys/i386/i386/exception.s:166 #10 0xc0657a16 in uihold (uip=0x0) at /usr/src/sys/kern/kern_resource.c:1248 #11 0xc0654ec9 in crcopy (dest=0xce3ee800, src=0xce3ee600) at /usr/src/sys/kern/kern_prot.c:1873 #12 0xc0654fd1 in crcopysafe (p=0xc90cc810, cr=0xce3ee800) at /usr/src/sys/kern/kern_prot.c:1950 #13 0xc0656d7f in seteuid (td=0xc9196b80, uap=0xc6b96cec) at /usr/src/sys/kern/kern_prot.c:615 #14 0xc06985ff in syscallenter (td=0xc9196b80, sa=0xc6b96ce4) at /usr/src/sys/kern/subr_trap.c:315 #15 0xc0880884 in syscall (frame=0xc6b96d28) at /usr/src/sys/i386/i386/trap.c:1061 #16 0xc08671d1 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:264 #17 0x0033 in ?? () (kgdb) frame 10 #10 0xc0657a16 in uihold (uip=0x0) at /usr/src/sys/kern/kern_resource.c:1248 1248{ (kgdb) list 1243 * Place another refcount on a uidinfo struct. 1244 */ 1245void 1246uihold(uip) 1247struct uidinfo *uip; 1248{ 1249 1250refcount_acquire(&uip->ui_ref); 1251} 1252 (kgdb) p *uip Cannot access memory at address 0x0 (kgdb) p uip $1 = (struct uidinfo *) 0x0 (kgdb) > > > Wed, 16 Feb 2011 12:08:30 -0500 письмо от Andrew Boyer > : > >> Moving this to -current and -stable and following up... >> >> Something is broken with coredumps on stable/8 amd64. I tried a vanilla >> 8.2-RC3 and yesterday's csup of stable/8; neither can dump a core with >> 'sysctl >> debug.kdb.panic=1'. >> >> For the 8.2-RC3 / amd64 / GENERIC install, I used the memstick image, >> installed on ad7 (a 250GB SATA drive), used the default partition map, and >> set >> dumpdev to AUTO. >> >> I added enough tracing to show that the second panic is due to the syncer >> process flushing buffers to the other filesystems in parallel with the dump. >> I've seen this panic and a similar one 'buffer not locked' coming from >> ffs_write(). One time out of about 30 the core ran to completion, but slowly >> (~1MB/sec). Other times the dump just locks up completely with no other >> output. >> >> Does anyone know what might have changed to expose this problem? >> >> I don't ever see it under 7.1. >> >> Thanks, >> Andrew >> >> On Feb 3, 2011, at 12:11 AM, Eugene Grosbein wrote: >> >>> On 02.02.2011 00:50, Gleb Smirnoff wrote: >>> E> Uptime: 8h3m51s E> Dumping 4087 MB (3 chunks) E> chunk 0: 1MB (150 pages) ... ok E> chunk 1: 3575MB (915088 pages) 3559 3543panic: bufwrite: buffer is not >> busy??? E> cpuid = 3 E> Uptime: 8h3m52s E> Automatic reboot in 15 seconds - press a key on the console to abort Can you add KDB_TRACE option to kernel? Your boxes for some reason can't dump core, but with this option we will have at least trace. >>> >>> I see Mike Tancsa's box has "bufwrite: buffer is not busy???" problem too. >>> Has anyone a thought how to fix generation of crashdumps? >>> >>> Eugene Grosbein >>> >>> >>> ___ >>> freebsd-...@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> >> -- >> Andrew Boyer abo...@averesystems.com >> >> >> >> >> ___ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > > -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing
Re: About "panic: bufwrite: buffer is not busy???"
Moving this to -current and -stable and following up... Something is broken with coredumps on stable/8 amd64. I tried a vanilla 8.2-RC3 and yesterday's csup of stable/8; neither can dump a core with 'sysctl debug.kdb.panic=1'. For the 8.2-RC3 / amd64 / GENERIC install, I used the memstick image, installed on ad7 (a 250GB SATA drive), used the default partition map, and set dumpdev to AUTO. I added enough tracing to show that the second panic is due to the syncer process flushing buffers to the other filesystems in parallel with the dump. I've seen this panic and a similar one 'buffer not locked' coming from ffs_write(). One time out of about 30 the core ran to completion, but slowly (~1MB/sec). Other times the dump just locks up completely with no other output. Does anyone know what might have changed to expose this problem? I don't ever see it under 7.1. Thanks, Andrew On Feb 3, 2011, at 12:11 AM, Eugene Grosbein wrote: > On 02.02.2011 00:50, Gleb Smirnoff wrote: > >> E> Uptime: 8h3m51s >> E> Dumping 4087 MB (3 chunks) >> E> chunk 0: 1MB (150 pages) ... ok >> E> chunk 1: 3575MB (915088 pages) 3559 3543panic: bufwrite: buffer is not >> busy??? >> E> cpuid = 3 >> E> Uptime: 8h3m52s >> E> Automatic reboot in 15 seconds - press a key on the console to abort >> Can you add KDB_TRACE option to kernel? Your boxes for some reason can't >> dump core, but with this option we will have at least trace. > > I see Mike Tancsa's box has "bufwrite: buffer is not busy???" problem too. > Has anyone a thought how to fix generation of crashdumps? > > Eugene Grosbein > > > ___ > freebsd-...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" -- Andrew Boyerabo...@averesystems.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"