Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/18/2016 13:30, Andriy Gapon wrote: On 14/11/2016 14:00, Henri Hennebert wrote: On 11/14/2016 12:45, Andriy Gapon wrote: Okay. Luckily for us, it seems that 'm' is available in frame 5. It also happens to be the first field of 'struct faultstate'. So, could you please go to frame and print '*m' and '*(struct faultstate *)m' ? (kgdb) fr 4 #4 0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753 753msleep(m, vm_page_lockptr(m), PVM | PDROP, wmesg, 0); (kgdb) print *m $1 = {plinks = {q = {tqe_next = 0xf800dc5d85b0, tqe_prev = 0xf800debf3bd0}, s = {ss = {sle_next = 0xf800dc5d85b0}, pv = 0xf800debf3bd0}, memguard = {p = 18446735281313646000, v = 18446735281353604048}}, listq = {tqe_next = 0x0, tqe_prev = 0xf800dc5d85c0}, object = 0xf800b62e9c60, pindex = 11, phys_addr = 3389358080, md = {pv_list = { tqh_first = 0x0, tqh_last = 0xf800df68cd78}, pv_gen = 426, pat_mode = 6}, wire_count = 0, busy_lock = 6, hold_count = 0, flags = 0, aflags = 2 '\002', oflags = 0 '\0', queue = 0 '\0', psind = 0 '\0', segind = 3 '\003', order = 13 '\r', pool = 0 '\0', act_count = 0 '\0', valid = 0 '\0', dirty = 0 '\0'} If I interpret this correctly the page is in the 'exclusive busy' state. Unfortunately, I can't tell much beyond that. But I am confident that this is the root cause of the lock-up. (kgdb) print *(struct faultstate *)m $2 = {m = 0xf800dc5d85b0, object = 0xf800debf3bd0, pindex = 0, first_m = 0xf800dc5d85c0, first_object = 0xf800b62e9c60, first_pindex = 11, map = 0xca058000, entry = 0x0, lookup_still_valid = -546779784, vp = 0x601aa} (kgdb) I was wrong on this one as 'm' is actually a pointer, so the above is not correct. Maybe 'info reg' in frame 5 would give a clue about the value of 'fs'. (kgdb) fr 5 #5 0x8089dd4d in vm_page_sleep_if_busy (m=0xf800df68cd40, msg=0x809c51bc "vmpfw") at /usr/src/sys/vm/vm_page.c:1086 1086vm_page_busy_sleep(m, msg); (kgdb) info reg rax0x0 0 rbx0xf800b62e9c78 -8793036514184 rcx0x0 0 rdx0x0 0 rsi0x0 0 rdi0x0 0 rbp0xfe0101836810 0xfe0101836810 rsp0xfe01018367e0 0xfe01018367e0 r8 0x0 0 r9 0x0 0 r100x0 0 r110x0 0 r120xf800b642aa00 -879303520 r130xf800df68cd40 -8792344834752 r140xf800b62e9c60 -8793036514208 r150x809c51bc -2137239108 rip0x8089dd4d 0x8089dd4d <vm_page_sleep_if_busy+285> eflags 0x0 0 cs 0x0 0 ss 0x0 0 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 I don't know what to do from here. I am not sure how to proceed from here. The only thing I can think of is a lock order reversal between the vnode lock and the page busying quasi-lock. But examining the code I can not spot it. Another possibility is a leak of a busy page, but that's hard to debug. How hard is it to reproduce the problem? After 7 days all seems normal only one copy of innd: [root@avoriaz ~]# ps xa|grep inn 1193 - Is 0:01.40 /usr/local/news/bin/innd -r 13498 - IN 0:00.01 /usr/local/news/bin/innfeed 1194 v0- IW 0:00.00 /bin/sh /usr/local/news/bin/innwatch -i 60 I will try to stop and restart innd. All continue to look good: [root@avoriaz ~]# ps xa|grep inn 31673 - Ss 0:00.02 /usr/local/news/bin/innd 31694 - SN 0:00.01 /usr/local/news/bin/innfeed 31674 0 S0:00.01 /bin/sh /usr/local/news/bin/innwatch -i 60 I think to reproduce is just waiting it occurs by itself... One thing here: The deadlock occurs at least 5 times since 10.0R. And always with the directory /usr/local/news/bin Maybe Konstantin would have some ideas or suggestions. Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/14/2016 12:45, Andriy Gapon wrote: On 14/11/2016 11:35, Henri Hennebert wrote: On 11/14/2016 10:07, Andriy Gapon wrote: Hmm, I've just noticed another interesting thread: Thread 668 (Thread 101245): #0 sched_switch (td=0xf800b642aa00, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973 #1 0x80561ae2 in mi_switch (flags=, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:646 #3 0x805614b1 in _sleep (ident=, lock=, priority=, wmesg=0x809c51bc "vmpfw", sbt=0, pr=, flags=) at /usr/src/sys/kern/kern_synch.c:229 #4 0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753 #5 0x8089dd4d in vm_page_sleep_if_busy (m=0xf800df68cd40, msg=0x809c51bc "vmpfw") at /usr/src/sys/vm/vm_page.c:1086 #6 0x80886be9 in vm_fault_hold (map=, vaddr=, fault_type=4 '\004', fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:495 #7 0x80885448 in vm_fault (map=0xf80011d66000, vaddr=, fault_type=4 '\004', fault_flags=) at /usr/src/sys/vm/vm_fault.c:273 #8 0x808d3c49 in trap_pfault (frame=0xfe0101836c00, usermode=1) at /usr/src/sys/amd64/amd64/trap.c:741 #9 0x808d3386 in trap (frame=0xfe0101836c00) at /usr/src/sys/amd64/amd64/trap.c:333 #10 0x808b7af1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 This tread is another program from the news system: 668 Thread 101245 (PID=49124: innfeed) sched_switch (td=0xf800b642aa00, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973 I strongly suspect that this is thread that we were looking for. I think that it has the vnode lock in the shared mode while trying to fault in a page. --clip-- Okay. Luckily for us, it seems that 'm' is available in frame 5. It also happens to be the first field of 'struct faultstate'. So, could you please go to frame and print '*m' and '*(struct faultstate *)m' ? (kgdb) fr 4 #4 0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753 753 msleep(m, vm_page_lockptr(m), PVM | PDROP, wmesg, 0); (kgdb) print *m $1 = {plinks = {q = {tqe_next = 0xf800dc5d85b0, tqe_prev = 0xf800debf3bd0}, s = {ss = {sle_next = 0xf800dc5d85b0}, pv = 0xf800debf3bd0}, memguard = {p = 18446735281313646000, v = 18446735281353604048}}, listq = {tqe_next = 0x0, tqe_prev = 0xf800dc5d85c0}, object = 0xf800b62e9c60, pindex = 11, phys_addr = 3389358080, md = {pv_list = { tqh_first = 0x0, tqh_last = 0xf800df68cd78}, pv_gen = 426, pat_mode = 6}, wire_count = 0, busy_lock = 6, hold_count = 0, flags = 0, aflags = 2 '\002', oflags = 0 '\0', queue = 0 '\0', psind = 0 '\0', segind = 3 '\003', order = 13 '\r', pool = 0 '\0', act_count = 0 '\0', valid = 0 '\0', dirty = 0 '\0'} (kgdb) print *(struct faultstate *)m $2 = {m = 0xf800dc5d85b0, object = 0xf800debf3bd0, pindex = 0, first_m = 0xf800dc5d85c0, first_object = 0xf800b62e9c60, first_pindex = 11, map = 0xca058000, entry = 0x0, lookup_still_valid = -546779784, vp = 0x601aa} (kgdb) ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/14/2016 10:07, Andriy Gapon wrote: On 13/11/2016 15:28, Henri Hennebert wrote: On 11/13/2016 11:06, Andriy Gapon wrote: On 12/11/2016 14:40, Henri Hennebert wrote: [snip] Could you please show 'info local' in frame 14? I expected that 'nd' variable would be defined there and it may contain some useful information. No luck there: (kgdb) fr 14 #14 0x80636838 in kern_statat (td=0xf80009ba0500, flag=, fd=-100, path=0x0, pathseg=, sbp=, hook=0x800e2a388) at /usr/src/sys/kern/vfs_syscalls.c:2160 2160if ((error = namei()) != 0) (kgdb) info local rights = nd = error = sb = (kgdb) I also try to get information from the execve of the other treads: for tid 101250: (kgdb) fr 10 #10 0x80508ccc in sys_execve (td=0xf800b6429000, uap=0xfe010184fb80) at /usr/src/sys/kern/kern_exec.c:218 218error = kern_execve(td, , NULL); (kgdb) print *uap $4 = {fname_l_ = 0xfe010184fb80 "`\220\217\002\b", fname = 0x8028f9060 , fname_r_ = 0xfe010184fb88 "`¶ÿÿÿ\177", argv_l_ = 0xfe010184fb88 "`¶ÿÿÿ\177", argv = 0x7fffb660, argv_r_ = 0xfe010184fb90 "\bÜÿÿÿ\177", envv_l_ = 0xfe010184fb90 "\bÜÿÿÿ\177", envv = 0x7fffdc08, envv_r_ = 0xfe010184fb98 ""} (kgdb) for tid 101243: (kgdb) f 15 #15 0x80508ccc in sys_execve (td=0xf800b642b500, uap=0xfe010182cb80) at /usr/src/sys/kern/kern_exec.c:218 218error = kern_execve(td, , NULL); (kgdb) print *uap $5 = {fname_l_ = 0xfe010182cb80 "ÀÏ\205\002\b", fname = 0x80285cfc0 , fname_r_ = 0xfe010182cb88 "`¶ÿÿÿ\177", argv_l_ = 0xfe010182cb88 "`¶ÿÿÿ\177", argv = 0x7fffb660, argv_r_ = 0xfe010182cb90 "\bÜÿÿÿ\177", envv_l_ = 0xfe010182cb90 "\bÜÿÿÿ\177", envv = 0x7fffdc08, envv_r_ = 0xfe010182cb98 ""} (kgdb) I think that you see garbage in those structures because they contain pointers to userland data. Hmm, I've just noticed another interesting thread: Thread 668 (Thread 101245): #0 sched_switch (td=0xf800b642aa00, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973 #1 0x80561ae2 in mi_switch (flags=, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:646 #3 0x805614b1 in _sleep (ident=, lock=, priority=, wmesg=0x809c51bc "vmpfw", sbt=0, pr=, flags=) at /usr/src/sys/kern/kern_synch.c:229 #4 0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753 #5 0x8089dd4d in vm_page_sleep_if_busy (m=0xf800df68cd40, msg=0x809c51bc "vmpfw") at /usr/src/sys/vm/vm_page.c:1086 #6 0x80886be9 in vm_fault_hold (map=, vaddr=, fault_type=4 '\004', fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:495 #7 0x80885448 in vm_fault (map=0xf80011d66000, vaddr=, fault_type=4 '\004', fault_flags=) at /usr/src/sys/vm/vm_fault.c:273 #8 0x808d3c49 in trap_pfault (frame=0xfe0101836c00, usermode=1) at /usr/src/sys/amd64/amd64/trap.c:741 #9 0x808d3386 in trap (frame=0xfe0101836c00) at /usr/src/sys/amd64/amd64/trap.c:333 #10 0x808b7af1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 This tread is another program from the news system: 668 Thread 101245 (PID=49124: innfeed) sched_switch (td=0xf800b642aa00, newtd=0xf8000285f000, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973 I strongly suspect that this is thread that we were looking for. I think that it has the vnode lock in the shared mode while trying to fault in a page. Could you please check that by going to frame 6 and printing 'fs' and '*fs.vp'? It'd be interesting to understand why this thread is waiting here. So, please also print '*fs.m' and '*fs.object'. No luck :-( (kgdb) fr 6 #6 0x80886be9 in vm_fault_hold (map=, vaddr=, fault_type=4 '\004', fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:495 495 vm_page_sleep_if_busy(fs.m, "vmpfw"); (kgdb) print fs Cannot access memory at address 0x1fa0 (kgdb) Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/13/2016 14:28, Henri Hennebert wrote: This 2 threads are innd processes. In core.txt.4: 8 14789 29165 0 24 4 40040 6612 zfs DN- 0:00.00 [innd] 8 29165 1 0 20 0 42496 6888 select Ds- 0:01.33 [innd] 8 49778 29165 0 24 4 40040 6900 zfs DN- 0:00.00 [innd] 8 82034 29165 0 24 4 132 0 zfs DN- 0:00.00 [innd] the corresponding info treads are: 687 Thread 101243 (PID=49778: innd) sched_switch (td=0xf800b642b500, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973 681 Thread 101147 (PID=14789: innd) sched_switch (td=0xf80065f4e500, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973 669 Thread 101250 (PID=82034: innd) sched_switch (td=0xf800b6429000, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973 665 Thread 101262 (PID=29165: innd) sched_switch (td=0xf800b6b54a00, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973 In case it may help, I have a look at innd. This processes use 2 execv: one to execute /bin/sh and the other to execute itself: /* ** Re-exec ourselves. */ static const char * CCxexec(char *av[]) { char*innd; char*p; int i; if (CCargv == NULL) return "1 no argv!"; innd = concatpath(innconf->pathbin, "innd"); /* Get the pathname. */ p = av[0]; if (*p == '\0' || strcmp(p, "innd") == 0) CCargv[0] = innd; else return "1 Bad value"; #ifdef DO_PERL PLmode(Mode, OMshutdown, av[0]); #endif #ifdef DO_PYTHON PYmode(Mode, OMshutdown, av[0]); #endif JustCleanup(); syslog(L_NOTICE, "%s execv %s", LogName, CCargv[0]); /* Close all fds to protect possible fd leaking accross successive innds. */ for (i=3; i<30; i++) close(i); execv(CCargv[0], CCargv); syslog(L_FATAL, "%s cant execv %s %m", LogName, CCargv[0]); _exit(1); /* NOTREACHED */ return "1 Exit failed"; } The culprit may be /usr/local/news/bin/innd, remember that find is locked in /usr/local/news/bin Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/13/2016 11:06, Andriy Gapon wrote: On 12/11/2016 14:40, Henri Hennebert wrote: I attatch it Thank you! So, these two threads are trying to get the lock in the exclusive mode: Thread 687 (Thread 101243): #0 sched_switch (td=0xf800b642b500, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973 #1 0x80561ae2 in mi_switch (flags=, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:646 #3 0x8052f854 in sleeplk (lk=, flags=, ilk=, wmesg=0x813be535 "zfs", pri=, timo=51) at /usr/src/sys/kern/kern_lock.c:222 #4 0x8052f39d in __lockmgr_args (lk=, flags=, ilk=, wmesg=, pri=, timo=, file=, line=) at /usr/src/sys/kern/kern_lock.c:958 #5 0x80616a8c in vop_stdlock (ap=) at lockmgr.h:98 #6 0x8093784d in VOP_LOCK1_APV (vop=, a=) at vnode_if.c:2087 #7 0x8063c5b3 in _vn_lock (vp=, flags=548864, file=, line=) at vnode_if.h:859 #8 0x8062a5f7 in vget (vp=0xf80049c2c000, flags=548864, td=0xf800b642b500) at /usr/src/sys/kern/vfs_subr.c:2523 #9 0x806118b9 in cache_lookup (dvp=, vpp=, cnp=, tsp=, ticksp=) at /usr/src/sys/kern/vfs_cache.c:686 #10 0x806133dc in vfs_cache_lookup (ap=) at /usr/src/sys/kern/vfs_cache.c:1081 #11 0x80935777 in VOP_LOOKUP_APV (vop=, a=) at vnode_if.c:127 #12 0x8061cdf1 in lookup (ndp=) at vnode_if.h:54 #13 0x8061c492 in namei (ndp=) at /usr/src/sys/kern/vfs_lookup.c:306 #14 0x80509395 in kern_execve (td=, args=, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443 #15 0x80508ccc in sys_execve (td=0xf800b642b500, uap=0xfe010182cb80) at /usr/src/sys/kern/kern_exec.c:218 #16 0x808d449e in amd64_syscall (td=, traced=0) at subr_syscall.c:135 #17 0x808b7ddb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 Thread 681 (Thread 101147): #0 sched_switch (td=0xf80065f4e500, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973 #1 0x80561ae2 in mi_switch (flags=, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:646 #3 0x8052f854 in sleeplk (lk=, flags=, ilk=, wmesg=0x813be535 "zfs", pri=, timo=51) at /usr/src/sys/kern/kern_lock.c:222 #4 0x8052f39d in __lockmgr_args (lk=, flags=, ilk=, wmesg=, pri=, timo=, file=, line=) at /usr/src/sys/kern/kern_lock.c:958 #5 0x80616a8c in vop_stdlock (ap=) at lockmgr.h:98 #6 0x8093784d in VOP_LOCK1_APV (vop=, a=) at vnode_if.c:2087 #7 0x8063c5b3 in _vn_lock (vp=, flags=548864, file=, line=) at vnode_if.h:859 #8 0x8062a5f7 in vget (vp=0xf80049c2c000, flags=548864, td=0xf80065f4e500) at /usr/src/sys/kern/vfs_subr.c:2523 #9 0x806118b9 in cache_lookup (dvp=, vpp=, cnp=, tsp=, ticksp=) at /usr/src/sys/kern/vfs_cache.c:686 #10 0x806133dc in vfs_cache_lookup (ap=) at /usr/src/sys/kern/vfs_cache.c:1081 #11 0x80935777 in VOP_LOOKUP_APV (vop=, a=) at vnode_if.c:127 #12 0x8061cdf1 in lookup (ndp=) at vnode_if.h:54 #13 0x8061c492 in namei (ndp=) at /usr/src/sys/kern/vfs_lookup.c:306 #14 0x80509395 in kern_execve (td=, args=, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443 #15 0x80508ccc in sys_execve (td=0xf80065f4e500, uap=0xfe01016b8b80) at /usr/src/sys/kern/kern_exec.c:218 #16 0x808d449e in amd64_syscall (td=, traced=0) at subr_syscall.c:135 #17 0x808b7ddb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 This 2 threads are innd processes. In core.txt.4: 8 14789 29165 0 24 4 40040 6612 zfs DN- 0:00.00 [innd] 8 29165 1 0 20 0 42496 6888 select Ds- 0:01.33 [innd] 8 49778 29165 0 24 4 40040 6900 zfs DN- 0:00.00 [innd] 8 82034 29165 0 24 4 132 0 zfs DN- 0:00.00 [innd] the corresponding info treads are: 687 Thread 101243 (PID=49778: innd) sched_switch (td=0xf800b642b500, newtd=0xf8000285ea00, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973 681 Thread 101147 (PID=14789: innd) sched_switch (td=0xf80065f4e500, newtd=0xf8000285f000, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973 669 Thread 101250 (PID=82034: innd) sched_switch (td=0xf800b6429000, newtd=0xf8000285ea00, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973 665 Thread 101262 (PID=29165: innd) sched_switch (td=0xf800b6b54a00, newtd=0xf8000285ea00, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973 So your missing tread must be 101250: (kgdb) tid 101250 [Switching to thread 669 (Thread 101250)]#0 sched_switch (td=0xf800b6429000, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973 1973cpuid = PCPU_GET(cpuid); Current language: a
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/11/2016 12:24, Andriy Gapon wrote: At this stage I would try to get a system crash dump for post-mortem analysis. There are a few way to do that. You can enter ddb and then run 'dump' and 'reset' commands. Or you can just do `sysctl debug.kdb.panic=1`. In either case, please double-check that your system has a dump device configured. It take some time to upload the dump... You can find it at http://tignes.restart.be/Xfer/ Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/10/2016 19:40, Andriy Gapon wrote: On 10/11/2016 19:55, Henri Hennebert wrote: On 11/10/2016 18:33, Andriy Gapon wrote: On 10/11/2016 18:12, Henri Hennebert wrote: On 11/10/2016 16:54, Andriy Gapon wrote: On 10/11/2016 17:20, Henri Hennebert wrote: On 11/10/2016 15:00, Andriy Gapon wrote: Interesting. I can not spot any suspicious thread that would hold the vnode lock. Could you please run kgdb (just like that, no arguments), then execute 'bt' command and then select a frame when _vn_lock is called with 'fr N' command. Then please 'print *vp' and share the result. I Think I miss something in your request: Oh, sorry! The very first step should be 'tid 101112' to switch to the correct context. (kgdb) fr 7 #7 0x8063c5b3 in _vn_lock (vp=, flags=2121728, "value optimized out" - not good file=, line=) at vnode_if.h:859 859vnode_if.h: No such file or directory. in vnode_if.h (kgdb) print *vp I am not sure if this output is valid, because of the message above. Could you please try to navigate to nearby frames and see if vp itself has a valid value there. If you can find such a frame please do *vp there. Does this seems better? Yes! (kgdb) fr 8 #8 0x8062a5f7 in vget (vp=0xf80049c2c000, flags=2121728, td=0xf80009ba0500) at /usr/src/sys/kern/vfs_subr.c:2523 2523if ((error = vn_lock(vp, flags)) != 0) { (kgdb) print *vp $1 = {v_tag = 0x813be535 "zfs", v_op = 0x813d0f70, v_data = 0xf80049c1f420, v_mount = 0xf800093aa660, v_nmntvnodes = {tqe_next = 0xf80049c2c938, tqe_prev = 0xf80049c2bb30}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_cache_src = {lh_first = 0x0}, v_cache_dst = { tqh_first = 0xf800bfc8e3f0, tqh_last = 0xf800bfc8e410}, v_cache_dd = 0x0, v_lock = {lock_object = { lo_name = 0x813be535 "zfs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 23, lk_exslpfail = 0, lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {lo_name = 0x8099e9e0 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xf80049c2c068, v_actfreelist = { tqe_next = 0xf80049c2c938, tqe_prev = 0xf80049ae9bd0}, v_bufobj = {bo_lock = {lock_object = { lo_name = 0x8099e9f0 "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, bo_ops = 0x80c4bf70, bo_object = 0xf800b62e9c60, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xf80049c2c000, __bo_vnode = 0xf80049c2c000, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xf80049c2c120}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xf80049c2c140}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xf80049c2c188}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 9, v_usecount = 6, v_iflag = 512, v_vflag = 32, v_writecount = 0, v_hash = 4833984, v_type = VREG} (kgdb) flags=2121728 = 0x206000 = LK_SHARED | LK_VNHELD | LK_NODDLKTREAT lk_lock = 23 = 0x17 = LK_ONE_SHARER | LK_EXCLUSIVE_WAITERS | LK_SHARED_WAITERS | LK_SHARE So, here's what we have here: this thread tries to get a shared lock on the vnode, the vnode is already locked in shared mode, but there is an exclusive waiter (or, perhaps, multiple waiters). So, this thread can not get the lock because of the exclusive waiter. And I do not see an easy way to identify that waiter. In the procstat output that you provided earlier there was no other thread in vn_lock. Hmm, I see this: procstat: sysctl: kern.proc.kstack: 14789: Device busy procstat: sysctl: kern.proc.kstack: 82034: Device busy Could you please check what those two processes are (if they are still running)? Perhaps try procstat for each of the pids several times. This 2 processes are the 2 instances of the innd daemon (news server) which seems in accordance with the directory /usr/local/news/bin. [root@avoriaz ~]# procstat 14789 PID PPID PGID SID TSID THR LOGINWCHAN EMUL COMM 14789 29165 29165 29165 0 1 root zfs FreeBSD ELF64 innd [root@avoriaz ~]# procstat 82034 PID PPID PGID SID TSID THR LOGINWCHAN EMUL COMM 82034 29165 29165 29165 0 1 root zfs FreeBSD ELF64 innd [root@avoriaz ~]# procstat -f 14789 procstat: kinfo_getfile(): Device busy PID COMMFD T V FLAGSREF OFFSET PRO NAME [root@avoriaz ~]# procstat -f 14789 procstat: kinfo_getfile(): Device busy PID COMMFD T V FLAGSREF OFFSET PRO NAME [root@avoriaz ~]# procstat -f 14789 procstat: kinfo_getfile():
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/10/2016 18:33, Andriy Gapon wrote: On 10/11/2016 18:12, Henri Hennebert wrote: On 11/10/2016 16:54, Andriy Gapon wrote: On 10/11/2016 17:20, Henri Hennebert wrote: On 11/10/2016 15:00, Andriy Gapon wrote: Interesting. I can not spot any suspicious thread that would hold the vnode lock. Could you please run kgdb (just like that, no arguments), then execute 'bt' command and then select a frame when _vn_lock is called with 'fr N' command. Then please 'print *vp' and share the result. I Think I miss something in your request: Oh, sorry! The very first step should be 'tid 101112' to switch to the correct context. (kgdb) fr 7 #7 0x8063c5b3 in _vn_lock (vp=, flags=2121728, "value optimized out" - not good file=, line=) at vnode_if.h:859 859vnode_if.h: No such file or directory. in vnode_if.h (kgdb) print *vp I am not sure if this output is valid, because of the message above. Could you please try to navigate to nearby frames and see if vp itself has a valid value there. If you can find such a frame please do *vp there. Does this seems better? (kgdb) fr 8 #8 0x8062a5f7 in vget (vp=0xf80049c2c000, flags=2121728, td=0xf80009ba0500) at /usr/src/sys/kern/vfs_subr.c:2523 2523if ((error = vn_lock(vp, flags)) != 0) { (kgdb) print *vp $1 = {v_tag = 0x813be535 "zfs", v_op = 0x813d0f70, v_data = 0xf80049c1f420, v_mount = 0xf800093aa660, v_nmntvnodes = {tqe_next = 0xf80049c2c938, tqe_prev = 0xf80049c2bb30}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_cache_src = {lh_first = 0x0}, v_cache_dst = { tqh_first = 0xf800bfc8e3f0, tqh_last = 0xf800bfc8e410}, v_cache_dd = 0x0, v_lock = {lock_object = { lo_name = 0x813be535 "zfs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 23, lk_exslpfail = 0, lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {lo_name = 0x8099e9e0 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xf80049c2c068, v_actfreelist = { tqe_next = 0xf80049c2c938, tqe_prev = 0xf80049ae9bd0}, v_bufobj = {bo_lock = {lock_object = { lo_name = 0x8099e9f0 "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, bo_ops = 0x80c4bf70, bo_object = 0xf800b62e9c60, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xf80049c2c000, __bo_vnode = 0xf80049c2c000, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xf80049c2c120}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xf80049c2c140}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xf80049c2c188}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 9, v_usecount = 6, v_iflag = 512, v_vflag = 32, v_writecount = 0, v_hash = 4833984, v_type = VREG} (kgdb) Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/10/2016 16:54, Andriy Gapon wrote: On 10/11/2016 17:20, Henri Hennebert wrote: On 11/10/2016 15:00, Andriy Gapon wrote: Interesting. I can not spot any suspicious thread that would hold the vnode lock. Could you please run kgdb (just like that, no arguments), then execute 'bt' command and then select a frame when _vn_lock is called with 'fr N' command. Then please 'print *vp' and share the result. I Think I miss something in your request: Oh, sorry! The very first step should be 'tid 101112' to switch to the correct context. (kgdb) fr 7 #7 0x8063c5b3 in _vn_lock (vp=, flags=2121728, file=, line=) at vnode_if.h:859 859 vnode_if.h: No such file or directory. in vnode_if.h (kgdb) print *vp $1 = {v_tag = 0x80faeb78 "â~\231\200", v_op = 0xf80009a41000, v_data = 0x0, v_mount = 0xf80009a41010, v_nmntvnodes = {tqe_next = 0x0, tqe_prev = 0x80edc088}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0xf80009466e90, le_prev = 0x0}, v_cache_src = {lh_first = 0xfe010186d768}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xfeb8a7c0}, v_cache_dd = 0xf8000284f000, v_lock = {lock_object = { lo_name = 0xf8002c00ee80 "", lo_flags = 0, lo_data = 0, lo_witness = 0xf800068bd480}, lk_lock = 1844673520268056, lk_exslpfail = 153715840, lk_timo = -2048, lk_pri = 0}, v_interlock = {lock_object = { lo_name = 0x18af8 Bad address>, lo_flags = 0, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, v_vnlock = 0x0, v_actfreelist = {tqe_next = 0x0, tqe_prev = 0xf80009ba05c0}, v_bufobj = {bo_lock = {lock_object = {lo_name = 0xf80009a41000 "", lo_flags = 1, lo_data = 0, lo_witness = 0x400ff}, rw_lock = 2}, bo_ops = 0x1, bo_object = 0xf80049c2c068, bo_synclist = {le_next = 0x813be535, le_prev = 0x1}, bo_private = 0x0, __bo_vnode = 0x0, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0x0}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0xf80088ac8d00, tqh_last = 0xf8003cc5b600}, bv_root = {pt_root = 2553161591}, bv_cnt = -1741805705}, bo_numoutput = 31, bo_flag = 0, bo_bsize = 0}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0xf88, tqh_last = 0x19cc}, rl_currdep = 0x3f8}, v_cstart = 16256, v_lasta = 679, v_lastw = 0, v_clen = 0, v_holdcnt = 0, v_usecount = 2369, v_iflag = 0, v_vflag = 0, v_writecount = 0, v_hash = 0, v_type = VNON} (kgdb) Thanks for your time Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/10/2016 15:00, Andriy Gapon wrote: On 10/11/2016 12:30, Henri Hennebert wrote: On 11/10/2016 11:21, Andriy Gapon wrote: On 09/11/2016 15:58, Eric van Gyzen wrote: On 11/09/2016 07:48, Henri Hennebert wrote: I encounter a strange deadlock on FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 r308260: Fri Nov 4 02:51:33 CET 2016 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 This system is exclusively running on zfs. After 3 or 4 days, `periodic daily` is locked in the directory /usr/local/news/bin [root@avoriaz ~]# ps xa|grep find 85656 - D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) -prune -o ( -name [#,]* -o -name .#* -o -name a.out -o -nam 462 1 S+ 0:00.00 grep find [root@avoriaz ~]# procstat -f 85656 PID COMMFD T V FLAGSREF OFFSET PRO NAME 85656 find text v r r--- - - - /usr/bin/find 85656 find cwd v d r--- - - - /usr/local/news/bin 85656 find root v d r--- - - - / 85656 find 0 v c r--- 3 0 - /dev/null 85656 find 1 p - rw-- 1 0 - - 85656 find 2 v r -w-- 7 17 - - 85656 find 3 v d r--- 1 0 - /home/root 85656 find 4 v d r--- 1 0 - /home/root 85656 find 5 v d rn-- 1 533545184 - /usr/local/news/bin [root@avoriaz ~]# If I try `ls /usr/local/news/bin` it is also locked. After `shutdown -r now` the system remain locked after the line '0 0 0 0 0 0' After a reset and reboot I can access /usr/local/news/bin. I delete this directory and reinstall the package `portupgrade -fu news/inn` 5 days later `periodic daily`is locked on the same directory :-o Any idea? I can't help with the deadlock, but someone who _can_ help will probably ask for the output of "procstat -kk PID" with the PID of the "find" process. In fact, it's procstat -kk -a. With just one thread we would see that a thread is blocked on something, but we won't see why that something can not be acquired. I attach the result, Interesting. I can not spot any suspicious thread that would hold the vnode lock. Could you please run kgdb (just like that, no arguments), then execute 'bt' command and then select a frame when _vn_lock is called with 'fr N' command. Then please 'print *vp' and share the result. I Think I miss something in your request: [root@avoriaz ~]# kgdb GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done. done. --- clip --- Loaded symbols for /boot/kernel/accf_data.ko Reading symbols from /boot/kernel/daemon_saver.ko...Reading symbols from /usr/lib/debug//boot/kernel/daemon_saver.ko.debug...done. done. Loaded symbols for /boot/kernel/daemon_saver.ko #0 sched_switch (td=0xf8001131da00, newtd=0xf800762a8500, flags=) at /usr/src/sys/kern/sched_ule.c:1973 1973cpuid = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=0xf8001131da00, newtd=0xf800762a8500, flags=) at /usr/src/sys/kern/sched_ule.c:1973 #1 0x80566b15 in tc_fill_vdso_timehands32 (vdso_th32=0x0) at /usr/src/sys/kern/kern_tc.c:2121 #2 0x80555227 in timekeep_push_vdso () at /usr/src/sys/kern/kern_sharedpage.c:174 #3 0x80566226 in tc_windup () at /usr/src/sys/kern/kern_tc.c:1426 #4 0x804eaa41 in hardclock_cnt (cnt=1, usermode=optimized out>) at /usr/src/sys/kern/kern_clock.c:589 #5 0x808fac74 in handleevents (now=, fake=0) at /usr/src/sys/kern/kern_clocksource.c:223 #6 0x808fb1d7 in timercb (et=0x8100cf20, arg=optimized out>) at /usr/src/sys/kern/kern_clocksource.c:352 #7 0xf800b6429a00 in ?? () #8 0x81051080 in vm_page_array () #9 0x81051098 in vm_page_queue_free_mtx () #10 0xfe0101818920 in ?? () #11 0x805399c0 in __mtx_lock_sleep (c=, tid=Error accessing memory address 0xffac: Bad add\ ress. ) at /usr/src/sys/kern/kern_mutex.c:590 Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) q [root@avoriaz ~]# I don't find the requested frame Henri ___ freebsd-stable@freebsd.org mailing list h
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/10/2016 11:21, Andriy Gapon wrote: On 09/11/2016 15:58, Eric van Gyzen wrote: On 11/09/2016 07:48, Henri Hennebert wrote: I encounter a strange deadlock on FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 r308260: Fri Nov 4 02:51:33 CET 2016 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 This system is exclusively running on zfs. After 3 or 4 days, `periodic daily` is locked in the directory /usr/local/news/bin [root@avoriaz ~]# ps xa|grep find 85656 - D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) -prune -o ( -name [#,]* -o -name .#* -o -name a.out -o -nam 462 1 S+ 0:00.00 grep find [root@avoriaz ~]# procstat -f 85656 PID COMMFD T V FLAGSREF OFFSET PRO NAME 85656 find text v r r--- - - - /usr/bin/find 85656 find cwd v d r--- - - - /usr/local/news/bin 85656 find root v d r--- - - - / 85656 find 0 v c r--- 3 0 - /dev/null 85656 find 1 p - rw-- 1 0 - - 85656 find 2 v r -w-- 7 17 - - 85656 find 3 v d r--- 1 0 - /home/root 85656 find 4 v d r--- 1 0 - /home/root 85656 find 5 v d rn-- 1 533545184 - /usr/local/news/bin [root@avoriaz ~]# If I try `ls /usr/local/news/bin` it is also locked. After `shutdown -r now` the system remain locked after the line '0 0 0 0 0 0' After a reset and reboot I can access /usr/local/news/bin. I delete this directory and reinstall the package `portupgrade -fu news/inn` 5 days later `periodic daily`is locked on the same directory :-o Any idea? I can't help with the deadlock, but someone who _can_ help will probably ask for the output of "procstat -kk PID" with the PID of the "find" process. In fact, it's procstat -kk -a. With just one thread we would see that a thread is blocked on something, but we won't see why that something can not be acquired. I attach the result, Henri [root@avoriaz ~]# procstat -kk -a PIDTID COMM TDNAME KSTACK 0 10 kernel swapper mi_switch+0xd2 sleepq_timedwait+0x3a _sleep+0x281 swapper+0x464 btext+0x2c 0 19 kernel kqueue_ctx taskq mi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100012 kernel aiod_kick taskq mi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100013 kernel thread taskq mi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100018 kernel firmware taskq mi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100022 kernel acpi_task_0 mi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100023 kernel acpi_task_1 mi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100024 kernel acpi_task_2 mi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100025 kernel em0 que mi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100026 kernel em0 txq mi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100027 kernel em1 taskqmi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100060 kernel mca taskqmi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100061 kernel system_taskq_0 mi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100062 kernel system_taskq_1 mi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100063 kernel dbu_evictmi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100072 kernel CAM taskqmi_switch+0xd2 sleepq_wait+0x3a _sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 0 100086 kernel if_config_tqg_0 mi_switch+0xd2 sleepq_wait+0x3a msleep_spin_sbt+0x1bd gtaskqueue_thread_loop+0x113 fork_exit+0x85 fork_trampoline+0xe 0 100087 kernel if_io_tqg_0 mi_switch+0
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/09/2016 19:23, Thierry Thomas wrote: Le mer. 9 nov. 16 à 15:03:49 +0100, Henri Hennebert <h...@restart.be> écrivait : [root@avoriaz ~]# procstat -kk 85656 PIDTID COMM TDNAME KSTACK 85656 101112 find -mi_switch+0xd2 sleepq_wait+0x3a sleeplk+0x1b4 __lockmgr_args+0x356 vop_stdlock+0x3c VOP_LOCK1_APV+0x8d _vn_lock+0x43 vget+0x47 cache_lookup+0x679 vfs_cache_lookup+0xac VOP_LOOKUP_APV+0x87 lookup+0x591 namei+0x572 kern_statat+0xa8 sys_fstatat+0x2c amd64_syscall+0x4ce Xfast_syscall+0xfb It looks similar to the problem reportes in PR 205163 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205163 May be causes by too small values for some vfs.zfs.arc*. Could you please list sysctl for vfs.zfs.arc_max and others? Regards, [root@avoriaz ~]# sysctl vfs.zfs vfs.zfs.trim.max_interval: 1 vfs.zfs.trim.timeout: 30 vfs.zfs.trim.txg_delay: 32 vfs.zfs.trim.enabled: 1 vfs.zfs.vol.unmap_enabled: 1 vfs.zfs.vol.recursive: 0 vfs.zfs.vol.mode: 1 vfs.zfs.version.zpl: 5 vfs.zfs.version.spa: 5000 vfs.zfs.version.acl: 1 vfs.zfs.version.ioctl: 6 vfs.zfs.debug: 0 vfs.zfs.super_owner: 0 vfs.zfs.sync_pass_rewrite: 2 vfs.zfs.sync_pass_dont_compress: 5 vfs.zfs.sync_pass_deferred_free: 2 vfs.zfs.zio.exclude_metadata: 0 vfs.zfs.zio.use_uma: 1 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_replay_disable: 0 vfs.zfs.min_auto_ashift: 9 vfs.zfs.max_auto_ashift: 13 vfs.zfs.vdev.trim_max_pending: 1 vfs.zfs.vdev.bio_delete_disable: 0 vfs.zfs.vdev.bio_flush_disable: 0 vfs.zfs.vdev.write_gap_limit: 4096 vfs.zfs.vdev.read_gap_limit: 32768 vfs.zfs.vdev.aggregation_limit: 131072 vfs.zfs.vdev.trim_max_active: 64 vfs.zfs.vdev.trim_min_active: 1 vfs.zfs.vdev.scrub_max_active: 2 vfs.zfs.vdev.scrub_min_active: 1 vfs.zfs.vdev.async_write_max_active: 10 vfs.zfs.vdev.async_write_min_active: 1 vfs.zfs.vdev.async_read_max_active: 3 vfs.zfs.vdev.async_read_min_active: 1 vfs.zfs.vdev.sync_write_max_active: 10 vfs.zfs.vdev.sync_write_min_active: 10 vfs.zfs.vdev.sync_read_max_active: 10 vfs.zfs.vdev.sync_read_min_active: 10 vfs.zfs.vdev.max_active: 1000 vfs.zfs.vdev.async_write_active_max_dirty_percent: 60 vfs.zfs.vdev.async_write_active_min_dirty_percent: 30 vfs.zfs.vdev.mirror.non_rotating_seek_inc: 1 vfs.zfs.vdev.mirror.non_rotating_inc: 0 vfs.zfs.vdev.mirror.rotating_seek_offset: 1048576 vfs.zfs.vdev.mirror.rotating_seek_inc: 5 vfs.zfs.vdev.mirror.rotating_inc: 0 vfs.zfs.vdev.trim_on_init: 1 vfs.zfs.vdev.cache.bshift: 16 vfs.zfs.vdev.cache.size: 0 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.vdev.metaslabs_per_vdev: 200 vfs.zfs.txg.timeout: 5 vfs.zfs.space_map_blksz: 4096 vfs.zfs.spa_slop_shift: 5 vfs.zfs.spa_asize_inflation: 24 vfs.zfs.deadman_enabled: 1 vfs.zfs.deadman_checktime_ms: 5000 vfs.zfs.deadman_synctime_ms: 100 vfs.zfs.debug_flags: 0 vfs.zfs.recover: 0 vfs.zfs.spa_load_verify_data: 1 vfs.zfs.spa_load_verify_metadata: 1 vfs.zfs.spa_load_verify_maxinflight: 1 vfs.zfs.ccw_retry_interval: 300 vfs.zfs.check_hostid: 1 vfs.zfs.mg_fragmentation_threshold: 85 vfs.zfs.mg_noalloc_threshold: 0 vfs.zfs.condense_pct: 200 vfs.zfs.metaslab.bias_enabled: 1 vfs.zfs.metaslab.lba_weighting_enabled: 1 vfs.zfs.metaslab.fragmentation_factor_enabled: 1 vfs.zfs.metaslab.preload_enabled: 1 vfs.zfs.metaslab.preload_limit: 3 vfs.zfs.metaslab.unload_delay: 8 vfs.zfs.metaslab.load_pct: 50 vfs.zfs.metaslab.min_alloc_size: 33554432 vfs.zfs.metaslab.df_free_pct: 4 vfs.zfs.metaslab.df_alloc_threshold: 131072 vfs.zfs.metaslab.debug_unload: 0 vfs.zfs.metaslab.debug_load: 0 vfs.zfs.metaslab.fragmentation_threshold: 70 vfs.zfs.metaslab.gang_bang: 16777217 vfs.zfs.free_bpobj_enabled: 1 vfs.zfs.free_max_blocks: 18446744073709551615 vfs.zfs.no_scrub_prefetch: 0 vfs.zfs.no_scrub_io: 0 vfs.zfs.resilver_min_time_ms: 3000 vfs.zfs.free_min_time_ms: 1000 vfs.zfs.scan_min_time_ms: 1000 vfs.zfs.scan_idle: 50 vfs.zfs.scrub_delay: 4 vfs.zfs.resilver_delay: 2 vfs.zfs.top_maxinflight: 32 vfs.zfs.zfetch.array_rd_sz: 1048576 vfs.zfs.zfetch.max_distance: 8388608 vfs.zfs.zfetch.min_sec_reap: 2 vfs.zfs.zfetch.max_streams: 8 vfs.zfs.prefetch_disable: 1 vfs.zfs.delay_scale: 50 vfs.zfs.delay_min_dirty_percent: 60 vfs.zfs.dirty_data_sync: 67108864 vfs.zfs.dirty_data_max_percent: 10 vfs.zfs.dirty_data_max_max: 4294967296 vfs.zfs.dirty_data_max: 373664153 vfs.zfs.max_recordsize: 1048576 vfs.zfs.mdcomp_disable: 0 vfs.zfs.nopwrite_enabled: 1 vfs.zfs.dedup.prefetch: 1 vfs.zfs.l2c_only_size: 0 vfs.zfs.mfu_ghost_data_lsize: 24202240 vfs.zfs.mfu_ghost_metadata_lsize: 136404992 vfs.zfs.mfu_ghost_size: 160607232 vfs.zfs.mfu_data_lsize: 449569280 vfs.zfs.mfu_metadata_lsize: 102724608 vfs.zfs.mfu_size: 714202624 vfs.zfs.mru_ghost_data_lsize: 874834432 vfs.zfs.mru_ghost_metadata_lsize: 387692032 vfs.zfs.mru_ghost_size: 1262526464 vfs.zfs.mru_data_lsize: 151275008 vfs.zfs.mru_metadata_lsize: 13547008 vfs.zfs.mru_size: 322614272 vfs.zfs.anon_data_lsize: 0 vfs.zfs.anon_metadata_lsize: 0 vfs.zfs.anon_size: 2916352 vfs.zfs.l2arc_norw: 1 vfs.zfs.l2arc_feed
Re: Freebsd 11.0 RELEASE - ZFS deadlock
On 11/09/2016 14:58, Eric van Gyzen wrote: On 11/09/2016 07:48, Henri Hennebert wrote: I encounter a strange deadlock on FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 r308260: Fri Nov 4 02:51:33 CET 2016 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 This system is exclusively running on zfs. After 3 or 4 days, `periodic daily` is locked in the directory /usr/local/news/bin [root@avoriaz ~]# ps xa|grep find 85656 - D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) -prune -o ( -name [#,]* -o -name .#* -o -name a.out -o -nam 462 1 S+ 0:00.00 grep find [root@avoriaz ~]# procstat -f 85656 PID COMMFD T V FLAGSREF OFFSET PRO NAME 85656 find text v r r--- - - - /usr/bin/find 85656 find cwd v d r--- - - - /usr/local/news/bin 85656 find root v d r--- - - - / 85656 find 0 v c r--- 3 0 - /dev/null 85656 find 1 p - rw-- 1 0 - - 85656 find 2 v r -w-- 7 17 - - 85656 find 3 v d r--- 1 0 - /home/root 85656 find 4 v d r--- 1 0 - /home/root 85656 find 5 v d rn-- 1 533545184 - /usr/local/news/bin [root@avoriaz ~]# If I try `ls /usr/local/news/bin` it is also locked. After `shutdown -r now` the system remain locked after the line '0 0 0 0 0 0' After a reset and reboot I can access /usr/local/news/bin. I delete this directory and reinstall the package `portupgrade -fu news/inn` 5 days later `periodic daily`is locked on the same directory :-o Any idea? I can't help with the deadlock, but someone who _can_ help will probably ask for the output of "procstat -kk PID" with the PID of the "find" process. Eric [root@avoriaz ~]# procstat -kk 85656 PIDTID COMM TDNAME KSTACK 85656 101112 find -mi_switch+0xd2 sleepq_wait+0x3a sleeplk+0x1b4 __lockmgr_args+0x356 vop_stdlock+0x3c VOP_LOCK1_APV+0x8d _vn_lock+0x43 vget+0x47 cache_lookup+0x679 vfs_cache_lookup+0xac VOP_LOOKUP_APV+0x87 lookup+0x591 namei+0x572 kern_statat+0xa8 sys_fstatat+0x2c amd64_syscall+0x4ce Xfast_syscall+0xfb Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Freebsd 11.0 RELEASE - ZFS deadlock
I encounter a strange deadlock on FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 r308260: Fri Nov 4 02:51:33 CET 2016 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 This system is exclusively running on zfs. After 3 or 4 days, `periodic daily` is locked in the directory /usr/local/news/bin [root@avoriaz ~]# ps xa|grep find 85656 - D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) -prune -o ( -name [#,]* -o -name .#* -o -name a.out -o -nam 462 1 S+ 0:00.00 grep find [root@avoriaz ~]# procstat -f 85656 PID COMMFD T V FLAGSREF OFFSET PRO NAME 85656 find text v r r--- - - - /usr/bin/find 85656 find cwd v d r--- - - - /usr/local/news/bin 85656 find root v d r--- - - - / 85656 find 0 v c r--- 3 0 - /dev/null 85656 find 1 p - rw-- 1 0 - - 85656 find 2 v r -w-- 7 17 - - 85656 find 3 v d r--- 1 0 - /home/root 85656 find 4 v d r--- 1 0 - /home/root 85656 find 5 v d rn-- 1 533545184 - /usr/local/news/bin [root@avoriaz ~]# If I try `ls /usr/local/news/bin` it is also locked. After `shutdown -r now` the system remain locked after the line '0 0 0 0 0 0' After a reset and reboot I can access /usr/local/news/bin. I delete this directory and reinstall the package `portupgrade -fu news/inn` 5 days later `periodic daily`is locked on the same directory :-o Any idea? Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 10-STABLE hangups frequently
On 02/18/2016 01:24, Marius Strobl wrote: > > Could those of you experiencing these hangs with ZFS please test > whether instead of reverting all of r292895, a kernel built with > just the merge of r291244 undone via the following patch gets rid > of that problem - especially on amd64 - and report back? > https://people.freebsd.org/~marius/r291244_reversal_10.diff > > Marius > On a i386 with 2GB and pure ZFS without r291244 all is normal Henri ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 10-STABLE hangups frequently
On 02/03/2016 02:03, Hajimu UMEMOTO wrote: > Hi, > >> On Wed, 3 Feb 2016 07:07:38 +1100 >> Peter Jeremysaid: > > peter> As others have said, you need to provide lots more detail on your > peter> configuration. > > CPU: AMD Athlon(tm) 64 Processor 3500+ > Memory: 4GB > HDD: 3TB > > I'm using ZFS only setup. > > peter> There were no problems at r290231 but after I upgraded to r295005, I > peter> started seeing "out of swap" errors and hangs during the periodic > peter> daily runs. I'm not seeing this on 1GB instances - though they are > peter> all running UFS. > > r292875 runs well: > > FreeBSD asuka.mahoroba.org 10.2-STABLEFreeBSD 10.2-STABLE #5 r292875: Tue Feb > 2 07:08:29 JST 2016 r...@asuka.mahoroba.org:/usr/obj/usr/src/sys/ASUKA amd6 > > r292895 hangs: > > FreeBSD asuka.mahoroba.org 10.2-STABLE FreeBSD 10.2-STABLE #6 r292895: Tue > Feb 2 10:17:28 JST 2016 r...@asuka.mahoroba.org:/usr/obj/usr/src/sys/ASUKA > amd64 > > I tried latest stable (r295137) with the sys/kern/vfs_subr.c part of > r292895 reverted, and it seems running well, here: > > FreeBSD asuka.mahoroba.org 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #0 > r295137M: Tue Feb 2 20:39:11 JST 2016 > r...@asuka.mahoroba.org:/usr/obj/usr/src/sys/ASUKA amd64 > > peter> Some experimentation suggested that just "find /" was enough to wedge > peter> my system. I did some experimenting and found that the following > peter> loader config was enough to prevent it hanging: > peter> vfs.zfs.arc_max="128M" > peter> vfs.zfs.arc_meta_limit="50M" > peter> vfs.zfs.arc_min="25M" > peter> (previously, I had no ZFS tuning at all). > > I had ZFS tuning before. However, after this problem was occur, I > removed all of ZFS tuning. > The FS related setting is only kern.maxvnodes=40, now. > > Sincerely, > > -- > Hajimu UMEMOTO > u...@mahoroba.org u...@freebsd.org > http://www.mahoroba.org/~ume/ I encounter a hangup 3 times after I upgrade to 10.3-PRERELEASE r295247M in a zfs configuration (i386 with 2GB memory) while trying to run security/tripwire (compute checksum on all the files). With /usr/src/sys/kern/vfs_subr.c at revision 291757 all return to normal. Henri PS thanks Hajimu ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Link error in usr.bin/dig if WITH_BIND_XML=yes
Hello, Dig can't be linked if WITH_BIND_XML=yes is added to /etc/src.conf. [root@morzine src]# svn info Path: . Working Copy Root Path: /usr/src URL: http://svn.restart.bel/svn-FreeBSD-base/stable/9 Relative URL: ^/stable/9 Repository Root: http://svn.restart.bel/svn-FreeBSD-base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 255473 Node Kind: directory Schedule: normal Last Changed Author: des Last Changed Rev: 255443 Last Changed Date: 2013-09-10 12:07:21 +0200 (Tue, 10 Sep 2013) === usr.bin/dig (all) /usr/src/usr.bin/dig/../../contrib/bind9/bin/dig/dighost.c:4336:27: warning: passing 'const char *' to parameter of type 'void *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] isc_buffer_init(buffer, str, len); ^~~ /usr/src/usr.bin/dig/../../contrib/bind9/lib/isc/include/isc/buffer.h:225:41: note: passing argument to parameter 'base' here isc__buffer_init(isc_buffer_t *b, void *base, unsigned int length); ^ 1 warning generated. /usr/local/lib/libxml2.a(xzlib.o): In function `__libxml2_xzclose': xzlib.c:(.text+0x69): undefined reference to `lzma_end' /usr/local/lib/libxml2.a(xzlib.o): In function `xz_decomp': xzlib.c:(.text+0x4a6): undefined reference to `lzma_code' /usr/local/lib/libxml2.a(xzlib.o): In function `xz_make': xzlib.c:(.text+0x8cd): undefined reference to `lzma_auto_decoder' xzlib.c:(.text+0xa04): undefined reference to `lzma_properties_decode' clang: error: linker command failed with exit code 1 (use -v to see invocation) *** [dig] Error code 1 Stop in /usr/src/usr.bin/dig. *** [all] Error code 1 Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Lost CAM Access to DVD Writer
On 09/02/2013 01:54, Thomas Laus wrote: Hi, :-( Unable to CAMGETPASSTHRU for /dev/cd0 Inappropriate ioctl for device. I encounter the same problem and reinstalling dvd+rw-tools-7.1 solved it. Henri Could someone else try to make a 'dump to DVD' backup [...] /sbin/dump -0u -L -C16 -B4589840 -P 'growisofs -Z /dev/cd0=/dev/fd/0' /u A test with less disk load would be to write e.g. 100 MB of zeros to e.g. DVD+RW media (in order to reduce waste): dd if=/dev/zero bs=1M count=100 | growisofs -Z /dev/cd0=/dev/fd/0 I got the same result and error message as I did when trying to dump the file system on all of the computers that use ATA for disk access. On the one PC that uses AHCI, it was able to write to the DVD. Tom ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9.2-BETA2 - Problem with newsyslog
On 07/29/2013 11:18, Henri Hennebert wrote: Hello, My entry for newsyslog in /etc/crontab is: 0 * * * * rootnewsyslog -t \%Y-\%m-\%d_\%H:\%M And I get: newsyslog: Could not convert time string to time value: No such file or directory I try to use the newsyslog from head to to avail. This solution was working a month ago (see Revision 248776) Here I must have make some mistake... I retry with newsyslog.c from head and all is OK Henri My file system is zfs version 28. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
9.2-BETA2 - Problem with newsyslog
Hello, My entry for newsyslog in /etc/crontab is: 0 * * * * rootnewsyslog -t \%Y-\%m-\%d_\%H:\%M And I get: newsyslog: Could not convert time string to time value: No such file or directory I try to use the newsyslog from head to to avail. This solution was working a month ago (see Revision 248776) My file system is zfs version 28. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
9.2-BETA2 bind + WITH_BIND_XML=yes + libxml2-2.8.0
Hello, When compiling world of 9.2-BETA2 and adding in /etc/src.conf WITH_BIND_XML=yes and with libxml2-2.8.0_2 (textproc/libxml2) installed in /usr/local I get this link error: === usr.bin/dig (all) /usr/local/lib/libxml2.a(xzlib.o): In function `__libxml2_xzclose': xzlib.c:(.text+0x69): undefined reference to `lzma_end' /usr/local/lib/libxml2.a(xzlib.o): In function `xz_decomp': xzlib.c:(.text+0x4a6): undefined reference to `lzma_code' /usr/local/lib/libxml2.a(xzlib.o): In function `xz_make': xzlib.c:(.text+0x8cd): undefined reference to `lzma_auto_decoder' xzlib.c:(.text+0xa04): undefined reference to `lzma_properties_decode' clang: error: linker command failed with exit code 1 (use -v to see invocation) *** [dig] Error code 1 Stop in /usr/src/usr.bin/dig. *** [all] Error code 1 Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: sysctl -a causes kernel trap 12
On 02/12/2013 12:22, Henri Hennebert wrote: On 01/19/2013 06:58, Brandon Gooch wrote: On Fri, Jan 18, 2013 at 2:56 PM, Xin Li delp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/18/13 12:50, Brandon Gooch wrote: On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net mailto:delp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 To all: this became more and more hard to replicate lately. I've tried these options and the most important progress is that it's possible to get a crashdump when debug.debugger_on_panic=0 and I managed to get a backtrace which indicates the panic occur when trying to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait - propagate_priority, but after I've added some instruments to the surrounding code and enabled INVARIANT and/or WITNESS, it mysteriously went away. Reverting my instruments code and update to latest svn makes the issue disappear for one day. I've hit it again today but unfortunately didn't get a successful dump and after reboot I can't reproduce it again :( Still trying... Any updates Xin? No, it mysteriously disappeared for now. According to my understanding to recent svn commits, I didn't see anybody committing something that fixes it but I can no longer panic my system, with or without debugging code :( I was actually hitting what I believe to be exactly the same issue as you on one of my systems, and, as you've seen, adding any extra debugging or diagnostics seemed to eliminate the issue. I was able to generate quite a few vmcores and still have these sitting around in my filesystem (along with the kernels that helped produce them). I can recreate this crash on my system by compiling the NVIDIA driver with clang at -01 and above. Although it's been noted that this issue has been seen in scenarios without an NIVIDIA driver in the mix, whatever is happening in the kernel to cause the panic is somehow triggered by this, at least on my system. I'm not sure if this is the same problem. Could you please try using gcc to compile the nVIdia driver and see if that fixes the problem? Cheers, - -- Xin LI delp...@delphij.nethttps://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die Indeed, a gcc compiled NVIDIA module eliminates the issue, sorry if I hadn't mentioned this earlier. What was happening to me at first was that my system would just hang while booting. I was able to figure out that it was during /etc/rc.d/initrandom. I actually got to a point where I removed the call to sysctl -a from 'better_than_nothing()' in /etc/rc.d/initrandom to have a booting system. I finally had a situation where I could get a panic by adding SW_WATCHDOG to my kernel and running watchdogd(8). For me, this panic would come and go seemingly at random as well, and I couldn't fumble my way around in the debugger to learn much of anythingfreebsd-curr...@freebsd.org when I first started seeing it. I just started a process of modularizing everything I could in my kernel config, then loading modules 1-by-1 and booting over-and-over until I finally found what appeared to be the problem, which was the NVIDIA module compiled with clang. Oh, another thing: at times it seemed as though it was the number of modules loaded, as I could get the hang with 41 modules loaded, but not 40 or 42?! I admit, when I was seeing that behavior, I hadn't eliminated the NVIDIA driver from my loaded modules. I need to revisit the panic situation to confirm this particular strangeness. Here's the last panic I had: Unread portion of the kernel message buffer: = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 1175 (sysctl) (kgdb) bt #0 doadump (textdump=1694704112) at pcpu.h:229 #1 0x802fab82 in db_fncall (dummy1=value optimized out, dummy2=value optimized out, dummy3=value optimized out, dummy4=value optimized out) at /usr/src/sys/ddb/db_command.c:578 #2 0x802fa85a in db_command (last_cmdp=value optimized out, cmd_table=value optimized out, dopager=1) at /usr/src/sys/ddb/db_command.c:449 #3 0x802fa612 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0x802fcf60 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:231 #5 0x804a7b93 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #6 0x807157c5 in trap_fatal (frame=0xff8865032670, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:867 #7 0x80715adb in trap_pfault (frame=0x0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:698 #8 0x8071529b in trap (frame=0xff8865032670) at /usr/src/sys/amd64/amd64/trap.c:463 #9 0x806ff382 in calltrap () at exception.S:228 #10 0x8047bd50 in sysctl_sysctl_next_ls
Re: Can not build kernel with modular ata and ATA_CAM
On 06/25/2012 10:50, Mitya wrote: My kernel options: # Bus support. device acpi device pci # Modular ATA device atadisk # ATA disk drives device atacore # Core ATA functionality device atapci # PCI bus support; only generic chipset support device ataintel# Intel options ATA_CAM # Handle legacy controllers with CAM options ATA_STATIC_ID # Static device numbering # ATA/SCSI peripherals device scbus # SCSI bus (required for ATA/SCSI) device da # Direct Access (disks) device pass# Passthrough device (direct ATA/SCSI access) From /usr/src/sys/conf/NOTES: # ATA_CAM: Turn ata(4) subsystem controller drivers into cam(4) # interface modules. This deprecates all ata(4) # peripheral device drivers (atadisk, ataraid, atapicd, # atapifd, atapist, atapicam) and all user-level APIs. # cam(4) drivers and APIs will be connected instead. So you must remove 'device atadisk' Henri make's output: ata-disk.o: In function `ad_init': ata-disk.c:(.text+0x7d): undefined reference to `ata_setmode' ata-disk.c:(.text+0x95): undefined reference to `ata_wc' ata-disk.c:(.text+0xc9): undefined reference to `ata_controlcmd' ata-disk.c:(.text+0x11b): undefined reference to `ata_controlcmd' ata-disk.c:(.text+0x16d): undefined reference to `ata_controlcmd' ata-disk.c:(.text+0x1b6): undefined reference to `ata_controlcmd' ata-disk.o: In function `ad_shutdown': ata-disk.c:(.text+0x258): undefined reference to `ata_controlcmd' ata-disk.o: In function `ad_detach': ata-disk.c:(.text+0x479): undefined reference to `ata_fail_requests' ata-disk.o: In function `ad_dump': ata-disk.c:(.text+0x861): undefined reference to `ata_drop_requests' ata-disk.c:(.text+0x921): undefined reference to `ata_controlcmd' ata-disk.o: In function `ad_attach': ata-disk.c:(.text+0xa40): undefined reference to `ata_setmax' ata-disk.c:(.text+0xb62): undefined reference to `ata_satarev2str' ata-disk.c:(.text+0xba7): undefined reference to `ata_unit2str' ata-disk.c:(.text+0xfff): undefined reference to `ata_queue_request' ata-disk.c:(.text+0x131e): undefined reference to `ata_queue_request' ata-disk.c:(.text+0x1340): undefined reference to `ata_getparam' ata-disk.o: In function `ad_spindown': ata-disk.c:(.text+0x539): undefined reference to `ata_queue_request' ata-disk.o: In function `ad_ioctl': ata-disk.c:(.text+0x5a4): undefined reference to `ata_device_ioctl' ata-disk.o: In function `ad_strategy': ata-disk.c:(.text+0x6c7): undefined reference to `ata_queue_request' *** [kernel] Error code 1 I found differences in ata-all.c and ata-all.h In ata-all.c: #ifndef ATA_CAM void ata_setmode(device_t dev) { But, in ata-all.h: void ata_setmode(device_t dev); without any #ifdef or #ifndef ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: cvsup{, d} woes after upgrading to RELENG_9 on amd64 this weekend
On 06/05/2012 10:17, Scot Hetzel wrote: On Mon, Jun 4, 2012 at 4:34 AM, Henri Henneberth...@restart.be wrote: On 06/04/2012 10:53, Trond Endrestøl wrote: Hi, After upgrading to RELENG_9 as of yesterday on my amd64 system, cvsup bombs out with Bus error: 10. Example: # /usr/local/bin/cvsup -g -L 2 /usr/src/stable-supfile Parsing supfile /usr/src/stable-supfile Connecting to localhost Connected to localhost Server software version: SNAP_16_1h Negotiating file attribute support Exchanging collection information Establishing multiplexed-mode data connection Running Updating collection src-all/cvs Bus error: 10 The only recent change I can think of is switching to clang for building the kernel and base. Made I should rebuild world and kernel using gcc. This is the culprit, you must compile libc and libz with gcc. See http://www.freebsd.org/cgi/query-pr.cgi?pr=162588 make.conf snipet from PR 162588: .if defined(WITH_CLANG) .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif .if !defined(CPP) || ${CPP} == cpp CPP=clang -E .endif NO_WERROR= WERROR= .endif # WITH_CLANG acccording to http://wiki.freebsd.org/BuildingFreeBSDWithClang#Quickstart, you should be using: CPP=clang-cpp I change this a while ago and it don't change the problem at hand Henri If you change this , does it fix the issue? Scot ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: cvsup{, d} woes after upgrading to RELENG_9 on amd64 this weekend
On 06/04/2012 10:53, Trond Endrestøl wrote: Hi, After upgrading to RELENG_9 as of yesterday on my amd64 system, cvsup bombs out with Bus error: 10. Example: # /usr/local/bin/cvsup -g -L 2 /usr/src/stable-supfile Parsing supfile /usr/src/stable-supfile Connecting to localhost Connected to localhost Server software version: SNAP_16_1h Negotiating file attribute support Exchanging collection information Establishing multiplexed-mode data connection Running Updating collection src-all/cvs Bus error: 10 The only recent change I can think of is switching to clang for building the kernel and base. Made I should rebuild world and kernel using gcc. This is the culprit, you must compile libc and libz with gcc. See http://www.freebsd.org/cgi/query-pr.cgi?pr=162588 Henri Today, I used portupgrade -fprv lang/ezm3 net/cvsup-without-gui, but cvsup gives me the same result as in the example above. This bug also affects cvsupd for those of us who are running a local FreeBSD CVSup mirror (http://motoyuki.bsdclub.org/BSD/cvsup.html) on amd64/RELENG_9. I know csup is generally preferred over cvsup, and in the meantime I'm able to use csup with another local FreeBSD CVSup mirror running on i386/RELENG_8. cvsup on the amd64 box crashes with Bus error even when accessing the CVSup mirror on the i386 box, thus indicating a problem local to the amd64 box. I welcome any clues to solve this problem. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: strange system corruption (freebsd 9.0)
On 04/20/2012 18:38, Ingrid Ditra wrote: Hi, folks! I am in the middle of really creepy problem with my new FreeBSD box. I would appreciate any ideas about what the hell happened. I've installed FreeBSD 9.0 from CD on IBM System x3550 server (with RAID-5 on 4 hard drives) and moved on it the most of config's from my old FreeBSD (8.2) box, and everything seemed working fine for some days. Long story short, today I realised, that I can't login nether through ssh or console, some third-party soft doesn't work, and most of utilities from base system doesn't work too. My /usr/sbin and /usr/libdata are completely gone, /usr/libexec is empty, /usr/bin contains only dtrace dir and librt.so.1 many files from /usr/bin are gone, /usr/src contains only directory with my kernconf (there was all sources) and /usr/ports contains only ports I've installed. Time of access to all deleted or semi-deleted dirs is almost the same, Do you look carefully in /var/log/cron for this same time ? Another thought, give you filesystem layout. but I didn't find any weird actions in logs. First, I thought that portsnap (runned by cron) somehow corrupted my system, but it was executed like eight hours earlier. No one but me has access to this box, so it's unlikely mean joke. This system is connected to the internet ? http server ? if so check the logs So, please, please, help me. I really do not know what I suppose to do now. I can't find out why this happened, so it would be useless just reinstall system -- I'll have this situation again. All this stuff repeated twice Same time, day of the week ? -- so it is not kind of glitch (last time a cvsuped sources and ports and thought it was the reason of crash). Maybe I'm not helpful but you feel less lonely ... ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: libutempter
On 01/14/2012 09:47, Andre Goree wrote: On Tue, 10 Jan 2012 19:58:13 -0600, Andre Goree an...@drenet.info wrote: I recently csup'd 9-STABLE and was able to get it working along with my custom kernel. I'm now in the process of rebuilding all my ports, and I've come across something when running 'portmaster -af' that I can't seem to find any information on. === Launching child to reinstall libutempter-1.1.5_1 === Port directory: /usr/ports/sysutils/libutempter === This port is marked IGNORE === is now contained in the base system === If you are sure you can build it, remove the IGNORE line in the Makefile and try again. === Update for libutempter-1.1.5_1 failed === Aborting update Terminated I figure, ok I'll just delete the package and move on. However, there are many packages I have installed that depend on libutemper. I would still just proceed with the removal given that the functionality is provided in base now, however I don't want to break all these ports and have to deal with the mess when I portmaster -af again. What is the recommended action here? Should I just force exclude that port from the upgrade? That's probably the easiest way but I'd have to deal with this at some point. Thanks in advance for any advice -- Andre Goree andre@drenetinfo So I've rebuilt everything that I could, but when I get to the ports that depend on libutempter, I get an error that they could not be reinstalled due to a failure with libutempter :/ --- Skipping 'www/opera' (opera-11.60) because a requisite package 'libutempter-1.1.5_1' (sysutils/libutempter) failed (specify -k to force) --- Skipping 'www/opera-linuxplugins' (opera-linuxplugins-11.60) because a requisite package 'opera-11.60' (www/opera) failed (specify -k to force) --- Skipping 'deskutils/kdeplasma-addons' (kdeplasma-addons-4.7.3) because a requisite package 'kdepimlibs-4.7.3' (deskutils/kdepimlibs4) failed (specify -k to force) --- Skipping 'graphics/libkdcraw-kde4' (libkdcraw-4.7.3) because a requisite package 'libutempter-1.1.5_1' (sysutils/libutempter) failed (specify -k to force) I installed misc/compat8x, however it informed my that I'd need to add to the kernel conf. When I try to do that, I'm met with this error: /usr/src/sys/amd64/conf/DESKTOPKERN9: unknown option COMPAT_FREEBSD8 *** Error code 1 Stop in /usr/src. Which is weird, because: [root@desktop src]# uname -r 9.0-STABLE Meaning I'm certainly running 9.0-STABLE. So what gives re: that error above about unknown option? I even tried to csup source and buildworld again, but to no avail -- the error remains. I upgrade my ports with portupgrade. After removing libutempter I just run `pkgdb -Fu' and then I can proceed with the update of depending ports. I don't need compat8x. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9 recompile ports
On 01/14/2012 09:46, Matthew Seaman wrote: On 13/01/2012 22:57, Andriy Gapon wrote: But if the appropriate misc/compatX port is installed, then those libraries do actually exist and the system should be fully usable... Modulo the compat libraries not working with the new kernel as Kostik has pointed out. As soon as you update or install an application after this point, you are likely to end up with an application that tries to dynamically link two different versions of the same shlib, and that is a recipe for tears-before-bedtime. This /etc/libmap.conf help me greatly when I reinstall all my ports after 9.0-BETA2 and make delete-old-libs: libsbuf.so.5libsbuf.so.6 libz.so.5 libz.so.6 libutil.so.8libutil.so.9 libcam.so.5 libcam.so.6 libpcap.so.7libpcap.so.8 libufs.so.5 libufs.so.6 libbsnmp.so.5 libbsnmp.so.6 libdwarf.so.2 libdwarf.so.3 libopie.so.6libopie.so.7 librtld_db.so.1 librtld_db.so.2 libtacplus.so.4 libtacplus.so.5 Henri Cheers, Matthew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9 recompile ports
On 01/14/2012 11:37, Jeremy Chadwick wrote: On Sat, Jan 14, 2012 at 11:29:00AM +0100, Henri Hennebert wrote: On 01/14/2012 09:46, Matthew Seaman wrote: On 13/01/2012 22:57, Andriy Gapon wrote: But if the appropriate misc/compatX port is installed, then those libraries do actually exist and the system should be fully usable... Modulo the compat libraries not working with the new kernel as Kostik has pointed out. As soon as you update or install an application after this point, you are likely to end up with an application that tries to dynamically link two different versions of the same shlib, and that is a recipe for tears-before-bedtime. This /etc/libmap.conf help me greatly when I reinstall all my ports --- after 9.0-BETA2 and make delete-old-libs: libsbuf.so.5 libsbuf.so.6 libz.so.5libz.so.6 libutil.so.8 libutil.so.9 libcam.so.5 libcam.so.6 libpcap.so.7 libpcap.so.8 libufs.so.5 libufs.so.6 libbsnmp.so.5libbsnmp.so.6 libdwarf.so.2libdwarf.so.3 libopie.so.6 libopie.so.7 librtld_db.so.1 librtld_db.so.2 libtacplus.so.4 libtacplus.so.5 This is very, VERY, ***VERY*** dangerous. Apparently nobody has explained why, so I will: When a linked library version number (N of libfoo.so.N) increases or changes, it indicates there are API/ABI changes to the library. There is absolutely ZERO guarantee that calling semantics are the same, that function arguments (thus stack order) are the same, or that structures used internally by the library are the same. The effects of this can be devastating -- if you're lucky it'll consist of just missing symbol, but it can be a lot worse. The TL;DR version is: there is absolutely ZERO guarantee that the internal operations and calling semantics of the libraries are identical. Folks reading this thread, PLEASE do not follow the above advice and leave your system running in that kind of state. Instead of being lazy, I don't want to argue too much, but you don't read me correctly. I just do this during the time I REINSTALL ALL PORTS and then I delete /etc/libmap.conf, of course, I'm not crazy! rebuild all your ports from scratch or pull down new binary copies (pkg_add -r ...) for the version of the OS you're running. Doug and I have the same opinion when it comes to this situation, and it's based purely on experience. Schedule downtime, spend an afternoon rebuilding things, whatever -- just do it the Right Way(tm) please. Otherwise you're creating a lot of support hassle when it comes to trying to diagnose why some program on your system behaves oddly -- weeks go by, oh, libmap.conf... ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Tripwire segmentation fault - amd64 - 9.0-RC2 in _malloc_postfork
Hello, On 2 systems running 9.0-RC2 amd64 tripwire segfault. The problem occurs during `tripwire --check` after +/- 20 minutes of execution: Here is the bt [root@tignes tripwire]# gdb ./tripwire GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd...(no debugging symbols found)... (gdb) run --check Starting program: /usr/ports/security/tripwire/work/tripwire-2.4.1.2-src/src/tripwire/tripwire --check (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...Parsing policy file: /usr/local/etc/tripwire/tw.pol *** Processing Unix File System *** Performing integrity check... The object: /var/spool/httpd/tignes/htdocs/Xfer/Henri_2006_11_24 is on a different file system...ignoring. Program received signal SIGSEGV, Segmentation fault. 0x0008014efb12 in _malloc_postfork () from /lib/libc.so.7 (gdb) bt #0 0x0008014efb12 in _malloc_postfork () from /lib/libc.so.7 #1 0x0008014ef158 in realloc () from /lib/libc.so.7 #2 0x0008014ef385 in free () from /lib/libc.so.7 #3 0x004c6181 in cFileUtil::IsRegularFile () #4 0x00499dbb in WriteObject () #5 0x0049c429 in cTWUtil::WriteReport () #6 0x0042709a in cTWModeIC::Execute () #7 0x0041cf85 in main () Under 9.0-RC1 I encounter no problem at all Henri PS - on another system under 9.0-RC2 i386 tripwire run smoothly ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS boot inside on the second partition inside a slice
On 06/21/2011 23:27, John Baldwin wrote: On Tuesday, June 21, 2011 4:13:20 pm Henri Hennebert wrote: On 06/21/2011 21:25, John Baldwin wrote: and I get: Read error: 04 Hmm, that is the error for an invalid sector. Try this patch. It reshuffles a few more things and adds code to dump the low 32-bits of the LBA on an error: Index: zfsldr.S === --- zfsldr.S(revision 223365) +++ zfsldr.S(working copy) @@ -16,7 +16,6 @@ */ /* Memory Locations */ - .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area @@ -91,26 +90,18 @@ main: cld # String ops inc mov %cx,%ss # Set up mov $start,%sp # stack /* - * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets - * %cx == 0x100. - */ - mov %sp,%si # Source - mov $MEM_REL,%di# Destination - incb %ch# Word count - rep # Copy - movsw # code -/* * If we are on a hard drive, then load the MBR and look for the first * FreeBSD slice. We use the fake partition entry below that points to * the MBR when we call nread. The first pass looks for the first active * FreeBSD slice. The second pass looks for the first non-active FreeBSD * slice if the first one fails. */ - mov $part4,%si # Partition + mov $part4,%si # Dummy partition cmpb $0x80,%dl # Hard drive? jb main.4 # No - movb $0x1,%dh # Block count - callw nread # Read MBR + xor %eax,%eax # Read MBR + movw $MEM_BUF,%bx # from first + callw nread # sector mov $0x1,%cx# Two passes main.1: mov $MEM_BUF+PRT_OFF,%si# Partition table movb $0x1,%dh # Partition @@ -161,10 +152,16 @@ main.4: xor %dx,%dx # Partition:drive * area and target area do not overlap. */ main.5: mov %dx,MEM_ARG # Save args - movb $NSECT,%dh # Sector count + mov $NSECT,%cx # Sector count movl $1024,%eax # Offset to boot2 - callw nread.1 # Read disk -main.6:mov $MEM_BUF,%si# BTX (before reloc) + mov $MEM_BUF,%bx# Destination buffer +main.6:pushal # Save params + callw nread # Read disk + popal # Restore + incl %eax # Update for + add $SIZ_SEC,%bx# next sector + loop main.6 # If not last, read another + mov $MEM_BUF,%si# BTX (before reloc) mov 0xa(%si),%bx# Get BTX length and set mov $NSECT*SIZ_SEC-1,%di# Size of load area (less one) mov %di,%si # End of load @@ -214,29 +211,35 @@ seta20.3: sti # Enable interrupts * packet on the stack and passes it to read. * * %eax - int - LBA to read in relative to partition start + * %es:%bx - ptr - destination address * %dl- byte- drive to read from - * %dh - byte- num sectors to read * %si- ptr - MBR partition entry */ -nread: xor %eax,%eax # Sector offset in partition -nread.1: xor %ecx,%ecx # Get +nread: xor %ecx,%ecx # Get addl 0x8(%si),%eax # LBA adc $0,%ecx pushl %ecx # Starting absolute block pushl %eax # block number push %es# Address of - push $MEM_BUF # transfer buffer - xor %ax,%ax # Number of - movb %dh,%al# blocks to - push %ax# transfer + push %bx# transfer buffer + push $0x1 # Read 1 sector push
Re: ZFS boot inside on the second partition inside a slice
On 06/22/2011 16:19, Henri Hennebert wrote: On 06/22/2011 15:57, John Baldwin wrote: On Wednesday, June 22, 2011 7:34:05 am Henri Hennebert wrote: I get LBA: 8200 Read error: 04 Odd. Oh, I fubar'd and read the wrong thing for the sector. Also, we should leave the EDD packet on the stack so it doesn't get trashed by calling hex8, etc. Please try this: Index: zfsldr.S === --- zfsldr.S (revision 223365) +++ zfsldr.S (working copy) @@ -16,7 +16,6 @@ */ /* Memory Locations */ - .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area @@ -91,26 +90,18 @@ main: cld # String ops inc mov %cx,%ss # Set up mov $start,%sp # stack /* - * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets - * %cx == 0x100. - */ - mov %sp,%si # Source - mov $MEM_REL,%di # Destination - incb %ch # Word count - rep # Copy - movsw # code -/* * If we are on a hard drive, then load the MBR and look for the first * FreeBSD slice. We use the fake partition entry below that points to * the MBR when we call nread. The first pass looks for the first active * FreeBSD slice. The second pass looks for the first non-active FreeBSD * slice if the first one fails. */ - mov $part4,%si # Partition + mov $part4,%si # Dummy partition cmpb $0x80,%dl # Hard drive? jb main.4 # No - movb $0x1,%dh # Block count - callw nread # Read MBR + xor %eax,%eax # Read MBR + movw $MEM_BUF,%bx # from first + callw nread # sector mov $0x1,%cx # Two passes main.1: mov $MEM_BUF+PRT_OFF,%si # Partition table movb $0x1,%dh # Partition @@ -161,10 +152,16 @@ main.4: xor %dx,%dx # Partition:drive * area and target area do not overlap. */ main.5: mov %dx,MEM_ARG # Save args - movb $NSECT,%dh # Sector count + mov $NSECT,%cx # Sector count movl $1024,%eax # Offset to boot2 - callw nread.1 # Read disk -main.6: mov $MEM_BUF,%si # BTX (before reloc) + mov $MEM_BUF,%bx # Destination buffer +main.6: pushal # Save params + callw nread # Read disk + popal # Restore + incl %eax # Update for + add $SIZ_SEC,%bx # next sector + loop main.6 # If not last, read another + mov $MEM_BUF,%si # BTX (before reloc) mov 0xa(%si),%bx # Get BTX length and set mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one) mov %di,%si # End of load @@ -214,29 +211,35 @@ seta20.3: sti # Enable interrupts * packet on the stack and passes it to read. * * %eax - int - LBA to read in relative to partition start + * %es:%bx - ptr - destination address * %dl - byte - drive to read from - * %dh - byte - num sectors to read * %si - ptr - MBR partition entry */ -nread: xor %eax,%eax # Sector offset in partition -nread.1: xor %ecx,%ecx # Get +nread: xor %ecx,%ecx # Get addl 0x8(%si),%eax # LBA adc $0,%ecx pushl %ecx # Starting absolute block pushl %eax # block number push %es # Address of - push $MEM_BUF # transfer buffer - xor %ax,%ax # Number of - movb %dh,%al # blocks to - push %ax # transfer + push %bx # transfer buffer + push $0x1 # Read 1 sector push $0x10 # Size of packet mov %sp,%bp # Packet pointer callw read # Read from disk + jc nread.1 # If error, fail lea 0x10(%bp),%sp # Clear stack - jnc return # If success, return - mov $msg_read,%si # Otherwise, set the error - # message and fall through to - # the error routine + ret # If success, return +nread.1: mov %ah,%al # Format + mov $read_err,%di # error + call hex8 # code + movl 0x8(%bp),%eax # Format + mov $lba,%di # LBA + call hex32 + mov $msg_lba,%si # Display + call putstr # LBA + mov $msg_read,%si # Set the error message and + # fall through to the error + # routine /* * Print out the error message pointed to by %ds:(%si) followed * by a prompt, wait for a keypress, and then reboot the machine. @@ -259,14 +262,6 @@ putstr: lodsb # Get char jne putstr.0 # No /* - * Overused return code. ereturn is used to return an error from the - * read function. Since we assume putstr succeeds, we (ab)use the - * same code when we return from putstr. - */ -ereturn: movb $0x1,%ah # Invalid - stc # argument -return: retw # To caller -/* * Reads sectors from the disk. If EDD is enabled, then check if it is * installed and use it if it is. If it is not installed or not enabled, then * fall back to using CHS. Since we use a LBA, if we are using CHS, we have to @@ -294,14 +289,38 @@ read: cmpb $0x80,%dl # Hard drive? retw # To caller read.1: mov $msg_chs,%si jmp error -msg_chs: .asciz CHS not supported +/* + * Convert EAX, AX, or AL to hex, saving the result to [EDI]. + */ +hex32: pushl %eax # Save + shrl $0x10,%eax # Do upper + call hex16 # 16 + popl %eax # Restore +hex16: call hex16.1 # Do upper 8 +hex16.1: xchgb %ah,%al # Save/restore +hex8: push %ax # Save + shrb $0x4,%al # Do upper + call hex8.1 # 4 + pop %ax # Restore +hex8.1: andb $0xf,%al # Get lower 4 + cmpb $0xa,%al # Convert + sbbb $0x69,%al # to hex + das # digit + orb $0x20,%al # To lower case + stosb # Save char + ret # (Recursive) + /* Messages
Re: ZFS boot inside on the second partition inside a slice
On 06/22/2011 15:57, John Baldwin wrote: On Wednesday, June 22, 2011 7:34:05 am Henri Hennebert wrote: I get LBA: 8200 Read error: 04 Odd. Oh, I fubar'd and read the wrong thing for the sector. Also, we should leave the EDD packet on the stack so it doesn't get trashed by calling hex8, etc. Please try this: Index: zfsldr.S === --- zfsldr.S(revision 223365) +++ zfsldr.S(working copy) @@ -16,7 +16,6 @@ */ /* Memory Locations */ - .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area @@ -91,26 +90,18 @@ main: cld # String ops inc mov %cx,%ss # Set up mov $start,%sp # stack /* - * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets - * %cx == 0x100. - */ - mov %sp,%si # Source - mov $MEM_REL,%di# Destination - incb %ch# Word count - rep # Copy - movsw # code -/* * If we are on a hard drive, then load the MBR and look for the first * FreeBSD slice. We use the fake partition entry below that points to * the MBR when we call nread. The first pass looks for the first active * FreeBSD slice. The second pass looks for the first non-active FreeBSD * slice if the first one fails. */ - mov $part4,%si # Partition + mov $part4,%si # Dummy partition cmpb $0x80,%dl # Hard drive? jb main.4 # No - movb $0x1,%dh # Block count - callw nread # Read MBR + xor %eax,%eax # Read MBR + movw $MEM_BUF,%bx # from first + callw nread # sector mov $0x1,%cx# Two passes main.1: mov $MEM_BUF+PRT_OFF,%si# Partition table movb $0x1,%dh # Partition @@ -161,10 +152,16 @@ main.4: xor %dx,%dx # Partition:drive * area and target area do not overlap. */ main.5: mov %dx,MEM_ARG # Save args - movb $NSECT,%dh # Sector count + mov $NSECT,%cx # Sector count movl $1024,%eax # Offset to boot2 - callw nread.1 # Read disk -main.6:mov $MEM_BUF,%si# BTX (before reloc) + mov $MEM_BUF,%bx# Destination buffer +main.6:pushal # Save params + callw nread # Read disk + popal # Restore + incl %eax # Update for + add $SIZ_SEC,%bx# next sector + loop main.6 # If not last, read another + mov $MEM_BUF,%si# BTX (before reloc) mov 0xa(%si),%bx# Get BTX length and set mov $NSECT*SIZ_SEC-1,%di# Size of load area (less one) mov %di,%si # End of load @@ -214,29 +211,35 @@ seta20.3: sti # Enable interrupts * packet on the stack and passes it to read. * * %eax - int - LBA to read in relative to partition start + * %es:%bx - ptr - destination address * %dl- byte- drive to read from - * %dh - byte- num sectors to read * %si- ptr - MBR partition entry */ -nread: xor %eax,%eax # Sector offset in partition -nread.1: xor %ecx,%ecx # Get +nread: xor %ecx,%ecx # Get addl 0x8(%si),%eax # LBA adc $0,%ecx pushl %ecx # Starting absolute block pushl %eax # block number push %es# Address of - push $MEM_BUF # transfer buffer - xor %ax,%ax # Number of - movb %dh,%al# blocks to - push %ax# transfer + push %bx# transfer buffer + push $0x1 # Read 1 sector push $0x10
Re: ZFS boot inside on the second partition inside a slice
On 06/22/2011 16:23, Henri Hennebert wrote: On 06/22/2011 16:19, Henri Hennebert wrote: On 06/22/2011 15:57, John Baldwin wrote: On Wednesday, June 22, 2011 7:34:05 am Henri Hennebert wrote: I get LBA: 8200 Read error: 04 Odd. Oh, I fubar'd and read the wrong thing for the sector. Also, we should leave the EDD packet on the stack so it doesn't get trashed by calling hex8, etc. Please try this: Index: zfsldr.S === --- zfsldr.S (revision 223365) +++ zfsldr.S (working copy) @@ -16,7 +16,6 @@ */ /* Memory Locations */ - .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area @@ -91,26 +90,18 @@ main: cld # String ops inc mov %cx,%ss # Set up mov $start,%sp # stack /* - * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets - * %cx == 0x100. - */ - mov %sp,%si # Source - mov $MEM_REL,%di # Destination - incb %ch # Word count - rep # Copy - movsw # code -/* * If we are on a hard drive, then load the MBR and look for the first * FreeBSD slice. We use the fake partition entry below that points to * the MBR when we call nread. The first pass looks for the first active * FreeBSD slice. The second pass looks for the first non-active FreeBSD * slice if the first one fails. */ - mov $part4,%si # Partition + mov $part4,%si # Dummy partition cmpb $0x80,%dl # Hard drive? jb main.4 # No - movb $0x1,%dh # Block count - callw nread # Read MBR + xor %eax,%eax # Read MBR + movw $MEM_BUF,%bx # from first + callw nread # sector mov $0x1,%cx # Two passes main.1: mov $MEM_BUF+PRT_OFF,%si # Partition table movb $0x1,%dh # Partition @@ -161,10 +152,16 @@ main.4: xor %dx,%dx # Partition:drive * area and target area do not overlap. */ main.5: mov %dx,MEM_ARG # Save args - movb $NSECT,%dh # Sector count + mov $NSECT,%cx # Sector count movl $1024,%eax # Offset to boot2 - callw nread.1 # Read disk -main.6: mov $MEM_BUF,%si # BTX (before reloc) + mov $MEM_BUF,%bx # Destination buffer +main.6: pushal # Save params + callw nread # Read disk + popal # Restore + incl %eax # Update for + add $SIZ_SEC,%bx # next sector + loop main.6 # If not last, read another + mov $MEM_BUF,%si # BTX (before reloc) mov 0xa(%si),%bx # Get BTX length and set mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one) mov %di,%si # End of load @@ -214,29 +211,35 @@ seta20.3: sti # Enable interrupts * packet on the stack and passes it to read. * * %eax - int - LBA to read in relative to partition start + * %es:%bx - ptr - destination address * %dl - byte - drive to read from - * %dh - byte - num sectors to read * %si - ptr - MBR partition entry */ -nread: xor %eax,%eax # Sector offset in partition -nread.1: xor %ecx,%ecx # Get +nread: xor %ecx,%ecx # Get addl 0x8(%si),%eax # LBA adc $0,%ecx pushl %ecx # Starting absolute block pushl %eax # block number push %es # Address of - push $MEM_BUF # transfer buffer - xor %ax,%ax # Number of - movb %dh,%al # blocks to - push %ax # transfer + push %bx # transfer buffer + push $0x1 # Read 1 sector push $0x10 # Size of packet mov %sp,%bp # Packet pointer callw read # Read from disk + jc nread.1 # If error, fail lea 0x10(%bp),%sp # Clear stack - jnc return # If success, return - mov $msg_read,%si # Otherwise, set the error - # message and fall through to - # the error routine + ret # If success, return +nread.1: mov %ah,%al # Format + mov $read_err,%di # error + call hex8 # code + movl 0x8(%bp),%eax # Format + mov $lba,%di # LBA + call hex32 + mov $msg_lba,%si # Display + call putstr # LBA + mov $msg_read,%si # Set the error message and + # fall through to the error + # routine /* * Print out the error message pointed to by %ds:(%si) followed * by a prompt, wait for a keypress, and then reboot the machine. @@ -259,14 +262,6 @@ putstr: lodsb # Get char jne putstr.0 # No /* - * Overused return code. ereturn is used to return an error from the - * read function. Since we assume putstr succeeds, we (ab)use the - * same code when we return from putstr. - */ -ereturn: movb $0x1,%ah # Invalid - stc # argument -return: retw # To caller -/* * Reads sectors from the disk. If EDD is enabled, then check if it is * installed and use it if it is. If it is not installed or not enabled, then * fall back to using CHS. Since we use a LBA, if we are using CHS, we have to @@ -294,14 +289,38 @@ read: cmpb $0x80,%dl # Hard drive? retw # To caller read.1: mov $msg_chs,%si jmp error -msg_chs: .asciz CHS not supported +/* + * Convert EAX, AX, or AL to hex, saving the result to [EDI]. + */ +hex32: pushl %eax # Save + shrl $0x10,%eax # Do upper + call hex16 # 16 + popl %eax # Restore +hex16: call hex16.1 # Do upper 8 +hex16.1: xchgb %ah,%al # Save/restore +hex8: push %ax # Save + shrb $0x4,%al # Do upper + call hex8.1 # 4 + pop %ax # Restore +hex8.1: andb $0xf,%al # Get lower 4 + cmpb $0xa,%al # Convert + sbbb $0x69,%al # to hex + das # digit + orb $0x20,%al # To lower case + stosb # Save char
Re: ZFS boot inside on the second partition inside a slice
On 06/22/2011 17:58, John Baldwin wrote: Index: zfsldr.S === --- zfsldr.S(revision 223365) +++ zfsldr.S(working copy) @@ -16,7 +16,6 @@ */ /* Memory Locations */ - .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area @@ -91,26 +90,18 @@ main: cld # String ops inc mov %cx,%ss # Set up mov $start,%sp # stack /* - * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets - * %cx == 0x100. - */ - mov %sp,%si # Source - mov $MEM_REL,%di# Destination - incb %ch# Word count - rep # Copy - movsw # code -/* * If we are on a hard drive, then load the MBR and look for the first * FreeBSD slice. We use the fake partition entry below that points to * the MBR when we call nread. The first pass looks for the first active * FreeBSD slice. The second pass looks for the first non-active FreeBSD * slice if the first one fails. */ - mov $part4,%si # Partition + mov $part4,%si # Dummy partition cmpb $0x80,%dl # Hard drive? jb main.4 # No - movb $0x1,%dh # Block count - callw nread # Read MBR + xor %eax,%eax # Read MBR + movw $MEM_BUF,%bx # from first + callw nread # sector mov $0x1,%cx# Two passes main.1: mov $MEM_BUF+PRT_OFF,%si# Partition table movb $0x1,%dh # Partition @@ -143,32 +134,35 @@ main.4: xor %dx,%dx # Partition:drive * (i.e. after the two vdev labels). We don't have do anything fancy * here to allow for an extra copy of boot1 and a partition table * (compare to this section of the UFS bootstrap) so we just load it - * all at 0x8000. The first part of boot2 is BTX, which wants to run + * all at 0x9000. The first part of boot2 is BTX, which wants to run * at 0x9000. The boot2.bin binary starts right after the end of BTX, * so we have to figure out where the start of it is and then move the - * binary to 0xc000. After we have moved the client, we relocate BTX - * itself to 0x9000 - doing it in this order means that none of the - * memcpy regions overlap which would corrupt the copy. Normally, BTX - * clients start at MEM_USR, or 0xa000, but when we use btxld to - * create zfsboot2, we use an entry point of 0x2000. That entry point is - * relative to MEM_USR; thus boot2.bin starts at 0xc000. + * binary to 0xc000. Normally, BTX clients start at MEM_USR, or 0xa000, + * but when we use btxld to create zfsboot2, we use an entry point of + * 0x2000. That entry point is relative to MEM_USR; thus boot2.bin + * starts at 0xc000. * * The load area and the target area for the client overlap so we have * to use a decrementing string move. We also play segment register * games with the destination address for the move so that the client * can be larger than 16k (which would overflow the zero segment since - * the client starts at 0xc000). Relocating BTX is easy since the load - * area and target area do not overlap. + * the client starts at 0xc000). */ main.5: mov %dx,MEM_ARG # Save args - movb $NSECT,%dh # Sector count + mov $NSECT,%cx # Sector count movl $1024,%eax # Offset to boot2 - callw nread.1 # Read disk -main.6:mov $MEM_BUF,%si# BTX (before reloc) + mov $MEM_BTX,%bx# Destination buffer +main.6:pushal # Save params + callw nread # Read disk + popal # Restore + incl %eax # Update for + add $SIZ_SEC,%bx# next sector + loop main.6 # If not last, read another + mov $MEM_BTX,%si# BTX mov 0xa(%si),%bx# Get BTX length and set mov $NSECT*SIZ_SEC-1,%di# Size of load area (less one) mov %di,%si # End of load - add $MEM_BUF,%si# area +
Re: ZFS boot inside on the second partition inside a slice
On 06/20/2011 15:51, John Baldwin wrote: On Saturday, June 18, 2011 5:04:07 am Henri Hennebert wrote: On 06/17/2011 19:37, John Baldwin wrote: On Friday, June 17, 2011 1:06:22 pm Henri Hennebert wrote: On 06/16/2011 19:35, John Baldwin wrote: On Thursday, June 16, 2011 8:45:41 am Zhihao Yuan wrote: Exactly. The MFCed ZFSv28 is different from any patch maintained by mm@. Maybe some untested changes involved. Can you try reverting this change: Author: jhb Date: Thu Apr 28 17:44:24 2011 New Revision: 221177 URL: http://svn.freebsd.org/changeset/base/221177 Log: Due to space constraints, the UFS boot2 and boot1 use an evil hack where boot2 calls back into boot1 to perform disk reads. The ZFS MBR boot blocks do not have the same space constraints, so remove this hack for ZFS. While here, remove commented out code to support C/H/S addressing from zfsldr. The ZFS and GPT bootstraps always just use EDD LBA addressing. MFC after:2 weeks Modified: head/sys/boot/i386/boot2/Makefile head/sys/boot/i386/common/drv.c head/sys/boot/i386/zfsboot/Makefile head/sys/boot/i386/zfsboot/zfsldr.S I try with this revision (221177) reverted to no avail: same error - 'read error' Hmm, ok. No other ideas off the top of my head. I make the same test under virtualbox and get: A critical error has occurred while running the virtual machine and the machine execution has been stopped. I attach VBox.log. PS - the message 'ZFS: supported version 28' comes from my patch: Index: sys/boot/zfs/zfsimpl.c === --- sys/boot/zfs/zfsimpl.c (revision 212549) +++ sys/boot/zfs/zfsimpl.c (working copy) @@ -61,6 +61,8 @@ STAILQ_INIT(zfs_vdevs); STAILQ_INIT(zfs_pools); + printf(ZFS: supported version %u\n, (unsigned) SPA_VERSION); + zfs_temp_buf = malloc(TEMP_SIZE); zfs_temp_end = zfs_temp_buf + TEMP_SIZE; zfs_temp_ptr = zfs_temp_buf; Hmm, can you add printfs and narrow down where the hang happens (or which reads are failing)? The VBOX log seems to make no sense. It shows the CPU trying to call into the BIOS from within protected mode in the loader but that shouldn't ever happen (note a cs of 0x2b (which is the loader's %cs selector) but an eip that looks like a cs:ip of a BIOS routine). I just try to put printf but I get only 'Read error' without any of my printf. Previously event my printf in zfs_init don't show up on the console of my netbook. Under VBox it was printed. Maybe printf is not allowed so soon in zfsboot ? For the record, I write the bootcode with this 2 commands after booting with mfsbsd (from mm@) and fetching zfsboot in /tmp: dd if=/tmp/zfsboot of=/dev/ad0s2a bs=512 count=1 dd if=/tmp/zfsboot of=/dev/ad0s2a bs=512 skip=1 seek=1024 My debugging patch in zfsboot.c: [root@morzine zfsboot]# svn diff zfsboot.c Index: zfsboot.c === --- zfsboot.c (revision 223081) +++ zfsboot.c (working copy) @@ -447,10 +447,16 @@ off_t off; struct dsk *dsk; + printf(==trying to boot\n); + dmadat = (void *)(roundup2(__base + (int32_t)_end, 0x1) - __base); + printf(==about to call bios_getmem()\n); + bios_getmem(); + printf(==bios_getmem() completed\n); + if (high_heap_size 0) { heap_end = PTOV(high_heap_base + high_heap_size); heap_next = PTOV(high_heap_base); @@ -482,6 +488,8 @@ autoboot = 1; + printf(==about to call zfs_init()\n); + zfs_init(); /* Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS boot inside on the second partition inside a slice
On 06/21/2011 15:01, John Baldwin wrote: Index: zfsldr.S === --- zfsldr.S(revision 223339) +++ zfsldr.S(working copy) @@ -234,9 +234,12 @@ nread.1: xor %ecx,%ecx # Get callw read # Read from disk lea 0x10(%bp),%sp # Clear stack jnc return # If success, return - mov $msg_read,%si # Otherwise, set the error - # message and fall through to - # the error routine + mov %ah,%al # Format + mov $read_err,%di # error + call hex8 # code + mov $msg_read,%si # Set the error message and + # fall through to the error + # routine /* * Print out the error message pointed to by %ds:(%si) followed * by a prompt, wait for a keypress, and then reboot the machine. @@ -296,12 +299,28 @@ read.1: mov $msg_chs,%si jmp error msg_chs: .asciz CHS not supported +/* + * Convert AL to hex, saving the result to [EDI]. + */ +hex8: push %ax# Save + shrb $0x4,%al # Do upper + call hex8.1 # 4 + pop %ax # Restore +hex8.1:andb $0xf,%al # Get lower 4 + cmpb $0xa,%al # Convert + sbbb $0x69,%al # to hex + das # digit + orb $0x20,%al # To lower case + stosb # Save char + ret # (Recursive) + /* Messages */ -msg_read: .asciz Read -msg_part: .asciz Boot +msg_read: .ascii Read error: +read_err: .asciz XX +msg_part: .asciz Boot error -prompt:.asciz error\r\n +prompt:.asciz \r\n .org PRT_OFF,0x90 I get Read error: 01 Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS boot inside on the second partition inside a slice
On 06/21/2011 17:55, John Baldwin wrote: On Tuesday, June 21, 2011 10:50:14 am Henri Hennebert wrote: On 06/21/2011 15:01, John Baldwin wrote: Index: zfsldr.S === --- zfsldr.S(revision 223339) +++ zfsldr.S(working copy) @@ -234,9 +234,12 @@ nread.1: xor %ecx,%ecx # Get callw read # Read from disk lea 0x10(%bp),%sp # Clear stack jnc return # If success, return - mov $msg_read,%si # Otherwise, set the error - # message and fall through to - # the error routine + mov %ah,%al # Format + mov $read_err,%di # error + call hex8 # code + mov $msg_read,%si # Set the error message and + # fall through to the error + # routine /* * Print out the error message pointed to by %ds:(%si) followed * by a prompt, wait for a keypress, and then reboot the machine. @@ -296,12 +299,28 @@ read.1: mov $msg_chs,%si jmp error msg_chs: .asciz CHS not supported +/* + * Convert AL to hex, saving the result to [EDI]. + */ +hex8: push %ax# Save + shrb $0x4,%al # Do upper + call hex8.1 # 4 + pop %ax # Restore +hex8.1:andb $0xf,%al # Get lower 4 + cmpb $0xa,%al # Convert + sbbb $0x69,%al # to hex + das # digit + orb $0x20,%al # To lower case + stosb # Save char + ret # (Recursive) + /* Messages */ -msg_read: .asciz Read -msg_part: .asciz Boot +msg_read: .ascii Read error: +read_err: .asciz XX +msg_part: .asciz Boot error -prompt:.asciz error\r\n +prompt:.asciz \r\n .org PRT_OFF,0x90 I get Read error: 01 Hmm, that would be 'invalid parameter'. Can you add a 'foo: jmp foo' infinite loop and move it around to figure out which read call is failing? main.5: mov %dx,MEM_ARG # Save args movb $NSECT,%dh # Sector count movl $1024,%eax # Offset to boot2 callw nread.1 # Read disk foo:jmp foo After this one I get 'Read error: 01' Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS boot inside on the second partition inside a slice
On 06/21/2011 19:51, John Baldwin wrote: On Tuesday, June 21, 2011 12:15:58 pm Henri Hennebert wrote: On 06/21/2011 17:55, John Baldwin wrote: On Tuesday, June 21, 2011 10:50:14 am Henri Hennebert wrote: On 06/21/2011 15:01, John Baldwin wrote: Index: zfsldr.S === --- zfsldr.S(revision 223339) +++ zfsldr.S(working copy) @@ -234,9 +234,12 @@ nread.1: xor %ecx,%ecx # Get callw read # Read from disk lea 0x10(%bp),%sp # Clear stack jnc return # If success, return - mov $msg_read,%si # Otherwise, set the error - # message and fall through to - # the error routine + mov %ah,%al # Format + mov $read_err,%di # error + call hex8 # code + mov $msg_read,%si # Set the error message and + # fall through to the error + # routine /* * Print out the error message pointed to by %ds:(%si) followed * by a prompt, wait for a keypress, and then reboot the machine. @@ -296,12 +299,28 @@ read.1: mov $msg_chs,%si jmp error msg_chs:.asciz CHS not supported +/* + * Convert AL to hex, saving the result to [EDI]. + */ +hex8: push %ax# Save + shrb $0x4,%al # Do upper + call hex8.1 # 4 + pop %ax # Restore +hex8.1:andb $0xf,%al # Get lower 4 + cmpb $0xa,%al # Convert + sbbb $0x69,%al # to hex + das # digit + orb $0x20,%al # To lower case + stosb # Save char + ret # (Recursive) + /* Messages */ -msg_read: .asciz Read -msg_part: .asciz Boot +msg_read: .ascii Read error: +read_err: .asciz XX +msg_part: .asciz Boot error -prompt:.asciz error\r\n +prompt:.asciz \r\n .org PRT_OFF,0x90 I get Read error: 01 Hmm, that would be 'invalid parameter'. Can you add a 'foo: jmp foo' infinite loop and move it around to figure out which read call is failing? main.5: mov %dx,MEM_ARG # Save args movb $NSECT,%dh # Sector count movl $1024,%eax # Offset to boot2 callw nread.1 # Read disk foo:jmp foo After this one I get 'Read error: 01' Hmm, ok. NSECT changed in the MFC (it is now larger). Try this patch. It changes the code to read zfsboot in one sector at a time: I encounter 2 problems - see in you patch Henri Index: zfsldr.S === --- zfsldr.S(revision 223365) +++ zfsldr.S(working copy) @@ -16,7 +16,6 @@ */ /* Memory Locations */ - .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area @@ -91,26 +90,19 @@ main: cld # String ops inc mov %cx,%ss # Set up mov $start,%sp # stack /* - * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets - * %cx == 0x100. - */ - mov %sp,%si # Source - mov $MEM_REL,%di# Destination - incb %ch# Word count - rep # Copy - movsw # code -/* * If we are on a hard drive, then load the MBR and look for the first * FreeBSD slice. We use the fake partition entry below that points to * the MBR when we call nread. The first pass looks for the first active * FreeBSD slice. The second pass looks for the first non-active FreeBSD * slice if the first one fails. */ - mov $part4,%si # Partition + mov $part4,%si # Dummy partition cmpb $0x80,%dl # Hard drive? jb main.4 # No - movb $0x1,%dh # Block count - callw nread # Read MBR + xor %eax,%eax
Re: ZFS boot inside on the second partition inside a slice
On 06/21/2011 21:25, John Baldwin wrote: On Tuesday, June 21, 2011 3:02:28 pm Henri Hennebert wrote: On 06/21/2011 19:51, John Baldwin wrote: On Tuesday, June 21, 2011 12:15:58 pm Henri Hennebert wrote: On 06/21/2011 17:55, John Baldwin wrote: On Tuesday, June 21, 2011 10:50:14 am Henri Hennebert wrote: On 06/21/2011 15:01, John Baldwin wrote: Index: zfsldr.S === --- zfsldr.S(revision 223339) +++ zfsldr.S(working copy) @@ -234,9 +234,12 @@ nread.1: xor %ecx,%ecx # Get callw read # Read from disk lea 0x10(%bp),%sp # Clear stack jnc return # If success, return - mov $msg_read,%si # Otherwise, set the error - # message and fall through to - # the error routine + mov %ah,%al # Format + mov $read_err,%di # error + call hex8 # code + mov $msg_read,%si # Set the error message and + # fall through to the error + # routine /* * Print out the error message pointed to by %ds:(%si) followed * by a prompt, wait for a keypress, and then reboot the machine. @@ -296,12 +299,28 @@ read.1: mov $msg_chs,%si jmp error msg_chs: .asciz CHS not supported +/* + * Convert AL to hex, saving the result to [EDI]. + */ +hex8: push %ax# Save + shrb $0x4,%al # Do upper + call hex8.1 # 4 + pop %ax # Restore +hex8.1:andb $0xf,%al # Get lower 4 + cmpb $0xa,%al # Convert + sbbb $0x69,%al # to hex + das # digit + orb $0x20,%al # To lower case + stosb # Save char + ret # (Recursive) + /* Messages */ -msg_read: .asciz Read -msg_part: .asciz Boot +msg_read: .ascii Read error: +read_err: .asciz XX +msg_part: .asciz Boot error -prompt:.asciz error\r\n +prompt:.asciz \r\n .org PRT_OFF,0x90 I get Read error: 01 Hmm, that would be 'invalid parameter'. Can you add a 'foo: jmp foo' infinite loop and move it around to figure out which read call is failing? main.5: mov %dx,MEM_ARG # Save args movb $NSECT,%dh # Sector count movl $1024,%eax # Offset to boot2 callw nread.1 # Read disk foo:jmp foo After this one I get 'Read error: 01' Hmm, ok. NSECT changed in the MFC (it is now larger). Try this patch. It changes the code to read zfsboot in one sector at a time: I encounter 2 problems - see in you patch Henri Index: zfsldr.S === --- zfsldr.S(revision 223365) +++ zfsldr.S(working copy) @@ -16,7 +16,6 @@ */ /* Memory Locations */ - .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area @@ -91,26 +90,19 @@ main: cld # String ops inc mov %cx,%ss # Set up mov $start,%sp # stack /* - * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets - * %cx == 0x100. - */ - mov %sp,%si # Source - mov $MEM_REL,%di# Destination - incb %ch# Word count - rep # Copy - movsw # code -/* * If we are on a hard drive, then load the MBR and look for the first * FreeBSD slice. We use the fake partition entry below that points to * the MBR when we call nread. The first pass looks for the first active * FreeBSD slice. The second pass looks for the first non-active FreeBSD * slice if the first one fails. */ - mov $part4,%si # Partition + mov $part4,%si # Dummy partition cmpb $0x80,%dl # Hard drive? jb main.4 # No - movb $0x1,%dh
Re: ZFS boot inside on the second partition inside a slice
On 06/16/2011 19:35, John Baldwin wrote: On Thursday, June 16, 2011 8:45:41 am Zhihao Yuan wrote: Exactly. The MFCed ZFSv28 is different from any patch maintained by mm@. Maybe some untested changes involved. Can you try reverting this change: Author: jhb Date: Thu Apr 28 17:44:24 2011 New Revision: 221177 URL: http://svn.freebsd.org/changeset/base/221177 Log: Due to space constraints, the UFS boot2 and boot1 use an evil hack where boot2 calls back into boot1 to perform disk reads. The ZFS MBR boot blocks do not have the same space constraints, so remove this hack for ZFS. While here, remove commented out code to support C/H/S addressing from zfsldr. The ZFS and GPT bootstraps always just use EDD LBA addressing. MFC after:2 weeks Modified: head/sys/boot/i386/boot2/Makefile head/sys/boot/i386/common/drv.c head/sys/boot/i386/zfsboot/Makefile head/sys/boot/i386/zfsboot/zfsldr.S I try with this revision (221177) reverted to no avail: same error - 'read error' Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS boot inside on the second partition inside a slice
On 06/16/2011 07:32, Zhihao Yuan wrote: I just redo everything, and changed the order of freebsd-zfs and freebsd-swap. The Read error still happens! Just a me too. Everything was working great with zfsboot from 8.2-RELEASE + a patch (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/153552). As I update to 8.2-STABLE after v28 MFC, I have to write a new zfsboot to be allowed to upgrade my pool. I get the Read Error after that. PS - same comfig, a netboot with windows7 on first partition - so I can't switch to gpt. Henri On Wed, Jun 15, 2011 at 8:07 PM, Zhihao Yuanlich...@gmail.com wrote: On Wed, Jun 15, 2011 at 7:58 PM, Xin LIdelp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 06/15/11 17:42, Zhihao Yuan wrote: Hi, I configured my disk layout according to http://wiki.freebsd.org/RootOnZFS/ZFSBootPartition But I swapped the order of the freebsd-zfs and freebsd-swap. The 4.0G freebsd-swap partition appears first inside the slice. After that, I write zfsboot on both ada0s2 and ada0s2b, but the boot0 gives me a Read error. Where did your second slice start? There can be a lot of reasons why it gives Read error. After an NTFS partition of 12GB. This should be the problem with zfsboot, because if I use sysinstall to install a bootmgr, the boot gives me a not UFS error, which means the boot0 is done (am I right?). I personally recommend using GPT scheme instead of MBR, as you have a dedicated partition for gptzfsboot, which is much cleaner than this approach. Yeah, yeah, I agree. I should not plan to play Windows games. Cheers, - -- Xin LIdelp...@delphij.net http://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.17 (FreeBSD) iQEcBAEBCAAGBQJN+VUrAAoJEATO+BI/yjfBpksH/2ZswQ+ogdDpYwvhRIjJaqLs NEl8FtC2Ua+c3F2sNwrLK5a/fn/LL+jPAXndvuQdxOaz41Iqtnt8w1i9Dz5ATkva T+i0fnRVwXFqjrlRTWK+ODtNtrhI2/7ECAIfOOLNhaiJnPRrJJgvxJ6V5W+/N+l7 Lt4yMp6hGbhO/9Yp2UoaQuUThOTz+dKNZGECd1nLT+ooHbTPhBvjii080hHowNl6 Ef+JBaEng2NbRJPxYWrRwz6R7A44RDXvrKzn5w/TuUa+4fYrS25EZxygzIh3xjFX 2ILP25yabJ+Vw5o8bFCsJ3ExbEfq0PnfROHanRSdTjMDra27dGY9JZKyytE+Ykc= =D5+X -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org -- Zhihao Yuan, nickname lichray The best way to predict the future is to invent it. ___ 4BSD -- http://4bsd.biz/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS boot inside on the second partition inside a slice
On 06/16/2011 19:35, John Baldwin wrote: On Thursday, June 16, 2011 8:45:41 am Zhihao Yuan wrote: Exactly. The MFCed ZFSv28 is different from any patch maintained by mm@. Maybe some untested changes involved. Can you try reverting this change: Author: jhb Date: Thu Apr 28 17:44:24 2011 New Revision: 221177 URL: http://svn.freebsd.org/changeset/base/221177 Log: Due to space constraints, the UFS boot2 and boot1 use an evil hack where boot2 calls back into boot1 to perform disk reads. The ZFS MBR boot blocks do not have the same space constraints, so remove this hack for ZFS. While here, remove commented out code to support C/H/S addressing from zfsldr. The ZFS and GPT bootstraps always just use EDD LBA addressing. MFC after:2 weeks Modified: head/sys/boot/i386/boot2/Makefile head/sys/boot/i386/common/drv.c head/sys/boot/i386/zfsboot/Makefile head/sys/boot/i386/zfsboot/zfsldr.S I will try this saturday! Thanks Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
zfsboot from 8.2RC1 freeze at boot time
Hello and merry Xmas to everybody, I upgrade a remote server from 8.1-RELEASE to 8.2-RC1. This server have one disk: [r...@tignes ~]# gpart show = 63 488397105 ada0 MBR (233G) 63 12583809 1 freebsd (6.0G) 12583872 475813296 2 freebsd [active] (227G) = 0 12583809 ada0s1 BSD (6.0G) 0 8388608 1 freebsd-ufs (4.0G) 8388608 4195201 2 freebsd-swap (2.0G) =0 475813296 ada0s2 BSD (227G) 0 475813296 1 freebsd-zfs (227G) It boot with zfsboot from ada0s2 containing a zfs pool. After upgrading the zfsboot just to be able to upgrade the pool to v15, the server don't boot anymore. It is a remote server, so I reproduce this config under VirtualBox. The boot freeze after zfsboot displaying -. I grab a old zfsboot from another server running 8.1-STABLE (r213582) which boot fine. I put the zfsboot from r213582 (zpool v15 aware) on ada0s2 and bingo, the server boot normally. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MFC of ZFSv15
On 09/19/2010 18:33, Dan Mack wrote: But I should be able to boot my ZFSv14 root pool using the ZFSv15 build of FreeBSD, correct? Yes But the problem scenario would be when I've upgraded by root pool to v15 and I attempt to boot it with v14 boot loader. At least that is what I think ... You are right I guess what I'm getting at is ... you should be able to buildworld, installkernel, reboot, installworld, reboot without worry. It is the case But when after your run 'zpool upgrade', you will need to re-write the bootcode using gpart on each of your root pool ZFS disks. I prefer to install bootcode BEFORE. Then reboot and check it with the printf of my simple patch. Then you can zpool/zfs upgrade without problem. Am I understanding this correctly ? Thanks for all the work on ZFS BTW, it's great! Dan On Sep 16, 2010, at 2:03 PM, Henri Hennebert wrote: On 09/16/2010 17:18, jhell wrote: On 09/16/2010 09:55, Mike Tancsa wrote: Thanks again for all the ZFS fixes and enhancements! Are there any caveats to upgrading ? Do I just do zpool upgrade -a zfs upgrade -a or are there any extra steps ? Hi Mike, No-one knows your bootcode better than you. So if you are upgrading don't forget if you are on a ZFS root then your bootcode might need updating. I was bitten by this problem in a previous ZFS upgrade. To be sure, I have added this patch to zfsimpl.c so, at boot I know if zpool/zfs upgrade will be OK. Henri Regards, UPDATING should have anything else. sys_boot_zfs.patch___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org Dan -- Dan Mack m...@macktronics.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MFC of ZFSv15
On 09/16/2010 17:18, jhell wrote: On 09/16/2010 09:55, Mike Tancsa wrote: Thanks again for all the ZFS fixes and enhancements! Are there any caveats to upgrading ? Do I just do zpool upgrade -a zfs upgrade -a or are there any extra steps ? Hi Mike, No-one knows your bootcode better than you. So if you are upgrading don't forget if you are on a ZFS root then your bootcode might need updating. I was bitten by this problem in a previous ZFS upgrade. To be sure, I have added this patch to zfsimpl.c so, at boot I know if zpool/zfs upgrade will be OK. Henri Regards, UPDATING should have anything else. Index: sys/boot/zfs/zfsimpl.c === --- sys/boot/zfs/zfsimpl.c (revision 212549) +++ sys/boot/zfs/zfsimpl.c (working copy) @@ -61,6 +61,8 @@ STAILQ_INIT(zfs_vdevs); STAILQ_INIT(zfs_pools); + printf(ZFS: supported version %u\n, (unsigned) SPA_VERSION); + zfs_temp_buf = malloc(TEMP_SIZE); zfs_temp_end = zfs_temp_buf + TEMP_SIZE; zfs_temp_ptr = zfs_temp_buf; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: zfs destroy snapshot doesn't free space
On 08/13/2010 20:02, Andreas Mayer wrote: $ uname -a FreeBSD wurd.dev001.net 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19 02:36:49 UTC 2010 r...@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 2010/8/13 Malcolm Waltz mwa...@pacific.edu: Have you tried zfs list -t all ? I have, it produces this output: $ zfs list -t all NAME USED AVAIL REFER MOUNTPOINT rpool 637G 48,4G18K none rpool/root 245M 1,76G 209M legacy .. rpool/root backup snapshots ... rpool/srv 5,31G 48,4G 4,94G /srv .. rpool/srv backup snapshots ... rpool/tmp 90,2M 1,91G 90,2M /tmp .. rpool/tmp backup snapshots ... rpool/usr 7,91G 48,4G 6,83G /usr .. rpool/usr backup snapshots ... rpool/var 623G 48,4G 623G /var Just to be sure that a process is not still hogging space: fstat |grep /var Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Pack of CAM improvements
On 01/19/2010 17:12, Alexander Motin wrote: Hi. I've made a patch, that should solve set of problems of CAM ATA and CAM generally. I would like to ask for testing and feedback. What patch does: - It unifies bus reset/probe sequence. Whenever bus attached at boot or later, CAM will automatically reset and scan it. It allows to remove duplicate code from many drivers. - Any bus, attached before CAM completed it's boot-time initialization, will equally join to the process, delaying boot if needed. - New kern.cam.boot_delay loader tunable should help controllers that are still unable to register their buses in time (such as slow USB/ PCCard/ CardBus devices). With kern.cam.boot_delay=15000 (I suppose that it was in ms) I can now boot from my sim card reader. Thanks Henri - To allow synchronization between different CAM levels, concept of requests priorities was extended. Priorities now split between several run levels. Device can be freezed at specified level, allowing higher priority requests to pass. For example, no payload requests allowed, until PMP driver enable port. ATA XPT negotiate transfer parameters, periph driver configure caching and so on. - Frozen requests are no more counted by request allocation scheduler. It fixes deadlocks, when frozen low priority payload requests occupying slots, required by higher levels to manage theit execution. - Two last changes were holding proper ATA reinitialization and error recovery implementation. Now it is done: SATA controllers and Port Multipliers now implement automatic hot-plug and should correctly recover from timeouts and bus resets. Patch can be found here: http://people.freebsd.org/~mav/cam-ata.20100119.patch Feedback as always welcome. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: USB problems on 8.0-STABLE
On 01/09/2010 05:39, Warren Block wrote: On Fri, 8 Jan 2010, Frank wrote: On Fri, 8 Jan 2010, Steven Friedrich wrote: Option AllowEmptyInput off EndSection Comment out the line containing AllowEmptyInput. OK, this took care of the nothing-works-unless-mouse-is-moved problem but why do I get this? It's keeping apcupsd from starting. Ace /usr/ports # usbdevs -d -v usbdevs: no USB controllers found I'd guess that usbdevs is obsolete, part of the old USB system. Ace /usr/ports # usbconfig ugen0.1: OHCI root HUB nVidia at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON ugen1.1: EHCI root HUB nVidia at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON ugen0.2: Back-UPS XS 1200 FW:8.g1 .D USB FW:g1 American Power Conversion at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON ugen0.3: USB Optical Mouse vendor 0x0461 at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON ugen0.4: Dell USB Keyboard Dell at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON Do you have DEVICE /dev/ugen0.2 in apcupsd.conf? I don't understand why usbdevs can't find any controllers and apcupsd can't find any device while the kernel and usbconfig can find it all. upsdevs: probably obsolete. As for apcupsd, I don't think it can auto-scan for USB devices, but haven't used it with USB. I have: FreeBSD avoriaz.restart.bel 8.0-RELEASE FreeBSD 8.0-RELEASE #0 r199628M: Tue Nov 24 21:38:07 CET 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 usbconfig: ugen0.2: Back-UPS CS 650 FW:817.v4.I USB American Power Conversion at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON apcupsd.conf: UPSNAME Back-UPS-CS-650 UPSCABLE usb UPSTYPE usb DEVICE apcupsd is working with this config. Henri -Warren Block * Rapid City, South Dakota USA ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: cvsweb: src/UPDATING on RELENG_7
Le 1/01/2010 16:40, Ian Smith a écrit : Hi, Thought I had a clue on using cvsweb, but seem to have mislaid it .. After updating 7.0-RELEASE to RELENG_7 sources on Dec 28, checking UPDATING before and during buildworld, went hunting on cvsweb for the very version of UPDATING I was reading, 1.507.2.34 of 2009/11/29. I can't find it, as such. 1.507 shows on MAIN, RELENG_7_BP, RELENG_7. Selecting only RELENG_7 just shows that single ver 1.507 of Oct'07. It is a known bug of cvsweb: http://www.freebsd.org/cgi/query-pr.cgi?prp=120185-1-txtn=/patch.txt Henri On a punt I manually entered 1.507.2.34 for a diff against 1.507 and that looks just right: http://www.freebsd.org/cgi/cvsweb.cgi/src/UPDATING.diff?r1=texttr1=1.507r2=texttr2=1.507.2.34 But where would I look to find the log and view for 1.507.2.34 itself? cheers, Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
[SOLVED] 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections
Li, Qing wrote: Just another case where the route must be created: That's probably because I explicitly disabled such route installation for PPP link type. Please apply patch http://people.freebsd.org/~qingli/patch and let me know if that solves your problem. The problem is solved. Thanks a lot. Henri PS. the ipv4 ping was working fine before (and after) your patch, so I don't see why you have to patch in.c Thanks, -- Qing [r...@avoriaz ~]# ifconfig gif0 gif0: flags=8051UP,POINTOPOINT,RUNNING,MULTICAST metric 0 mtu 1280 tunnel inet 212.239.166.57 -- 94.23.44.41 inet6 fe80::21d:60ff:fead:2ace%gif0 prefixlen 64 scopeid 0x4 inet6 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:0::: prefixlen 128 options=1ACCEPT_REV_ETHIP_VER [r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1::: PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:1::: ^C --- 2001:41d0:2:2d29:1::: ping6 statistics --- 4 packets transmitted, 0 packets received, 100.0% packet loss [r...@avoriaz ~]# route add -inet6 2001:41d0:2:2d29:1::: -interface lo0 add host 2001:41d0:2:2d29:1:::: gateway lo0 [r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1::: PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:1::: 16 bytes from ::1, icmp_seq=0 hlim=64 time=0.531 ms 16 bytes from ::1, icmp_seq=1 hlim=64 time=0.884 ms 16 bytes from ::1, icmp_seq=2 hlim=64 time=0.748 ms ^C --- 2001:41d0:2:2d29:1::: ping6 statistics --- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 0.531/0.721/0.884/0.145 ms Thanks Henri -Original Message- From: Henri Hennebert [mailto:h...@restart.be] Sent: Sat 7/11/2009 3:09 AM To: Li, Qing Cc: freebsd-stable@freebsd.org; freebsd-...@freebsd.org Subject: Re: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections Li, Qing wrote: Hi, Please try patch-7-10 in my home directory http://people.freebsd.org/~qingli/ and let me know how it works out for you. I thought I had committed the patch but turned out I didn't. I apply the patch, reset my pf.conf to its previous content and all is running smoothly. By the way, I discover after my post that my solution was not working for long (many bytes) connections and this is solved too. Many thank for your time Henri PS please commit as soon as possible On 8.0-BETA1 there is an assymetry: netstat -rn display 192.168.24.1 link#3 no entry for 2001:41d0:2:2d29:1:1:: This is by design as part of the new architecture in 8.0, which maintains the L2 ARP/ND6 and L3 routing tables separately. -- Qing -Original Message- From: owner-freebsd-sta...@freebsd.org on behalf of Henri Hennebert Sent: Fri 7/10/2009 5:32 AM To: freebsd-stable@freebsd.org; freebsd...@freebsd.org Subject: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections Hello, After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when connecting with firefox to a local apache server using the global unicast IPv6 address of the local machine. pf.conf must be updated! My configuration: [r...@avoriaz ~]# ifconfig em0 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO 4 ether 00:1d:60:ad:2a:ce inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255 inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1 inet6 2001:41d0:2:2d29:1:1:: prefixlen 80 media: Ethernet 100baseTX (100baseTX half-duplex) status: active [r...@avoriaz ~]# host www.restart.bel www.restart.bel is an alias for avoriaz.restart.bel. avoriaz.restart.bel has address 192.168.24.1 avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1:: pf.conf: int_if=em0 block in log all block out log all set skip on lo0 antispoof quick for $int_if inet # Allow trafic with physical internal network pass in quick on $int_if from ($int_if:network) to ($int_if) keep state pass out quick on $int_if from ($int_if) to ($int_if:network) keep state The problem: [r...@avoriaz ~]# telnet -4 www.restart.bel 80 Trying 192.168.24.1... Connected to avoriaz.restart.bel. Escape character is '^]'. ^] telnet quit Connection closed. [r...@avoriaz ~]# telnet -6 www.restart.bel 80 Trying 2001:41d0:2:2d29:1:1::... ---Never connect and get a timeout! tcpdump and logging in pf show me that For a IPv4 connection: the packet from telnet to apache pass 2 times on lo0 (out and in) the answer packet from apache to telnet pass 2 times on lo0 (out and in) So no problem, there is `set skip on lo0' For a IPv6 connection: The first packet from telnet to apache pass 2 times on lo0 (out and in) The answer packet from apache to telnet path on em0 and is rejected due to the default flags S/SA. So I have to change pf.conf and replace the last line: pass out quick
Re: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections
Li, Qing wrote: The patch has been committed, svn revision 195643. Thanks, -- Qing Just another case where the route must be created: [r...@avoriaz ~]# ifconfig gif0 gif0: flags=8051UP,POINTOPOINT,RUNNING,MULTICAST metric 0 mtu 1280 tunnel inet 212.239.166.57 -- 94.23.44.41 inet6 fe80::21d:60ff:fead:2ace%gif0 prefixlen 64 scopeid 0x4 inet6 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:0::: prefixlen 128 options=1ACCEPT_REV_ETHIP_VER [r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1::: PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:1::: ^C --- 2001:41d0:2:2d29:1::: ping6 statistics --- 4 packets transmitted, 0 packets received, 100.0% packet loss [r...@avoriaz ~]# route add -inet6 2001:41d0:2:2d29:1::: -interface lo0 add host 2001:41d0:2:2d29:1:::: gateway lo0 [r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1::: PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:1::: 16 bytes from ::1, icmp_seq=0 hlim=64 time=0.531 ms 16 bytes from ::1, icmp_seq=1 hlim=64 time=0.884 ms 16 bytes from ::1, icmp_seq=2 hlim=64 time=0.748 ms ^C --- 2001:41d0:2:2d29:1::: ping6 statistics --- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 0.531/0.721/0.884/0.145 ms Thanks Henri -Original Message- From: Henri Hennebert [mailto:h...@restart.be] Sent: Sat 7/11/2009 3:09 AM To: Li, Qing Cc: freebsd-stable@freebsd.org; freebsd-...@freebsd.org Subject: Re: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections Li, Qing wrote: Hi, Please try patch-7-10 in my home directory http://people.freebsd.org/~qingli/ and let me know how it works out for you. I thought I had committed the patch but turned out I didn't. I apply the patch, reset my pf.conf to its previous content and all is running smoothly. By the way, I discover after my post that my solution was not working for long (many bytes) connections and this is solved too. Many thank for your time Henri PS please commit as soon as possible On 8.0-BETA1 there is an assymetry: netstat -rn display 192.168.24.1 link#3 no entry for 2001:41d0:2:2d29:1:1:: This is by design as part of the new architecture in 8.0, which maintains the L2 ARP/ND6 and L3 routing tables separately. -- Qing -Original Message- From: owner-freebsd-sta...@freebsd.org on behalf of Henri Hennebert Sent: Fri 7/10/2009 5:32 AM To: freebsd-stable@freebsd.org; freebsd...@freebsd.org Subject: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections Hello, After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when connecting with firefox to a local apache server using the global unicast IPv6 address of the local machine. pf.conf must be updated! My configuration: [r...@avoriaz ~]# ifconfig em0 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4 ether 00:1d:60:ad:2a:ce inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255 inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1 inet6 2001:41d0:2:2d29:1:1:: prefixlen 80 media: Ethernet 100baseTX (100baseTX half-duplex) status: active [r...@avoriaz ~]# host www.restart.bel www.restart.bel is an alias for avoriaz.restart.bel. avoriaz.restart.bel has address 192.168.24.1 avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1:: pf.conf: int_if=em0 block in log all block out log all set skip on lo0 antispoof quick for $int_if inet # Allow trafic with physical internal network pass in quick on $int_if from ($int_if:network) to ($int_if) keep state pass out quick on $int_if from ($int_if) to ($int_if:network) keep state The problem: [r...@avoriaz ~]# telnet -4 www.restart.bel 80 Trying 192.168.24.1... Connected to avoriaz.restart.bel. Escape character is '^]'. ^] telnet quit Connection closed. [r...@avoriaz ~]# telnet -6 www.restart.bel 80 Trying 2001:41d0:2:2d29:1:1::... ---Never connect and get a timeout! tcpdump and logging in pf show me that For a IPv4 connection: the packet from telnet to apache pass 2 times on lo0 (out and in) the answer packet from apache to telnet pass 2 times on lo0 (out and in) So no problem, there is `set skip on lo0' For a IPv6 connection: The first packet from telnet to apache pass 2 times on lo0 (out and in) The answer packet from apache to telnet path on em0 and is rejected due to the default flags S/SA. So I have to change pf.conf and replace the last line: pass out quick on $int_if from ($int_if) to ($int_if:network) \ keep state flags any Then all is OK By the way, on 7.2 netstat -rn display 192.168.24.100:1d:60:ad:2a:ce 2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce On 8.0-BETA1 there is an assymetry: netstat -rn display 192.168.24.1 link#3 no entry
Re: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections
Li, Qing wrote: Hi, Please try patch-7-10 in my home directory http://people.freebsd.org/~qingli/ and let me know how it works out for you. I thought I had committed the patch but turned out I didn't. I apply the patch, reset my pf.conf to its previous content and all is running smoothly. By the way, I discover after my post that my solution was not working for long (many bytes) connections and this is solved too. Many thank for your time Henri PS please commit as soon as possible On 8.0-BETA1 there is an assymetry: netstat -rn display 192.168.24.1 link#3 no entry for 2001:41d0:2:2d29:1:1:: This is by design as part of the new architecture in 8.0, which maintains the L2 ARP/ND6 and L3 routing tables separately. -- Qing -Original Message- From: owner-freebsd-sta...@freebsd.org on behalf of Henri Hennebert Sent: Fri 7/10/2009 5:32 AM To: freebsd-stable@freebsd.org; freebsd...@freebsd.org Subject: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections Hello, After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when connecting with firefox to a local apache server using the global unicast IPv6 address of the local machine. pf.conf must be updated! My configuration: [r...@avoriaz ~]# ifconfig em0 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4 ether 00:1d:60:ad:2a:ce inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255 inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1 inet6 2001:41d0:2:2d29:1:1:: prefixlen 80 media: Ethernet 100baseTX (100baseTX half-duplex) status: active [r...@avoriaz ~]# host www.restart.bel www.restart.bel is an alias for avoriaz.restart.bel. avoriaz.restart.bel has address 192.168.24.1 avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1:: pf.conf: int_if=em0 block in log all block out log all set skip on lo0 antispoof quick for $int_if inet # Allow trafic with physical internal network pass in quick on $int_if from ($int_if:network) to ($int_if) keep state pass out quick on $int_if from ($int_if) to ($int_if:network) keep state The problem: [r...@avoriaz ~]# telnet -4 www.restart.bel 80 Trying 192.168.24.1... Connected to avoriaz.restart.bel. Escape character is '^]'. ^] telnet quit Connection closed. [r...@avoriaz ~]# telnet -6 www.restart.bel 80 Trying 2001:41d0:2:2d29:1:1::... ---Never connect and get a timeout! tcpdump and logging in pf show me that For a IPv4 connection: the packet from telnet to apache pass 2 times on lo0 (out and in) the answer packet from apache to telnet pass 2 times on lo0 (out and in) So no problem, there is `set skip on lo0' For a IPv6 connection: The first packet from telnet to apache pass 2 times on lo0 (out and in) The answer packet from apache to telnet path on em0 and is rejected due to the default flags S/SA. So I have to change pf.conf and replace the last line: pass out quick on $int_if from ($int_if) to ($int_if:network) \ keep state flags any Then all is OK By the way, on 7.2 netstat -rn display 192.168.24.100:1d:60:ad:2a:ce 2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce On 8.0-BETA1 there is an assymetry: netstat -rn display 192.168.24.1 link#3 no entry for 2001:41d0:2:2d29:1:1:: Hope it may help someone Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections
Hello, After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when connecting with firefox to a local apache server using the global unicast IPv6 address of the local machine. pf.conf must be updated! My configuration: [r...@avoriaz ~]# ifconfig em0 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4 ether 00:1d:60:ad:2a:ce inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255 inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1 inet6 2001:41d0:2:2d29:1:1:: prefixlen 80 media: Ethernet 100baseTX (100baseTX half-duplex) status: active [r...@avoriaz ~]# host www.restart.bel www.restart.bel is an alias for avoriaz.restart.bel. avoriaz.restart.bel has address 192.168.24.1 avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1:: pf.conf: int_if=em0 block in log all block out log all set skip on lo0 antispoof quick for $int_if inet # Allow trafic with physical internal network pass in quick on $int_if from ($int_if:network) to ($int_if) keep state pass out quick on $int_if from ($int_if) to ($int_if:network) keep state The problem: [r...@avoriaz ~]# telnet -4 www.restart.bel 80 Trying 192.168.24.1... Connected to avoriaz.restart.bel. Escape character is '^]'. ^] telnet quit Connection closed. [r...@avoriaz ~]# telnet -6 www.restart.bel 80 Trying 2001:41d0:2:2d29:1:1::... ---Never connect and get a timeout! tcpdump and logging in pf show me that For a IPv4 connection: the packet from telnet to apache pass 2 times on lo0 (out and in) the answer packet from apache to telnet pass 2 times on lo0 (out and in) So no problem, there is `set skip on lo0' For a IPv6 connection: The first packet from telnet to apache pass 2 times on lo0 (out and in) The answer packet from apache to telnet path on em0 and is rejected due to the default flags S/SA. So I have to change pf.conf and replace the last line: pass out quick on $int_if from ($int_if) to ($int_if:network) \ keep state flags any Then all is OK By the way, on 7.2 netstat -rn display 192.168.24.100:1d:60:ad:2a:ce 2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce On 8.0-BETA1 there is an assymetry: netstat -rn display 192.168.24.1 link#3 no entry for 2001:41d0:2:2d29:1:1:: Hope it may help someone Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections
Hello, After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when connecting with firefox to a local apache server using the global unicast IPv6 address of the local machine. pf.conf must be updated! My configuration: [r...@avoriaz ~]# ifconfig em0 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4 ether 00:1d:60:ad:2a:ce inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255 inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1 inet6 2001:41d0:2:2d29:1:1:: prefixlen 80 media: Ethernet 100baseTX (100baseTX half-duplex) status: active [r...@avoriaz ~]# host www.restart.bel www.restart.bel is an alias for avoriaz.restart.bel. avoriaz.restart.bel has address 192.168.24.1 avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1:: pf.conf: int_if=em0 block in log all block out log all set skip on lo0 antispoof quick for $int_if inet # Allow trafic with physical internal network pass in quick on $int_if from ($int_if:network) to ($int_if) keep state pass out quick on $int_if from ($int_if) to ($int_if:network) keep state The problem: [r...@avoriaz ~]# telnet -4 www.restart.bel 80 Trying 192.168.24.1... Connected to avoriaz.restart.bel. Escape character is '^]'. ^] telnet quit Connection closed. [r...@avoriaz ~]# telnet -6 www.restart.bel 80 Trying 2001:41d0:2:2d29:1:1::... ---Never connect and get a timeout! tcpdump and logging in pf show me that For a IPv4 connection: the packet from telnet to apache pass 2 times on lo0 (out and in) the answer packet from apache to telnet pass 2 times on lo0 (out and in) So no problem, there is `set skip on lo0' For a IPv6 connection: The first packet from telnet to apache pass 2 times on lo0 (out and in) The answer packet from apache to telnet path on em0 and is rejected due to the default flags S/SA. So I have to change pf.conf and replace the last line: pass out quick on $int_if from ($int_if) to ($int_if:network) \ keep state flags any Then all is OK By the way, on 7.2 netstat -rn display 192.168.24.100:1d:60:ad:2a:ce 2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce On 8.0-BETA1 there is an assymetry: netstat -rn display 192.168.24.1 link#3 no entry for 2001:41d0:2:2d29:1:1:: Hope it may help someone Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Zfs on usb-disk checksum errors?
Ronald Klop wrote: Hi. I put zfs on my external usb-disk, so I can backup my harddisk with zfs send/receive. I now have corruption on this volume. [r...@sjakie ~]# zpool status -v pool: extern state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 0h2m with 0 errors on Wed Jul 8 00:35:09 2009 config: NAMESTATE READ WRITE CKSUM extern ONLINE 1 0 0 da0 ONLINE 9 0 0 errors: Permanent errors have been detected in the following files: 0x3f:0xf5d6 I don't really understand which files have corruption. :-( In my syslog is this: (repeated quite often) Jul 8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0 Jul 8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error Jul 8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): SCSI Status: Check Condition Jul 8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): ILLEGAL REQUEST asc:20,0 Jul 8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): Invalid command operation code Jul 8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): Unretryable error I experience the same error with 'Kingston DataTraveler II 1.13'. I simply add in /usr/src/sys/dev/usb/usbdevs: product KINGSTON DATATRAVELER_2 0x1600 DAtaTraveler II (VENDOR was already in the file). and in sys/dev/usb/storage/umass.c: { USB_VENDOR_KINGSTON, USB_PRODUCT_KINGSTON_DATATRAVELER_2, RID_WILDCARD, UMASS_PROTO_SCSI | UMASS_PROTO_BBB, NO_SYNCHRONIZE_CACHE }, Note the flag NO_SYNCHRONIZE_CACHE and everything return to normal. PS - I encounter this problem on 7.2_STABLE with the MFC of ZFS v13. Henri and sometimes Jul 8 10:00:35 sjakie root: ZFS: vdev I/O failure, zpool=extern path=/dev/da0 offset=127558877184 size=3072 error=5 Jul 8 10:00:35 sjakie root: ZFS: vdev I/O failure, zpool=extern path=/dev/da0 offset=127558877184 size=3072 error=5 Jul 8 10:00:35 sjakie root: ZFS: zpool I/O failure, zpool=extern error=5 With varying offsets and sizes. What can I conclude from this? Is the disk failing? Is the 'Invalid command operation code' something to worry about? It didn't show up when the disk was UFS. I reinstalled the pool but the read-errors showed up again. Thanks for any advice, Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 7-STABLE and chflags on ZFS now(?) failing
Ralf S. Engelschall wrote: One of my FreeBSD boxes is a 7-STABLE/amd64 one on ZFS, now in production for over a 1.5 years now and which receives regular upgrades. The last installation of FreeBSD 7-STABLE was just about 2 weeks ago. Today the upgrade failed the first time: cd /usr/src; /usr/bin/make -f Makefile.inc1 install === share/info (install) === lib (install) === lib/csu/amd64 (install) install -o root -g wheel -m 444 crt1.o crti.o crtn.o gcrt1.o /usr/lib === lib/libc (install) install -C -o root -g wheel -m 444 libc.a /usr/lib install -C -o root -g wheel -m 444 libc_p.a /usr/lib install -s -o root -g wheel -m 444 -fschg -S libc.so.7 /lib install: /lib/libc.so.7: chflags: Invalid argument *** Error code 71 Stop in /usr/src/lib/libc. *** Error code 1 Stop in /usr/src/lib. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. 3.30s real 0.35s user 0.75s sys /libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh *** Error code 1 Stop in /usr/adm. *** Error code 1 (ignored) /libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh *** Error code 1 (ignored) /libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh *** Error code 1 (ignored) /libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh *** Error code 1 (ignored) /libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh *** Error code 1 Stop in /usr/adm. *** Error code 1 (ignored) # sh /libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh # Fortunately, I was able to quickly recover via /rescue/cp by copying a libc.so.7 from a Jail to the host system (where the upgrade was performed). But why has this problem occurred now. Well, /lib is on ZFS and I can remember from the past that ZFS did not honor chflags. But remains two questions: 1. I thought chflags support for ZFS was added already in the past. Can it be that just a _few_ chflags flags are supported? It looks like uchg works while the above schg fails. I believe that for schg `zfs get version file_system_with /lib` must be 3. To upgrade this: `zfs upgrade file_system_with /lib` 2. Assuming that schg was never supported on ZFS by us, why did the upgrades in the past on this FreeBSD 7-STABLE box never failed until now? Why now the first time? I would have expected that it already failed from day zero with the above error. Just a try to this strange problem: `man install` say: By default, install preserves all file flags, with the exception of the ``nodump'' flag. With the previous version of zfs there was no flags and so no try to play with flags during update. Henri As workaround I've now put a NO_SCHG=yes into /etc/make.conf and performed the upgrade from scratch. Now it succeeded, of course. But I still do not know the answer to the above two questions and this makes me still feel a little bit unsure about the whole situation... PS: At a mergemaster run I now got a problems which looks related: mv: /var/db/mergemaster.mtree: set flags (was: ): Invalid argument Yes, /var is also on ZFS here. Same problem as it looks. But I'm sure also this error did not occur in the past... -- r...@freebsd.orgRalf S. Engelschall FreeBSD.org/~rse r...@engelschall.com FreeBSD committer www.engelschall.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS pool from current
Nenhum_de_Nos wrote: On Wed, June 17, 2009 11:16, Dimitry Andric wrote: On 2009-06-17 16:09, Nenhum_de_Nos wrote: And for virtualbox on amd64 purposes I want to run 7.2R or STABLE to use VT-x and amd64 vm's under vbox. will I have to make anything, or it will just work ? Kip Macy created a branch were there is the new zfs code, but I didn't get it if it is in the main sources or if I need to fetch any especial code. Kip merged the ZFS v13 support to -STABLE just last month. It seems to work okay for most people, but be sure to read the UPDATING file, especially if you are upgrading existing pools. thanks, I was just looking for this update on web interface to cvs and there is nothing in UPDATING for RELENG_7 there. is this really supposed to happen ? Sadly a known and ignored problem of cvsweb http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/120185 Henri I'll get from csup now ... thanks again, matheus ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS pool from current
Gavin Atkinson wrote: On Wed, 2009-06-17 at 16:51 +0200, Henri Hennebert wrote: Nenhum_de_Nos wrote: thanks, I was just looking for this update on web interface to cvs and there is nothing in UPDATING for RELENG_7 there. is this really supposed to happen ? Sadly a known and ignored problem of cvsweb http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/120185 Henri As far as I can tell, this isn't really a problem with cvsweb, but more of a problem with the repository itself. The issue comes when a commit is made and the log message includes the magic string that CVS uses internally to track different revisions. The patch proposed in that PR appears to be more of a hack than a fix. Ok with that but there is no fix if you base your algorithm on a wrong specification. It's the same reason that (for example) http://www.freebsd.org/cgi/cvsweb.cgi/src/etc/rc.d/ntpd lists a revision 1.335 even though the most recent commit was version 1.18. The hack work well in this case too. I prefer a hack instead of a confusing answer. Henri On the upside, it doesn't appear that these bogus commits have ended up replicated in the SVN repository. Gavin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Stable from May 31 - zfs list locked
Hello, I encounter this problem for the second time. The system is working perfectly well but suddenly the command `zfs list' don't work and can't be killed. Here is a procstat of the culprit: [r...@morzine ~]# procstat -k 91766 PIDTID COMM TDNAME KSTACK 91766 100490 zfs -mi_switch sleepq_switch sleepq_wait _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir zap_lookup_norm zap_lookup dsl_prop_get_dd dsl_dataset_get_ref dsl_dataset_hold dmu_objset_open zfs_ioc_objset_stats zfsdev_ioctl devfs_ioctl_f kern_ioctl same thing happen if I try to run `zpool list' un another terminal. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Stable from May 31 - zfs list locked
Henri Hennebert wrote: Hello, I encounter this problem for the second time. The system is working perfectly well but suddenly the command `zfs list' don't work and can't be killed. Here is a procstat of the culprit: [r...@morzine ~]# procstat -k 91766 PIDTID COMM TDNAME KSTACK 91766 100490 zfs -mi_switch sleepq_switch sleepq_wait _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir zap_lookup_norm zap_lookup dsl_prop_get_dd dsl_dataset_get_ref dsl_dataset_hold dmu_objset_open zfs_ioc_objset_stats zfsdev_ioctl devfs_ioctl_f kern_ioctl same thing happen if I try to run `zpool list' un another terminal. Stangely, zfs snapsot and zfs destroy seems working properly ... I reboot to check this Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS booting without partitions
Kip Macy wrote: On Mon, Jun 1, 2009 at 10:21 AM, Adam McDougall mcdou...@egr.msu.edu wrote: I'm thinking that too. I spent some time taking stabs at figuring it out yesterday but didn't get anywhere useful. I did try compiling the -current src/sys/boot tree on 7.2 after a couple header tweaks to make it compile but the loader still didn't work. The working loader is the same file size as the broken loader unless it was compiled on i386 and then it is ~30k bigger for some reason (it shrinks to the same size as the rest if I force it to use the same 32bit compilation flags as used on amd64). Just mentioning this in case it saves someone else some time. I'm real pleased it works at all. If someone has the time to track down the differences I'll MFC them. I'm not using ZFS boot at the moment so I have no way of testing. At last I get this F.G diff!!! The problem was in libstand.a. By the way , the patch also take into account the update of Doug Rabson to answer my problem with too many devices / pools. Happy to help on this one. Cheers, Kip --- lib/libstand/stand.h.orig 2007-01-09 02:02:04.0 +0100 +++ lib/libstand/stand.h2009-06-03 17:24:42.627552341 +0200 @@ -167,7 +167,7 @@ #define SOPEN_RASIZE 512 }; -#defineSOPEN_MAX 8 +#defineSOPEN_MAX 64 extern struct open_file files[]; /* f_flags values */ --- lib/libstand/nfs.c.orig 2004-01-21 21:12:23.0 +0100 +++ lib/libstand/nfs.c 2009-06-05 20:36:26.001368421 +0200 @@ -29,7 +29,7 @@ */ #include sys/cdefs.h -__FBSDID($FreeBSD: src/lib/libstand/nfs.c,v 1.12 2004/01/21 20:12:23 jhb Exp $); +__FBSDID($FreeBSD: src/lib/libstand/nfs.c,v 1.14 2008/11/21 09:14:29 luigi Exp $); #include sys/param.h #include sys/time.h @@ -405,16 +405,23 @@ #ifdef NFS_DEBUG if (debug) - printf(nfs_open: %s (rootpath=%s)\n, path, rootpath); + printf(nfs_open: %s (rootpath=%s)\n, upath, rootpath); #endif if (!rootpath[0]) { printf(no rootpath, no nfs\n); return (ENXIO); } + /* +* This is silly - we should look at dv_type but that value is +* arch dependant and we can't use it here. +*/ #ifndef __i386__ if (strcmp(f-f_dev-dv_name, net) != 0) return(EINVAL); +#else + if (strcmp(f-f_dev-dv_name, pxe) != 0) + return(EINVAL); #endif if (!(desc = socktodesc(*(int *)(f-f_devdata ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: /boot/loader can't load kernel if too many pool/devices
Doug Rabson wrote: On 1 Jun 2009, at 11:22, Henri Hennebert wrote: Hello, During my tests (succesful) to directly boot from ZFS (with zfsboot and gptzfsboot) I encounter the error can't boot 'kernel' if too many devices/pools are connected to the machine. In my case: 2 SAS disks with 2 pools 2 SATA disks with 2 pools 1 USB key with one pool `heap` command: Active Allocations: 171/173 536576 bytes reserved 527800 bytes allocated `ls` command: open '/' failed: too many open files If I reboot without the USB key all is OK. If I reboot from the USB key after disconnecting 2 disks all is OK. By the way, the /boot/loader in 7.2-STABLE don't work, complains about forth not found. The previous tests were made with 7.2-STABLE (May 31) with /boot/loader from 8.0-CURRENT. I recently increased the number of file descriptors available for /boot/loader. Could you rebuild and try again please. Make sure you rebuild libstand.a as well as /boot/loader. OK - I can boot with the USB key and 4 disks Thanks Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS booting without partitions
Lorenzo Perone wrote: Hi, I tried hard... but without success ;( the result is, when choosing the disk with the zfs boot sectors in it (in my case F5, which goes to ad6), the kernel is not found. the console shows: forth not found definitions not found only not found (the above repeated several times) This is the file /boot/loader from 7.2-STABLE which is wrong. You can find a copy from 8.0-CURRENT and a script that I tested on a USB key) and is running for me: http://verbier.restart.be/xfer/boot-zfs/ Put this directory somewhere, eg /tmp/boot-zfs and run the script eg: `cd /tmp/boot-zfs sh -x make_usb_key.sh da6 kingston` good luck Henri can't load 'kernel' and I get thrown to the loader prompt. lsdev does not show any ZFS devices. Strange thing: if I boot from the other disk, F1, which is my ad4 containing the normal ufs system I used to make up the other one, and escape to the loader prompt, lsdev actually sees the zpool which is on the other disk, and shows: zfs0: tank I tried booting with boot zfs:tank or zfs:tank:/boot/kernel/kernel, but there I get the panic: free: guard1 fail message. (would boot zfs:tank:/boot/kernel/kernel be correct, anyways?) Sure I'm doing something wrong, but what...? Is it a problem that the pool is made out of the second disk only (ad6)? Here are my details (note: latest stable and biosdisk.c merged with changes shown in r185095. no problems in buildworld/kernel): snip Machine: p4 4GHz 4 GB RAM (i386) Note: the pool has actually a different name (heidi instead of tank, if this can be of any relevance...), just using tank here as it's one of the conventions... mount (just to show my starting situation) /dev/mirror/gm0s1a on / (ufs, local) devfs on /dev (devfs, local) /dev/mirror/gm0s1e on /tmp (ufs, local, soft-updates) /dev/mirror/gm0s1f on /usr (ufs, local, soft-updates) /dev/mirror/gm0s1d on /var (ufs, local, soft-updates) gmirror status NameStatus Components mirror/gm0 DEGRADED ad4 (ad6 used to be the second disk...) echo 'LOADER_ZFS_SUPPORT=yes' /etc/make.conf cd /usr/src make buildworld make buildkernel KERNCONF=HEIDI make installkernel KERNCONF=HEIDI mergemaster make installworld shutdown -r now dd if=/dev/zero of=/dev/ad6 bs=512 count=32 zpool create tank ad6 zfs create tank/usr zfs create tank/var zfs create -V 4gb tank/swap zfs set org.freebsd:swap=on tank/swap zpool set bootfs=tank tank rsync -avx / /tank rsync -avx /usr/ /tank/usr rsync -avx /var/ /tank/var cd /usr/src make installkernel KERNCONF=HEIDI DESTDIR=/tank zpool export tank dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1 dd if=/boot/zfsboot of=/dev/ad6 bs=512 skip=1 seek=1024 zpool import tank zfs set mountpoint=legacy tank zfs set mountpoint=/usr tank/usr zfs set mountpoint=/var tank/var shutdown -r now ... at the 'mbr prompt' I pressed F5 (the second disk, ad6) .. as written above, loader gets loaded (at this stage I suppose it's the stuff dd't after block 1024?), but kernel not found. /usr/src/sys/i386/conf/HEIDI: (among other things...): options KVA_PAGES=512 (/tank)/boot/loader.conf: vm.kmem_size=1024M vm.kmem_size_max=1024M vfs.zfs.arc_max=128M vfs.zfs.vdev.cache.size=8M vfs.root.mountfrom=zfs:tank (/tank)/etc/fstab: # DeviceMountpointFStypeOptionsDumpPass# tank/zfsrw00 /dev/acd0/cdromcd9660ro,noauto00 /snap any help is welcome... don't know where to go from here right now. BTW: I can't stop thanking the team for the incredible pace at which bugs are fixed these days! Regards, Lorenzo On 26.05.2009, at 18:42, George Hartzell wrote: Andriy Gapon writes: on 26/05/2009 19:21 George Hartzell said the following: Dmitry Morozovsky writes: On Tue, 26 May 2009, Mickael MAILLOT wrote: MM Hi, MM MM i prefere use zfsboot boot sector, an example is better than a long talk: MM MM $ zpool create tank mirror ad4 ad6 MM $ zpool export tank MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=1 MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1 MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 skeep=1 seek=1024 MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 skeep=1 seek=1024 s/skeep/skip/ ? ;-) What is the reason for copying zfsboot one bit at a time, as opposed to dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=2 seek=1024 for the second part? and no 'count=1' for it? :-) [Just guessing] Apparently the first block of zfsboot is some form of MBR and the rest is zfs-specific code that goes to magical sector 1024. Ok, I managed to read the argument to seek as one block, apparently my coffee hasn't hit yet. I'm still confused about the two parts of zfsboot and what's magical about seeking to 1024. g. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
/boot/loader can't load kernel if too many pool/devices
Hello, During my tests (succesful) to directly boot from ZFS (with zfsboot and gptzfsboot) I encounter the error can't boot 'kernel' if too many devices/pools are connected to the machine. In my case: 2 SAS disks with 2 pools 2 SATA disks with 2 pools 1 USB key with one pool `heap` command: Active Allocations: 171/173 536576 bytes reserved 527800 bytes allocated `ls` command: open '/' failed: too many open files If I reboot without the USB key all is OK. If I reboot from the USB key after disconnecting 2 disks all is OK. By the way, the /boot/loader in 7.2-STABLE don't work, complains about forth not found. The previous tests were made with 7.2-STABLE (May 31) with /boot/loader from 8.0-CURRENT. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS booting without partitions
Henri Hennebert wrote: Lorenzo Perone wrote: Hi, I tried hard... but without success ;( the result is, when choosing the disk with the zfs boot sectors in it (in my case F5, which goes to ad6), the kernel is not found. the console shows: forth not found definitions not found only not found (the above repeated several times) This is the file /boot/loader from 7.2-STABLE which is wrong. You can find a copy from 8.0-CURRENT and a script that I tested on a USB key) and is running for me: http://verbier.restart.be/xfer/boot-zfs/ Put this directory somewhere, eg /tmp/boot-zfs and run the script eg: `cd /tmp/boot-zfs sh -x make_usb_key.sh da6 kingston` good luck CAVEAT: The script put tuning in '/boot/loader.conf' wich imply options KVA_PAGES=384 in my i386 kernel. Henri Henri can't load 'kernel' and I get thrown to the loader prompt. lsdev does not show any ZFS devices. Strange thing: if I boot from the other disk, F1, which is my ad4 containing the normal ufs system I used to make up the other one, and escape to the loader prompt, lsdev actually sees the zpool which is on the other disk, and shows: zfs0: tank I tried booting with boot zfs:tank or zfs:tank:/boot/kernel/kernel, but there I get the panic: free: guard1 fail message. (would boot zfs:tank:/boot/kernel/kernel be correct, anyways?) Sure I'm doing something wrong, but what...? Is it a problem that the pool is made out of the second disk only (ad6)? Here are my details (note: latest stable and biosdisk.c merged with changes shown in r185095. no problems in buildworld/kernel): snip Machine: p4 4GHz 4 GB RAM (i386) Note: the pool has actually a different name (heidi instead of tank, if this can be of any relevance...), just using tank here as it's one of the conventions... mount (just to show my starting situation) /dev/mirror/gm0s1a on / (ufs, local) devfs on /dev (devfs, local) /dev/mirror/gm0s1e on /tmp (ufs, local, soft-updates) /dev/mirror/gm0s1f on /usr (ufs, local, soft-updates) /dev/mirror/gm0s1d on /var (ufs, local, soft-updates) gmirror status NameStatus Components mirror/gm0 DEGRADED ad4 (ad6 used to be the second disk...) echo 'LOADER_ZFS_SUPPORT=yes' /etc/make.conf cd /usr/src make buildworld make buildkernel KERNCONF=HEIDI make installkernel KERNCONF=HEIDI mergemaster make installworld shutdown -r now dd if=/dev/zero of=/dev/ad6 bs=512 count=32 zpool create tank ad6 zfs create tank/usr zfs create tank/var zfs create -V 4gb tank/swap zfs set org.freebsd:swap=on tank/swap zpool set bootfs=tank tank rsync -avx / /tank rsync -avx /usr/ /tank/usr rsync -avx /var/ /tank/var cd /usr/src make installkernel KERNCONF=HEIDI DESTDIR=/tank zpool export tank dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1 dd if=/boot/zfsboot of=/dev/ad6 bs=512 skip=1 seek=1024 zpool import tank zfs set mountpoint=legacy tank zfs set mountpoint=/usr tank/usr zfs set mountpoint=/var tank/var shutdown -r now ... at the 'mbr prompt' I pressed F5 (the second disk, ad6) .. as written above, loader gets loaded (at this stage I suppose it's the stuff dd't after block 1024?), but kernel not found. /usr/src/sys/i386/conf/HEIDI: (among other things...): options KVA_PAGES=512 (/tank)/boot/loader.conf: vm.kmem_size=1024M vm.kmem_size_max=1024M vfs.zfs.arc_max=128M vfs.zfs.vdev.cache.size=8M vfs.root.mountfrom=zfs:tank (/tank)/etc/fstab: # DeviceMountpointFStypeOptionsDumpPass# tank/zfsrw00 /dev/acd0/cdromcd9660ro,noauto00 /snap any help is welcome... don't know where to go from here right now. BTW: I can't stop thanking the team for the incredible pace at which bugs are fixed these days! Regards, Lorenzo On 26.05.2009, at 18:42, George Hartzell wrote: Andriy Gapon writes: on 26/05/2009 19:21 George Hartzell said the following: Dmitry Morozovsky writes: On Tue, 26 May 2009, Mickael MAILLOT wrote: MM Hi, MM MM i prefere use zfsboot boot sector, an example is better than a long talk: MM MM $ zpool create tank mirror ad4 ad6 MM $ zpool export tank MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=1 MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1 MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 skeep=1 seek=1024 MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 skeep=1 seek=1024 s/skeep/skip/ ? ;-) What is the reason for copying zfsboot one bit at a time, as opposed to dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=2 seek=1024 for the second part? and no 'count=1' for it? :-) [Just guessing] Apparently the first block of zfsboot is some form of MBR and the rest is zfs-specific code that goes to magical sector 1024. Ok, I managed to read the argument to seek as one block, apparently my coffee hasn't hit yet. I'm still confused about the two parts of zfsboot and what's magical about seeking to 1024. g. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org
Re: libzpool assert vs libc assert
Andriy Gapon wrote: on 29/05/2009 15:35 Andriy Gapon said the following: So anyone else feels that this is a bug? on 28/05/2009 16:55 Andriy Gapon said the following: on 28/05/2009 16:26 Henri Hennebert said the following: (gdb) bt #0 0x0008012a6f22 in strlen () from /lib/libc.so.7 #1 0x0008012a0feb in open () from /lib/libc.so.7 #2 0x00080129ea59 in open () from /lib/libc.so.7 #3 0x0008012a1f2e in vfprintf () from /lib/libc.so.7 #4 0x000801291158 in fprintf () from /lib/libc.so.7 #5 0x000801290fb0 in __assert () from /lib/libc.so.7 I find the above part interesting. Could this be because of the following discrepancy: 1) cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h: extern void __assert(const char *, const char *, int); 2) lib/libc/gen/assert.c: void __assert(func, file, line, failedexpr) const char *func, *file; int line; const char *failedexpr; #6 0x000800fef120 in zmutex_destroy () from /lib/libzpool.so.1 #7 0x00080102e1a0 in dsl_dataset_fast_stat () from /lib/libzpool.so.1 #8 0x000801045ffa in dbuf_find () from /lib/libzpool.so.1 #9 0x000801047bf3 in dmu_buf_rele () from /lib/libzpool.so.1 #10 0x000801027546 in dsl_pool_open () from /lib/libzpool.so.1 #11 0x00080101bcec in spa_create () from /lib/libzpool.so.1 #12 0x00080101c820 in spa_tryimport () from /lib/libzpool.so.1 I propose the following patch for this issue. It fixes mismatch between __assert extern declaration in zfs code and actual signature in libc code. I also took liberty of dropping __STDC__ and __STDC_VERSION__ checks. I think that those checks are not needed with compilers that can be used to compile FreeBSD. Besides, both branches of __STDC_VERSION__ check were exactly the same. Henri, if you still experience that crash of zpool command, could you please try the patch and see if you have a nicer assert message and stacktrace now? Sorry, that this is still not a fix for the real issue. diff --git a/cddl/contrib/opensolaris/head/assert.h b/cddl/contrib/opensolaris/head/assert.h index 394820a..c2a4936 100644 --- a/cddl/contrib/opensolaris/head/assert.h +++ b/cddl/contrib/opensolaris/head/assert.h @@ -37,15 +37,7 @@ extern C { #endif -#if defined(__STDC__) -#if __STDC_VERSION__ - 0 = 199901L -extern void __assert(const char *, const char *, int); -#else -extern void __assert(const char *, const char *, int); -#endif /* __STDC_VERSION__ - 0 = 199901L */ -#else -extern void _assert(); -#endif +extern void __assert(const char *, const char *, int, const char *); #ifdef __cplusplus } @@ -68,14 +60,6 @@ extern void _assert(); #else -#if defined(__STDC__) -#if __STDC_VERSION__ - 0 = 199901L -#defineassert(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 0)) -#else -#defineassert(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 0)) -#endif /* __STDC_VERSION__ - 0 = 199901L */ -#else -#defineassert(EX) (void)((EX) || (_assert(EX, __FILE__, __LINE__), 0)) -#endif /* __STDC__ */ +#defineassert(EX) (void)((EX) || (__assert(__func__, __FILE__, __LINE__, #EX), 0)) #endif /* NDEBUG */ diff --git a/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h b/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h index 7ae7f9d..631e302 100644 --- a/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h +++ b/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h @@ -120,21 +120,12 @@ extern void vpanic(const char *, __va_list); #definefm_panicpanic /* This definition is copied from assert.h. */ -#if defined(__STDC__) -#if __STDC_VERSION__ - 0 = 199901L -#defineverify(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 0)) -#else -#defineverify(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 0)) -#endif /* __STDC_VERSION__ - 0 = 199901L */ -#else -#defineverify(EX) (void)((EX) || (_assert(EX, __FILE__, __LINE__), 0)) -#endif /* __STDC__ */ - +#defineverify(EX) (void)((EX) || (__assert(__func__, __FILE__, __LINE__, #EX), 0)) #defineVERIFY verify #defineASSERT assert -extern void __assert(const char *, const char *, int); +extern void __assert(const char *, const char *, int, const char *); #ifdef lint #defineVERIFY3_IMPL(x, y, z, t)if (x == z) ((void)0) @@ -148,7 +139,7 @@ extern void __assert(const char *, const char *, int); (void) snprintf(__buf, 256, %s %s %s (0x%llx %s 0x%llx), \ #LEFT, #OP, #RIGHT, \ (u_longlong_t)__left, #OP, (u_longlong_t)__right); \ - __assert(__buf, __FILE__, __LINE__); \ + __assert(__func__, __FILE__, __LINE__, __buf); \ } \ _NOTE(CONSTCOND) } while (0) /* END CSTYLED */ Here is the new bt after the patch [r...@avoriaz libzpool]# gdb zdb GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB
Re: ZFS MFC heads down
Kip Macy wrote: Please try applying this change to your tree and let me know. I patch, I reboot 2 times without problem. I keep you posted is I encounter a new crash. Thanks Henri Thanks, Kip http://svn.freebsd.org/viewvc/base?view=revisionrevision=193110 On Sat, May 30, 2009 at 2:11 AM, Henri Hennebert h...@restart.be wrote: Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. I get a Fatal trap 12: page fault while in kernel mode at shutdown. the core.txt is http://verbier.restart.be/xfer/core.txt.61 Thanks for you work Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem with postfix and mail command
Ruben Lara wrote: Hi all! I just installed postfix, after build world without sendmail If i try to send mail i get: mail# mail aaa Subject: a a . EOT mail# mail: /usr/sbin/sendmail: No such file or directory Event with WITHOUT_SENDMAIL=yes in /etc/src.conf, make installworld must create this symbolic links: # ls -l /usr/sbin/sendmail lrwxr-xr-x 1 root wheel 21 May 21 13:54 /usr/sbin/sendmail - /usr/sbin/mailwrapper Henri I edited: mail# cat /etc/mail/mailer.conf # # Execute the Postfix sendmail program, named /usr/local/sbin/sendmail # sendmail/usr/local/sbin/sendmail send-mail/usr/local/sbin/sendmail mailq/usr/local/sbin/sendmail newaliases/usr/local/sbin/sendmail mail# where actually i have my postfix esecutables Thanks for help in advance Rubén Lara _ ¡Acelera con la Fórmula 1! Juega y demuestra lo que sabes con MSN Deportes http://msn.es.predictorpro.com/grand-prix/overview.aspx?season=8 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. I get a Fatal trap 12: page fault while in kernel mode at shutdown. the core.txt is http://verbier.restart.be/xfer/core.txt.61 Thanks for you work Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Kip Macy wrote: On Wed, May 27, 2009 at 11:04 AM, Artem Belevich fbsdl...@src.cx wrote: I had the same problem on -current. Try attached patch. It may not apply cleanly on -stable, but should be easy enough to make equivalent changes on -stable. --Artem Adding to rw_init looks fine, but I'd rather find out why owner isn't NULL when the calling convention expects it. Getting a backtrace from where the assert is hit would be helpful. -Kip on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May 25 12:06:07 CEST 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 Is it useful ? [r...@avoriaz ~]# gdb zdb GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd...(no debugging symbols found)... (gdb) r rpool Starting program: /usr/sbin/zdb rpool (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...[New LWP 100343] (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...[New Thread 0x8018020b0 (LWP 100343)] [New Thread 0x801802240 (LWP 100346)] version=13 name='rpool' state=0 txg=3467 pool_guid=536117255064806899 hostid=1133576597 hostname='unset' vdev_tree type='root' id=0 guid=536117255064806899 children[0] type='mirror' id=0 guid=3124217685892976292 metaslab_array=23 metaslab_shift=30 ashift=9 asize=155741847552 is_log=0 children[0] type='disk' id=0 guid=11099413743436480159 path='/dev/ad4p2' whole_disk=0 children[1] type='disk' id=1 guid=12724983687805955432 path='/dev/ad6p2' whole_disk=0 [New Thread 0x8018023d0 (LWP 100347)] [New Thread 0x801802560 (LWP 100354)] [New Thread 0x8018026f0 (LWP 100355)] [New Thread 0x801802880 (LWP 100356)] [New Thread 0x801802a10 (LWP 100359)] [New Thread 0x801802ba0 (LWP 100360)] [New Thread 0x801802d30 (LWP 100368)] [New Thread 0x801802ec0 (LWP 100369)] [New Thread 0x801803050 (LWP 100370)] [New Thread 0x8018031e0 (LWP 100371)] [New Thread 0x801803370 (LWP 100372)] [New Thread 0x801803500 (LWP 100373)] [New Thread 0x801803690 (LWP 100374)] [New Thread 0x801803820 (LWP 100375)] [New Thread 0x8018039b0 (LWP 100376)] [New Thread 0x801803b40 (LWP 100377)] [New Thread 0x801803cd0 (LWP 100378)] [New Thread 0x801803e60 (LWP 100379)] [New Thread 0x801803ff0 (LWP 100380)] [New Thread 0x801804180 (LWP 100381)] [New Thread 0x801804310 (LWP 100382)] [New Thread 0x8018044a0 (LWP 100383)] [New Thread 0x801804630 (LWP 100384)] [New Thread 0x8018047c0 (LWP 100385)] [New Thread 0x801804950 (LWP 100386)] [New Thread 0x801804ae0 (LWP 100387)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x8018020b0 (LWP 100343)] 0x0008012a6f22 in strlen () from /lib/libc.so.7 (gdb) bt #0 0x0008012a6f22 in strlen () from /lib/libc.so.7 #1 0x0008012a0feb in open () from /lib/libc.so.7 #2 0x00080129ea59 in open () from /lib/libc.so.7 #3 0x0008012a1f2e in vfprintf () from /lib/libc.so.7 #4 0x000801291158 in fprintf () from /lib/libc.so.7 #5 0x000801290fb0 in __assert () from /lib/libc.so.7 #6 0x000800fef120 in zmutex_destroy () from /lib/libzpool.so.1 #7 0x00080102e1a0 in dsl_dataset_fast_stat () from /lib/libzpool.so.1 #8 0x000801045ffa in dbuf_find () from /lib/libzpool.so.1 #9 0x000801047bf3 in dmu_buf_rele () from /lib/libzpool.so.1 #10 0x000801027546 in dsl_pool_open () from /lib/libzpool.so.1 #11 0x00080101bcec in spa_create () from /lib/libzpool.so.1 #12 0x00080101c820 in spa_tryimport () from /lib/libzpool.so.1 #13 0x00408b41 in ?? () #14 0x004036de in ?? () #15 0x000800534000 in ?? () #16 0x in ?? () #17 0x0002 in ?? () #18 0x7fffed70 in ?? () #19 0x7fffed7e in ?? () #20 0x in ?? () #21 0x7fffed84 in ?? () #22 0x7fffed9a in ?? () #23 0x7fffeda5 in ?? () #24 0x7fffedbf in ?? () #25 0x7fffedea in ?? () #26
Re: ZFS MFC heads down
Andriy Gapon wrote: on 28/05/2009 16:26 Henri Hennebert said the following: (gdb) bt #0 0x0008012a6f22 in strlen () from /lib/libc.so.7 #1 0x0008012a0feb in open () from /lib/libc.so.7 #2 0x00080129ea59 in open () from /lib/libc.so.7 #3 0x0008012a1f2e in vfprintf () from /lib/libc.so.7 #4 0x000801291158 in fprintf () from /lib/libc.so.7 #5 0x000801290fb0 in __assert () from /lib/libc.so.7 I find the above part interesting. Could this be because of the following discrepancy: 1) cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h: extern void __assert(const char *, const char *, int); 2) lib/libc/gen/assert.c: void __assert(func, file, line, failedexpr) const char *func, *file; int line; const char *failedexpr; #6 0x000800fef120 in zmutex_destroy () from /lib/libzpool.so.1 #7 0x00080102e1a0 in dsl_dataset_fast_stat () from /lib/libzpool.so.1 #8 0x000801045ffa in dbuf_find () from /lib/libzpool.so.1 #9 0x000801047bf3 in dmu_buf_rele () from /lib/libzpool.so.1 #10 0x000801027546 in dsl_pool_open () from /lib/libzpool.so.1 #11 0x00080101bcec in spa_create () from /lib/libzpool.so.1 #12 0x00080101c820 in spa_tryimport () from /lib/libzpool.so.1 But back to the problem - without an additional printf we still can not what was the value in m_owner. Only that it was not null. Probably it's better to build with debugging symbols and examine with gdb. Firt try: [r...@avoriaz libzpool]# gdb zdb GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd...(no debugging symbols found)... (gdb) r pool1 Starting program: /usr/sbin/zdb pool1 (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...[New LWP 100299] (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...[New Thread 0x8018020b0 (LWP 100299)] [New Thread 0x801802240 (LWP 100354)] version=13 name='pool1' state=0 txg=4 pool_guid=9156958376606789 hostid=1133576597 hostname='unset' vdev_tree type='root' id=0 guid=9156958376606789 children[0] type='raidz' id=0 guid=8214939615613279020 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=500108886016 is_log=0 children[0] type='disk' id=0 guid=7001907692988243779 path='/dev/ad8p2' whole_disk=0 children[1] type='disk' id=1 guid=1909032920962573263 path='/dev/ad10p2' whole_disk=0 [New Thread 0x8018023d0 (LWP 100369)] [New Thread 0x801802560 (LWP 100370)] [New Thread 0x8018026f0 (LWP 100371)] [New Thread 0x801802880 (LWP 100372)] [New Thread 0x801802a10 (LWP 100376)] [New Thread 0x801802ba0 (LWP 100382)] [New Thread 0x801802d30 (LWP 100383)] [New Thread 0x801802ec0 (LWP 100384)] [New Thread 0x801803050 (LWP 100385)] [New Thread 0x8018031e0 (LWP 100386)] [New Thread 0x801803370 (LWP 100387)] [New Thread 0x801803500 (LWP 100388)] [New Thread 0x801803690 (LWP 100389)] [New Thread 0x801803820 (LWP 100390)] [New Thread 0x8018039b0 (LWP 100391)] [New Thread 0x801803b40 (LWP 100392)] [New Thread 0x801803cd0 (LWP 100393)] [New Thread 0x801803e60 (LWP 100394)] [New Thread 0x801803ff0 (LWP 100395)] [New Thread 0x801804180 (LWP 100396)] [New Thread 0x801804310 (LWP 100397)] [New Thread 0x8018044a0 (LWP 100398)] [New Thread 0x801804630 (LWP 100399)] [New Thread 0x8018047c0 (LWP 100400)] [New Thread 0x801804950 (LWP 100401)] [New Thread 0x801804ae0 (LWP 100402)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x8018020b0 (LWP 100299)] 0x0008012a6f22 in strlen () from /lib/libc.so.7 (gdb) bt #0 0x0008012a6f22 in strlen () from /lib/libc.so.7 #1 0x0008012a0feb in open () from /lib/libc.so.7 #2 0x00080129ea59 in open () from /lib/libc.so.7 #3 0x0008012a1f2e in vfprintf () from /lib/libc.so.7 #4 0x000801291158 in fprintf () from /lib/libc.so.7 #5 0x000801290fb0 in __assert () from /lib/libc.so.7 #6 0x000800fef230 in zmutex_destroy (mp=0x8018b2cc0) at /usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c:112 #7 0x00080102e2b0
Re: ZFS MFC heads down
Henri Hennebert wrote: Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. --- clipped --- By the way, to help prepare a boot/root pool does a utility to display the content of zpool.cache exist ? I find the answer to this question and think it may be really useful to others: zdb -C [ -U path to zpool.cache ] Henri Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. No a real problem but maybe worth mentioning: on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May 26 15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE i386 [r...@morzine ~]# zdb rpool version=13 name='rpool' state=0 txg=959 pool_guid=17669857244588609348 hostid=2315842372 hostname='unset' vdev_tree type='root' id=0 guid=17669857244588609348 children[0] type='mirror' id=0 guid=3225603179255348056 metaslab_array=23 metaslab_shift=28 ashift=9 asize=51534888960 is_log=0 children[0] type='disk' id=0 guid=17573085726489368265 path='/dev/da0p2' whole_disk=0 children[1] type='disk' id=1 guid=2736169600077218893 path='/dev/da1p2' whole_disk=0 Assertion failed: (?Àuè?ëÛ´), function mp-m_owner == NULL, file /usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c, line 112. Abort trap: 6 and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May 25 12:06:07 CEST 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 [r...@avoriaz ~]# zdb rpool version=13 name='rpool' state=0 txg=3467 pool_guid=536117255064806899 hostid=1133576597 hostname='unset' vdev_tree type='root' id=0 guid=536117255064806899 children[0] type='mirror' id=0 guid=3124217685892976292 metaslab_array=23 metaslab_shift=30 ashift=9 asize=155741847552 is_log=0 children[0] type='disk' id=0 guid=11099413743436480159 path='/dev/ad4p2' whole_disk=0 children[1] type='disk' id=1 guid=12724983687805955432 path='/dev/ad6p2' whole_disk=0 Segmentation fault: 11 By the way, to help prepare a boot/root pool does a utility to display the content of zpool.cache exist ? Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Artem Belevich wrote: I had the same problem on -current. Try attached patch. It may not apply cleanly on -stable, but should be easy enough to make equivalent changes on -stable. The patch is ok for stable. now I get for the pool with my root: [r...@morzine libzpool]# zdb rpool version=13 name='rpool' state=0 txg=959 pool_guid=17669857244588609348 hostid=2315842372 hostname='unset' vdev_tree type='root' id=0 guid=17669857244588609348 children[0] type='mirror' id=0 guid=3225603179255348056 metaslab_array=23 metaslab_shift=28 ashift=9 asize=51534888960 is_log=0 children[0] type='disk' id=0 guid=17573085726489368265 path='/dev/da0p2' whole_disk=0 children[1] type='disk' id=1 guid=2736169600077218893 path='/dev/da1p2' whole_disk=0 WARNING: pool 'rpool' could not be loaded as it was last accessed by another system (host: unset hostid: 0x8a08f344). See: http://www.sun.com/msg/ZFS-8000-EY zdb: can't open rpool: No such file or directory But rpool have been used for many boot now - strange ... Thanks for your patch and time Henri --Artem On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert h...@restart.be wrote: Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. No a real problem but maybe worth mentioning: on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May 26 15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE i386 [r...@morzine ~]# zdb rpool version=13 name='rpool' state=0 txg=959 pool_guid=17669857244588609348 hostid=2315842372 hostname='unset' vdev_tree type='root' id=0 guid=17669857244588609348 children[0] type='mirror' id=0 guid=3225603179255348056 metaslab_array=23 metaslab_shift=28 ashift=9 asize=51534888960 is_log=0 children[0] type='disk' id=0 guid=17573085726489368265 path='/dev/da0p2' whole_disk=0 children[1] type='disk' id=1 guid=2736169600077218893 path='/dev/da1p2' whole_disk=0 Assertion failed: (?Ąuč? ėŪ¨´), function mp-m_owner == NULL, file /usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c, line 112. Abort trap: 6 and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May 25 12:06:07 CEST 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 [r...@avoriaz ~]# zdb rpool version=13 name='rpool' state=0 txg=3467 pool_guid=536117255064806899 hostid=1133576597 hostname='unset' vdev_tree type='root' id=0 guid=536117255064806899 children[0] type='mirror' id=0 guid=3124217685892976292 metaslab_array=23 metaslab_shift=30 ashift=9 asize=155741847552 is_log=0 children[0] type='disk' id=0 guid=11099413743436480159 path='/dev/ad4p2' whole_disk=0 children[1] type='disk' id=1 guid=12724983687805955432 path='/dev/ad6p2' whole_disk=0 Segmentation fault: 11 By the way, to help prepare a boot/root pool does a utility to display the content of zpool.cache exist ? Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http
Re: ZFS MFC heads down
Henri Hennebert wrote: Artem Belevich wrote: I had the same problem on -current. Try attached patch. It may not apply cleanly on -stable, but should be easy enough to make equivalent changes on -stable. The patch is ok for stable. now I get for the pool with my root: [r...@morzine libzpool]# zdb rpool version=13 name='rpool' state=0 txg=959 pool_guid=17669857244588609348 hostid=2315842372 hostname='unset' vdev_tree type='root' id=0 guid=17669857244588609348 children[0] type='mirror' id=0 guid=3225603179255348056 metaslab_array=23 metaslab_shift=28 ashift=9 asize=51534888960 is_log=0 children[0] type='disk' id=0 guid=17573085726489368265 path='/dev/da0p2' whole_disk=0 children[1] type='disk' id=1 guid=2736169600077218893 path='/dev/da1p2' whole_disk=0 WARNING: pool 'rpool' could not be loaded as it was last accessed by another system (host: unset hostid: 0x8a08f344). See: http://www.sun.com/msg/ZFS-8000-EY zdb: can't open rpool: No such file or directory But rpool have been used for many boot now - strange ... And dangerous: the second time I try: [r...@morzine ~]# zdb rpool zdb: can't open rpool: No such file or directory [r...@morzine ~]# And the real problem: rpool is no more in /boot/zfs/zpool.cache !!! Next boot will not work smoothly. Tomorrow, I will use the 3rd bootable disk to rebuild this. Henri Thanks for your patch and time Henri --Artem On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert h...@restart.be wrote: Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. No a real problem but maybe worth mentioning: on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May 26 15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE i386 [r...@morzine ~]# zdb rpool version=13 name='rpool' state=0 txg=959 pool_guid=17669857244588609348 hostid=2315842372 hostname='unset' vdev_tree type='root' id=0 guid=17669857244588609348 children[0] type='mirror' id=0 guid=3225603179255348056 metaslab_array=23 metaslab_shift=28 ashift=9 asize=51534888960 is_log=0 children[0] type='disk' id=0 guid=17573085726489368265 path='/dev/da0p2' whole_disk=0 children[1] type='disk' id=1 guid=2736169600077218893 path='/dev/da1p2' whole_disk=0 Assertion failed: (?Ąuč? ėŪ¨´), function mp-m_owner == NULL, file /usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c, line 112. Abort trap: 6 and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May 25 12:06:07 CEST 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 [r...@avoriaz ~]# zdb rpool version=13 name='rpool' state=0 txg=3467 pool_guid=536117255064806899 hostid=1133576597 hostname='unset' vdev_tree type='root' id=0 guid=536117255064806899 children[0] type='mirror' id=0 guid=3124217685892976292 metaslab_array=23 metaslab_shift=30 ashift=9 asize=155741847552 is_log=0 children[0] type='disk' id=0 guid=11099413743436480159 path='/dev/ad4p2' whole_disk=0 children[1] type='disk' id=1 guid=12724983687805955432 path='/dev/ad6p2' whole_disk=0 Segmentation fault: 11 By the way, to help prepare a boot/root pool does a utility to display the content of zpool.cache exist
Re: ZFS MFC heads down
Artem Belevich wrote: Did you by any chance do that from single-user mode? ZFS seems to rely on hostid being set. Try running /etc/rc.d/hostid start and then re-try your zfs commands. I was in multiuser with hostid set. Henri --Artem On Wed, May 27, 2009 at 1:06 PM, Henri Hennebert h...@restart.be wrote: Artem Belevich wrote: I had the same problem on -current. Try attached patch. It may not apply cleanly on -stable, but should be easy enough to make equivalent changes on -stable. The patch is ok for stable. now I get for the pool with my root: [r...@morzine libzpool]# zdb rpool version=13 name='rpool' state=0 txg=959 pool_guid=17669857244588609348 hostid=2315842372 hostname='unset' vdev_tree type='root' id=0 guid=17669857244588609348 children[0] type='mirror' id=0 guid=3225603179255348056 metaslab_array=23 metaslab_shift=28 ashift=9 asize=51534888960 is_log=0 children[0] type='disk' id=0 guid=17573085726489368265 path='/dev/da0p2' whole_disk=0 children[1] type='disk' id=1 guid=2736169600077218893 path='/dev/da1p2' whole_disk=0 WARNING: pool 'rpool' could not be loaded as it was last accessed by another system (host: unset hostid: 0x8a08f344). See: http://www.sun.com/msg/ZFS-8000-EY zdb: can't open rpool: No such file or directory But rpool have been used for many boot now - strange ... Thanks for your patch and time Henri --Artem On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert h...@restart.be wrote: Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. No a real problem but maybe worth mentioning: on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May 26 15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE i386 [r...@morzine ~]# zdb rpool version=13 name='rpool' state=0 txg=959 pool_guid=17669857244588609348 hostid=2315842372 hostname='unset' vdev_tree type='root' id=0 guid=17669857244588609348 children[0] type='mirror' id=0 guid=3225603179255348056 metaslab_array=23 metaslab_shift=28 ashift=9 asize=51534888960 is_log=0 children[0] type='disk' id=0 guid=17573085726489368265 path='/dev/da0p2' whole_disk=0 children[1] type='disk' id=1 guid=2736169600077218893 path='/dev/da1p2' whole_disk=0 Assertion failed: (?Ąuč? ėŪ¨´), function mp-m_owner == NULL, file /usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c, line 112. Abort trap: 6 and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May 25 12:06:07 CEST 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ amd64 [r...@avoriaz ~]# zdb rpool version=13 name='rpool' state=0 txg=3467 pool_guid=536117255064806899 hostid=1133576597 hostname='unset' vdev_tree type='root' id=0 guid=536117255064806899 children[0] type='mirror' id=0 guid=3124217685892976292 metaslab_array=23 metaslab_shift=30 ashift=9 asize=155741847552 is_log=0 children[0] type='disk' id=0 guid=11099413743436480159 path='/dev/ad4p2' whole_disk=0 children[1] type='disk' id=1 guid=12724983687805955432 path='/dev/ad6p2' whole_disk=0 Segmentation fault: 11 By the way, to help prepare a boot/root pool does a utility to display the content of zpool.cache exist ? Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http
Re: ZFS MFC heads down
Kip Macy wrote: I haven't looked at the panic yet, but adding a USB quirk (no SYNCHRONIZE_CACHE) would certainly reduce the noise in your logs. Thanks for this hint. I patch usbdevs and umass.c. No more noise but more interesting, now I can complete install on my usb key without deadlock or crash. Henri -Kip On Mon, May 25, 2009 at 4:16 AM, Henri Hennebert h...@restart.be wrote: Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. I get a panic: panic: solaris assert: 0 == dmu_read(os, lr-lr_foid, off, dlen, buf), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, line: 991 during `make -s DESTDIR=/kingston installworld` kingston is a pool on a USB stick with GPT partitions more info at : http://verbier.restart.be/xfer/core.txt.60 Thanks for your work Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. I get a panic: panic: solaris assert: 0 == dmu_read(os, lr-lr_foid, off, dlen, buf), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, line: 991 during `make -s DESTDIR=/kingston installworld` kingston is a pool on a USB stick with GPT partitions more info at : http://verbier.restart.be/xfer/core.txt.60 Thanks for your work Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Navdeep Parhar wrote: On Wed, May 20, 2009 at 5:00 PM, Kip Macy km...@freebsd.org wrote: Not really a problem but a question: Is the v13 on-disk format exactly the same as that used by Solaris/Opensolaris? It is supposed to be. The sources are the same. However, I have not tested interoperability. Does this make it possible to have a ZFS-only dual boot system running FreeBSD-stable and Solaris, with a shared home directory between the two environments? It should be. Has anyone tried anything like this? Google anyone? :-) My google-fu is weak today, and considering that this went into -stable a few minutes back, I didn't look that hard for v13/fbsd-stable/opensolaris adventures. :-) I do it with 7.1 and opensolaris 2008.05 without problem. I keep the pool in V6 of course. Henri I'm feeling brave. I think I'll try it myself. Thanks for getting this into -stable! Navdeep -Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. I upgrade to stable r192523: FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Thu May 21 13:18:53 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE i386 some strange things: just after boot: [r...@morzine ~]# zfs upgrade This system is currently running ZFS filesystem version 3. The following filesystems are out of date, and can be upgraded. After being upgraded, these filesystems (and any 'zfs send' streams generated from subsequent snapshots) will no longer be accessible by older software versions. VER FILESYSTEM --- 1 pool1 1 pool1/qemu 1 pool1/squid 1 pool2 1 pool2/WorkBench 1 pool2/backup 1 pool2/download 1 pool2/qemu 1 pool2/sys 1 rpool 1 rpool/home 1 rpool/root 1 rpool/tmp 1 rpool/usr 1 rpool/var 1 rpool/var/spool [r...@morzine ~]# zfs upgrade -v The following filesystem versions are supported: VER DESCRIPTION --- 1 Initial ZFS filesystem version 2 Enhanced directory entries 3 Case insensitive and File system unique identifer (FUID) For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/zpl/N Where 'N' is the version number. And now, after a few minutes: [r...@morzine ~]# zpool upgrade This system is currently running ZFS pool version 13. The following pools are out of date, and can be upgraded. After being upgraded, these pools will no longer be accessible by older software versions. VER POOL --- 6 pool1 6 pool2 6 rpool Use 'zpool upgrade -v' for a list of available versions and their associated features. [r...@morzine ~]# zpool upgrade -v This system is currently running ZFS pool version 13. The following versions are supported: VER DESCRIPTION --- 1 Initial ZFS version 2 Ditto blocks (replicated metadata) 3 Hot spares and double parity RAID-Z 4 zpool history 5 Compression using the gzip algorithm 6 bootfs pool property 7 Separate intent log devices 8 Delegated administration 9 refquota and refreservation properties 10 Cache devices 11 Improved scrub performance 12 Snapshot properties 13 snapused property For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/N Where 'N' is the version number. Strange isn't it o-) By the way all seems ok! Thanks to all for this update to zfs V13 Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS MFC heads down
Henri Hennebert wrote: Kip Macy wrote: On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote: I will be MFC'ing the newer ZFS support some time this afternoon. Both world and kernel will need to be re-built. Existing pools will continue to work without upgrade. If you choose to upgrade a pool to take advantage of new features you will no longer be able to use it with sources prior to today. 'zfs send/recv' is not expected to inter-operate between different pool versions. The MFC went in r192498. Please let me know if you have any problems. I upgrade to stable r192523: FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Thu May 21 13:18:53 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE i386 some strange things: just after boot: [r...@morzine ~]# zfs upgrade This system is currently running ZFS filesystem version 3. The following filesystems are out of date, and can be upgraded. After being upgraded, these filesystems (and any 'zfs send' streams generated from subsequent snapshots) will no longer be accessible by older software versions. VER FILESYSTEM --- 1 pool1 1 pool1/qemu 1 pool1/squid 1 pool2 1 pool2/WorkBench 1 pool2/backup 1 pool2/download 1 pool2/qemu 1 pool2/sys 1 rpool 1 rpool/home 1 rpool/root 1 rpool/tmp 1 rpool/usr 1 rpool/var 1 rpool/var/spool [r...@morzine ~]# zfs upgrade -v The following filesystem versions are supported: VER DESCRIPTION --- 1 Initial ZFS filesystem version 2 Enhanced directory entries 3 Case insensitive and File system unique identifer (FUID) For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/zpl/N Where 'N' is the version number. And now, after a few minutes: [r...@morzine ~]# zpool upgrade This system is currently running ZFS pool version 13. The following pools are out of date, and can be upgraded. After being upgraded, these pools will no longer be accessible by older software versions. VER POOL --- 6 pool1 6 pool2 6 rpool Use 'zpool upgrade -v' for a list of available versions and their associated features. [r...@morzine ~]# zpool upgrade -v This system is currently running ZFS pool version 13. The following versions are supported: VER DESCRIPTION --- 1 Initial ZFS version 2 Ditto blocks (replicated metadata) 3 Hot spares and double parity RAID-Z 4 zpool history 5 Compression using the gzip algorithm 6 bootfs pool property 7 Separate intent log devices 8 Delegated administration 9 refquota and refreservation properties 10 Cache devices 11 Improved scrub performance 12 Snapshot properties 13 snapused property For more information on a particular version, including supported releases, see: http://www.opensolaris.org/os/community/zfs/version/N Where 'N' is the version number. Strange isn't it o-) By the way all seems ok! This happen after the first boot in stable (comming from 7.2-RELEASE). I reboot and can't reproduce it!. Henri Thanks to all for this update to zfs V13 Henri Thanks, Kip ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
7.2-RC1 - serial console / sio0 not working
Hello, Experiencing some deadlock, I try to reenable my serial console on 7.2-RC1. (console=comconsole,vidconsole in /boot/loader.conf and -Dh or -Dh -S115200 in /boot.config). /var/log/message show: 'sio0: type 16550A, console' and from the vga point of view, console output from kernel is slow as if echoed on a serial and rc output is going somewhere. At the other end of the serial, minicom show nothing and is 'offline'. A break at the minicom set my 7.2-RC1 in debugging (ddb) but 'continue' has no effect. The cable is working fine (serial console mode) with another box in 8.0-CURRENT. If I disable serial console and try minicom on 7.2-RC1, status is offline but any key is recieved at the other end and any key type at the other end is displayed fine. Does anyone encounter such a problem ? Thanks in advance henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
7.2-RC1 - serial console / sio0 not working
Sorry for the previous wrong followup :-( Hello, Experiencing some deadlock, I try to reenable my serial console on 7.2-RC1. (console=comconsole,vidconsole in /boot/loader.conf and -Dh or -Dh -S115200 in /boot.config). /var/log/message show: 'sio0: type 16550A, console' and from the vga point of view, console output from kernel is slow as if echoed on a serial and rc output is going somewhere. At the other end of the serial, minicom show nothing and is 'offline'. A break at the minicom set my 7.2-RC1 in debugging (ddb) but 'continue' has no effect. The cable is working fine (serial console mode) with another box in 8.0-CURRENT. If I disable serial console and try minicom on 7.2-RC1, status is offline but any key is recieved at the other end and any key type at the other end is displayed fine. Does anyone encounter such a problem ? Thanks in advance henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-RC1 - serial console / sio0 not working
Marten Vijn wrote: On Mon, 2009-04-20 at 16:49 +0200, Henri Hennebert wrote: Hello, Experiencing some deadlock, I try to reenable my serial console on 7.2-RC1. (console=comconsole,vidconsole in /boot/loader.conf and -Dh or -Dh -S115200 in /boot.config). /var/log/message show: 'sio0: type 16550A, console' and from the vga point of view, console output from kernel is slow as if echoed on a serial and rc output is going somewhere. At the other end of the serial, minicom show nothing and is 'offline'. A break at the minicom set my 7.2-RC1 in debugging (ddb) but 'continue' has no effect. The cable is working fine (serial console mode) with another box in 8.0-CURRENT. If I disable serial console and try minicom on 7.2-RC1, status is offline but any key is recieved at the other end and any key type at the other end is displayed fine. Does anyone encounter such a problem ? maybe diff /etc/ttys between 8.0 and 7.2 I don't use the serial for login, so I believe it is not important in my case. Thank you for your time Henri I had problems updrading a machine (over serial console) lately, (7.1.to Current) Marten Thanks in advance henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 6.4-STABLE and PHP5 pcre and phpsysinfo
xer wrote: Hello Mine 6.4-STABLE today has a strange problem regarding phpsysinfo that i use it. Ports are updated,but phpsysinfo (on browser) today show errors about pcre: --- Notice: Undefined offset: 3 in /usr/local/www/data-dist/phpsysinfo/includes/os/class.FreeBSD.inc.php on line 59 ^ a lots Warning: preg_match() [function.preg-match]: Internal pcre_fullinfo() error -3 in /usr/local/www/data-dist/phpsysinfo/includes/os/class.BSD.common.inc.php on line 126 ^ a lots Warning: asort() expects parameter 1 to be array, boolean given in /usr/local/www/data-dist/phpsysinfo/includes/os/class.BSD.common.inc.php on line 174 ^ a lots Warning: preg_match() [function.preg-match]: Internal pcre_fullinfo() error -3 in /usr/local/www/data-dist/phpsysinfo/includes/os/class.BSD.common.inc.php on line 187 ^ a lots XPath error in XPath.class.php:3492 Expression failed to parse as PrimaryExpr because: Expression is not a PrimaryExpr XPath error in XPath.class.php:5903 The supplied xPath '/phpsysinfo/Vitals/Distro' does not *uniquely* describe a node in the xml document.Not unique xpath-query, matched 0-times. and more... It seems that the FreeBSD patch does not work so well, someone use phpsysinfo? I did deinstalled php5 and 1.3 extension and reinstalled as expected.. but no resolve. Contrary to /usr/ports/UPDATING - entry 20081211, base php5 (5.2.9) don't contains pcre. You simply have to add /usr/ports/devel/php5-pcre. All will be OK. Henri Any help please? Thanx in advance. _ Quante ne sai? Scoprilo con CrossWire! http://clk.atdmt.com/GBL/go/140630367/direct/01/___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
hald and GEOM_PART_BSD + GEOM_PART_MBR
Just for the record, I add options GEOM_PART_BSD and GEOM_PART_MBR to my kernel config (I want to see what gpart was saying abount my disks). At boot time I get some messages as: GEOM: ad4s2: geometry does not match label. GEOM: ad4s2: media size does not match label. and more important: hald eats up cpu time and can't answer to lshal. Removing those options resolve the problem. Henri PS - I'm ready to test some patches ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS performance issues (solved?!)
John Birrell wrote: For those people experiencing a performance degradation since the DTrace import, please update your copy of src/sys/cddl/compat/opensolaris/kern/opensolaris_kmem.c by either cvsup of direct edit to remove #define KMEM_DEBUG. You only need to rebuild the opensolaris kernel module after this change. The code is shared between ZFS and DTrace via the opensolaris kernel module. This is also the reason why you found it necessary to add KDB, DDB and STACK to your kernel. After removing KMEM_DEBUG, you won't need those. Please confirm that this solves the problem you have been seeing. Great, now everything is back to normal Thanks Henri -- John Birrell ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Possible ZFS patch, please test!
Jeremy Chadwick wrote: On Sat, Aug 30, 2008 at 09:28:36PM +0200, Henri Hennebert wrote: John Baldwin wrote: This patch merges a few changes from HEAD back to 7.x. I think the endian changes specifically might solve the issue people saw with zpools created with non-dtrace kernels not being readable by dtrace kernels and vice versa. http://www.FreeBSD.org/~jhb/patches/zfs_7.patch Just a follow-up I cvsup at Sat Aug 30 12:55 without zfs_7.patch and make buildworld make buildkernel make installkernel reboot (-s) --root on zfs is ok -- make installworld reboot System is still sluggish even during the make installworld in single user. Sorry if I've missed this, but what tuning have you done for ZFS? Some of us (most of us?) have seen fairly sluggish performance when prefetch is enabled (the default), while the system is generally more responsive when prefetch is disabled. prefetch is enabled: vfs.zfs.arc_min: 33554432 vfs.zfs.arc_max: 268435456 vfs.zfs.mdcomp_disable: 0 vfs.zfs.prefetch_disable: 0 vfs.zfs.zio.taskq_threads: 0 vfs.zfs.recover: 0 vfs.zfs.vdev.cache.size: 10485760 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.debug: 0 This may have nothing to do with the problem you've stated, but I thought I'd throw it out there. I think that the problem is somewhere else because 80% of system cpu on a dual core seems awfully bad - eg more than 20 seconds to open this response after clicking on the response button. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Possible ZFS patch, please test!
John Baldwin wrote: On Friday 29 August 2008 03:57:46 am Henri Hennebert wrote: Henri Hennebert wrote: John Baldwin wrote: This patch merges a few changes from HEAD back to 7.x. I think the endian changes specifically might solve the issue people saw with zpools created with non-dtrace kernels not being readable by dtrace kernels and vice versa. http://www.FreeBSD.org/~jhb/patches/zfs_7.patch It works for me with the root on zfs While rebuilding the ports index with `portsdb -Uu` the system become really sluggish with cpu running more than 60% in system... Something really strange here. Can you try removing the 'KDTRACE_*' options from your kernel config file? It appears that they haven't been enabled in 8.x by default yet. I try, but with the wold of 7.1-PRERELEASE I got cc: Internal error: Segmentation fault: 11 (program ld) Please submit a full bug report. See URL:http://gcc.gnu.org/bugs.html for instructions. *** Error code 1 Stop in /usr/obj/usr/src/sys/MORZINE. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. So I get /usr/bin/cc /usr/bin/ld /usr/libexec/cc* from a previous 7.0-STABLE and try it again... with a lot of cpu-system... 80% on a dual core. Anyway - I reboot the new kernel without KDTRACE_HOOKS and DDB_CTF. After reboot, system cpu is always very high and system not responsive. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Possible ZFS patch, please test!
John Birrell wrote: On Sat, Aug 30, 2008 at 09:07:15AM +0200, Henri Hennebert wrote: I try, but with the wold of 7.1-PRERELEASE I got cc: Internal error: Segmentation fault: 11 (program ld) Please submit a full bug report. See URL:http://gcc.gnu.org/bugs.html for instructions. *** Error code 1 Henri, please delete the entire contents of your obj directory to remove the bad tools that have been built there. When you run 'make buildkernel' it will use the tools from the last buildworld rather than the installed ones. For anyone experiencing this problem, you can do a 'make installworld' with STRIP= as long as you can boot to single user and mount your file systems. The problem is occurring when static binaries are installed with the default option to strip the binaries. It seems that the strip program doesn't like the presence of the CTF ELF section. OK - I better understand what's happening. I believe that the buildworld that you have is OK, even when built with the CTF data it's the installworld when things go bad. Do you need me to send you any files to recover from this problem? No problem, I have access to a previous 7.0-STABLE. Thanks Henri -- John Birrell ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Possible ZFS patch, please test!
John Birrell wrote: On Sat, Aug 30, 2008 at 10:40:12AM +0200, Henri Hennebert wrote: I believe that the buildworld that you have is OK, even when built with the CTF data it's the installworld when things go bad. Do you need me to send you any files to recover from this problem? No problem, I have access to a previous 7.0-STABLE. I am concerned about the high CPU problem. All the hooks that are built in with KDTRACE_HOOKS are inactive until the DTrace modules are loaded. So there should be no CPU implications there. Are you using i3886 or amd64? It is i386: CPU: Intel(R) Xeon(R) CPU5130 @ 2.00GHz (1995.01-MHz 686-class CPU) Origin = GenuineIntel Id = 0x6f6 Stepping = 6 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x4e33dSSE3,RSVD2,MON,DS_CPL,VMX,TM2,SSSE3,CX16,xTPR,PDCM,DCA AMD Features=0x2010NX,LM AMD Features2=0x1LAHF Cores per package: 2 real memory = 2146369536 (2046 MB) avail memory = 2084503552 (1987 MB) Henri -- John Birrell ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Possible ZFS patch, please test!
John Baldwin wrote: This patch merges a few changes from HEAD back to 7.x. I think the endian changes specifically might solve the issue people saw with zpools created with non-dtrace kernels not being readable by dtrace kernels and vice versa. http://www.FreeBSD.org/~jhb/patches/zfs_7.patch Just a follow-up I cvsup at Sat Aug 30 12:55 without zfs_7.patch and make buildworld make buildkernel make installkernel reboot (-s) --root on zfs is ok -- make installworld reboot System is still sluggish even during the make installworld in single user. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Possible ZFS patch, please test!
John Baldwin wrote: This patch merges a few changes from HEAD back to 7.x. I think the endian changes specifically might solve the issue people saw with zpools created with non-dtrace kernels not being readable by dtrace kernels and vice versa. http://www.FreeBSD.org/~jhb/patches/zfs_7.patch It works for me with the root on zfs Thanks Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Possible ZFS patch, please test!
Henri Hennebert wrote: John Baldwin wrote: This patch merges a few changes from HEAD back to 7.x. I think the endian changes specifically might solve the issue people saw with zpools created with non-dtrace kernels not being readable by dtrace kernels and vice versa. http://www.FreeBSD.org/~jhb/patches/zfs_7.patch It works for me with the root on zfs While rebuilding the ports index with `portsdb -Uu` the system become really sluggish with cpu running more than 60% in system... Something really strange here. Henri Thanks Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: recent regression with REL7 and Zfs
Thierry Herbelot wrote: Hello, I am using a recent 7.0-Stable (x86) and a Zfs pool for my data. After the dtrace import, I have updated my sources (and make buildworld, make buildkernel) and I no longer have access to my Zfs pool (just to be sure, I have since updated twice more to work around the announced issues). Probably same problem here: With new kernel (7.1-PRERELEASE) and a root file system under zfs, the root can't be mounted and system stop with message saying can't mount zfs:pool0 Manual root filesystem specification: fstype:device Mount device using filesystem fstype eg. ufs:da0s1a ? List valid disk boot devices empty line Abort manual input mountroot I revert to previous kernel and all is OK now. Henri With the new kernel, the Zpool is listed as failed (bad checksum ?). Reverting to the old kernel is sufficient to recover the Zfs pool (which was scrubed last week), and it is declared healthy. cheers and thanks for the good work TfH ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RELENG_7 /src/UPDATING out od date?
Abdullah Ibn Hamad Al-Marri wrote: Hey, http://www.freebsd.org/cgi/cvsweb.cgi/src/UPDATING?rev=1.507;only_with_tag=RELENG_7 NOTE TO PEOPLE WHO THINK THAT FreeBSD 7.x IS SLOW: FreeBSD 7.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. Could someone please nuke this? It is a problem with cvsweb - see http://www.freebsd.org/cgi/query-pr.cgi?pr=120185 Henri Regards, -Abdullah Ibn Hamad Al-Marri Arab Portal http://www.WeArab.Net/ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: finstall alpha3
Julian H. Stacey wrote: Ivan Voras wrote: As some of you may already know, I'm working on a graphical installer for FreeBSD 7, which was started as a Google SoC 2007 project but still continues. 10+ years back when Jordan did first pre= X, 24x80 graphical installer, soon afer he'd finished a blind chap posted ~So how do I install ?~ Answer then: ~Get a friend do it for you, or abandon FreeBSD use NetBSD~ NetBSD still have Ascii installer, so more attractive to some. Idea for another SOC project : An automated tool that could descramble all the glitz of [arbitrary ?] graphics tools back to something sensible / Ascii, a bit like what OCR does for printed paper. No doubt a bew grraphical installer might be nice (if X is reliable which it often is Not, don't rely on VESA either on old hardware), but just so's we don't forget blind too, amazingly they use computers (with expensive interfaces). Also visually impaired do too, the later simply with simple text xterms with Monster fonts, rather than graphics I presume. Just my opinion: Maybe you are great BUT YOU ARE TO DEROGATORY about a work witch may be usefull to some new users Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: finstall alpha3
Julian Stacey wrote: Henri Hennebert wrote 2 emails with same common text To: Julian H. Stacey [EMAIL PROTECTED] Date: Wed, 06 Feb 2008 15:20:38 +0100 Message-ID: [EMAIL PROTECTED] The first private shouting got answered. Then came To: freebsd-stable@freebsd.org Date: Wed, 06 Feb 2008 15:25:57 +0100 Message-id: [EMAIL PROTECTED] Assume Henri is too young to remember first graphical installer. Thank you! I'm 60 this year :-) and using FreeBSD since 2.1. Xenix since 88 IIRC. My point is that a graphical installer may be usefull, that's all. Though a new graphical installer may be very nice as an option, let it not ever be the only way: Remember blind installers, non VESA supported consoles, non X recognised chips, serial line controlled installs, non intel/AMD platforms with broken graphics terminal support. (Sparc maybe ? more later ?) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0-RC1 - ZFS + UFS + io activity show a deadlock
Pawel Jakub Dawidek wrote: On Sun, Jan 27, 2008 at 02:47:02PM +0100, Henri Hennebert wrote: Hello, I encounter a deadlock while 1) cpio -p from a ZFS filesystem to a UFS filesystem 2) rsync from ZFS to ZFS I was running with this patch: http://people.freebsd.org/~pjd/patches/zgd_done.patch This patch is wrong, why do you use it in the first place? You advise it to me ... I will remove it. Henri ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]