Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-18 Thread Henri Hennebert



On 11/18/2016 13:30, Andriy Gapon wrote:

On 14/11/2016 14:00, Henri Hennebert wrote:

On 11/14/2016 12:45, Andriy Gapon wrote:

Okay.  Luckily for us, it seems that 'm' is available in frame 5.  It also
happens to be the first field of 'struct faultstate'.  So, could you please go
to frame and print '*m' and '*(struct faultstate *)m' ?


(kgdb) fr 4
#4  0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753
753msleep(m, vm_page_lockptr(m), PVM | PDROP, wmesg, 0);
(kgdb) print *m
$1 = {plinks = {q = {tqe_next = 0xf800dc5d85b0, tqe_prev =
0xf800debf3bd0}, s = {ss = {sle_next = 0xf800dc5d85b0},
  pv = 0xf800debf3bd0}, memguard = {p = 18446735281313646000, v =
18446735281353604048}}, listq = {tqe_next = 0x0,
tqe_prev = 0xf800dc5d85c0}, object = 0xf800b62e9c60, pindex = 11,
phys_addr = 3389358080, md = {pv_list = {
  tqh_first = 0x0, tqh_last = 0xf800df68cd78}, pv_gen = 426, pat_mode =
6}, wire_count = 0, busy_lock = 6, hold_count = 0,
  flags = 0, aflags = 2 '\002', oflags = 0 '\0', queue = 0 '\0', psind = 0 '\0',
segind = 3 '\003', order = 13 '\r',
  pool = 0 '\0', act_count = 0 '\0', valid = 0 '\0', dirty = 0 '\0'}


If I interpret this correctly the page is in the 'exclusive busy' state.
Unfortunately, I can't tell much beyond that.
But I am confident that this is the root cause of the lock-up.


(kgdb) print *(struct faultstate *)m
$2 = {m = 0xf800dc5d85b0, object = 0xf800debf3bd0, pindex = 0, first_m =
0xf800dc5d85c0,
  first_object = 0xf800b62e9c60, first_pindex = 11, map = 0xca058000, entry
= 0x0, lookup_still_valid = -546779784,
  vp = 0x601aa}
(kgdb)


I was wrong on this one as 'm' is actually a pointer, so the above is not
correct.  Maybe 'info reg' in frame 5 would give a clue about the value of 'fs'.


(kgdb) fr 5
#5  0x8089dd4d in vm_page_sleep_if_busy (m=0xf800df68cd40, 
msg=0x809c51bc "vmpfw")

at /usr/src/sys/vm/vm_page.c:1086
1086vm_page_busy_sleep(m, msg);
(kgdb) info reg
rax0x0  0
rbx0xf800b62e9c78   -8793036514184
rcx0x0  0
rdx0x0  0
rsi0x0  0
rdi0x0  0
rbp0xfe0101836810   0xfe0101836810
rsp0xfe01018367e0   0xfe01018367e0
r8 0x0  0
r9 0x0  0
r100x0  0
r110x0  0
r120xf800b642aa00   -879303520
r130xf800df68cd40   -8792344834752
r140xf800b62e9c60   -8793036514208
r150x809c51bc   -2137239108
rip0x8089dd4d	0x8089dd4d 
<vm_page_sleep_if_busy+285>

eflags 0x0  0
cs 0x0  0
ss 0x0  0
ds 0x0  0
es 0x0  0
fs 0x0  0
gs 0x0  0

I don't know what to do from here.


I am not sure how to proceed from here.
The only thing I can think of is a lock order reversal between the vnode lock
and the page busying quasi-lock.  But examining the code I can not spot it.
Another possibility is a leak of a busy page, but that's hard to debug.

How hard is it to reproduce the problem?


After 7 days all seems normal only one copy of innd:

[root@avoriaz ~]# ps xa|grep inn
 1193  -  Is   0:01.40 /usr/local/news/bin/innd -r
13498  -  IN   0:00.01 /usr/local/news/bin/innfeed
 1194 v0- IW   0:00.00 /bin/sh /usr/local/news/bin/innwatch -i 60

I will try to stop and restart innd.

All continue to look good:

[root@avoriaz ~]# ps xa|grep inn
31673  -  Ss   0:00.02 /usr/local/news/bin/innd
31694  -  SN   0:00.01 /usr/local/news/bin/innfeed
31674  0  S0:00.01 /bin/sh /usr/local/news/bin/innwatch -i 60


I think to reproduce is just waiting it occurs by itself...

One thing here: The deadlock occurs at least 5 times since 10.0R. And 
always with the directory /usr/local/news/bin




Maybe Konstantin would have some ideas or suggestions.



Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-14 Thread Henri Hennebert



On 11/14/2016 12:45, Andriy Gapon wrote:

On 14/11/2016 11:35, Henri Hennebert wrote:



On 11/14/2016 10:07, Andriy Gapon wrote:

Hmm, I've just noticed another interesting thread:
Thread 668 (Thread 101245):
#0  sched_switch (td=0xf800b642aa00, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973
#1  0x80561ae2 in mi_switch (flags=, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:455
#2  0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:646
#3  0x805614b1 in _sleep (ident=, lock=, priority=, wmesg=0x809c51bc
"vmpfw", sbt=0, pr=, flags=) at
/usr/src/sys/kern/kern_synch.c:229
#4  0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753
#5  0x8089dd4d in vm_page_sleep_if_busy (m=0xf800df68cd40,
msg=0x809c51bc "vmpfw") at /usr/src/sys/vm/vm_page.c:1086
#6  0x80886be9 in vm_fault_hold (map=, vaddr=, fault_type=4 '\004', fault_flags=0, m_hold=0x0) at
/usr/src/sys/vm/vm_fault.c:495
#7  0x80885448 in vm_fault (map=0xf80011d66000, vaddr=, fault_type=4 '\004', fault_flags=) at
/usr/src/sys/vm/vm_fault.c:273
#8  0x808d3c49 in trap_pfault (frame=0xfe0101836c00, usermode=1) at
/usr/src/sys/amd64/amd64/trap.c:741
#9  0x808d3386 in trap (frame=0xfe0101836c00) at
/usr/src/sys/amd64/amd64/trap.c:333
#10 0x808b7af1 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:236


This tread is another program from the news system:
668 Thread 101245 (PID=49124: innfeed)  sched_switch (td=0xf800b642aa00,
newtd=0xf8000285f000, flags=) at
/usr/src/sys/kern/sched_ule.c:1973



I strongly suspect that this is thread that we were looking for.
I think that it has the vnode lock in the shared mode while trying to fault in a
page.



--clip--



Okay.  Luckily for us, it seems that 'm' is available in frame 5.  It also
happens to be the first field of 'struct faultstate'.  So, could you please go
to frame and print '*m' and '*(struct faultstate *)m' ?


(kgdb) fr 4
#4  0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, 
wmesg=) at /usr/src/sys/vm/vm_page.c:753

753 msleep(m, vm_page_lockptr(m), PVM | PDROP, wmesg, 0);
(kgdb) print *m
$1 = {plinks = {q = {tqe_next = 0xf800dc5d85b0, tqe_prev = 
0xf800debf3bd0}, s = {ss = {sle_next = 0xf800dc5d85b0},
  pv = 0xf800debf3bd0}, memguard = {p = 18446735281313646000, v 
= 18446735281353604048}}, listq = {tqe_next = 0x0,
tqe_prev = 0xf800dc5d85c0}, object = 0xf800b62e9c60, pindex 
= 11, phys_addr = 3389358080, md = {pv_list = {
  tqh_first = 0x0, tqh_last = 0xf800df68cd78}, pv_gen = 426, 
pat_mode = 6}, wire_count = 0, busy_lock = 6, hold_count = 0,
  flags = 0, aflags = 2 '\002', oflags = 0 '\0', queue = 0 '\0', psind 
= 0 '\0', segind = 3 '\003', order = 13 '\r',

  pool = 0 '\0', act_count = 0 '\0', valid = 0 '\0', dirty = 0 '\0'}
(kgdb) print *(struct faultstate *)m
$2 = {m = 0xf800dc5d85b0, object = 0xf800debf3bd0, pindex = 0, 
first_m = 0xf800dc5d85c0,
  first_object = 0xf800b62e9c60, first_pindex = 11, map = 
0xca058000, entry = 0x0, lookup_still_valid = -546779784,

  vp = 0x601aa}
(kgdb)
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-14 Thread Henri Hennebert



On 11/14/2016 10:07, Andriy Gapon wrote:

On 13/11/2016 15:28, Henri Hennebert wrote:

On 11/13/2016 11:06, Andriy Gapon wrote:

On 12/11/2016 14:40, Henri Hennebert wrote:



[snip]

Could you please show 'info local' in frame 14?
I expected that 'nd' variable would be defined there and it may contain some
useful information.


No luck there:

(kgdb) fr 14
#14 0x80636838 in kern_statat (td=0xf80009ba0500, 
flag=, fd=-100, path=0x0,
pathseg=, sbp=, 
hook=0x800e2a388) at /usr/src/sys/kern/vfs_syscalls.c:2160

2160if ((error = namei()) != 0)
(kgdb) info local
rights = 
nd = 
error = 
sb = 
(kgdb)



I also try to get information from the execve of the other treads:

for tid 101250:
(kgdb) fr 10
#10 0x80508ccc in sys_execve (td=0xf800b6429000,
uap=0xfe010184fb80) at /usr/src/sys/kern/kern_exec.c:218
218error = kern_execve(td, , NULL);
(kgdb) print *uap
$4 = {fname_l_ = 0xfe010184fb80 "`\220\217\002\b", fname = 0x8028f9060
,
  fname_r_ = 0xfe010184fb88 "`¶ÿÿÿ\177", argv_l_ = 0xfe010184fb88
"`¶ÿÿÿ\177", argv = 0x7fffb660,
  argv_r_ = 0xfe010184fb90 "\bÜÿÿÿ\177", envv_l_ = 0xfe010184fb90
"\bÜÿÿÿ\177", envv = 0x7fffdc08,
  envv_r_ = 0xfe010184fb98 ""}
(kgdb)

for tid 101243:

(kgdb) f 15
#15 0x80508ccc in sys_execve (td=0xf800b642b500,
uap=0xfe010182cb80) at /usr/src/sys/kern/kern_exec.c:218
218error = kern_execve(td, , NULL);
(kgdb) print *uap
$5 = {fname_l_ = 0xfe010182cb80 "ÀÏ\205\002\b", fname = 0x80285cfc0 ,
  fname_r_ = 0xfe010182cb88 "`¶ÿÿÿ\177", argv_l_ = 0xfe010182cb88
"`¶ÿÿÿ\177", argv = 0x7fffb660,
  argv_r_ = 0xfe010182cb90 "\bÜÿÿÿ\177", envv_l_ = 0xfe010182cb90
"\bÜÿÿÿ\177", envv = 0x7fffdc08,
  envv_r_ = 0xfe010182cb98 ""}
(kgdb)


I think that you see garbage in those structures because they contain pointers
to userland data.

Hmm, I've just noticed another interesting thread:
Thread 668 (Thread 101245):
#0  sched_switch (td=0xf800b642aa00, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973
#1  0x80561ae2 in mi_switch (flags=, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:455
#2  0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:646
#3  0x805614b1 in _sleep (ident=, lock=, priority=, wmesg=0x809c51bc
"vmpfw", sbt=0, pr=, flags=) at
/usr/src/sys/kern/kern_synch.c:229
#4  0x8089d1c1 in vm_page_busy_sleep (m=0xf800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753
#5  0x8089dd4d in vm_page_sleep_if_busy (m=0xf800df68cd40,
msg=0x809c51bc "vmpfw") at /usr/src/sys/vm/vm_page.c:1086
#6  0x80886be9 in vm_fault_hold (map=, vaddr=, fault_type=4 '\004', fault_flags=0, m_hold=0x0) at
/usr/src/sys/vm/vm_fault.c:495
#7  0x80885448 in vm_fault (map=0xf80011d66000, vaddr=, fault_type=4 '\004', fault_flags=) at
/usr/src/sys/vm/vm_fault.c:273
#8  0x808d3c49 in trap_pfault (frame=0xfe0101836c00, usermode=1) at
/usr/src/sys/amd64/amd64/trap.c:741
#9  0x808d3386 in trap (frame=0xfe0101836c00) at
/usr/src/sys/amd64/amd64/trap.c:333
#10 0x808b7af1 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:236


This tread is another program from the news system:
668 Thread 101245 (PID=49124: innfeed)  sched_switch 
(td=0xf800b642aa00, newtd=0xf8000285f000, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973




I strongly suspect that this is thread that we were looking for.
I think that it has the vnode lock in the shared mode while trying to fault in a
page.

Could you please check that by going to frame 6 and printing 'fs' and '*fs.vp'?
It'd be interesting to understand why this thread is waiting here.
So, please also print '*fs.m' and '*fs.object'.


No luck :-(
(kgdb) fr 6
#6  0x80886be9 in vm_fault_hold (map=, 
vaddr=, fault_type=4 '\004',

fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:495
495 vm_page_sleep_if_busy(fs.m, 
"vmpfw");
(kgdb) print fs
Cannot access memory at address 0x1fa0
(kgdb)

Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-13 Thread Henri Hennebert

On 11/13/2016 14:28, Henri Hennebert wrote:

This 2 threads are innd processes. In core.txt.4:

   8 14789 29165   0   24  4   40040   6612 zfs  DN- 0:00.00 [innd]
   8 29165 1   0   20  0   42496   6888 select   Ds- 0:01.33 [innd]
   8 49778 29165   0   24  4   40040   6900 zfs  DN- 0:00.00 [innd]
   8 82034 29165   0   24  4 132  0 zfs  DN- 0:00.00 [innd]

the corresponding info treads are:

  687 Thread 101243 (PID=49778: innd)  sched_switch
(td=0xf800b642b500, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973
  681 Thread 101147 (PID=14789: innd)  sched_switch
(td=0xf80065f4e500, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973
  669 Thread 101250 (PID=82034: innd)  sched_switch
(td=0xf800b6429000, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973
  665 Thread 101262 (PID=29165: innd)  sched_switch
(td=0xf800b6b54a00, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973


In case it may help, I have a look at innd. This processes use 2 execv:

one to execute /bin/sh and the other to execute itself:

/*
**  Re-exec ourselves.
*/
static const char *
CCxexec(char *av[])
{
char*innd;
char*p;
int i;

if (CCargv == NULL)
return "1 no argv!";

innd = concatpath(innconf->pathbin, "innd");
/* Get the pathname. */
p = av[0];
if (*p == '\0' || strcmp(p, "innd") == 0)
CCargv[0] = innd;
else
return "1 Bad value";

#ifdef DO_PERL
PLmode(Mode, OMshutdown, av[0]);
#endif
#ifdef DO_PYTHON
PYmode(Mode, OMshutdown, av[0]);
#endif
JustCleanup();
syslog(L_NOTICE, "%s execv %s", LogName, CCargv[0]);

/* Close all fds to protect possible fd leaking accross successive 
innds. */

for (i=3; i<30; i++)
close(i);

execv(CCargv[0], CCargv);
syslog(L_FATAL, "%s cant execv %s %m", LogName, CCargv[0]);
_exit(1);
/* NOTREACHED */
return "1 Exit failed";
}

The culprit may be /usr/local/news/bin/innd,

remember that find is locked in /usr/local/news/bin

Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-13 Thread Henri Hennebert

On 11/13/2016 11:06, Andriy Gapon wrote:

On 12/11/2016 14:40, Henri Hennebert wrote:

I attatch it


Thank you!
So, these two threads are trying to get the lock in the exclusive mode:
Thread 687 (Thread 101243):
#0  sched_switch (td=0xf800b642b500, newtd=0xf8000285ea00, flags=) at /usr/src/sys/kern/sched_ule.c:1973
#1  0x80561ae2 in mi_switch (flags=, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:455
#2  0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:646
#3  0x8052f854 in sleeplk (lk=, flags=, ilk=, wmesg=0x813be535 "zfs",
pri=, timo=51) at /usr/src/sys/kern/kern_lock.c:222
#4  0x8052f39d in __lockmgr_args (lk=, flags=, ilk=, wmesg=,
pri=, timo=, file=, line=) at /usr/src/sys/kern/kern_lock.c:958
#5  0x80616a8c in vop_stdlock (ap=) at lockmgr.h:98
#6  0x8093784d in VOP_LOCK1_APV (vop=, a=) at vnode_if.c:2087
#7  0x8063c5b3 in _vn_lock (vp=, flags=548864,
file=, line=) at vnode_if.h:859
#8  0x8062a5f7 in vget (vp=0xf80049c2c000, flags=548864,
td=0xf800b642b500) at /usr/src/sys/kern/vfs_subr.c:2523
#9  0x806118b9 in cache_lookup (dvp=, vpp=, cnp=, tsp=,
ticksp=) at /usr/src/sys/kern/vfs_cache.c:686
#10 0x806133dc in vfs_cache_lookup (ap=) at
/usr/src/sys/kern/vfs_cache.c:1081
#11 0x80935777 in VOP_LOOKUP_APV (vop=, a=) at vnode_if.c:127
#12 0x8061cdf1 in lookup (ndp=) at vnode_if.h:54
#13 0x8061c492 in namei (ndp=) at
/usr/src/sys/kern/vfs_lookup.c:306
#14 0x80509395 in kern_execve (td=, args=, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443
#15 0x80508ccc in sys_execve (td=0xf800b642b500,
uap=0xfe010182cb80) at /usr/src/sys/kern/kern_exec.c:218
#16 0x808d449e in amd64_syscall (td=, traced=0) at
subr_syscall.c:135
#17 0x808b7ddb in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:396

Thread 681 (Thread 101147):
#0  sched_switch (td=0xf80065f4e500, newtd=0xf8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973
#1  0x80561ae2 in mi_switch (flags=, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:455
#2  0x805ae8da in sleepq_wait (wchan=0x0, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:646
#3  0x8052f854 in sleeplk (lk=, flags=, ilk=, wmesg=0x813be535 "zfs",
pri=, timo=51) at /usr/src/sys/kern/kern_lock.c:222
#4  0x8052f39d in __lockmgr_args (lk=, flags=, ilk=, wmesg=,
pri=, timo=, file=, line=) at /usr/src/sys/kern/kern_lock.c:958
#5  0x80616a8c in vop_stdlock (ap=) at lockmgr.h:98
#6  0x8093784d in VOP_LOCK1_APV (vop=, a=) at vnode_if.c:2087
#7  0x8063c5b3 in _vn_lock (vp=, flags=548864,
file=, line=) at vnode_if.h:859
#8  0x8062a5f7 in vget (vp=0xf80049c2c000, flags=548864,
td=0xf80065f4e500) at /usr/src/sys/kern/vfs_subr.c:2523
#9  0x806118b9 in cache_lookup (dvp=, vpp=, cnp=, tsp=,
ticksp=) at /usr/src/sys/kern/vfs_cache.c:686
#10 0x806133dc in vfs_cache_lookup (ap=) at
/usr/src/sys/kern/vfs_cache.c:1081
#11 0x80935777 in VOP_LOOKUP_APV (vop=, a=) at vnode_if.c:127
#12 0x8061cdf1 in lookup (ndp=) at vnode_if.h:54
#13 0x8061c492 in namei (ndp=) at
/usr/src/sys/kern/vfs_lookup.c:306
#14 0x80509395 in kern_execve (td=, args=, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443
#15 0x80508ccc in sys_execve (td=0xf80065f4e500,
uap=0xfe01016b8b80) at /usr/src/sys/kern/kern_exec.c:218
#16 0x808d449e in amd64_syscall (td=, traced=0) at
subr_syscall.c:135
#17 0x808b7ddb in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:396


This 2 threads are innd processes. In core.txt.4:

   8 14789 29165   0   24  4   40040   6612 zfs  DN- 
0:00.00 [innd]
   8 29165 1   0   20  0   42496   6888 select   Ds- 
0:01.33 [innd]
   8 49778 29165   0   24  4   40040   6900 zfs  DN- 
0:00.00 [innd]
   8 82034 29165   0   24  4 132  0 zfs  DN- 
0:00.00 [innd]


the corresponding info treads are:

  687 Thread 101243 (PID=49778: innd)  sched_switch 
(td=0xf800b642b500, newtd=0xf8000285ea00, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973
  681 Thread 101147 (PID=14789: innd)  sched_switch 
(td=0xf80065f4e500, newtd=0xf8000285f000, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973
  669 Thread 101250 (PID=82034: innd)  sched_switch 
(td=0xf800b6429000, newtd=0xf8000285ea00, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973
  665 Thread 101262 (PID=29165: innd)  sched_switch 
(td=0xf800b6b54a00, newtd=0xf8000285ea00, flags=out>) at /usr/src/sys/kern/sched_ule.c:1973


So your missing tread must be 101250:

(kgdb) tid 101250
[Switching to thread 669 (Thread 101250)]#0  sched_switch 
(td=0xf800b6429000, newtd=0xf8000285ea00,

flags=) at /usr/src/sys/kern/sched_ule.c:1973
1973cpuid = PCPU_GET(cpuid);
Current language:  a

Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-11 Thread Henri Hennebert



On 11/11/2016 12:24, Andriy Gapon wrote:


At this stage I would try to get a system crash dump for post-mortem analysis.
There are a few way to do that.  You can enter ddb and then run 'dump' and
'reset' commands.  Or you can just do `sysctl debug.kdb.panic=1`.
In either case, please double-check that your system has a dump device 
configured.


It take some time to upload the dump...

You can find it at

http://tignes.restart.be/Xfer/

Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-10 Thread Henri Hennebert

On 11/10/2016 19:40, Andriy Gapon wrote:

On 10/11/2016 19:55, Henri Hennebert wrote:



On 11/10/2016 18:33, Andriy Gapon wrote:

On 10/11/2016 18:12, Henri Hennebert wrote:

On 11/10/2016 16:54, Andriy Gapon wrote:

On 10/11/2016 17:20, Henri Hennebert wrote:

On 11/10/2016 15:00, Andriy Gapon wrote:

Interesting.  I can not spot any suspicious thread that would hold the vnode
lock.  Could you please run kgdb (just like that, no arguments), then execute
'bt' command and then select a frame when _vn_lock is called with 'fr N'
command.  Then please 'print *vp' and share the result.


I Think I miss something in your request:


Oh, sorry!  The very first step should be 'tid 101112' to switch to the correct
context.



(kgdb) fr 7
#7  0x8063c5b3 in _vn_lock (vp=, flags=2121728,


"value optimized out" - not good


file=,
line=) at vnode_if.h:859
859vnode_if.h: No such file or directory.
in vnode_if.h
(kgdb) print *vp


I am not sure if this output is valid, because of the message above.
Could you please try to navigate to nearby frames and see if vp itself has a
valid value there.  If you can find such a frame please do *vp  there.



Does this seems better?


Yes!


(kgdb) fr 8
#8  0x8062a5f7 in vget (vp=0xf80049c2c000, flags=2121728,
td=0xf80009ba0500) at /usr/src/sys/kern/vfs_subr.c:2523
2523if ((error = vn_lock(vp, flags)) != 0) {
(kgdb) print *vp
$1 = {v_tag = 0x813be535 "zfs", v_op = 0x813d0f70, v_data =
0xf80049c1f420, v_mount = 0xf800093aa660,
  v_nmntvnodes = {tqe_next = 0xf80049c2c938, tqe_prev = 0xf80049c2bb30},
v_un = {vu_mount = 0x0, vu_socket = 0x0,
vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev =
0x0}, v_cache_src = {lh_first = 0x0}, v_cache_dst = {
tqh_first = 0xf800bfc8e3f0, tqh_last = 0xf800bfc8e410}, v_cache_dd =
0x0, v_lock = {lock_object = {
  lo_name = 0x813be535 "zfs", lo_flags = 117112832, lo_data = 0,
lo_witness = 0x0}, lk_lock = 23, lk_exslpfail = 0,
lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {lo_name =
0x8099e9e0 "vnode interlock", lo_flags = 16973824,
  lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock =
0xf80049c2c068, v_actfreelist = {
tqe_next = 0xf80049c2c938, tqe_prev = 0xf80049ae9bd0}, v_bufobj =
{bo_lock = {lock_object = {
lo_name = 0x8099e9f0 "bufobj interlock", lo_flags = 86179840,
lo_data = 0, lo_witness = 0x0}, rw_lock = 1},
bo_ops = 0x80c4bf70, bo_object = 0xf800b62e9c60, bo_synclist =
{le_next = 0x0, le_prev = 0x0},
bo_private = 0xf80049c2c000, __bo_vnode = 0xf80049c2c000, bo_clean =
{bv_hd = {tqh_first = 0x0,
tqh_last = 0xf80049c2c120}, bv_root = {pt_root = 0}, bv_cnt = 0},
bo_dirty = {bv_hd = {tqh_first = 0x0,
tqh_last = 0xf80049c2c140}, bv_root = {pt_root = 0}, bv_cnt = 0},
bo_numoutput = 0, bo_flag = 0, bo_bsize = 131072},
  v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters =
{tqh_first = 0x0, tqh_last = 0xf80049c2c188},
rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0,
v_holdcnt = 9, v_usecount = 6, v_iflag = 512,
  v_vflag = 32, v_writecount = 0, v_hash = 4833984, v_type = VREG}
(kgdb)


flags=2121728 = 0x206000 = LK_SHARED | LK_VNHELD | LK_NODDLKTREAT
lk_lock = 23 = 0x17 = LK_ONE_SHARER | LK_EXCLUSIVE_WAITERS | LK_SHARED_WAITERS |
LK_SHARE

So, here's what we have here: this thread tries to get a shared lock on the
vnode, the vnode is already locked in shared mode, but there is an exclusive
waiter (or, perhaps, multiple waiters).  So, this thread can not get the lock
because of the exclusive waiter.  And I do not see an easy way to identify that
waiter.

In the procstat output that you provided earlier there was no other thread in
vn_lock.  Hmm, I see this:
procstat: sysctl: kern.proc.kstack: 14789: Device busy
procstat: sysctl: kern.proc.kstack: 82034: Device busy

Could you please check what those two processes are (if they are still running)?
Perhaps try procstat for each of the pids several times.



This 2 processes are the 2 instances of the innd daemon (news server) 
which seems in accordance with the directory /usr/local/news/bin.


[root@avoriaz ~]# procstat 14789
  PID  PPID  PGID   SID  TSID THR LOGINWCHAN EMUL  COMM
14789 29165 29165 29165 0   1 root zfs   FreeBSD ELF64 innd
[root@avoriaz ~]# procstat 82034
  PID  PPID  PGID   SID  TSID THR LOGINWCHAN EMUL  COMM
82034 29165 29165 29165 0   1 root zfs   FreeBSD ELF64 innd
[root@avoriaz ~]# procstat -f 14789
procstat: kinfo_getfile(): Device busy
  PID COMMFD T V FLAGSREF  OFFSET PRO NAME
[root@avoriaz ~]# procstat -f 14789
procstat: kinfo_getfile(): Device busy
  PID COMMFD T V FLAGSREF  OFFSET PRO NAME
[root@avoriaz ~]# procstat -f 14789
procstat: kinfo_getfile(): 

Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-10 Thread Henri Hennebert



On 11/10/2016 18:33, Andriy Gapon wrote:

On 10/11/2016 18:12, Henri Hennebert wrote:

On 11/10/2016 16:54, Andriy Gapon wrote:

On 10/11/2016 17:20, Henri Hennebert wrote:

On 11/10/2016 15:00, Andriy Gapon wrote:

Interesting.  I can not spot any suspicious thread that would hold the vnode
lock.  Could you please run kgdb (just like that, no arguments), then execute
'bt' command and then select a frame when _vn_lock is called with 'fr N'
command.  Then please 'print *vp' and share the result.


I Think I miss something in your request:


Oh, sorry!  The very first step should be 'tid 101112' to switch to the correct
context.



(kgdb) fr 7
#7  0x8063c5b3 in _vn_lock (vp=, flags=2121728,


"value optimized out" - not good


file=,
line=) at vnode_if.h:859
859vnode_if.h: No such file or directory.
in vnode_if.h
(kgdb) print *vp


I am not sure if this output is valid, because of the message above.
Could you please try to navigate to nearby frames and see if vp itself has a
valid value there.  If you can find such a frame please do *vp  there.



Does this seems better?

(kgdb) fr 8
#8  0x8062a5f7 in vget (vp=0xf80049c2c000, flags=2121728, 
td=0xf80009ba0500) at /usr/src/sys/kern/vfs_subr.c:2523

2523if ((error = vn_lock(vp, flags)) != 0) {
(kgdb) print *vp
$1 = {v_tag = 0x813be535 "zfs", v_op = 0x813d0f70, 
v_data = 0xf80049c1f420, v_mount = 0xf800093aa660,
  v_nmntvnodes = {tqe_next = 0xf80049c2c938, tqe_prev = 
0xf80049c2bb30}, v_un = {vu_mount = 0x0, vu_socket = 0x0,
vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, 
le_prev = 0x0}, v_cache_src = {lh_first = 0x0}, v_cache_dst = {
tqh_first = 0xf800bfc8e3f0, tqh_last = 0xf800bfc8e410}, 
v_cache_dd = 0x0, v_lock = {lock_object = {
  lo_name = 0x813be535 "zfs", lo_flags = 117112832, lo_data 
= 0, lo_witness = 0x0}, lk_lock = 23, lk_exslpfail = 0,
lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {lo_name = 
0x8099e9e0 "vnode interlock", lo_flags = 16973824,
  lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 
0xf80049c2c068, v_actfreelist = {
tqe_next = 0xf80049c2c938, tqe_prev = 0xf80049ae9bd0}, 
v_bufobj = {bo_lock = {lock_object = {
lo_name = 0x8099e9f0 "bufobj interlock", lo_flags = 
86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1},
bo_ops = 0x80c4bf70, bo_object = 0xf800b62e9c60, 
bo_synclist = {le_next = 0x0, le_prev = 0x0},
bo_private = 0xf80049c2c000, __bo_vnode = 0xf80049c2c000, 
bo_clean = {bv_hd = {tqh_first = 0x0,
tqh_last = 0xf80049c2c120}, bv_root = {pt_root = 0}, bv_cnt 
= 0}, bo_dirty = {bv_hd = {tqh_first = 0x0,
tqh_last = 0xf80049c2c140}, bv_root = {pt_root = 0}, bv_cnt 
= 0}, bo_numoutput = 0, bo_flag = 0, bo_bsize = 131072},
  v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = 
{tqh_first = 0x0, tqh_last = 0xf80049c2c188},
rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 
0, v_holdcnt = 9, v_usecount = 6, v_iflag = 512,

  v_vflag = 32, v_writecount = 0, v_hash = 4833984, v_type = VREG}
(kgdb)

Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-10 Thread Henri Hennebert

On 11/10/2016 16:54, Andriy Gapon wrote:

On 10/11/2016 17:20, Henri Hennebert wrote:

On 11/10/2016 15:00, Andriy Gapon wrote:

Interesting.  I can not spot any suspicious thread that would hold the vnode
lock.  Could you please run kgdb (just like that, no arguments), then execute
'bt' command and then select a frame when _vn_lock is called with 'fr N'
command.  Then please 'print *vp' and share the result.


I Think I miss something in your request:


Oh, sorry!  The very first step should be 'tid 101112' to switch to the correct
context.



(kgdb) fr 7
#7  0x8063c5b3 in _vn_lock (vp=, 
flags=2121728, file=,

line=) at vnode_if.h:859
859 vnode_if.h: No such file or directory.
in vnode_if.h
(kgdb) print *vp
$1 = {v_tag = 0x80faeb78 "â~\231\200", v_op = 
0xf80009a41000, v_data = 0x0, v_mount = 0xf80009a41010,
  v_nmntvnodes = {tqe_next = 0x0, tqe_prev = 0x80edc088}, v_un 
= {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0,
vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0xf80009466e90, 
le_prev = 0x0}, v_cache_src = {lh_first = 0xfe010186d768},
  v_cache_dst = {tqh_first = 0x0, tqh_last = 0xfeb8a7c0}, 
v_cache_dd = 0xf8000284f000, v_lock = {lock_object = {
  lo_name = 0xf8002c00ee80 "", lo_flags = 0, lo_data = 0, 
lo_witness = 0xf800068bd480},
lk_lock = 1844673520268056, lk_exslpfail = 153715840, lk_timo = 
-2048, lk_pri = 0}, v_interlock = {lock_object = {
  lo_name = 0x18af8 Bad address>, lo_flags = 0, lo_data = 0,
  lo_witness = 0x0}, mtx_lock = 0}, v_vnlock = 0x0, v_actfreelist = 
{tqe_next = 0x0, tqe_prev = 0xf80009ba05c0},
  v_bufobj = {bo_lock = {lock_object = {lo_name = 0xf80009a41000 
"", lo_flags = 1, lo_data = 0, lo_witness = 0x400ff},
  rw_lock = 2}, bo_ops = 0x1, bo_object = 
0xf80049c2c068, bo_synclist = {le_next = 0x813be535,
  le_prev = 0x1}, bo_private = 0x0, __bo_vnode = 0x0, 
bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0x0},
  bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = 
{tqh_first = 0xf80088ac8d00, tqh_last = 0xf8003cc5b600},
  bv_root = {pt_root = 2553161591}, bv_cnt = -1741805705}, 
bo_numoutput = 31, bo_flag = 0, bo_bsize = 0}, v_pollinfo = 0x0,
  v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 
0xf88, tqh_last = 0x19cc}, rl_currdep = 0x3f8},
  v_cstart = 16256, v_lasta = 679, v_lastw = 0, v_clen = 0, v_holdcnt = 
0, v_usecount = 2369, v_iflag = 0, v_vflag = 0,

  v_writecount = 0, v_hash = 0, v_type = VNON}
(kgdb)

Thanks for your time

Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-10 Thread Henri Hennebert

On 11/10/2016 15:00, Andriy Gapon wrote:

On 10/11/2016 12:30, Henri Hennebert wrote:

On 11/10/2016 11:21, Andriy Gapon wrote:

On 09/11/2016 15:58, Eric van Gyzen wrote:

On 11/09/2016 07:48, Henri Hennebert wrote:

I encounter a strange deadlock on

FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 r308260:
Fri Nov  4 02:51:33 CET 2016
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ  amd64

This system is exclusively running on zfs.

After 3 or 4 days, `periodic daily` is locked in the directory
/usr/local/news/bin

[root@avoriaz ~]# ps xa|grep find
85656  -  D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) -prune
-o ( -name [#,]* -o -name .#* -o -name a.out -o -nam
  462  1  S+   0:00.00 grep find
[root@avoriaz ~]# procstat -f 85656
  PID COMMFD T V FLAGSREF  OFFSET PRO NAME
85656 find  text v r r---   -   - - /usr/bin/find
85656 find   cwd v d r---   -   - - /usr/local/news/bin
85656 find  root v d r---   -   - - /
85656 find 0 v c r---   3   0 - /dev/null
85656 find 1 p - rw--   1   0 - -
85656 find 2 v r -w--   7  17 - -
85656 find 3 v d r---   1   0 - /home/root
85656 find 4 v d r---   1   0 - /home/root
85656 find 5 v d rn--   1 533545184 - /usr/local/news/bin
[root@avoriaz ~]#

If I try `ls /usr/local/news/bin` it is also locked.

After `shutdown -r now` the system remain locked after the line '0 0 0 0 0 0'

After a reset and reboot  I can access /usr/local/news/bin.

I delete this directory and reinstall the package `portupgrade -fu news/inn`

5 days later `periodic daily`is locked on the same directory :-o

Any idea?


I can't help with the deadlock, but someone who _can_ help will probably ask for
the output of "procstat -kk PID" with the PID of the "find" process.


In fact, it's procstat -kk -a.  With just one thread we would see that a thread
is blocked on something, but we won't see why that something can not be 
acquired.



I attach the result,


Interesting.  I can not spot any suspicious thread that would hold the vnode
lock.  Could you please run kgdb (just like that, no arguments), then execute
'bt' command and then select a frame when _vn_lock is called with 'fr N'
command.  Then please 'print *vp' and share the result.


I Think I miss something in your request:

[root@avoriaz ~]# kgdb
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Reading symbols from /boot/kernel/zfs.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/zfs.ko.debug...done.

done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/opensolaris.ko.debug...done.

done.

--- clip ---

Loaded symbols for /boot/kernel/accf_data.ko
Reading symbols from /boot/kernel/daemon_saver.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/daemon_saver.ko.debug...done.

done.
Loaded symbols for /boot/kernel/daemon_saver.ko
#0  sched_switch (td=0xf8001131da00, newtd=0xf800762a8500, 
flags=)

at /usr/src/sys/kern/sched_ule.c:1973
1973cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xf8001131da00, newtd=0xf800762a8500, 
flags=)

at /usr/src/sys/kern/sched_ule.c:1973
#1  0x80566b15 in tc_fill_vdso_timehands32 (vdso_th32=0x0) at 
/usr/src/sys/kern/kern_tc.c:2121
#2  0x80555227 in timekeep_push_vdso () at 
/usr/src/sys/kern/kern_sharedpage.c:174

#3  0x80566226 in tc_windup () at /usr/src/sys/kern/kern_tc.c:1426
#4  0x804eaa41 in hardclock_cnt (cnt=1, usermode=optimized out>) at /usr/src/sys/kern/kern_clock.c:589
#5  0x808fac74 in handleevents (now=, 
fake=0) at /usr/src/sys/kern/kern_clocksource.c:223
#6  0x808fb1d7 in timercb (et=0x8100cf20, arg=optimized out>) at /usr/src/sys/kern/kern_clocksource.c:352

#7  0xf800b6429a00 in ?? ()
#8  0x81051080 in vm_page_array ()
#9  0x81051098 in vm_page_queue_free_mtx ()
#10 0xfe0101818920 in ?? ()
#11 0x805399c0 in __mtx_lock_sleep (c=, 
tid=Error accessing memory address 0xffac: Bad add\

ress.
) at /usr/src/sys/kern/kern_mutex.c:590
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb) q
[root@avoriaz ~]#

I don't find the requested frame

Henri
___
freebsd-stable@freebsd.org mailing list
h

Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-10 Thread Henri Hennebert

On 11/10/2016 11:21, Andriy Gapon wrote:

On 09/11/2016 15:58, Eric van Gyzen wrote:

On 11/09/2016 07:48, Henri Hennebert wrote:

I encounter a strange deadlock on

FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 r308260:
Fri Nov  4 02:51:33 CET 2016
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ  amd64

This system is exclusively running on zfs.

After 3 or 4 days, `periodic daily` is locked in the directory 
/usr/local/news/bin

[root@avoriaz ~]# ps xa|grep find
85656  -  D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) -prune
-o ( -name [#,]* -o -name .#* -o -name a.out -o -nam
  462  1  S+   0:00.00 grep find
[root@avoriaz ~]# procstat -f 85656
  PID COMMFD T V FLAGSREF  OFFSET PRO NAME
85656 find  text v r r---   -   - - /usr/bin/find
85656 find   cwd v d r---   -   - - /usr/local/news/bin
85656 find  root v d r---   -   - - /
85656 find 0 v c r---   3   0 - /dev/null
85656 find 1 p - rw--   1   0 - -
85656 find 2 v r -w--   7  17 - -
85656 find 3 v d r---   1   0 - /home/root
85656 find 4 v d r---   1   0 - /home/root
85656 find 5 v d rn--   1 533545184 - /usr/local/news/bin
[root@avoriaz ~]#

If I try `ls /usr/local/news/bin` it is also locked.

After `shutdown -r now` the system remain locked after the line '0 0 0 0 0 0'

After a reset and reboot  I can access /usr/local/news/bin.

I delete this directory and reinstall the package `portupgrade -fu news/inn`

5 days later `periodic daily`is locked on the same directory :-o

Any idea?


I can't help with the deadlock, but someone who _can_ help will probably ask for
the output of "procstat -kk PID" with the PID of the "find" process.


In fact, it's procstat -kk -a.  With just one thread we would see that a thread
is blocked on something, but we won't see why that something can not be 
acquired.



I attach the result,

Henri
[root@avoriaz ~]# procstat -kk -a
  PIDTID COMM TDNAME   KSTACK   
0 10 kernel   swapper  mi_switch+0xd2 
sleepq_timedwait+0x3a _sleep+0x281 swapper+0x464 btext+0x2c 
0 19 kernel   kqueue_ctx taskq mi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100012 kernel   aiod_kick taskq  mi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100013 kernel   thread taskq mi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100018 kernel   firmware taskq   mi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100022 kernel   acpi_task_0  mi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100023 kernel   acpi_task_1  mi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100024 kernel   acpi_task_2  mi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100025 kernel   em0 que  mi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100026 kernel   em0 txq  mi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100027 kernel   em1 taskqmi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100060 kernel   mca taskqmi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd taskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100061 kernel   system_taskq_0   mi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100062 kernel   system_taskq_1   mi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100063 kernel   dbu_evictmi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100072 kernel   CAM taskqmi_switch+0xd2 sleepq_wait+0x3a 
_sleep+0x2a1 taskqueue_thread_loop+0x141 fork_exit+0x85 fork_trampoline+0xe 
0 100086 kernel   if_config_tqg_0  mi_switch+0xd2 sleepq_wait+0x3a 
msleep_spin_sbt+0x1bd gtaskqueue_thread_loop+0x113 fork_exit+0x85 
fork_trampoline+0xe 
0 100087 kernel   if_io_tqg_0  mi_switch+0

Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-10 Thread Henri Hennebert

On 11/09/2016 19:23, Thierry Thomas wrote:

Le mer.  9 nov. 16 à 15:03:49 +0100, Henri Hennebert <h...@restart.be>
 écrivait :


[root@avoriaz ~]# procstat -kk 85656
   PIDTID COMM TDNAME KSTACK
85656 101112 find -mi_switch+0xd2
sleepq_wait+0x3a sleeplk+0x1b4 __lockmgr_args+0x356 vop_stdlock+0x3c
VOP_LOCK1_APV+0x8d _vn_lock+0x43 vget+0x47 cache_lookup+0x679
vfs_cache_lookup+0xac VOP_LOOKUP_APV+0x87 lookup+0x591 namei+0x572
kern_statat+0xa8 sys_fstatat+0x2c amd64_syscall+0x4ce Xfast_syscall+0xfb


It looks similar to the problem reportes in PR 205163
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205163

May be causes by too small values for some vfs.zfs.arc*.
Could you please list sysctl for vfs.zfs.arc_max and others?

Regards,


[root@avoriaz ~]# sysctl vfs.zfs
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vol.unmap_enabled: 1
vfs.zfs.vol.recursive: 0
vfs.zfs.vol.mode: 1
vfs.zfs.version.zpl: 5
vfs.zfs.version.spa: 5000
vfs.zfs.version.acl: 1
vfs.zfs.version.ioctl: 6
vfs.zfs.debug: 0
vfs.zfs.super_owner: 0
vfs.zfs.sync_pass_rewrite: 2
vfs.zfs.sync_pass_dont_compress: 5
vfs.zfs.sync_pass_deferred_free: 2
vfs.zfs.zio.exclude_metadata: 0
vfs.zfs.zio.use_uma: 1
vfs.zfs.cache_flush_disable: 0
vfs.zfs.zil_replay_disable: 0
vfs.zfs.min_auto_ashift: 9
vfs.zfs.max_auto_ashift: 13
vfs.zfs.vdev.trim_max_pending: 1
vfs.zfs.vdev.bio_delete_disable: 0
vfs.zfs.vdev.bio_flush_disable: 0
vfs.zfs.vdev.write_gap_limit: 4096
vfs.zfs.vdev.read_gap_limit: 32768
vfs.zfs.vdev.aggregation_limit: 131072
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.scrub_max_active: 2
vfs.zfs.vdev.scrub_min_active: 1
vfs.zfs.vdev.async_write_max_active: 10
vfs.zfs.vdev.async_write_min_active: 1
vfs.zfs.vdev.async_read_max_active: 3
vfs.zfs.vdev.async_read_min_active: 1
vfs.zfs.vdev.sync_write_max_active: 10
vfs.zfs.vdev.sync_write_min_active: 10
vfs.zfs.vdev.sync_read_max_active: 10
vfs.zfs.vdev.sync_read_min_active: 10
vfs.zfs.vdev.max_active: 1000
vfs.zfs.vdev.async_write_active_max_dirty_percent: 60
vfs.zfs.vdev.async_write_active_min_dirty_percent: 30
vfs.zfs.vdev.mirror.non_rotating_seek_inc: 1
vfs.zfs.vdev.mirror.non_rotating_inc: 0
vfs.zfs.vdev.mirror.rotating_seek_offset: 1048576
vfs.zfs.vdev.mirror.rotating_seek_inc: 5
vfs.zfs.vdev.mirror.rotating_inc: 0
vfs.zfs.vdev.trim_on_init: 1
vfs.zfs.vdev.cache.bshift: 16
vfs.zfs.vdev.cache.size: 0
vfs.zfs.vdev.cache.max: 16384
vfs.zfs.vdev.metaslabs_per_vdev: 200
vfs.zfs.txg.timeout: 5
vfs.zfs.space_map_blksz: 4096
vfs.zfs.spa_slop_shift: 5
vfs.zfs.spa_asize_inflation: 24
vfs.zfs.deadman_enabled: 1
vfs.zfs.deadman_checktime_ms: 5000
vfs.zfs.deadman_synctime_ms: 100
vfs.zfs.debug_flags: 0
vfs.zfs.recover: 0
vfs.zfs.spa_load_verify_data: 1
vfs.zfs.spa_load_verify_metadata: 1
vfs.zfs.spa_load_verify_maxinflight: 1
vfs.zfs.ccw_retry_interval: 300
vfs.zfs.check_hostid: 1
vfs.zfs.mg_fragmentation_threshold: 85
vfs.zfs.mg_noalloc_threshold: 0
vfs.zfs.condense_pct: 200
vfs.zfs.metaslab.bias_enabled: 1
vfs.zfs.metaslab.lba_weighting_enabled: 1
vfs.zfs.metaslab.fragmentation_factor_enabled: 1
vfs.zfs.metaslab.preload_enabled: 1
vfs.zfs.metaslab.preload_limit: 3
vfs.zfs.metaslab.unload_delay: 8
vfs.zfs.metaslab.load_pct: 50
vfs.zfs.metaslab.min_alloc_size: 33554432
vfs.zfs.metaslab.df_free_pct: 4
vfs.zfs.metaslab.df_alloc_threshold: 131072
vfs.zfs.metaslab.debug_unload: 0
vfs.zfs.metaslab.debug_load: 0
vfs.zfs.metaslab.fragmentation_threshold: 70
vfs.zfs.metaslab.gang_bang: 16777217
vfs.zfs.free_bpobj_enabled: 1
vfs.zfs.free_max_blocks: 18446744073709551615
vfs.zfs.no_scrub_prefetch: 0
vfs.zfs.no_scrub_io: 0
vfs.zfs.resilver_min_time_ms: 3000
vfs.zfs.free_min_time_ms: 1000
vfs.zfs.scan_min_time_ms: 1000
vfs.zfs.scan_idle: 50
vfs.zfs.scrub_delay: 4
vfs.zfs.resilver_delay: 2
vfs.zfs.top_maxinflight: 32
vfs.zfs.zfetch.array_rd_sz: 1048576
vfs.zfs.zfetch.max_distance: 8388608
vfs.zfs.zfetch.min_sec_reap: 2
vfs.zfs.zfetch.max_streams: 8
vfs.zfs.prefetch_disable: 1
vfs.zfs.delay_scale: 50
vfs.zfs.delay_min_dirty_percent: 60
vfs.zfs.dirty_data_sync: 67108864
vfs.zfs.dirty_data_max_percent: 10
vfs.zfs.dirty_data_max_max: 4294967296
vfs.zfs.dirty_data_max: 373664153
vfs.zfs.max_recordsize: 1048576
vfs.zfs.mdcomp_disable: 0
vfs.zfs.nopwrite_enabled: 1
vfs.zfs.dedup.prefetch: 1
vfs.zfs.l2c_only_size: 0
vfs.zfs.mfu_ghost_data_lsize: 24202240
vfs.zfs.mfu_ghost_metadata_lsize: 136404992
vfs.zfs.mfu_ghost_size: 160607232
vfs.zfs.mfu_data_lsize: 449569280
vfs.zfs.mfu_metadata_lsize: 102724608
vfs.zfs.mfu_size: 714202624
vfs.zfs.mru_ghost_data_lsize: 874834432
vfs.zfs.mru_ghost_metadata_lsize: 387692032
vfs.zfs.mru_ghost_size: 1262526464
vfs.zfs.mru_data_lsize: 151275008
vfs.zfs.mru_metadata_lsize: 13547008
vfs.zfs.mru_size: 322614272
vfs.zfs.anon_data_lsize: 0
vfs.zfs.anon_metadata_lsize: 0
vfs.zfs.anon_size: 2916352
vfs.zfs.l2arc_norw: 1
vfs.zfs.l2arc_feed

Re: Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-09 Thread Henri Hennebert

On 11/09/2016 14:58, Eric van Gyzen wrote:

On 11/09/2016 07:48, Henri Hennebert wrote:

I encounter a strange deadlock on

FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 r308260:
Fri Nov  4 02:51:33 CET 2016
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ  amd64

This system is exclusively running on zfs.

After 3 or 4 days, `periodic daily` is locked in the directory 
/usr/local/news/bin

[root@avoriaz ~]# ps xa|grep find
85656  -  D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) -prune
-o ( -name [#,]* -o -name .#* -o -name a.out -o -nam
   462  1  S+   0:00.00 grep find
[root@avoriaz ~]# procstat -f 85656
   PID COMMFD T V FLAGSREF  OFFSET PRO NAME
85656 find  text v r r---   -   - - /usr/bin/find
85656 find   cwd v d r---   -   - - /usr/local/news/bin
85656 find  root v d r---   -   - - /
85656 find 0 v c r---   3   0 - /dev/null
85656 find 1 p - rw--   1   0 - -
85656 find 2 v r -w--   7  17 - -
85656 find 3 v d r---   1   0 - /home/root
85656 find 4 v d r---   1   0 - /home/root
85656 find 5 v d rn--   1 533545184 - /usr/local/news/bin
[root@avoriaz ~]#

If I try `ls /usr/local/news/bin` it is also locked.

After `shutdown -r now` the system remain locked after the line '0 0 0 0 0 0'

After a reset and reboot  I can access /usr/local/news/bin.

I delete this directory and reinstall the package `portupgrade -fu news/inn`

5 days later `periodic daily`is locked on the same directory :-o

Any idea?

I can't help with the deadlock, but someone who _can_ help will probably ask for
the output of "procstat -kk PID" with the PID of the "find" process.

Eric

[root@avoriaz ~]# procstat -kk 85656
  PIDTID COMM TDNAME KSTACK
85656 101112 find -mi_switch+0xd2 
sleepq_wait+0x3a sleeplk+0x1b4 __lockmgr_args+0x356 vop_stdlock+0x3c 
VOP_LOCK1_APV+0x8d _vn_lock+0x43 vget+0x47 cache_lookup+0x679 
vfs_cache_lookup+0xac VOP_LOOKUP_APV+0x87 lookup+0x591 namei+0x572 
kern_statat+0xa8 sys_fstatat+0x2c amd64_syscall+0x4ce Xfast_syscall+0xfb




Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Freebsd 11.0 RELEASE - ZFS deadlock

2016-11-09 Thread Henri Hennebert

I encounter a strange deadlock on

FreeBSD avoriaz.restart.bel 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #0 
r308260: Fri Nov  4 02:51:33 CET 2016 
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ  amd64


This system is exclusively running on zfs.

After 3 or 4 days, `periodic daily` is locked in the directory 
/usr/local/news/bin


[root@avoriaz ~]# ps xa|grep find
85656  -  D0:01.13 find / ( ! -fstype local -o -fstype rdonly ) 
-prune -o ( -name [#,]* -o -name .#* -o -name a.out -o -nam

  462  1  S+   0:00.00 grep find
[root@avoriaz ~]# procstat -f 85656
  PID COMMFD T V FLAGSREF  OFFSET PRO NAME
85656 find  text v r r---   -   - - /usr/bin/find
85656 find   cwd v d r---   -   - - /usr/local/news/bin
85656 find  root v d r---   -   - - /
85656 find 0 v c r---   3   0 - /dev/null
85656 find 1 p - rw--   1   0 - -
85656 find 2 v r -w--   7  17 - -
85656 find 3 v d r---   1   0 - /home/root
85656 find 4 v d r---   1   0 - /home/root
85656 find 5 v d rn--   1 533545184 - 
/usr/local/news/bin

[root@avoriaz ~]#

If I try `ls /usr/local/news/bin` it is also locked.

After `shutdown -r now` the system remain locked after the line '0 0 0 0 
0 0'


After a reset and reboot  I can access /usr/local/news/bin.

I delete this directory and reinstall the package `portupgrade -fu news/inn`

5 days later `periodic daily`is locked on the same directory :-o

Any idea?

Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 10-STABLE hangups frequently

2016-02-18 Thread Henri Hennebert
On 02/18/2016 01:24, Marius Strobl wrote:
> 
> Could those of you experiencing these hangs with ZFS please test 
> whether instead of reverting all of r292895, a kernel built with 
> just the merge of r291244 undone via the following patch gets rid 
> of that problem - especially on amd64 - and report back? 
> https://people.freebsd.org/~marius/r291244_reversal_10.diff
> 
> Marius
> 
On a i386 with 2GB and pure ZFS without r291244 all is normal

Henri
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 10-STABLE hangups frequently

2016-02-07 Thread Henri Hennebert
On 02/03/2016 02:03, Hajimu UMEMOTO wrote:
> Hi,
> 
>> On Wed, 3 Feb 2016 07:07:38 +1100
>> Peter Jeremy  said:
> 
> peter> As others have said, you need to provide lots more detail on your
> peter> configuration.
> 
> CPU: AMD Athlon(tm) 64 Processor 3500+
> Memory: 4GB
> HDD: 3TB
> 
> I'm using ZFS only setup.
> 
> peter> There were no problems at r290231 but after I upgraded to r295005, I
> peter> started seeing "out of swap" errors and hangs during the periodic
> peter> daily runs.  I'm not seeing this on 1GB instances - though they are
> peter> all running UFS.
> 
> r292875 runs well:
> 
> FreeBSD asuka.mahoroba.org 10.2-STABLEFreeBSD 10.2-STABLE #5 r292875: Tue Feb 
>  2 07:08:29 JST 2016 r...@asuka.mahoroba.org:/usr/obj/usr/src/sys/ASUKA  amd6
> 
> r292895 hangs:
> 
> FreeBSD asuka.mahoroba.org 10.2-STABLE FreeBSD 10.2-STABLE #6 r292895: Tue 
> Feb  2 10:17:28 JST 2016 r...@asuka.mahoroba.org:/usr/obj/usr/src/sys/ASUKA  
> amd64
> 
> I tried latest stable (r295137) with the sys/kern/vfs_subr.c part of
> r292895 reverted, and it seems running well, here:
> 
> FreeBSD asuka.mahoroba.org 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #0 
> r295137M: Tue Feb  2 20:39:11 JST 2016 
> r...@asuka.mahoroba.org:/usr/obj/usr/src/sys/ASUKA  amd64
> 
> peter> Some experimentation suggested that just "find /" was enough to wedge
> peter> my system.  I did some experimenting and found that the following
> peter> loader config was enough to prevent it hanging:
> peter> vfs.zfs.arc_max="128M"
> peter> vfs.zfs.arc_meta_limit="50M"
> peter> vfs.zfs.arc_min="25M"
> peter> (previously, I had no ZFS tuning at all).
> 
> I had ZFS tuning before.  However, after this problem was occur, I
> removed all of ZFS tuning.
> The FS related setting is only kern.maxvnodes=40, now.
> 
> Sincerely,
> 
> --
> Hajimu UMEMOTO
> u...@mahoroba.org  u...@freebsd.org
> http://www.mahoroba.org/~ume/

I encounter a hangup 3 times after I upgrade to
10.3-PRERELEASE r295247M in a zfs configuration (i386 with 2GB memory)
while trying to run  security/tripwire (compute checksum on all the files).

With /usr/src/sys/kern/vfs_subr.c at revision 291757 all return to normal.

Henri

PS thanks Hajimu



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Link error in usr.bin/dig if WITH_BIND_XML=yes

2013-09-11 Thread Henri Hennebert
Hello,

Dig can't be linked if WITH_BIND_XML=yes is added to /etc/src.conf.

[root@morzine src]# svn info
Path: .
Working Copy Root Path: /usr/src
URL: http://svn.restart.bel/svn-FreeBSD-base/stable/9
Relative URL: ^/stable/9
Repository Root: http://svn.restart.bel/svn-FreeBSD-base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 255473
Node Kind: directory
Schedule: normal
Last Changed Author: des
Last Changed Rev: 255443
Last Changed Date: 2013-09-10 12:07:21 +0200 (Tue, 10 Sep 2013)


=== usr.bin/dig (all)
/usr/src/usr.bin/dig/../../contrib/bind9/bin/dig/dighost.c:4336:27:
warning: passing 'const char *' to parameter of type 'void *' discards
qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
isc_buffer_init(buffer, str, len);
 ^~~
/usr/src/usr.bin/dig/../../contrib/bind9/lib/isc/include/isc/buffer.h:225:41:
note: passing argument to parameter 'base' here
isc__buffer_init(isc_buffer_t *b, void *base, unsigned int length);
^
1 warning generated.
/usr/local/lib/libxml2.a(xzlib.o): In function `__libxml2_xzclose':
xzlib.c:(.text+0x69): undefined reference to `lzma_end'
/usr/local/lib/libxml2.a(xzlib.o): In function `xz_decomp':
xzlib.c:(.text+0x4a6): undefined reference to `lzma_code'
/usr/local/lib/libxml2.a(xzlib.o): In function `xz_make':
xzlib.c:(.text+0x8cd): undefined reference to `lzma_auto_decoder'
xzlib.c:(.text+0xa04): undefined reference to `lzma_properties_decode'
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
*** [dig] Error code 1

Stop in /usr/src/usr.bin/dig.
*** [all] Error code 1


Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Lost CAM Access to DVD Writer

2013-09-05 Thread Henri Hennebert
On 09/02/2013 01:54, Thomas Laus wrote:
 Hi,
 
 :-( Unable to CAMGETPASSTHRU for /dev/cd0 Inappropriate ioctl for 
 device.

I encounter the same problem and reinstalling dvd+rw-tools-7.1 solved it.

Henri


 Could someone else try to make a 'dump to DVD' backup [...]
 /sbin/dump -0u  -L -C16 -B4589840 -P 'growisofs -Z /dev/cd0=/dev/fd/0' /u
 
 A test with less disk load would be to write e.g. 100 MB of zeros
 to e.g. DVD+RW media (in order to reduce waste):
 
   dd if=/dev/zero bs=1M count=100 | growisofs -Z /dev/cd0=/dev/fd/0
 
 I got the same result and error message as I did when trying to dump the file 
 system on all of the computers that use ATA for disk access.  On the one PC 
 that uses AHCI, it was able to write to the DVD.
 
 Tom
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 9.2-BETA2 - Problem with newsyslog

2013-07-31 Thread Henri Hennebert
On 07/29/2013 11:18, Henri Hennebert wrote:
 Hello,
 
 My entry for newsyslog in /etc/crontab is:
 
 0  *  *  *  *  rootnewsyslog -t \%Y-\%m-\%d_\%H:\%M
 
 And I get:
 
 newsyslog: Could not convert time string to time value: No such file or
 directory
 
 I try to use the newsyslog from head to to avail. This solution was
 working a month ago (see Revision 248776)

Here I must have make some mistake... I retry with newsyslog.c from head
and all is OK

Henri


 
 My file system is zfs version 28.
 
 Henri
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


9.2-BETA2 - Problem with newsyslog

2013-07-29 Thread Henri Hennebert
Hello,

My entry for newsyslog in /etc/crontab is:

0  *  *  *  *  rootnewsyslog -t \%Y-\%m-\%d_\%H:\%M

And I get:

newsyslog: Could not convert time string to time value: No such file or
directory

I try to use the newsyslog from head to to avail. This solution was
working a month ago (see Revision 248776)

My file system is zfs version 28.

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


9.2-BETA2 bind + WITH_BIND_XML=yes + libxml2-2.8.0

2013-07-29 Thread Henri Hennebert
Hello,

When compiling world of 9.2-BETA2 and adding in /etc/src.conf

WITH_BIND_XML=yes

and with libxml2-2.8.0_2 (textproc/libxml2) installed in /usr/local

I get this link error:

=== usr.bin/dig (all)
/usr/local/lib/libxml2.a(xzlib.o): In function `__libxml2_xzclose':
xzlib.c:(.text+0x69): undefined reference to `lzma_end'
/usr/local/lib/libxml2.a(xzlib.o): In function `xz_decomp':
xzlib.c:(.text+0x4a6): undefined reference to `lzma_code'
/usr/local/lib/libxml2.a(xzlib.o): In function `xz_make':
xzlib.c:(.text+0x8cd): undefined reference to `lzma_auto_decoder'
xzlib.c:(.text+0xa04): undefined reference to `lzma_properties_decode'
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
*** [dig] Error code 1

Stop in /usr/src/usr.bin/dig.
*** [all] Error code 1


Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: sysctl -a causes kernel trap 12

2013-02-13 Thread Henri Hennebert
On 02/12/2013 12:22, Henri Hennebert wrote:
 On 01/19/2013 06:58, Brandon Gooch wrote:
 On Fri, Jan 18, 2013 at 2:56 PM, Xin Li delp...@delphij.net wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA512

 On 01/18/13 12:50, Brandon Gooch wrote:
 On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net
 mailto:delp...@delphij.net wrote:

 -BEGIN PGP SIGNED MESSAGE- Hash: SHA256

 To all: this became more and more hard to replicate lately.  I've
 tried these options and the most important progress is that it's
 possible to get a crashdump when debug.debugger_on_panic=0 and I
 managed to get a backtrace which indicates the panic occur when
 trying to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait
 - propagate_priority, but after I've added some instruments to
 the surrounding code and enabled INVARIANT and/or WITNESS, it
 mysteriously went away.

 Reverting my instruments code and update to latest svn makes the
 issue disappear for one day.  I've hit it again today but
 unfortunately didn't get a successful dump and after reboot I can't
 reproduce it again :(

 Still trying...


 Any updates Xin?

 No, it mysteriously disappeared for now.  According to my
 understanding to recent svn commits, I didn't see anybody committing
 something that fixes it but I can no longer panic my system, with or
 without debugging code :(

 I was actually hitting what I believe to be exactly the same issue
 as you on one of my systems, and, as you've seen, adding any extra
 debugging or diagnostics seemed to eliminate the issue.

 I was able to generate quite a few vmcores and still have these
 sitting around in my filesystem (along with the kernels that helped
 produce them).

 I can recreate this crash on my system by compiling the NVIDIA
 driver with clang at -01 and above. Although it's been noted that
 this issue has been seen in scenarios without an NIVIDIA driver in
 the mix, whatever is happening in the kernel to cause the panic is
 somehow triggered by this, at least on my system.

 I'm not sure if this is the same problem.  Could you please try using
 gcc to compile the nVIdia driver and see if that fixes the problem?

 Cheers,
 - --
 Xin LI delp...@delphij.nethttps://www.delphij.net/
 FreeBSD - The Power to Serve!   Live free or die


 Indeed, a gcc compiled NVIDIA module eliminates the issue, sorry if I
 hadn't mentioned this earlier.

 What was happening to me at first was that my system would just hang while
 booting. I was able to figure out that it was during /etc/rc.d/initrandom.
 I actually got to a point where I removed the call to sysctl -a from
 'better_than_nothing()' in /etc/rc.d/initrandom to have a booting system. I
 finally had a situation where I could get a panic by adding SW_WATCHDOG to
 my kernel and running watchdogd(8).

 For me, this panic would come and go seemingly at random as well, and I
 couldn't fumble my way around in the debugger to learn much of 
 anythingfreebsd-curr...@freebsd.org
 when I first started seeing it. I just started a process of modularizing
 everything I could in my kernel config, then loading modules 1-by-1 and
 booting over-and-over until I finally found what appeared to be the
 problem, which was the NVIDIA module compiled with clang.

 Oh, another thing: at times it seemed as though it was the number of
 modules loaded, as I could get the hang with 41 modules loaded, but not 40
 or 42?! I admit, when I was seeing that behavior, I hadn't eliminated the
 NVIDIA driver from my loaded modules. I need to revisit the panic situation
 to confirm this particular strangeness.

 Here's the last panic I had:

 Unread portion of the kernel message buffer:
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 1175 (sysctl)

 (kgdb) bt
 #0  doadump (textdump=1694704112) at pcpu.h:229
 #1  0x802fab82 in db_fncall (dummy1=value optimized out,
 dummy2=value optimized out, dummy3=value optimized out, dummy4=value
 optimized out) at /usr/src/sys/ddb/db_command.c:578
 #2  0x802fa85a in db_command (last_cmdp=value optimized out,
 cmd_table=value optimized out, dopager=1) at
 /usr/src/sys/ddb/db_command.c:449
 #3  0x802fa612 in db_command_loop () at
 /usr/src/sys/ddb/db_command.c:502
 #4  0x802fcf60 in db_trap (type=value optimized out, code=0) at
 /usr/src/sys/ddb/db_main.c:231
 #5  0x804a7b93 in kdb_trap (type=12, code=0, tf=value optimized
 out) at /usr/src/sys/kern/subr_kdb.c:654
 #6  0x807157c5 in trap_fatal (frame=0xff8865032670, eva=value
 optimized out) at /usr/src/sys/amd64/amd64/trap.c:867
 #7  0x80715adb in trap_pfault (frame=0x0, usermode=0) at
 /usr/src/sys/amd64/amd64/trap.c:698
 #8  0x8071529b in trap (frame=0xff8865032670) at
 /usr/src/sys/amd64/amd64/trap.c:463
 #9  0x806ff382 in calltrap () at exception.S:228
 #10 0x8047bd50 in sysctl_sysctl_next_ls

Re: Can not build kernel with modular ata and ATA_CAM

2012-06-25 Thread Henri Hennebert

On 06/25/2012 10:50, Mitya wrote:

My kernel options:

# Bus support.
device  acpi
device  pci

# Modular ATA
device  atadisk # ATA disk drives
device  atacore # Core ATA functionality
device  atapci  # PCI bus support; only generic chipset
support
device  ataintel# Intel

options ATA_CAM # Handle legacy controllers with CAM
options ATA_STATIC_ID   # Static device numbering

# ATA/SCSI peripherals
device  scbus   # SCSI bus (required for ATA/SCSI)
device  da  # Direct Access (disks)
device  pass# Passthrough device (direct ATA/SCSI
access)


From /usr/src/sys/conf/NOTES:

# ATA_CAM:  Turn ata(4) subsystem controller drivers into cam(4)
#   interface modules. This deprecates all ata(4)
#   peripheral device drivers (atadisk, ataraid, 
atapicd,

#   atapifd, atapist, atapicam) and all user-level APIs.
#   cam(4) drivers and APIs will be connected instead.


So you must remove 'device  atadisk'

Henri




make's output:

ata-disk.o: In function `ad_init':
ata-disk.c:(.text+0x7d): undefined reference to `ata_setmode'
ata-disk.c:(.text+0x95): undefined reference to `ata_wc'
ata-disk.c:(.text+0xc9): undefined reference to `ata_controlcmd'
ata-disk.c:(.text+0x11b): undefined reference to `ata_controlcmd'
ata-disk.c:(.text+0x16d): undefined reference to `ata_controlcmd'
ata-disk.c:(.text+0x1b6): undefined reference to `ata_controlcmd'
ata-disk.o: In function `ad_shutdown':
ata-disk.c:(.text+0x258): undefined reference to `ata_controlcmd'
ata-disk.o: In function `ad_detach':
ata-disk.c:(.text+0x479): undefined reference to `ata_fail_requests'
ata-disk.o: In function `ad_dump':
ata-disk.c:(.text+0x861): undefined reference to `ata_drop_requests'
ata-disk.c:(.text+0x921): undefined reference to `ata_controlcmd'
ata-disk.o: In function `ad_attach':
ata-disk.c:(.text+0xa40): undefined reference to `ata_setmax'
ata-disk.c:(.text+0xb62): undefined reference to `ata_satarev2str'
ata-disk.c:(.text+0xba7): undefined reference to `ata_unit2str'
ata-disk.c:(.text+0xfff): undefined reference to `ata_queue_request'
ata-disk.c:(.text+0x131e): undefined reference to `ata_queue_request'
ata-disk.c:(.text+0x1340): undefined reference to `ata_getparam'
ata-disk.o: In function `ad_spindown':
ata-disk.c:(.text+0x539): undefined reference to `ata_queue_request'
ata-disk.o: In function `ad_ioctl':
ata-disk.c:(.text+0x5a4): undefined reference to `ata_device_ioctl'
ata-disk.o: In function `ad_strategy':
ata-disk.c:(.text+0x6c7): undefined reference to `ata_queue_request'
*** [kernel] Error code 1

I found differences in ata-all.c and ata-all.h

In ata-all.c:
#ifndef ATA_CAM
void
ata_setmode(device_t dev)
{

But, in ata-all.h:
void ata_setmode(device_t dev);

without any #ifdef or #ifndef







___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: cvsup{, d} woes after upgrading to RELENG_9 on amd64 this weekend

2012-06-05 Thread Henri Hennebert

On 06/05/2012 10:17, Scot Hetzel wrote:

On Mon, Jun 4, 2012 at 4:34 AM, Henri Henneberth...@restart.be  wrote:

On 06/04/2012 10:53, Trond Endrestøl wrote:


Hi,

After upgrading to RELENG_9 as of yesterday on my amd64 system, cvsup
bombs out with Bus error: 10.

Example:

# /usr/local/bin/cvsup -g -L 2 /usr/src/stable-supfile
Parsing supfile /usr/src/stable-supfile
Connecting to localhost
Connected to localhost
Server software version: SNAP_16_1h
Negotiating file attribute support
Exchanging collection information
Establishing multiplexed-mode data connection
Running
Updating collection src-all/cvs
Bus error: 10

The only recent change I can think of is switching to clang for
building the kernel and base. Made I should rebuild world and kernel
using gcc.


This is the culprit, you must compile libc and libz with gcc.

See http://www.freebsd.org/cgi/query-pr.cgi?pr=162588



make.conf snipet from PR 162588:

.if defined(WITH_CLANG)
.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
.if !defined(CPP) || ${CPP} == cpp
CPP=clang -E
.endif
NO_WERROR=
WERROR=
.endif # WITH_CLANG

acccording to http://wiki.freebsd.org/BuildingFreeBSDWithClang#Quickstart,
you should be using:

CPP=clang-cpp



I change this a while ago and it don't change the problem at hand

Henri


If you change this , does it fix the issue?

Scot



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: cvsup{, d} woes after upgrading to RELENG_9 on amd64 this weekend

2012-06-04 Thread Henri Hennebert

On 06/04/2012 10:53, Trond Endrestøl wrote:

Hi,

After upgrading to RELENG_9 as of yesterday on my amd64 system, cvsup
bombs out with Bus error: 10.

Example:

# /usr/local/bin/cvsup -g -L 2 /usr/src/stable-supfile
Parsing supfile /usr/src/stable-supfile
Connecting to localhost
Connected to localhost
Server software version: SNAP_16_1h
Negotiating file attribute support
Exchanging collection information
Establishing multiplexed-mode data connection
Running
Updating collection src-all/cvs
Bus error: 10

The only recent change I can think of is switching to clang for
building the kernel and base. Made I should rebuild world and kernel
using gcc.


This is the culprit, you must compile libc and libz with gcc.

See http://www.freebsd.org/cgi/query-pr.cgi?pr=162588

Henri


Today, I used portupgrade -fprv lang/ezm3 net/cvsup-without-gui, but
cvsup gives me the same result as in the example above.

This bug also affects cvsupd for those of us who are running a local
FreeBSD CVSup mirror (http://motoyuki.bsdclub.org/BSD/cvsup.html) on
amd64/RELENG_9.

I know csup is generally preferred over cvsup, and in the meantime I'm
able to use csup with another local FreeBSD CVSup mirror running on
i386/RELENG_8.

cvsup on the amd64 box crashes with Bus error even when accessing the
CVSup mirror on the i386 box, thus indicating a problem local to the
amd64 box.

I welcome any clues to solve this problem.




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: strange system corruption (freebsd 9.0)

2012-04-21 Thread Henri Hennebert

On 04/20/2012 18:38, Ingrid Ditra wrote:

Hi, folks!
I am in the middle of really creepy problem with my new FreeBSD box. I would 
appreciate any ideas about what the hell happened.

I've installed FreeBSD 9.0 from CD on IBM System x3550 server (with RAID-5 on 4 
hard drives) and moved on it the most of config's from my old FreeBSD (8.2) 
box, and everything seemed working fine for some days. Long story short, today 
I realised, that I can't login nether through ssh or console, some third-party 
soft doesn't work, and most of utilities from base system doesn't work too.
My /usr/sbin and /usr/libdata are completely gone, /usr/libexec is empty, 
/usr/bin contains only dtrace dir and librt.so.1 many files from /usr/bin are 
gone, /usr/src contains only directory with my kernconf (there was all sources) 
and /usr/ports contains only ports I've installed.
Time of access to all deleted or semi-deleted dirs is almost the same,

Do you look carefully in /var/log/cron for this same time ?

Another thought, give you filesystem layout.

 but I didn't find any weird actions in logs. First, I thought that 
portsnap (runned by cron) somehow corrupted my system, but it was 
executed like eight hours earlier.

No one but me has access to this box, so it's unlikely mean joke.
This system is connected to the internet ? http server ? if so check the 
logs




So, please, please, help me. I really do not know what I suppose to do now. I 
can't find out why this happened, so it would be useless just reinstall system 
-- I'll have this situation again. All this stuff repeated twice


Same time, day of the week ?

-- so it is not kind of glitch (last time a cvsuped sources and ports 
and thought it was the reason of crash).

Maybe I'm not helpful but you feel less lonely ...

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: libutempter

2012-01-14 Thread Henri Hennebert
On 01/14/2012 09:47, Andre Goree wrote:
 On Tue, 10 Jan 2012 19:58:13 -0600, Andre Goree an...@drenet.info wrote:
 
 I recently csup'd 9-STABLE and was able to get it working along with my
 custom kernel.  I'm now in the process of rebuilding all my ports, and
 I've
 come across something when running 'portmaster -af' that I can't seem to
 find any information on.

 === Launching child to reinstall libutempter-1.1.5_1

 === Port directory: /usr/ports/sysutils/libutempter

 === This port is marked IGNORE
 === is now contained in the base system


 === If you are sure you can build it, remove the
IGNORE line in the Makefile and try again.

 === Update for libutempter-1.1.5_1 failed
 === Aborting update

 Terminated


 I figure, ok I'll just delete the package and move on.  However, there
 are many packages I have installed that depend on libutemper.  I would
 still just proceed with the removal given that the functionality is
 provided in base now, however I don't want to break all these ports and
 have to deal with the mess when I portmaster -af again.

 What is the recommended action here?  Should I just force exclude that
 port
 from the upgrade?  That's probably the easiest way but I'd have to deal
 with this at some point.

 Thanks in advance for any advice

 -- 
 Andre Goree
 andre@drenetinfo
 
 So I've rebuilt everything that I could, but when I get to the ports
 that depend on libutempter, I get an error that they could not be
 reinstalled due to a failure with libutempter  :/
 
 ---  Skipping 'www/opera' (opera-11.60) because a requisite package
 'libutempter-1.1.5_1' (sysutils/libutempter) failed (specify -k to force)
 ---  Skipping 'www/opera-linuxplugins' (opera-linuxplugins-11.60)
 because a requisite package 'opera-11.60' (www/opera) failed (specify -k
 to force)
 ---  Skipping 'deskutils/kdeplasma-addons' (kdeplasma-addons-4.7.3)
 because a requisite package 'kdepimlibs-4.7.3' (deskutils/kdepimlibs4)
 failed (specify -k to force)
 ---  Skipping 'graphics/libkdcraw-kde4' (libkdcraw-4.7.3) because a
 requisite package 'libutempter-1.1.5_1' (sysutils/libutempter) failed
 (specify -k to force)
 
 
 I installed misc/compat8x, however it informed my that I'd need to add
  to the kernel conf.  When I try to do that, I'm met with this error:
 
 /usr/src/sys/amd64/conf/DESKTOPKERN9: unknown option COMPAT_FREEBSD8
 *** Error code 1
 Stop in /usr/src.
 
 
 Which is weird, because:
 
 [root@desktop src]# uname -r
 9.0-STABLE
 
 
 Meaning I'm certainly running 9.0-STABLE.  So what gives re: that error
 above about unknown option?  I even tried to csup source and
 buildworld again, but to no avail -- the error remains.
 
 
I upgrade my ports with portupgrade.
After removing libutempter I just run `pkgdb -Fu' and then
I can proceed with the update of depending ports.
I don't need compat8x.

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9 recompile ports

2012-01-14 Thread Henri Hennebert
On 01/14/2012 09:46, Matthew Seaman wrote:
 On 13/01/2012 22:57, Andriy Gapon wrote:
 But if the appropriate misc/compatX port is installed, then
 those libraries do actually exist and the system should be fully 
 usable... Modulo the compat libraries not working with the new 
 kernel as Kostik has pointed out.
 
 As soon as you update or install an application after this point, 
 you are likely to end up with an application that tries to 
 dynamically link two different versions of the same shlib, and
 that is a recipe for tears-before-bedtime.

This /etc/libmap.conf help me greatly when I reinstall all my ports
after 9.0-BETA2 and make delete-old-libs:

libsbuf.so.5libsbuf.so.6
libz.so.5   libz.so.6
libutil.so.8libutil.so.9
libcam.so.5 libcam.so.6
libpcap.so.7libpcap.so.8
libufs.so.5 libufs.so.6
libbsnmp.so.5   libbsnmp.so.6
libdwarf.so.2   libdwarf.so.3
libopie.so.6libopie.so.7
librtld_db.so.1 librtld_db.so.2
libtacplus.so.4 libtacplus.so.5

Henri
 
 Cheers,
 
 Matthew
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9 recompile ports

2012-01-14 Thread Henri Hennebert
On 01/14/2012 11:37, Jeremy Chadwick wrote:
 On Sat, Jan 14, 2012 at 11:29:00AM +0100, Henri Hennebert wrote:
 On 01/14/2012 09:46, Matthew Seaman wrote:
 On 13/01/2012 22:57, Andriy Gapon wrote:
 But if the appropriate misc/compatX port is installed, then
 those libraries do actually exist and the system should be fully 
 usable... Modulo the compat libraries not working with the new 
 kernel as Kostik has pointed out.

 As soon as you update or install an application after this point, 
 you are likely to end up with an application that tries to 
 dynamically link two different versions of the same shlib, and
 that is a recipe for tears-before-bedtime.

 This /etc/libmap.conf help me greatly when I reinstall all my ports
---  


 after 9.0-BETA2 and make delete-old-libs:

 libsbuf.so.5 libsbuf.so.6
 libz.so.5libz.so.6
 libutil.so.8 libutil.so.9
 libcam.so.5  libcam.so.6
 libpcap.so.7 libpcap.so.8
 libufs.so.5  libufs.so.6
 libbsnmp.so.5libbsnmp.so.6
 libdwarf.so.2libdwarf.so.3
 libopie.so.6 libopie.so.7
 librtld_db.so.1  librtld_db.so.2
 libtacplus.so.4  libtacplus.so.5
 
 This is very, VERY, ***VERY*** dangerous.  Apparently nobody has
 explained why, so I will:
 
 When a linked library version number (N of libfoo.so.N) increases or
 changes, it indicates there are API/ABI changes to the library.  There
 is absolutely ZERO guarantee that calling semantics are the same, that
 function arguments (thus stack order) are the same, or that structures
 used internally by the library are the same.  The effects of this can be
 devastating -- if you're lucky it'll consist of just missing symbol,
 but it can be a lot worse.  The TL;DR version is: there is absolutely
 ZERO guarantee that the internal operations and calling semantics of the
 libraries are identical.
 
 Folks reading this thread, PLEASE do not follow the above advice and
 leave your system running in that kind of state.  Instead of being lazy,

I don't want to argue too much, but you don't read me correctly.
I just do this during the time I REINSTALL ALL PORTS and then I delete
/etc/libmap.conf, of course, I'm not crazy!

 rebuild all your ports from scratch or pull down new binary copies
 (pkg_add -r ...) for the version of the OS you're running.  Doug and I
 have the same opinion when it comes to this situation, and it's based
 purely on experience.  Schedule downtime, spend an afternoon rebuilding
 things, whatever -- just do it the Right Way(tm) please.  Otherwise
 you're creating a lot of support hassle when it comes to trying to
 diagnose why some program on your system behaves oddly -- weeks go by,
 oh, libmap.conf...
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Tripwire segmentation fault - amd64 - 9.0-RC2 in _malloc_postfork

2011-11-10 Thread Henri Hennebert

Hello,

On 2 systems running 9.0-RC2 amd64 tripwire segfault.

The problem occurs during `tripwire --check` after +/- 20 minutes of 
execution:


Here is the bt

[root@tignes tripwire]# gdb ./tripwire
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...(no debugging 
symbols found)...

(gdb) run --check
Starting program: 
/usr/ports/security/tripwire/work/tripwire-2.4.1.2-src/src/tripwire/tripwire 
--check
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...(no debugging symbols found)...Parsing policy file: 
/usr/local/etc/tripwire/tw.pol

*** Processing Unix File System ***
Performing integrity check...
The object: /var/spool/httpd/tignes/htdocs/Xfer/Henri_2006_11_24 is on 
a different file system...ignoring.


Program received signal SIGSEGV, Segmentation fault.
0x0008014efb12 in _malloc_postfork () from /lib/libc.so.7
(gdb) bt
#0  0x0008014efb12 in _malloc_postfork () from /lib/libc.so.7
#1  0x0008014ef158 in realloc () from /lib/libc.so.7
#2  0x0008014ef385 in free () from /lib/libc.so.7
#3  0x004c6181 in cFileUtil::IsRegularFile ()
#4  0x00499dbb in WriteObject ()
#5  0x0049c429 in cTWUtil::WriteReport ()
#6  0x0042709a in cTWModeIC::Execute ()
#7  0x0041cf85 in main ()

Under 9.0-RC1 I encounter no problem at all

Henri

PS - on another system under 9.0-RC2 i386 tripwire run smoothly
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS boot inside on the second partition inside a slice

2011-06-22 Thread Henri Hennebert

On 06/21/2011 23:27, John Baldwin wrote:

On Tuesday, June 21, 2011 4:13:20 pm Henri Hennebert wrote:

On 06/21/2011 21:25, John Baldwin wrote:
and I get:

Read error: 04


Hmm, that is the error for an invalid sector.  Try this patch.  It reshuffles
a few more things and adds code to dump the low 32-bits of the LBA on an
error:

Index: zfsldr.S
===
--- zfsldr.S(revision 223365)
+++ zfsldr.S(working copy)
@@ -16,7 +16,6 @@
   */

  /* Memory Locations */
-   .set MEM_REL,0x700  # Relocation address
.set MEM_ARG,0x900  # Arguments
.set MEM_ORG,0x7c00 # Origin
.set MEM_BUF,0x8000 # Load area
@@ -91,26 +90,18 @@ main:   cld # 
String ops inc
mov %cx,%ss # Set up
mov $start,%sp  #  stack
  /*
- * Relocate ourself to MEM_REL.  Since %cx == 0, the inc %ch sets
- * %cx == 0x100.
- */
-   mov %sp,%si # Source
-   mov $MEM_REL,%di# Destination
-   incb %ch# Word count
-   rep # Copy
-   movsw   #  code
-/*
   * If we are on a hard drive, then load the MBR and look for the first
   * FreeBSD slice.  We use the fake partition entry below that points to
   * the MBR when we call nread.  The first pass looks for the first active
   * FreeBSD slice.  The second pass looks for the first non-active FreeBSD
   * slice if the first one fails.
   */
-   mov $part4,%si  # Partition
+   mov $part4,%si  # Dummy partition
cmpb $0x80,%dl  # Hard drive?
jb main.4   # No
-   movb $0x1,%dh   # Block count
-   callw nread # Read MBR
+   xor %eax,%eax   # Read MBR
+   movw $MEM_BUF,%bx   #  from first
+   callw nread #  sector
mov $0x1,%cx# Two passes
  main.1:   mov $MEM_BUF+PRT_OFF,%si# Partition table
movb $0x1,%dh   # Partition
@@ -161,10 +152,16 @@ main.4:   xor %dx,%dx # 
Partition:drive
   * area and target area do not overlap.
   */
  main.5:   mov %dx,MEM_ARG # Save args
-   movb $NSECT,%dh # Sector count
+   mov $NSECT,%cx  # Sector count
movl $1024,%eax # Offset to boot2
-   callw nread.1   # Read disk
-main.6:mov $MEM_BUF,%si# BTX (before reloc)
+   mov $MEM_BUF,%bx# Destination buffer
+main.6:pushal  # Save params
+   callw nread # Read disk
+   popal   # Restore
+   incl %eax   # Update for
+   add $SIZ_SEC,%bx#  next sector
+   loop main.6 # If not last, read another
+   mov $MEM_BUF,%si# BTX (before reloc)
mov 0xa(%si),%bx# Get BTX length and set
mov $NSECT*SIZ_SEC-1,%di# Size of load area (less one)
mov %di,%si # End of load
@@ -214,29 +211,35 @@ seta20.3: sti # Enable 
interrupts
   * packet on the stack and passes it to read.
   *
   * %eax   - int - LBA to read in relative to partition start
+ * %es:%bx - ptr - destination address
   * %dl- byte- drive to read from
- * %dh - byte- num sectors to read
   * %si- ptr - MBR partition entry
   */
-nread: xor %eax,%eax   # Sector offset in partition
-nread.1:   xor %ecx,%ecx   # Get
+nread: xor %ecx,%ecx   # Get
addl 0x8(%si),%eax  #  LBA
adc $0,%ecx
pushl %ecx  # Starting absolute block
pushl %eax  #  block number
push %es# Address of
-   push $MEM_BUF   #  transfer buffer
-   xor %ax,%ax # Number of
-   movb %dh,%al#  blocks to
-   push %ax#  transfer
+   push %bx#  transfer buffer
+   push $0x1   # Read 1 sector
push

Re: ZFS boot inside on the second partition inside a slice

2011-06-22 Thread Henri Hennebert

On 06/22/2011 16:19, Henri Hennebert wrote:

On 06/22/2011 15:57, John Baldwin wrote:

On Wednesday, June 22, 2011 7:34:05 am Henri Hennebert wrote:

I get

LBA: 8200
Read error: 04


Odd. Oh, I fubar'd and read the wrong thing for the sector. Also, we
should leave the EDD packet on the stack so it doesn't get trashed by
calling hex8, etc. Please try this:

Index: zfsldr.S
===
--- zfsldr.S (revision 223365)
+++ zfsldr.S (working copy)
@@ -16,7 +16,6 @@
*/

/* Memory Locations */
- .set MEM_REL,0x700 # Relocation address
.set MEM_ARG,0x900 # Arguments
.set MEM_ORG,0x7c00 # Origin
.set MEM_BUF,0x8000 # Load area
@@ -91,26 +90,18 @@ main: cld # String ops inc
mov %cx,%ss # Set up
mov $start,%sp # stack
/*
- * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets
- * %cx == 0x100.
- */
- mov %sp,%si # Source
- mov $MEM_REL,%di # Destination
- incb %ch # Word count
- rep # Copy
- movsw # code
-/*
* If we are on a hard drive, then load the MBR and look for the first
* FreeBSD slice. We use the fake partition entry below that points to
* the MBR when we call nread. The first pass looks for the first active
* FreeBSD slice. The second pass looks for the first non-active FreeBSD
* slice if the first one fails.
*/
- mov $part4,%si # Partition
+ mov $part4,%si # Dummy partition
cmpb $0x80,%dl # Hard drive?
jb main.4 # No
- movb $0x1,%dh # Block count
- callw nread # Read MBR
+ xor %eax,%eax # Read MBR
+ movw $MEM_BUF,%bx # from first
+ callw nread # sector
mov $0x1,%cx # Two passes
main.1: mov $MEM_BUF+PRT_OFF,%si # Partition table
movb $0x1,%dh # Partition
@@ -161,10 +152,16 @@ main.4: xor %dx,%dx # Partition:drive
* area and target area do not overlap.
*/
main.5: mov %dx,MEM_ARG # Save args
- movb $NSECT,%dh # Sector count
+ mov $NSECT,%cx # Sector count
movl $1024,%eax # Offset to boot2
- callw nread.1 # Read disk
-main.6: mov $MEM_BUF,%si # BTX (before reloc)
+ mov $MEM_BUF,%bx # Destination buffer
+main.6: pushal # Save params
+ callw nread # Read disk
+ popal # Restore
+ incl %eax # Update for
+ add $SIZ_SEC,%bx # next sector
+ loop main.6 # If not last, read another
+ mov $MEM_BUF,%si # BTX (before reloc)
mov 0xa(%si),%bx # Get BTX length and set
mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one)
mov %di,%si # End of load
@@ -214,29 +211,35 @@ seta20.3: sti # Enable interrupts
* packet on the stack and passes it to read.
*
* %eax - int - LBA to read in relative to partition start
+ * %es:%bx - ptr - destination address
* %dl - byte - drive to read from
- * %dh - byte - num sectors to read
* %si - ptr - MBR partition entry
*/
-nread: xor %eax,%eax # Sector offset in partition
-nread.1: xor %ecx,%ecx # Get
+nread: xor %ecx,%ecx # Get
addl 0x8(%si),%eax # LBA
adc $0,%ecx
pushl %ecx # Starting absolute block
pushl %eax # block number
push %es # Address of
- push $MEM_BUF # transfer buffer
- xor %ax,%ax # Number of
- movb %dh,%al # blocks to
- push %ax # transfer
+ push %bx # transfer buffer
+ push $0x1 # Read 1 sector
push $0x10 # Size of packet
mov %sp,%bp # Packet pointer
callw read # Read from disk
+ jc nread.1 # If error, fail
lea 0x10(%bp),%sp # Clear stack
- jnc return # If success, return
- mov $msg_read,%si # Otherwise, set the error
- # message and fall through to
- # the error routine
+ ret # If success, return
+nread.1: mov %ah,%al # Format
+ mov $read_err,%di # error
+ call hex8 # code
+ movl 0x8(%bp),%eax # Format
+ mov $lba,%di # LBA
+ call hex32
+ mov $msg_lba,%si # Display
+ call putstr # LBA
+ mov $msg_read,%si # Set the error message and
+ # fall through to the error
+ # routine
/*
* Print out the error message pointed to by %ds:(%si) followed
* by a prompt, wait for a keypress, and then reboot the machine.
@@ -259,14 +262,6 @@ putstr: lodsb # Get char
jne putstr.0 # No

/*
- * Overused return code. ereturn is used to return an error from the
- * read function. Since we assume putstr succeeds, we (ab)use the
- * same code when we return from putstr.
- */
-ereturn: movb $0x1,%ah # Invalid
- stc # argument
-return: retw # To caller
-/*
* Reads sectors from the disk. If EDD is enabled, then check if it is
* installed and use it if it is. If it is not installed or not
enabled, then
* fall back to using CHS. Since we use a LBA, if we are using CHS, we
have to
@@ -294,14 +289,38 @@ read: cmpb $0x80,%dl # Hard drive?
retw # To caller
read.1: mov $msg_chs,%si
jmp error
-msg_chs: .asciz CHS not supported

+/*
+ * Convert EAX, AX, or AL to hex, saving the result to [EDI].
+ */
+hex32: pushl %eax # Save
+ shrl $0x10,%eax # Do upper
+ call hex16 # 16
+ popl %eax # Restore
+hex16: call hex16.1 # Do upper 8
+hex16.1: xchgb %ah,%al # Save/restore
+hex8: push %ax # Save
+ shrb $0x4,%al # Do upper
+ call hex8.1 # 4
+ pop %ax # Restore
+hex8.1: andb $0xf,%al # Get lower 4
+ cmpb $0xa,%al # Convert
+ sbbb $0x69,%al # to hex
+ das # digit
+ orb $0x20,%al # To lower case
+ stosb # Save char
+ ret # (Recursive)
+
/* Messages

Re: ZFS boot inside on the second partition inside a slice

2011-06-22 Thread Henri Hennebert

On 06/22/2011 15:57, John Baldwin wrote:

On Wednesday, June 22, 2011 7:34:05 am Henri Hennebert wrote:

I get

LBA: 8200
Read error: 04


Odd.  Oh, I fubar'd and read the wrong thing for the sector.  Also, we
should leave the EDD packet on the stack so it doesn't get trashed by
calling hex8, etc.  Please try this:

Index: zfsldr.S
===
--- zfsldr.S(revision 223365)
+++ zfsldr.S(working copy)
@@ -16,7 +16,6 @@
   */

  /* Memory Locations */
-   .set MEM_REL,0x700  # Relocation address
.set MEM_ARG,0x900  # Arguments
.set MEM_ORG,0x7c00 # Origin
.set MEM_BUF,0x8000 # Load area
@@ -91,26 +90,18 @@ main:   cld # 
String ops inc
mov %cx,%ss # Set up
mov $start,%sp  #  stack
  /*
- * Relocate ourself to MEM_REL.  Since %cx == 0, the inc %ch sets
- * %cx == 0x100.
- */
-   mov %sp,%si # Source
-   mov $MEM_REL,%di# Destination
-   incb %ch# Word count
-   rep # Copy
-   movsw   #  code
-/*
   * If we are on a hard drive, then load the MBR and look for the first
   * FreeBSD slice.  We use the fake partition entry below that points to
   * the MBR when we call nread.  The first pass looks for the first active
   * FreeBSD slice.  The second pass looks for the first non-active FreeBSD
   * slice if the first one fails.
   */
-   mov $part4,%si  # Partition
+   mov $part4,%si  # Dummy partition
cmpb $0x80,%dl  # Hard drive?
jb main.4   # No
-   movb $0x1,%dh   # Block count
-   callw nread # Read MBR
+   xor %eax,%eax   # Read MBR
+   movw $MEM_BUF,%bx   #  from first
+   callw nread #  sector
mov $0x1,%cx# Two passes
  main.1:   mov $MEM_BUF+PRT_OFF,%si# Partition table
movb $0x1,%dh   # Partition
@@ -161,10 +152,16 @@ main.4:   xor %dx,%dx # 
Partition:drive
   * area and target area do not overlap.
   */
  main.5:   mov %dx,MEM_ARG # Save args
-   movb $NSECT,%dh # Sector count
+   mov $NSECT,%cx  # Sector count
movl $1024,%eax # Offset to boot2
-   callw nread.1   # Read disk
-main.6:mov $MEM_BUF,%si# BTX (before reloc)
+   mov $MEM_BUF,%bx# Destination buffer
+main.6:pushal  # Save params
+   callw nread # Read disk
+   popal   # Restore
+   incl %eax   # Update for
+   add $SIZ_SEC,%bx#  next sector
+   loop main.6 # If not last, read another
+   mov $MEM_BUF,%si# BTX (before reloc)
mov 0xa(%si),%bx# Get BTX length and set
mov $NSECT*SIZ_SEC-1,%di# Size of load area (less one)
mov %di,%si # End of load
@@ -214,29 +211,35 @@ seta20.3: sti # Enable 
interrupts
   * packet on the stack and passes it to read.
   *
   * %eax   - int - LBA to read in relative to partition start
+ * %es:%bx - ptr - destination address
   * %dl- byte- drive to read from
- * %dh - byte- num sectors to read
   * %si- ptr - MBR partition entry
   */
-nread: xor %eax,%eax   # Sector offset in partition
-nread.1:   xor %ecx,%ecx   # Get
+nread: xor %ecx,%ecx   # Get
addl 0x8(%si),%eax  #  LBA
adc $0,%ecx
pushl %ecx  # Starting absolute block
pushl %eax  #  block number
push %es# Address of
-   push $MEM_BUF   #  transfer buffer
-   xor %ax,%ax # Number of
-   movb %dh,%al#  blocks to
-   push %ax#  transfer
+   push %bx#  transfer buffer
+   push $0x1   # Read 1 sector
push $0x10

Re: ZFS boot inside on the second partition inside a slice

2011-06-22 Thread Henri Hennebert

On 06/22/2011 16:23, Henri Hennebert wrote:

On 06/22/2011 16:19, Henri Hennebert wrote:

On 06/22/2011 15:57, John Baldwin wrote:

On Wednesday, June 22, 2011 7:34:05 am Henri Hennebert wrote:

I get

LBA: 8200
Read error: 04


Odd. Oh, I fubar'd and read the wrong thing for the sector. Also, we
should leave the EDD packet on the stack so it doesn't get trashed by
calling hex8, etc. Please try this:

Index: zfsldr.S
===
--- zfsldr.S (revision 223365)
+++ zfsldr.S (working copy)
@@ -16,7 +16,6 @@
*/

/* Memory Locations */
- .set MEM_REL,0x700 # Relocation address
.set MEM_ARG,0x900 # Arguments
.set MEM_ORG,0x7c00 # Origin
.set MEM_BUF,0x8000 # Load area
@@ -91,26 +90,18 @@ main: cld # String ops inc
mov %cx,%ss # Set up
mov $start,%sp # stack
/*
- * Relocate ourself to MEM_REL. Since %cx == 0, the inc %ch sets
- * %cx == 0x100.
- */
- mov %sp,%si # Source
- mov $MEM_REL,%di # Destination
- incb %ch # Word count
- rep # Copy
- movsw # code
-/*
* If we are on a hard drive, then load the MBR and look for the first
* FreeBSD slice. We use the fake partition entry below that points to
* the MBR when we call nread. The first pass looks for the first active
* FreeBSD slice. The second pass looks for the first non-active FreeBSD
* slice if the first one fails.
*/
- mov $part4,%si # Partition
+ mov $part4,%si # Dummy partition
cmpb $0x80,%dl # Hard drive?
jb main.4 # No
- movb $0x1,%dh # Block count
- callw nread # Read MBR
+ xor %eax,%eax # Read MBR
+ movw $MEM_BUF,%bx # from first
+ callw nread # sector
mov $0x1,%cx # Two passes
main.1: mov $MEM_BUF+PRT_OFF,%si # Partition table
movb $0x1,%dh # Partition
@@ -161,10 +152,16 @@ main.4: xor %dx,%dx # Partition:drive
* area and target area do not overlap.
*/
main.5: mov %dx,MEM_ARG # Save args
- movb $NSECT,%dh # Sector count
+ mov $NSECT,%cx # Sector count
movl $1024,%eax # Offset to boot2
- callw nread.1 # Read disk
-main.6: mov $MEM_BUF,%si # BTX (before reloc)
+ mov $MEM_BUF,%bx # Destination buffer
+main.6: pushal # Save params
+ callw nread # Read disk
+ popal # Restore
+ incl %eax # Update for
+ add $SIZ_SEC,%bx # next sector
+ loop main.6 # If not last, read another
+ mov $MEM_BUF,%si # BTX (before reloc)
mov 0xa(%si),%bx # Get BTX length and set
mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one)
mov %di,%si # End of load
@@ -214,29 +211,35 @@ seta20.3: sti # Enable interrupts
* packet on the stack and passes it to read.
*
* %eax - int - LBA to read in relative to partition start
+ * %es:%bx - ptr - destination address
* %dl - byte - drive to read from
- * %dh - byte - num sectors to read
* %si - ptr - MBR partition entry
*/
-nread: xor %eax,%eax # Sector offset in partition
-nread.1: xor %ecx,%ecx # Get
+nread: xor %ecx,%ecx # Get
addl 0x8(%si),%eax # LBA
adc $0,%ecx
pushl %ecx # Starting absolute block
pushl %eax # block number
push %es # Address of
- push $MEM_BUF # transfer buffer
- xor %ax,%ax # Number of
- movb %dh,%al # blocks to
- push %ax # transfer
+ push %bx # transfer buffer
+ push $0x1 # Read 1 sector
push $0x10 # Size of packet
mov %sp,%bp # Packet pointer
callw read # Read from disk
+ jc nread.1 # If error, fail
lea 0x10(%bp),%sp # Clear stack
- jnc return # If success, return
- mov $msg_read,%si # Otherwise, set the error
- # message and fall through to
- # the error routine
+ ret # If success, return
+nread.1: mov %ah,%al # Format
+ mov $read_err,%di # error
+ call hex8 # code
+ movl 0x8(%bp),%eax # Format
+ mov $lba,%di # LBA
+ call hex32
+ mov $msg_lba,%si # Display
+ call putstr # LBA
+ mov $msg_read,%si # Set the error message and
+ # fall through to the error
+ # routine
/*
* Print out the error message pointed to by %ds:(%si) followed
* by a prompt, wait for a keypress, and then reboot the machine.
@@ -259,14 +262,6 @@ putstr: lodsb # Get char
jne putstr.0 # No

/*
- * Overused return code. ereturn is used to return an error from the
- * read function. Since we assume putstr succeeds, we (ab)use the
- * same code when we return from putstr.
- */
-ereturn: movb $0x1,%ah # Invalid
- stc # argument
-return: retw # To caller
-/*
* Reads sectors from the disk. If EDD is enabled, then check if it is
* installed and use it if it is. If it is not installed or not
enabled, then
* fall back to using CHS. Since we use a LBA, if we are using CHS, we
have to
@@ -294,14 +289,38 @@ read: cmpb $0x80,%dl # Hard drive?
retw # To caller
read.1: mov $msg_chs,%si
jmp error
-msg_chs: .asciz CHS not supported

+/*
+ * Convert EAX, AX, or AL to hex, saving the result to [EDI].
+ */
+hex32: pushl %eax # Save
+ shrl $0x10,%eax # Do upper
+ call hex16 # 16
+ popl %eax # Restore
+hex16: call hex16.1 # Do upper 8
+hex16.1: xchgb %ah,%al # Save/restore
+hex8: push %ax # Save
+ shrb $0x4,%al # Do upper
+ call hex8.1 # 4
+ pop %ax # Restore
+hex8.1: andb $0xf,%al # Get lower 4
+ cmpb $0xa,%al # Convert
+ sbbb $0x69,%al # to hex
+ das # digit
+ orb $0x20,%al # To lower case
+ stosb # Save char

Re: ZFS boot inside on the second partition inside a slice

2011-06-22 Thread Henri Hennebert

On 06/22/2011 17:58, John Baldwin wrote:

Index: zfsldr.S
===
--- zfsldr.S(revision 223365)
+++ zfsldr.S(working copy)
@@ -16,7 +16,6 @@
   */

  /* Memory Locations */
-   .set MEM_REL,0x700  # Relocation address
.set MEM_ARG,0x900  # Arguments
.set MEM_ORG,0x7c00 # Origin
.set MEM_BUF,0x8000 # Load area
@@ -91,26 +90,18 @@ main:   cld # 
String ops inc
mov %cx,%ss # Set up
mov $start,%sp  #  stack
  /*
- * Relocate ourself to MEM_REL.  Since %cx == 0, the inc %ch sets
- * %cx == 0x100.
- */
-   mov %sp,%si # Source
-   mov $MEM_REL,%di# Destination
-   incb %ch# Word count
-   rep # Copy
-   movsw   #  code
-/*
   * If we are on a hard drive, then load the MBR and look for the first
   * FreeBSD slice.  We use the fake partition entry below that points to
   * the MBR when we call nread.  The first pass looks for the first active
   * FreeBSD slice.  The second pass looks for the first non-active FreeBSD
   * slice if the first one fails.
   */
-   mov $part4,%si  # Partition
+   mov $part4,%si  # Dummy partition
cmpb $0x80,%dl  # Hard drive?
jb main.4   # No
-   movb $0x1,%dh   # Block count
-   callw nread # Read MBR
+   xor %eax,%eax   # Read MBR
+   movw $MEM_BUF,%bx   #  from first
+   callw nread #  sector
mov $0x1,%cx# Two passes
  main.1:   mov $MEM_BUF+PRT_OFF,%si# Partition table
movb $0x1,%dh   # Partition
@@ -143,32 +134,35 @@ main.4:   xor %dx,%dx # 
Partition:drive
   * (i.e. after the two vdev labels).  We don't have do anything fancy
   * here to allow for an extra copy of boot1 and a partition table
   * (compare to this section of the UFS bootstrap) so we just load it
- * all at 0x8000. The first part of boot2 is BTX, which wants to run
+ * all at 0x9000. The first part of boot2 is BTX, which wants to run
   * at 0x9000. The boot2.bin binary starts right after the end of BTX,
   * so we have to figure out where the start of it is and then move the
- * binary to 0xc000. After we have moved the client, we relocate BTX
- * itself to 0x9000 - doing it in this order means that none of the
- * memcpy regions overlap which would corrupt the copy.  Normally, BTX
- * clients start at MEM_USR, or 0xa000, but when we use btxld to
- * create zfsboot2, we use an entry point of 0x2000.  That entry point is
- * relative to MEM_USR; thus boot2.bin starts at 0xc000.
+ * binary to 0xc000.  Normally, BTX clients start at MEM_USR, or 0xa000,
+ * but when we use btxld to create zfsboot2, we use an entry point of
+ * 0x2000.  That entry point is relative to MEM_USR; thus boot2.bin
+ * starts at 0xc000.
   *
   * The load area and the target area for the client overlap so we have
   * to use a decrementing string move. We also play segment register
   * games with the destination address for the move so that the client
   * can be larger than 16k (which would overflow the zero segment since
- * the client starts at 0xc000). Relocating BTX is easy since the load
- * area and target area do not overlap.
+ * the client starts at 0xc000).
   */
  main.5:   mov %dx,MEM_ARG # Save args
-   movb $NSECT,%dh # Sector count
+   mov $NSECT,%cx  # Sector count
movl $1024,%eax # Offset to boot2
-   callw nread.1   # Read disk
-main.6:mov $MEM_BUF,%si# BTX (before reloc)
+   mov $MEM_BTX,%bx# Destination buffer
+main.6:pushal  # Save params
+   callw nread # Read disk
+   popal   # Restore
+   incl %eax   # Update for
+   add $SIZ_SEC,%bx#  next sector
+   loop main.6 # If not last, read another
+   mov $MEM_BTX,%si# BTX
mov 0xa(%si),%bx# Get BTX length and set
mov $NSECT*SIZ_SEC-1,%di# Size of load area (less one)
mov %di,%si # End of load
-   add $MEM_BUF,%si#  area
+   

Re: ZFS boot inside on the second partition inside a slice

2011-06-21 Thread Henri Hennebert

On 06/20/2011 15:51, John Baldwin wrote:

On Saturday, June 18, 2011 5:04:07 am Henri Hennebert wrote:

On 06/17/2011 19:37, John Baldwin wrote:

On Friday, June 17, 2011 1:06:22 pm Henri Hennebert wrote:

On 06/16/2011 19:35, John Baldwin wrote:

On Thursday, June 16, 2011 8:45:41 am Zhihao Yuan wrote:

Exactly. The MFCed ZFSv28 is different from any patch maintained by
mm@. Maybe some untested changes involved.


Can you try reverting this change:

Author: jhb
Date: Thu Apr 28 17:44:24 2011
New Revision: 221177
URL: http://svn.freebsd.org/changeset/base/221177

Log:
Due to space constraints, the UFS boot2 and boot1 use an evil hack where
boot2 calls back into boot1 to perform disk reads.  The ZFS MBR boot blocks
do not have the same space constraints, so remove this hack for ZFS.
While here, remove commented out code to support C/H/S addressing from
zfsldr.  The ZFS and GPT bootstraps always just use EDD LBA addressing.

MFC after:2 weeks

Modified:
head/sys/boot/i386/boot2/Makefile
head/sys/boot/i386/common/drv.c
head/sys/boot/i386/zfsboot/Makefile
head/sys/boot/i386/zfsboot/zfsldr.S


I try with this revision (221177) reverted to no avail:
same error - 'read error'


Hmm, ok.  No other ideas off the top of my head.


I make the same test under virtualbox and get:

A critical error has occurred while running the virtual machine and the
machine execution has been stopped.

I attach VBox.log.

PS - the message 'ZFS: supported version 28' comes from my patch:

Index: sys/boot/zfs/zfsimpl.c
===
--- sys/boot/zfs/zfsimpl.c  (revision 212549)
+++ sys/boot/zfs/zfsimpl.c  (working copy)
@@ -61,6 +61,8 @@
STAILQ_INIT(zfs_vdevs);
STAILQ_INIT(zfs_pools);

+   printf(ZFS: supported version %u\n, (unsigned) SPA_VERSION);
+
zfs_temp_buf = malloc(TEMP_SIZE);
zfs_temp_end = zfs_temp_buf + TEMP_SIZE;
zfs_temp_ptr = zfs_temp_buf;


Hmm, can you add printfs and narrow down where the hang happens (or which
reads are failing)?  The VBOX log seems to make no sense.  It shows the
CPU trying to call into the BIOS from within protected mode in the loader
but that shouldn't ever happen (note a cs of 0x2b (which is the loader's
%cs selector) but an eip that looks like a cs:ip of a BIOS routine).

I just try to put printf but I get only 'Read error' without any of my 
printf.


Previously event my printf in zfs_init don't show up on the console of 
my netbook. Under VBox it was printed.


Maybe printf is not allowed so soon in zfsboot ?

For the record, I write the bootcode with this 2 commands after booting 
with mfsbsd (from mm@) and fetching zfsboot in /tmp:


dd if=/tmp/zfsboot of=/dev/ad0s2a bs=512 count=1
dd if=/tmp/zfsboot of=/dev/ad0s2a bs=512 skip=1 seek=1024


My debugging patch in zfsboot.c:

[root@morzine zfsboot]# svn diff zfsboot.c
Index: zfsboot.c
===
--- zfsboot.c   (revision 223081)
+++ zfsboot.c   (working copy)
@@ -447,10 +447,16 @@
 off_t off;
 struct dsk *dsk;

+   printf(==trying to boot\n);
+
 dmadat = (void *)(roundup2(__base + (int32_t)_end, 0x1) - 
__base);


+   printf(==about to call bios_getmem()\n);
+
 bios_getmem();

+   printf(==bios_getmem() completed\n);
+   
 if (high_heap_size  0) {
heap_end = PTOV(high_heap_base + high_heap_size);
heap_next = PTOV(high_heap_base);
@@ -482,6 +488,8 @@

 autoboot = 1;

+   printf(==about to call zfs_init()\n);
+   
 zfs_init();

 /*


Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS boot inside on the second partition inside a slice

2011-06-21 Thread Henri Hennebert

On 06/21/2011 15:01, John Baldwin wrote:

Index: zfsldr.S
===
--- zfsldr.S(revision 223339)
+++ zfsldr.S(working copy)
@@ -234,9 +234,12 @@ nread.1:   xor %ecx,%ecx   # Get
callw read  # Read from disk
lea 0x10(%bp),%sp   # Clear stack
jnc return  # If success, return
-   mov $msg_read,%si   # Otherwise, set the error
-   #  message and fall through to
-   #  the error routine
+   mov %ah,%al # Format
+   mov $read_err,%di   #  error
+   call hex8   #  code
+   mov $msg_read,%si   # Set the error message and
+   #  fall through to the error
+   #  routine
  /*
   * Print out the error message pointed to by %ds:(%si) followed
   * by a prompt, wait for a keypress, and then reboot the machine.
@@ -296,12 +299,28 @@ read.1:   mov $msg_chs,%si
jmp error
  msg_chs:  .asciz CHS not supported

+/*
+ * Convert AL to hex, saving the result to [EDI].
+ */
+hex8:  push %ax# Save
+   shrb $0x4,%al   # Do upper
+   call hex8.1 #  4
+   pop %ax # Restore
+hex8.1:andb $0xf,%al   # Get lower 4
+   cmpb $0xa,%al   # Convert
+   sbbb $0x69,%al  #  to hex
+   das #  digit
+   orb $0x20,%al   # To lower case
+   stosb   # Save char
+   ret # (Recursive)
+
  /* Messages */

-msg_read:  .asciz Read
-msg_part:  .asciz Boot
+msg_read:  .ascii Read error: 
+read_err:  .asciz XX
+msg_part:  .asciz Boot error

-prompt:.asciz  error\r\n
+prompt:.asciz \r\n

.org PRT_OFF,0x90


I get

Read error: 01

Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS boot inside on the second partition inside a slice

2011-06-21 Thread Henri Hennebert

On 06/21/2011 17:55, John Baldwin wrote:

On Tuesday, June 21, 2011 10:50:14 am Henri Hennebert wrote:

On 06/21/2011 15:01, John Baldwin wrote:

Index: zfsldr.S
===
--- zfsldr.S(revision 223339)
+++ zfsldr.S(working copy)
@@ -234,9 +234,12 @@ nread.1:   xor %ecx,%ecx   # Get
callw read  # Read from disk
lea 0x10(%bp),%sp   # Clear stack
jnc return  # If success, return
-   mov $msg_read,%si   # Otherwise, set the error
-   #  message and fall through to
-   #  the error routine
+   mov %ah,%al # Format
+   mov $read_err,%di   #  error
+   call hex8   #  code
+   mov $msg_read,%si   # Set the error message and
+   #  fall through to the error
+   #  routine
   /*
* Print out the error message pointed to by %ds:(%si) followed
* by a prompt, wait for a keypress, and then reboot the machine.
@@ -296,12 +299,28 @@ read.1:   mov $msg_chs,%si
jmp error
   msg_chs: .asciz CHS not supported

+/*
+ * Convert AL to hex, saving the result to [EDI].
+ */
+hex8:  push %ax# Save
+   shrb $0x4,%al   # Do upper
+   call hex8.1 #  4
+   pop %ax # Restore
+hex8.1:andb $0xf,%al   # Get lower 4
+   cmpb $0xa,%al   # Convert
+   sbbb $0x69,%al  #  to hex
+   das #  digit
+   orb $0x20,%al   # To lower case
+   stosb   # Save char
+   ret # (Recursive)
+
   /* Messages */

-msg_read:  .asciz Read
-msg_part:  .asciz Boot
+msg_read:  .ascii Read error: 
+read_err:  .asciz XX
+msg_part:  .asciz Boot error

-prompt:.asciz  error\r\n
+prompt:.asciz \r\n

.org PRT_OFF,0x90


I get

Read error: 01


Hmm, that would be 'invalid parameter'.

Can you add a 'foo: jmp foo' infinite loop and move it around to figure out
which read call is failing?


main.5: mov %dx,MEM_ARG # Save args
movb $NSECT,%dh # Sector count
movl $1024,%eax # Offset to boot2
callw nread.1   # Read disk

foo:jmp foo

After this one I get

'Read error: 01'

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS boot inside on the second partition inside a slice

2011-06-21 Thread Henri Hennebert

On 06/21/2011 19:51, John Baldwin wrote:

On Tuesday, June 21, 2011 12:15:58 pm Henri Hennebert wrote:

On 06/21/2011 17:55, John Baldwin wrote:

On Tuesday, June 21, 2011 10:50:14 am Henri Hennebert wrote:

On 06/21/2011 15:01, John Baldwin wrote:

Index: zfsldr.S
===
--- zfsldr.S(revision 223339)
+++ zfsldr.S(working copy)
@@ -234,9 +234,12 @@ nread.1:   xor %ecx,%ecx   # Get
callw read  # Read from disk
lea 0x10(%bp),%sp   # Clear stack
jnc return  # If success, return
-   mov $msg_read,%si   # Otherwise, set the error
-   #  message and fall through to
-   #  the error routine
+   mov %ah,%al # Format
+   mov $read_err,%di   #  error
+   call hex8   #  code
+   mov $msg_read,%si   # Set the error message and
+   #  fall through to the error
+   #  routine
/*
 * Print out the error message pointed to by %ds:(%si) followed
 * by a prompt, wait for a keypress, and then reboot the machine.
@@ -296,12 +299,28 @@ read.1:   mov $msg_chs,%si
jmp error
msg_chs:.asciz CHS not supported

+/*
+ * Convert AL to hex, saving the result to [EDI].
+ */
+hex8:  push %ax# Save
+   shrb $0x4,%al   # Do upper
+   call hex8.1 #  4
+   pop %ax # Restore
+hex8.1:andb $0xf,%al   # Get lower 4
+   cmpb $0xa,%al   # Convert
+   sbbb $0x69,%al  #  to hex
+   das #  digit
+   orb $0x20,%al   # To lower case
+   stosb   # Save char
+   ret # (Recursive)
+
/* Messages */

-msg_read:  .asciz Read
-msg_part:  .asciz Boot
+msg_read:  .ascii Read error: 
+read_err:  .asciz XX
+msg_part:  .asciz Boot error

-prompt:.asciz  error\r\n
+prompt:.asciz \r\n

.org PRT_OFF,0x90


I get

Read error: 01


Hmm, that would be 'invalid parameter'.

Can you add a 'foo: jmp foo' infinite loop and move it around to figure

out

which read call is failing?


main.5: mov %dx,MEM_ARG # Save args
  movb $NSECT,%dh # Sector count
  movl $1024,%eax # Offset to boot2
  callw nread.1   # Read disk

foo:jmp foo

After this one I get

'Read error: 01'


Hmm, ok.  NSECT changed in the MFC (it is now larger).  Try this patch.  It
changes the code to read zfsboot in one sector at a time:



I encounter 2 problems - see in you patch

Henri



Index: zfsldr.S
===
--- zfsldr.S(revision 223365)
+++ zfsldr.S(working copy)
@@ -16,7 +16,6 @@
   */

  /* Memory Locations */
-   .set MEM_REL,0x700  # Relocation address
.set MEM_ARG,0x900  # Arguments
.set MEM_ORG,0x7c00 # Origin
.set MEM_BUF,0x8000 # Load area
@@ -91,26 +90,19 @@ main:   cld # 
String ops inc
mov %cx,%ss # Set up
mov $start,%sp  #  stack
  /*
- * Relocate ourself to MEM_REL.  Since %cx == 0, the inc %ch sets
- * %cx == 0x100.
- */
-   mov %sp,%si # Source
-   mov $MEM_REL,%di# Destination
-   incb %ch# Word count
-   rep # Copy
-   movsw   #  code
-/*
   * If we are on a hard drive, then load the MBR and look for the first
   * FreeBSD slice.  We use the fake partition entry below that points to
   * the MBR when we call nread.  The first pass looks for the first active
   * FreeBSD slice.  The second pass looks for the first non-active FreeBSD
   * slice if the first one fails.
   */
-   mov $part4,%si  # Partition
+   mov $part4,%si  # Dummy partition
cmpb $0x80,%dl  # Hard drive?
jb main.4   # No
-   movb $0x1,%dh   # Block count
-   callw nread # Read MBR
+   xor %eax,%eax

Re: ZFS boot inside on the second partition inside a slice

2011-06-21 Thread Henri Hennebert

On 06/21/2011 21:25, John Baldwin wrote:

On Tuesday, June 21, 2011 3:02:28 pm Henri Hennebert wrote:

On 06/21/2011 19:51, John Baldwin wrote:

On Tuesday, June 21, 2011 12:15:58 pm Henri Hennebert wrote:

On 06/21/2011 17:55, John Baldwin wrote:

On Tuesday, June 21, 2011 10:50:14 am Henri Hennebert wrote:

On 06/21/2011 15:01, John Baldwin wrote:

Index: zfsldr.S
===
--- zfsldr.S(revision 223339)
+++ zfsldr.S(working copy)
@@ -234,9 +234,12 @@ nread.1:   xor %ecx,%ecx   # Get
callw read  # Read from disk
lea 0x10(%bp),%sp   # Clear stack
jnc return  # If success, return
-   mov $msg_read,%si   # Otherwise, set the error
-   #  message and fall through to
-   #  the error routine
+   mov %ah,%al # Format
+   mov $read_err,%di   #  error
+   call hex8   #  code
+   mov $msg_read,%si   # Set the error message and
+   #  fall through to the error
+   #  routine
 /*
  * Print out the error message pointed to by %ds:(%si) followed
  * by a prompt, wait for a keypress, and then reboot the machine.
@@ -296,12 +299,28 @@ read.1:   mov $msg_chs,%si
jmp error
 msg_chs:   .asciz CHS not supported

+/*
+ * Convert AL to hex, saving the result to [EDI].
+ */
+hex8:  push %ax# Save
+   shrb $0x4,%al   # Do upper
+   call hex8.1 #  4
+   pop %ax # Restore
+hex8.1:andb $0xf,%al   # Get lower 4
+   cmpb $0xa,%al   # Convert
+   sbbb $0x69,%al  #  to hex
+   das #  digit
+   orb $0x20,%al   # To lower case
+   stosb   # Save char
+   ret # (Recursive)
+
 /* Messages */

-msg_read:  .asciz Read
-msg_part:  .asciz Boot
+msg_read:  .ascii Read error: 
+read_err:  .asciz XX
+msg_part:  .asciz Boot error

-prompt:.asciz  error\r\n
+prompt:.asciz \r\n

.org PRT_OFF,0x90


I get

Read error: 01


Hmm, that would be 'invalid parameter'.

Can you add a 'foo: jmp foo' infinite loop and move it around to figure

out

which read call is failing?


main.5: mov %dx,MEM_ARG # Save args
   movb $NSECT,%dh # Sector count
   movl $1024,%eax # Offset to boot2
   callw nread.1   # Read disk

foo:jmp foo

After this one I get

'Read error: 01'


Hmm, ok.  NSECT changed in the MFC (it is now larger).  Try this patch.

It

changes the code to read zfsboot in one sector at a time:



I encounter 2 problems - see in you patch

Henri



Index: zfsldr.S
===
--- zfsldr.S(revision 223365)
+++ zfsldr.S(working copy)
@@ -16,7 +16,6 @@
*/

   /* Memory Locations */
-   .set MEM_REL,0x700  # Relocation address
.set MEM_ARG,0x900  # Arguments
.set MEM_ORG,0x7c00 # Origin
.set MEM_BUF,0x8000 # Load area
@@ -91,26 +90,19 @@ main:   cld # 
String ops inc
mov %cx,%ss # Set up
mov $start,%sp  #  stack
   /*
- * Relocate ourself to MEM_REL.  Since %cx == 0, the inc %ch sets
- * %cx == 0x100.
- */
-   mov %sp,%si # Source
-   mov $MEM_REL,%di# Destination
-   incb %ch# Word count
-   rep # Copy
-   movsw   #  code
-/*
* If we are on a hard drive, then load the MBR and look for the first
* FreeBSD slice.  We use the fake partition entry below that points to
* the MBR when we call nread.  The first pass looks for the first

active

* FreeBSD slice.  The second pass looks for the first non-active

FreeBSD

* slice if the first one fails.
*/
-   mov $part4,%si  # Partition
+   mov $part4,%si  # Dummy partition
cmpb $0x80,%dl  # Hard drive?
jb main.4   # No
-   movb $0x1,%dh

Re: ZFS boot inside on the second partition inside a slice

2011-06-17 Thread Henri Hennebert
On 06/16/2011 19:35, John Baldwin wrote:
 On Thursday, June 16, 2011 8:45:41 am Zhihao Yuan wrote:
 Exactly. The MFCed ZFSv28 is different from any patch maintained by
 mm@. Maybe some untested changes involved.
 
 Can you try reverting this change:
 
 Author: jhb
 Date: Thu Apr 28 17:44:24 2011
 New Revision: 221177
 URL: http://svn.freebsd.org/changeset/base/221177
 
 Log:
   Due to space constraints, the UFS boot2 and boot1 use an evil hack where
   boot2 calls back into boot1 to perform disk reads.  The ZFS MBR boot blocks
   do not have the same space constraints, so remove this hack for ZFS.
   While here, remove commented out code to support C/H/S addressing from
   zfsldr.  The ZFS and GPT bootstraps always just use EDD LBA addressing.
   
   MFC after:2 weeks
 
 Modified:
   head/sys/boot/i386/boot2/Makefile
   head/sys/boot/i386/common/drv.c
   head/sys/boot/i386/zfsboot/Makefile
   head/sys/boot/i386/zfsboot/zfsldr.S
 
I try with this revision (221177) reverted to no avail:
same error - 'read error'

Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS boot inside on the second partition inside a slice

2011-06-16 Thread Henri Hennebert

On 06/16/2011 07:32, Zhihao Yuan wrote:

I just redo everything, and changed the order of freebsd-zfs and
freebsd-swap. The Read error still happens!


Just a me too.

Everything was working great with zfsboot from 8.2-RELEASE + a patch
(http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/153552).

As I update to 8.2-STABLE after v28 MFC, I have to write a new zfsboot 
to be allowed to upgrade my pool. I get the Read Error after that.


PS - same comfig, a netboot with windows7 on first partition - so I 
can't switch to gpt.


Henri


On Wed, Jun 15, 2011 at 8:07 PM, Zhihao Yuanlich...@gmail.com  wrote:

On Wed, Jun 15, 2011 at 7:58 PM, Xin LIdelp...@delphij.net  wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 06/15/11 17:42, Zhihao Yuan wrote:

Hi,

I configured my disk layout according to
http://wiki.freebsd.org/RootOnZFS/ZFSBootPartition

But I swapped the order of the freebsd-zfs and freebsd-swap. The 4.0G
freebsd-swap partition appears first inside the slice.
After that, I write zfsboot on both ada0s2 and ada0s2b, but the boot0
gives me a Read error.


Where did your second slice start?  There can be a lot of reasons why it
gives Read error.


After an NTFS partition of 12GB.
This should be the problem with zfsboot, because if I use sysinstall
to install a bootmgr, the boot gives me a not UFS error, which means
the boot0 is done (am I right?).



I personally recommend using GPT scheme instead of MBR, as you have a
dedicated partition for gptzfsboot, which is much cleaner than this
approach.



Yeah, yeah, I agree. I should not plan to play Windows games.


Cheers,
- --
Xin LIdelp...@delphij.net  http://www.delphij.net/
FreeBSD - The Power to Serve!  Live free or die
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (FreeBSD)

iQEcBAEBCAAGBQJN+VUrAAoJEATO+BI/yjfBpksH/2ZswQ+ogdDpYwvhRIjJaqLs
NEl8FtC2Ua+c3F2sNwrLK5a/fn/LL+jPAXndvuQdxOaz41Iqtnt8w1i9Dz5ATkva
T+i0fnRVwXFqjrlRTWK+ODtNtrhI2/7ECAIfOOLNhaiJnPRrJJgvxJ6V5W+/N+l7
Lt4yMp6hGbhO/9Yp2UoaQuUThOTz+dKNZGECd1nLT+ooHbTPhBvjii080hHowNl6
Ef+JBaEng2NbRJPxYWrRwz6R7A44RDXvrKzn5w/TuUa+4fYrS25EZxygzIh3xjFX
2ILP25yabJ+Vw5o8bFCsJ3ExbEfq0PnfROHanRSdTjMDra27dGY9JZKyytE+Ykc=
=D5+X
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org





--
Zhihao Yuan, nickname lichray
The best way to predict the future is to invent it.
___
4BSD -- http://4bsd.biz/







___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS boot inside on the second partition inside a slice

2011-06-16 Thread Henri Hennebert

On 06/16/2011 19:35, John Baldwin wrote:

On Thursday, June 16, 2011 8:45:41 am Zhihao Yuan wrote:

Exactly. The MFCed ZFSv28 is different from any patch maintained by
mm@. Maybe some untested changes involved.


Can you try reverting this change:

Author: jhb
Date: Thu Apr 28 17:44:24 2011
New Revision: 221177
URL: http://svn.freebsd.org/changeset/base/221177

Log:
   Due to space constraints, the UFS boot2 and boot1 use an evil hack where
   boot2 calls back into boot1 to perform disk reads.  The ZFS MBR boot blocks
   do not have the same space constraints, so remove this hack for ZFS.
   While here, remove commented out code to support C/H/S addressing from
   zfsldr.  The ZFS and GPT bootstraps always just use EDD LBA addressing.

   MFC after:2 weeks

Modified:
   head/sys/boot/i386/boot2/Makefile
   head/sys/boot/i386/common/drv.c
   head/sys/boot/i386/zfsboot/Makefile
   head/sys/boot/i386/zfsboot/zfsldr.S


I will try this saturday!

Thanks

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


zfsboot from 8.2RC1 freeze at boot time

2010-12-28 Thread Henri Hennebert

Hello and merry Xmas to everybody,

I upgrade a remote server from 8.1-RELEASE to 8.2-RC1.

This server have one disk:

[r...@tignes ~]# gpart show
=   63  488397105  ada0  MBR  (233G)
 63   12583809 1  freebsd  (6.0G)
   12583872  475813296 2  freebsd  [active]  (227G)

=   0  12583809  ada0s1  BSD  (6.0G)
 0   8388608   1  freebsd-ufs  (4.0G)
   8388608   4195201   2  freebsd-swap  (2.0G)

=0  475813296  ada0s2  BSD  (227G)
  0  475813296   1  freebsd-zfs  (227G)


It boot with zfsboot from ada0s2 containing a zfs pool.

After upgrading the zfsboot just to be able to upgrade the pool to v15, 
the server don't boot anymore.


It is a remote server, so I reproduce this config under VirtualBox. The 
boot freeze after zfsboot displaying -.


I grab a old zfsboot from another server running 8.1-STABLE (r213582) 
which boot fine.


I put the zfsboot from r213582 (zpool v15 aware) on ada0s2 and bingo, 
the server boot normally.


Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: MFC of ZFSv15

2010-09-19 Thread Henri Hennebert
On 09/19/2010 18:33, Dan Mack wrote:
 But I should be able to boot my ZFSv14 root pool using the ZFSv15 build of 
 FreeBSD, correct? 

Yes
 But the problem scenario would be when I've upgraded by root pool to v15 and 
 I attempt to boot it with v14 boot loader.  At least that is what I think ...
You are right
 
 I guess what I'm getting at is ... you should be able to buildworld, 
 installkernel, reboot, installworld, reboot without worry. 
It is the case

  But when after your run 'zpool upgrade', you will need to re-write the 
 bootcode using gpart on each of your root pool ZFS disks.

I prefer to install bootcode BEFORE. Then reboot and check it with the
printf of my simple patch. Then you can zpool/zfs upgrade without problem.
 
 Am I understanding this correctly ?
 
 Thanks for all the work on ZFS BTW, it's great!
 
 Dan
 
 On Sep 16, 2010, at 2:03 PM, Henri Hennebert wrote:
 
 On 09/16/2010 17:18, jhell wrote:
 On 09/16/2010 09:55, Mike Tancsa wrote:

 Thanks again for all the ZFS fixes and enhancements!   Are there any
 caveats to upgrading ?

 Do I just do

 zpool upgrade -a
 zfs upgrade -a

 or are there any extra steps ?


 Hi Mike,

 No-one knows your bootcode better than you. So if you are upgrading
 don't forget if you are on a ZFS root then your bootcode might need
 updating.

 I was bitten by this problem in a previous ZFS upgrade.

 To be sure, I have added this patch to zfsimpl.c so, at boot I know if 
 zpool/zfs upgrade will be OK.

 Henri

 Regards, UPDATING should have anything else.


 sys_boot_zfs.patch___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 
 Dan
 --
 Dan Mack
 m...@macktronics.com
 
 
 
 
 
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: MFC of ZFSv15

2010-09-16 Thread Henri Hennebert

On 09/16/2010 17:18, jhell wrote:

On 09/16/2010 09:55, Mike Tancsa wrote:


Thanks again for all the ZFS fixes and enhancements!   Are there any
caveats to upgrading ?

Do I just do

zpool upgrade -a
zfs upgrade -a

or are there any extra steps ?



Hi Mike,

No-one knows your bootcode better than you. So if you are upgrading
don't forget if you are on a ZFS root then your bootcode might need
updating.


I was bitten by this problem in a previous ZFS upgrade.

To be sure, I have added this patch to zfsimpl.c so, at boot I know if 
zpool/zfs upgrade will be OK.


Henri


Regards, UPDATING should have anything else.



Index: sys/boot/zfs/zfsimpl.c
===
--- sys/boot/zfs/zfsimpl.c  (revision 212549)
+++ sys/boot/zfs/zfsimpl.c  (working copy)
@@ -61,6 +61,8 @@
STAILQ_INIT(zfs_vdevs);
STAILQ_INIT(zfs_pools);
 
+   printf(ZFS: supported version %u\n, (unsigned) SPA_VERSION);
+
zfs_temp_buf = malloc(TEMP_SIZE);
zfs_temp_end = zfs_temp_buf + TEMP_SIZE;
zfs_temp_ptr = zfs_temp_buf;
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zfs destroy snapshot doesn't free space

2010-08-13 Thread Henri Hennebert
On 08/13/2010 20:02, Andreas Mayer wrote:
 $ uname -a
 FreeBSD wurd.dev001.net 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19
 02:36:49 UTC 2010
 r...@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
 
 2010/8/13 Malcolm Waltz mwa...@pacific.edu:
 Have you tried zfs list -t all ?
 
 I have, it produces this output:
 $ zfs list -t all
 NAME   USED  AVAIL  REFER  MOUNTPOINT
 rpool  637G  48,4G18K  none
 rpool/root 245M  1,76G   209M  legacy
 .. rpool/root backup snapshots ...
 rpool/srv 5,31G  48,4G  4,94G  /srv
 .. rpool/srv backup snapshots ...
 rpool/tmp 90,2M  1,91G  90,2M  /tmp
 .. rpool/tmp backup snapshots ...
 rpool/usr 7,91G  48,4G  6,83G  /usr
 .. rpool/usr backup snapshots ...
 rpool/var  623G  48,4G   623G  /var
 
Just to be sure that a process is not still hogging space:

fstat |grep /var

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Pack of CAM improvements

2010-01-20 Thread Henri Hennebert

On 01/19/2010 17:12, Alexander Motin wrote:

Hi.

I've made a patch, that should solve set of problems of CAM ATA and CAM
generally. I would like to ask for testing and feedback.

What patch does:
- It unifies bus reset/probe sequence. Whenever bus attached at boot or
later, CAM will automatically reset and scan it. It allows to remove
duplicate code from many drivers.
- Any bus, attached before CAM completed it's boot-time initialization,
will equally join to the process, delaying boot if needed.
- New kern.cam.boot_delay loader tunable should help controllers that
are still unable to register their buses in time (such as slow USB/
PCCard/ CardBus devices).


With kern.cam.boot_delay=15000 (I suppose that it was in ms) I can now
boot from my sim card reader.

Thanks

Henri


- To allow synchronization between different CAM levels, concept of
requests priorities was extended. Priorities now split between several
run levels. Device can be freezed at specified level, allowing higher
priority requests to pass. For example, no payload requests allowed,
until PMP driver enable port. ATA XPT negotiate transfer parameters,
periph driver configure caching and so on.
- Frozen requests are no more counted by request allocation scheduler.
It fixes deadlocks, when frozen low priority payload requests occupying
slots, required by higher levels to manage theit execution.
- Two last changes were holding proper ATA reinitialization and error
recovery implementation. Now it is done: SATA controllers and Port
Multipliers now implement automatic hot-plug and should correctly
recover from timeouts and bus resets.

Patch can be found here:
http://people.freebsd.org/~mav/cam-ata.20100119.patch

Feedback as always welcome.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: USB problems on 8.0-STABLE

2010-01-09 Thread Henri Hennebert

On 01/09/2010 05:39, Warren Block wrote:

On Fri, 8 Jan 2010, Frank wrote:


On Fri, 8 Jan 2010, Steven Friedrich wrote:


Option AllowEmptyInput off
EndSection



Comment out the line containing AllowEmptyInput.


OK, this took care of the nothing-works-unless-mouse-is-moved problem
but why do I get this? It's keeping apcupsd from starting.

Ace /usr/ports # usbdevs -d -v
usbdevs: no USB controllers found


I'd guess that usbdevs is obsolete, part of the old USB system.


Ace /usr/ports # usbconfig
ugen0.1: OHCI root HUB nVidia at usbus0, cfg=0 md=HOST spd=FULL
(12Mbps) pwr=ON
ugen1.1: EHCI root HUB nVidia at usbus1, cfg=0 md=HOST spd=HIGH
(480Mbps) pwr=ON
ugen0.2: Back-UPS XS 1200 FW:8.g1 .D USB FW:g1 American Power
Conversion at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON
ugen0.3: USB Optical Mouse vendor 0x0461 at usbus0, cfg=0 md=HOST
spd=LOW (1.5Mbps) pwr=ON
ugen0.4: Dell USB Keyboard Dell at usbus0, cfg=0 md=HOST spd=LOW
(1.5Mbps) pwr=ON


Do you have DEVICE /dev/ugen0.2 in apcupsd.conf?


I don't understand why usbdevs can't find any controllers and apcupsd
can't find any device while the kernel and usbconfig can find it all.


upsdevs: probably obsolete. As for apcupsd, I don't think it can
auto-scan for USB devices, but haven't used it with USB.


I have:

FreeBSD avoriaz.restart.bel 8.0-RELEASE FreeBSD 8.0-RELEASE #0 r199628M: 
Tue Nov 24 21:38:07 CET 2009 
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ  amd64


usbconfig:
ugen0.2: Back-UPS CS 650 FW:817.v4.I USB American Power Conversion at 
usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON


apcupsd.conf:
UPSNAME Back-UPS-CS-650
UPSCABLE usb
UPSTYPE usb
DEVICE

apcupsd is working with this config.

Henri




-Warren Block * Rapid City, South Dakota USA
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: cvsweb: src/UPDATING on RELENG_7

2010-01-02 Thread Henri Hennebert

Le 1/01/2010 16:40, Ian Smith a écrit :

Hi,

Thought I had a clue on using cvsweb, but seem to have mislaid it ..

After updating 7.0-RELEASE to RELENG_7 sources on Dec 28, checking
UPDATING before and during buildworld, went hunting on cvsweb for the
very version of UPDATING I was reading, 1.507.2.34 of 2009/11/29.

I can't find it, as such.  1.507 shows on MAIN, RELENG_7_BP, RELENG_7.
Selecting only RELENG_7 just shows that single ver 1.507 of Oct'07.


It is a known bug of cvsweb:

http://www.freebsd.org/cgi/query-pr.cgi?prp=120185-1-txtn=/patch.txt

Henri


On a punt I manually entered 1.507.2.34 for a diff against 1.507 and
that looks just right:

http://www.freebsd.org/cgi/cvsweb.cgi/src/UPDATING.diff?r1=texttr1=1.507r2=texttr2=1.507.2.34

But where would I look to find the log and view for 1.507.2.34 itself?

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


[SOLVED] 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections

2009-07-23 Thread Henri Hennebert

Li, Qing wrote:

Just another case where the route must be created:



That's probably because I explicitly disabled such
route installation for PPP link type.

Please apply patch http://people.freebsd.org/~qingli/patch and
let me know if that solves your problem.


The problem is solved.

Thanks a lot.

Henri

PS. the ipv4 ping was working fine before (and after) your patch, so
I don't see why you have to patch in.c


Thanks,

-- Qing




[r...@avoriaz ~]# ifconfig gif0
gif0: flags=8051UP,POINTOPOINT,RUNNING,MULTICAST metric 0 mtu 1280
tunnel inet 212.239.166.57 -- 94.23.44.41
inet6 fe80::21d:60ff:fead:2ace%gif0 prefixlen 64 scopeid 0x4
inet6 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:0:::
prefixlen
128
options=1ACCEPT_REV_ETHIP_VER

[r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1:::
PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: --
2001:41d0:2:2d29:1:::
^C
--- 2001:41d0:2:2d29:1::: ping6 statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss

[r...@avoriaz ~]# route add -inet6 2001:41d0:2:2d29:1:::

-interface

lo0
add host 2001:41d0:2:2d29:1:::: gateway lo0

[r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1:::
PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: --
2001:41d0:2:2d29:1:::
16 bytes from ::1, icmp_seq=0 hlim=64 time=0.531 ms
16 bytes from ::1, icmp_seq=1 hlim=64 time=0.884 ms
16 bytes from ::1, icmp_seq=2 hlim=64 time=0.748 ms
^C
--- 2001:41d0:2:2d29:1::: ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.531/0.721/0.884/0.145 ms

Thanks

Henri

-Original Message-
From: Henri Hennebert [mailto:h...@restart.be]
Sent: Sat 7/11/2009 3:09 AM
To: Li, Qing
Cc: freebsd-stable@freebsd.org; freebsd-...@freebsd.org
Subject: Re: 8.0-BETA1 - for the record - different paths followed

by

IPv4 and IPv6 for 'local' connections

Li, Qing wrote:

Hi,

Please try patch-7-10 in my home directory

http://people.freebsd.org/~qingli/

and let me know how it works out for you. I thought I had committed

the patch

but turned out I didn't.

I apply the patch, reset my pf.conf to its previous content and all

is

running smoothly. By the way, I discover after my post that my
solution was not working for long (many bytes) connections and

this

is

solved too.

Many thank for your time

Henri

PS please commit as soon as possible


On 8.0-BETA1 there is an assymetry:

netstat -rn display

192.168.24.1   link#3

no entry for 2001:41d0:2:2d29:1:1::


This is by design as part of the new architecture in 8.0, which

maintains

the L2 ARP/ND6 and L3 routing tables separately.

-- Qing



-Original Message-
From: owner-freebsd-sta...@freebsd.org on behalf of Henri Hennebert
Sent: Fri 7/10/2009 5:32 AM
To: freebsd-stable@freebsd.org; freebsd...@freebsd.org
Subject: 8.0-BETA1 - for the record - different paths followed by

IPv4 and IPv6 for 'local' connections

Hello,

After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem

when

connecting with firefox to a local apache server using the global
unicast IPv6 address of the local machine. pf.conf must be updated!

My configuration:

[r...@avoriaz ~]# ifconfig em0

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0

mtu

1500

options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO

4

ether 00:1d:60:ad:2a:ce
inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255
inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1
inet6 2001:41d0:2:2d29:1:1:: prefixlen 80
media: Ethernet 100baseTX (100baseTX half-duplex)
status: active

[r...@avoriaz ~]# host www.restart.bel
www.restart.bel is an alias for avoriaz.restart.bel.
avoriaz.restart.bel has address 192.168.24.1
avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1::

pf.conf:

int_if=em0
block in  log all
block out log all
set skip on lo0
antispoof quick for $int_if inet
# Allow trafic with physical internal network
pass in quick on $int_if from ($int_if:network) to ($int_if) keep

state

pass out quick on $int_if from ($int_if) to ($int_if:network) keep

state

The problem:

[r...@avoriaz ~]# telnet -4 www.restart.bel 80
Trying 192.168.24.1...
Connected to avoriaz.restart.bel.
Escape character is '^]'.
^]
telnet quit
Connection closed.
[r...@avoriaz ~]# telnet -6 www.restart.bel 80
Trying 2001:41d0:2:2d29:1:1::...
---Never connect and get a timeout!

tcpdump and logging in pf show me that

For a IPv4 connection:
the packet from telnet to apache pass 2 times on lo0 (out and in)
the answer packet from apache to telnet pass 2 times on lo0 (out

and

in)

So no problem, there is `set skip on lo0'

For a IPv6 connection:
The first packet from telnet to apache pass 2 times on lo0 (out and

in)

The answer packet from apache to telnet path on em0  and is

rejected

due to the default flags S/SA.

So I have to change pf.conf and replace the last line:
pass out quick

Re: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections

2009-07-20 Thread Henri Hennebert

Li, Qing wrote:

The patch has been committed, svn revision 195643.

Thanks,

-- Qing


Just another case where the route must be created:

[r...@avoriaz ~]# ifconfig gif0
gif0: flags=8051UP,POINTOPOINT,RUNNING,MULTICAST metric 0 mtu 1280
tunnel inet 212.239.166.57 -- 94.23.44.41
inet6 fe80::21d:60ff:fead:2ace%gif0 prefixlen 64 scopeid 0x4
	inet6 2001:41d0:2:2d29:1::: -- 2001:41d0:2:2d29:0::: prefixlen 
128

options=1ACCEPT_REV_ETHIP_VER

[r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1:::
PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: -- 
2001:41d0:2:2d29:1:::

^C
--- 2001:41d0:2:2d29:1::: ping6 statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss

[r...@avoriaz ~]# route add -inet6 2001:41d0:2:2d29:1::: -interface lo0
add host 2001:41d0:2:2d29:1:::: gateway lo0

[r...@avoriaz ~]# ping6 2001:41d0:2:2d29:1:::
PING6(56=40+8+8 bytes) 2001:41d0:2:2d29:1::: -- 
2001:41d0:2:2d29:1:::

16 bytes from ::1, icmp_seq=0 hlim=64 time=0.531 ms
16 bytes from ::1, icmp_seq=1 hlim=64 time=0.884 ms
16 bytes from ::1, icmp_seq=2 hlim=64 time=0.748 ms
^C
--- 2001:41d0:2:2d29:1::: ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.531/0.721/0.884/0.145 ms

Thanks

Henri


-Original Message-
From: Henri Hennebert [mailto:h...@restart.be]
Sent: Sat 7/11/2009 3:09 AM
To: Li, Qing
Cc: freebsd-stable@freebsd.org; freebsd-...@freebsd.org
Subject: Re: 8.0-BETA1 - for the record - different paths followed by IPv4 and 
IPv6 for 'local' connections
 
Li, Qing wrote:

Hi,

Please try patch-7-10 in my home directory http://people.freebsd.org/~qingli/
and let me know how it works out for you. I thought I had committed the patch 
but turned out I didn't.


I apply the patch, reset my pf.conf to its previous content and all is 
running smoothly. By the way, I discover after my post that my 
solution was not working for long (many bytes) connections and this is 
solved too.


Many thank for your time

Henri

PS please commit as soon as possible


On 8.0-BETA1 there is an assymetry:

netstat -rn display

192.168.24.1   link#3

no entry for 2001:41d0:2:2d29:1:1::

This is by design as part of the new architecture in 8.0, which maintains 
the L2 ARP/ND6 and L3 routing tables separately.


-- Qing



-Original Message-
From: owner-freebsd-sta...@freebsd.org on behalf of Henri Hennebert
Sent: Fri 7/10/2009 5:32 AM
To: freebsd-stable@freebsd.org; freebsd...@freebsd.org
Subject: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 
for 'local' connections
 
Hello,


After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when 
connecting with firefox to a local apache server using the global 
unicast IPv6 address of the local machine. pf.conf must be updated!


My configuration:

[r...@avoriaz ~]# ifconfig em0

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4
ether 00:1d:60:ad:2a:ce
inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255
inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1
inet6 2001:41d0:2:2d29:1:1:: prefixlen 80
media: Ethernet 100baseTX (100baseTX half-duplex)
status: active

[r...@avoriaz ~]# host www.restart.bel
www.restart.bel is an alias for avoriaz.restart.bel.
avoriaz.restart.bel has address 192.168.24.1
avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1::

pf.conf:

int_if=em0
block in  log all
block out log all
set skip on lo0
antispoof quick for $int_if inet
# Allow trafic with physical internal network
pass in quick on $int_if from ($int_if:network) to ($int_if) keep state
pass out quick on $int_if from ($int_if) to ($int_if:network) keep state

The problem:

[r...@avoriaz ~]# telnet -4 www.restart.bel 80
Trying 192.168.24.1...
Connected to avoriaz.restart.bel.
Escape character is '^]'.
^]
telnet quit
Connection closed.
[r...@avoriaz ~]# telnet -6 www.restart.bel 80
Trying 2001:41d0:2:2d29:1:1::...
---Never connect and get a timeout!

tcpdump and logging in pf show me that

For a IPv4 connection:
the packet from telnet to apache pass 2 times on lo0 (out and in)
the answer packet from apache to telnet pass 2 times on lo0 (out and in)

So no problem, there is `set skip on lo0'

For a IPv6 connection:
The first packet from telnet to apache pass 2 times on lo0 (out and in)
The answer packet from apache to telnet path on em0  and is rejected
due to the default flags S/SA.

So I have to change pf.conf and replace the last line:
pass out quick on $int_if from ($int_if) to ($int_if:network) \
keep state flags any

Then all is OK

By the way, on 7.2

netstat -rn display

192.168.24.100:1d:60:ad:2a:ce

2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce


On 8.0-BETA1 there is an assymetry:

netstat -rn display

192.168.24.1   link#3

no entry

Re: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections

2009-07-11 Thread Henri Hennebert

Li, Qing wrote:

Hi,

Please try patch-7-10 in my home directory http://people.freebsd.org/~qingli/
and let me know how it works out for you. I thought I had committed the patch 
but turned out I didn't.


I apply the patch, reset my pf.conf to its previous content and all is 
running smoothly. By the way, I discover after my post that my 
solution was not working for long (many bytes) connections and this is 
solved too.


Many thank for your time

Henri

PS please commit as soon as possible




On 8.0-BETA1 there is an assymetry:

netstat -rn display

192.168.24.1   link#3

no entry for 2001:41d0:2:2d29:1:1::



This is by design as part of the new architecture in 8.0, which maintains 
the L2 ARP/ND6 and L3 routing tables separately.


-- Qing



-Original Message-
From: owner-freebsd-sta...@freebsd.org on behalf of Henri Hennebert
Sent: Fri 7/10/2009 5:32 AM
To: freebsd-stable@freebsd.org; freebsd...@freebsd.org
Subject: 8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 
for 'local' connections
 
Hello,


After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when 
connecting with firefox to a local apache server using the global 
unicast IPv6 address of the local machine. pf.conf must be updated!


My configuration:

[r...@avoriaz ~]# ifconfig em0

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4
ether 00:1d:60:ad:2a:ce
inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255
inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1
inet6 2001:41d0:2:2d29:1:1:: prefixlen 80
media: Ethernet 100baseTX (100baseTX half-duplex)
status: active

[r...@avoriaz ~]# host www.restart.bel
www.restart.bel is an alias for avoriaz.restart.bel.
avoriaz.restart.bel has address 192.168.24.1
avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1::

pf.conf:

int_if=em0
block in  log all
block out log all
set skip on lo0
antispoof quick for $int_if inet
# Allow trafic with physical internal network
pass in quick on $int_if from ($int_if:network) to ($int_if) keep state
pass out quick on $int_if from ($int_if) to ($int_if:network) keep state

The problem:

[r...@avoriaz ~]# telnet -4 www.restart.bel 80
Trying 192.168.24.1...
Connected to avoriaz.restart.bel.
Escape character is '^]'.
^]
telnet quit
Connection closed.
[r...@avoriaz ~]# telnet -6 www.restart.bel 80
Trying 2001:41d0:2:2d29:1:1::...
---Never connect and get a timeout!

tcpdump and logging in pf show me that

For a IPv4 connection:
the packet from telnet to apache pass 2 times on lo0 (out and in)
the answer packet from apache to telnet pass 2 times on lo0 (out and in)

So no problem, there is `set skip on lo0'

For a IPv6 connection:
The first packet from telnet to apache pass 2 times on lo0 (out and in)
The answer packet from apache to telnet path on em0  and is rejected
due to the default flags S/SA.

So I have to change pf.conf and replace the last line:
pass out quick on $int_if from ($int_if) to ($int_if:network) \
keep state flags any

Then all is OK

By the way, on 7.2

netstat -rn display

192.168.24.100:1d:60:ad:2a:ce

2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce


On 8.0-BETA1 there is an assymetry:

netstat -rn display

192.168.24.1   link#3

no entry for 2001:41d0:2:2d29:1:1::

Hope it may help someone

Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections

2009-07-10 Thread Henri Hennebert

Hello,

After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when 
connecting with firefox to a local apache server using the global 
unicast IPv6 address of the local machine. pf.conf must be updated!


My configuration:

[r...@avoriaz ~]# ifconfig em0

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4
ether 00:1d:60:ad:2a:ce
inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255
inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1
inet6 2001:41d0:2:2d29:1:1:: prefixlen 80
media: Ethernet 100baseTX (100baseTX half-duplex)
status: active

[r...@avoriaz ~]# host www.restart.bel
www.restart.bel is an alias for avoriaz.restart.bel.
avoriaz.restart.bel has address 192.168.24.1
avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1::

pf.conf:

int_if=em0
block in  log all
block out log all
set skip on lo0
antispoof quick for $int_if inet
# Allow trafic with physical internal network
pass in quick on $int_if from ($int_if:network) to ($int_if) keep state
pass out quick on $int_if from ($int_if) to ($int_if:network) keep state

The problem:

[r...@avoriaz ~]# telnet -4 www.restart.bel 80
Trying 192.168.24.1...
Connected to avoriaz.restart.bel.
Escape character is '^]'.
^]
telnet quit
Connection closed.
[r...@avoriaz ~]# telnet -6 www.restart.bel 80
Trying 2001:41d0:2:2d29:1:1::...
---Never connect and get a timeout!

tcpdump and logging in pf show me that

For a IPv4 connection:
the packet from telnet to apache pass 2 times on lo0 (out and in)
the answer packet from apache to telnet pass 2 times on lo0 (out and in)

So no problem, there is `set skip on lo0'

For a IPv6 connection:
The first packet from telnet to apache pass 2 times on lo0 (out and in)
The answer packet from apache to telnet path on em0  and is rejected
due to the default flags S/SA.

So I have to change pf.conf and replace the last line:
pass out quick on $int_if from ($int_if) to ($int_if:network) \
keep state flags any

Then all is OK

By the way, on 7.2

netstat -rn display

192.168.24.100:1d:60:ad:2a:ce

2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce


On 8.0-BETA1 there is an assymetry:

netstat -rn display

192.168.24.1   link#3

no entry for 2001:41d0:2:2d29:1:1::

Hope it may help someone

Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


8.0-BETA1 - for the record - different paths followed by IPv4 and IPv6 for 'local' connections

2009-07-10 Thread Henri Hennebert

Hello,

After upgrading from 7.2-STABLE to 8.0-BETA1 I encounter a problem when
connecting with firefox to a local apache server using the global
unicast IPv6 address of the local machine. pf.conf must be updated!

My configuration:

[r...@avoriaz ~]# ifconfig em0

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4
ether 00:1d:60:ad:2a:ce
inet 192.168.24.1 netmask 0xff00 broadcast 192.168.24.255
inet6 fe80::21d:60ff:fead:2ace%em0 prefixlen 64 scopeid 0x1
inet6 2001:41d0:2:2d29:1:1:: prefixlen 80
media: Ethernet 100baseTX (100baseTX half-duplex)
status: active

[r...@avoriaz ~]# host www.restart.bel
www.restart.bel is an alias for avoriaz.restart.bel.
avoriaz.restart.bel has address 192.168.24.1
avoriaz.restart.bel has IPv6 address 2001:41d0:2:2d29:1:1::

pf.conf:

int_if=em0
block in  log all
block out log all
set skip on lo0
antispoof quick for $int_if inet
# Allow trafic with physical internal network
pass in quick on $int_if from ($int_if:network) to ($int_if) keep state
pass out quick on $int_if from ($int_if) to ($int_if:network) keep state

The problem:

[r...@avoriaz ~]# telnet -4 www.restart.bel 80
Trying 192.168.24.1...
Connected to avoriaz.restart.bel.
Escape character is '^]'.
^]
telnet quit
Connection closed.
[r...@avoriaz ~]# telnet -6 www.restart.bel 80
Trying 2001:41d0:2:2d29:1:1::...
---Never connect and get a timeout!

tcpdump and logging in pf show me that

For a IPv4 connection:
the packet from telnet to apache pass 2 times on lo0 (out and in)
the answer packet from apache to telnet pass 2 times on lo0 (out and in)

So no problem, there is `set skip on lo0'

For a IPv6 connection:
The first packet from telnet to apache pass 2 times on lo0 (out and in)
The answer packet from apache to telnet path on em0  and is rejected
due to the default flags S/SA.

So I have to change pf.conf and replace the last line:
pass out quick on $int_if from ($int_if) to ($int_if:network) \
keep state flags any

Then all is OK

By the way, on 7.2

netstat -rn display

192.168.24.100:1d:60:ad:2a:ce

2001:41d0:2:2d29:1:1::00:1d:60:ad:2a:ce


On 8.0-BETA1 there is an assymetry:

netstat -rn display

192.168.24.1   link#3

no entry for 2001:41d0:2:2d29:1:1::

Hope it may help someone

Henri


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Zfs on usb-disk checksum errors?

2009-07-08 Thread Henri Hennebert

Ronald Klop wrote:

Hi.

I put zfs on my external usb-disk, so I can backup my harddisk with zfs 
send/receive.

I now have corruption on this volume.

[r...@sjakie ~]# zpool status -v
  pool: extern
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed after 0h2m with 0 errors on Wed Jul  8 00:35:09 
2009

config:

NAMESTATE READ WRITE CKSUM
extern  ONLINE   1 0 0
  da0   ONLINE   9 0 0

errors: Permanent errors have been detected in the following files:

0x3f:0xf5d6

I don't really understand which files have corruption. :-(
In my syslog is this: (repeated quite often)
Jul  8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): SYNCHRONIZE 
CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0
Jul  8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): CAM Status: SCSI 
Status Error
Jul  8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): SCSI Status: 
Check Condition
Jul  8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): ILLEGAL REQUEST 
asc:20,0
Jul  8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): Invalid command 
operation code

Jul  8 10:00:37 sjakie kernel: (da0:umass-sim0:0:0:0): Unretryable error

I experience the same error with 'Kingston DataTraveler II 1.13'. I 
simply add in /usr/src/sys/dev/usb/usbdevs:


product KINGSTON DATATRAVELER_2 0x1600 DAtaTraveler II

(VENDOR was already in the file).

and  in sys/dev/usb/storage/umass.c:

   { USB_VENDOR_KINGSTON, USB_PRODUCT_KINGSTON_DATATRAVELER_2, 
RID_WILDCARD, 

 UMASS_PROTO_SCSI | UMASS_PROTO_BBB, 



 NO_SYNCHRONIZE_CACHE 



   }, 



Note the flag NO_SYNCHRONIZE_CACHE and everything return to normal.

PS - I encounter this problem on 7.2_STABLE with the MFC of ZFS v13.

Henri


and sometimes
Jul  8 10:00:35 sjakie root: ZFS: vdev I/O failure, zpool=extern 
path=/dev/da0 offset=127558877184 size=3072 error=5
Jul  8 10:00:35 sjakie root: ZFS: vdev I/O failure, zpool=extern 
path=/dev/da0 offset=127558877184 size=3072 error=5

Jul  8 10:00:35 sjakie root: ZFS: zpool I/O failure, zpool=extern error=5
With varying offsets and sizes.

What can I conclude from this? Is the disk failing? Is the 'Invalid 
command operation code' something to worry about? It didn't show up when 
the disk was UFS.


I reinstalled the pool but the read-errors showed up again.

Thanks for any advice,

Ronald.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 7-STABLE and chflags on ZFS now(?) failing

2009-06-29 Thread Henri Hennebert

Ralf S. Engelschall wrote:

One of my FreeBSD boxes is a 7-STABLE/amd64 one on ZFS, now in
production for over a 1.5 years now and which receives regular upgrades.
The last installation of FreeBSD 7-STABLE was just about 2 weeks ago.
Today the upgrade failed the first time:


cd /usr/src; /usr/bin/make -f Makefile.inc1 install
=== share/info (install)
=== lib (install)
=== lib/csu/amd64 (install)
install -o root -g wheel -m 444  crt1.o crti.o crtn.o gcrt1.o /usr/lib
=== lib/libc (install)
install -C -o root -g wheel -m 444   libc.a /usr/lib
install -C -o root -g wheel -m 444   libc_p.a /usr/lib
install -s -o root -g wheel -m 444   -fschg -S  libc.so.7 /lib
install: /lib/libc.so.7: chflags: Invalid argument
*** Error code 71

Stop in /usr/src/lib/libc.
*** Error code 1

Stop in /usr/src/lib.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
3.30s real  0.35s user  0.75s sys
/libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh
*** Error code 1

Stop in /usr/adm.
*** Error code 1 (ignored)
/libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh
*** Error code 1 (ignored)
/libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh
*** Error code 1 (ignored)
/libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh
*** Error code 1 (ignored)
/libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh
*** Error code 1

Stop in /usr/adm.
*** Error code 1 (ignored)
# sh
/libexec/ld-elf.so.1: Shared object libc.so.7 not found, required by sh
#


Fortunately, I was able to quickly recover via /rescue/cp by copying
a libc.so.7 from a Jail to the host system (where the upgrade was
performed). But why has this problem occurred now.

Well, /lib is on ZFS and I can remember from the past that ZFS did not
honor chflags. But remains two questions:

1. I thought chflags support for ZFS was added already in the past.
   Can it be that just a _few_ chflags flags are supported? It looks
   like uchg works while the above schg fails.

I believe that for schg `zfs get version file_system_with /lib`
must be 3.
To upgrade this: `zfs upgrade file_system_with /lib`



2. Assuming that schg was never supported on ZFS by us, why did the
   upgrades in the past on this FreeBSD 7-STABLE box never failed until
   now? Why now the first time? I would have expected that it already
   failed from day zero with the above error.


Just a try to this strange problem:

`man install` say:

By default, install preserves all file flags, with the exception of the
``nodump'' flag.

With the previous version of zfs there was no flags and so no try to 
play with flags during update.


Henri


As workaround I've now put a NO_SCHG=yes into /etc/make.conf and
performed the upgrade from scratch. Now it succeeded, of course. But I
still do not know the answer to the above two questions and this makes
me still feel a little bit unsure about the whole situation...


PS: At a mergemaster run I now got a problems which looks related:
mv: /var/db/mergemaster.mtree: set flags (was: ): Invalid argument
Yes, /var is also on ZFS here. Same problem as it looks. But I'm
sure also this error did not occur in the past...

--
r...@freebsd.orgRalf S. Engelschall
FreeBSD.org/~rse   r...@engelschall.com
FreeBSD committer  www.engelschall.com

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS pool from current

2009-06-17 Thread Henri Hennebert

Nenhum_de_Nos wrote:

On Wed, June 17, 2009 11:16, Dimitry Andric wrote:

On 2009-06-17 16:09, Nenhum_de_Nos wrote:

And for virtualbox on amd64 purposes I want to run 7.2R or STABLE to use
VT-x and amd64 vm's under vbox. will I have to make anything, or it will
just work ?

Kip Macy created a branch were there is the new zfs code, but I didn't
get
it if it is in the main sources or if I need to fetch any especial code.

Kip merged the ZFS v13 support to -STABLE just last month.  It seems to
work okay for most people, but be sure to read the UPDATING file,
especially if you are upgrading existing pools.


thanks, I was just looking for this update on web interface to cvs and
there is nothing in UPDATING for RELENG_7 there. is this really supposed
to happen ?


Sadly a known and ignored problem of cvsweb

http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/120185

Henri


I'll get from csup now ...

thanks again,

matheus



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS pool from current

2009-06-17 Thread Henri Hennebert

Gavin Atkinson wrote:

On Wed, 2009-06-17 at 16:51 +0200, Henri Hennebert wrote:

Nenhum_de_Nos wrote:

thanks, I was just looking for this update on web interface to cvs and
there is nothing in UPDATING for RELENG_7 there. is this really supposed
to happen ?

Sadly a known and ignored problem of cvsweb

http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/120185

Henri


As far as I can tell, this isn't really a problem with cvsweb, but more
of a problem with the repository itself.  The issue comes when a commit
is made and the log message includes the magic string that CVS uses
internally to track different revisions.  The patch proposed in that PR
appears to be more of a hack than a fix.


Ok with that but there is no fix if you base your algorithm on a wrong 
specification.




It's the same reason that (for example)
http://www.freebsd.org/cgi/cvsweb.cgi/src/etc/rc.d/ntpd lists a revision
1.335 even though the most recent commit was version 1.18.


The hack work well in this case too. I prefer a hack instead of a 
confusing answer.


Henri


On the upside, it doesn't appear that these bogus commits have ended up
replicated in the SVN repository.

Gavin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Stable from May 31 - zfs list locked

2009-06-06 Thread Henri Hennebert

Hello,

I encounter this problem for the second time. The system is working 
perfectly well but suddenly the command `zfs list' don't work and can't 
be killed.


Here is a procstat of the culprit:

[r...@morzine ~]# procstat -k 91766
  PIDTID COMM TDNAME   KSTACK 

91766 100490 zfs  -mi_switch sleepq_switch 
sleepq_wait _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir 
zap_lookup_norm zap_lookup dsl_prop_get_dd dsl_dataset_get_ref 
dsl_dataset_hold dmu_objset_open zfs_ioc_objset_stats zfsdev_ioctl 
devfs_ioctl_f kern_ioctl


same thing happen if I try to run `zpool list' un another terminal.

Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Stable from May 31 - zfs list locked

2009-06-06 Thread Henri Hennebert

Henri Hennebert wrote:

Hello,

I encounter this problem for the second time. The system is working 
perfectly well but suddenly the command `zfs list' don't work and can't 
be killed.


Here is a procstat of the culprit:

[r...@morzine ~]# procstat -k 91766
  PIDTID COMM TDNAME   KSTACK
91766 100490 zfs  -mi_switch sleepq_switch 
sleepq_wait _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir 
zap_lookup_norm zap_lookup dsl_prop_get_dd dsl_dataset_get_ref 
dsl_dataset_hold dmu_objset_open zfs_ioc_objset_stats zfsdev_ioctl 
devfs_ioctl_f kern_ioctl


same thing happen if I try to run `zpool list' un another terminal.


Stangely, zfs snapsot and zfs destroy seems working properly ...

I reboot to check this


Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS booting without partitions

2009-06-05 Thread Henri Hennebert

Kip Macy wrote:

On Mon, Jun 1, 2009 at 10:21 AM, Adam McDougall mcdou...@egr.msu.edu wrote:

I'm thinking that too.  I spent some time taking stabs at figuring it out
yesterday but didn't get anywhere useful.  I did try compiling the -current
src/sys/boot tree on 7.2 after a couple header tweaks to make it compile but
the loader still didn't work.  The working loader is the same file size as
the broken loader unless it was compiled on i386 and then it is ~30k bigger
for some reason (it shrinks to the same size as the rest if I force it to
use the same 32bit compilation flags as used on amd64).  Just mentioning
this in case it saves someone else some time.  I'm real pleased it works at
all.


If someone has the time to track down the differences I'll MFC them.
I'm not using ZFS boot at the moment so I have no way of testing.


At last I get this F.G diff!!!

The problem was in libstand.a. By the way , the patch also take into 
account the update of Doug Rabson to answer my problem with too many 
devices / pools.


Happy to help on this one.



Cheers,
Kip


--- lib/libstand/stand.h.orig   2007-01-09 02:02:04.0 +0100
+++ lib/libstand/stand.h2009-06-03 17:24:42.627552341 +0200
@@ -167,7 +167,7 @@
 #define SOPEN_RASIZE   512
 };
 
-#defineSOPEN_MAX   8
+#defineSOPEN_MAX   64
 extern struct open_file files[];
 
 /* f_flags values */
--- lib/libstand/nfs.c.orig 2004-01-21 21:12:23.0 +0100
+++ lib/libstand/nfs.c  2009-06-05 20:36:26.001368421 +0200
@@ -29,7 +29,7 @@
  */
 
 #include sys/cdefs.h
-__FBSDID($FreeBSD: src/lib/libstand/nfs.c,v 1.12 2004/01/21 20:12:23 jhb Exp 
$);
+__FBSDID($FreeBSD: src/lib/libstand/nfs.c,v 1.14 2008/11/21 09:14:29 luigi 
Exp $);
 
 #include sys/param.h
 #include sys/time.h
@@ -405,16 +405,23 @@
 
 #ifdef NFS_DEBUG
if (debug)
-   printf(nfs_open: %s (rootpath=%s)\n, path, rootpath);
+   printf(nfs_open: %s (rootpath=%s)\n, upath, rootpath);
 #endif
if (!rootpath[0]) {
printf(no rootpath, no nfs\n);
return (ENXIO);
}
 
+   /*
+* This is silly - we should look at dv_type but that value is
+* arch dependant and we can't use it here.
+*/
 #ifndef __i386__
if (strcmp(f-f_dev-dv_name, net) != 0)
return(EINVAL);
+#else
+   if (strcmp(f-f_dev-dv_name, pxe) != 0)
+   return(EINVAL);
 #endif
 
if (!(desc = socktodesc(*(int *)(f-f_devdata
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: /boot/loader can't load kernel if too many pool/devices

2009-06-02 Thread Henri Hennebert

Doug Rabson wrote:


On 1 Jun 2009, at 11:22, Henri Hennebert wrote:


Hello,

During my tests (succesful) to directly boot from ZFS (with zfsboot 
and gptzfsboot) I encounter the error can't boot 'kernel' if too 
many devices/pools are connected to the machine. In my case:


2 SAS disks with 2 pools
2 SATA disks with 2 pools
1 USB key with one pool

`heap` command:

Active Allocations: 171/173
536576 bytes reserved 527800 bytes allocated

`ls` command:

open '/' failed: too many open files

If I reboot without the USB key all is OK.

If I reboot from the USB key after disconnecting 2 disks all is OK.

By the way, the /boot/loader in 7.2-STABLE don't work, complains about 
forth not found.


The previous tests were made with 7.2-STABLE (May 31) with 
/boot/loader from 8.0-CURRENT.


I recently increased the number of file descriptors available for 
/boot/loader. Could you rebuild and try again please. Make sure you 
rebuild libstand.a as well as /boot/loader.



OK - I can boot with the USB key and 4 disks

Thanks

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS booting without partitions

2009-06-01 Thread Henri Hennebert

Lorenzo Perone wrote:

Hi,

I tried hard... but without success ;(

the result is, when choosing the disk with the zfs boot
sectors in it (in my case F5, which goes to ad6), the kernel
is not found. the console shows:

forth not found
definitions not found
only not found
(the above repeated several times)


This is the file /boot/loader from 7.2-STABLE which is wrong.

You can find a copy from 8.0-CURRENT and a script that I tested on a USB 
key) and is running for me:


http://verbier.restart.be/xfer/boot-zfs/

Put this directory somewhere, eg /tmp/boot-zfs

and run the script eg:
`cd /tmp/boot-zfs  sh -x make_usb_key.sh da6 kingston`

good luck

Henri


can't load 'kernel'

and I get thrown to the loader prompt.
lsdev does not show any ZFS devices.

Strange thing: if I boot from the other disk, F1, which is my
ad4 containing the normal ufs system I used to make up the other
one, and escape to the loader prompt, lsdev actually sees the
zpool which is on the other disk, and shows:
zfs0: tank

I tried booting with boot zfs:tank or zfs:tank:/boot/kernel/kernel,
but there I get the panic: free: guard1 fail message.
(would boot zfs:tank:/boot/kernel/kernel be correct, anyways?)

Sure I'm doing something wrong, but what...? Is it a problem that
the pool is made out of the second disk only (ad6)?

Here are my details (note: latest stable and biosdisk.c merged
with changes shown in r185095. no problems in buildworld/kernel):

snip

Machine: p4 4GHz 4 GB RAM (i386)

Note: the pool has actually a different name (heidi
instead of tank, if this can be of any relevance...),
just using tank here as it's one of the conventions...

mount (just to show my starting situation)

/dev/mirror/gm0s1a on / (ufs, local)
devfs on /dev (devfs, local)
/dev/mirror/gm0s1e on /tmp (ufs, local, soft-updates)
/dev/mirror/gm0s1f on /usr (ufs, local, soft-updates)
/dev/mirror/gm0s1d on /var (ufs, local, soft-updates)

gmirror status
  NameStatus  Components
mirror/gm0  DEGRADED  ad4
(ad6 used to be the second disk...)

echo 'LOADER_ZFS_SUPPORT=yes'  /etc/make.conf

cd /usr/src
make buildworld  make buildkernel KERNCONF=HEIDI
make installkernel KERNCONF=HEIDI
mergemaster
make installworld
shutdown -r now

dd if=/dev/zero of=/dev/ad6 bs=512 count=32

zpool create tank ad6
zfs create tank/usr
zfs create tank/var
zfs create -V 4gb tank/swap
zfs set org.freebsd:swap=on tank/swap
zpool set bootfs=tank tank

rsync -avx / /tank
rsync -avx /usr/ /tank/usr
rsync -avx /var/ /tank/var
cd /usr/src
make installkernel KERNCONF=HEIDI DESTDIR=/tank

zpool export tank

dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1
dd if=/boot/zfsboot of=/dev/ad6 bs=512 skip=1 seek=1024

zpool import tank

zfs set mountpoint=legacy tank
zfs set mountpoint=/usr tank/usr
zfs set mountpoint=/var tank/var

shutdown -r now ...

at the 'mbr prompt' I pressed F5 (the second disk, ad6)
.. as written above, loader gets loaded (at this stage
I suppose it's the stuff dd't after block 1024?),
but kernel not found.

/usr/src/sys/i386/conf/HEIDI:
(among other things...):
options KVA_PAGES=512

(/tank)/boot/loader.conf:
vm.kmem_size=1024M
vm.kmem_size_max=1024M
vfs.zfs.arc_max=128M
vfs.zfs.vdev.cache.size=8M
vfs.root.mountfrom=zfs:tank

(/tank)/etc/fstab:
# DeviceMountpointFStypeOptionsDumpPass#
tank/zfsrw00
/dev/acd0/cdromcd9660ro,noauto00

/snap

any help is welcome... don't know where to go from here right now.

BTW: I can't stop thanking the team for the incredible
pace at which bugs are fixed these days!


Regards,

Lorenzo



On 26.05.2009, at 18:42, George Hartzell wrote:


Andriy Gapon writes:

on 26/05/2009 19:21 George Hartzell said the following:

Dmitry Morozovsky writes:

On Tue, 26 May 2009, Mickael MAILLOT wrote:

MM Hi,
MM
MM i prefere use zfsboot boot sector, an example is better than a 
long talk:

MM
MM $ zpool create tank mirror ad4 ad6
MM $ zpool export tank
MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=1
MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1
MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 skeep=1  seek=1024
MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 skeep=1  seek=1024

s/skeep/skip/ ? ;-)


What is the reason for copying zfsboot one bit at a time, as opposed
to

 dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=2


seek=1024 for the second part? and no 'count=1' for it? :-)

[Just guessing] Apparently the first block of zfsboot is some form of 
MBR and the

rest is zfs-specific code that goes to magical sector 1024.


Ok, I managed to read the argument to seek as one block, apparently
my coffee hasn't hit yet.

I'm still confused about the two parts of zfsboot and what's magical
about seeking to 1024.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



/boot/loader can't load kernel if too many pool/devices

2009-06-01 Thread Henri Hennebert

Hello,

During my tests (succesful) to directly boot from ZFS (with zfsboot and 
gptzfsboot) I encounter the error can't boot 'kernel' if too many 
devices/pools are connected to the machine. In my case:


2 SAS disks with 2 pools
2 SATA disks with 2 pools
1 USB key with one pool

`heap` command:

Active Allocations: 171/173
536576 bytes reserved 527800 bytes allocated

`ls` command:

open '/' failed: too many open files

If I reboot without the USB key all is OK.

If I reboot from the USB key after disconnecting 2 disks all is OK.

By the way, the /boot/loader in 7.2-STABLE don't work, complains about 
forth not found.


The previous tests were made with 7.2-STABLE (May 31) with /boot/loader 
from 8.0-CURRENT.


Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS booting without partitions

2009-06-01 Thread Henri Hennebert

Henri Hennebert wrote:

Lorenzo Perone wrote:

Hi,

I tried hard... but without success ;(

the result is, when choosing the disk with the zfs boot
sectors in it (in my case F5, which goes to ad6), the kernel
is not found. the console shows:

forth not found
definitions not found
only not found
(the above repeated several times)


This is the file /boot/loader from 7.2-STABLE which is wrong.

You can find a copy from 8.0-CURRENT and a script that I tested on a USB 
key) and is running for me:


http://verbier.restart.be/xfer/boot-zfs/

Put this directory somewhere, eg /tmp/boot-zfs

and run the script eg:
`cd /tmp/boot-zfs  sh -x make_usb_key.sh da6 kingston`

good luck


CAVEAT:

The script put tuning in '/boot/loader.conf' wich imply options 
 KVA_PAGES=384 in my i386 kernel.


Henri



Henri


can't load 'kernel'

and I get thrown to the loader prompt.
lsdev does not show any ZFS devices.

Strange thing: if I boot from the other disk, F1, which is my
ad4 containing the normal ufs system I used to make up the other
one, and escape to the loader prompt, lsdev actually sees the
zpool which is on the other disk, and shows:
zfs0: tank

I tried booting with boot zfs:tank or zfs:tank:/boot/kernel/kernel,
but there I get the panic: free: guard1 fail message.
(would boot zfs:tank:/boot/kernel/kernel be correct, anyways?)

Sure I'm doing something wrong, but what...? Is it a problem that
the pool is made out of the second disk only (ad6)?

Here are my details (note: latest stable and biosdisk.c merged
with changes shown in r185095. no problems in buildworld/kernel):

snip

Machine: p4 4GHz 4 GB RAM (i386)

Note: the pool has actually a different name (heidi
instead of tank, if this can be of any relevance...),
just using tank here as it's one of the conventions...

mount (just to show my starting situation)

/dev/mirror/gm0s1a on / (ufs, local)
devfs on /dev (devfs, local)
/dev/mirror/gm0s1e on /tmp (ufs, local, soft-updates)
/dev/mirror/gm0s1f on /usr (ufs, local, soft-updates)
/dev/mirror/gm0s1d on /var (ufs, local, soft-updates)

gmirror status
  NameStatus  Components
mirror/gm0  DEGRADED  ad4
(ad6 used to be the second disk...)

echo 'LOADER_ZFS_SUPPORT=yes'  /etc/make.conf

cd /usr/src
make buildworld  make buildkernel KERNCONF=HEIDI
make installkernel KERNCONF=HEIDI
mergemaster
make installworld
shutdown -r now

dd if=/dev/zero of=/dev/ad6 bs=512 count=32

zpool create tank ad6
zfs create tank/usr
zfs create tank/var
zfs create -V 4gb tank/swap
zfs set org.freebsd:swap=on tank/swap
zpool set bootfs=tank tank

rsync -avx / /tank
rsync -avx /usr/ /tank/usr
rsync -avx /var/ /tank/var
cd /usr/src
make installkernel KERNCONF=HEIDI DESTDIR=/tank

zpool export tank

dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1
dd if=/boot/zfsboot of=/dev/ad6 bs=512 skip=1 seek=1024

zpool import tank

zfs set mountpoint=legacy tank
zfs set mountpoint=/usr tank/usr
zfs set mountpoint=/var tank/var

shutdown -r now ...

at the 'mbr prompt' I pressed F5 (the second disk, ad6)
.. as written above, loader gets loaded (at this stage
I suppose it's the stuff dd't after block 1024?),
but kernel not found.

/usr/src/sys/i386/conf/HEIDI:
(among other things...):
options KVA_PAGES=512

(/tank)/boot/loader.conf:
vm.kmem_size=1024M
vm.kmem_size_max=1024M
vfs.zfs.arc_max=128M
vfs.zfs.vdev.cache.size=8M
vfs.root.mountfrom=zfs:tank

(/tank)/etc/fstab:
# DeviceMountpointFStypeOptionsDumpPass#
tank/zfsrw00
/dev/acd0/cdromcd9660ro,noauto00

/snap

any help is welcome... don't know where to go from here right now.

BTW: I can't stop thanking the team for the incredible
pace at which bugs are fixed these days!


Regards,

Lorenzo



On 26.05.2009, at 18:42, George Hartzell wrote:


Andriy Gapon writes:

on 26/05/2009 19:21 George Hartzell said the following:

Dmitry Morozovsky writes:

On Tue, 26 May 2009, Mickael MAILLOT wrote:

MM Hi,
MM
MM i prefere use zfsboot boot sector, an example is better than a 
long talk:

MM
MM $ zpool create tank mirror ad4 ad6
MM $ zpool export tank
MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=1
MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1
MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 skeep=1  seek=1024
MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 skeep=1  seek=1024

s/skeep/skip/ ? ;-)


What is the reason for copying zfsboot one bit at a time, as opposed
to

 dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=2


seek=1024 for the second part? and no 'count=1' for it? :-)

[Just guessing] Apparently the first block of zfsboot is some form 
of MBR and the

rest is zfs-specific code that goes to magical sector 1024.


Ok, I managed to read the argument to seek as one block, apparently
my coffee hasn't hit yet.

I'm still confused about the two parts of zfsboot and what's magical
about seeking to 1024.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org

Re: libzpool assert vs libc assert

2009-06-01 Thread Henri Hennebert

Andriy Gapon wrote:

on 29/05/2009 15:35 Andriy Gapon said the following:

So anyone else feels that this is a bug?

on 28/05/2009 16:55 Andriy Gapon said the following:

on 28/05/2009 16:26 Henri Hennebert said the following:

(gdb) bt
#0  0x0008012a6f22 in strlen () from /lib/libc.so.7
#1  0x0008012a0feb in open () from /lib/libc.so.7
#2  0x00080129ea59 in open () from /lib/libc.so.7
#3  0x0008012a1f2e in vfprintf () from /lib/libc.so.7
#4  0x000801291158 in fprintf () from /lib/libc.so.7
#5  0x000801290fb0 in __assert () from /lib/libc.so.7

I find the above part interesting.
Could this be because of the following discrepancy:

1)
cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h:
extern void __assert(const char *, const char *, int);
2)
lib/libc/gen/assert.c:
void
__assert(func, file, line, failedexpr)
const char *func, *file;
int line;
const char *failedexpr;


#6  0x000800fef120 in zmutex_destroy () from /lib/libzpool.so.1
#7  0x00080102e1a0 in dsl_dataset_fast_stat () from /lib/libzpool.so.1
#8  0x000801045ffa in dbuf_find () from /lib/libzpool.so.1
#9  0x000801047bf3 in dmu_buf_rele () from /lib/libzpool.so.1
#10 0x000801027546 in dsl_pool_open () from /lib/libzpool.so.1
#11 0x00080101bcec in spa_create () from /lib/libzpool.so.1
#12 0x00080101c820 in spa_tryimport () from /lib/libzpool.so.1


I propose the following patch for this issue.
It fixes mismatch between __assert extern declaration in zfs code and actual
signature in libc code.
I also took liberty of dropping __STDC__ and __STDC_VERSION__ checks. I think 
that
those checks are not needed with compilers that can be used to compile FreeBSD.
Besides, both branches of __STDC_VERSION__ check were exactly the same.

Henri,

if you still experience that crash of zpool command, could you please try the
patch and see if you have a nicer assert message and stacktrace now?
Sorry, that this is still not a fix for the real issue.

diff --git a/cddl/contrib/opensolaris/head/assert.h
b/cddl/contrib/opensolaris/head/assert.h
index 394820a..c2a4936 100644
--- a/cddl/contrib/opensolaris/head/assert.h
+++ b/cddl/contrib/opensolaris/head/assert.h
@@ -37,15 +37,7 @@
 extern C {
 #endif

-#if defined(__STDC__)
-#if __STDC_VERSION__ - 0 = 199901L
-extern void __assert(const char *, const char *, int);
-#else
-extern void __assert(const char *, const char *, int);
-#endif /* __STDC_VERSION__ - 0 = 199901L */
-#else
-extern void _assert();
-#endif
+extern void __assert(const char *, const char *, int, const char *);

 #ifdef __cplusplus
 }
@@ -68,14 +60,6 @@ extern void _assert();

 #else

-#if defined(__STDC__)
-#if __STDC_VERSION__ - 0 = 199901L
-#defineassert(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 
0))
-#else
-#defineassert(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 
0))
-#endif /* __STDC_VERSION__ - 0 = 199901L */
-#else
-#defineassert(EX) (void)((EX) || (_assert(EX, __FILE__, __LINE__), 
0))
-#endif /* __STDC__ */
+#defineassert(EX) (void)((EX) || (__assert(__func__, __FILE__, 
__LINE__, #EX), 0))

 #endif /* NDEBUG */
diff --git a/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h
b/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h
index 7ae7f9d..631e302 100644
--- a/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h
+++ b/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h
@@ -120,21 +120,12 @@ extern void vpanic(const char *, __va_list);
 #definefm_panicpanic

 /* This definition is copied from assert.h. */
-#if defined(__STDC__)
-#if __STDC_VERSION__ - 0 = 199901L
-#defineverify(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 
0))
-#else
-#defineverify(EX) (void)((EX) || (__assert(#EX, __FILE__, __LINE__), 
0))
-#endif /* __STDC_VERSION__ - 0 = 199901L */
-#else
-#defineverify(EX) (void)((EX) || (_assert(EX, __FILE__, __LINE__), 
0))
-#endif /* __STDC__ */
-
+#defineverify(EX) (void)((EX) || (__assert(__func__, __FILE__, 
__LINE__, #EX), 0))

 #defineVERIFY  verify
 #defineASSERT  assert

-extern void __assert(const char *, const char *, int);
+extern void __assert(const char *, const char *, int, const char *);

 #ifdef lint
 #defineVERIFY3_IMPL(x, y, z, t)if (x == z) ((void)0)
@@ -148,7 +139,7 @@ extern void __assert(const char *, const char *, int);
(void) snprintf(__buf, 256, %s %s %s (0x%llx %s 0x%llx), \
#LEFT, #OP, #RIGHT, \
(u_longlong_t)__left, #OP, (u_longlong_t)__right); \
-   __assert(__buf, __FILE__, __LINE__); \
+   __assert(__func__, __FILE__, __LINE__, __buf); \
} \
 _NOTE(CONSTCOND) } while (0)
 /* END CSTYLED */



Here is the new bt after the patch

[r...@avoriaz libzpool]# gdb zdb
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB

Re: ZFS MFC heads down

2009-05-31 Thread Henri Hennebert

Kip Macy wrote:

Please try applying this change to your tree and let me know.


I patch, I reboot 2 times without problem. I keep you posted is
I encounter a new crash.

Thanks

Henri


Thanks,
Kip

http://svn.freebsd.org/viewvc/base?view=revisionrevision=193110


On Sat, May 30, 2009 at 2:11 AM, Henri Hennebert h...@restart.be wrote:

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.


The MFC went in r192498. Please let me know if you have any problems.


I get a Fatal trap 12: page fault while in kernel mode
at shutdown. the core.txt is http://verbier.restart.be/xfer/core.txt.61

Thanks for you work

Henri



Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org








___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Problem with postfix and mail command

2009-05-30 Thread Henri Hennebert

Ruben Lara wrote:

Hi all!

I just installed postfix, after build world without sendmail

If i try to send mail i get:

mail# mail aaa
Subject: a
a
.
EOT
mail# mail: /usr/sbin/sendmail: No such file or directory


Event with WITHOUT_SENDMAIL=yes in /etc/src.conf, make installworld must 
create this symbolic links:


# ls -l /usr/sbin/sendmail
lrwxr-xr-x  1 root  wheel  21 May 21 13:54 /usr/sbin/sendmail - 
/usr/sbin/mailwrapper


Henri



I edited:
mail# cat /etc/mail/mailer.conf
#
# Execute the Postfix sendmail program, named /usr/local/sbin/sendmail
#
sendmail/usr/local/sbin/sendmail
send-mail/usr/local/sbin/sendmail
mailq/usr/local/sbin/sendmail
newaliases/usr/local/sbin/sendmail
mail# 


where actually i have my postfix esecutables

Thanks for help in advance
Rubén Lara

_
¡Acelera con la Fórmula 1! Juega y demuestra lo que sabes con MSN Deportes
http://msn.es.predictorpro.com/grand-prix/overview.aspx?season=8 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-30 Thread Henri Hennebert

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.



The MFC went in r192498. Please let me know if you have any problems.


I get a Fatal trap 12: page fault while in kernel mode
at shutdown. the core.txt is http://verbier.restart.be/xfer/core.txt.61

Thanks for you work

Henri



Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-28 Thread Henri Hennebert

Kip Macy wrote:

On Wed, May 27, 2009 at 11:04 AM, Artem Belevich fbsdl...@src.cx wrote:

I had the same problem on -current. Try attached patch. It may not
apply cleanly on -stable, but should be easy enough to make equivalent
changes on -stable.

--Artem



Adding to rw_init looks fine, but I'd rather find out why owner isn't
NULL when the calling convention expects it. Getting a backtrace from
where the assert is hit would be helpful.


-Kip



on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May 
25 12:06:07 CEST 2009 
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ  amd64


Is it useful ?

[r...@avoriaz ~]# gdb zdb
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...(no debugging 
symbols found)...

(gdb) r rpool
Starting program: /usr/sbin/zdb rpool
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...[New LWP 100343]
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...(no debugging symbols found)...(no debugging symbols 
found)...(no debugging symbols found)...(no debugging symbols 
found)...(no debugging symbols found)...[New Thread 0x8018020b0 (LWP 
100343)]

[New Thread 0x801802240 (LWP 100346)]
version=13
name='rpool'
state=0
txg=3467
pool_guid=536117255064806899
hostid=1133576597
hostname='unset'
vdev_tree
type='root'
id=0
guid=536117255064806899
children[0]
type='mirror'
id=0
guid=3124217685892976292
metaslab_array=23
metaslab_shift=30
ashift=9
asize=155741847552
is_log=0
children[0]
type='disk'
id=0
guid=11099413743436480159
path='/dev/ad4p2'
whole_disk=0
children[1]
type='disk'
id=1
guid=12724983687805955432
path='/dev/ad6p2'
whole_disk=0
[New Thread 0x8018023d0 (LWP 100347)]
[New Thread 0x801802560 (LWP 100354)]
[New Thread 0x8018026f0 (LWP 100355)]
[New Thread 0x801802880 (LWP 100356)]
[New Thread 0x801802a10 (LWP 100359)]
[New Thread 0x801802ba0 (LWP 100360)]
[New Thread 0x801802d30 (LWP 100368)]
[New Thread 0x801802ec0 (LWP 100369)]
[New Thread 0x801803050 (LWP 100370)]
[New Thread 0x8018031e0 (LWP 100371)]
[New Thread 0x801803370 (LWP 100372)]
[New Thread 0x801803500 (LWP 100373)]
[New Thread 0x801803690 (LWP 100374)]
[New Thread 0x801803820 (LWP 100375)]
[New Thread 0x8018039b0 (LWP 100376)]
[New Thread 0x801803b40 (LWP 100377)]
[New Thread 0x801803cd0 (LWP 100378)]
[New Thread 0x801803e60 (LWP 100379)]
[New Thread 0x801803ff0 (LWP 100380)]
[New Thread 0x801804180 (LWP 100381)]
[New Thread 0x801804310 (LWP 100382)]
[New Thread 0x8018044a0 (LWP 100383)]
[New Thread 0x801804630 (LWP 100384)]
[New Thread 0x8018047c0 (LWP 100385)]
[New Thread 0x801804950 (LWP 100386)]
[New Thread 0x801804ae0 (LWP 100387)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x8018020b0 (LWP 100343)]
0x0008012a6f22 in strlen () from /lib/libc.so.7
(gdb) bt
#0  0x0008012a6f22 in strlen () from /lib/libc.so.7
#1  0x0008012a0feb in open () from /lib/libc.so.7
#2  0x00080129ea59 in open () from /lib/libc.so.7
#3  0x0008012a1f2e in vfprintf () from /lib/libc.so.7
#4  0x000801291158 in fprintf () from /lib/libc.so.7
#5  0x000801290fb0 in __assert () from /lib/libc.so.7
#6  0x000800fef120 in zmutex_destroy () from /lib/libzpool.so.1
#7  0x00080102e1a0 in dsl_dataset_fast_stat () from /lib/libzpool.so.1
#8  0x000801045ffa in dbuf_find () from /lib/libzpool.so.1
#9  0x000801047bf3 in dmu_buf_rele () from /lib/libzpool.so.1
#10 0x000801027546 in dsl_pool_open () from /lib/libzpool.so.1
#11 0x00080101bcec in spa_create () from /lib/libzpool.so.1
#12 0x00080101c820 in spa_tryimport () from /lib/libzpool.so.1
#13 0x00408b41 in ?? ()
#14 0x004036de in ?? ()
#15 0x000800534000 in ?? ()
#16 0x in ?? ()
#17 0x0002 in ?? ()
#18 0x7fffed70 in ?? ()
#19 0x7fffed7e in ?? ()
#20 0x in ?? ()
#21 0x7fffed84 in ?? ()
#22 0x7fffed9a in ?? ()
#23 0x7fffeda5 in ?? ()
#24 0x7fffedbf in ?? ()
#25 0x7fffedea in ?? ()
#26 

Re: ZFS MFC heads down

2009-05-28 Thread Henri Hennebert



Andriy Gapon wrote:

on 28/05/2009 16:26 Henri Hennebert said the following:

(gdb) bt
#0  0x0008012a6f22 in strlen () from /lib/libc.so.7
#1  0x0008012a0feb in open () from /lib/libc.so.7
#2  0x00080129ea59 in open () from /lib/libc.so.7
#3  0x0008012a1f2e in vfprintf () from /lib/libc.so.7
#4  0x000801291158 in fprintf () from /lib/libc.so.7
#5  0x000801290fb0 in __assert () from /lib/libc.so.7


I find the above part interesting.
Could this be because of the following discrepancy:

1)
cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h:
extern void __assert(const char *, const char *, int);
2)
lib/libc/gen/assert.c:
void
__assert(func, file, line, failedexpr)
const char *func, *file;
int line;
const char *failedexpr;


#6  0x000800fef120 in zmutex_destroy () from /lib/libzpool.so.1
#7  0x00080102e1a0 in dsl_dataset_fast_stat () from /lib/libzpool.so.1
#8  0x000801045ffa in dbuf_find () from /lib/libzpool.so.1
#9  0x000801047bf3 in dmu_buf_rele () from /lib/libzpool.so.1
#10 0x000801027546 in dsl_pool_open () from /lib/libzpool.so.1
#11 0x00080101bcec in spa_create () from /lib/libzpool.so.1
#12 0x00080101c820 in spa_tryimport () from /lib/libzpool.so.1


But back to the problem - without an additional printf we still can not what was
the value in m_owner. Only that it was not null.
Probably it's better to build with debugging symbols and examine with gdb.


Firt try:
[r...@avoriaz libzpool]# gdb zdb
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...(no debugging 
symbols found)...

(gdb) r pool1
Starting program: /usr/sbin/zdb pool1
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...[New LWP 100299]
(no debugging symbols found)...(no debugging symbols found)...(no 
debugging symbols found)...(no debugging symbols found)...[New Thread 
0x8018020b0 (LWP 100299)]

[New Thread 0x801802240 (LWP 100354)]
version=13
name='pool1'
state=0
txg=4
pool_guid=9156958376606789
hostid=1133576597
hostname='unset'
vdev_tree
type='root'
id=0
guid=9156958376606789
children[0]
type='raidz'
id=0
guid=8214939615613279020
nparity=1
metaslab_array=23
metaslab_shift=32
ashift=9
asize=500108886016
is_log=0
children[0]
type='disk'
id=0
guid=7001907692988243779
path='/dev/ad8p2'
whole_disk=0
children[1]
type='disk'
id=1
guid=1909032920962573263
path='/dev/ad10p2'
whole_disk=0
[New Thread 0x8018023d0 (LWP 100369)]
[New Thread 0x801802560 (LWP 100370)]
[New Thread 0x8018026f0 (LWP 100371)]
[New Thread 0x801802880 (LWP 100372)]
[New Thread 0x801802a10 (LWP 100376)]
[New Thread 0x801802ba0 (LWP 100382)]
[New Thread 0x801802d30 (LWP 100383)]
[New Thread 0x801802ec0 (LWP 100384)]
[New Thread 0x801803050 (LWP 100385)]
[New Thread 0x8018031e0 (LWP 100386)]
[New Thread 0x801803370 (LWP 100387)]
[New Thread 0x801803500 (LWP 100388)]
[New Thread 0x801803690 (LWP 100389)]
[New Thread 0x801803820 (LWP 100390)]
[New Thread 0x8018039b0 (LWP 100391)]
[New Thread 0x801803b40 (LWP 100392)]
[New Thread 0x801803cd0 (LWP 100393)]
[New Thread 0x801803e60 (LWP 100394)]
[New Thread 0x801803ff0 (LWP 100395)]
[New Thread 0x801804180 (LWP 100396)]
[New Thread 0x801804310 (LWP 100397)]
[New Thread 0x8018044a0 (LWP 100398)]
[New Thread 0x801804630 (LWP 100399)]
[New Thread 0x8018047c0 (LWP 100400)]
[New Thread 0x801804950 (LWP 100401)]
[New Thread 0x801804ae0 (LWP 100402)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x8018020b0 (LWP 100299)]
0x0008012a6f22 in strlen () from /lib/libc.so.7
(gdb) bt
#0  0x0008012a6f22 in strlen () from /lib/libc.so.7
#1  0x0008012a0feb in open () from /lib/libc.so.7
#2  0x00080129ea59 in open () from /lib/libc.so.7
#3  0x0008012a1f2e in vfprintf () from /lib/libc.so.7
#4  0x000801291158 in fprintf () from /lib/libc.so.7
#5  0x000801290fb0 in __assert () from /lib/libc.so.7
#6  0x000800fef230 in zmutex_destroy (mp=0x8018b2cc0)
at 
/usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c:112
#7  0x00080102e2b0

Re: ZFS MFC heads down

2009-05-28 Thread Henri Hennebert

Henri Hennebert wrote:

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.



The MFC went in r192498. Please let me know if you have any problems.


--- clipped ---


By the way, to help prepare a boot/root pool does a utility to display 
the content of zpool.cache exist ?


I find the answer to this question and think it may be really useful to 
others:


zdb -C [ -U path to zpool.cache ]

Henri




Henri


Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-27 Thread Henri Hennebert

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.



The MFC went in r192498. Please let me know if you have any problems.


No a real problem but maybe worth mentioning:

on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May 
26 15:37:48 CEST 2009 
r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE  i386


[r...@morzine ~]# zdb rpool
version=13
name='rpool'
state=0
txg=959
pool_guid=17669857244588609348
hostid=2315842372
hostname='unset'
vdev_tree
type='root'
id=0
guid=17669857244588609348
children[0]
type='mirror'
id=0
guid=3225603179255348056
metaslab_array=23
metaslab_shift=28
ashift=9
asize=51534888960
is_log=0
children[0]
type='disk'
id=0
guid=17573085726489368265
path='/dev/da0p2'
whole_disk=0
children[1]
type='disk'
id=1
guid=2736169600077218893
path='/dev/da1p2'
whole_disk=0
Assertion failed: (?Àuè?ëۍ´), function mp-m_owner == NULL, file 
/usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c, 
line 112.

Abort trap: 6


and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon 
May 25 12:06:07 CEST 2009 
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ  amd64


[r...@avoriaz ~]# zdb rpool
version=13
name='rpool'
state=0
txg=3467
pool_guid=536117255064806899
hostid=1133576597
hostname='unset'
vdev_tree
type='root'
id=0
guid=536117255064806899
children[0]
type='mirror'
id=0
guid=3124217685892976292
metaslab_array=23
metaslab_shift=30
ashift=9
asize=155741847552
is_log=0
children[0]
type='disk'
id=0
guid=11099413743436480159
path='/dev/ad4p2'
whole_disk=0
children[1]
type='disk'
id=1
guid=12724983687805955432
path='/dev/ad6p2'
whole_disk=0
Segmentation fault: 11

By the way, to help prepare a boot/root pool does a utility to display 
the content of zpool.cache exist ?



Henri


Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-27 Thread Henri Hennebert

Artem Belevich wrote:

I had the same problem on -current. Try attached patch. It may not
apply cleanly on -stable, but should be easy enough to make equivalent
changes on -stable.


The patch is ok for stable.

now I get for the pool with my root:

[r...@morzine libzpool]# zdb rpool
version=13
name='rpool'
state=0
txg=959
pool_guid=17669857244588609348
hostid=2315842372
hostname='unset'
vdev_tree
type='root'
id=0
guid=17669857244588609348
children[0]
type='mirror'
id=0
guid=3225603179255348056
metaslab_array=23
metaslab_shift=28
ashift=9
asize=51534888960
is_log=0
children[0]
type='disk'
id=0
guid=17573085726489368265
path='/dev/da0p2'
whole_disk=0
children[1]
type='disk'
id=1
guid=2736169600077218893
path='/dev/da1p2'
whole_disk=0
WARNING: pool 'rpool' could not be loaded as it was last accessed by 
another system (host: unset hostid: 0x8a08f344). See: 
http://www.sun.com/msg/ZFS-8000-EY

zdb: can't open rpool: No such file or directory

But rpool have been used for many boot now - strange ...

Thanks for your patch and time

Henri




--Artem



On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert h...@restart.be wrote:

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.


The MFC went in r192498. Please let me know if you have any problems.

No a real problem but maybe worth mentioning:

on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May 26
15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE
 i386

[r...@morzine ~]# zdb rpool
   version=13
   name='rpool'
   state=0
   txg=959
   pool_guid=17669857244588609348
   hostid=2315842372
   hostname='unset'
   vdev_tree
   type='root'
   id=0
   guid=17669857244588609348
   children[0]
   type='mirror'
   id=0
   guid=3225603179255348056
   metaslab_array=23
   metaslab_shift=28
   ashift=9
   asize=51534888960
   is_log=0
   children[0]
   type='disk'
   id=0
   guid=17573085726489368265
   path='/dev/da0p2'
   whole_disk=0
   children[1]
   type='disk'
   id=1
   guid=2736169600077218893
   path='/dev/da1p2'
   whole_disk=0
Assertion failed: (?Ąuč? ėŪ¨´), function mp-m_owner == NULL, file
/usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c,
line 112.
Abort trap: 6


and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon May
25 12:06:07 CEST 2009 r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ
 amd64

[r...@avoriaz ~]# zdb rpool
   version=13
   name='rpool'
   state=0
   txg=3467
   pool_guid=536117255064806899
   hostid=1133576597
   hostname='unset'
   vdev_tree
   type='root'
   id=0
   guid=536117255064806899
   children[0]
   type='mirror'
   id=0
   guid=3124217685892976292
   metaslab_array=23
   metaslab_shift=30
   ashift=9
   asize=155741847552
   is_log=0
   children[0]
   type='disk'
   id=0
   guid=11099413743436480159
   path='/dev/ad4p2'
   whole_disk=0
   children[1]
   type='disk'
   id=1
   guid=12724983687805955432
   path='/dev/ad6p2'
   whole_disk=0
Segmentation fault: 11

By the way, to help prepare a boot/root pool does a utility to display the
content of zpool.cache exist ?


Henri

Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http

Re: ZFS MFC heads down

2009-05-27 Thread Henri Hennebert

Henri Hennebert wrote:

Artem Belevich wrote:

I had the same problem on -current. Try attached patch. It may not
apply cleanly on -stable, but should be easy enough to make equivalent
changes on -stable.


The patch is ok for stable.

now I get for the pool with my root:

[r...@morzine libzpool]# zdb rpool
version=13
name='rpool'
state=0
txg=959
pool_guid=17669857244588609348
hostid=2315842372
hostname='unset'
vdev_tree
type='root'
id=0
guid=17669857244588609348
children[0]
type='mirror'
id=0
guid=3225603179255348056
metaslab_array=23
metaslab_shift=28
ashift=9
asize=51534888960
is_log=0
children[0]
type='disk'
id=0
guid=17573085726489368265
path='/dev/da0p2'
whole_disk=0
children[1]
type='disk'
id=1
guid=2736169600077218893
path='/dev/da1p2'
whole_disk=0
WARNING: pool 'rpool' could not be loaded as it was last accessed by 
another system (host: unset hostid: 0x8a08f344). See: 
http://www.sun.com/msg/ZFS-8000-EY

zdb: can't open rpool: No such file or directory

But rpool have been used for many boot now - strange ...


And dangerous:

the second time I try:

[r...@morzine ~]# zdb rpool
zdb: can't open rpool: No such file or directory
[r...@morzine ~]#

And the real problem: rpool is no more in /boot/zfs/zpool.cache !!!

Next boot will not work smoothly.

Tomorrow, I will use the 3rd bootable disk to rebuild this.

Henri


Thanks for your patch and time

Henri




--Artem



On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert h...@restart.be wrote:

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.


The MFC went in r192498. Please let me know if you have any problems.

No a real problem but maybe worth mentioning:

on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue 
May 26

15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE
 i386

[r...@morzine ~]# zdb rpool
   version=13
   name='rpool'
   state=0
   txg=959
   pool_guid=17669857244588609348
   hostid=2315842372
   hostname='unset'
   vdev_tree
   type='root'
   id=0
   guid=17669857244588609348
   children[0]
   type='mirror'
   id=0
   guid=3225603179255348056
   metaslab_array=23
   metaslab_shift=28
   ashift=9
   asize=51534888960
   is_log=0
   children[0]
   type='disk'
   id=0
   guid=17573085726489368265
   path='/dev/da0p2'
   whole_disk=0
   children[1]
   type='disk'
   id=1
   guid=2736169600077218893
   path='/dev/da1p2'
   whole_disk=0
Assertion failed: (?Ąuč? ėŪ¨´), function mp-m_owner == NULL, file
/usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c, 


line 112.
Abort trap: 6


and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: 
Mon May
25 12:06:07 CEST 2009 
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ

 amd64

[r...@avoriaz ~]# zdb rpool
   version=13
   name='rpool'
   state=0
   txg=3467
   pool_guid=536117255064806899
   hostid=1133576597
   hostname='unset'
   vdev_tree
   type='root'
   id=0
   guid=536117255064806899
   children[0]
   type='mirror'
   id=0
   guid=3124217685892976292
   metaslab_array=23
   metaslab_shift=30
   ashift=9
   asize=155741847552
   is_log=0
   children[0]
   type='disk'
   id=0
   guid=11099413743436480159
   path='/dev/ad4p2'
   whole_disk=0
   children[1]
   type='disk'
   id=1
   guid=12724983687805955432
   path='/dev/ad6p2'
   whole_disk=0
Segmentation fault: 11

By the way, to help prepare a boot/root pool does a utility to 
display the

content of zpool.cache exist

Re: ZFS MFC heads down

2009-05-27 Thread Henri Hennebert

Artem Belevich wrote:

Did you by any chance do that from single-user mode? ZFS seems to rely
on hostid being set.
Try running /etc/rc.d/hostid start and then re-try your zfs commands.


I was in multiuser with hostid set.

Henri


--Artem



On Wed, May 27, 2009 at 1:06 PM, Henri Hennebert h...@restart.be wrote:

Artem Belevich wrote:

I had the same problem on -current. Try attached patch. It may not
apply cleanly on -stable, but should be easy enough to make equivalent
changes on -stable.

The patch is ok for stable.

now I get for the pool with my root:

[r...@morzine libzpool]# zdb rpool
   version=13
   name='rpool'
   state=0
   txg=959
   pool_guid=17669857244588609348
   hostid=2315842372
   hostname='unset'
   vdev_tree
   type='root'
   id=0
   guid=17669857244588609348
   children[0]
   type='mirror'
   id=0
   guid=3225603179255348056
   metaslab_array=23
   metaslab_shift=28
   ashift=9
   asize=51534888960
   is_log=0
   children[0]
   type='disk'
   id=0
   guid=17573085726489368265
   path='/dev/da0p2'
   whole_disk=0
   children[1]
   type='disk'
   id=1
   guid=2736169600077218893
   path='/dev/da1p2'
   whole_disk=0
WARNING: pool 'rpool' could not be loaded as it was last accessed by another
system (host: unset hostid: 0x8a08f344). See:
http://www.sun.com/msg/ZFS-8000-EY
zdb: can't open rpool: No such file or directory

But rpool have been used for many boot now - strange ...

Thanks for your patch and time

Henri



--Artem



On Wed, May 27, 2009 at 3:00 AM, Henri Hennebert h...@restart.be wrote:

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.

The MFC went in r192498. Please let me know if you have any problems.

No a real problem but maybe worth mentioning:

on FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Tue May
26
15:37:48 CEST 2009 r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE
 i386

[r...@morzine ~]# zdb rpool
  version=13
  name='rpool'
  state=0
  txg=959
  pool_guid=17669857244588609348
  hostid=2315842372
  hostname='unset'
  vdev_tree
  type='root'
  id=0
  guid=17669857244588609348
  children[0]
  type='mirror'
  id=0
  guid=3225603179255348056
  metaslab_array=23
  metaslab_shift=28
  ashift=9
  asize=51534888960
  is_log=0
  children[0]
  type='disk'
  id=0
  guid=17573085726489368265
  path='/dev/da0p2'
  whole_disk=0
  children[1]
  type='disk'
  id=1
  guid=2736169600077218893
  path='/dev/da1p2'
  whole_disk=0
Assertion failed: (?Ąuč? ėŪ¨´), function mp-m_owner == NULL, file

/usr/src/cddl/lib/libzpool/../../../cddl/contrib/opensolaris/lib/libzpool/common/kernel.c,
line 112.
Abort trap: 6


and on FreeBSD avoriaz.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon
May
25 12:06:07 CEST 2009
r...@avoriaz.restart.bel:/usr/obj/usr/src/sys/AVORIAZ
 amd64

[r...@avoriaz ~]# zdb rpool
  version=13
  name='rpool'
  state=0
  txg=3467
  pool_guid=536117255064806899
  hostid=1133576597
  hostname='unset'
  vdev_tree
  type='root'
  id=0
  guid=536117255064806899
  children[0]
  type='mirror'
  id=0
  guid=3124217685892976292
  metaslab_array=23
  metaslab_shift=30
  ashift=9
  asize=155741847552
  is_log=0
  children[0]
  type='disk'
  id=0
  guid=11099413743436480159
  path='/dev/ad4p2'
  whole_disk=0
  children[1]
  type='disk'
  id=1
  guid=12724983687805955432
  path='/dev/ad6p2'
  whole_disk=0
Segmentation fault: 11

By the way, to help prepare a boot/root pool does a utility to display
the
content of zpool.cache exist ?


Henri

Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http

Re: ZFS MFC heads down

2009-05-26 Thread Henri Hennebert

Kip Macy wrote:

I haven't looked at the panic yet, but adding a USB quirk (no
SYNCHRONIZE_CACHE) would certainly reduce the noise in your logs.


Thanks for this hint.

I patch usbdevs and umass.c. No more noise but more interesting, now I
can complete install on my usb key without deadlock or crash.

Henri


-Kip

On Mon, May 25, 2009 at 4:16 AM, Henri Hennebert h...@restart.be wrote:

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.


The MFC went in r192498. Please let me know if you have any problems.


I get a panic:

panic: solaris assert: 0 == dmu_read(os, lr-lr_foid, off, dlen, buf), file:
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c,
line: 991

during `make -s DESTDIR=/kingston installworld`

kingston is a pool on a USB stick with GPT partitions

more info at : http://verbier.restart.be/xfer/core.txt.60

Thanks for your work

Henri


Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org








___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-25 Thread Henri Hennebert

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.



The MFC went in r192498. Please let me know if you have any problems.


I get a panic:

panic: solaris assert: 0 == dmu_read(os, lr-lr_foid, off, dlen, buf), 
file: 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, 
line: 991


during `make -s DESTDIR=/kingston installworld`

kingston is a pool on a USB stick with GPT partitions

more info at : http://verbier.restart.be/xfer/core.txt.60

Thanks for your work

Henri


Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-21 Thread Henri Hennebert

Navdeep Parhar wrote:

On Wed, May 20, 2009 at 5:00 PM, Kip Macy km...@freebsd.org wrote:

Not really a problem but a question:  Is the v13 on-disk format
exactly the same as that used by Solaris/Opensolaris?

It is supposed to be. The sources are the same. However, I have not
tested interoperability.



Does this make
it possible to have a ZFS-only dual boot system running FreeBSD-stable
and Solaris, with a shared home directory between the two
environments?

It should be.


Has anyone tried anything like this?


Google anyone? :-)


My google-fu is weak today, and considering that this went into
-stable a few minutes back, I didn't look that hard for
v13/fbsd-stable/opensolaris adventures. :-)


I do it with 7.1 and opensolaris 2008.05 without problem. I keep the 
pool in V6 of course.


Henri


I'm feeling brave.  I think I'll try it myself.  Thanks for getting
this into -stable!

Navdeep


-Kip


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-21 Thread Henri Hennebert

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.



The MFC went in r192498. Please let me know if you have any problems.


I upgrade to stable r192523:

FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Thu May 21 
13:18:53 CEST 2009 
r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE  i386


some strange things:

just after boot:

[r...@morzine ~]# zfs upgrade
This system is currently running ZFS filesystem version 3.

The following filesystems are out of date, and can be upgraded.  After being
upgraded, these filesystems (and any 'zfs send' streams generated from
subsequent snapshots) will no longer be accessible by older software 
versions.



VER  FILESYSTEM
---  
 1   pool1
 1   pool1/qemu
 1   pool1/squid
 1   pool2
 1   pool2/WorkBench
 1   pool2/backup
 1   pool2/download
 1   pool2/qemu
 1   pool2/sys
 1   rpool
 1   rpool/home
 1   rpool/root
 1   rpool/tmp
 1   rpool/usr
 1   rpool/var
 1   rpool/var/spool
[r...@morzine ~]# zfs upgrade -v
The following filesystem versions are supported:

VER  DESCRIPTION
---  
 1   Initial ZFS filesystem version
 2   Enhanced directory entries
 3   Case insensitive and File system unique identifer (FUID)

For more information on a particular version, including supported 
releases, see:


http://www.opensolaris.org/os/community/zfs/version/zpl/N

Where 'N' is the version number.


And now, after a few minutes:

[r...@morzine ~]# zpool upgrade
This system is currently running ZFS pool version 13.

The following pools are out of date, and can be upgraded.  After being
upgraded, these pools will no longer be accessible by older software 
versions.


VER  POOL
---  
 6   pool1
 6   pool2
 6   rpool

Use 'zpool upgrade -v' for a list of available versions and their associated
features.
[r...@morzine ~]# zpool upgrade -v
This system is currently running ZFS pool version 13.

The following versions are supported:

VER  DESCRIPTION
---  
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z
 4   zpool history
 5   Compression using the gzip algorithm
 6   bootfs pool property
 7   Separate intent log devices
 8   Delegated administration
 9   refquota and refreservation properties
 10  Cache devices
 11  Improved scrub performance
 12  Snapshot properties
 13  snapused property
For more information on a particular version, including supported 
releases, see:


http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.

Strange isn't it o-)

By the way all seems ok!

Thanks to all for this update to zfs V13

Henri


Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS MFC heads down

2009-05-21 Thread Henri Hennebert

Henri Hennebert wrote:

Kip Macy wrote:

On Wed, May 20, 2009 at 2:59 PM, Kip Macy km...@freebsd.org wrote:

I will be MFC'ing the newer ZFS support some time this afternoon. Both
world and kernel will need to be re-built. Existing pools will
continue to work without upgrade.


If you choose to upgrade a pool to take advantage of new features you
will no longer be able to use it with sources prior to today. 'zfs
send/recv' is not expected to inter-operate between different pool
versions.



The MFC went in r192498. Please let me know if you have any problems.


I upgrade to stable r192523:

FreeBSD morzine.restart.bel 7.2-STABLE FreeBSD 7.2-STABLE #0: Thu May 21 
13:18:53 CEST 2009 
r...@morzine.restart.bel:/usr/obj/usr/src/sys/MORZINE  i386


some strange things:

just after boot:

[r...@morzine ~]# zfs upgrade
This system is currently running ZFS filesystem version 3.

The following filesystems are out of date, and can be upgraded.  After 
being

upgraded, these filesystems (and any 'zfs send' streams generated from
subsequent snapshots) will no longer be accessible by older software 
versions.



VER  FILESYSTEM
---  
 1   pool1
 1   pool1/qemu
 1   pool1/squid
 1   pool2
 1   pool2/WorkBench
 1   pool2/backup
 1   pool2/download
 1   pool2/qemu
 1   pool2/sys
 1   rpool
 1   rpool/home
 1   rpool/root
 1   rpool/tmp
 1   rpool/usr
 1   rpool/var
 1   rpool/var/spool
[r...@morzine ~]# zfs upgrade -v
The following filesystem versions are supported:

VER  DESCRIPTION
---  
 1   Initial ZFS filesystem version
 2   Enhanced directory entries
 3   Case insensitive and File system unique identifer (FUID)

For more information on a particular version, including supported 
releases, see:


http://www.opensolaris.org/os/community/zfs/version/zpl/N

Where 'N' is the version number.


And now, after a few minutes:

[r...@morzine ~]# zpool upgrade
This system is currently running ZFS pool version 13.

The following pools are out of date, and can be upgraded.  After being
upgraded, these pools will no longer be accessible by older software 
versions.


VER  POOL
---  
 6   pool1
 6   pool2
 6   rpool

Use 'zpool upgrade -v' for a list of available versions and their 
associated

features.
[r...@morzine ~]# zpool upgrade -v
This system is currently running ZFS pool version 13.

The following versions are supported:

VER  DESCRIPTION
---  
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z
 4   zpool history
 5   Compression using the gzip algorithm
 6   bootfs pool property
 7   Separate intent log devices
 8   Delegated administration
 9   refquota and refreservation properties
 10  Cache devices
 11  Improved scrub performance
 12  Snapshot properties
 13  snapused property
For more information on a particular version, including supported 
releases, see:


http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.

Strange isn't it o-)

By the way all seems ok!


This happen after the first boot in stable (comming from 7.2-RELEASE).

I reboot and can't reproduce it!.

Henri


Thanks to all for this update to zfs V13

Henri


Thanks,
Kip
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


7.2-RC1 - serial console / sio0 not working

2009-04-20 Thread Henri Hennebert

Hello,

Experiencing some deadlock, I try to reenable my serial console on 
7.2-RC1. (console=comconsole,vidconsole in /boot/loader.conf and

-Dh or -Dh -S115200 in /boot.config).

/var/log/message show: 'sio0: type 16550A, console' and from the vga 
point of view, console output from kernel is slow as if echoed on a 
serial and rc output is going somewhere.


At the other end of the serial, minicom show nothing and is 'offline'.
A break at the minicom set my 7.2-RC1 in debugging (ddb) but 'continue' 
has no effect.


The cable is working fine (serial console mode) with another box in 
8.0-CURRENT.


If I disable serial console and try minicom on 7.2-RC1, status is 
offline but any key is recieved at the other end and any key type at the 
other end is displayed fine.


Does anyone encounter such a problem ?

Thanks in advance

henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


7.2-RC1 - serial console / sio0 not working

2009-04-20 Thread Henri Hennebert

Sorry for the previous wrong followup :-(

Hello,

Experiencing some deadlock, I try to reenable my serial console on 
7.2-RC1. (console=comconsole,vidconsole in /boot/loader.conf and

-Dh or -Dh -S115200 in /boot.config).

/var/log/message show: 'sio0: type 16550A, console' and from the vga 
point of view, console output from kernel is slow as if echoed on a 
serial and rc output is going somewhere.


At the other end of the serial, minicom show nothing and is 'offline'.
A break at the minicom set my 7.2-RC1 in debugging (ddb) but 'continue' 
has no effect.


The cable is working fine (serial console mode) with another box in 
8.0-CURRENT.


If I disable serial console and try minicom on 7.2-RC1, status is 
offline but any key is recieved at the other end and any key type at the 
other end is displayed fine.


Does anyone encounter such a problem ?

Thanks in advance

henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-RC1 - serial console / sio0 not working

2009-04-20 Thread Henri Hennebert

Marten Vijn wrote:

On Mon, 2009-04-20 at 16:49 +0200, Henri Hennebert wrote:

Hello,

Experiencing some deadlock, I try to reenable my serial console on 
7.2-RC1. (console=comconsole,vidconsole in /boot/loader.conf and

-Dh or -Dh -S115200 in /boot.config).

/var/log/message show: 'sio0: type 16550A, console' and from the vga 
point of view, console output from kernel is slow as if echoed on a 
serial and rc output is going somewhere.


At the other end of the serial, minicom show nothing and is 'offline'.
A break at the minicom set my 7.2-RC1 in debugging (ddb) but 'continue' 
has no effect.


The cable is working fine (serial console mode) with another box in 
8.0-CURRENT.


If I disable serial console and try minicom on 7.2-RC1, status is 
offline but any key is recieved at the other end and any key type at the 
other end is displayed fine.


Does anyone encounter such a problem ?


maybe diff /etc/ttys

between 8.0 and 7.2


I don't use the serial for login, so I believe it is not important in my 
case.


Thank you for your time

Henri


I had problems updrading a machine (over serial console)
lately, (7.1.to Current)  


Marten


Thanks in advance

henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 6.4-STABLE and PHP5 pcre and phpsysinfo

2009-04-19 Thread Henri Hennebert

xer wrote:

Hello
Mine 6.4-STABLE today has a strange problem regarding phpsysinfo that i use 
it.

Ports are updated,but phpsysinfo (on browser) today show errors about pcre:


---
Notice:  Undefined offset:  3 in 
/usr/local/www/data-dist/phpsysinfo/includes/os/class.FreeBSD.inc.php on line 59


^ a lots

Warning:  preg_match() [function.preg-match]: Internal pcre_fullinfo() error -3 
in /usr/local/www/data-dist/phpsysinfo/includes/os/class.BSD.common.inc.php on 
line 126


^ a lots

Warning:  asort() expects parameter 1 to be array, boolean given in 
/usr/local/www/data-dist/phpsysinfo/includes/os/class.BSD.common.inc.php on 
line 174


^ a lots


Warning:  preg_match() [function.preg-match]: Internal pcre_fullinfo() error -3 
in /usr/local/www/data-dist/phpsysinfo/includes/os/class.BSD.common.inc.php on 
line 187


^ a lots

XPath error in XPath.class.php:3492 Expression failed to parse as PrimaryExpr 
because: Expression is not a PrimaryExpr
XPath error in XPath.class.php:5903 The supplied xPath
'/phpsysinfo/Vitals/Distro' does not *uniquely* describe a node in the
xml document.Not unique xpath-query, matched 0-times.


and more...

It seems that the FreeBSD patch does not work so well, someone use phpsysinfo?

I did deinstalled php5 and 1.3 extension and reinstalled as expected.. but no 
resolve.


Contrary to /usr/ports/UPDATING - entry 20081211, base php5 (5.2.9) 
don't contains pcre. You simply have to add /usr/ports/devel/php5-pcre. 
All will be OK.


Henri


Any help please?
Thanx in advance.

_
Quante ne sai? Scoprilo con CrossWire!
http://clk.atdmt.com/GBL/go/140630367/direct/01/___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


hald and GEOM_PART_BSD + GEOM_PART_MBR

2009-01-21 Thread Henri Hennebert

Just for the record,

I add options GEOM_PART_BSD and GEOM_PART_MBR to my kernel config (I 
want to see what gpart was saying abount my disks).


At boot time I get some messages as:

GEOM: ad4s2: geometry does not match label.
GEOM: ad4s2: media size does not match label.

and more important: hald eats up cpu time and can't answer to lshal.

Removing those options resolve the problem.

Henri

PS - I'm ready to test some patches
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS performance issues (solved?!)

2008-09-01 Thread Henri Hennebert

John Birrell wrote:

For those people experiencing a performance degradation since the DTrace import,
please update your copy of 
src/sys/cddl/compat/opensolaris/kern/opensolaris_kmem.c
by either cvsup of direct edit to remove #define KMEM_DEBUG.

You only need to rebuild the opensolaris kernel module after this change. The 
code
is shared between ZFS and DTrace via the opensolaris kernel module.

This is also the reason why you found it necessary to add KDB, DDB and STACK to
your kernel. After removing KMEM_DEBUG, you won't need those.

Please confirm that this solves the problem you have been seeing.


Great, now everything is back to normal

Thanks

Henri


--
John Birrell
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible ZFS patch, please test!

2008-08-31 Thread Henri Hennebert

Jeremy Chadwick wrote:

On Sat, Aug 30, 2008 at 09:28:36PM +0200, Henri Hennebert wrote:

John Baldwin wrote:
This patch merges a few changes from HEAD back to 7.x.  I think the 
endian changes specifically might solve the issue people saw with 
zpools created with non-dtrace kernels not being readable by dtrace 
kernels and vice versa.


http://www.FreeBSD.org/~jhb/patches/zfs_7.patch


Just a follow-up

I cvsup at Sat Aug 30 12:55 without zfs_7.patch and

make buildworld  make buildkernel  make installkernel

reboot (-s) --root on zfs is ok -- make installworld

reboot

System is still sluggish even during the make installworld in single user.


Sorry if I've missed this, but what tuning have you done for ZFS?  Some
of us (most of us?) have seen fairly sluggish performance when
prefetch is enabled (the default), while the system is generally more
responsive when prefetch is disabled.


prefetch is enabled:

vfs.zfs.arc_min: 33554432
vfs.zfs.arc_max: 268435456
vfs.zfs.mdcomp_disable: 0
vfs.zfs.prefetch_disable: 0
vfs.zfs.zio.taskq_threads: 0
vfs.zfs.recover: 0
vfs.zfs.vdev.cache.size: 10485760
vfs.zfs.vdev.cache.max: 16384
vfs.zfs.cache_flush_disable: 0
vfs.zfs.zil_disable: 0
vfs.zfs.debug: 0



This may have nothing to do with the problem you've stated, but I
thought I'd throw it out there.


I think that the problem is somewhere else because 80% of system cpu on 
a dual core seems awfully bad - eg more than 20 seconds to open this 
response after clicking on the response button.


Henri




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible ZFS patch, please test!

2008-08-30 Thread Henri Hennebert

John Baldwin wrote:

On Friday 29 August 2008 03:57:46 am Henri Hennebert wrote:

Henri Hennebert wrote:

John Baldwin wrote:
This patch merges a few changes from HEAD back to 7.x.  I think the 
endian changes specifically might solve the issue people saw with 
zpools created with non-dtrace kernels not being readable by dtrace 
kernels and vice versa.


http://www.FreeBSD.org/~jhb/patches/zfs_7.patch


It works for me with the root on zfs

While rebuilding the ports index with `portsdb -Uu` the system become 
really sluggish with cpu running more than 60% in system...


Something really strange here.


Can you try removing the 'KDTRACE_*' options from your kernel config file?  It 
appears that they haven't been enabled in 8.x by default yet.



I try, but with the wold of 7.1-PRERELEASE I got

cc: Internal error: Segmentation fault: 11 (program ld)
Please submit a full bug report.
See URL:http://gcc.gnu.org/bugs.html for instructions.
*** Error code 1

Stop in /usr/obj/usr/src/sys/MORZINE.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.

So I get /usr/bin/cc /usr/bin/ld /usr/libexec/cc* from a previous
7.0-STABLE and try it again... with a lot of cpu-system... 80% on a
dual core.

Anyway - I reboot the new kernel without KDTRACE_HOOKS and DDB_CTF.

After reboot, system cpu is always very high and system not responsive.

Henri




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible ZFS patch, please test!

2008-08-30 Thread Henri Hennebert

John Birrell wrote:

On Sat, Aug 30, 2008 at 09:07:15AM +0200, Henri Hennebert wrote:

I try, but with the wold of 7.1-PRERELEASE I got

cc: Internal error: Segmentation fault: 11 (program ld)
Please submit a full bug report.
See URL:http://gcc.gnu.org/bugs.html for instructions.
*** Error code 1


Henri, please delete the entire contents of your obj directory to
remove the bad tools that have been built there.

When you run 'make buildkernel' it will use the tools from the last
buildworld rather than the installed ones.

For anyone experiencing this problem, you can do a 'make installworld'
with STRIP= as long as you can boot to single user and mount your file
systems.

The problem is occurring when static binaries are installed with the default
option to strip the binaries. It seems that the strip program doesn't like
the presence of the CTF ELF section.


OK - I better understand what's happening.



I believe that the buildworld that you have is OK, even when built with the
CTF data it's the installworld when things go bad.

Do you need me to send you any files to recover from this problem?


No problem, I have access to a previous 7.0-STABLE.

Thanks

Henri


--
John Birrell
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible ZFS patch, please test!

2008-08-30 Thread Henri Hennebert

John Birrell wrote:

On Sat, Aug 30, 2008 at 10:40:12AM +0200, Henri Hennebert wrote:

I believe that the buildworld that you have is OK, even when built with the
CTF data it's the installworld when things go bad.

Do you need me to send you any files to recover from this problem?

No problem, I have access to a previous 7.0-STABLE.


I am concerned about the high CPU problem. All the hooks that are built in
with KDTRACE_HOOKS are inactive until the DTrace modules are loaded. So there
should be no CPU implications there.

Are you using i3886 or amd64?


It is i386:

CPU: Intel(R) Xeon(R) CPU5130  @ 2.00GHz (1995.01-MHz 
686-class CPU)

  Origin = GenuineIntel  Id = 0x6f6  Stepping = 6

Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x4e33dSSE3,RSVD2,MON,DS_CPL,VMX,TM2,SSSE3,CX16,xTPR,PDCM,DCA
  AMD Features=0x2010NX,LM
  AMD Features2=0x1LAHF
  Cores per package: 2
real memory  = 2146369536 (2046 MB)
avail memory = 2084503552 (1987 MB)


Henri


--
John Birrell



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible ZFS patch, please test!

2008-08-30 Thread Henri Hennebert

John Baldwin wrote:
This patch merges a few changes from HEAD back to 7.x.  I think the endian 
changes specifically might solve the issue people saw with zpools created 
with non-dtrace kernels not being readable by dtrace kernels and vice versa.


http://www.FreeBSD.org/~jhb/patches/zfs_7.patch


Just a follow-up

I cvsup at Sat Aug 30 12:55 without zfs_7.patch and

make buildworld  make buildkernel  make installkernel

reboot (-s) --root on zfs is ok -- make installworld

reboot

System is still sluggish even during the make installworld in single user.

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible ZFS patch, please test!

2008-08-29 Thread Henri Hennebert

John Baldwin wrote:
This patch merges a few changes from HEAD back to 7.x.  I think the endian 
changes specifically might solve the issue people saw with zpools created 
with non-dtrace kernels not being readable by dtrace kernels and vice versa.


http://www.FreeBSD.org/~jhb/patches/zfs_7.patch


It works for me with the root on zfs

Thanks

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible ZFS patch, please test!

2008-08-29 Thread Henri Hennebert

Henri Hennebert wrote:

John Baldwin wrote:
This patch merges a few changes from HEAD back to 7.x.  I think the 
endian changes specifically might solve the issue people saw with 
zpools created with non-dtrace kernels not being readable by dtrace 
kernels and vice versa.


http://www.FreeBSD.org/~jhb/patches/zfs_7.patch


It works for me with the root on zfs

While rebuilding the ports index with `portsdb -Uu` the system become 
really sluggish with cpu running more than 60% in system...


Something really strange here.

Henri


Thanks

Henri
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: recent regression with REL7 and Zfs

2008-08-28 Thread Henri Hennebert

Thierry Herbelot wrote:

Hello,

I am using a recent 7.0-Stable (x86) and a Zfs pool for my data.

After the dtrace import, I have updated my sources (and make buildworld, make 
buildkernel) and I no longer have access to my Zfs pool (just to be sure, I 
have since updated twice more to work around the announced issues).


Probably same problem here:

With new kernel (7.1-PRERELEASE) and a root file system under zfs, the 
root can't be mounted and system stop with


message saying can't mount zfs:pool0

Manual root filesystem specification:
  fstype:device  Mount device using filesystem fstype
   eg. ufs:da0s1a
  ?  List valid disk boot devices
  empty line   Abort manual input

mountroot

I revert to previous kernel and all is OK now.

Henri


With the new kernel, the Zpool is listed as failed (bad checksum ?).

Reverting to the old kernel is sufficient to recover the Zfs pool (which was 
scrubed last week), and it is declared healthy.


cheers and thanks for the good work

TfH
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RELENG_7 /src/UPDATING out od date?

2008-03-16 Thread Henri Hennebert

Abdullah Ibn Hamad Al-Marri wrote:

Hey,

http://www.freebsd.org/cgi/cvsweb.cgi/src/UPDATING?rev=1.507;only_with_tag=RELENG_7

NOTE TO PEOPLE WHO THINK THAT FreeBSD 7.x IS SLOW:
FreeBSD 7.x has many debugging features turned on, in
both the kernel and userland.  These features attempt to detect
incorrect use of system primitives, and encourage loud failure
through extra sanity checking and fail stop semantics.  They
also substantially impact system performance.  If you want to
do performance measurement, benchmarking, and optimization,
you'll want to turn them off.  This includes various WITNESS-
related kernel options, INVARIANTS, malloc debugging flags
in userland, and various verbose features in the kernel.  Many
developers choose to disable these features on build machines
to maximize performance.
Could someone please nuke this?


It is a	problem with cvsweb - see 
http://www.freebsd.org/cgi/query-pr.cgi?pr=120185


Henri
 
Regards,


-Abdullah Ibn Hamad Al-Marri
Arab Portal
http://www.WeArab.Net/





  

Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: finstall alpha3

2008-02-06 Thread Henri Hennebert

Julian H. Stacey wrote:

Ivan Voras wrote:

As some of you may already know, I'm working on a graphical installer
for FreeBSD 7, which was started as a Google SoC 2007 project but still
continues. 


10+ years back when Jordan did first pre= X, 24x80 graphical installer,
soon afer he'd finished a blind chap posted ~So how do I install ?~
Answer then: ~Get a friend do it for you, or abandon FreeBSD  use NetBSD~

NetBSD still have Ascii installer, so more attractive to some.  Idea
for another SOC project : An automated tool that could descramble
all the glitz of [arbitrary ?] graphics tools back to something
sensible / Ascii, a bit like what OCR does for printed paper.

No doubt a bew grraphical installer might be nice (if X is reliable
which it often is Not,  don't rely on VESA either on old hardware),
but just so's we don't forget blind too,  amazingly they
use computers (with expensive interfaces).

Also visually impaired do too, the later simply with simple text
xterms with Monster fonts, rather than graphics I presume.


Just my opinion:

Maybe you are great BUT YOU ARE TO DEROGATORY about a work witch may
be usefull to some new users

Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: finstall alpha3

2008-02-06 Thread Henri Hennebert

Julian Stacey wrote:

Henri Hennebert wrote 2 emails with same common text
To: Julian H. Stacey [EMAIL PROTECTED]
Date: Wed, 06 Feb 2008 15:20:38 +0100
Message-ID: [EMAIL PROTECTED]
The first private shouting got answered.  Then came
To: freebsd-stable@freebsd.org
Date: Wed, 06 Feb 2008 15:25:57 +0100
Message-id: [EMAIL PROTECTED]
Assume Henri is too young  to remember first graphical installer.


Thank you! I'm 60 this year :-) and using FreeBSD since 2.1. Xenix since 
 88 IIRC. My point is that a graphical installer may be usefull, that's 
all.




Though a new graphical installer may be very nice as an option,
let it not ever be the only way: Remember blind installers,
non VESA supported consoles, non X recognised chips, serial line
controlled installs,  non intel/AMD platforms with broken 
graphics terminal support. (Sparc maybe ? more later ?)




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 7.0-RC1 - ZFS + UFS + io activity show a deadlock

2008-02-01 Thread Henri Hennebert

Pawel Jakub Dawidek wrote:

On Sun, Jan 27, 2008 at 02:47:02PM +0100, Henri Hennebert wrote:

Hello,

I encounter a deadlock while

1) cpio -p from a ZFS filesystem to a UFS filesystem

2) rsync from ZFS to ZFS

I was running with this patch:
http://people.freebsd.org/~pjd/patches/zgd_done.patch


This patch is wrong, why do you use it in the first place?


You advise it to me ...

I will remove it.

Henri

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


  1   2   >