Re: core.txt.N not created anymore on panic

2013-07-18 Thread Mikolaj Golub
On Wed, Jul 17, 2013 at 09:21:51AM +0200, Jeremie Le Hen wrote:
 On Wed, Jul 17, 2013 at 09:16:55AM +0200, Jeremie Le Hen wrote:
  Hi,
  
  Early May I set ddb_enable=YES (crashinfo_enable=YES by default).
  Upon panic, it created the following kind of files:
  
  -rw---  1 root  wheel 549 Jun 26 22:09 info.0
  -rw---  1 root  wheel  1518501888 Jun 26 22:09 vmcore.0
  -rw---  1 root  wheel  196981 Jun 26 22:09 core.txt.0
  -rw---  1 root  wheel 546 Jun 26 23:15 info.1
  -rw---  1 root  wheel   472608768 Jun 26 23:15 vmcore.1
  -rw---  1 root  wheel  207034 Jun 26 23:15 core.txt.1
  -rw---  1 root  wheel 546 Jun 27 00:47 info.2
  -rw---  1 root  wheel   667717632 Jun 27 00:47 vmcore.2
  -rw---  1 root  wheel  208745 Jun 27 00:48 core.txt.2
  -rw---  1 root  wheel 549 Jul  3 14:40 info.3
  -rw---  1 root  wheel  1455198208 Jul  3 14:40 vmcore.3
  -rw---  1 root  wheel  208173 Jul  3 14:41 core.txt.3
  
  The core.txt.N files contained crashinfo(8) informations along with
  ddb textdump, because crashinfo(8) outputs dmesg.
  
  
  Yesterday, I upgraded to latest -CURRENT from one from June 9th.  While
  stress-testing overnight, I got a couple of panics but core.txt.N are
  not created anymore.
  
  -rw---  1 root  wheel 530 Jul 17 01:10 info.5
  -rw---  1 root  wheel   75776 Jul 17 01:10 textdump.tar.5
  -rw---  1 root  wheel 529 Jul 17 02:01 info.6
  -rw---  1 root  wheel   74240 Jul 17 02:01 textdump.tar.6
  -rw---  1 root  wheel 530 Jul 17 04:20 info.7
  -rw---  1 root  wheel   74752 Jul 17 04:20 textdump.tar.7
  -rw---  1 root  wheel 530 Jul 17 07:50 info.8
  -rw---  1 root  wheel   92672 Jul 17 07:50 textdump.tar.8
  -rw---  1 root  wheel 531 Jul 17 08:44 info.9
  -rw---  1 root  wheel  110592 Jul 17 08:44 textdump.tar.9
  
  Each textdump.tar.N contains:
  
  tar tvf /var/crash/textdump.tar.9 
  -rw---  0 root   wheel   49152 Jul 17 08:30 ddb.txt
  -rw---  0 root   wheel3179 Jul 17 08:30 config.txt
  -rw---  0 root   wheel   54137 Jul 17 08:30 msgbuf.txt
  -rw---  0 root   wheel  88 Jul 17 08:30 panic.txt
  -rw---  0 root   wheel 120 Jul 17 08:30 version.txt
  
  Any idea changed in between?  I checked svn log in etc/ but I found
  nothing relevant.

Before your system was configured to generate vmcore dumps. Now it is
configured to generate textdumps. crashinfo(8) works with vmcore.

 
 For the record:
 
 debug.ddb.capture.bufoff: 0
 debug.ddb.capture.maxbufsize: 5242880
 debug.ddb.capture.inprogress: 0
 debug.ddb.capture.bufsize: 49152
 debug.ddb.capture.data: 
 debug.ddb.scripting.scripts: lockinfo=show locks; show alllocks; show
 lockedvnods
 kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt;

'textdump set' tells ddb to store dumps in textdump format. Remove
this from /etc/ddb.conf (and run /etc/rc.d/ddb) if you want
crashinfo(8) data.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic: Lock filedesc structure not share locked

2013-06-30 Thread Mikolaj Golub
On Sun, Jun 30, 2013 at 10:27:57AM +0200, Mateusz Guzik wrote:
 On Sun, Jun 30, 2013 at 09:41:50AM +0200, Alexander Leidinger wrote:
  Hi,
  
  with head as of r252381 on amd64, I got the following panic after
  starting tmux and creating a 2nd terminal window inside tmux
  (ctrl-tmux_command_character + c):
  ---snip---
  panic: Lock filedesc structure not share locked @ 
  /space/system/usr_src/sys/kern/kern_descrip.c:3448
  
  cpuid = 2
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
  0xff839ee566d0
  kdb_backtrace() at kdb_backtrace+0x39/frame 0xff839ee56780
  vpanic() at vpanic+0x126/frame 0xff839ee567c0
  panic() at panic+0x43/frame 0xff839ee56820
  _sx_assert() at _sx_assert+0x134/frame 0xff839ee56830
  _sx_sunlock() at _sx_sunlock+0x46/frame 0xff839ee56860
  kern_proc_filedesc_out() at kern_proc_filedesc_out+0x420/frame 
  0xff839ee568e0
  sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x66/frame 
  0xff839ee56950
  sysctl_root() at sysctl_root+0x1bd/frame 0xff839ee569a0
  userland_sysctl() at userland_sysctl+0x192/frame 0xff839ee56a40
  sys___sysctl() at sys___sysctl+0x74/frame 0xff839ee56af0
  amd64_syscall() at amd64_syscall+0x23c/frame 0xff839ee56bf0
  Xfast_syscall() at Xfast_syscall+0xfb/frame 0xff839ee56bf0
  ---snip---
  
 
 Can you try this (only compile-tested):
 diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
 index e760fe5..7aa17cd 100644
 --- a/sys/kern/kern_descrip.c
 +++ b/sys/kern/kern_descrip.c
 @@ -3272,6 +3272,8 @@ export_fd_to_sb(void *data, int type, int fd, int 
 fflags, int refcnt,
   if (efbuf-remainder  kif-kf_structsize) {
   /* Terminate export. */
   efbuf-remainder = 0;
 + if (!locked  efbuf-fdp != NULL)
 + FILEDESC_SLOCK(efbuf-fdp);
   return (0);
   }
   efbuf-remainder -= kif-kf_structsize;
 

Mateusz, thank you for spotting this lock leakage. Regardless if this
is the root cause of the reported panic (it looks like it is), this
fix should be definetly committed. Will you do this?

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic: Lock filedesc structure not share locked

2013-06-30 Thread Mikolaj Golub
On Sun, Jun 30, 2013 at 11:29:59PM +0200, Mateusz Guzik wrote:

 I think it will be better if you do this and then MFC all commits.

Committed as r252436. Thanks.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zfs kernel panic, known incompatibilities with clang CPUTYPE/COPTFLAGS?

2013-06-15 Thread Mikolaj Golub
On Fri, Jun 14, 2013 at 11:07:02PM +0200, Alexander Leidinger wrote:

 db bt
 Tracing pid 2356
 uart_sab82532_class() at 0
 devfs_ioctl_f() at devfs_ioctl_f+0xf0
 kern_ioctl() at kern_ioctl+0x1d7
 sys_ioctl() at sys_ioctl+0x142
 ---snip---
 
 Anyone with a pointer to an explanation how to convert those pointers
 into source locations?

kgdb
l *devfs_ioctl_f+0xf0

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Proposal for change to kernel linker for fixing a VNET and DPCPU problem.

2012-09-27 Thread Mikolaj Golub
On Tue, Sep 25, 2012 at 11:07:02AM -0400, John Baldwin wrote:
 On Friday, September 21, 2012 12:56:56 pm Julian Elischer wrote:
  On 9/21/12 2:22 AM, Mikolaj Golub wrote:

   http://people.freebsd.org/~trociny/link_elf.c.pcpu_vnet.patch
  
   The fix is to make the linker on a module load recognize external
   VNET/DPCPU variables defined in the previously loaded modules and
   relocate them accordingly. For this set_pcpu_list and set_vnet_list
   are used, where the addresses of modules 'set_pcpu' and 'set_vnet'
   linker sets are stored in.
  
  it makes sense to me, but I really am not a linker person..
  I think it woul be good to get Doug Rabson  to weigh in on it, and
  maybe john Baldwin..
  
  moving to -current as it's not a net issue really..
 
 I think the proposed patch is ok.

Thanks! Committed as r240997.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: zpool can't bring online disk2 ----I screwed up

2012-09-26 Thread Mikolaj Golub
On Sun, Sep 23, 2012 at 10:50:28PM -0700, Jose A. Lombera wrote:

 This is the error I got when I run the failover script.
 
  
 
 Sep 24 06:43:39 san1 hastd[3404]: [disk3] (primary) Provider /dev/mfid3 is 
 not part of resource disk3.
 
 Sep 24 06:43:39 san1 hastd[3343]: [disk3] (primary) Worker process exited 
 ungracefully (pid=3404, exitcode=66).
 
 Sep 24 06:43:39 san1 hastd[3413]: [disk6] (primary) Provider /dev/mfid6 is 
 not part of resource disk6.
 
 Sep 24 06:43:39 san1 hastd[3343]: [disk6] (primary) Worker process exited 
 ungracefully (pid=3413, exitcode=66).
 
 Sep 24 06:43:39 san1 hastd[3425]: [disk10] (primary) Unable to open 
 /dev/mfid10: No such file or directory.
 
 Sep 24 06:43:39 san1 hastd[3407]: [disk4] (primary) Provider /dev/mfid4 is 
 not part of resource disk4.

This looks like your disk numbering has changed? Your another email
confirms this. Then you should change it accordingly in hast.conf.

 Sep 24 06:43:40 san1 hastd[3351]: [disk2] (primary) Resource unique ID 
 mismatch (primary=2635341666474957411, secondary=5944493181984227803).
 
 Sep 24 06:43:45 san1 hastd[3348]: [disk1] (primary) Split-brain condition!
 
 Sep 24 06:43:50 san1 hastd[3351]: [disk2] (primary) Resource unique ID 
 mismatch (primary=2635341666474957411, secondary=5944493181984227803).
 
 Sep 24 06:43:55 san1 hastd[3348]: [disk1] (primary) Split-brain condition!

Split-brain can only be fixed manually, deciding what host contains
actual data and recreating HAST resources (disk1 and disk2 in this
case) on another host.

The simplest way to recover from your situation looks like the following:

Supposing that host A is a host where the disk was changed and things
messed up and host B is a good host.

1) Disable auto failovering if you have any.
2) On host A set all HAST resources to init.
3) On host B set all HAST resources to primary.
4) On host B import pool and check that it works ok here and you have
   your data.
5) On host A recreate HAST resources (hastctl create disk1...)
6) On host A change role to secondary for all HAST
   resources. A synchronization process should start.
7) Wait until the synchronization is complete, checking hastctl status on
   B (primary) host

After this you can switch the pool to the host A again if you want and
enable auto failovering.

Actually you can switch to the host A not waiting until the
synchronization is complete. It will work, but read requests will go
to the remote host B until the synchronization is complete, so I would
not do this until there are good reasons for this.

It might be possible to recover faster, without recreating/resyncing
all devices, depending on how things messed up, fixing the disk
numbering in hast.conf and recreating/resyncing only resources in
split-brain state. But it would require more manual work, careful
investigation of logs and good understanding what you are doing.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [head tinderbox] failure on arm/arm

2011-11-27 Thread Mikolaj Golub

On Sun, 27 Nov 2011 20:08:02 GMT FreeBSD Tinderbox wrote:

 FT /src/sys/kern/kern_proc.c:2589: error: 'KERN_PROC_PS_STRINGS' undeclared 
here (not in a function)
 FT *** Error code 1

Forgot to commit changes to sys/sysctl.h. Sorry for this. Should be fixed in 
r228046.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [head tinderbox] failure on mips/mips

2011-11-27 Thread Mikolaj Golub

On Mon, 28 Nov 2011 03:35:32 GMT FreeBSD Tinderbox wrote:

 FT TB --- 2011-11-28 02:40:42 - tinderbox 2.8 running on 
freebsd-current.sentex.ca
 FT TB --- 2011-11-28 02:40:42 - starting HEAD tinderbox run for mips/mips
 FT TB --- 2011-11-28 02:40:43 - cleaning the object tree
 FT TB --- 2011-11-28 02:40:58 - cvsupping the source tree
 FT TB --- 2011-11-28 02:40:58 - /usr/bin/csup -z -r 3 -g -L 1 -h 
cvsup.sentex.ca /tinderbox/HEAD/mips/mips/supfile
 FT TB --- 2011-11-28 02:41:16 - building world
 FT TB --- 2011-11-28 02:41:16 - CROSS_BUILD_TESTING=YES
 FT TB --- 2011-11-28 02:41:16 - MAKEOBJDIRPREFIX=/obj
 FT TB --- 2011-11-28 02:41:16 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
 FT TB --- 2011-11-28 02:41:16 - SRCCONF=/dev/null
 FT TB --- 2011-11-28 02:41:16 - TARGET=mips
 FT TB --- 2011-11-28 02:41:16 - TARGET_ARCH=mips
 FT TB --- 2011-11-28 02:41:16 - TZ=UTC
 FT TB --- 2011-11-28 02:41:16 - __MAKE_CONF=/dev/null
 FT TB --- 2011-11-28 02:41:16 - cd /src
 FT TB --- 2011-11-28 02:41:16 - /usr/bin/make -B buildworld
  World build started on Mon Nov 28 02:41:17 UTC 2011
  Rebuilding the temporary build tree
  stage 1.1: legacy release compatibility shims
  stage 1.2: bootstrap tools
  stage 2.1: cleaning up the object tree
  stage 2.2: rebuilding the object tree
  stage 2.3: build tools
  stage 3: cross tools
  stage 4.1: building includes
  stage 4.2: building libraries
  stage 4.3: make dependencies
  stage 4.4: building everything
 FT [...]
 FT /src/usr.bin/procstat/procstat_auxv.c:123: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT /src/usr.bin/procstat/procstat_auxv.c:128: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT /src/usr.bin/procstat/procstat_auxv.c:133: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT /src/usr.bin/procstat/procstat_auxv.c:143: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT /src/usr.bin/procstat/procstat_auxv.c:146: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT /src/usr.bin/procstat/procstat_auxv.c:149: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT /src/usr.bin/procstat/procstat_auxv.c:155: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT /src/usr.bin/procstat/procstat_auxv.c:164: warning: format '%ld' expects 
type 'long int', but argument 4 has type 'int'
 FT *** Error code 1

Sorry, should be fixed in r228049.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: 9.0-BETA3 lock order reversal in mount_smbfs

2011-10-09 Thread Mikolaj Golub

On Wed, 5 Oct 2011 19:39:56 -0700 Bob Finch wrote:

 BF Attempting to mount a remote SMB share with mount_smbfs fails:

 BF freebsd9b3# uname -a
 BF FreeBSD freebsd9b3 9.0-BETA3 FreeBSD 9.0-BETA3 #0: Sat Sep 24 20:46:57 UTC 
2011 r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386
 BF freebsd9b3# mount_smbfs -I smbhost -U xxx -W domain //smbhost/xxx /mnt
 BF Password:
 BF mount_smbfs: unable to open connection: syserr = No such file or directory

 BF and displays the following kernel messages:

 BF smb_co_lock: recursive lock for object 1
 BF lock order reversal:
 BF 1st 0xc2ef4608 smb_vc (smb_vc) @ 
/usr/src/sys/modules/smbfs/../../netsmb/smb_conn.c:325
 BF 2nd 0xc2ffbc28 smbsm (smbsm) @ 
/usr/src/sys/modules/smbfs/../../netsmb/smb_conn.c:348
 BF KDB: stack backtrace:
 BF db_trace_self_wrapper(c0eff6ac,626d732f,2e2f7366,2e2e2f2e,74656e2f,...) at 
db_trace_self_wrapper+0x26
 BF kdb_backtrace(c0a42bdb,c0f0300f,c29697b0,c29696e0,c76d298c,...) at 
kdb_backtrace+0x2a
 BF _witness_debugger(c0f0300f,c2ffbc28,c2ff93df,c29696e0,c2ff9320,...) at 
_witness_debugger+0x25
 BF witness_checkorder(c2ffbc28,9,c2ff9320,15c,c2ffbc48,...) at 
witness_checkorder+0x839
 BF __lockmgr_args(c2ffbc28,8,c2ffbc48,0,0,...) at __lockmgr_args+0x824
 BF smb_co_lock(c2ffbc20,8,2,2,c76d2b30,...) at smb_co_lock+0x73
 BF smb_co_gone(c2ef4600,c76d2b88,c76d2b88,c76d2aac,c2ad5b00,...) at 
smb_co_gone+0x34
 BF smb_sm_lookup(c76d2ad8,c76d2b14,c76d2b88,c76d2b30,c29f041c,...) at 
smb_sm_lookup+0xf0
 BF smb_usr_lookup(c29f0400,c76d2b88,c76d2b94,c76d2b90,c76d2b7c,...) at 
smb_usr_lookup+0x98
 BF nsmb_dev_ioctl(c2f76700,82fc6e6a,c29f0400,3,c2fce8a0,...) at 
nsmb_dev_ioctl+0x1d9
 BF giant_ioctl(c2f76700,82fc6e6a,c29f0400,3,c2fce8a0,...) at giant_ioctl+0x75
 BF devfs_ioctl_f(c2d417a8,82fc6e6a,c29f0400,c2c05e00,c2fce8a0,...) at 
devfs_ioctl_f+0x10b
 BF kern_ioctl(c2fce8a0,3,82fc6e6a,c29f0400,6d2cec,...) at kern_ioctl+0x21d
 BF sys_ioctl(c2fce8a0,c76d2cec,c0f493b6,c0eebb0e,246,...) at sys_ioctl+0x134
 BF syscall(c76d2d28) at syscall+0x284
 BF Xint0x80_syscall() at Xint0x80_syscall+0x21
 BF --- syscall (54, FreeBSD ELF32, sys_ioctl), eip = 0x28193283, esp = 
0xbfbfe35c, ebp = 0xbfbfe688 ---

The LOR appears after the problem (the connection gone) happened, on error
handling. So although it indicates that there is something wrong with our smb
locking this does not look like a cause of your issue.

 BF Anything further I can do to help debug this problem?

Is this specific for 9.0-BETA3? Have you tried mounting the share with the
same parameters on another systems?

I think it could be useful to tcpdump the session and look at it in wireshark,
which understands the SMB protocol.

Also you might want to rebuild the kernel with this options in /etc/src.conf:

DEBUG_FLAGS=-g -DSMB_SOCKET_DEBUG -DSMB_IOD_DEBUG -DNB_DEBUG 
-DSMB_VNODE_DEBUG

which will add many debugging messages. This might be helpful for 
troubleshooting.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: truss

2011-09-19 Thread Mikolaj Golub

On Mon, 19 Sep 2011 12:13:56 + (UTC) Anton Yuzhaninov wrote to Mikolaj 
Golub:

 AY On Sun, 18 Sep 2011 16:46:01 +0300, Mikolaj Golub wrote:
 MG Could you please run ktrace with -i option? The behavior is like if
 MG ptrace(PT_TRACE_ME) failed in the child by some reason. Unfortunately, 
truss
 MG does not check this.

 AY ktrace -i for truss sleep 5
 AY http://dl.dropbox.com/u/8798217/tmp/truss_ktrace2.txt

Although ptrace(PT_TRACE_ME,0,0,0) returned 0 the process did not stop after
execve() and wait4() in parent (which was actually waiting for this stop)
returned only after the child exit. No I idea why so far :-).

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: truss

2011-09-18 Thread Mikolaj Golub

On Wed, 14 Sep 2011 06:17:45 + (UTC) Anton Yuzhaninov wrote to Xin LI:

 AY On Fri, 09 Sep 2011 15:56:41 -0700, Xin LI wrote:
 XL -BEGIN PGP SIGNED MESSAGE-
 XL Hash: SHA256
 XL 
 XL On 08/31/11 07:35, Anton Yuzhaninov wrote:
  It seems to be truss(1) is broken on current
  
  :~ truss /bin/echo x x truss: can not get etype: No such process
  
  FreeBSD 9.0-BETA1 #0 r224884M i386
  
  from ktrace of turss
  
  3162 trussCALL
  __sysctl(0xbfbfea00,0x4,0xbfbfe9e0,0xbfbfea10,0,0) 3162 truss
  SCTL  kern.proc.sv_name.3163 3162 trussRET   __sysctl -1
  errno 3 No such process
 XL 
 XL Can't seem to be reproducable here, did I missed anything?  (note that
 XL you may need a full world/kernel build).
 XL 

 AY Problem still here after svn up and rebuild world/kernel

 AY :~ ktrace -t+ truss /usr/bin/true
 AY truss: can not get etype: No such process

Could you please run ktrace with -i option? The behavior is like if
ptrace(PT_TRACE_ME) failed in the child by some reason. Unfortunately, truss
does not check this.

 AY Full ktrace:
 AY http://dl.dropbox.com/u/8798217/tmp/truss_ktrace.txt

 AY FreeBSD 9.0-BETA2 #1 r225504M
 AY i386

 AY Kernel config is not GENERIC - main difference - DTrace added:
 AY http://dl.dropbox.com/u/8798217/tmp/kernconf.txt

 AY -- 
 AY  Anton Yuzhaninov

 AY ___
 AY freebsd-current@freebsd.org mailing list
 AY http://lists.freebsd.org/mailman/listinfo/freebsd-current
 AY To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Weird issue with hastd(8)

2011-06-25 Thread Mikolaj Golub
On Fri, Jun 3, 2011 at 11:26 AM, Maxim Sobolev sobo...@freebsd.org wrote:

 I would also like to get your input on my two other patches - randomization
 of the synchronization pattern and ad-hoc asynchronous more. Hastd appears
 extremely useful to synchronize large virtual disks over slow links without
 taking live virtual machine offline.

For me the idea to send updates to secondary only via
synchronization thread, starting it periodically looks
interesting. Sure it should not be the replacement for real
async mode, but having something like this in hast apart other
synchronization modes might be useful.

Comparing it with real async  that is described in manual it has
the following advantages:

1) It is much easier to implement.

2) If you have frequent updates of the same blocks, real async
will send them all, while with sync thread approach we will skip
many intermediate updates.

Even if we don't run sync thread very frequently and HAST
switches to failover it may sync dirty buffers from previous
master.

It might be useful for backuping volumes via WAN, instead of
rsync or zfs send.

There is a disadvantage -- instead of sending only one dirty
block we synchronize the hole extent (see below how it may be
improved though).

But let me say about the problems with your patch:

http://sobomax.sippysoft.com/primary.c.diff

In your approach you still put the requests to the send thread
but mark them there as failed so they are not actually sent and
the extent is marked as need sync.

You don't start sync thread. It starts in your case after
reconnecting to secondary. You have frequent reconnects because
of the following. Because there are requests in the send thread it does
not send keep alive requests (it sends them only when it is idle)
but actually the requests are not sent and the secondary exits by
timeout not receiving any data from primary. Sure frequent
reconnects are bad.

Also the problem you described in randomization thread looks
like is only possible with your patch. As the request fails in
send thread the extent is marked as need sync, if at this time
sync thread is running you may observe the effect when the same
frequently updated extent is resent frequently. Without your
patch an extent may be marked as need sync only when connection
to secondary is lost, so synchronization is not running at that
moment.

I think the right approach could be:

1) Don't put the request to the send thread at all.

2) When returning the request to the kernel it still remains
dirty in memmap.

3) periodically, the dirty (in memmap) extents are marked as need
sync and the sync thread is waken up.

Here is the patch that implements it:

http://people.freebsd.org/~trociny/hast.async.patch

The patch can not be considered as complete because of:

1) I think this mode should not be called async, because people
would expect from it the behavior that was known from man (and
how it works in DRBD it suppose). Also real async might be implemented
in future too. Some other name should be thought out.

2) The synchronization thread is waked up in guard thread every
HAST_KEEPALIVE seconds. I think it should be not so frequent and
configurable.

It can be improved but I would like to know Pawel's opinion
first. He might know why this is completely wrong :-)

Now about sending the hole extent when only small part of it is
updated. It might be improved with checksum based
synchronization. I have a patch that implements it -- when
synchronizing an extent, before sending the chunk of MAXPHYS
size, its checksum is send and if it matches the chunk is not
sent. It is supposed to be useful when one needs to resync disks,
e.g. after split brain, when most of the blocks on the nodes match.
But apparently it should improve things in this case too.

--
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Weird issue with hastd(8)

2011-05-29 Thread Mikolaj Golub

On Wed, 25 May 2011 11:21:04 -0700 Maxim Sobolev wrote:

 MS Hi Pawel,

 MS I am observing strange errors while synchronizing the data between
 MS primary and secondary. I keep getting the following error messages:

 MS May 25 11:09:19 eights hastd[10113]: [test] (secondary) Unable to
 MS receive request header: Socket is not connected.
 MS May 25 11:09:24 eights hastd[37571]: [test] (secondary) Worker process
 MS exited ungracefully (pid=10113, exitcode=75).
 MS May 25 11:10:17 eights hastd[12109]: [test] (secondary) Unable to
 MS receive request header: Socket is not connected.
 MS May 25 11:10:18 eights hastd[37571]: [test] (secondary) Worker process
 MS exited ungracefully (pid=12109, exitcode=75).
 MS May 25 11:10:39 eights hastd[14685]: [test] (secondary) Unable to
 MS receive request header: Socket is not connected.
 MS May 25 11:10:44 eights hastd[37571]: [test] (secondary) Worker process
 MS exited ungracefully (pid=14685, exitcode=75).

 MS The synchronization steel proceeds, but it's slow due to the need to
 MS re-negotiate and re-spawn the secondary worker. I have tried to ktrace
 MS both server and client at the same time. For some reason the primary
 MS keeps sending data, while client gets 0-read from the recvfrom at some
 MS point, while the primary keeps sending more data. This is 8-STABLE
 MS code on both ends.

 MS Any ideas of what could be wrong here are appreciated.

This might be MSG_WAITALL issue I described on net@ (look for the thread
recv() with MSG_WAITALL might stuck when receiving more than rcvbuf, and
also kern/154504).

Could you please try the patch?

http://people.freebsd.org/~trociny/uipc_socket.c.patch

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any success stories for HAST + ZFS?

2011-04-11 Thread Mikolaj Golub

On Mon, 11 Apr 2011 11:26:15 -0700 Freddie Cash wrote:

 FC On Sun, Apr 10, 2011 at 12:36 PM, Mikolaj Golub troc...@freebsd.org 
wrote:
  On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:
   FC Once the deadlock patches above are MFC'd to -STABLE, I can do an
   FC upgrade cycle and test them.
 
  Committed to STABLE.

 FC Updated src tree to r220537.  Recompiled world, kernel, etc.
 FC Installed world, kernel, etc.  ZFSv28 patch was not affected.

 FC Everything is detected correctly, everything comes up correctly.  See
 FC a new option (reload) in the RC script for hast.

 FC Can create/change role for 24 hast devices simultaneously.

 FC Can switch between master/slave modes.

 FC Have 5 rsyncs running in parallel without any issues, transferring
 FC 80-120 Mbps over the network (just under 100 Mbps seems to be the
 FC average right now).

 FC Switching roles while the rsyncs are running succeeds without
 FC deadlocking (obviously, rsync complains a whole bunch while the switch
 FC happens as the pool disappears out from underneath it, but it picks up
 FC again when the pool is back in place).

 FC Hitting the reset switch on the box while the rsyncs are running
 FC doesn't affect the hast devices or the pool, beyond losing the last 5
 FC seconds of writes.

 FC It's only been a couple of hours of testing and hammering, but so far
 FC things are much more stable/performant than before.

Cool! Thanks for reporting!

 FC Anything else I should test?

Nothing particular, but any tests and reports are appreciated. E.g. ones of
the recent features Pawel has added are checksum and compression. You could
try different options and compare :-)

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any success stories for HAST + ZFS?

2011-04-10 Thread Mikolaj Golub

On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:

 FC Once the deadlock patches above are MFC'd to -STABLE, I can do an
 FC upgrade cycle and test them.

Committed to STABLE.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any success stories for HAST + ZFS?

2011-04-05 Thread Mikolaj Golub

On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:

 FC On Sat, Apr 2, 2011 at 1:44 AM, Pawel Jakub Dawidek p...@freebsd.org 
wrote:
 
  I just committed a fix for a problem that might look like a deadlock.
  With trociny@ patch and my last fix (to GEOM GATE and hastd) do you
  still have any issues?

 FC Just to confirm, this is commit r220264, 220265, 220266 to -CURRENT?

Yes, r220264 and 220266. As it is stated in the commit log MFC is planned
after 1 week.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any success stories for HAST + ZFS?

2011-04-01 Thread Mikolaj Golub

On Fri, 01 Apr 2011 11:40:11 +0100 Pete French wrote:

  Yes, you may hit it only on hast devices creation. The workaround is to 
  avoid
  using 'hastctl role primary all', start providers one by one instead.

 PF Interesting to note that I just hit a lockup in hast (the discs froze
 PF up - could not run hastctl or zpool import, and could not kill
 PF them). I have two hast devices instead of one, but I am starting them
 PF individually instead of  using 'all'. The copde includes all the latest
 PF patches which have gone into STABLE over the last few days, none of which
 PF look particularly controversial!

 PF I havent tried your atch yet, nor been able to reporduce the lockup, but
 PF thought you might be interested to know that I also had problems with
 PF multiple providers.

This looks like a different problem. If you have this again please provide the
output of 'procstat -kka'.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any success stories for HAST + ZFS?

2011-03-28 Thread Mikolaj Golub

On Mon, 28 Mar 2011 10:47:22 +0100 Pete French wrote:

  It is not a hastd crash, but a kernel crash triggered by hastd process.
 
  I am not sure I got the same crash as you but apparently the race is 
  possible
  in g_gate on device creation.
 
  I got the following crash starting many hast providers simultaneously:

 PF This is very interestng to me - my successful ZFS+HAST only had
 PF a single drive, but in my new setup I am intending to use two
 PF HAST processes and then mirror across thhem under ZFS, so I am
 PF likely to hit this bug. Are the processes stable once launched ?

Yes, you may hit it only on hast devices creation. The workaround is to avoid
using 'hastctl role primary all', start providers one by one instead.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Any success stories for HAST + ZFS?

2011-03-27 Thread Mikolaj Golub

On Sat, 26 Mar 2011 10:52:08 -0700 Freddie Cash wrote:

 FC hastd backtrace is here:
 FC http://www.sd73.bc.ca/downloads/crash/hast-backtrace.png

It is not a hastd crash, but a kernel crash triggered by hastd process.

I am not sure I got the same crash as you but apparently the race is possible
in g_gate on device creation.

I got the following crash starting many hast providers simultaneously:

fault virtual address   = 0x0

#8  0xc0c11adc in calltrap () at /usr/src/sys/i386/i386/exception.s:168
#9  0xc086ac6b in g_gate_ioctl (dev=0xc6a24300, cmd=3374345472, 
addr=0xc9fec000 \002, flags=3, td=0xc7ff0b80)
at /usr/src/sys/geom/gate/g_gate.c:410
#10 0xc0853c5b in devfs_ioctl_f (fp=0xc9b9e310, com=3374345472, 
data=0xc9fec000, cred=0xc8c9c200, td=0xc7ff0b80)
at /usr/src/sys/fs/devfs/devfs_vnops.c:678
#11 0xc09210cd in kern_ioctl (td=0xc7ff0b80, fd=3, com=3374345472, 
data=0xc9fec000 \002) at file.h:262
#12 0xc0921254 in ioctl (td=0xc7ff0b80, uap=0xf5edbcec)
at /usr/src/sys/kern/sys_generic.c:679
#13 0xc0916616 in syscallenter (td=0xc7ff0b80, sa=0xf5edbce4)
at /usr/src/sys/kern/subr_trap.c:315
#14 0xc0c2b9ff in syscall (frame=0xf5edbd28)
at /usr/src/sys/i386/i386/trap.c:1086
#15 0xc0c11b71 in Xint0x80_syscall ()
at /usr/src/sys/i386/i386/exception.s:266

Or just creating many ggate devices simultaneously:

for i in `jot 100`; do
./ggiocreate $i
done

ggiocreate.c is attached.

In my case the kernel crashes in g_gate_create() when checking for name
collisions in strcmp():

/* Check for name collision. */
for (unit = 0; unit  g_gate_maxunits; unit++) {
if (g_gate_units[unit] == NULL)
continue;
if (strcmp(name, g_gate_units[unit]-sc_provider-name) != 0)
continue;
mtx_unlock(g_gate_units_lock);
mtx_destroy(sc-sc_queue_mtx);
free(sc, M_GATE);
return (EEXIST);
}

I think the issue is the following. When preparing sc we take
g_gate_units_lock, check for name collision, fill sc fields except
sc-sc_provider, and registers sc in g_gate_units[unit]. sc_provider is filled
later, when g_gate_units_lock is released. So the scenario is possible:

1) Thread A registers sc in g_gate_units[unit] with
g_gate_units[unit]-sc_provider still null and releases g_gate_units_lock.

2) Thread B traverses g_gate_units[] when checking for name collision and
craches accessing g_gate_units[unit]-sc_provider-name.

The attached patch fixes the issue in my case.

-- 
Mikolaj Golub



ggiocreate.c
Description: Binary data
Index: sys/geom/gate/g_gate.c
===
--- sys/geom/gate/g_gate.c	(revision 220050)
+++ sys/geom/gate/g_gate.c	(working copy)
@@ -407,13 +407,14 @@ g_gate_create(struct g_gate_ctl_create *ggio)
 	for (unit = 0; unit  g_gate_maxunits; unit++) {
 		if (g_gate_units[unit] == NULL)
 			continue;
-		if (strcmp(name, g_gate_units[unit]-sc_provider-name) != 0)
+		if (strcmp(name, g_gate_units[unit]-sc_name) != 0)
 			continue;
 		mtx_unlock(g_gate_units_lock);
 		mtx_destroy(sc-sc_queue_mtx);
 		free(sc, M_GATE);
 		return (EEXIST);
 	}
+	sc-sc_name = name;
 	g_gate_units[sc-sc_unit] = sc;
 	g_gate_nunits++;
 	mtx_unlock(g_gate_units_lock);
@@ -432,6 +433,9 @@ g_gate_create(struct g_gate_ctl_create *ggio)
 	sc-sc_provider = pp;
 	g_error_provider(pp, 0);
 	g_topology_unlock();
+	mtx_lock(g_gate_units_lock);
+	sc-sc_name = sc-sc_provider-name;
+	mtx_unlock(g_gate_units_lock);
 
 	if (sc-sc_timeout  0) {
 		callout_reset(sc-sc_callout, sc-sc_timeout * hz,
Index: sys/geom/gate/g_gate.h
===
--- sys/geom/gate/g_gate.h	(revision 220050)
+++ sys/geom/gate/g_gate.h	(working copy)
@@ -76,6 +76,7 @@
  * 'P:' means 'Protected by'.
  */
 struct g_gate_softc {
+	char			*sc_name;		/* P: (read-only) */
 	int			 sc_unit;		/* P: (read-only) */
 	int			 sc_ref;		/* P: g_gate_list_mtx */
 	struct g_provider	*sc_provider;		/* P: (read-only) */
@@ -96,7 +97,6 @@ struct g_gate_softc {
 	LIST_ENTRY(g_gate_softc) sc_next;		/* P: g_gate_list_mtx */
 	char			 sc_info[G_GATE_INFOSIZE]; /* P: (read-only) */
 };
-#define	sc_name	sc_provider-geom-name
 
 #define	G_GATE_DEBUG(lvl, ...)	do {	\
 	if (g_gate_debug = (lvl)) {	\
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Any success stories for HAST + ZFS?

2011-03-27 Thread Mikolaj Golub

On Sun, 27 Mar 2011 15:16:15 +0300 Mikolaj Golub wrote to Freddie Cash:

 MG The attached patch fixes the issue in my case.

The patch is committed to current.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: net.inet.tcp.timer_race: does anyone have a non-zero value?

2010-03-07 Thread Mikolaj Golub
On Sun, 7 Mar 2010 11:59:35 + (GMT) Robert Watson wrote:

 Please check the results of the following command:

   % sysctl net.inet.tcp.timer_race
   net.inet.tcp.timer_race: 0

Are the results for FreeBSD7 look interesting for you? Because currently we
have mostly FreeBSD7.1 hosts in production and I observe nonzero values on 8
hosts (about 15%). I would send more details to you privately if you are
interested.

-- 
Mikolaj Golub
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org