panic in 7.2 (ffs_alloc.c?)

2009-11-21 Thread Charles Sprickman

Howdy,

I'm not expert at getting info out of a dump, but I'll do my best to 
provide some information.


This is a Dell PE2970 w/PERC6/i RAID running FreeBSD 7.2/amd64.  Brand new 
box, has been doing very light work for about two weeks.  Last night I 
started a very long mstone run on a jailed mail server and found that 
quite a way into this burn-in, the box paniced.  I was going to put it in 
service Monday (after punishing it all weekend).  Looking for some input 
on what the root cause is and whether going to a -stable snapshot might be 
worthwhile.


I can tell you there was a good deal of disk activity at the time in the 
jail - mstone was simulating 100 POP and SMTP clients hitting the machine 
at once.  This is qmail+courier.  So messages are coming in, hitting the 
queue, hitting a user's maildir, getting read and deleted via the POP 
"client" over and over again.  I do see lots of "ffs_*" stuff in the 
backtrace, which is a little scary.


Here's my stab at a kgdb session (also @ pastie for easier reading: 
http://pastie.org/709671):


[r...@bigmail /usr/obj/usr/src/sys/BWAY7-64]# kgdb kernel.debug 
/var/crash/vmcore.0

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you 
are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for 
details.

This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x12d4b9f5c
fault code  = supervisor read data, page not present
instruction pointer = 0x8:0x8050382e
stack pointer   = 0x10:0x281a75b0
frame pointer   = 0x10:0xff000455f800
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 6324 (vdelivermail)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 12d0h32m3s
Physical memory: 6130 MB
Dumping 725 MB: 710 694 678 662 646 630 614 598 582 566 550 534 518 502 
486 470 454 438 422 406 390 374 358 342 326 310 294 278 262 246 230 214 
198 182 166 150 134 118 102 86 70 54 38 22 6


Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from 
/boot/kernel/nullfs.ko.symbols...done.

done.
Loaded symbols for /boot/kernel/nullfs.ko
Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from 
/boot/kernel/fdescfs.ko.symbols...done.

done.
Loaded symbols for /boot/kernel/fdescfs.ko
#0  doadump () at pcpu.h:195
195 __asm __volatile("movq %%gs:0,%0" : "=r" (td));
#3  0x8034cba2 in panic (fmt=0x104 )
at /usr/src/sys/kern/kern_shutdown.c:574
#4  0x80574823 in trap_fatal (frame=0xff00046c8000, 
eva=Variable "eva" is not available.

)
at /usr/src/sys/amd64/amd64/trap.c:757
#5  0x80574bf5 in trap_pfault (frame=0x281a7500, 
usermode=0)

at /usr/src/sys/amd64/amd64/trap.c:673
#6  0x80575534 in trap (frame=0x281a7500)
at /usr/src/sys/amd64/amd64/trap.c:444
#7  0x8055969e in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:209
#8  0x8050382e in ffs_realloccg (ip=0xff00267f75c0, lbprev=0,
bprev=6288224785898156086, bpref=593305256, osize=0, nsize=2048,
flags=33619968, cred=0xff00927fe800, bpp=0x281a7800)
at /usr/src/sys/ufs/ffs/ffs_alloc.c:1349
#9  0x80506e8e in ffs_balloc_ufs2 (vp=0xff0027a64dc8, 
startoffset=Variable "startoffset" is not available.

)
at /usr/src/sys/ufs/ffs/ffs_balloc.c:692
#10 0x805223e5 in ffs_write (ap=0x281a7a10)
at /usr/src/sys/ufs/ffs/ffs_vnops.c:724
#11 0x805a0645 in VOP_WRITE_APV (vop=0x80793d20,
a=0x281a7a10) at vnode_if.c:691
#12 0x803dd731 in vn_write (fp=0xff001027cd00,
uio=0x281a7b00, active_cred=Variable "active_cred" is not 
available.

) at vnode_if.h:373
#13 0x80388768 in dofilewrite (td=0xff00046c8000, fd=5,
fp=0xff001027cd00, auio=dwarf2_read_address: Corrupted DWARF 
expression.

) at file.h:257
#14 0x80388a6e in kern_writev (td=0xff00046c8000, fd=5,
auio=0x281a7b00) at /usr/src/sys/kern/sys_generic.c:402
#15 0x80388aec in write (td=0x800, uap=0x12d4b9f50)
at /usr/src/sys/kern/sys_generic.c:318
#16 0x80596a66 in ia32_syscall (frame=0x281a7c80)
at /usr/src/sys/amd64/ia32/ia32_syscall.c:182
#17 0x80559ad0 in Xint0x80_syscall () at ia32_exception.S:65
#18 0x28167928 in ?? ()
Previous frame inner to this frame (corrupt stack?)

Full dmesg, verbose boot and kernel config at pastie as well.  Actually no 
verbose boot...  I rebooted the box afte

Re: 7.2 dies in zfs

2009-11-21 Thread Adam McDougall
On Sat, Nov 21, 2009 at 11:36:43AM -0800, Jeremy Chadwick wrote:

  
  On Sat, Nov 21, 2009 at 08:07:40PM +0100, Johan Hendriks wrote:
  > Randy Bush  wrote:
  > > imiho, zfs can not be called production ready if it crashes if you
  > > do not stand on your left leg, put your right hand in the air, and
  > > burn some eye of newt.
  > 
  > This is not a rant, but where do you read that on FreeBSD 7.2 ZFS has
  > been marked as production ready.
  > As far as i know, on FreeBSD 8.0 ZFS is called production ready.
  > 
  > If you boot your system it probably tell you it is still experimental.
  > 
  > Try running FreeBSD 7-Stable to get the latest ZFS version which on
  > FreeBSD is 13
  > On 7.2 it is still at 6 (if I remember it right).
  
  RELENG_7 uses ZFS v13, RELENG_8 uses ZFS v18.
  
  RELENG_7 and RELENG_8 both, more or less, behave the same way with
  regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my
  question as far as what's needed to stabilise ZFS on either 7.x or 8.x.

I have a stable public ftp/http/rsync/cvsupd mirror that runs ZFS v13.
It has been stable since mid may.  I have not had a kmem panic on any
of my ZFS systems for a long time, its a matter of making sure there is
enough kmem at boot (not depending on kmem_size_max) and that it is big enough
that fragmentation does not cause a premature allocation failure due to lack
of large-enough contiguous chunk.  This requires the platform to support a
kmem size that is "big enough"... i386 can barely muster 1.6G and sometimes
that might not be enough.  I'm pretty sure all of my currently existing ZFS
systems are amd64 where the kmem can now be huge.  On the busy fileserver with
20 gigs of ram running FreeBSD 8.0-RC2 #21: Tue Oct 27 21:45:41 EDT 2009,
I currently have:
vfs.zfs.arc_max=16384M
vfs.zfs.arc_min=4096M
vm.kmem_size=18G
The arc settings here are to try to encourage it to favor the arc cache
instead of whatever else Inactive memory in 'top' contains.

On other systems that are hit less hard, I simply set:
vm.kmem_size="20G"
I even do this on systems with much less ram, it doesn't seem to matter
except it works, this is on an amd64 with only 8G.  Most of my ZFS systems
are 7.2-stable, some are 8.0-something.  Anything with v13 is much better
than v6, but 8.0 has additional fixes that have not been backported to 7 yet.
I don't consider the additional fixes in 8 required for my uses yet, although
I'm planning on moving forward eventually.  I would consider 2G kmem a realistic
minimum on a system that will see some serious disk IO (regardless of how much
ram the system actually contains, as long as the kmem size can be set that big
and the system not blow chunks).  Hope this personal experience helps.

  
  The people who need to answer the question are those who are familiar
  with the code.  Specifically: Kip Macy, Pawel Jakub Dawidek, and anyone
  else who knows the internals.  Everyone else in the user community is
  simply guessing + going crazy trying to figure out a solution.
  
  As much as I appreciate all the work that has been done to bring ZFS to
  FreeBSD -- and I do mean that! -- we need answers at this point.
  
  -- 
  | Jeremy Chadwick   j...@parodius.com |
  | Parodius Networking   http://www.parodius.com/ |
  | UNIX Systems Administrator  Mountain View, CA, USA |
  | Making life hard for others since 1977.  PGP: 4BD6C0CB |
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
  
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 8.0-RC USB/FS problem

2009-11-21 Thread Guojun Jin
Tried on the USB hard drive:

Deleted slice 3 and recreated slice 3 with two partitions s3d and s3e.
Was happy because successfully did dump/restore on s3d, and thought it just 
partition format issue;
but system crashed during dump/restore on s3e, and partition lost the file 
system type.

wolf# mount /dev/da0s3e /mnt
WARNING: /mnt was not properly dismounted
/mnt: mount pending error: blocks 35968 files 0
wolf# fsck da0s3e
fsck: Could not determine filesystem type
wolf# bsdlabel da0s3
# /dev/da0s3:
8 partitions:
#size   offsetfstype   [fsize bsize bps/cpg]
  c: 1757350350unused0 0 # "raw" part, don't edi
t
  d: 1887436804.2BSD0 0 0
  e: 156860667 188743684.2BSD0 0 0

Therefore, tried directly use fsck_ufs on both USB hard drive and USB stick to 
get file system
clean up. All data got back now.

The machine has run with FreeBSD 6.1 all the way to 7.2 without such problem.
How can we determine what could go wrong in 8.0? FS or USB.

By the way, IDE to IDE dump/restore seems not having such problem at this 
point, although
one of IDE drive experienced partition recognizing problem, which went away 
after deleting
slices and recreating slices.


-Original Message-
From: Guojun Jin
Sent: Sat 11/21/2009 7:40 PM
To: Hans Petter Selasky; freebsd-...@freebsd.org
Cc: freebsd-stable@freebsd.org; questi...@freebsd.org
Subject: 8.0-RC USB problem -- how to recover a damaged USB stick
 
It seems this is more serious problem in 8.0, and I hope it could be resolved 
before a formal release. I can help to diagnose this if people need more 
information (this is destructive).

I have picked a USB stick (DataTraveler 2GB), that has two partitions s0 for 
DOS and s1 for FreeBSD.
Both USB hard drive and the USB stick have worked under FreeBSD 6.2, 6.3, 6.4 
and 7.2 for a few years without any problem.

Plugged USB stick in 8.0-RC and mounted it on /mnt; then
untar a file, tarred one day ago from FreeBSD 6.3 machine onto stick,to a IDE 
drive then tar it
back to the USB stick.
During tar writting from IDE to USB stick, did "ls /mnt", and "tar" paused and 
"ls" hangs.
A couple of minutes later, ls comes back, but tar still pauses.

Hit ^C on tar process around 14:30, it took another minutes to stop the 
process. Tried tar again,
and system disallowed to write on  the USB stick. "ls" shows all file still 
there (probably cached inods).

Went out for a few hours, and came back found /var/log/message are flooded with 
following message:
-rw-r--r--  1 root  wheel  167181 Nov 21 19:02 messages
-rw-r--r--  1 root  wheel7390 Nov 21 18:00 messages.0.bz2
-rw-r--r--  1 root  wheel7509 Nov 21 17:00 messages.1.bz2
-rw-r--r--  1 root  wheel9365 Nov 21 16:00 messages.2.bz2
-rw-r--r--  1 root  wheel   20598 Nov 21 15:00 messages.3.bz2

Nov 21 18:00:00 wolf newsyslog[2635]: logfile turned over due to size>384K
Nov 21 18:00:27 wolf kernel: g_vfs_done():da0s2[WRITE(offset=625688576, length=1
31072)]error = 5
Nov 21 18:00:27 wolf kernel: g_vfs_done():da0s2[WRITE(offset=625819648, length=1
31072)]error = 5
.
Nov 21 18:19:03 wolf kernel: g_vfs_done():da0s2[WRITE(offset=524451840, length=1
6384)]error = 5
Nov 21 18:19:33 wolf kernel: g_vfs_done():da0s2[WRITE(offset=5586944, length=204
8)]error = 5
Nov 21 18:19:33 wolf kernel: g_vfs_done():da0s2[WRITE(offset=65536, length=2048)
]error = 5
Nov 21 18:19:33 wolf kernel: g_vfs_done():da0s2[WRITE(offset=114688, length=1638
4)]error = 5
Nov 21 18:20:05 wolf kernel: g_vfs_done():da0s2[WRITE(offset=349700096, length=1

and has to reboot the system, and reboot was not able to umount everything 
(boot up message):

Nov 21 18:24:03 wolf kernel: da0:  Removable Dir
ect Access SCSI-2 device
Nov 21 18:24:03 wolf kernel: da0: 40.000MB/s transfers
Nov 21 18:24:03 wolf kernel: da0: 1947MB (3987456 512 byte sectors: 255H 63S/T 2
48C)
Nov 21 18:24:03 wolf kernel: WARNING: / was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /data was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /home was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /tmp was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /usr was not properly dismounted
...

# mount /dev/da0s2 /mnt
mount: /dev/da0s2 : Operation not permitted

The USB stick cannot be mount under any FreeBSD OS now, and everything on the 
drive has lost.

Does anyone know if it is possible to revocer such damaged USB stick?

-Original Message-
From: Hans Petter Selasky [mailto:hsela...@c2i.net]
Sent: Wed 11/18/2009 3:13 AM
To: freebsd-...@freebsd.org
Cc: Guojun Jin; freebsd-stable@freebsd.org; questi...@freebsd.org
Subject: Re: 8.0-RC3 USB lock up on mounting two partitions from one USB drive
 
Hi,

I'm not sure if this is an USB issue or not. If you get READ/WRITE errors and 
the drive simply dies then it might be the case. Else it is a system issue.

There are quirks for mass storage which you can add 

8.0-RC USB problem -- how to recover a damaged USB stick

2009-11-21 Thread Guojun Jin
It seems this is more serious problem in 8.0, and I hope it could be resolved 
before a formal release. I can help to diagnose this if people need more 
information (this is destructive).

I have picked a USB stick (DataTraveler 2GB), that has two partitions s0 for 
DOS and s1 for FreeBSD.
Both USB hard drive and the USB stick have worked under FreeBSD 6.2, 6.3, 6.4 
and 7.2 for a few years without any problem.

Plugged USB stick in 8.0-RC and mounted it on /mnt; then
untar a file, tarred one day ago from FreeBSD 6.3 machine onto stick,to a IDE 
drive then tar it
back to the USB stick.
During tar writting from IDE to USB stick, did "ls /mnt", and "tar" paused and 
"ls" hangs.
A couple of minutes later, ls comes back, but tar still pauses.

Hit ^C on tar process around 14:30, it took another minutes to stop the 
process. Tried tar again,
and system disallowed to write on  the USB stick. "ls" shows all file still 
there (probably cached inods).

Went out for a few hours, and came back found /var/log/message are flooded with 
following message:
-rw-r--r--  1 root  wheel  167181 Nov 21 19:02 messages
-rw-r--r--  1 root  wheel7390 Nov 21 18:00 messages.0.bz2
-rw-r--r--  1 root  wheel7509 Nov 21 17:00 messages.1.bz2
-rw-r--r--  1 root  wheel9365 Nov 21 16:00 messages.2.bz2
-rw-r--r--  1 root  wheel   20598 Nov 21 15:00 messages.3.bz2

Nov 21 18:00:00 wolf newsyslog[2635]: logfile turned over due to size>384K
Nov 21 18:00:27 wolf kernel: g_vfs_done():da0s2[WRITE(offset=625688576, length=1
31072)]error = 5
Nov 21 18:00:27 wolf kernel: g_vfs_done():da0s2[WRITE(offset=625819648, length=1
31072)]error = 5
.
Nov 21 18:19:03 wolf kernel: g_vfs_done():da0s2[WRITE(offset=524451840, length=1
6384)]error = 5
Nov 21 18:19:33 wolf kernel: g_vfs_done():da0s2[WRITE(offset=5586944, length=204
8)]error = 5
Nov 21 18:19:33 wolf kernel: g_vfs_done():da0s2[WRITE(offset=65536, length=2048)
]error = 5
Nov 21 18:19:33 wolf kernel: g_vfs_done():da0s2[WRITE(offset=114688, length=1638
4)]error = 5
Nov 21 18:20:05 wolf kernel: g_vfs_done():da0s2[WRITE(offset=349700096, length=1

and has to reboot the system, and reboot was not able to umount everything 
(boot up message):

Nov 21 18:24:03 wolf kernel: da0:  Removable Dir
ect Access SCSI-2 device
Nov 21 18:24:03 wolf kernel: da0: 40.000MB/s transfers
Nov 21 18:24:03 wolf kernel: da0: 1947MB (3987456 512 byte sectors: 255H 63S/T 2
48C)
Nov 21 18:24:03 wolf kernel: WARNING: / was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /data was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /home was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /tmp was not properly dismounted
Nov 21 18:24:03 wolf kernel: WARNING: /usr was not properly dismounted
...

# mount /dev/da0s2 /mnt
mount: /dev/da0s2 : Operation not permitted

The USB stick cannot be mount under any FreeBSD OS now, and everything on the 
drive has lost.

Does anyone know if it is possible to revocer such damaged USB stick?

-Original Message-
From: Hans Petter Selasky [mailto:hsela...@c2i.net]
Sent: Wed 11/18/2009 3:13 AM
To: freebsd-...@freebsd.org
Cc: Guojun Jin; freebsd-stable@freebsd.org; questi...@freebsd.org
Subject: Re: 8.0-RC3 USB lock up on mounting two partitions from one USB drive
 
Hi,

I'm not sure if this is an USB issue or not. If you get READ/WRITE errors and 
the drive simply dies then it might be the case. Else it is a system issue.

There are quirks for mass storage which you can add to 
sys/dev/usb/storage/umass.c .

--HPS

On Wednesday 18 November 2009 08:33:07 Guojun Jin wrote:
> Did newfs on those partition and made things worsen -- restore completely
> fails: (I had experienced another similar problem on an IDE, which works
> well for 6.4 and 7.2, but 8.0.) This dirve works fine under FreeBSD 6.4.
>
> Is something new in 8.0 making disk partition schema changed?
>
> g_vfs_done():da0s3d[READ(offset=98304, length=16384)]error = 6
> g_vfs_done():da0s3d[WRITE(offset=192806912, length=16384)]error = 6
> fopen: Device not configured
> cannot create save file ./restoresymtable for symbol table
> abort? [yn] (da0:umass-sim0:0:0:0): Synchronize cache failed, status ==
> 0xa, scs i status == 0x0
> (da0:umass-sim0:0:0:0): removing device entry
> ugen1.2:  at usbus1
> umass0:  on usbus1
> umass0:  SCSI over Bulk-Only; quirks = 0x
> umass0:0:0:-1: Attached to scbus0
> da0 at umass-sim0 bus 0 target 0 lun 0
> da0:  Fixed Direct Access SCSI-0 device
> da0: 40.000MB/s transfers
> da0: 114473MB (234441648 512 byte sectors: 255H 63S/T 14593C)
> Device da0s3d went missing before all of the data could be written to it;
> expect data loss.
>
> 99  23:19   sysinstall
>100  23:20   newfs /dev/da0s3d
>101  23:20   newfs /dev/da0s3e
>102  23:21   mount /dev/da0s3d /mnt
>103  23:21   cd /mnt
>104  23:21   dump -0f - /home | restore -rf -
>105  23:27   history 15
>
>
>
> -Original Message-
> From: Guojun Jin
> Sent: Tue 11/17/2009 11:0

Pulse Meter with Cellular communication

2009-11-21 Thread Exemys
This is a message in multipart MIME format.  Your mail client should not be 
displaying this. Consider upgrading your mail client to view this message 
correctly.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


MFC of r198284 to 7-STABLE

2009-11-21 Thread Oliver Pinter
commit 4a6ea694eaad85c9ff99668ba7427c00cea3e990
Author: kib 
Date:   Tue Oct 20 13:34:41 2009 +

MFC r197934:
Map PIE binaries at non-zero base address.

MFC r198202:
Honour non-zero mapbase for PIE binaries. Inform interpreter-less PIE
binary about its relocbase.

Approved by:re (kensmith)


git-svn-id: svn://svn.freebsd.org/base/stable/8...@198284
ccf9f872-aa2e-dd11-9fc8-001c23d0
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Jeremy Chadwick
On Sat, Nov 21, 2009 at 04:29:26PM -0800, Jeremy Chadwick wrote:
> On Sat, Nov 21, 2009 at 01:59:11PM -0600, Scot Hetzel wrote:
> > > RELENG_7 and RELENG_8 both, more or less, behave the same way with
> > > regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my
> > > question as far as what's needed to stabilise ZFS on either 7.x or 8.x.
> > >
> > Under RELENG_8/i386, you still need to tune ZFS as mentioned in the
> > ZFS Tuning Guide:
> > 
> > http://wiki.freebsd.org/ZFSTuningGuide
> > 
> > With RELENG_8/amd64 no tuning is necessary, if the system has at least 2G 
> > RAM.
> 
> Nope.
> 
> http://lists.freebsd.org/pipermail/freebsd-stable/2009-October/052256.html

I'll expand briefly on this because my post mentioned RELENG_7, and the
"state" of ZFS in RELENG_7 vs. RELENG_8 vs. HEAD is hard to follow
because some of the commits to (what once was) HEAD are actually in
RELENG_8 given when HEAD was tagged as RELENG_8.

There's a particular situation (with patch for RELENG_8) that has been
"making the rounds":

http://lists.freebsd.org/pipermail/freebsd-fs/2009-October/006907.html
http://lists.freebsd.org/pipermail/freebsd-fs/2009-October/006969.html

The discussion is with regards to slow performance as a result of ARC
degrading, except numerous posters (including the OP) mention that their
box also can "just hang".

But this patch seems different than the one which got committed to HEAD
(what is CURRENT today); revision 1.25 --

Commit message:
Prevent paging pressure from draining arc too much
- always drain arc if above arc_c_max - never drain arc if arc is
  below arc_c_max

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c.diff?r1=1.24;r2=1.25;f=h

This commit is not in RELENG_8 nor RELENG_7 (I've confirmed by looking
at sources), and of course the patch is "?!?" given the nature of the
thread.  I've looked at SVN commits to HEAD and Kip has been very, very
busy (even today).  :-)

.but then there's this commit, which happened ~5 months ago, and made
it into HEAD at the time (thus is in RELENG_8; also verified by looking
at source):

Commit message:
Manually export rev 192360 from kmacy

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c.diff?r1=1.19;r2=1.20;f=h

...Which I don't understand technically, but appears to have a direct
effect on ARC limiting.  So, this is getting very hard to track/follow.

Circling back to kmem exhaustion: has there been any official statement
on what's actually causing it?  Is it ARC overuse (and if so how's that
even possible)?  Is it ZIL?  Is it a combination of things?  Is it bugs
in the ZFS port (e.g. Solaris VM vs. FreeBSD VM)?  Is it all of these
things?  And ultimately -- how do we work around it?

With regards to loader.conf tuning, because this comes up often too:

There still has been no official or even semi-official (e.g. Wiki)
explanation as far as what should be tuned, and HOW things should be
tuned.  What are the proper variables to tune this?  Tuning on RELENG_7
vs. RELENG_8 also probably differs at this point in time -- or does it?
The following loader.conf variables are under scrutiny:

vm.kmem_size
vm.kmem_size_max
vfs.zfs.arc_min
vfs.zfs.arc_max
vfs.zfs.prefetch_disable
vfs.zfs.zil_disable

The number of conflicting details on the mailing lists (freebsd-stable,
freebsd-current, and freebsd-fs) make it very hard to discern at this
point how one is supposed to tune loader.conf to gain stability.  For
example, I've seen pjd@ mention that one should NOT be touching
vm.kmem_size_max, but rather vm.kmem_size -- which I don't understand
(and I mean that as in "help me understand", not "I'm questioning the
logic"), especially since src/UPDATING states "you probably don't need
to adjust either of these".  This is why we need people who are familiar
with both the ZFS code and the VM to help provide details so that
documentation can be updated (I'm referring to the Wiki).

If we could get something official from people who are "in the know",
that would be awesome.

Or maybe this is the wrong list to be discussing it at all, and
freebsd-fs is?  I don't know any more...

It's almost like we need some kind of "ZFS on FreeBSD" newsletter that's
sent out weekly documenting all of what's getting changed and what it
solves and how it impacts users.  Things are totally chaotic right now.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Marco van Tol
On Sat, Nov 21, 2009 at 06:16:04PM +0900, Randy Bush wrote:
> >> imiho, zfs can not be called production ready if it crashes if you
> >> do not stand on your left leg, put your right hand in the air, and
> >> burn some eye of newt.
> > ROFL!
> > As with any open-source project, I suppose it will be ready
> > when it is ready.  At least it hasn't been made the default.
> 
> yep.  i demand a full refund!  :)
> 
> my concern is the innocent admin putting something critical on zfs in
> 8.0 when it is called stable and production.  this isn't linux.

Well, to be honest, running something critical on a release candidate is
interesting enough as it is right? :-)

Just my $0.02

Marco van Tol

-- 
The first step to better times is to imagine them.
- www.chinese-fortune-cookie.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Randy Bush
> Everyone's workloads are different, but the panic is the same every
> time: kmem exhaustion.  i386 with KVA_PAGES or amd64 -- happens on both.
> It's highly dependent upon workload and what the filesystem consists of
> (many files vs. fewer files but larger in size, etc.)

these are measurable.  at least until it crashes.  the suggested memsz
script could, instead of giving what to a naive admin are some cute but
unhepful numbers, suggest values for loader.conf.local.

randy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Jeremy Chadwick
On Sat, Nov 21, 2009 at 01:59:11PM -0600, Scot Hetzel wrote:
> On Sat, Nov 21, 2009 at 1:36 PM, Jeremy Chadwick
>  wrote:
> >
> > On Sat, Nov 21, 2009 at 08:07:40PM +0100, Johan Hendriks wrote:
> >> Randy Bush  wrote:
> >> > imiho, zfs can not be called production ready if it crashes if you
> >> > do not stand on your left leg, put your right hand in the air, and
> >> > burn some eye of newt.
> >>
> >> This is not a rant, but where do you read that on FreeBSD 7.2 ZFS has
> >> been marked as production ready.
> >> As far as i know, on FreeBSD 8.0 ZFS is called production ready.
> >>
> >> If you boot your system it probably tell you it is still experimental.
> >>
> >> Try running FreeBSD 7-Stable to get the latest ZFS version which on
> >> FreeBSD is 13
> >> On 7.2 it is still at 6 (if I remember it right).
> >
> > RELENG_7 uses ZFS v13, RELENG_8 uses ZFS v18.
> >
> 
> RELENG_8 is still using ZFS v13.

I meant to type ZFS v13 for RELENG_8.  Fingers focused on 8 for some
reason... Heh.  :-)

I'm not going to go on a rant talking about the recurring scenario that
keeps happening on the mailing lists -- you know, where Person X says
"well, use these loader.conf variables and it's stable", yet Person Y
comes back with evidence that it's NOT stable.

Everyone's workloads are different, but the panic is the same every
time: kmem exhaustion.  i386 with KVA_PAGES or amd64 -- happens on both.
It's highly dependent upon workload and what the filesystem consists of
(many files vs. fewer files but larger in size, etc.)

> > RELENG_7 and RELENG_8 both, more or less, behave the same way with
> > regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my
> > question as far as what's needed to stabilise ZFS on either 7.x or 8.x.
> >
> Under RELENG_8/i386, you still need to tune ZFS as mentioned in the
> ZFS Tuning Guide:
> 
> http://wiki.freebsd.org/ZFSTuningGuide
> 
> With RELENG_8/amd64 no tuning is necessary, if the system has at least 2G RAM.

Nope.

http://lists.freebsd.org/pipermail/freebsd-stable/2009-October/052256.html

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Randy Bush
> My understanding is that the problem is more that the FreeBSD VM
> system doesn't gracefully handle running low or out of memory.

to me, that's just life in the big city.  the problem i think can be
solved before this is let loose on the unsuspecting public is that there
are really no good tools for tuning.  and the wiki page does not cut it.

there is not even a table of ram size vs load and good starting parms
for the intersection.  or "just don't risk real data with less than 2g
of ram."

randy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread pluknet
2009/11/21 Peter Jeremy :
> On 2009-Nov-21 09:47:56 +0900, Randy Bush  wrote:
>>imiho, zfs can not be called production ready if it crashes if you do
>>not stand on your left leg, put your right hand in the air, and burn
>>some eye of newt.
>
> FWIW, it's still very brittle on Solaris 10 and the Sun Support
> response to most issues is "restore from backup".  The IMHO, the
> biggest issue with ZFS itself is lack of recovery tools prior to PSARC
> 2009/479 (in ZFS v21).
>
> On 2009-Nov-21 11:36:43 -0800, Jeremy Chadwick  
> wrote:
>>RELENG_7 uses ZFS v13, RELENG_8 uses ZFS v18.
>
> Not in my repository.  I still have v13 in
> sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h in last night's
> RELENG_7, RELENG_8 and -current.

The good side of things is that there's the ongoing work on v13 -> v22
in perforce.

>
>>RELENG_7 and RELENG_8 both, more or less, behave the same way with
>>regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my
>>question as far as what's needed to stabilise ZFS on either 7.x or 8.x.
>
> My understanding is that the problem is more that the FreeBSD VM
> system doesn't gracefully handle running low or out of memory.

AFAIU kmacy works on zfs integration into FreeBSD'ish buf/vm.
It'd be nice to read something on that..

-- 
wbr,
pluknet
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Randy Bush
>> imiho, zfs can not be called production ready if it crashes if you
>> do not stand on your left leg, put your right hand in the air, and
>> burn some eye of newt.
> This is not a rant, but where do you read that on FreeBSD 7.2 ZFS has
> been marked as production ready.
> As far as i know, on FreeBSD 8.0 ZFS is called production ready.

whoops!  you are correct.  my apologies.

> Try running FreeBSD 7-Stable to get the latest ZFS version which on
> FreeBSD is 13

that is what i am running.  RELENG_7

randy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Randy Bush
>> imiho, zfs can not be called production ready if it crashes if you
>> do not stand on your left leg, put your right hand in the air, and
>> burn some eye of newt.
> ROFL!
> As with any open-source project, I suppose it will be ready
> when it is ready.  At least it hasn't been made the default.

yep.  i demand a full refund!  :)

my concern is the innocent admin putting something critical on zfs in
8.0 when it is called stable and production.  this isn't linux.

randy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Peter Jeremy
On 2009-Nov-21 09:47:56 +0900, Randy Bush  wrote:
>imiho, zfs can not be called production ready if it crashes if you do
>not stand on your left leg, put your right hand in the air, and burn
>some eye of newt.

FWIW, it's still very brittle on Solaris 10 and the Sun Support
response to most issues is "restore from backup".  The IMHO, the
biggest issue with ZFS itself is lack of recovery tools prior to PSARC
2009/479 (in ZFS v21).

On 2009-Nov-21 11:36:43 -0800, Jeremy Chadwick  wrote:
>RELENG_7 uses ZFS v13, RELENG_8 uses ZFS v18.

Not in my repository.  I still have v13 in
sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h in last night's
RELENG_7, RELENG_8 and -current.

>RELENG_7 and RELENG_8 both, more or less, behave the same way with
>regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my
>question as far as what's needed to stabilise ZFS on either 7.x or 8.x.

My understanding is that the problem is more that the FreeBSD VM
system doesn't gracefully handle running low or out of memory.

-- 
Peter Jeremy


pgpK1zEro2kG1.pgp
Description: PGP signature


Re: whats best pracfive for ZFS on a whole disc these days ?

2009-11-21 Thread Daniel O'Connor
On Sat, 21 Nov 2009, Marius Nünnerich wrote:
> >> Maybe. Could you paste kern.geom.confdot, kern.geom.confxml and
> >> mount output to pastie.org or the like.
> >
> > I've attached it..
>
> Hmm, I do not see whats wrong here. ZFS already opened the devices
> and I see no /dev/gpt/* entries. Maybe there is some bug with
> handling the long gptid names. Maybe you try to detach ZFSfrom the
> devices, give everything a short gpt label and try to use that.

I'd really prefer to use UUID which _does_ exist, ZFS just doesn't like 
it..

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


signature.asc
Description: This is a digitally signed message part.


Re: 7.2 dies in zfs

2009-11-21 Thread Scot Hetzel
On Sat, Nov 21, 2009 at 1:36 PM, Jeremy Chadwick
 wrote:
>
> On Sat, Nov 21, 2009 at 08:07:40PM +0100, Johan Hendriks wrote:
>> Randy Bush  wrote:
>> > imiho, zfs can not be called production ready if it crashes if you
>> > do not stand on your left leg, put your right hand in the air, and
>> > burn some eye of newt.
>>
>> This is not a rant, but where do you read that on FreeBSD 7.2 ZFS has
>> been marked as production ready.
>> As far as i know, on FreeBSD 8.0 ZFS is called production ready.
>>
>> If you boot your system it probably tell you it is still experimental.
>>
>> Try running FreeBSD 7-Stable to get the latest ZFS version which on
>> FreeBSD is 13
>> On 7.2 it is still at 6 (if I remember it right).
>
> RELENG_7 uses ZFS v13, RELENG_8 uses ZFS v18.
>

RELENG_8 is still using ZFS v13.

> RELENG_7 and RELENG_8 both, more or less, behave the same way with
> regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my
> question as far as what's needed to stabilise ZFS on either 7.x or 8.x.
>
Under RELENG_8/i386, you still need to tune ZFS as mentioned in the
ZFS Tuning Guide:

http://wiki.freebsd.org/ZFSTuningGuide

With RELENG_8/amd64 no tuning is necessary, if the system has at least 2G RAM.

Scot
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread Jeremy Chadwick

On Sat, Nov 21, 2009 at 08:07:40PM +0100, Johan Hendriks wrote:
> Randy Bush  wrote:
> > imiho, zfs can not be called production ready if it crashes if you
> > do not stand on your left leg, put your right hand in the air, and
> > burn some eye of newt.
> 
> This is not a rant, but where do you read that on FreeBSD 7.2 ZFS has
> been marked as production ready.
> As far as i know, on FreeBSD 8.0 ZFS is called production ready.
> 
> If you boot your system it probably tell you it is still experimental.
> 
> Try running FreeBSD 7-Stable to get the latest ZFS version which on
> FreeBSD is 13
> On 7.2 it is still at 6 (if I remember it right).

RELENG_7 uses ZFS v13, RELENG_8 uses ZFS v18.

RELENG_7 and RELENG_8 both, more or less, behave the same way with
regards to ZFS.  Both panic on kmem exhaustion.  No one has answered my
question as far as what's needed to stabilise ZFS on either 7.x or 8.x.

The people who need to answer the question are those who are familiar
with the code.  Specifically: Kip Macy, Pawel Jakub Dawidek, and anyone
else who knows the internals.  Everyone else in the user community is
simply guessing + going crazy trying to figure out a solution.

As much as I appreciate all the work that has been done to bring ZFS to
FreeBSD -- and I do mean that! -- we need answers at this point.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


RE: 7.2 dies in zfs

2009-11-21 Thread Johan Hendriks
Randy Bush  wrote:
> imiho, zfs can not be called production ready if it crashes if you
> do not stand on your left leg, put your right hand in the air, and
> burn some eye of newt.

This is not a rant, but where do you read that on FreeBSD 7.2 ZFS has
been marked as production ready.
As far as i know, on FreeBSD 8.0 ZFS is called production ready.

If you boot your system it probably tell you it is still experimental.

Try running FreeBSD 7-Stable to get the latest ZFS version which on
FreeBSD is 13
On 7.2 it is still at 6 (if I remember it right).

Regards,
Johan Hendriks
 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: whats best pracfive for ZFS on a whole disc these days ?

2009-11-21 Thread Marius Nünnerich
On Fri, Nov 20, 2009 at 23:20, Daniel O'Connor  wrote:
> On Sat, 21 Nov 2009, Marius Nünnerich wrote:
>> On Fri, Nov 20, 2009 at 14:27, Daniel O'Connor 
> wrote:
>> > On Fri, 20 Nov 2009, Marius Nünnerich wrote:
>> >> > Actually that is an interesting point, the swap partitions don't
>> >> > have an entry in /dev/gptid, although perhaps that is because
>> >> > glabel has grabbed that node.
>> >>
>> >> If I remember correctly nodes vanish when another name for the
>> >> same device is opened. Maybe this happens for the other gpt labels
>> >> too?
>> >
>> > Hmm, but I have gptid ones corresponding to my ZFS partitions.. It
>> > seems like a bug.
>>
>> Maybe. Could you paste kern.geom.confdot, kern.geom.confxml and mount
>> output to pastie.org or the like.
>
> I've attached it..

Hmm, I do not see whats wrong here. ZFS already opened the devices and
I see no /dev/gpt/* entries. Maybe there is some bug with handling the
long gptid names. Maybe you try to detach ZFSfrom the devices, give
everything a short gpt label and try to use that.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 7.2 dies in zfs

2009-11-21 Thread perryh
Randy Bush  wrote:
> imiho, zfs can not be called production ready if it crashes if you
> do not stand on your left leg, put your right hand in the air, and
> burn some eye of newt.

ROFL!

As with any open-source project, I suppose it will be ready
when it is ready.  At least it hasn't been made the default.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"