Re: How to investigate ENOBUFS?

2011-04-18 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

The filesystem will never recover. As I understand, this happens because
some socket has pending data that is never readen. Do I have a way to
discover what socket is holding buffer space? And is there a way to see how
much buffer space is allocated and how much remain, by the way?

netstat -m will show you how many buffers are allocated.
vmstat -m will show you the parameters of the mbpool_cache (mbpl).

A kernel with option MBUFTRACE lets netstat -mssv report finer
details on mbuf usage. Unfortunately not on a per-socket level.

Each socket will allocate net.inet.{udp,tcp}.{sendspace,recvspace} unless
the application asks for a specific size. In -current we have buffer
autosizing specified by net.inet.{udp,tcp}.{sendbuf,recvbuf}_inc and
net.inet.{udp,tcp}.{sendbuf,recvbuf}_max. But I don't know a way to
query the buffer size from the outside short of gdb.

ENOBUFS is also reported when the sending buffer of a (SOCK_DGRAM only?)
socket is full. netstat shows send-q and recv-q, not the buffer sizes
but the corresponding buffer usages that give a hint what socket is
sending.

-- 
-- 
Michael van Elst
Internet: mlel...@serpens.de
A potential Snark may lurk in every tree.


Re: ffs snapshots patch

2011-04-18 Thread Juergen Hannken-Illjes
On Sat, Apr 16, 2011 at 09:29:26PM +0200, Manuel Bouyer wrote:
 Hello,
 attached is a work in progress on ffs snapshot (as it's work in progress,
 some debug and instrumentation code is still present in the
 patch, no need to comment on this part :).
 The start of this work is that when working on quota, I noticed that
 taking a snapshot on a 500Gb filesystem needs several minutes, and is
 O(n) with the number of persisent snapshots.
 Here's some timings on a otherwise idle 500Gb filesystem (it's some brand of
 SATA2 3.5 drive attached to a AHCI controller, so it's a reasonable test
 bed for today):
 java# /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
   260.53 real 0.00 user 1.15 sys
 /home: suspended 77.873 sec, redo 1184 of 2556
 java# /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
   377.87 real 0.00 user 2.53 sys
 /home: suspended 206.078 sec, redo 1184 of 2556
 java# /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
   508.23 real 0.00 user 4.28 sys
 /home: suspended 338.534 sec, redo 1184 of 2556
 java# /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
   621.40 real 0.00 user 5.50 sys
 /home: suspended 431.154 sec, redo 1183 of 2556
 
 suspending a filesystem for more than 7mn to take a snapshot makes
 persisent snapshot quite useless to me. I wonder how it would behaves
 on a multi-terabyte filesystem.
 
 I looked at where the time is spend and found 2 major issues:
 1 cgaccount() works in 2 pass: first it copies cg before suspending the
   filesystem; then it is called again to copy only the cg that have been
   modified between copy and filesystem suspend.
   The problem is that to copy a cg we need to allocate blocks for the snapshot
   file, which may be in a cg we just copied. This is the cause of the high
   number of cg copies (almost half of them) with the filesystem suspended.
 
 2 while the filesystem is suspended, we want to expunge the snapshot files
   from the snapshot view (make them appear as a 0-length file).
   With ~500GB sparse files this is a lot of work.
 
 I fixed 1) by preallocating needed blocks snapshot_setup(). 

Good catch.  Committed.

 Fixing 2) is trickier. To avoid the heavy writes to the snapshot file
 with the fs suspended, the snapshot appears with its real lenght and
 blocks at the time of creation, but is marked invalid (only the
 inode block needs to be copied, and this can be done before suspending
 the fs). Now BLK_SNAP should never be seen as a block number, and we skip
 ffs_copyonwrite() if the write is to a snapshot inode.

I strongly object here.  There are good reasons to expunge old snapshots.

Even it it were done right, without deadlocks and locking-against-self,
the resulting snapshot looses at least two properties:

- A snapshot is considered stable.  Whenever you read a block you get
  the same contents.  Allowing old snapshots to exist but not running
  copy-on-write means these blocks will change their contents.

- A snapshot will fsck clean.  It is impossible to change fsck_ffs
  to check a snapshot as these old snapshots indirect blocks now will
  contain garbage.

You cannot copy blocks before suspension without rewriting them once
the file system is suspended.

The check in ffs_copyonwrite() will only work as long as the old
snapshot exists.  As sson as it gets removed we will run COW
on the blocks used by the old snapshot.

 With these changes the times are much more reasonable:
 /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
   299.68 real 0.00 user 1.10 sys
 /home: suspended 0.310 sec, redo 0 of 2556
 /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
   188.10 real 0.00 user 0.86 sys
 /home: suspended 0.270 sec, redo 0 of 2556
 /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
   169.78 real 0.00 user 0.95 sys
 /home: suspended 0.450 sec, redo 0 of 2556
 /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
   172.39 real 0.00 user 0.99 sys
 /home: suspended 0.300 sec, redo 0 of 2556
 
 This seems to work; one issue with this patch is that the block
 count for the snapshot inode, and block summary informations (the
 second being probably a consequence of the first) appear wrong when
 running fsck against a snapshot.  I believe this is fixable, but
 I've not yet found from where the information mismatch is coming from.
 
 comments ?
 
 PS: I'm away from computers for one week, so don't expect replies to
 your comments before next sunday.
 
 -- 
 Manuel Bouyer bou...@antioche.eu.org
  NetBSD: 26 ans d'experience feront toujours la difference
 --

-- 
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread tsugutomo . enami
Martin Husemann mar...@duskware.de writes:

 What do you think? Better naming suggestion also welcome.

IMHO, root autoconfiguration should be limited to take effect only when
booted device is included in its components.  Since the current behavior
is suprising and inconvinient (I sometimes boot system from pxeboot), my
locally running system is modified so.

enami.


re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread matthew green

 Martin Husemann mar...@duskware.de writes:
 
  What do you think? Better naming suggestion also welcome.
 
 IMHO, root autoconfiguration should be limited to take effect only when
 booted device is included in its components.  Since the current behavior
 is suprising and inconvinient (I sometimes boot system from pxeboot), my
 locally running system is modified so.

this doesn't work for setups where the root and boot devices are not
the same, which happens to be 2 of my systems.  the root device is
simply not visible at boot time.

what would be great is if the normal selection was told when it has
booted from a raid set, and it could find the right one and use that
device.  this could then work with raidctl -A yes.  that would
allow the majority of installations to skip the forced root magic.

do you think this would be easy based upon your current changes?

since there are valid ways to run that require the forced magic
without special kernel builds (or unwritten code) removing the code
is not really an option.


i think martin should commit his change.  it would allow me to boot
GENERIC on my dev machine that i otherwise have a forced root on nfs
configuration for, as well as solve his install problem.


.mrg.


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Klaus . Heinz
Martin Husemann wrote:

 as described in PR 44774 (see
 http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44774), it is
 currently not possible to use a standard NetBSD install CD on a system
 wich normally boots from raid (at least on i386, amd64 or sparc64, where
 a stock GENERIC kernel is used).

It looks like this is a similar problem to the one I raised here

  http://mail-index.NetBSD.org/tech-kern/2009/10/31/msg006410.html

Instead of providing RB_NO_ROOT_OVERRIDE I would prefer something that
actually _lets_ me override everything else from boot.cfg.

Quoting Robert Elz from the mentioned thread:

 FWIW, I think the code to allow the user to override that at boot
 time would be a useful addition - while it is possible to boot,
 raidctl -A yes or raidctl -A no, and then reboot, or perhaps
 boot -a on systems that support that, but that's a painful
 sequence of operations for what should be a simple task.
 
 It should always be
   1. what the user explicitly asks for
   2. what the kernel has built in
   3. hacks like ratdctl -A root (or perhaps similar things for cgd etc)
   4. where I think I came from


ciao
 Klaus


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Martin Husemann
On Mon, Apr 18, 2011 at 01:06:23PM +0200, Klaus . Heinz wrote:
 Instead of providing RB_NO_ROOT_OVERRIDE I would prefer something that
 actually _lets_ me override everything else from boot.cfg.

Yes, I understand this wish (and it is not that hard to implement).
However, I think both are quite orthogonal and should be discussed separately.

Note that especially for the install CD setup, a rootstring passed from
boot.cfg is *not* possible, as we do not know which CD drive is used
for booting.

Martin


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Izumi Tsutsui
 Note that especially for the install CD setup, a rootstring passed from
 boot.cfg is *not* possible, as we do not know which CD drive is used
 for booting.

FYI, see also PR port-i386/39998 and x86_autoconf.c:

  /*
   * XXX
   * There is no proper way to detect which unit is
   * recognized as a bootable CD-ROM drive by the BIOS.
   * Assume the first unit is the one.
   */

I.e. currently only root on cd0a works on x86 GENERIC anyway.

---
Izumi Tsutsui


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Martin Husemann
On Mon, Apr 18, 2011 at 11:59:32PM +0900, Izumi Tsutsui wrote:
 I thought passing root on cd0a from boot.cfg just worked on x86..

Maybe, but I'm not talking about x86 only.

Martin


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Greg Oster
On Mon, 18 Apr 2011 13:06:23 +0200
Klaus . Heinz k.he...@aprelf.kh-22.de wrote:

 Martin Husemann wrote:
 
  as described in PR 44774 (see
  http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44774),
  it is currently not possible to use a standard NetBSD install CD on
  a system wich normally boots from raid (at least on i386, amd64 or
  sparc64, where a stock GENERIC kernel is used).
 
 It looks like this is a similar problem to the one I raised here
 
   http://mail-index.NetBSD.org/tech-kern/2009/10/31/msg006410.html
 
 Instead of providing RB_NO_ROOT_OVERRIDE I would prefer something that
 actually _lets_ me override everything else from boot.cfg.

So how about adding another flag like RB_ROOT_EXPLICIT OR
RB_EXPLICIT_ROOT with the idea being that the user has to explicitly
specify what the root is going to be?  I think that addresses 1) below
and would hopefully handle the pxeboot and NFS root situations.  Maybe
something like RB_SET_ROOT with a required option of NFS or raid0
or wd0a or pxe or whatever would be the way to go?

I've never liked the 'yank root away from the boot device that the
system thinks it maybe booted from' hack in RAIDframe, so anything we
can do to make it better is fine by me

We may need both RB_NO_ROOT_OVERRIDE and this new flag in order to get
everything covered, and I'm fine with that too...

 Quoting Robert Elz from the mentioned thread:
 
  FWIW, I think the code to allow the user to override that at boot
  time would be a useful addition - while it is possible to boot,
  raidctl -A yes or raidctl -A no, and then reboot, or perhaps
  boot -a on systems that support that, but that's a painful
  sequence of operations for what should be a simple task.
  
  It should always be
1. what the user explicitly asks for
2. what the kernel has built in
3. hacks like ratdctl -A root (or perhaps similar things
  for cgd etc) 4. where I think I came from
 
 
 ciao
  Klaus


Later...

Greg Oster


Re: New boothowto flag to prevent raid auto-root-configuration

2011-04-18 Thread Brian Buhrow
Hello Martin.  Doesn't boot -a  already do this by allowing you to
select the root filesystem and the init path?  I'm certain I've booted
systems running with raid roots off of cdroms for repair purposes.
-Brian

On Apr 18,  7:41am, Martin Husemann wrote:
} Subject: New boothowto flag to prevent raid auto-root-configuration
} 
} --6c2NcOVqGQ03X4Wi
} Content-Type: text/plain; charset=us-ascii
} Content-Disposition: inline
} 
} Hi folks,
} 
} as described in PR 44774 (see
} http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44774), it is
} currently not possible to use a standard NetBSD install CD on a system
} wich normally boots from raid (at least on i386, amd64 or sparc64, where
} a stock GENERIC kernel is used).
} 
} To fix this, I'd like to introduce a new boothowto flag, that turns off
} all magic to override the root device (for now: turns off the root part
} of raidframe autoconfiguration, but could do similar things in the future
} with LVM or whatever).
} 
} The patch attched does this. It will be accompanied by bootloader changes to
} set this flag if a new keyword is present in /boot.cfg.
} 
} What do you think? Better naming suggestion also welcome.
} 
} Martin
} 
} 
} --6c2NcOVqGQ03X4Wi
} Content-Type: text/plain; charset=us-ascii
} Content-Disposition: attachment; filename=rf.patch
} 
} Index: sys/reboot.h
} ===
} RCS file: /cvsroot/src/sys/sys/reboot.h,v
} retrieving revision 1.25
} diff -c -u -r1.25 reboot.h
} --- sys/reboot.h  25 Dec 2007 18:33:48 -  1.25
} +++ sys/reboot.h  18 Apr 2011 05:34:01 -
} @@ -53,6 +53,9 @@
}  #define  RB_STRING   0x0400  /* use provided bootstr */
}  #define  RB_POWERDOWN(RB_HALT|0x800) /* turn power off (or at least 
halt) */
}  #define RB_USERCONF  0x1000  /* change configured devices */
} +#define  RB_NO_ROOT_OVERRIDE 0x2000  /* no automatic override of the 
booted
} +  * device, like raidframes auto
} +  * root configuration */
}  
}  /*
}   * Extra autoboot flags (passed by boot prog to kernel). See also
} Index: dev/raidframe/rf_netbsdkintf.c
} ===
} RCS file: /cvsroot/src/sys/dev/raidframe/rf_netbsdkintf.c,v
} retrieving revision 1.284
} diff -c -u -r1.284 rf_netbsdkintf.c
} --- dev/raidframe/rf_netbsdkintf.c18 Mar 2011 23:53:26 -  1.284
} +++ dev/raidframe/rf_netbsdkintf.c18 Apr 2011 05:34:01 -
} @@ -465,7 +465,7 @@
}   /* if the user has specified what the root device should be
}  then we don't touch booted_device or boothowto... */
}  
} - if (rootspec != NULL)
} + if ((rootspec != NULL) || (boothowto  RB_NO_ROOT_OVERRIDE))
}   return;
}  
}   /* we found something bootable... */
} 
} --6c2NcOVqGQ03X4Wi--
-- End of excerpt from Martin Husemann