make world broken for RELENG_6

2007-03-31 Thread Rick C. Petty
Is anyone else experiencing build problems with the latest RELENG_6 (sup'd
as of 2007-Mar-31 0800 UTC)?

A buildworld failed at:

=== sbin/ipfw (all)
cc -O2 -fno-strict-aliasing -pipe   -c /usr/src/sbin/ipfw/ipfw2.c
/usr/src/sbin/ipfw/ipfw2.c: In function `add':
/usr/src/sbin/ipfw/ipfw2.c:3976: error: `ipfw_insn_pipe' undeclared (first use 
in this function)
/usr/src/sbin/ipfw/ipfw2.c:3976: error: (Each undeclared identifier is reported 
only once
/usr/src/sbin/ipfw/ipfw2.c:3976: error: for each function it appears in.)
*** Error code 1

Stop in /usr/src/sbin/ipfw.
*** Error code 1

Stop in /usr/src/sbin.
*** Error code 1

Stop in /usr/src.
*** Error code 1


I keep most of my machines csup'd every day, and this is the first time
since 2.2.5 that I've ever seen the build broken for -STABLE.  Maybe I've
just been lucky in the past, or are people breaking things now?

Please Cc: me, thanks,

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: make world broken for RELENG_6

2007-03-31 Thread Rick C. Petty
On Sat, Mar 31, 2007 at 09:22:58AM -0700, Julian Elischer wrote:
 David Wolfskill wrote:
 
 Right; I encountered the same thing.
 
 Locally reverting src/sys/netinet/ip_fw.h rev. 1.100.2.6 appears to have
 fixed it for me:  after doing that, I was able to successfully build,
 install, and boot.  And yes, I use IPFW.  :-}
 
 The issue appears to be that src/sbin/ipfw/ipfw2.c references
 ipfw_insn_pipe, which 1.100.2.6 dyked out out ip_fw.h.
 
 I don't know that reverting 1.100.2.6 was the correct thing to do; it
 may be better to change ipfw2.c to not try to refer to it.
 
 I've Cc:ed Julian, since he committed the changes.
 
 Peace,
 david
 
 
 try just deleting the offending lines in ipfw2.c



It's just so rare that -stable breaks on buildworld (even -current isn't
broken often, in terms of build breakage)..  something I can't say about
other operating systems.  A csup this morning caused the problem to go
away.  Thanks, guys!

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: cd ripping to flac

2009-03-10 Thread Rick C. Petty
On Sat, Mar 07, 2009 at 08:12:28AM +0100, Zoran Kolic wrote:
 Howdy!
 I'd like to rip my cd-s to flac files using some
 command line app, like cdda2wav or cdparanoia.
 Using pipe to flac utility would be nice and the
 way I'd take. What program acts in that matter?

What does this have to do with FreeBSD-stable?  This question is better
asked on freebsd-questions.

 Since cdda2wav is in the base, I suppose people
 use it regurarly. Something like:

What do you mean by base?  It is a port and not in the base system.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: TCP differences in 7.2 vs 7.1

2009-05-12 Thread Rick C. Petty
On Tue, May 12, 2009 at 05:31:01PM -0400, David Samms wrote:
 
 Setting sysctl net.inet.tcp.tso=0 resolved the issue completely.   What 
 does sysctl net.inet.tcp.tso=0 do?

# sysctl -d net.inet.tcp.tso
net.inet.tcp.tso: Enable TCP Segmentation Offload

I had a similar problem with a different NIC.  This option controls whether
we offload segmenting to the NIC.  My NIC seemed to be limited by the
number of interrupts which could be delivered.  You can also do this on a
card-by-card basis using ifconfig interface -tso.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildkernel KERNCONF=GENERIC fails

2009-05-27 Thread Rick C. Petty
On Tue, May 26, 2009 at 08:42:52PM +0200, Christian Walther wrote:
 
 Well, for some strange reason the same happened again: I did
 
 # mv /usr/src /usr/src.old
 # csup /root/stable-supfile
 # cd /usr/src
 # make buildkernel KERNCONF=GENERIC

You should always do a buildworld before doing a buildkernel, as the
toolchain which builds the kernel might have changed.  Also
KERNCONF=GENERIC is implied.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Use n instead of Fn for choosing the OS when booting?

2009-06-22 Thread Rick C. Petty
On Tue, Jun 23, 2009 at 07:38:00AM +0800, Wu, Yue wrote:
 
 Another question, if these are more then 12 OSes exist on disk, how to select
 the one that number larger than 12? :)

You can only have 4 primary partitions in an MBR.  The boot0 gives choice
#5 for the next disk.  Thus there are only 5 choices, maximum.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: What is /boot/kernel/*.symbols?

2009-07-06 Thread Rick C. Petty
On Mon, Jul 06, 2009 at 11:39:04AM +0200, Ruben de Groot wrote:
 On Mon, Jul 06, 2009 at 10:46:50AM +0200, Dimitry Andric typed:
  
  Right, so it's a lot bigger on amd64.  I guess those 64-bit pointers
  aren't entirely free. :)
 
 I'm not sure where the size difference comes from. I have some sparc64
 systems running -current with symbols and the size of /boot/kernel is
 more comparable to i386, even with the 8-byte pointer size:

Um, probably there are a lot of devices on amd64 that aren't available for
sparc64?

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: What is /boot/kernel/*.symbols?

2009-07-07 Thread Rick C. Petty
On Tue, Jul 07, 2009 at 11:24:51AM +0200, Ruben de Groot wrote:
 On Mon, Jul 06, 2009 at 04:20:45PM -0500, Rick C. Petty typed:
  On Mon, Jul 06, 2009 at 11:39:04AM +0200, Ruben de Groot wrote:
   On Mon, Jul 06, 2009 at 10:46:50AM +0200, Dimitry Andric typed:

Right, so it's a lot bigger on amd64.  I guess those 64-bit pointers
aren't entirely free. :)
   
   I'm not sure where the size difference comes from. I have some sparc64
   systems running -current with symbols and the size of /boot/kernel is
   more comparable to i386, even with the 8-byte pointer size:
  
  Um, probably there are a lot of devices on amd64 that aren't available for
  sparc64?
 
 Yes, That's probably it.

It was just a theory; I don't have sparc64.  What's your output of
ls -1 /boot/kernel | wc?

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ataraid's revenge! (Was: Re: A nasty ataraid experience.)

2009-07-23 Thread Rick C. Petty
This question should have been directed to freebsd-geom.

On Thu, Jul 23, 2009 at 03:42:47PM -0500, Sean C. Farley wrote:
 
 Anyone know if I can boot off of a gvinum partition and/or how it works 
 (or does not) with various label schemes?

http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/vinum-root.html

There are some downsides.  First of all, when you mount your root partition
under vinum, you can't unload the geom_vinum module (for obvious reasons).
If you decide to relocate volumes in the future, you won't be able to
remove the disks from the vinum.  The reason is because you're not allowed
to change the bsdlabel if there's a vinum slice while vinum is loaded, and
you cannot remove a device from vinum even if it has no subdisks.  I know
lulf@ was looking into the second part.

I used to do this on all my machines.  Sometime soon after 7.1-release, I
noticed a significant slowdown in disk performance, perhaps due to disk
scheduling.  Basically while any moderate I/O hit one gvinum volume, all
other I/O on that disk was essentially suspended until the the heavier I/O
completed.  What this meant was that if I did an rsync to or from a volume
whose subdisk resided on the same physical disk as my root and /usr
partitions, I could not run any commands until the rsync finished or was
killed (with control-C, since I couldn't even get a killall to execute).
After that I switched all my machines to use gmirror which helped slightly.
During that switch is when I discovered you need to be pretty clever to go
away from vinum root.

I don't mean to say not to use gvinum.  I put a lot of time into it,
submitted some patches, and did a lot of testing with Ulf's deltas.  There
were still a couple of bugs when I stopped using it, but that doesn't mean
they won't be fixed by 8.0.  I just couldn't deal with the I/O performance
issues which I was never able to track down.  I need my workstations to be
responsive while I'm rsyncing stuff around, and with 4 cores and 8 GB of
RAM, I expected it to behave better.  YMMV.

FYI, the trick I used when migrating mirrors in vinum to gmirror:
1). Pick one disk to convert to gmirror first.  I'll call this disk0.
2). Remove all subdisks which are attached to the device using disk0.
3). Reboot into single-user with vinum unloaded, specifying the raw device
root (from disk1) instead of /dev/gvinum/root-volume.
4). Wipe sector 8 of the vinum slice on disk0.
5). Create a gmirror with disk0, re-bsdlabel, create new filesystems, etc.
6). Load geom_vinum kernel module.  This may panic the kernel.  If so,
reboot normally using the vinum root volume.
7). Mount the new filesystems.
8). Mount the volumes you want moved to the mirror and rsync them over to
the filesystems mounted on disk0.
9). Unmount all vinum volumes and try to unload the geom_vinum module.  If
this panics the kernel, or if there was a panic in step 6, you'll need to
reboot into single-user again with geom_vinum unloaded.
10). Wipe sector 8 of the vinum slick on disk1.
11). Add disk1 to the gmirror, which should start a sync.
12). Load the geom_vinum module again, if you have other volumes not part
of this mirror.  If this panics the kernel, or if the kernel panicked after
step 11, a normal restart might be successful.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: newfs(8) parameters from dumpfs -m have bad -s value?

2009-01-05 Thread Rick C. Petty
On Mon, Jan 05, 2009 at 08:23:53PM +0100, Oliver Fromme wrote:
 
 This seems to be a bug in dumpfs(8).  It simply prints
 the value of the fs_size field of the superblock, which
 is wrong.
 
 The -s option of newfs(8) expects the available size in
 sectors (i.e. 512 bytes), but the fs_size field contains
 the size of the file system in 2KB units.  This seems to
 be the fragment size, but I'm not sure if this is just

This *is* the fragment size.  UFS/FFS uses the plain term block to mean
the fragment size.  All blocks are indexed with this number, unlike block
size which is almost always 8 fragments (blocks).  Confusing.

 So, dumpfs(8) needs to be fixed to perform the proper
 calculations when printing the value for the -s option.
 Unfortunately I'm not sufficiently much of a UFS guru
 to offer a fix.  My best guess would be to multiply the
 fs_size value by the fragment size (measured in 512 byte
 units), i.e. multiply by 4 in the most common case.
 But I'm afraid the real solution is not that simple.

The sector size and filesystem size parameters in newfs are remnants.
Everything is converted to number of media sectors (sector size as
specified by the device).  So one could assume for dumpfs to always use
512, since it's rarely different, and multiply fs_size by fs_fsize and
divide by 512, and then output -S 512.

Better yet would be to add a parameter (-z perhaps) to newfs(8) to accept
number of bytes instead of multiples of sectorsize.

I would be willing to write up patches for dumpfs and newfs to both add the
raw byte size and the 512-byte sector size handling to correct said
mistake, unless someone else would rather.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: cvsup freebsd 6_2 to freebsd 7_1 not upgrading?

2009-01-05 Thread Rick C. Petty
On Mon, Jan 05, 2009 at 11:41:40AM -0700, Brian Duke wrote:
 
 #cp /usr/share/examples/cvsup/standard-supfile /root/stand_sup
 #vi /root/stand_sup
 host=CHANGE_ME.freebsd.org
 host=cvsup15.us.FreeBSD.org
 tag=RELENG_6_2
 tag=RELENG_7_1
 
 #cd /usr/src
 #cvsup -g -L2 /root/stand_sup
 ...
 #make -j4 buildworld; make -j4 buildkernel; make installkernel
 
 ... (come back a hour or so later)
 #make installworld; reboot

You should always reboot into the new kernel before running the install
world, especially if updating the major version.  I always boot into
single-user to do my install world, although a new kernel should work with
old userland.  Although this isn't your problem.  You should see the
kernel's version at boot time (and through uname(1)), so somehow you're not
installing the kernel.  I'd probably change your line to:

make -j4 buildworld buildkernel  make installkernel

or just:

make -j4 buildworld kernel  echo success

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: newfs(8) parameters from dumpfs -m have bad -s value?

2009-01-05 Thread Rick C. Petty
On Tue, Jan 06, 2009 at 08:51:18AM +0200, Danny Braniss wrote:
  
  Everything is converted to number of media sectors (sector size as
  specified by the device).  So one could assume for dumpfs to always use
  512, since it's rarely different, and multiply fs_size by fs_fsize and
  divide by 512, and then output -S 512.
 
 don't assume 512, in the iscsi world I have seen all kinds of sector sizes,
 making it a PITA to get things right.

It was a suggestion, one assumed by FreeBSD in many places.  In this case,
it makes no difference since the number of bytes is computed by newfs and
then divided by the actual sector size when calling bwrite(3).  I still
would prefer:

  Better yet would be to add a parameter (-z perhaps) to newfs(8) to accept
  number of bytes instead of multiples of sectorsize.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: SMART

2009-11-12 Thread Rick C. Petty
On Thu, Nov 12, 2009 at 09:44:28AM -0800, Jeremy Chadwick wrote:
 On Thu, Nov 12, 2009 at 01:25:12PM +0100, Ivan Voras wrote:
  Jeremy Chadwick wrote:
  
  I can teach you how to decode/read SMART statistics correctly.
  
  Actually, it would be good if you taught more than him :)
  
  I've always wondered how important are each of the dozen or so
  statistics and what indicates what...
 
 I'll work on writing an actual HTML document to put up on my web site
 and will respond with the URL once I finish it.

Isn't this sufficient?
http://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

If not, could you make the changes on wikipedia?  This isn't a
FreeBSD-specific topic, and the larger community would benefit from such
documentation.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Why is NFSv4 so slow?

2010-06-27 Thread Rick C. Petty
First off, many thanks to Rick Macklem for making NFSv4 possible in
FreeBSD!

I recently updated my NFS server and clients to v4, but have since noticed
significant performance penalties.  For instance, when I try ls a b c (if
a, b, and c are empty directories) on the client, it takes up to 1.87
seconds (wall time) whereas before it always finished in under 0.1 seconds.
If I repeat the test, it takes the same amount of time in v4 (in v3, wall
time was always under 0.01 seconds for subsequent requests, as if the
directory listing was cached).

If I try to play an h264 video file on the filesystem using mplayer, it
often jitters and skipping around in time introduces up to a second or so
pause.  With NFSv3 it behaved more like the file was on local disk (no
noticable pauses or jitters).

Has anyone seen this behavior upon switching to v4 or does anyone have any
suggestions for tuning?

Both client and server are running the same GENERIC kernel, 8.1-PRERELEASE
as of 2010-May-29.  They are connected via gigabit.  Both v3 and v4 tests
were performed on the exact same hardware and I/O, CPU, network loads.
All I did was toggle nfsv4_server_enable (and nfsuserd/nfscbd of course).

It seems like a server-side issue, because if I try an nfs3 client mount
to the nfs4 server and run the same tests, I see only a slight improvement
in performance.  In both cases, my mount options were
rdirplus,bg,intr,soft (and nfsv4 added in the one case, obviously).

On the server, I have these tunables explicitly set:

kern.ipc.maxsockbuf=524288
vfs.newnfs.issue_delegations=1

On the client, I just have the maxsockbuf setting (this is twice the
default value).  I'm open to trying other tunables or patches.  TIA,

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Rick C. Petty
On Sun, Jun 27, 2010 at 08:04:28PM -0400, Rick Macklem wrote:
 
 Weird, I don't see that here. The only thing I can think of is that the
 experimental client/server will try to do I/O at the size of MAXBSIZE
 by default, which might be causing a burst of traffic your net interface
 can't keep up with. (This can be turned down to 32K via the
 rsize=32768,wsize=32768 mount options. I found this necessary to avoid
 abissmal performance on some Macs for the Mac OS X port.)

Hmm.  When I mounted the same filesystem with nfs3 from a different client,
everything started working at almost normal speed (still a little slower
though).

Now on that same host I saw a file get corrupted.  On the server, I see
the following:

% hd testfile | tail -4
00677fd0  2a 24 cc 43 03 90 ad e2  9a 4a 01 d9 c4 6a f7 14  |*$.C.J...j..|
00677fe0  3f ba 01 77 28 4f 0f 58  1a 21 67 c5 73 1e 4f 54  |?..w(O.X.!g.s.OT|
00677ff0  bf 75 59 05 52 54 07 6f  db 62 d6 4a 78 e8 3e 2b  |.uY.RT.o.b.Jx.+|
00678000

But on the client I see this:

% hd testfile | tail -4
00011ff0  1e af dc 8e d6 73 67 a2  cd 93 fe cb 7e a4 dd 83  |.sg.~...|
00012000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
00678000

The only thing I could do to fix it was to copy the file on the server,
delete the original file on the client, and move the copied file back.

Not only is it affecting random file reads, but started breaking src
and ports builds in random places.  In one situation, portmaster failed
because of a port checksum.  It then tried to refetch and failed with the
same checksum problem.  I manually deleted the file, tried again and it
built just fine.  The ports tree and distfiles are nfs4 mounted.

 The other thing that can really slow it down is if the uid-login-name
 (and/or gid-group-name) is messed up, but this would normally only
 show up for things like ls -l. (Beware having multiple password database
 entries for the same uid, such as root and toor.)

I use the same UIDs/GIDs on all my boxes, so that can't be it.  But thanks
for the idea.

 I don't recommend the use of intr or soft for NFSv4 mounts, but they
 wouldn't affect performance for trivial tests. You might want to try:
 nfsv4,rsize=32768,wsize=32768 and see how that works.

I'm trying that right now (with rdirplus also) on one host.  If I start to
the delays again, I'll compare between hosts.

 When you did the nfs3 mount did you specify newnfs or nfs for the
 file system type? (I'm wondering if you still saw the problem with the
 regular nfs client against the server? Others have had good luck using
 the server for NFSv3 mounts.)

I used nfs for FStype.  So I should be using newnfs?  This wasn't very
clear in the man pages.  In fact newnfs wasn't mentioned in
man mount_newnfs.

 When I see abissmal NFS perf. it is usually an issue with the underlying
 transport. Looking at things like netstat -i or netstat -s might
 give you a hint?

I suspected it might be transport-related.  I didn't see anything out of
the ordinary from netstat, but then again I don't know what's ordinary
with NFS.  =)

~~

One other thing I noticed but I'm not sure if it's a bug or expected
behavior (unrelated to the delays or corruption), is I have the following
filesystems on the server:

/vol/a
/vol/a/b
/vol/a/c

I export all three volumes and set my NFS V4 root to /.  On the client,
I'll mount ... server:vol /vol and the b and c directories show up
but when I try ls /vol/a/b /vol/a/c, they show up empty.  In dmesg I see:

kernel: nfsv4 client/server protocol prob err=10020

After unmounting /vol, I discovered that my client already had /vol/a/b and
/vol/a/c directories (because pre-NFSv4, I had to mount each filesystem
separately).  Once I removed those empty dirs and remounted, the problem
went away.  But it did drive me crazy for a few hours.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Rick C. Petty
On Sun, Jun 27, 2010 at 08:04:28PM -0400, Rick Macklem wrote:
 
 Weird, I don't see that here. The only thing I can think of is that the
 experimental client/server will try to do I/O at the size of MAXBSIZE
 by default, which might be causing a burst of traffic your net interface
 can't keep up with. (This can be turned down to 32K via the
 rsize=32768,wsize=32768 mount options. I found this necessary to avoid
 abissmal performance on some Macs for the Mac OS X port.)

I just ran into the speed problem again after remounting.  This time
I tried to do a make buildworld and make got stuck on [newnfsreq] for
ten minutes, with no other filesystem activity on either client or server.

The file system corruption is still pretty bad.  I can no longer build any
ports on one machine, because after the port is extracted, the config.sub
files are being filled with all zeros.  It took me awhile to track this
down while trying to build devel/libtool22:

+ ac_build_alias=amd64-portbld-freebsd8.1
+ test xamd64-portbld-freebsd8.1 = x
+ test xamd64-portbld-freebsd8.1 = x
+ /bin/sh libltdl/config/config.sub amd64-portbld-freebsd8.1
+ ac_cv_build=''
+ printf '%s\n' 'configure:4596: result: '
+ printf '%s\n' ''

+ as_fn_error 'invalid value of canonical build' 4600 5
+ as_status=0
+ test 0 -eq 0
+ as_status=1
+ test 5

And although my work dir is on local disk,

% hd work/libtool-2.2.6b/libltdl/config/config.sub:

  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
||
*
7660

Again, my ports tree is mounted as FSType nfs with option nfsv4.
FreeBSD/amd64 8.1-PRERELEASE r208408M GENERIC kernel.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 12:30:30AM -0400, Rick Macklem wrote:
 
 I can't explain the corruption, beyond the fact that soft,intr can
 cause all sorts of grief. If mounts without soft,intr still show
 corruption problems, try disabling delegations (either kill off the
 nfscbd daemons on the client or set vfs.newnfs.issue_delegations=0
 on the server). It is disabled by default because it is the greenest
 part of the subsystem.

I tried without soft,intr and make buildworld failed with what looks like
file corruption again.  I'm trying without delegations now.

 Make sure you don't have multiple entries for the same uid, such as root
 and toor both for uid 0 in your /etc/passwd. (ie. get rid of one of 
 them, if you have both)

Hmm, that's a strange requirement, since FreeBSD by default comes with
both.  That should probably be documented in the nfsv4 man page.

 When you specify nfs for an NFSv3 mount, you get the regular client.
 When you specify newnfs for an NFSv3 mount, you get the experimental
 client. When you specify nfsv4 you always get the experimental NFS
 client, and it doesn't matter which FStype you've specified.

Ok.  So my comparison was with the regular and experimental clients.

 If you are using UFS/FFS on the server, this should work and I don't know
 why the empty directories under /vol on the client confused it. If your
 server is using ZFS, everything from / including /vol need to be exported.

Nope, UFS2 only (on both clients and server).

  kernel: nfsv4 client/server protocol prob err=10020
 
 This error indicates that there wasn't a valid FH for the server. I
 suspect that the mount failed. (It does a loop of Lookups from / in
 the kernel during the mount and it somehow got confused part way through.)

If the mount failed, why would it allow me to ls /vol/a and see both b
and c directories as well as other files/directories on /vol/ ?

 I don't know why these empty dirs would confuse it. I'll try a test
 here, but I suspect the real problem was that the mount failed and
 then happened to succeed after you deleted the empty dirs.

It doesn't seem likely.  I spent an hour mounting and unmounting and each
mount looked successful in that there were files and directories besides
the two I was trying to decend into.

 It still smells like some sort of transport/net interface/... issue
 is at the bottom of this. (see response to your next post)

It's possible.  I just had another NFSv4 client (with the same server) lock
up:

load: 0.00  cmd: ls 17410 [nfsv4lck] 641.87r 0.00u 0.00s 0% 1512k

and:

load: 0.00  cmd: make 87546 [wait] 37095.09r 0.01u 0.01s 0% 844k

That make has been hung for hours, and the ls(1) was executed during that
lockup.  I wish there was a way I could unhang these processes and unmount
the NFS mount without panicking the kernel, but alas even this fails:

# umount -f /sw
load: 0.00  cmd: umount 17479 [nfsclumnt] 1.27r 0.00u 0.04s 0% 788k

A shutdown -p now resulted in a panic with the speaker beeping
constantly and no console output.

It's possible the NICs are all suspect, but all of this worked fine a
couple of days ago when I was only using NFSv3.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
 at 0xfee0

client:

r...@pci0:1:0:0:class=0x02 card=0x84321043 chip=0x816810ec rev=0x06 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
class  = network
subclass   = ethernet
cap 01[40] = powerspec 3  supports D0 D1 D2 D3  current D0
cap 05[50] = MSI supports 1 message, 64 bit enabled with 1 message
cap 10[70] = PCI-Express 2 endpoint IRQ 2 max data 128(256) link x1(x1)
cap 11[b0] = MSI-X supports 4 messages in map 0x20
cap 03[d0] = VPD

 5) sysctl dev.XXX.N  (ex. for em0, XXX=em, N=0)

server:

dev.nfe.0.%desc: NVIDIA nForce MCP77 Networking Adapter
dev.nfe.0.%driver: nfe
dev.nfe.0.%location: slot=10 function=0 handle=\_SB_.PCI0.NMAC
dev.nfe.0.%pnpinfo: vendor=0x10de device=0x0760 subvendor=0x1043 
subdevice=0x82f2 class=0x02
dev.nfe.0.%parent: pci0
dev.nfe.0.process_limit: 192
dev.nfe.0.stats.rx.frame_errors: 0
dev.nfe.0.stats.rx.extra_bytes: 0
dev.nfe.0.stats.rx.late_cols: 0
dev.nfe.0.stats.rx.runts: 0
dev.nfe.0.stats.rx.jumbos: 0
dev.nfe.0.stats.rx.fifo_overuns: 0
dev.nfe.0.stats.rx.crc_errors: 0
dev.nfe.0.stats.rx.fae: 0
dev.nfe.0.stats.rx.len_errors: 0
dev.nfe.0.stats.rx.unicast: 1762645090
dev.nfe.0.stats.rx.multicast: 1
dev.nfe.0.stats.rx.broadcast: 7608
dev.nfe.0.stats.tx.octets: 2036479975330
dev.nfe.0.stats.tx.zero_rexmits: 2090186021
dev.nfe.0.stats.tx.one_rexmits: 0
dev.nfe.0.stats.tx.multi_rexmits: 0
dev.nfe.0.stats.tx.late_cols: 0
dev.nfe.0.stats.tx.fifo_underuns: 0
dev.nfe.0.stats.tx.carrier_losts: 0
dev.nfe.0.stats.tx.excess_deferrals: 0
dev.nfe.0.stats.tx.retry_errors: 0
dev.nfe.0.stats.tx.unicast: 0
dev.nfe.0.stats.tx.multicast: 0
dev.nfe.0.stats.tx.broadcast: 0
dev.nfe.0.wake: 0

client:

c: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gigabit Ethernet
dev.re.0.%driver: re
dev.re.0.%location: slot=0 function=0
dev.re.0.%pnpinfo: vendor=0x10ec device=0x8168 subvendor=0x1043 
subdevice=0x8432 class=0x02
dev.re.0.%parent: pci1

 check dmesg to see if there's any messages the kernel has
 been spitting out which look relevant?  Thanks.

server, immediately after restarting all of nfs scripts (rpcbind nfsclient 
nfsuserd nfsserver mountd nfsd statd lockd nfscbd):

Jun 27 18:04:44 rpcbind: cannot get information for udp6
Jun 27 18:04:44 rpcbind: cannot get information for tcp6
NLM: failed to contact remote rpcbind, stat = 5, port = 28416
Jun 27 18:05:12 amanda kernel: NLM: failed to contact remote rpcbind, stat = 5, 
port = 28416

client, when noticing the mounting-over-directories problem:

NLM: failed to contact remote rpcbind, stat = 5, port = 28416
nfsv4 client/server protocol prob err=10020
nfsv4 client/server protocol prob err=10020
...

No other related messages were found in /var/log/messages either.

-- 


-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 07:56:00AM -0700, Jeremy Chadwick wrote:
 
 Three other things to provide output from if you could (you can X out IPs
 and MACs too), from both client and server:
 
 6) netstat -idn

server:

NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
nfe0   1500 Link#1  00:22:15:b4:2d:XX 1767890778 0 0 872169302
 0 00 
nfe0   1500 172.XX.XX.0/2 172.XX.XX.4   1767882158 - - 1964274616   
  - -- 
lo0   16384 Link#23728 0 0 3728 0 
00 
lo0   16384 
(28)00:00:00:00:00:00:fe:80:00:02:00:00:00:00:00:00:00:00:00:00:00:01 3728  
   0 0 3728 0 00 
lo0   16384 
(28)00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:01 3728  
   0 0 3728 0 00 
lo0   16384 127.0.0.0/8   127.0.0.1 3648 - - 3664 - 
-- 

client:

NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
re01500 Link#1  e0:cb:4e:cd:d3:XX 955288523 0 0 696819089 
0 00 
re01500 172.XX.XX.0/2 172.XX.XX.2   955279721 - - 696814499 
- -- 
lo0   16384 Link#23148 0 0 3148 0 
00 
lo0   16384 
(28)00:00:00:00:00:00:fe:80:00:02:00:00:00:00:00:00:00:00:00:00:00:01 3148  
   0 0 3148 0 00 
lo0   16384 
(28)00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:01 3148  
   0 0 3148 0 00 
lo0   16384 127.0.0.0/8   127.0.0.1 3112 - - 3112 - 
-- 

 7) sysctl hw.pci | grep msi

both server and client:

hw.pci.honor_msi_blacklist: 1
hw.pci.enable_msix: 1
hw.pci.enable_msi: 1

 8) Contents of /etc/sysctl.conf

server and client:

# 4 virtual channels
dev.pcm.0.play.vchans=4
# Read modules from /usr/local/modules
kern.module_path=/boot/kernel;/boot/modules;/usr/local/modules
# Remove those annoying ARP moved messages:
net.link.ether.inet.log_arp_movements=0
# 32MB write cache on disk controllers system-wide
vfs.hirunningspace=33554432
# Allow users to mount file systems
vfs.usermount=1
# misc
net.link.tap.user_open=1
net.inet.ip.forwarding=1
compat.linux.osrelease=2.6.16
debug.ddb.textdump.pending=1
# for NFSv4
kern.ipc.maxsockbuf=524288

  server, immediately after restarting all of nfs scripts (rpcbind
  nfsclient nfsuserd nfsserver mountd nfsd statd lockd nfscbd):
 
  Jun 27 18:04:44 rpcbind: cannot get information for udp6
  Jun 27 18:04:44 rpcbind: cannot get information for tcp6
 
 These two usually indicate you removed IPv6 support from the kernel,
 except your ifconfig output (I've remove it) on the server shows you do
 have IPv6 support.  I've been trying to get these warnings removed for
 quite some time (PR kern/96242).  They're harmless, but the
 inconsistency here is a little weird -- are you explicitly disabling
 IPv6 on nfe0?

I have WITHOUT_IPV6= in my make.conf on all my machines (or I have
problems with jdk1.6) and WITHOUT_INET6= in my src.conf.  I'm not sure
why the rpcbind/ifconfig binaries have a different concept than the
kernel since I always make buildworld kernel and keep things in sync
with mergemaster when I reboot.  I'm building new worlds/kernels now
to see if that makes any difference.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 12:35:14AM -0400, Rick Macklem wrote:
 
 Being stuck in newnfsreq means that it is trying to establish a TCP
 connection with the server (again smells like some networking issue).
 snip
 Disabling delegations is the next step. (They aren't
 required for correct behaviour and are disabled by default because
 they are the greenest part of the implementation.)

After disabling delegations, I was able to build world and kernel on two
different clients, and my port build problems went away as well.

I'm still left with a performance problem, although not quite as bad as I
originally reported.  Directory listings are snappy once again, but playing
h264 video is choppy, particularly when seeking around: there's almost a
full second delay before it kicks in, no matter where I seek.  With NFSv3
the delay on seeks was less than 0.1 seconds and the playback was never
jittery.

I can try it again with v3 client and v4 server, if you think that's
worthy of pursuit.  If it makes any difference, the server's four CPUs are
pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
before I enabled v4 server too.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 10:09:21PM -0400, Rick Macklem wrote:
 
 
 On Mon, 28 Jun 2010, Rick C. Petty wrote:
 
  If it makes any difference, the server's four CPUs are
 pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
 before I enabled v4 server too.

 If it is practical, it would be interesting to see what effect killing
 off the cpu bound jobs has w.r.t. performance.

I sent SIGTSTP to all those processes and brought the CPUs to idle.  The
jittering/stuttering is still present when watching h264 video.  So that
rules out scheduling issues.  I'll be investigating Jeremy's TCP tuning
suggestions next.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 09:29:11AM -0700, Jeremy Chadwick wrote:
 
 # Increase send/receive buffer maximums from 256KB to 16MB.
 # FreeBSD 7.x and later will auto-tune the size, but only up to the max.
 net.inet.tcp.sendbuf_max=16777216
 net.inet.tcp.recvbuf_max=16777216
 
 # Double send/receive TCP datagram memory allocation.  This defines the
 # amount of memory taken up by default *per socket*.
 net.inet.tcp.sendspace=65536
 net.inet.tcp.recvspace=131072

I tried adjusting to these settings, on both the client and the server.
I still see the same jittery/stuttery video behavior.  Thanks for your
suggestions though, these are probably good settings to have around anyway
since I have 12 GB of RAM on the client and 8 GB of RAM on the server.

 make.conf WITHOUT_IPV6 would affect ports, src.conf WITHOUT_INET6 would
 affect the base system (thus rpcbind).  The src.conf entry is what's
 causing rpcbind to spit out the above cannot get information messages,
 even though IPv6 is available in your kernel (see below).
 
 However: your kernel configuration file must contain options INET6 or
 else you wouldn't have IPv6 addresses on lo0.  So even though your
 kernel and world are synchronised, IPv6 capability-wise they probably
 aren't.  This may be your intended desire though, and if so, no biggie.

Oh forgot about that.  I'll have to add the nooptions since I like to
build as close to GENERIC as possible.  Mostly the WITHOUT_* stuff in
/etc/src.conf is to reduce my overall build times, since I don't need some
of those tools.

I'm okay with the messages though; I'll probably comment out WITHOUT_INET6.

Thanks again for your suggestions,

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 07:48:59PM -0400, Rick Macklem wrote:
 
 Ok, it sounds like you found some kind of race condition in the delegation
 handling. (I'll see if I can reproduce it here. It could be fun to find:-)

Good luck with that!  =)

 I can try it again with v3 client and v4 server, if you think that's
 worthy of pursuit.  If it makes any difference, the server's four CPUs are
 pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
 before I enabled v4 server too.
 
 It would be interesting to see if the performance problem exists for
 NFSv3 mounts against the experimental (nfsv4) server.

Hmm, I couldn't reproduce the problem.  Once I unmounted the nfsv4 client
and tried v3, the jittering stopped.  Then I unmounted v3 and tried v4
again, no jitters.  I played with a couple of combinations back and forth
(toggling the presence of nfsv4 in the options) and sometimes I saw
jittering but only with v4, but nothing like what I was seeing before.
Perhaps this is a result of Jeremy's TCP tuning tweaks.

This is also a difficult thing to test, because the server and client have
so much memory, they cache the date blocks.  So if I try my stutter test
on the same video a second time, I only notice stutters if I skip to parts
I haven't skipped to before.  I can comment that it seemed like more of a
latency issue than a throughput issue to me.  But the disks aren't ever
under a high load.  But it's hard to determine accurate load when the
disks are seeking.  Oh, I'm using the AHCI controller mode/driver on those
disks instead of ATA, if that matters.

One time when I mounted the v4 again, it broke subdirectories like I was
talking about before.  Essentially it would give me a readout of all the
top-level directories but wouldn't descend into subdirectories which
reflect different mountpoints on the server.  An unmount and a remount
(without changes to /etc/fstab) fixed the problem.  I'm wondering if there
isn't some race condition that seems to affect crossing mountpoints on the
server.  When the situation happens, it affects all mountpoints equally
and persists for the duration of that mount.  And of course, I can't
reproduce the problem when I try.

I saw the broken mountpoint crossing on another client (without any TCP
tuning) but each time it happened I saw this in the logs:

nfscl: consider increasing kern.ipc.maxsockbuf

Once I doubled that value, the problem went away..  at least with this
particular v4 server mountpoint.

At the moment, things are behaving as expected.  The v4 file system seems
just as fast as v3 did, and I don't need a dozen mountpoints specified
on each client thanks to v4.  Once again, I thank you, Rick, for all your
hard work!

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Rick C. Petty
On Tue, Jun 29, 2010 at 10:20:57AM -0500, Adam Vande More wrote:
 On Tue, Jun 29, 2010 at 9:58 AM, Rick Macklem rmack...@uoguelph.ca wrote:
 
  I suppose if the FreeBSD world feels that root and toor must both
  exist in the password database, then nfsuserd could be hacked to handle
  the case of translating uid 0 to root without calling getpwuid(). It
  seems ugly, but if deleting toor from the password database upsets
  people, I can do that.
 
 I agree with Ian on this.  I don't use toor either, but have seen people use
 it, and sometimes it will get recommended here for various reasons e.g.
 running a root account with a different default shell.  It wouldn't bother
 me having to do this provided it was documented, but having to do so would
 be a POLA violation to many users I think.

To be fair, I'm not sure this is even a problem.  Rick M. only suggested it
as a possibility.  I would think that getpwuid() would return the first
match which has always been root.  At least that's what it does when
scanning the passwd file; I'm not sure about NIS.  If someone can prove
that this will cause a problem with NFSv4, we could consider hackingit.
Otherwise I don't think we should change this behavior yet.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-08-28 Thread Rick C. Petty
Hi.  I'm still having problems with NFSv4 being very laggy on one client.
When the NFSv4 server is at 50% idle CPU and the disks are  1% busy, I am
getting horrible throughput on an idle client.  Using dd(1) with 1 MB block
size, when I try to read a  100 MB file from the client, I'm getting
around 300-500 KiB/s.  On another client, I see upwards of 20 MiB/s with
the same test (on a different file).  On the broken client:

# uname -mv
FreeBSD 8.1-STABLE #5 r211534M: Sat Aug 28 15:53:10 CDT 2010 
u...@example.com:/usr/obj/usr/src/sys/GENERIC  i386

# ifconfig re0
re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500

options=389bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC
ether 00:e0:4c:xx:yy:zz
inet xx.yy.zz.3 netmask 0xff00 broadcast xx.yy.zz.255
media: Ethernet autoselect (1000baseT full-duplex)
status: active

# netstat -m
267/768/1035 mbufs in use (current/cache/total)
263/389/652/25600 mbuf clusters in use (current/cache/total/max)
263/377 mbuf+clusters out of packet secondary zone in use (current/cache)
0/20/20/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
592K/1050K/1642K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/5/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

# netstat -idn
NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
re01500 Link#1  00:e0:4c:xx:yy:zz   232135 0 068984 0 
00 
re01500 xx.yy.zz.0/2 xx.yy.zz.3 232127 - -68979 -   
  -- 
nfe0*  1500 Link#2  00:22:15:xx:yy:zz0 0 00 0 
00 
plip0  1500 Link#3   0 0 00 0 
00 
lo0   16384 Link#4  42 0 0   42 0 
00 
lo0   16384 fe80:4::1/64  fe80:4::10 - -0 - 
-- 

lo0   16384 ::1/128   ::1  0 - -0 - 
-- 
lo0   16384 127.0.0.0/8   127.0.0.1   42 - -   42 - 
-- 

# sysctl kern.ipc.maxsockbuf
kern.ipc.maxsockbuf: 1048576
# sysctl net.inet.tcp.sendbuf_max
net.inet.tcp.sendbuf_max: 16777216
# sysctl net.inet.tcp.recvbuf_max
net.inet.tcp.recvbuf_max: 16777216
# sysctl net.inet.tcp.sendspace
net.inet.tcp.sendspace: 65536
# sysctl net.inet.tcp.recvspace
net.inet.tcp.recvspace: 131072

# sysctl hw.pci | grep msi
hw.pci.honor_msi_blacklist: 1
hw.pci.enable_msix: 1
hw.pci.enable_msi: 1

# vmstat -i
interrupt  total   rate
irq14: ata0   47  0
irq16: re0219278191
irq21: ohci0+   5939  5
irq22: vgapci0+77990 67
cpu0: timer  2294451   1998
irq256: hdac0  44069 38
cpu1: timer  2293983   1998
Total4935757   4299

Any ideas?

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-08-30 Thread Rick C. Petty
On Sun, Aug 29, 2010 at 11:44:06AM -0400, Rick Macklem wrote:
  Hi. I'm still having problems with NFSv4 being very laggy on one
  client.
  When the NFSv4 server is at 50% idle CPU and the disks are  1% busy,
  I am
  getting horrible throughput on an idle client. Using dd(1) with 1 MB
  block
  size, when I try to read a  100 MB file from the client, I'm getting
  around 300-500 KiB/s. On another client, I see upwards of 20 MiB/s
  with
  the same test (on a different file). On the broken client:
 
 Since other client(s) are working well, that seems to suggest that it
 is a network related problem and not a bug in the NFS code.

Well I wouldn't say well.  Every client I've set up has had this issue,
and somehow through tweaking various settings and restarting nfs a bunch of
times, I've been able to make it tolerable for most clients.  Only one
client is behaving well, and that happens to be the only machine I haven't
rebooted since I enabled NFSv4.  Other clients are seeing 2-3 MiB/s on my
dd(1) test.

I should point out that caching is an issue.  The second time I run a dd on
the same input file, I get upwards of 20-35 MiB/s on the bad client.  But
I can invalidate the cache by unmounting and remounting the file system
so it looks like client-side caching.

I'm not sure how you can say it's network-related and not NFS.  Things
worked just fine with NFSv3 (in fact NFSv3 client using the same NFSv4
server doesn't have this problem).  Using rsync over ssh I get around 15-20
MiB/s throughput, and dd piped through ssh gets almost 40 MiB/s (neither
one is using compression)!

 First off, the obvious question: How does this client differ from the
 one that performs much better?

Different hardware (CPU, board, memory).  I'm also hoping it was some
sysctl tweak I did, but I can't seem to determine what it was.

 Do they both use the same re network interface for the NFS traffic?
 (If the answer is no, I'd be suspicious that the re hardware or
 device driver is the culprit.)

That's the same thing you and others said about the *other* NFSv4 clients
I set up.  How is v4 that much different than v3 in terms of network
traffic?  The other clients are all using re0 and exactly the same
ifconfig options and flags, including the client that's behaving fine.

 Things that I might try in an effort to isolate the problem:
 - switch the NFS traffic to use the nfe0 net interface.

I'll consider it.  I'm not convinced it's a NIC problem yet.

 - put a net interface identical to the one on the client that
   works well in the machine and use that for the NFS traffic.

It's already close enough.  Bad client:

r...@pci0:1:7:0: class=0x02 card=0x816910ec chip=0x816910ec rev=0x10 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Single Gigabit LOM Ethernet Controller (RTL8110)'
class  = network
subclass   = ethernet

Good client:

r...@pci0:1:0:0: class=0x02 card=0x84321043 chip=0x816810ec rev=0x06 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
class  = network
subclass   = ethernet

Mediocre client:

r...@pci0:1:0:0: class=0x02 card=0x84321043 chip=0x816810ec rev=0x06 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
class  = network
subclass   = ethernet

The mediocre and good clients have exactly identical hardware, and often
I'll witness the slow client behavior on the mediocre client, and rarely
on the good client although in previous emails to you, it was the good
client which was behaving the worst of all.

Other differences:
good client = 8.1 GENERIC r210227M amd64 12GB RAM Athlon II X2 255
med. client = 8.1 GENERIC r209555M i386 4GB RAM Athlon II X2 255
bad client = 8.1 GENERIC r211534M i386 2GB RAM Athlon 64 X2 5200+

 - turn off TXCSUM and RXCSUM on re0

Tried that, didn't help although it seemed to slow things down a little.

 - reduce the read/write data size, using rsize=N,wsize=N on the
   mount. (It will default to MAXBSIZE and some net interfaces don't
   handle large bursts of received data well. If you drop it to
   rsize=8192,wszie=8192 and things improve, then increase N until it
   screws up.)

8k didn't improve things at all.

 - check the port configuration on the switch end, to make sure it
   is also 1000bps-full duplex.

It is, and has been.

 - move the client to a different net port on the switch or even a
   different switch (and change the cable, while you're at it).

I've tried that too.  The switches are great and my cables are fine.
Like I said, NFSv3 on the same moint point works just fine (dd does
around 30-35 MiB/s).

 - Look at netstat -s and see if there are a lot of retransmits
   going on in TCP.

2 of 40k TCP packets retransmitted, 7k of 40k duplicate acks received.
I don't see anything else in netstat -s with numbers larger than 10.

 If none of the above seems to help, you 

Re: Why is NFSv4 so slow?

2010-09-03 Thread Rick C. Petty
On Mon, Aug 30, 2010 at 09:59:38PM -0400, Rick Macklem wrote:
 
 I don't tune anything with sysctl, I just use what I get from an
 install from CD onto i386 hardware. (I don't even bother to increase
 kern.ipc.maxsockbuf although I suggest that in the mount message.)

Sure.  But maybe you don't have server mount points with 34k+ files in
them?  I notice when I increase maxsockbuf, the problem of disappearing
files goes away, mostly.  Often a find /mnt fixes the problem
temporarily, until I unmount and mount again.

 The only thing I can suggest is trying:
 # mount -t newnfs -o nfsv3 server:/path /mnt
 and seeing if that performs like the regukar NFSv3 or has
 the perf. issue you see for NFSv4?

Yes, that has the same exact problem.  However, if I use:
mount -t nfs server:/path /mnt
The problem does indeed go away!  But it means I have to mount all the
subdirectories independently, which I'm trying to avoid and is the
reason I went to NFSv4.

 If this does have the perf. issue, then the exp. client
 is most likely the cause and may get better in a few months
 when I bring it up-to-date.

Then that settles it-- the newnfs client seems to be the problem.  Just
to recap...  These two are *terribly* slow (e.g. a VBR mp3 avg 192kbps
cannot be played without skips):
mount -t newnfs -o nfsv4 server:/path /mnt
mount -t newnfs -o nfsv3 server:/path /mnt
But this one works just fine (H.264 1080p video does not skip):
mount -t nfs server:/path /mnt

I guess I will have to wait for you to bring the v4 client up to date.
Thanks again for all of your contributions and for porting NFSv4 to
FreeBSD!

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-03 Thread Rick C. Petty
On Wed, Sep 01, 2010 at 11:46:30AM -0400, Rick Macklem wrote:
  
  I am experiencing similar issues with newnfs:
  
  1) I have two clients that each get around 0.5MiB/s to 2.6MiB/s
  reading
  from the NFS4-share on Gbit-Lan
  
  2) Mounting with -t newnfs -o nfsv3 results in no performance gain
  whatsoever.
  
  3) Mounting with -t nfs results in 58MiB/s ! (Netcat has similar
  performance) ??? not a hardware/driver issue from my pov
 
 Ok, so it does sound like an issue in the experimental client and
 not NFSv4. For the most part, the read code is the same as
 the regular client, but it hasn't been brought up-to-date
 with recent changes.

Do you (or will you soon) have some patches I/we could test?  I'm
willing to try anything to avoid mounting ten or so subdirectories in
each of my mount points.

 One thing you could try is building a kernel without SMP enabled
 and see if that helps? (I only have single core hardware, so I won't
 see any SMP races.) If that helps, I can compare the regular vs
 experimental client for smp locking in the read stuff.

I can try disabling SMP too.  Should that really matter, if you're not
even pegging one CPU?  The locks shouldn't have *that* much overhead...

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-13 Thread Rick C. Petty
On Mon, Sep 13, 2010 at 11:15:34AM -0400, Rick Macklem wrote:
  
  instead of from the local cache. I also made sure that
  the file was in the cache on the server, so the server's
  disk speed is irrelevant.
  

snip

  So, nfs is roughly twice as fast as newnfs, indeed.

Hmm, I have the same network switch as Oliver, and I wasn't caching the
file on the server before.  When I cache the file on the server, I get
about 1 MiB/s faster throughput, so that doesn't seem to make the
difference to me (but with higher throughputs, I would imagine it would).

 Thanks for doing the test. I think I can find out what causes the
 factor of 2 someday. What is really weird is that some people see
 several orders of magnitude slower (a few Mbytes/sec).
 
 Your case was also useful, because you are using the same net
 interface/driver as the original report of a few Mbytes/sec, so it
 doesn't appear to be an re problem.

I believe I said something to that effect.  :-P

The problem I have is that the magnitude of throughput varies randomly.
Sometimes I can repeat the test and see 3-4 MB/s.  Then my server's
motherboard failed last week so I swapped things around and now I have 9-10
MB/s on the same client (but using 100Mbit interface instead of gigabit, so
those speeds make sense).

One thing I noticed is the lag seems to have disappeared after the reboots.
Another thing I had to change was that I was using an NFSv3 mount for /home
(with the v3 client, not the experimental v3/v4 client) and now I'm using
NFSv4 mounts exclusively.  Too much hardware changed because of that board
failing (AHCI was randomly dropping disks, and it got to the point that it
wouldn't pick up drives after a cold start and then the board failed to
POST 11 of 12 times), so I haven't been able to reliably reproduce any
problems.  I also had to reboot the bad client because of the broken
NFSv3 mountpoints, and the server was auto-upgraded to a newer 8.1-stable
(I often run make buildworld kernel regularly, so any reboots will
automatically have a newer kernel).

There's definite evidence that the newnfs mounts are slower than plain nfs,
and sometimes orders of magnitude slower (as others have shown).  But the
old nfs is so broken in other ways that I'd prefer slower yet more stable.
Thanks again for all your help, Rick!

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org