from:"Gerrit Kühn"

Re: limit process memory usage

2020-01-30 Thread Gerrit Kühn

Am Thu, 30 Jan 2020 22:18:12 -0800
schrieb Julian Elischer :


> start with the man page  man 1 limits and follow the "see also" links.  

Ah, great, that helps, thanks a lot!
I wonder why I didn't find this before. I found rctl that is linked in the
"see also" section of limits. However, there is no such link in the other
direction.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

limit process memory usage

2020-01-30 Thread Gerrit Kühn

Hello all,

I have an application that sometimes develops some kind of memory leak
or similar and eats up all RAM within a few minutes until the system is
running out of memory and swap so the kernel starts randomly killing other
processes and finally the crashes.
Is there a way to limit the memory available to an (or any) application so
that something like this doesn't tear down the whole system every time it
happens but just kills the culprit? I found the rctl tool, but I couldn't
make out how to use it for this purpose so far.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: high cpu irq load and slow boot after update from 10.4 to 11.2

2018-11-29 Thread Gerrit Kühn

On Thu, 29 Nov 2018 15:12:34 +0700 Eugene Grosbein 
wrote about Re: high cpu irq load and slow boot after update from 10.4 to
11.2:

> Just report all that you have already tested, and results (lack of).

I didn't pipe all the debug info into files so far, so I'll have to wait a
few days for the issue to pop up again.



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: high cpu irq load and slow boot after update from 10.4 to 11.2

2018-11-29 Thread Gerrit Kühn

On Thu, 29 Nov 2018 14:48:29 +0700 Eugene Grosbein 
wrote about Re: high cpu irq load and slow boot after update from 10.4 to
11.2:

> Fill a PR and include exact output showing IRQs and their count numbers.

Which output would that include? Is vmstat -i sufficient? Should I revert
to the default timer settings for the report?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: high cpu irq load and slow boot after update from 10.4 to 11.2

2018-11-28 Thread Gerrit Kühn

On Thu, 29 Nov 2018 07:48:23 +0100 Gerrit Kühn 
wrote about Re: high cpu irq load and slow boot after update from 10.4 to
11.2:

> The issue is back this morning: sys and irq load incredibly high, system
> hardly usable anymore.
> I guess it's time to try kern.eventtimer.periodic=1 now... should I do
> that in addition to the timecounter setting to hpet, or revert that
> one first?

Nothing helps, rebooting now to get the system back online.
Any further ideas on how to debug this?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: high cpu irq load and slow boot after update from 10.4 to 11.2

2018-11-28 Thread Gerrit Kühn

On Mon, 26 Nov 2018 15:37:02 +0100 Gerrit Kühn 
wrote about Re: high cpu irq load and slow boot after update from 10.4 to
11.2:

> > Try switching to "HPET" in both cases and do not wait for next disaster
> > but do it right now with sysctl command, reboot is not needed.
> > You can restore default settings any moment same way.
> > 
> > And see if the problem goes away.

> Ok, I did that, let's see what happens next (I guess I cannot declare
> victory until it's running stable for more than 4 weeks from now on).
> Thanks again!

The issue is back this morning: sys and irq load incredibly high, system
hardly usable anymore.
I guess it's time to try kern.eventtimer.periodic=1 now... should I do
that in addition to the timecounter setting to hpet, or revert that
one first?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: high cpu irq load and slow boot after update from 10.4 to 11.2

2018-11-26 Thread Gerrit Kühn

On Mon, 26 Nov 2018 21:09:18 +0700 Eugene Grosbein 
wrote about Re: high cpu irq load and slow boot after update from 10.4 to
11.2:

> Try switching to "HPET" in both cases and do not wait for next disaster
> but do it right now with sysctl command, reboot is not needed.
> You can restore default settings any moment same way.
> 
> And see if the problem goes away.

Ok, I did that, let's see what happens next (I guess I cannot declare
victory until it's running stable for more than 4 weeks from now on).
Thanks again!


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: high cpu irq load and slow boot after update from 10.4 to 11.2

2018-11-26 Thread Gerrit Kühn

On Mon, 26 Nov 2018 19:34:43 +0700 Eugene Grosbein 
wrote about Re: high cpu irq load and slow boot after update from 10.4 to
11.2:


> > Any ideas?

> Maybe this box has some clocking problems incompatible with tickless
> kernel. 

Is there anything I could look out for in dmesg or similar to spot the
root cause for this behaviour? The CPUs are Xeon E5606 on a Supermicro
X8DTU mainboard.

> Try get back to old periodic ticking with sysctl
> kern.eventtimer.periodic=1 instead of now default 0.

I'll try that as soon as I spot the issue again.

> Of, if you are curious, run ntpd if it is not already running, wait
> about an hour then look to its /var/db/ntpd.drift file to see if system
> clock is good or not.

ntpd is always running. Right now it looks ok to me (but the issue is not
there, either).

root@storage:~ # cat /var/db/ntpd.drift
-1.366

> Perhaps, you can get better behaviour changing default value
> of kern.timecounter.hardware to another one from kern.timecounter.choice;
> same with kern.eventtimer.timer and kern.eventtimer.choice

Would that work while I see the issue (i.e., should it make the issue go
away then), or should this be set on (re)boot?

Which settings would be recommended to try? This is what I have now:

---
root@storage:~ # sysctl kern.timecounter.hardware
kern.timecounter.hardware: TSC

root@storage:~ # sysctl kern.timecounter.choice
kern.timecounter.choice: ACPI-safe(850) HPET(950) i8254(0) TSC(1000)
dummy(-100)

root@storage:~ # sysctl kern.eventtimer.timer
kern.eventtimer.timer: LAPIC

root@storage:~ # sysctl kern.eventtimer.choice
kern.eventtimer.choice: LAPIC(600) HPET(350) HPET1(340) HPET2(340)
HPET3(340) i8254(100) RTC(0)
---


Thanks for your input.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

high cpu irq load and slow boot after update from 10.4 to 11.2

2018-11-26 Thread Gerrit Kühn

Hi all,

A couple of weeks ago, I updated an older storage server (2 CPUs, 4 cores
each, 48GB RAM, 36x4GB HDDs, 3 LSI-based mps controllers) from 10.4 to
11.2. The first thing I noticed was that booting takes much longer now. The
system probes each HDD (there are 36 of them, attached to mps controllers)
very slowly multiple times (I can see the light of each disk blinking,
it takes seconds to go on to the next disk), the whole process takes
several minutes (was much faster before).

A more nasty issue appears after a couple of weeks of operation (so far,
roughly between 15 and 30 days):
Suddenly there is a very high irq load on one of the CPU cores
(cpu:timer), causing high system load and high cpu load (top easily
shows average load over 10, whereas it was always below 1 before). I cannot
find any process or device as a culprit. First I thought this problem can
only be made to go away by rebooting, but now I managed to get rid of it
(at least for some time, don't know if or when it will be back) while
checking out the latest source in background (I actually intended to fiddle
with some kernel settings, but suddenly the issue was gone after
persisting permanently over the weekend), causing.

Looking around, I found a couple of vaguely similar reports (like
https://lists.freebsd.org/pipermail/freebsd-current/2017-January/064419.html),
but these all appear to be fixed by now.
I have a couple of other storage machines (mostly mps-based, but always
slightly different hardware) that show no such issue after updating to
11.2.

Any ideas?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: devd out of swap space ? (zfs arc related ?)

2018-04-09 Thread Gerrit Kühn

On Mon, 9 Apr 2018 15:27:45 -0400 Mike Tancsa  wrote
about devd out of swap space ? (zfs arc related ?):

> Anyone else seen anything like this on a recent RELENG11 STABLE ?

I think I have seen something similar last week with a -stable from
somewhen in March. Lots of processes crashed over night due to "out of
swap space" although there appeared to be plenty of both swap and RAM.
Somehow it looked arc-related to me, but I havn't been able to reproduce
it so far (however, I did not try too hard, either ;-).
This is what top shows on the machine right now:

CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 2776K Active, 7231M Inact, 31M Laundry, 24G Wired, 934M Buf, 288M Free
ARC: 20G Total, 6678M MFU, 13G MRU, 1060K Anon, 182M Header, 146M Other
 19G Compressed, 82G Uncompressed, 4.24:1 Ratio



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS - poor performance with "large" directories

2015-11-25 Thread Gerrit Kühn

On Wed, 25 Nov 2015 12:06:38 + krad  wrote about Re:
ZFS - poor performance with "large" directories:

K> consumer SSDs are cheap enough now not to bother with usb drives I would
K> imagine.

Sure. I was just suggesting a USB drive as a quick way to check if this
might help at all. Most people have USB drives lying around, and they can
simply be plugged into any computer. A SSD (cheap or not) more likely
needs to be bought first, and the server box might need to be opened to
have it installed.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS - poor performance with "large" directories

2015-11-24 Thread Gerrit Kühn

On Tue, 24 Nov 2015 17:11:54 +0100 Albert Cervin 
wrote about Re: ZFS - poor performance with "large" directories:

AC> Will try a bit with the meta limit.

You can also put metadata on a flash device to speed things up. To
check if this is really the bottleneck in your case, something simple like
a USB stick might suffice to try out:



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: dev.ix.0.queueX.interrupt_rate

2015-08-26 Thread Gerrit Kühn

On Tue, 25 Aug 2015 19:45:08 +0300 Slawa Olhovchenkov s...@zxy.spb.ru
wrote about Re: dev.ix.0.queueX.interrupt_rate:

SO For discover poor network performance you need:

[...]

I am already through this, see
https://lists.freebsd.org/pipermail/freebsd-net/2015-June/042536.html

I just wanted to let /you/ know that my interrupt rates look similar (you
asked for that).
I think there are a couple of people on the mailing lists reporting
performance issues with ix interfaces and (especially?) NFS lately. Would
be great if we could find and fix the root cause(s)...


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: dev.ix.0.queueX.interrupt_rate

2015-08-25 Thread Gerrit Kühn

On Tue, 25 Aug 2015 07:55:49 -0400 (EDT) Rick Macklem
rmack...@uoguelph.ca wrote about Re: dev.ix.0.queueX.interrupt_rate:


RM If you have tso enabled, you could try this patch:
RM   https://reviews.freebsd.org/D3477
RM 
RM If TSO is disabled, then we don't have an explanation for poor NFS
RM performance yet. 

I tried both with or without TSO, and it does not appear to make any
difference for me. I get about 50MB/s net write speed and about 200MB/s
read. Even my 1GBE interface perform better in terms of writing.

RM If you haven't seen it, you might want to keep an eye
RM on this thread: http://docs.FreeBSD.org/cgi/mid.cgi?55DC1B5A.8010109

Yes, I am watching this. Thanks for the pointer.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: dev.ix.0.queueX.interrupt_rate

2015-08-25 Thread Gerrit Kühn

On Mon, 24 Aug 2015 22:29:26 +0300 Slawa Olhovchenkov s...@zxy.spb.ru
wrote about dev.ix.0.queueX.interrupt_rate:

SO Last -stable, no tuning. Is this normal?

From 10.2-rel (and still having severe performance issues with NFS as
reported before):

dev.ix.0.queue7.interrupt_rate: 31250
dev.ix.0.queue6.interrupt_rate: 10
dev.ix.0.queue5.interrupt_rate: 8
dev.ix.0.queue4.interrupt_rate: 10
dev.ix.0.queue3.interrupt_rate: 50
dev.ix.0.queue2.interrupt_rate: 31250
dev.ix.0.queue1.interrupt_rate: 31250
dev.ix.0.queue0.interrupt_rate: 50



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Fw: zfs snapshot: Bad file descriptor

2011-08-25 Thread Gerrit Kühn

Sorry for crossposting, but I got no answer at all from freebsd-fs. Anyone
in here having any ideas/suggestions on this?


cu
  Gerrit



Begin forwarded message:

Date: Tue, 23 Aug 2011 17:02:55 +0200
From: Gerrit Kühn ger...@pmp.uni-hannover.de
To: freebsd...@freebsd.org
Subject: zfs snapshot: Bad file descriptor


Hi all,

since upgrading some of my storage machines to recent 8.2-stable and
zfs-v28 I see the following on some filesystems after some time of
operation:

---
mclane# ll /tank/home/pt/.zfs
ls: snapshot: Bad file descriptor
total 0
---


I make quite heavy use of snapshots on all my machines and use rsync to
backup snapshots to other machines.
Googleing around I found several people reporting similar problems, but no
real solution (apart from rebooting, which is not really a thing you want
to do every time you run into this).
Is there any knowledge/ideas available over the list here how to improve
this situation? Am I just one of the few unlucky people who see this, or is
there an actual reason for this happening that could be fixed or
circumvented?


cu
  Gerrit
___
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to freebsd-fs-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

diskless booting with 8.2 regression?

2011-07-18 Thread Gerrit Kühn

Hi all,

I just updated my nfs/tftp server for diskless booting from 8.0-rel to
8.2-stable. I have a bunch of Linux clients that used to work with the
8.0-setup, but fail to boot now.

On the server side I see

Jul 18 11:18:24 mclane tftpd[72434]: Got ERROR packet: TFTP Aborted

in the log/messages, but the Linux kernel appears to be transferred over
the net just fine (so this is probably not the real issue). It starts to
boot and fails at some later point (with no apparent error message on
screen) causing an endless reboot loop.
I already googled for quite some time on this now, but nothing useful
came up. The error message above seems to be harmless, at least the
machines of people reporting them work nevertheless.

Are there any known issues/regressions with tftp/nfs diskless booting? I
read in some posts that people were vaguely having problems with it when
updating to 8.2-something, but could not find any details. Are there any
further hints what I could do to narrow down the problem?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: diskless booting with 8.2 regression?

2011-07-18 Thread Gerrit Kühn

On Mon, 18 Jul 2011 12:38:22 +0200 Gerrit Kühn
ger...@pmp.uni-hannover.de wrote about diskless booting with 8.2
regression?:

I guess I found the root of all evil now: device nodes on zfs!

This is how a linux /dev/console looked on the 8.0-FreeBSD server on
zfsv15:

crw---  1 root  wheel5,   1 Oct 28  2010 
/tank/diskless/gco-fe2/dev/console


Now, after updating to FreeBSD-8.2 and zfsv28 it looks like this on the server:

crw-r--r--  1 root  wheel  255, 0x00ff Jul 18 16:33 
/tank/diskless/pt-fe2/dev/console


Strange enough, the Linux client still displays the correct values when using 
ls -la, but it refuses to work properly.
I tried creating new device nodes from the client side with mknod and I tried 
getting correct ones from a backup, but they always end up being broken. Even 
movong the directories over to a ufs volume leaves them unusable:

crw-r--r--  1 root  wheel0,   0 Jul 18 16:33 /tmp/console


Luckily, I am back into business now with my machines, because moving the stuff 
from zfs to ufs and dropping in a correct version of /dev on the ufs side works 
just fine.

However, it would be great if this could be fixed, because I do not have many 
ufs partitions left these days...


cu
  Gerrit



GK Hi all,
GK 
GK I just updated my nfs/tftp server for diskless booting from 8.0-rel to
GK 8.2-stable. I have a bunch of Linux clients that used to work with the
GK 8.0-setup, but fail to boot now.
GK 
GK On the server side I see
GK 
GK Jul 18 11:18:24 mclane tftpd[72434]: Got ERROR packet: TFTP Aborted
GK 
GK in the log/messages, but the Linux kernel appears to be transferred
GK over the net just fine (so this is probably not the real issue). It
GK starts to boot and fails at some later point (with no apparent error
GK message on screen) causing an endless reboot loop.
GK I already googled for quite some time on this now, but nothing useful
GK came up. The error message above seems to be harmless, at least the
GK machines of people reporting them work nevertheless.
GK 
GK Are there any known issues/regressions with tftp/nfs diskless booting?
GK I read in some posts that people were vaguely having problems with
GK it when updating to 8.2-something, but could not find any details. Are
GK there any further hints what I could do to narrow down the problem?
GK 
GK 
GK cu
GK   Gerrit
GK ___
GK freebsd-stable@freebsd.org mailing list
GK http://lists.freebsd.org/mailman/listinfo/freebsd-stable
GK To unsubscribe, send any mail to
GK freebsd-stable-unsubscr...@freebsd.org
GK 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: drives 2TB on mpt device

2011-04-14 Thread Gerrit Kühn

On Mon, 4 Apr 2011 07:37:15 -0700 Artem Belevich a...@freebsd.org wrote
about Re: drives 2TB on mpt device:

AB You're probably out of luck as far as 2Tb+ support for 1068-based HBAs:
AB http://kb.lsi.com/KnowledgebaseArticle16399.aspx
AB 
AB Newer controllers based on LSI2008 (mps driver?) should not have that
AB limit.

For the record:
My latest info from Supermicro is that the chip would do above 2TB with
SAS drives, but doesn't do it with SATA...

However, I changed to a 3ware 9650se controller now. I had to flash a beta
firmware to get 2.8GB recognized on the drives, but now it seems to work.
Only the twa device/controller responds with a reset when trying to do zdb
-C... strange. But apart from that the drives seem to work fine, even
with zfs.

To sum up what I experienced during the last days: all cheap controllers
I tried (nvidia mcp55 onboard, SiI 3124) work fine out-of-the-box. All
expensive, scsi-like stuff (3ware 9650, lsi) needs at least firmware
updates or does not work (meaning shows either 2TB, 800GB or does not work
at all). For my old 3ware 9550 controllers there is not even a beta
firmware available to fix the problem. :-(


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

drives 2TB on mpt device

2011-04-04 Thread Gerrit Kühn

Hi all,

I have a freshly installed 8.2-REL with a SuperMicro AOC-USASLP-L8i
controller (LSI/MPT 1068E chipset). I have several of these controllers
working nicely in other systems.
However, this time I tried drives 2TB for the first time (Hitachi
Deskstar 3TB). It appears that the mpt device reports only 2TB in this
case. I have already flashed the controller's firmware to the latest
available version (from 2009), but that did not change anything. The drive
is working fine on the standard SATA connectors on the mainboard
(Supermicro H8DME-2) and reports 2.8TB there.
Are there any hints how to access the full drive? Am I seeing a limitation
of the controller/firmware or rather of the driver (mpt)?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: drives 2TB on mpt device

2011-04-04 Thread Gerrit Kühn

On Mon, 4 Apr 2011 14:36:25 +0100 Bruce Cran br...@cran.org.uk wrote
about Re: drives 2TB on mpt device:

Hi Bruce,

BC It looks like a known issue:
BC http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/147572

Hm, I don't know if this is exactly what I'm seeing here (although the
cause may be the same):
I do not use mptutil. The controller is dumb (without actual raid
processor), and I intend to use it with zfs. However, I cannot even get
gpart to create a partition larger than 2TB, because mpt comes back with
only 2TB after probing the drive. As this is a problem that already exists
with 1 drive, I cannot use gstripe or zfs to get around this.
But the PR above states that this limitation is already built into mpt, so
my only chance is probably to try a different controller/driver (any
suggestions for a cheap 8port controller to use with zfs?), or to wait
until mpt is updated to support larger drives. Does anyone know if there
is already ongoing effort to do this?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: NFS client over udp

2011-02-21 Thread Gerrit Kühn

On Sat, 19 Feb 2011 08:56:45 -0800 (PST) Kirill Yelizarov
ykir...@yahoo.com wrote about Re: NFS client over udp:

KY   Is ZFS in use on the system which sees rising wired
KY   memory?

KY  No, ufs only. 
KY I found an old post stating there is a leak with nfs udp client over
KY zfs:
KY http://lists.freebsd.org/pipermail/freebsd-fs/2010-February/007876.html

Later on in that thread we found out that the leak has nothing to do with
zfs and is triggered just by using nfs over udp. I cannot remember if
there was a fix for that (Rick probably knows); at some point I just
turned off udp on the server side completely and switched all clients to
tcp to get my systems stable again.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs

2010-03-01 Thread Gerrit Kühn

On Mon, 01 Mar 2010 12:52:32 +0100 Willem Jan Withagen w...@digiware.nl
wrote about Re: mbuf leakage with nfs:

WJW  In my case it is the Linux client causing the problems (cannot tell
WJW  yet if it is only with udp, but I would think so). If I understand
WJW  Daniel correctly his latest testes were performed with FreeBSD
WJW  client and udp. So it may very well be a generel issue with udp?!
WJW  Would this help narrowing down the problem?
WJW 
WJW I'm off 'till thursday.
WJW At which time I'm willing to run more tests. Got plenty of boxes here.
WJW Both FreeBSD and Linux. And otherwise will boot more in VirtualBox.

I finally too an axe and restarted nfsd without -u. Now my mbuf usage is
flat as it should be. I guess some people using computers with udp
mounts will complian, but this can be fixed easily by converting their
connections to tcp.
However, I am still interested in having the issue fixed, so I will be
following the thread and contribute if possible.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs

2010-02-28 Thread Gerrit Kühn

On Sun, 28 Feb 2010 12:21:28 + Robert N. M. Watson
rwat...@freebsd.org wrote about Re: mbuf leakage with nfs/zfs? :

RNMW It's almost certainly one or a small number of very specific RPCs
RNMW that are triggering it -- maybe OpenBSD does an extra lookup, or
RNMW stat, or something, on a name that may not exist anymore, or does it
RNMW sooner than the other clients. Hard to say, other than to wave hands
RNMW at the possibilities.
RNMW 
RNMW And it may well be we're looking at two bugs: Danny may see one bug,
RNMW perhaps triggered by a race condition, but it may be different from
RNMW the OpenBSD client-triggered bug (to be clear: it's definitely a
RNMW FreeBSD bug, although we might only see it when an OpenBSD client is
RNMW used because perhaps OpenBSD also has a bug or feature).

In my case it is the Linux client causing the problems (cannot tell yet if
it is only with udp, but I would think so). If I understand Daniel
correctly his latest testes were performed with FreeBSD client and udp. So
it may very well be a generel issue with udp?! Would this help narrowing
down the problem?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/udp (was: mbuf leakage with nfs/zfs)

2010-02-28 Thread Gerrit Kühn

On Sun, 28 Feb 2010 16:52:44 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/udp (was: mbuf leakage with nfs/zfs):

DB well, I have further reduced the problem, it happens with NFS/UDP
DB writes. i'll try the wireshark road, but i'm very rusty with RPC, the
DB other road is to check the changes, my oldest is from late october
DB (RC2) where it's happening, while
DB Gerrit tried 8-pre from November and worked, so it will be fun
DB trying to nail it down :-)

I already withdrew from this position yesterday, because the 8-PRE server
I have does not have udp clients, only tcp. So I cannot tell (yet) wether
it is affected by the leakage or not.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 09:24:10 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/zfs? :

DB I doubt it, but here is another shot:
DB are we all running samba? I'm asking because the lock manager keeps
DB dying and ...

Nope, no samba on my side. I am running lockd and statd on the server, but
stoppeing them does not change anything. All clients are using option
nolock anyway.

DB PS: I dropped Jack from the CC, I think em is innocent :-)

Yes, good idea.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 11:14:56 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/zfs? :

DB anyways, I am running tests on an 'unused' server, only me using it to
DB 'make world'
DB and it's leaking.

Hm, I've got a server with 8-PRE from somewhen in Nov09 that is serving
nfs from zfs fine and shows no leakage...


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 12:26:02 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/zfs? :


DB  Hm, I've got a server with 8-PRE from somewhen in Nov09 that is
DB  serving nfs from zfs fine and shows no leakage...

DB the binary search has started!
DB sorry, have to go know :-) [realy], but should be back in a couple of
DB hours, let me know if you managed to pin it down, else I can continue.

Sorry, but I cannot do much over the weekend. Both the machine with leakage
and the one without are in production (and about 40km apart from each
other and away from my home :-).
I still wonder if there are more circumstances needed to provoke this
problem. I really doubt that this would have gone unnoticed for weeks or
even months if it only takes some nfs-server serving from zfs storage and
some client to see it.
What does the client in your test setup look like?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 15:15:52 +0100 Willem Jan Withagen w...@digiware.nl
wrote about Re: mbuf leakage with nfs/zfs?:

WJW  81492/2613/84105 mbufs in use (current/cache/total)
WJW  80467/2235/82702/128000 mbuf clusters in use
WJW  (current/cache/total/max) 80458/822 mbuf+clusters out of packet
WJW  secondary zone in use (current/cache)

WJW Over the night I only had rsync and FreeBSD nfs traffic.
WJW 
WJW 45337/2828/48165 mbufs in use (current/cache/total)
WJW 44708/1902/46610/262144 mbuf clusters in use (current/cache/total/max)
WJW 44040/888 mbuf+clusters out of packet secondary zone in use
WJW (current/cache)

After about 24h I now have

128320/2630/130950 mbufs in use (current/cache/total)
127294/1200/128494/512000 mbuf clusters in use (current/cache/total/max)
127294/834 mbuf+clusters out of packet secondary zone in use
(current/cache)

WJW I only have one Linux box runing Kubuntu 8.10, mounted UDP: 
WJW (rw,udp,nolock,rsize=32768,wsize=32768,intr)

Hm, are you able to narrow this down? Does a single Linux client with tcp
mount cause the same trouble? Or a FreeBSD client with udp?
If it was only Linux clients with udp mounts or something like this, I
could understand why it took some time to pop up.

WJW But running something like 'find openembedded | xarg cat  /dev/null'
WJW Shows a steadily growing number of mbufs, and letting the system sit
WJW for 5 min. doesn't decrease the used mbufs

I still have several udp and tcp mounts by Linux clients on my Server,
though most of them are probably stale now after the upgrade; and my
buffers keep draining...

WJW Doing this on another FreeBSD 7.2 client runs the mbufs up(max inc
WJW about 2000 mbuf), but within a few secs after the last file was
WJW fetched, the mbuf tab runs down to around to what is was before the
WJW command.

FreeBSD client with udp mount? Then it is either Linux client with udp or
all Linux clients triggering this leakage. I doubt that this is the case
with all Linux clients, this would have caused more trouble earlier.

WJW Not shure where to go from here? I'm certainly not fluent enough in
WJW NFS to start interpreting a wireshark trace.

Nor do I.
I already wrote Rick Macklem an Email on Friday, but so far only got back
an automated reply stating he is on permanent vacation. I guess we need
him or one of the other nfs guys to get this fixed.
Could you try a single Linux client with tcp mount in the meantime? This
would tell us if Linux clients as such are causing the issue, or if it is
only Linux with udp mount.


cu
  Gerrit


P.S.: I cc'ed freebsd-fs because my PR went there.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 12:26:02 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/zfs? :


DB  Hm, I've got a server with 8-PRE from somewhen in Nov09 that is
DB  serving nfs from zfs fine and shows no leakage...

DB the binary search has started!

After considering the last email from Willem: My 8-PRE server does not
have udp Linux clients, only Linux with tcp. If indeed Linux with udp is
causing the problem, it may very well even be in 8-PRE, and I just did not
see it so far.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 21:32:39 +0100 Eirik Øverby ltn...@anduin.net wrote
about Re: mbuf leakage with nfs/zfs?:

E I've had a discussion with some folks on this for a while. I can easily
E reproduce this situation by mounting a FreeBSD ZFS filesystem via
E NFS-UDP from an OpenBSD machine. Telling the OpenBSD machine to use TCP
E instead of UDP makes the problem go away.

So we see this problem with udp clients from OpenBSD and Linux.

E Other FreeBSD systems mounting the same share, either using UDP or TCP,
E does not cause the problem to show up.

As Daniel reported he saw the problem with FBSD 8-stable: Which version
was the FBSD-client that worked for you with udp?

E A patch was suggested by Rick Macklem, but that did not solve the issue:
E http://lists.freebsd.org/pipermail/freebsd-current/2009-December/014181.html

Yeah, I also found and tried this on Friday - unfortunately without any
success, the leakage is still there.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 21:36:47 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/zfs? :

DB I have been running for the last few hours, 8-rel, and the only client
DB is another
DB 8-stable, furthermore, no ZFS, just plain UFS, and the leak is there!

Mounted via udp, not tcp, I guess...?!

DB I am now trying 8-rc2 but will check in the morning, it is after all
DB saturday night :-)

Same here. :-)


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 11:38:19 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: mbuf leakage with nfs/zfs?:

JC I should point out that the NFS+ZFS-based filer doesn't actually do its
JC backups using NFS; it uses rsnapshot (rsync) over SSH.  There is
JC intense network I/O during backup time though, depending on how much
JC data there is to back up.  The NFS mounts (on the clients) are only
JC used to provide a way for people to get access to their nightly
JC backups in a convenient way; it isn't used very heavily.

That's rather similar to my situation, I would say. Most traffic goes via
rsync, nfs only gives access to home dirs, which are not intensively used.

JC I can do something NFS-intensive on any of the above clients if people
JC want me to kind of testing.  Possibly an rsync with a source of the NFS
JC mount and a destination of the local disk would be a good test?  Let me
JC know if anyone's interested in me testing that.

From the last emails I would say we get most out of it by comparing tcp
and udp clients to make sure this happens only with udp (and it is still
not quite clear to me if it also happens with a FBSD client using udp).

OTOH it would be great if someone with the ability to actually fix
something in the nfs code could get in this discussion to guide us to do
the debugging needed to do so.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-27 Thread Gerrit Kühn

On Sat, 27 Feb 2010 22:40:43 +0100 Eirik Øverby ltn...@anduin.net wrote
about Re: mbuf leakage with nfs/zfs?:

E  So we see this problem with udp clients from OpenBSD and Linux.

E I have not had the opportunity to test with Linux or anything else.

I guess all others who reported so far (including me) had Linux on the
client side.

E Could try from Windows, but not sure I want to get my hands THAT dirty.

:-)))

E  As Daniel reported he saw the problem with FBSD 8-stable: Which
E  version was the FBSD-client that worked for you with udp?

E 7.1, 7.2, 8.0-RCsomething and 8.0-RELEASE - no problems with either.

Daniel, are you sure you had the leakage with 8-stable? Eirik, do you have
the opportunity to try 8-stable with udp?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 10:34:41 +0100 Willem Jan Withagen w...@digiware.nl
wrote about Re: em0 freezes on ZFS server:

WJW Probably the reason why this happened yesterday is that I started
WJW doing major software builds (over ZFS/NFS/TCP/v3) against data stored
WJW on this box.

I saw a similar problem this morning and suppose it started when some
automatic backup jobs started last night. A unstable em device is a rather
bad thing, I hope increasing the buffer (mine is at 64000 now) prevents
this from happening again.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn

On Thu, 25 Feb 2010 14:59:28 -0800 Jack Vogel jfvo...@gmail.com wrote
about Re: em0 freezes on ZFS server:

JV The failure to setup receive structures means it did not have
JV sufficient mbufs
JV to setup the RX ring and buffer structs. 

I don't know if this is related, but I updated an amd64 zfs machine with
several em cards from 7.2 to 8-stable yesterday. First it worked fine after
booting, but this morning, at least three of the five em interfaces did
not do much anymore. You could revive them for some seconds with ifconfig
down/up, but they always ceased functioning soon after that (within
seconds).
During debugging (up/down, load/unload if_em etc.) I saw the same error
message as above at some point. I finally gave up and rebooted the
machine. For now, everything appears to be back to normal (but for how
long?).

JV Not sure why this results in a lockup, but try and increase
JV kern.ipc.nmbclusters.

I just did that, just to make sure.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn

On Thu, 25 Feb 2010 14:59:28 -0800 Jack Vogel jfvo...@gmail.com wrote
about Re: em0 freezes on ZFS server:

JV The failure to setup receive structures means it did not have
JV sufficient mbufs
JV to setup the RX ring and buffer structs.

I'm monitoring mbufs since I rebooted my server. Right now (after 2.5 hours
or so of operation) the number of total clusters has already increased to
15k. Is this a normal behaviour for a relatively idle server or will it
inevitably go through the roof in some more hours?


Every 1s: netstat -mFri Feb 26
13:14:54 2010

15001/2279/17280 mbufs in use (current/cache/total)
13970/1212/15182/64000 mbuf clusters in use (current/cache/total/max)
13970/750 mbuf+clusters out of packet secondary zone in use (current/cache)
0/119/119/12800 4k (page size) jumbo clusters in use
(current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use
(current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use
(current/cache/total/max) 31690K/3469K/35160K bytes allocated to network
(current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf
+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
3 requests for I/O initiated by sendfile
0 calls to protocol drain routines



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 04:03:39 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: em0 freezes on ZFS server:

JC Note how close the current value is to that of total.  I'm not too
JC surprised you're seeing what you are as a result of this.  What on
JC earth is this machine doing at all times?

Well, speaking for my machine: serving some nfs dirs from zfs, do some
file transfers via rsync/scp, server some web pages (gitweb, redmine).
Really nothing spectacular. I just updated from 7.2 to 8-stable yesterday
and did not have that problem before. From my last email to now (about 15
minutes) mbuf clusters have increased from 15k to 18k. All my other
machines (even another one with 8-stable, but without nfs-services and
without em nics) have only a few k of buffers in use.
Is there any way I could find out what is actually using these buffers?


cu
  Gerrit

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 13:31:38 +0100 Gerrit Kühn
ger...@pmp.uni-hannover.de wrote about Re: em0 freezes on ZFS server:

GK JC Note how close the current value is to that of total.  I'm not
GK JC too surprised you're seeing what you are as a result of this.
GK JC What on earth is this machine doing at all times?

GK Is there any way I could find out what is actually using these buffers?

Sorry for replying to my own email:
At least in my case I found out what is eating the buffers: nfsd does!
The buffers stop increasing as soon as I stop nfsd. However, they start
increasing as soon as I start nfsd again.
Are there any ideas how to fix this? Downgrading back to 7-stable is not
really an easy task as far as I know, and I need the server to run without
having to reboot it once for twice a day...


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 15:04:37 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: em0 freezes on ZFS server :

DB  At least in my case I found out what is eating the buffers: nfsd
DB  does! The buffers stop increasing as soon as I stop nfsd. However,
DB  they start increasing as soon as I start nfsd again.
DB  Are there any ideas how to fix this? Downgrading back to 7-stable is
DB  not really an easy task as far as I know, and I need the server to
DB  run without having to reboot it once for twice a day...

DB I want to add some spices to this stew: :-)

You're welcome. :-)

DB Some few day later it hung, and it's now hanging every few days.
DB Most of the hangs are because there is no network, but the NIC is bce
DB not em! I doubled kern.ipc.nmbclusters and lets see what happens ...

Do you have nfsd running and serving clients? If so, we should maybe
change the topic to something like possible nfs mbuf leakage...

DB 23066/6634/29700 mbufs in use (current/cache/total)

My server is at 22k now, and the buffer number is still increasing every
few seconds...
Can you monitor your mbuf usage and report if it grows?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: em0 freezes on ZFS server

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 17:07:13 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: em0 freezes on ZFS server :

DB it's only purpose in life is a nfs server.

I thought so, but you did not mention it explicitely.

DB but I wouldn't exclude zfs from the equation yet.
DB I have othere nfs servers, not doing zfs and I don't see this.

My machine has zfs, too. I do not have 8-stable with nfs on ufs, so I
cannot crosscheck that.

DB  My server is at 22k now, and the buffer number is still increasing
DB  every few seconds...
DB  Can you monitor your mbuf usage and report if it grows?

DB I am, and in the last 2hs. it grew by about 300, it does oscilate,
DB i.e. it grows some, then
DB it goes down, but it seems that the low always increases.

Mine is at 36k now:

36797/3403/40200 mbufs in use (current/cache/total)
35772/1202/36974/65000 mbuf clusters in use (current/cache/total/max)
35772/836 mbuf+clusters out of packet secondary zone in use (current/cache)

DB when I have enough data i'll plot it.

I think I'll reboot my machine now and hope that it lives as long as
possible into the weekend. Although at the present rate it will not
survive 24h. :-(


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 17:41:02 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: em0 freezes on ZFS server :

DB check:
DB ftp://ftp.cs.huji.ac.il/users/danny/freebsd/plot.ps
DB x is seconds, y is mbus current.

Looks not as bad as mine. I had 37k when I rebooted the machine some
minutes ago (and it's basically idle, just serving a few nfs clients that
don't do much).
But from the values Jeremy has posted and from my own comparsisons here I
would think that something like 5k of mbuf clusters would be normal for my
machine (and probably also for yours).

Some more info from my side:
In the meantime I also tried a different network interface. The
nfe-interface that is onboard causes the same problems, so it is probably
not an em-specific issue.
Furthermore I found this via Google:
http://lists.freebsd.org/pipermail/freebsd-current/2009-December/014062.html.
I patched and recompiled my kernel with this, just to try it out. Right
now I have

2264/1321/3585 mbufs in use (current/cache/total)
1239/1017/2256/65000 mbuf clusters in use (current/cache/total/max)
1239/809 mbuf+clusters out of packet secondary zone in use (current/cache)

but the uptime is only 12min so far. In some hours I'll know for certain
if this patch has anything to do with the problem.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 22:09:32 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS
server) :

DB  Furthermore I found this via Google:
DB  
http://lists.freebsd.org/pipermail/freebsd-current/2009-December/014062.html.

This did not help, I still see the same problem.

DB I'll have to do some packet snooping to check if it's TCP or UDP nfs
DB traffic, since some of the clients are Linux ...

I have Linux clients, too. Some use tcp, some udp.

DB  2264/1321/3585 mbufs in use (current/cache/total)
DB  1239/1017/2256/65000 mbuf clusters in use (current/cache/total/max)
DB  1239/809 mbuf+clusters out of packet secondary zone in use
DB  (current/cache)

DB  but the uptime is only 12min so far. In some hours I'll know for
DB  certain if this patch has anything to do with the problem.

It did not help. In the meantime the values read

20555/1465/22020 mbufs in use (current/cache/total)
19529/1029/20558/65000 mbuf clusters in use (current/cache/total/max)
19529/823 mbuf+clusters out of packet secondary zone in use (current/cache)


I created a little graph here:
http://www.pmp.uni-hannover.de/test/Mitarbeiter/g_kuehn/data/mbuf.pdf.

y-axis are the total mbuf clusters, x-axis in minutes. The flat part in
the upper right corner is a 10min-interval when I had stopped nfsd.

DB at the moment there is not much activity, but if you check the latest
DB plot.ps you will see that the bottom is slowly increasing, so my bet
DB is that there must be some leakage!

There certainly is. I wonder when this came in and why it has gone
unnoticed so far. Probably not all people serving nfs from zfs see this,
or this would have popped up earlier. Maybe the Linux clients are somehow
triggering the issue? Or did it start with the import of zvol version 14?
Unfortunately I have upgraded my pool, so I cannot easily go back to 8-REL
to test this (otoh, I need a stable server quite urgently).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS server)

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 22:09:32 +0200 Daniel Braniss da...@cs.huji.ac.il
wrote about Re: mbuf leakage with nfs/zfs? (was: em0 freezes on ZFS
server) :

DB at the moment there is not much activity, but if you check the latest
DB plot.ps you will see that the bottom is slowly increasing, so my bet
DB is that there must be some leakage!

BTW: I filed a PR for this:
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/144330


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: mbuf leakage with nfs/zfs?

2010-02-26 Thread Gerrit Kühn

On Fri, 26 Feb 2010 23:12:39 +0100 Willem Jan Withagen w...@digiware.nl
wrote about Re: mbuf leakage with nfs/zfs?:

WJW Mine are now:
WJW 41533/2402/43935 mbufs in use (current/cache/total)
WJW 41454/1572/43026/262144 mbuf clusters in use (current/cache/total/max)
WJW 39241/823 mbuf+clusters out of packet secondary zone in use
WJW (current/cache)

81492/2613/84105 mbufs in use (current/cache/total)
80467/2235/82702/128000 mbuf clusters in use (current/cache/total/max)
80458/822 mbuf+clusters out of packet secondary zone in use (current/cache)

If I keep increasing the clusters, maybe I can make it over the
weekend. :-)

WJW ', I did set the zvol version this morning also to 14 but I think 
WJW that I ran into trouble already when still running version 13.

Ok, so this is possibly ruled out, too. Maybe the Linux clients do
something weird?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: nss_ldap and multiple group memberships

2010-02-25 Thread Gerrit Kühn

On Thu, 25 Feb 2010 11:17:32 +1100 Scott, Brian
brian.sco...@det.nsw.edu.au wrote about RE: nss_ldap and multiple group
memberships:

SB It depends on the type of group. There are at least two types of group
SB objects that you can use in LDAP but only one of them works. You need
SB to use posixGroup objects for unix groups. As I remember it, these
SB have memberUid attributes for the member ids. These are simple unix
SB identifiers. groupOfNames objects on the other hand have full
SB distinguished names with 'member' attributes and can't be used by
SB nss_ldap.

The server is running openldap under SLES and is not under my control.
ldapsearch gives group entries like

# lisa, group, aei.uni-hannover.de
dn: cn=lisa,ou=group,dc=aei,dc=uni-hannover,dc=de
cn: lisa
displayName: lisa
gidNumber: 1003
member: uid=gekueh,ou=people,dc=aei,dc=uni-hannover,dc=de


So this would be the first case, I guess.

SB The idea is that posixGroup and posixAccount mimic the unix files so
SB extraction of the data is fast. If the software used a groupOfNames
SB object then the returned member names would need to queried as
SB additional transactions to find the uid's of those entries that had
SB posixAccount information. This is because the original authentication
SB was done by pam_ldap and that just returned a UID to the system. If it
SB returned the LDAP distinguished name to the system, and if that could
SB then be passed into nss_ldap it would be possible to do the LDAP query
SB in a single transaction. But then that all breaks down if you
SB authenticate with something else like GSSAPI. If that was the case you
SB would need to first search for the posixAccount object of the
SB authenticated user ((objectClass=posixAccount)(uid=1001)) and then
SB search for all the group of names containing that distinguished name (
SB (objectClass=groupOfNames)
SB (member=uid=bscott,ou=People,dc=netlab,dc=albury,dc=tafe)). That's two
SB transactions and seems unnecessarily wasteful. Mind you, if it was an
SB option I'd probably turn it on.

Thanks for this fine explanation. I do not use GSS. However, I found the
following configuration option in (nss) ldap.conf that helped me:

nss_map_attribute uniqueMember member

After commenting this in, everything seems to work fine:

penumbra# id gekueh
uid=1030(gekueh) gid=1012(aei) groups=1012(aei),1003(lisa)

Maybe this could be mentioned somewhere in the documentation? I used
http://www.freebsd.org/doc/en/articles/ldap-auth/client.html to set up
the client, but the information I got from this article were rather
sparse and led me the wrong path more than once.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: nss_ldap and multiple group memberships

2010-02-25 Thread Gerrit Kühn

On Thu, 25 Feb 2010 15:10:03 +1100 Scott, Brian
brian.sco...@det.nsw.edu.au wrote about RE: nss_ldap and multiple group
memberships:

SB It looks like you may need to uncomment the line '#nss_map_attribute
SB uniqueMember member' in your ldap.conf to then use the correct
SB attribute name.

Yes, that's exactly the solution here. I got this from reading the config
files of a working Linux client that uses the same nss libraries.

Thank you for your support!


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

nss_ldap and multiple group memberships

2010-02-24 Thread Gerrit Kühn

Hi all,

Is anyone here using nss_ldap and can successfully get it to work with
multiple group memberships? I would really like to get this to work here,
but I only get the primary group:

penumbra# id gekueh
uid=1030(gekueh) gid=1012(aei) groups=1012(aei)

getent group comes up with the complete group list. ldapsearch reports
three groups with member:-lines for my user. Somehow nss does not pick this
up. Any ideas?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: bugs in mpt(4) and mptutil(8)

2010-02-11 Thread Gerrit Kühn

On Wed, 10 Feb 2010 08:53:18 -0500 John Baldwin j...@freebsd.org wrote
about Re: bugs in mpt(4) and mptutil(8):

JB  This output is definitely wrong, because the drives are split up on
JB  mpt0 and mpt1 (and the USB stick is not connected to mpt at all :-)
JB  as can be seen with camcontrol:

JB Hmm, I asked the previous reporter to debug this by examining the
JB results that CAM returns from the bus scan using gdb, but I haven't
JB heard back. Unfortunately I do not have access to any hardware with
JB this sort of setup to debug this.

I will do some debugging work here, if you can tell me what to do.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zpool vdev vs. glabel

2010-02-10 Thread Gerrit Kühn

On Tue, 9 Feb 2010 13:27:21 -0700 Elliot Finley efinley.li...@gmail.com
wrote about Re: zpool vdev vs. glabel:

EF I ran into this same problem.  you need to clean the beginning and end
EF of your disk off before glabeling and adding it to your pool.  clean
EF with dd if=/dev/zero...

Hm, I think I did that (at least for the beginning part).
Maybe I was not quite clear what I did below: I removed and re-attached
the *same* disk which was labelled with glabel and running fine brefore.
The label was there when I inserted it back, but zfs went for the da
device node anyway.
If I see this problem again, I will try to wipe the complete disk before
re-inserting it.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zpool vdev vs. glabel

2010-02-10 Thread Gerrit Kühn

On Wed, 10 Feb 2010 10:18:49 +0100 Marius Nünnerich mar...@nuenneri.ch
wrote about Re: zpool vdev vs. glabel:

MN It seems there is some kind of race condition with zfs either picking
MN up the disk itself or the label device for the same disk. I guess it's
MN which ever it probes first.

This could explain it. However, it seems that zfs sticks to the da device
once it changed it's mind. Meanwhile I discovered one more system where is
obviously has happened (although I cannot say when:

luna# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
tank  ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
label/disk-1  ONLINE   0 0 0
da0   ONLINE   0 0 0
label/disk-2  ONLINE   0 0 0
label/disk-3  ONLINE   0 0 0

errors: No known data errors

MN I wrote the GPT part of glabel for using
MN it in situations like this, I had not a single report of this kind of
MN problem with the gpt labels. Maybe you can try them too?

Yeah, I just have to look into how gpt labels work. I did not use them at
all up to now.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

bugs in mpt(4) and mptutil(8)

2010-02-10 Thread Gerrit Kühn

Hi,

I have 2 8port cards with lsi chips installed in one machine that are
driven by mpt(4). I see about the same problem (I think) when disconnecting
disks as described here:
http://forums.freebsd.org/showthread.php?t=9407

When I simply pull a disk (without offlineing it first), zfs does not
notice it (is still listed as online) and I get lots of

mpt1: mpt_cam_event: 0x16
mpt1: mpt_cam_event: 0x12
mpt1: mpt_cam_event: 0x16
mpt1: mpt_cam_event: 0x16
mpt1: mpt_cam_event: 0x16
mpt1: request 0xff80005e0bf0:2419 timed out for ccb 0xff0005802800
(req-ccb 0xff0005802800) mpt1: attempting to abort req
0xff80005e0bf0:2419 function 0 mpt1: completing timedout/aborted req
0xff80005e0bf0:2419 mpt1: abort of req 0xff80005e0bf0:0 completed
mpt1: request 0xff80005dc000:2810 timed out for ccb 0xff000fa66800
(req-ccb 0xff000fa66800) mpt1: request 0xff80005dc3f0:2811 timed
out for ccb 0xff0005802800 (req-ccb 0xff0005802800) mpt1:
attempting to abort req 0xff80005dc000:2810 function 0 mpt1:
completing timedout/aborted req 0xff80005dc3f0:2811 mpt1: completing
timedout/aborted req 0xff80005dc000:2810
[...goes on for ages...]

I don't know if this would ever stop. It ceased when I put the disk back
in. In the thread above it is mentioned that there are some fixes for mpt
(4) in -current to try out. However, I do not want to run -current on this
machine. So, does anyone here know how the chances are that the mentioned
patches are MFCed soon?


One more thing I noticed is that mptutil does not play well with my
controllers:

pigpen# mptutil show adapter
mpt0 Adapter:
   Board Name: USASLP-L8i
   Board Assembly: USASLP-L8i
Chip Name: C1068E
Chip Revision: B3
  RAID Levels: none
mptutil: Reading config page header failed: Invalid configuration page


I don't know if it terminates because it cannot read the config page or if
it is not able to see the second card. However:

pigpen# mptutil show drives
mpt0 Physical Drives:
 da0 (  466G) ONLINE WDC WD5002ABYS-0 3B02 SATA bus 0 id 0
 da1 (  466G) ONLINE WDC WD5002ABYS-0 3B02 SATA bus 0 id 1
 da6 (  466G) ONLINE WDC WD5002ABYS-0 3B02 SATA bus 0 id 2
da11 (  466G) ONLINE WDC WD5002ABYS-0 3B02 SATA bus 0 id 3
 da3 (  466G) ONLINE WDC WD5001ABYS-0 1D01 SATA bus 0 id 0
 da4 (  466G) ONLINE WDC WD5001ABYS-0 1D01 SATA bus 0 id 1
 da5 (  466G) ONLINE WDC WD5001ABYS-0 1D01 SATA bus 0 id 2
 da2 (   75G) ONLINE ST380815AS B SATA bus 0 id 3
 da7 (   75G) ONLINE ST380815AS B SATA bus 0 id 4
 da8 (  466G) ONLINE ST3500641NS C SATA bus 0 id 5
 da9 (  466G) ONLINE ST3500641NS C SATA bus 0 id 6
da10 (  466G) ONLINE ST3500641NS C SATA bus 0 id 7
da12 ( 3824M) ONLINE ST 4GB  SCSI-0 bus 0 id 0


This output is definitely wrong, because the drives are split up on mpt0
and mpt1 (and the USB stick is not connected to mpt at all :-) as can be
seen with camcontrol:

pigpen# camcontrol devlist
ATA WDC WD5002ABYS-0 3B02at scbus0 target 0 lun 0 (da0,pass0)
ATA WDC WD5002ABYS-0 3B02at scbus0 target 1 lun 0 (da1,pass1)
ATA WDC WD5002ABYS-0 3B02at scbus0 target 2 lun 0 (pass6,da6)
ATA WDC WD5002ABYS-0 3B02at scbus0 target 3 lun 0 (pass11,da11)
ATA WDC WD5001ABYS-0 1D01at scbus1 target 0 lun 0 (da3,pass3)
ATA WDC WD5001ABYS-0 1D01at scbus1 target 1 lun 0 (da4,pass4)
ATA WDC WD5001ABYS-0 1D01at scbus1 target 2 lun 0 (da5,pass5)
ATA ST380815AS B at scbus1 target 3 lun 0 (pass2,da2)
ATA ST380815AS B at scbus1 target 4 lun 0 (da7,pass7)
ATA ST3500641NS Cat scbus1 target 5 lun 0 (da8,pass8)
ATA ST3500641NS Cat scbus1 target 6 lun 0 (da9,pass9)
ATA ST3500641NS Cat scbus1 target 7 lun 0 (da10,pass10)
ST 4GB   at scbus2 target 0 lun 0 (pass12,da12)



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: hardware for home use large storage

2010-02-09 Thread Gerrit Kühn


CS pricey hardware raid cards for compatibility reasons.  There seem to
CS be no decent add-on SATA cards that play nice with FreeBSD other than
CS that weird supermicro card that has to be physically hacked about to
CS fit.

BTW: I recently built some more machines with this card. I can confirm now
that you can use it with standard brackets, if you have some spare. The
distance for the two holders is the same as for e.g. 3ware 95/96
controllers and I had some spares in standard height there because I use
the 3wares in low profile setups. The brackets of Intel NICs seem to fit,
too. The only thing that is different with the card now is the side on
which the components are mounted. But this should not be a problem unless
you want to place them next ti a graphics card.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: hardware for home use large storage

2010-02-09 Thread Gerrit Kühn

On Tue, 09 Feb 2010 17:21:32 +1100 Andrew Snow and...@modulus.org wrote
about Re: hardware for home use large storage:

AS http://www.supermicro.com/products/motherboard/ATOM/ICH9/X7SPA.cfm?typ=H

The good thing about this board is that the pineview atoms seem to be
64bit capable, which makes them attractive for zfs. I bought a board with
VIA Nano processor for this reason last year, as I could not find a decent
hardware with 64bit capable atom.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

zpool vdev vs. glabel

2010-02-09 Thread Gerrit Kühn

Hi,

I have created a raidz2 with disk I labeled with glabel before. Right
after creation this pool looked fine, using devices label/tank[1-6].

I did some tests with replacing/swapping disks and so on. After doing a

zpool offline tank label/tank6
remove disk
camcontrol rescan all
insert disk
camcontrol rescan all
zpool online tank label/tank6

I got the disk back, but not under the requested label, but under the da
device name:

  pool: tank
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Tue Feb  9 14:56:37
2010 config:

NAME STATE READ WRITE CKSUM
tank ONLINE   0 0 0
  raidz2 ONLINE   0 0 0
label/tank1  ONLINE   0 0 0  8.50K resilvered
label/tank2  ONLINE   0 0 0  7.50K resilvered
label/tank3  ONLINE   0 0 0  8.50K resilvered
label/tank4  ONLINE   0 0 0  7.50K resilvered
label/tank5  ONLINE   0 0 0  9K resilvered
da6  ONLINE   0 0 0  13.5K resilvered

errors: No known data errors



Why does this happen? Is there any way to get zfs to use the label again?
After the device is in use, the label in /dev/label disappears. When
taking the device offline again, the label is there, but cannot be used:

pigpen# zpool offline tank da6
pigpen# zpool status
  pool: system
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are
unaffected. action: Determine if the device needs to be replaced, and
clear the errors using 'zpool clear' or replace the device with 'zpool
replace'. see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Tue Feb  9 14:49:14
2010 config:

NAME   STATE READ WRITE CKSUM
system ONLINE   0 0 0
  mirror   ONLINE   0 0 0
label/system1  ONLINE   3   617 0  126K resilvered
label/system2  ONLINE   0 0 0  41K resilvered

errors: No known data errors

  pool: tank
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are
unaffected. action: Determine if the device needs to be replaced, and
clear the errors using 'zpool clear' or replace the device with 'zpool
replace'. see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Tue Feb  9 14:56:37
2010 config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
  raidz2 DEGRADED 0 0 0
label/tank1  ONLINE   0 0 0  8.50K resilvered
label/tank2  ONLINE   0 0 0  7.50K resilvered
label/tank3  ONLINE   0 0 0  8.50K resilvered
label/tank4  ONLINE   0 0 0  7.50K resilvered
label/tank5  ONLINE   0 0 0  9K resilvered
da6  OFFLINE  038 0  13.5K resilvered

errors: No known data errors
pigpen# ll /dev/label/
total 0
crw-r-  1 root  operator0, 104 Feb  9 14:04 lisacrypt1
crw-r-  1 root  operator0, 112 Feb  9 14:04 lisacrypt2
crw-r-  1 root  operator0, 113 Feb  9 14:04 lisacrypt3
crw-r-  1 root  operator0, 134 Feb  9 14:48 system1
crw-r-  1 root  operator0, 115 Feb  9 14:04 system2
crw-r-  1 root  operator0, 116 Feb  9 14:04 tank1
crw-r-  1 root  operator0, 117 Feb  9 14:04 tank2
crw-r-  1 root  operator0, 118 Feb  9 14:04 tank3
crw-r-  1 root  operator0, 101 Feb  9 14:04 tank4
crw-r-  1 root  operator0, 102 Feb  9 14:04 tank5
crw-r-  1 root  operator0, 103 Feb  9 15:02 tank6

pigpen# zpool online tank label/tank6
cannot online label/tank6: no such device in pool

In a different thread I found the hint to use zpool replace to get to the
usage of labels, but this seems not possible, either:

pigpen# zpool replace tank label/tank6
invalid vdev specification
use '-f' to override the following errors:
/dev/label/tank6 is part of active pool 'tank'

pigpen# zpool replace -f tank label/tank6
invalid vdev specification
the following errors must be manually repaired:
/dev/label/tank6 is part of active pool 'tank'

pigpen# zpool replace -f tank da6 label/tank6
invalid vdev specification
the following errors must be manually repaired:
/dev/label/tank6 is part of active pool 'tank'


I'm running out of ideas here...



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zpool vdev vs. glabel

2010-02-09 Thread Gerrit Kühn

On Tue, 9 Feb 2010 06:26:58 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: zpool vdev vs. glabel:

JC  I'm running out of ideas here...

JC Would zpool export and zpool import be necessary in this case?

I tried that several times, does not change anything.

JC Also, I'm a little confused as to the use of glabel in this case.  In
JC what condition do your disk indices (e.g. X of daX) change?  Are you
JC yanking multiple disks out of a system at the same time and then
JC shoving them back into different drive bays?  

I just did not want to do hard-wiring da-devices in the kernel. I have two
lsi controllers, and they do not even come up in the same order every time
I boot (mpt0/mpt1), let alone the disks picking up the same daX every
time. I thought labeling the disks would be a good idea to prevent all
these kinds of problems.

JC Are you switching
JC between storage subsystem drivers (ahci(4) vs. ataahci(4), for
JC example) regularly?

No (not yet al least :-).

JC I've yet to be convinced glabel is worth bothering with, unless the
JC system adheres to one of the above situations (which are worthy of
JC strangulation anyway ;-) ).

I would really like to know how this happened at all... meanwhile I used a
spare disk under a different name to replace everything round-robin back
to normal.

However, I just recognized one more thing:

pigpen# zpool status tank
  pool: tank
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Tue Feb  9 15:50:01
2010 config:

NAME STATE READ WRITE CKSUM
tank ONLINE   0 0 0
  raidz2 ONLINE   0 0 0
label/tank1  ONLINE   0 0 0  11K resilvered
label/tank2  ONLINE   0 0 0  10K resilvered
label/tank3  ONLINE   0 0 0  11K resilvered
label/tank4  ONLINE   0 0 0  10.5K resilvered
label/tank5  ONLINE   0 0 0  11K resilvered
label/tank6  ONLINE   0 0 0  15K resilvered

errors: No known data errors
pigpen# zpool offline tank label/tank5
pigpen# zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are
unaffected. action: Determine if the device needs to be replaced, and
clear the errors using 'zpool clear' or replace the device with 'zpool
replace'. see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Tue Feb  9 15:50:01
2010 config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
  raidz2 DEGRADED 0 0 0
label/tank1  ONLINE   0 0 0  11K resilvered
label/tank2  ONLINE   0 0 0  10K resilvered
label/tank3  ONLINE   0 0 0  11K resilvered
label/tank4  ONLINE   0 0 0  10.5K resilvered
label/tank5  ONLINE   0 0 0  11K resilvered
label/tank6  OFFLINE  039 0  15K resilvered

errors: No known data errors

pigpen# zpool offline tank label/tank5
cannot offline label/tank5: no valid replicas



Why can't I offline a second disk? This is a raidz2 volume, after all?!


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: one more load-cycle-count problem

2010-02-08 Thread Gerrit Kühn

On Mon, 8 Feb 2010 15:43:46 +0200 Dan Naumov dan.nau...@gmail.com wrote
about RE: one more load-cycle-count problem:

DN Any further ideas how to get rid of this feature?

DN 1) The most clean solution is probably using the WDIDLE3 utility on
DN your drives to disable automatic parking or in cases where its not
DN possible to complete disable it, you can adjust it to 5 minutes, which
DN essentially solves the problem. Note that going this route will
DN probably involve rebuilding your entire array from scratch, because
DN applying WDIDLE3 to the disk is likely to very slightly affect disk
DN geometry, but just enough for hardware raid or ZFS or whatever to bark
DN at you and refuse to continue using the drive in an existing pool (the
DN affected disk can become very slightly smaller in capacity). Backup
DN data, apply WDIDLE3 to all disks. Recreate the pool, restore backups.
DN This will also void your warranty if used on the new WD drives,
DN although it will still work just fine.

Thanks for the warning. How on earth can a tool to set the idle time
affect the disk geometry?!

DN 2) A less clean solution would be to setup a script that polls the
DN SMART data of all disks affected by the problem every 8-9 seconds and
DN have this script launch on boot. This will keep the affected drives
DN just busy enough to not park their heads.

That's what I'm doing since yesterday when I first noted the problem on
this particular system. Not a pretty solution either. I'm close of buying
Hitachi drives instead (HTE545050B9A300). Does anyone here know these
drives and can confirm that they do not have this kind of problem (I
would expect it because of the 24/7 certification)?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: one more load-cycle-count problem

2010-02-08 Thread Gerrit Kühn

On Mon, 8 Feb 2010 06:22:59 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: one more load-cycle-count
problem:


JC The DOS utilities submit custom ATA CMDs or data to all WD disks to
JC toggle or adjust these features.  If someone could figure out what the
JC command(s) were, the feature(s) could be implemented into atacontrol
JC (8). Of course, that would require reverse-engineering of the EXEs,
JC which would probably induce DMCA-related lawsuits (in the US).  Sad
JC too, since documentation of said feature(s) would improve customer
JC satisfaction. But hey, I'm just an engineer, what do I know.

:-)))
I would really prefer to be able to set this stuff via camcontrol or
atacontrol. Alone having to boot DOS with this machine (no floppy, no
cdrom) will be a real pain. And most probably the DOS tool will not be
able to see the disks sitting behind my lsi-driven controller anyway, so I
have to plug them elsewhere, too. Great job, WD. :-(


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

one more load-cycle-count problem

2010-02-07 Thread Gerrit Kühn

Hi all,

After being disturbed by the firmware issues of the wd drives causing
exceeding load cycles (see thread immense delayed write to file system
(ZFS and UFS2), performance issues in January), I have found some more
problematic drives in the following setup:

4 x 2.5 WDC WD4000BEVT-00ZAT0 in RAIDZ1 configuration attached to a
Supermicro SAS controller:

m...@pci0:2:0:0:class=0x01 card=0xa38015d9 chip=0x00581000
rev=0x08 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)'
device = 'SAS 3000 series, 8-port with 1068E -StorPort'
class  = mass storage
subclass   = SCSI

luna# camcontrol devlist
ATA WDC WD4000BEVT-0 1A01at scbus0 target 0 lun 0 (pass0,da0)
ATA WDC WD4000BEVT-0 1A01at scbus0 target 1 lun 0 (pass1,da1)
ATA WDC WD4000BEVT-0 1A01at scbus0 target 2 lun 0 (pass2,da2)
ATA WDC WD4000BEVT-0 1A01at scbus0 target 3 lun 0 (pass3,da3)


The disks appear to load/unload every 10s or so if I do not artificially
keep them busy. Does anyone here have a suggestion how to make this
interval longer or even turn off the unload feature completely?

I tried

luna# camcontrol idle da0 -t 600
(pass0:mpt0:0:0:0): CMD: IDLE: e3 00 00 00 00 40 00 00 00 00 78 00
(pass0:mpt0:0:0:0): CAM Status: CCB request was invalid

luna# camcontrol standby da0 -t 600
(pass0:mpt0:0:0:0): CMD: STANDBY: e2 00 00 00 00 40 00 00 00 00 78 00
(pass0:mpt0:0:0:0): CAM Status: CCB request was invalid


From /usr/share/misc/scsi_modes I gather that page 26 should contain power
control features, but no avail:

luna# camcontrol modepage da0 -m 26
camcontrol: error sending mode sense command


Any further ideas how to get rid of this feature?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: immense delayed write to file system (ZFS and UFS2), performance issues

2010-01-26 Thread Gerrit Kühn

On Tue, 19 Jan 2010 03:24:49 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: immense delayed write to file
system (ZFS and UFS2), performance issues:

JC So which drive models above are experiencing a continual increase in
JC SMART attribute 193 (Load Cycle Count)?  My guess is that some of the
JC WD Caviar Green models, and possibly all of the RE2-GP and RE4-GP
JC models are experiencing this problem.

Just to add some more info:
I contacted WD support about the problem with RE4 drives and received a
firmware update by email today which is supposed to fix the problem. Did
not try it yet, though.


I am still busy replacing RE2-disks with updated drives. I came across a
very strange thing with zfs. Actually I had the following pool layout:

mclane# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz1ONLINE   0 0 0
ad8 ONLINE   0 0 0
ad10ONLINE   0 0 0
ad12ONLINE   0 0 0
spares
  ad14  AVAIL   

errors: No known data errors

All disks still have the firmware bug, so I want to replace them with
disks that I already fixed. I put in a updated drive as ad18 and
wanted to replace ad12 to get the drive with the broken firmware out:

mclane# zpool replace tank /dev/ad12 /dev/ad18 
mclane# zpool status
  pool: tank
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.01% done, 52h51m to go
config:

NAME   STATE READ WRITE CKSUM
tank   ONLINE   0 0 0
  raidz1   ONLINE   0 0 0
ad8ONLINE   0 0 0  7.21M resilvered
ad10   ONLINE   0 0 0  7.22M resilvered
replacing  ONLINE   0 0 0
  ad12 ONLINE   0 0 0
  ad18 ONLINE   0 0 0  10.7M resilvered
spares
  ad14 AVAIL   

errors: No known data errors

However, something must have gone wrong during the resilvering process and
it now looks like this:

mclane# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are
unaffected. action: Determine if the device needs to be replaced, and
clear the errors using 'zpool clear' or replace the device with 'zpool
replace'. see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 2h39m with 0 errors on Tue Jan 26
14:00:00 2010 config:

NAME   STATE READ WRITE CKSUM
tank   DEGRADED 0 0 0
  raidz1   DEGRADED 0 0 0
ad8ONLINE   0 0 0  975M resilvered
ad10   ONLINE   0 0   142  974M resilvered
replacing  DEGRADED 0 7.25M 0
  ad12 ONLINE   0 0 0
  ad18 REMOVED  0 1 0  79.4M resilvered
spares
  ad14 AVAIL   

errors: No known data errors


What is going on here? ad18 obviously detached during the
process. /var/log/messages just gives me

Jan 26 11:23:33 mclane kernel: ad18: FAILURE - device detached

Additionally ad10 obviously produced chksum errors. What do I do about the
degraded replacing process? Can I terminate it somehow and maybe replace
ad10 first? Any other hints?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ZFS zpool replace problems

2010-01-26 Thread Gerrit Kühn

On Tue, 26 Jan 2010 06:30:21 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: ZFS zpool replace problems:

JC I'm removing the In-Reply-To mail headers for this thread, as you've
JC now hijacked it for a different purpose.  Please don't do this; start
JC a new thread altogether.  :-)

Thanks. You're perfectly right, I should have done that.

JC I'm not sure how the above is supposed to work (I haven't personally
JC tried it), but:
JC 
JC 1) Why didn't you offline the ad10 disk first?
JCzpool offline tank ad10

Well, probably because I thought that zfs would simply handle the
situation. I just wanted to replace drive A with drive B, so this was
quite straight-forward for me.

JC 2) How did you attach ad18?  Did you tell the system about it using
JCatacontrol?  If so, what commands did you use?

Yes. The drives did not appear automatically (verified with atacontrol
list). Then I first tried reinit ata9, but that did not work out, so I did
a detach/attach for ata9, then the drive was there (with list and also
the device node appeared).

JC 3) Can you please provide uname -a output, as well as relevant dmesg
JCoutput to show what kind of SATA controller you have, what's
JCattached to what, etc.?

Of course (dmesg is not there anymore, I use pciconf -vl and
atacontrol instead):

ATA channel 0:
Master:  no device present
Slave:  acd0 Optiarc DVD RW AD-7540A/1.01 ATA/ATAPI revision 0
ATA channel 1:
Master:  no device present
Slave:   no device present
ATA channel 2:
Master:  ad4 ST380815AS/3.AAC SATA revision 2.x
Slave:   no device present
ATA channel 3:
Master:  ad6 ST380815AS/3.AAC SATA revision 2.x
Slave:   no device present
ATA channel 4:
Master:  ad8 WDC WD1000FYPS-01ZKB0/02.01B01 SATA revision 2.x
Slave:   no device present
ATA channel 5:
Master: ad10 WDC WD1000FYPS-01ZKB0/02.01B01 SATA revision 2.x
Slave:   no device present
ATA channel 6:
Master: ad12 WDC WD1000FYPS-01ZKB0/02.01B01 SATA revision 2.x
Slave:   no device present
ATA channel 7:
Master: ad14 WDC WD1000FYPS-01ZKB0/02.01B01 SATA revision 2.x
Slave:   no device present
ATA channel 8:
Master:  no device present
Slave:   no device present
ATA channel 9:
Master:  no device present
Slave:   no device present


FreeBSD mclane.rt.aei.uni-hannover.de 7.2-STABLE FreeBSD 7.2-STABLE #0:
Mon Sep  7 11:01:56 CEST 2009
r...@mclane.rt.aei.uni-hannover.de:/usr/obj/usr/src/sys/MCLANE.72  amd64

The first six drives (up to ad14) are connected onboard (Supermicro dual
opteron board with mcp55):

atap...@pci0:0:5:0: class=0x010485 card=0x161115d9 chip=0x037f10de
rev=0xa3 hdr=0x00 vendor = 'Nvidia Corp'
device = 'MCP55 SATA/RAID Controller (MCP55S)'
class  = mass storage
subclass   = RAID
atap...@pci0:0:5:1: class=0x010485 card=0x161115d9 chip=0x037f10de
rev=0xa3 hdr=0x00 vendor = 'Nvidia Corp'
device = 'MCP55 SATA/RAID Controller (MCP55S)'
class  = mass storage
subclass   = RAID
atap...@pci0:0:5:2: class=0x010485 card=0x161115d9 chip=0x037f10de
rev=0xa3 hdr=0x00 vendor = 'Nvidia Corp'
device = 'MCP55 SATA/RAID Controller (MCP55S)'
class  = mass storage
subclass   = RAID

The other two (ad16 and ad18, the chassis has 8 slots and the last two
were only intended to be used in situtations like the one I have now) are
connected to an extra pci card:

atap...@pci0:3:6:0: class=0x010401 card=0x02409005 chip=0x02401095
rev=0x02 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology
Inc)' device = 'SATA/Raid controller(2XSATA150) (SIL3112)'
class  = mass storage
subclass   = RAID

Meanwhile I took out the ad18 drive again and tried to use a different
drive. But that was listed as UNAVAIL with corrupted data by zfs.
Probably it already branded the disk for resilvering and is looking for
exactly this one now. I also put in the disk which caused the problem
above again. The resilvering process started again, but very soon the
drive got detached again resulting in the same situation I described above.

Any help is greatly appreciated.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ZFS zpool replace problems

2010-01-26 Thread Gerrit Kühn

On Tue, 26 Jan 2010 08:15:27 -0800 Chuck Swiger cswi...@mac.com wrote
about Re: ZFS zpool replace problems:

CS  Meanwhile I took out the ad18 drive again and tried to use a
CS  different drive. But that was listed as UNAVAIL with corrupted
CS  data by zfs.

CS There's your problem-- the Silicon Image 3112/4 chips are remarkably
CS buggy and exhibit data corruption:

Hm, sure? I would expect the same behaviour (detaching) as with the first
drive if the controller was the reason in this case.

CS   http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2005-08/0208.html

I already thought about replacing the controller to get rid of the
detach-problem. However, I cannot do this online and I really would prefer
fixing the disk firmware problem first.
I could remove the hotspare drive ad14 and use this slot for putting in a
replacement disk. Is it possible to get ad18 out of zfs' replacing
process? Maybe by detaching the disk from the pool?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ZFS zpool replace problems

2010-01-26 Thread Gerrit Kühn

On Tue, 26 Jan 2010 08:27:37 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: ZFS zpool replace problems:

JC Well, to be fair, we can't be 100% certain he got bit by that bug.
JC It's possible/likely, but we don't know for certain at this point.  We
JC also don't know what brand hard disks he had connected to ad16 and/or
JC ad18.

The same as on the others (WD RE2GP), just with the updated firmware
(02.01B02 that is) to get rid of the lcc problem.

JC Older Silicon Image controllers are known for. well, just read the
JC Wikipedia entry for details.
JC http://en.wikipedia.org/wiki/Silicon_Image_Inc.#Product_alerts

I knew the card is not top of the line, but I didn't know that it
is /that/ bad. When I set up the system 1 or 2 years ago, I just thought
it might be nice to be able to use the two extra slots in case of any
drives having to be replaced or so and the card was just lying aroung
(well, maybe I have an idea now why nobody else wanted to use it :-).

I guess I will try to offline the hotspare slot (connected to the mcp55 on
the motherboard) and plug the replacement disk in there. Maybe zfs
recognizes it and picks up the resilvering there. Otherwise I'll have to
look into how to get rid of the degraded resilvering process and restart it
with the drive in the other slot.

JC As others have stated already: Intel could make a fortune off of a
JC simple PCIe or PCI-X SATA controller card that's ICH9/ICH10-based.

Indeed. I use these 8-channel Supermicro-Controller (I think I recommended
them some time ago here) with LSI chipset that work really nicely. But
the backet does not fit into standard slots and there is no PCI-X version.
I would certainly prefer a regular card by Intel.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ZFS zpool replace problems

2010-01-26 Thread Gerrit Kühn

On Tue, 26 Jan 2010 08:46:19 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: ZFS zpool replace problems:

JC - zpool offline pool disk
JC - atacontrol detach ataX (where X = channel associated with disk)
JC - Physically remove bad disk
JC - Physically insert new disk
JC - Wait 15 seconds for stuff to settle
JC - atacontrol attach ataX (where X = previous channel detached)
JC - zpool replace pool disk
JC - zpool online pool disk

JC reinit shouldn't be needed at all -- in fact, I've seen reinit cause
JC some craziness (even on Intel controllers), including a system
JC deadlock, but this was back during the RELENG_6 and RELENG_7 days.
JC Great improvements have been made to ata(4) since then.

Thanks for pointing that out. I would have went exactly this way, if I did
not have the extra slots or one of the drives was actually faulty. But in
this case I just wanted to replace every drive on-by-one and (at least I
thought) I had extra slots, so I did not want to give up the redundancy
during the replacement (knowing very well that the drives to be replaced
are already beyond the specification of wd due to the load-cycle bug).

JC If you need me to validate the above procedure (it's been a while since
JC I've had to hot-swap a disk), I can do so.  I do have a 4-disk
JC Supermicro SuperServer 5015B-MTB (ICH9-based) sitting on my workbench
JC which I can test with.

I'm quite sure this will work fine. I just don't know how to get rid of
the degraded replacement zfs sees.

JC It honestly sounds like hot-swapping is causing some chaos on your
JC system.  Are all of the controllers involved configured for AHCI?  

I think so. How could I verify this?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ZFS zpool replace problems

2010-01-26 Thread Gerrit Kühn

On Tue, 26 Jan 2010 08:59:27 -0800 Chuck Swiger cswi...@mac.com wrote
about Re: ZFS zpool replace problems:

CS As a general matter of maintaining RAID systems, however, the approach
CS to upgrading drive firmware on members of a RAID array should be to
CS take down the entire container and offline the drives, update one
CS drive, test it (via SMART self-test and read-only checksum comparison
CS or similar), and then proceed to update all of the drives (preferably
CS doing the SMART self-test for each, if time allows) before returning
CS them to the RAID container and onlining them.

Well, I had several spare drives sitting on the shelf. So I updated the
firmware of these spare drives and now want to replace the drives with the
old firmware by new new ones one-by-one. Taking the system offline for
longer than a few minutes is not really an option. I'd rather roll in a
new machine to take over the job in that case.

CS Pulling individual drives from a RAID set while live and updating the
CS firmware one at a time is not an approach I would take-- running with
CS mixed firmware versions doesn't thrill me, and I know of multiple
CS cases where someone made a mistake reconnecting a drive with the wrong
CS SCSI id or something like that, taking out a second drive while the
CS RAID was not redundant, resulting in massive data corruption or even
CS total loss of the RAID contents.

This scenario was exactly the reason why I plugged in the new drive to an
extra slot and asked zfs to replace it with an old one. Well, I did not
know what kind of fiasco the controller for this extra slot would turn out
to be - otherwise I would have used the hot-spare slot for this in the
first place.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: immense delayed write to file system (ZFS and UFS2), performance issues

2010-01-26 Thread Gerrit Kühn

On Wed, 27 Jan 2010 03:53:20 +0900 Tommi Lätti s...@iki.fi wrote about
Re: immense delayed write to file system (ZFS and UFS2), performance
issues:

TL Well AFAIK WD certifies that there's no extra risk involved unless you
TL go over 300.000 park cycles. On the other hand, my 9 month 1.5tb green
TL drive has over 200.000 cycles.

I think the RE2 drives I have here are certified for 600k cycles.

TL Maybe check if you can disable the idle timer using WDIDLE3... works
TL for my drives (although it did some strange things to one out of the 6
TL drives -- decreased reported sector count and the zfs invalidated the
TL pool :/ ).

I can only encourage everyone having this problem to report to WD's
support about this. Today I received an update for the firmware of
RE4-drives (which I did not try out yet). IMHO, the more people complain
about these issues, the higher is the chance that WD will do something
about it.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: immense delayed write to file system (ZFS and UFS2), performance issues

2010-01-26 Thread Gerrit Kühn

On Tue, 26 Jan 2010 19:12:01 -0500 Damian Gerow dge...@afflictions.org
wrote about Re: immense delayed write to file system (ZFS and UFS2),
performance issues:

DG Adrian Wontroba wrote:

DG Having a script kick off and write to a disk will help so long as that
DG disk is writable; if it's being used as a hot spare in a raidz array,
DG it's not going to help much.

For my RE2 and RE4 disks I wrote a script that calls smartctl -a on all
disks (one after another) every 5s or so. This also prevents the counter
to increase in my setup and you can do it for every disk, no matter if
they are in a raid compound or not. I think writing to the disks may also
fail the desired effect if you have stripes the writes are spead to (raid
50 or similar zpool setups).

Just my 2¢.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: immense delayed write to file system (ZFS and UFS2), performance issues

2010-01-19 Thread Gerrit Kühn

On Mon, 18 Jan 2010 21:41:53 -0500 Garrett Moore garrettmo...@gmail.com
wrote about Re: immense delayed write to file system (ZFS and UFS2),
performance issues:

GM The drives being discussed in my related thread (regarding poor
GM performance) are all WD Green drives. I have used wdidle3 to set all
GM of my drive timeouts to 5 minutes. I'll see what sort of difference
GM this makes for performance.

GM Even if it makes no difference to performance, thank you for pointing
GM it out
GM -- my drives have less than 2,000 hours on them and were all over
GM 90,000 load cycles due to this moronic factory setting. Since changing
GM the timeout, they haven't parked (which is what I would expect).

Thanks for bringing up this topic here. I have drives showing up close to
80 load cycle counts here. Guess it's time for that fix... :-|


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: immense delayed write to file system (ZFS and UFS2), performance issues

2010-01-19 Thread Gerrit Kühn

On Tue, 19 Jan 2010 01:57:36 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: immense delayed write to file
system (ZFS and UFS2), performance issues:

JC If you want a consumer-edition drive that's better tuned for server
JC work, you should really be looking at the WD Caviar Black series or
JC their RE/RE2 series.  

That's exactly what I did. I have WD-RE2 drives here that show exactly
this problem (RE2/GP)! The model number is WD1000FYPS-01ZKB0.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: immense delayed write to file system (ZFS and UFS2), performance issues

2010-01-19 Thread Gerrit Kühn

On Tue, 19 Jan 2010 03:24:49 -0800 Jeremy Chadwick
free...@jdc.parodius.com wrote about Re: immense delayed write to file
system (ZFS and UFS2), performance issues:

JC  JC If you want a consumer-edition drive that's better tuned for
JC  JC server work, you should really be looking at the WD Caviar Black
JC  JC series or their RE/RE2 series.  

JC  That's exactly what I did. I have WD-RE2 drives here that show
JC  exactly this problem (RE2/GP)! The model number is WD1000FYPS-01ZKB0.

JC I should have been more specific.  WD makes RE-series drives which
JC don't have GP applied to them; those are what I was referring to.

Well, when I bought these drives I was not aware of this issue. Buying a
drive intended for 24/7 use in RAID configurations is basically the right
idea, I think. From what was written about the GP feature back then I
could not anticipate such problems.
I would have liked to buy the 2TB drives without GP lately, but they have
lead times into April here. So I went for the GP model, which now shows
the same problem as the 1TB drive... :-(

JC WD1000FYPS - WD RE2-GP,   1TB, 16MB, variable rpm
JC WD2002FYPS - WD RE4-GP,   2TB, 64MB, variable rpm

JC So which drive models above are experiencing a continual increase in
JC SMART attribute 193 (Load Cycle Count)?  My guess is that some of the
JC WD Caviar Green models, and possibly all of the RE2-GP and RE4-GP
JC models are experiencing this problem.

I can confirm that the two models above show this problem.
Furthermore I can confirm that at least in my setup here this drive
type works fine:

WD5001ABYS

I have some of the RE3 drives sitting around here and will probably try
them later.
Can anyone here report anything about the fixed firmware from
http://support.wdc.com/product/download.asp?groupid=609sid=113lang=en?
Does this remedy the problem for the 1TB RE2 drive?

JC I say some with regards to WD Caviar Green since I have some which do
JC not appear to exhibit the heads/actuator arm moved into the
JC landing/park zone.  I'm at work right now, but when I get home I can
JC verify what models I've used which didn't experience this problem, as
JC well as what the manufacturing date and F/W revisions are.  I should
JC note I don't have said Green drives in use (I use WD1001FALS drives
JC now).

Thanks for sharing this information.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

zfs/nfs mkstemp() failure subsequent hangs

2009-11-20 Thread Gerrit Kühn

Hi all,

I have a 8.0-PRERELEASE zfs/nfs server here that complains about i/o
errors when using rsync on a nfs client:

rsync: mkstemp
/usr/portage/metadata/cache/app-mobilephone/.ksms-0.1.2.4.BynVFw failed:
Input/output error (5)


I found this to be quite similar to kern/135412. However, this one
is said to be fixed and only applicable to 7-stable anyway.
Furthermore, after this happened, I tried to access files on the server
from the zfs filesystem concerned and found that I cannot access the fs
anymore. ls hangs in state zfs, so do mountd and zfs unmount.

Questions:
Should I open a new PR for this?
Are there any ideas how to recover access to the fs apart from rebooting
the machine? Right now I still have it running, so I could get some more
debugging information out of it.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Support for SAS/SATA non-RAID adapters

2009-11-19 Thread Gerrit Kühn

On Wed, 18 Nov 2009 11:37:03 -0600 Barry Pederson b...@barryp.org wrote
about Re: Support for SAS/SATA non-RAID adapters:

  I guess the version of the card I have here was actually intended to
  be used in some kind of special Supermirco-Extension Slot. However,
  it fits into a standard PCIe slot and works nicely there as far as I
  can tell. Do you have the opportunity of using a riser card that
  would give you one more slot?

BP Those Supermicro UIO cards look like backwards PCIe cards.  Do they
BP come with other brackets for fitting into a PCIe slot, or did you have
BP to go bracketless?

They only come with a bracket that does not exactly fit into a standard
slot. Maybe the other bracket is available, but I did not care much about
it and simply went for bracketless (not much of a problem with a low
profile card).

BP didn't mention anything about brackets or how it'd work in PCIe slots.

For me it simply works. Only the bracket does not fit.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Support for SAS/SATA non-RAID adapters

2009-11-19 Thread Gerrit Kühn

On Wed, 18 Nov 2009 09:35:56 -0800 Freddie Cash fjwc...@gmail.com wrote
about Re: Support for SAS/SATA non-RAID adapters:

FC  Hm, I don't know the recent exchange rate, but are you sure this is
FC  the same card? I paid something like 80,-€ (excl. VAT).

FC Oops, you're right, was reading the model numbers wrong.  The
FC LSI1068-based one is only $129 CDN, the Intel IOP-based ones are
FC $200-300 CDN.

That makes sense then.

FC Last time I checked the Euro was in the $1.50-2.00 CDN range.

Seems to be something like 1.55 these days.

FC  I guess the version of the card I have here was actually intended to
FC  be used in some kind of special Supermirco-Extension Slot. However,
FC  it fits into a standard PCIe slot and works nicely there as far as I
FC  can tell. Do you have the opportunity of using a riser card that
FC  would give you one more slot?

FC Urgh, I have yet to find a riser card that will plug into a Tyan
FC motherboard and not cause issues.  Due to all the issues we've had
FC with riser cards in the past, we have sworn off all riser cards.  For
FC our 2U servers, we use low-profile cards to avoid risers.

I had some trouble with risers in the past, too. However, I have a Tyan
Transport here that seems to work nicely at least with the riser that came
with the system.

FC I'll keep looking for a PCI-X card.  These look like they'll cover our
FC PCIe needs.

Please let us know if you find one that is suitable. I spent quite some
time to dig out the Supermicro card; cheap (without raid) and
FreeBSD-supported cards with more than 4 channels are not that common.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Support for SAS/SATA non-RAID adapters

2009-11-19 Thread Gerrit Kühn

On Wed, 18 Nov 2009 13:15:59 -0600 Barry Pederson b...@barryp.org wrote
about Re: Support for SAS/SATA non-RAID adapters:

BP What I was questioning was where the OP said: it fits into a standard 
BP PCIe slot and works nicely there as far as I can tell - which to me 
BP sounds like you could use this HBA in a *NON-Supermicro* motherboard.

BP I was just wondering if that was truly the case, given how in the
BP photos it looks to be arranged physically backwards from a regular
BP PCIe card, and given how you mention The UIO slot itself is
BP proprietary.

I'm sorry if my comment fits into a standard PCIe slot was misleading
here. I wanted to state that -although Supermirco lists this one as a card
for UIO- I plugged it into a standard PCIe slot and it simply works there
for me. Just the mounting bracket it came with did not fit, but for a low
profile card it is not that difficult to live without it.

BP But some more digging on Google has turned up a few mentions along the 
BP lines of:
BP 
BP 
BPThis card plugs into a normal PCIe 8x slot but the
BPmetal mounting bracket bolted to the card is made
BPfor a UIO slot (which is why it's so cheap).
BP 
BPAll you have to do is remove the metal bracket and
BPzip-tie the card to your case for mechanical support.
BPElectrically it'll work fine in a PCIe x8 or x16 slot.
BP 

That's exactly my experience.

BP If someone wanted to make PCIe compatible brackets for this affordable 
BP card, they'd probably sell a fair number to small shops or home users.

Yeah, I would also buy some. :-)


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Support for SAS/SATA non-RAID adapters

2009-11-18 Thread Gerrit Kühn

On Tue, 17 Nov 2009 16:29:06 -0800 Freddie Cash fjwc...@gmail.com wrote
about Support for SAS/SATA non-RAID adapters:

FC Any recommendations on other SAS/SATA controllers to look at (just not
FC anything with MegaRAID in the name)?

I installed a Supermicro AOC-USASLP-L8i card here some days ago. Should be
even cheaper than the ones you mentioned and comes with a LSI chip
supported by mpt driver:

m...@pci0:6:0:0:class=0x01 card=0xa68015d9 chip=0x00581000
rev=0x08 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)'
device = 'SAS 3000 series, 8-port with 1068E -StorPort'
class  = mass storage
subclass   = SCSI



I only installed it last week and cannot comment much on performance and
stability up to now.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Support for SAS/SATA non-RAID adapters

2009-11-18 Thread Gerrit Kühn

On Wed, 18 Nov 2009 08:56:14 -0800 Freddie Cash fjwc...@gmail.com wrote
about Re: Support for SAS/SATA non-RAID adapters:

FC  I installed a Supermicro AOC-USASLP-L8i card here some days ago.
FC  Should be even cheaper than the ones you mentioned and comes with a
FC  LSI chip supported by mpt driver:

FC  m...@pci0:6:0:0:        class=0x01 card=0xa68015d9
FC  chip=0x00581000 rev=0x08 hdr=0x00 vendor     = 'LSI Logic (Was:
FC  Symbios Logic, NCR)' device     = 'SAS 3000 series, 8-port with
FC  1068E -StorPort' class      = mass storage
FC     subclass   = SCSI

FC  I only installed it last week and cannot comment much on performance
FC  and stability up to now.

FC These look nice, and are in the $200-300 CDN range.  Have the same
FC mini-SAS connectors as the 3Ware cards we use, so wouldn't have to
FC re-cable the chassis.

Hm, I don't know the recent exchange rate, but are you sure this is the
same card? I paid something like 80,-€ (excl. VAT).

FC Are you using these as standard disk controllers, or are you using the
FC RAID features (seems it supports RAID0 and RAID1 in hardware, RAID5 in
FC software)?  Reading through the manual right now, and it doesn't cover
FC using the card in non-RAID modes.  Wondering if the drives would show
FC up as normal da0 da1 da2 etc.

I think my card does not have the raid features included, maybe that's why
it was so cheap. The devices appear as normal scsi disks:

dmesg:
da0 at mpt0 bus 0 target 0 lun 0
da0: ATA WDC WD5001ABYS-0 1D01 Fixed Direct Access SCSI-5 device
da0: 300.000MB/s transfers
da0: Command Queueing enabled
da0: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C)
[...]

cliff# camcontrol devlist
ATA WDC WD5001ABYS-0 1D01at scbus0 target 0 lun 0 (da0,pass0)
ATA WDC WD5001ABYS-0 1D01at scbus0 target 1 lun 0 (da1,pass1)
ATA WDC WD5001ABYS-0 1D01at scbus0 target 2 lun 0 (da2,pass2)
ATA WDC WD5001ABYS-0 1D01at scbus0 target 3 lun 0 (da3,pass3)
ATA WDC WD5001ABYS-0 1D01at scbus0 target 4 lun 0 (da4,pass4)
ATA WDC WD5001ABYS-0 1D01at scbus0 target 5 lun 0 (da5,pass5)
ATA WDC WD5001ABYS-0 1D01at scbus0 target 6 lun 0 (da6,pass6)
ATA WDC WD5001ABYS-0 1D01at scbus0 target 7 lun 0 (da7,pass7)

FC All of these (there's a couple variations on the card) appear to be
FC PCIe, though, no PCI-X.  We have 24 drive bays, and only 2 PCIe slots.
FC Have 3 PCI-X slots, though, so would need at least 1 PCI-X
FC controller.

I guess the version of the card I have here was actually intended to be
used in some kind of special Supermirco-Extension Slot. However, it fits
into a standard PCIe slot and works nicely there as far as I can tell.
Do you have the opportunity of using a riser card that would give you one
more slot?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

zfs panic mounting fs after crash with RC2

2009-11-04 Thread Gerrit Kühn

Hi,

Yesterday I had the opportunity to play around with my yet-to-become new
fileserver a bit more. Originally I had installed 7.2-R, which I upgraded
to 8-0-RC2 yesterday. After that I upgraded my zpool consisting of 4 disks
in raidz1 constallation to v13.
Some time later I tried to use powerd which was obviously a bad idea: it
crashed the machine immediately. I will give a separate report on that
later as it is probably related to the hardware, which is a bit exotic (VIA
VB8001 board with 64bit Via Nano processor).
However, the worst thing for me is, that after rebooting from that crash,
one of my zfs fs cannot be mounted anymore. As soon as I try to mount it I
get a kernel panic. I can still access the properties (I made use of
canmount=noauto for the first time :-), but I cannot do a snapshot of
the fs (funny enough, zfs complains that the fs is busy, while in reality
it is not even mounted - so how could it be busy?).

I took a picture of the kernel panic and put it here (don't know if there
is any useful information in it):
http://www.pmp.uni-hannover.de/test/Mitarbeiter/g_kuehn/data/zfs-panic.jpg

The pool as such seems to be fine, all other fs in it can be mounted and
used, only trying to mount tank/sys/var triggers this panic.
Are there any suggestions what I could do to get my fs back? Please let me
know if (and how) I can provide more debugging information.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

zfs panic

2009-11-02 Thread Gerrit Kühn

Hi,

I got the following panic when rebooting after a crash on 7.2-REL:

panic: solaris assert: dmu_read(os, smo-smo_object, offset, size,
entry_map) == 0 (0x5 == 0x0), file:
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/spa
ce_map.c, line: 341

This seems to be the same panic as mentioned here:
http://lists.freebsd.org/pipermail/freebsd-stable/2008-July/043763.html.

However, I did not see warnings about the ZIL. The crash leading to this
situation was probably caused by me pushing the controller card a bit too
hard (mechanically) during operation (well, so much about hot-plugging of
cards :-).
Since my pool was almost empty anyway and I needed the machine, I opted to
recreate the pool instead of trying the patches supplied by pjd@ in the
thread above.

But nevertheless I would like to be prepared if this happens again (and
the pool is not empty :-).
Right now I am updating the system to 8.0-RC2. Will this issue go away
with zpoolv13/FBSD8.0 (as suggested above)? I could not find out from the
thread above if the suggested patches helped or if anything from this has
been commited at all. Pawel or Daniel, do you remember what the final
result was?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Linux/KDE and NFS locking on 7-stable

2009-09-14 Thread Gerrit Kühn

Hi all,

I upgraded a FreeBSD fileserver last week from 7.0-stable to 7.2-stable
and experience some weird problems now with Linux NFS clients.
The Linux Clients mount their home directories via nfs. I usually use
nolock on the client side, because file locking was always troublesome
in the past. On the Clients the users run kde 3.5 or 4.2.
After the update of the server kde 3.5 quit starting up (after logging
in with kdm) on the spalsh screen and comes up with some kind of I/O error
when writing to the home dir. At the same time the server complains about

kernel: NLM: failed to contact remote rpcbind, stat = 5, port = 28416

Any other window manager (xfce, icewm, mwm, twm) seems to work fine.
Playing around with locking, udp/tcp, rebooting some times then somehow
magically made it work with kde 3.5 again (although I am using the same
mount options in the and as I used before):

mclane:/tank/home/gco  /tank/home/ghf nfs nfsvers=3,rw,nolock,nordirplus 0
0

Turning locking on definitely does not work (tried it three times).
rcp.lockd and rpc.statd are running on the server side, no further tuning
done there.

KDE 4.2 seems to have better, I have been able to get it working with
locking turned on (but it refused to work without locking with the same
errors as described above for kde 3.5).

I find the whole situation a bit unattractive. Can anybody here give me a
hint which combination of mount options should work for a FreeBSD Server
running 7.2-stable and Linux clients running 2.6.29 and KDE3/4? I am not
that much into performance here, I want a stable working solution. And why,
after all, is KDE so picky about locking and nfs homedirs anyway? All
other environments appear not to show these problems.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zfs kernel panic

2009-09-09 Thread Gerrit Kühn

On Tue, 8 Sep 2009 18:44:13 +0200 Pawel Jakub Dawidek p...@freebsd.org
wrote about Re: zfs kernel panic:

PJD If this is amd64, add vm.kmem_size=4G to your loader.conf back.

Yes, it is amd64 (sorry I did not mention that). I will add the option
back (the one I used before was set to a somewhat lower value, something
like 2G afaicr).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zfs kernel panic

2009-09-09 Thread Gerrit Kühn

On Tue, 8 Sep 2009 18:44:13 +0200 Pawel Jakub Dawidek p...@freebsd.org
wrote about Re: zfs kernel panic:

PJD If this is amd64, add vm.kmem_size=4G to your loader.conf back.

What about vm.kmem_size_max? Does that also need tuning?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

zfs kernel panic

2009-09-08 Thread Gerrit Kühn

Hi folks,

I just upgraded a zfs server from 7.0-something to 7.2-stable and hoped to
get rid of some minor instabilities I experienced every 6 months or so.
Unfortunately, the new system crashed for the first time after only a few
hours when copying some files via scp onto it.
I got a kernel panic which looked quite similar to the one reported here
(kmem_map too small):
http://www.archivum.info/freebsd...@freebsd.org/2009-04/00071/FreeBSD_7.2-RC1_-_ZFS_related_kernel_panic_quot_kmem_map_too_small_quot

I have a dual cpu dual core opteron system with 4GB of RAM and a 3-disk
raidz1. I took out the memory settings from loader.conf as suggested in
UPDATING. I did not yet upgrade zpool nor zfs version (would that help?).
Are there any known issues or any further hints what might cause the
crash? I copied the files again, but this time everything went fine.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: FreeBSD 7.1 (and 7.2) Breaks re and rl Network Interface Drivers

2009-08-17 Thread Gerrit Kühn

On Mon, 9 Mar 2009 17:33:06 +0900 Pyun YongHyeon pyu...@gmail.com wrote
about Re: FreeBSD 7.1 Breaks re and rl Network Interface Drivers:

PY  I cannot say if the actual issue I had with 7.1-stable has gone
PY  away, too, because this only occured after a longer time of
PY  operation. However, up to now everything looks nice.

PY Ok, if you find any re(4) instability feel free to contact me.

Ok, took some time, but here I am. :-)

It seems I have two different version of the Jetway mainboard here (one
with 25W total power consumption and one with 12W I guess). Anyway, my
version of 7.1 with your patches is running fine on this board:


---
Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-STABLE #0: Tue Mar 24 12:46:03 CET 2009
r...@xenon:/usr/tmp/usr/obj/usr/work/current/src/sys/FIREFLY
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: VIA C7-D Processor 1500MHz (1500.02-MHz 686-class CPU)
  Origin = CentaurHauls  Id = 0x6d0  Stepping = 0
  
Features=0xa7c9baffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,APIC,SEP,MTRR,PGE,CMOV,PAT,CLFLUSH,ACPI,MMX,FXSR,SSE,SSE2,TM,PBE
  Features2=0x4001SSE3,xTPR
  VIA Padlock Features=0xffccRNG,AES,AES-CTR,SHA1,SHA256,RSA
real memory  = 1055784960 (1006 MB)
avail memory = 1023750144 (976 MB)
kbd1 at kbdmux0
cryptosoft0: software crypto on motherboard
padlock0: AES-CBC,SHA1,SHA256 on motherboard
acpi0: CN700 AWRDACPI on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a (3) failed
acpi0: reservation of 10, 3ede (3) failed
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x408-0x40b on acpi0
acpi_button0: Power Button on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci_link2: BIOS IRQ 5 for 0.9.INTA is invalid
pci_link2: BIOS IRQ 5 for 0.16.INTC is invalid
pci_link2: BIOS IRQ 5 for 0.17.INTC is invalid
pci0: ACPI PCI bus on pcib0
pcib1: PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
vgapci0: VGA-compatible display mem
0xf400-0xf7ff,0xfb00-0xfbff irq 11 at device 0.0 on pci1
re0: RealTek 8169SC/8110SC Single-chip Gigabit Ethernet port
0xf000-0xf0ff mem 0xfdfff000-0xfdfff0ff irq 10 at device 9.0 on pci0 re0:
Chip rev. 0x1800 re0: MAC rev. 0x miibus0: MII bus on re0
rgephy0: RTL8169S/8110S/8211B media interface PHY 1 on miibus0
rgephy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto re0: Ethernet address: 00:30:18:a7:8a:1c
re0: [FILTER]
re1: RealTek 8169SC/8110SC Single-chip Gigabit Ethernet port
0xf200-0xf2ff mem 0xfdffe000-0xfdffe0ff irq 10 at device 11.0 on pci0 re1:
Chip rev. 0x1800 re1: MAC rev. 0x
miibus1: MII bus on re1
rgephy1: RTL8169S/8110S/8211B media interface PHY 1 on miibus1
rgephy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto re1: Ethernet address: 00:30:18:a7:8a:1d
re1: [FILTER]
atapci0: VIA 6420 SATA150 controller port
0xff00-0xff07,0xfe00-0xfe03,0xfd00-0xfd07,0xfc00-0xfc03,0xfb00-0xfb0f,0xf400-0xf4ff
irq 11 at device 15.0 on pci0 atapci0: [ITHREAD] ata2: ATA channel 0 on
atapci0 ata2: [ITHREAD]
ata3: ATA channel 1 on atapci0
ata3: [ITHREAD]
atapci1: VIA 8237 UDMA133 controller port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfa00-0xfa0f at device 15.1 on pci0
ata0: ATA channel 0 on atapci1 ata0: [ITHREAD]
ata1: ATA channel 1 on atapci1
ata1: [ITHREAD]
uhci0: VIA 83C572 USB controller port 0xf900-0xf91f irq 11 at device
16.0 on pci0 uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0: VIA 83C572 USB controller on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 on usb0
uhub0: 2 ports with 2 removable, self powered
uhci1: VIA 83C572 USB controller port 0xf800-0xf81f irq 11 at device
16.1 on pci0 uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb1: VIA 83C572 USB controller on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 on usb1
uhub1: 2 ports with 2 removable, self powered
uhci2: VIA 83C572 USB controller port 0xf700-0xf71f irq 11 at device
16.2 on pci0 uhci2: [GIANT-LOCKED]
uhci2: [ITHREAD]
usb2: VIA 83C572 USB controller on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 on usb2
uhub2: 2 ports with 2 removable, self powered
uhci3: VIA 83C572 USB controller port 0xf600-0xf61f irq 11 at device
16.3 on pci0 uhci3: [GIANT-LOCKED]
uhci3: [ITHREAD]
usb3: VIA 83C572 USB controller on uhci3
usb3: USB revision 1.0
uhub3: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 on usb3
uhub3: 2 ports with 2 removable, self powered
ehci0: VIA VT6202 USB 2.0 controller mem 0xfdffd000-0xfdffd0ff irq 10 at
device 16.4 on pci0 ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1

Re: ZFS NAS configuration question

2009-06-02 Thread Gerrit Kühn

On Sat, 30 May 2009 21:41:36 +0300 Dan Naumov dan.nau...@gmail.com wrote
about ZFS NAS configuration question:

DN So, this leaves me with 1 SATA port used for a FreeBSD disk and 4 SATA
DN ports available for tinketing with ZFS. 

Do you have a USB port available to boot from? A conventional USB stick (I
use 4 GB or 8GB these days, but smaller ones would certainly also do) is
enough to hold the base system on UFS, and you can give the whole of your
disks to ZFS without having to bother with booting from them.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

zfs: dataset is busy

2009-05-11 Thread Gerrit Kühn

Hi all,

I have several machines here that do automatic snapshotting via RSEs
snapshot-tool (sysutils/freebsd-snapshot). I use a homemade script to
incrementally transfer daily snapshots to a backup server. This runs fine
with machines running 7.0-STABLE of Jun 08.
However, I have one box with 7.1-PRERELEASE from Sep 08 showing the
following error after some time (when trying to rotate the snapshots via
the freebsd-snapshot tool):

cannot destroy 'tank/w...@daily.6': dataset is busy


I fact I cannot do anything to the snapshot, neither destroy nor export
(not even with -f), it is always busy. The only chance to get away
from this I found is to reboot the machine. After a reboot it is fine for
some weeks, but then comes back with the same problem.

Has anyone seen this before? Is there a fix or workaround available? Are
there any improvements I could take profit of by upgrading to 7.2-stable?
Right now the machine is in this state, so I could get some more
information from it (if anyone tells me what to do :-).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: FreeBSD 7.1 Breaks re and rl Network Interface Drivers

2009-03-09 Thread Gerrit Kühn

On Sun, 8 Mar 2009 11:36:42 +0900 Pyun YongHyeon pyu...@gmail.com wrote
about Re: FreeBSD 7.1 Breaks re and rl Network Interface Drivers:

PY Have you tried re(4) in HEAD?
PY I had one report that re(4) in HEAD still does not fix the issue so
PY I posted a possible workaround for that. Unfortunately he didn't
PY report back so I don't know whether it was right workaround or not. 
PY If re(4) in HEAD does not fix the issue, would you try attached
PY patch and let me know how it goes?

Are you talking about me? ;-)
I put the first system with the patched patch to work last week. It
definitely fixes the problems I had with the first patch (HEAD, I
suppose), as the interface attaches fine now (even a lot faster than
before).
I cannot say if the actual issue I had with 7.1-stable has gone away, too,
because this only occured after a longer time of operation. However, up to
now everything looks nice.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: FreeBSD 7.1 Breaks re and rl Network Interface Drivers

2009-03-09 Thread Gerrit Kühn

On Mon, 9 Mar 2009 14:51:41 +1000 Gavin Stone-Tolcher
g.stone-tolc...@its.uq.edu.au wrote about RE: FreeBSD 7.1 Breaks re and
rl Network Interface Drivers:

GST Hi, Just some more feedback on your patch.
GST I have a Jetway J7F4K1G2E board with dual embedded 
GST RealteK RTL8110SC. I tried using the 19 January 2009 jkim 
GST patches:

And I just want to add that I am using the same mainboard series here (A,
D and E versions, I think).

GST I have been using your patch above originally proffered on 
GST Feb 13 since Feb 18 and the system has been working fine 
GST since then. 

If that's the patch pyunyh posted here in answer to my issue (should have
been around the same time, my kernel is from Feb 13), that's the same I am
using here now (and it also works for me up to now).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zfs crashes with nfs and snapshots

2009-02-17 Thread Gerrit Kühn

On Mon, 16 Feb 2009 19:43:00 +0200 Jaakko Heinonen j...@saunalahti.fi
wrote about Re: zfs crashes with nfs and snapshots:

JH  Ok, I will upgrade to 7.1-stable asap. The client was Linux 2.6.25,
JH  I cannot say if it uses readdirplus and if I could disable that (the
JH  manpage says nothing about it at all, but I will look into that
JH  further).

JH -o nordirplus mount option should disable it on Linux.

Thanks. I missed that when first looking into the manpage (probably because
it's written in UPPERCASE :-).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: fun with if_re

2009-02-13 Thread Gerrit Kühn

On Thu, 5 Feb 2009 17:28:04 +0900 Pyun YongHyeon pyu...@gmail.com wrote
about Re: fun with if_re:

PY  I did build new nanobsd images with these patches meanwhile and will
PY  start using them today. However, as it has worked without problems
PY  for weeks with the buggy version before, I will not be able to say
PY  if it is really working until next month or so. Or do you know any
PY  method to reliably
PY 
PY That's fine.


I had to reboot some of the machines meanwhile and could do some further
testing. One strange thing I noticed is that the re-interfaces often do
not come up in a working state after rebooting. Strangely, I see
network traffic floating around via tcpdump, but not even ping works.
This state often goes away when playing around with the interface
(sometimes ifconfig down/up helps, sometimes disabling some of the
additional features like txc/rxc), but I cannot make out a reproducible
behaviour so far. When the interface leaves this strange state it seems to
work fine afterwards. Any clues?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: zfs crashes with nfs and snapshots

2009-02-13 Thread Gerrit Kühn

On Wed, 11 Feb 2009 19:55:11 +0200 Jaakko Heinonen j...@saunalahti.fi
wrote about Re: zfs crashes with nfs and snapshots:

JH This is likely the issue described in this message:
JH http://lists.freebsd.org/pipermail/freebsd-fs/2008-October/005217.html

Yes, this looks very much like it.

JH The nfs fix has been committed to head and stable/7 (7.1-RELEASE has
JH the fix). The fix prevents system from panicing but you still can't
JH access the snapshot directory with readdirplus enabled nfs clients. As
JH a workaround you can disable readdirplus support if your nfs client
JH allows it.

Ok, I will upgrade to 7.1-stable asap. The client was Linux 2.6.25, I
cannot say if it uses readdirplus and if I could disable that (the manpage
says nothing about it at all, but I will look into that further).
Thanks for the hint.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: fun with if_re

2009-02-13 Thread Gerrit Kühn

On Fri, 13 Feb 2009 19:24:00 +0900 Pyun YongHyeon pyu...@gmail.com wrote
about Re: fun with if_re:

PY  I had to reboot some of the machines meanwhile and could do some
PY  further testing. One strange thing I noticed is that the
PY  re-interfaces often do not come up in a working state after
PY  rebooting. Strangely, I see network traffic floating around via
PY  tcpdump, but not even ping works. This state often goes away when
PY  playing around with the interface (sometimes ifconfig down/up helps,
PY  sometimes disabling some of the additional features like txc/rxc),
PY  but I cannot make out a reproducible behaviour so far. When the
PY  interface leaves this strange state it seems to work fine
PY  afterwards. Any clues?

PY Does this happen on latest if_re.c/if_rlreg.h? I guess jkim fixed
PY this type of problem in r187483. If that have no effect please let
PY me know.

It happens on both versions: the old one from 11th Dec 08 I still had, and
the new one I built with the patches you recommended about a week ago.
if_re is 1.151 2009/01/20 20:22:28 jkim, if_rlreg is 1.94 2009/01/20
20:22:28 jkim for the latter.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: fun with if_re

2009-02-13 Thread Gerrit Kühn

On Fri, 13 Feb 2009 20:39:55 +0900 Pyun YongHyeon pyu...@gmail.com wrote
about Re: fun with if_re:


PY Ok, try attached patch.

Thanks, building new images right now. I'll be back later (next week).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

zfs crashes with nfs and snapshots

2009-02-11 Thread Gerrit Kühn

Hi folks,

I just saw one of my FreeBSD servers (7.0-stable of June 2008) crash while
trying to access the .zfs snapshot directory via a nfs client machine.
The server got a page fault caused by the nfsd process. It wasn't even
able to dump the kernel image anymore.
Resetting the machine it first appeared to come back fine, but shortly
before the login prompt the nfsd let it crash hard again the same way as
before. Then I booted single user, fscked the ufs partitions by hand and
had to re-import the zpool with -f. After that I did another reboot,
whereupon everything was fine again.

As I need that machine I'm a bit unwilling to try accessing the snapshot
directory again via nfs right now. :-)
So here are some questions before I do anything else:
- Did anyone else already see this behaviour?
- Is there something wrong with accessing the snapshot directory via nfs?
- Does zfs stability profit from an update to a recent -stable?

Any answers or further thoughts/hints on this are very welcome.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: fun with if_re

2009-02-05 Thread Gerrit Kühn

On Thu, 5 Feb 2009 17:28:04 +0900 Pyun YongHyeon pyu...@gmail.com wrote
about Re: fun with if_re:

PY  I did build new nanobsd images with these patches meanwhile and will
PY  start using them today. However, as it has worked without problems
PY  for weeks with the buggy version before, I will not be able to say
PY  if it is really working until next month or so. Or do you know any
PY  method to reliably

PY That's fine.

Sorry to be back so soon again, but I just noticed that I did in fact not
produce new images yesterday. :-)
Kernel build stopped with

---
mkdep -f .depend -a   -nostdinc -D_KERNEL -DKLD_MODULE
-DHAVE_KERNEL_OPTION_HEADERS -I. -I@ -I@/contrib/altq
-I/usr/tmp/usr/obj/usr/work/curren
t/src/sys/FIREFLY /usr/work/current/src/sys/modules/mii/../../dev/mii/acphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/amphy.c /usr/
work/current/src/sys/modules/mii/../../dev/mii/atphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/bmtphy.c 
/usr/work/current/src/sys/m
odules/mii/../../dev/mii/brgphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/ciphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/m
ii/e1000phy.c /usr/work/current/src/sys/modules/mii/../../dev/mii/exphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/gentbi.c /usr/wor
k/current/src/sys/modules/mii/../../dev/mii/icsphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/inphy.c 
/usr/work/current/src/sys/modu
les/mii/../../dev/mii/ip1000phy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/jmphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/m
ii/lxtphy.c
miibus_if.c /usr/work/current/src/sys/modules/mii/../../dev/mii/mii.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/mii_physu
br.c /usr/work/current/src/sys/modules/mii/../../dev/mii/mlphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/nsgphy.c /usr/work/current 
/src/sys/modules/mii/../../dev/mii/nsphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/nsphyter.c 
/usr/work/current/src/sys/modules/mii /../../dev/mii/pnaphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/qsphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/rgephy.
c /usr/work/current/src/sys/modules/mii/../../dev/mii/rlphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/ruephy.c 
/usr/work/current/sr
c/sys/modules/mii/../../dev/mii/tdkphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/tlphy.c 
/usr/work/current/src/sys/modules/mii/../. ./dev/mii/truephy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/ukphy.c 
/usr/work/current/src/sys/modules/mii/../../dev/mii/ukphy_subr.
c /usr/work/current/src/sys/modules/mii/../../dev/mii/xmphy.c
In file included
from /usr/work/current/src/sys/modules/mii/../../dev/mii/rgephy.c:60:
@/pci/if_rlreg.h:1509:28: error: token ; is not valid in preprocessor
expressions @/pci/if_rlreg.h:1917:6: error: unterminated comment
@/pci/if_rlreg.h:1509:1: error: unterminated #if In file included
from /usr/work/current/src/sys/modules/mii/../../dev/mii/rlphy.c:56:
@/pci/if_rlreg.h:1509:28: error: token ; is not valid in preprocessor
expressions @/pci/if_rlreg.h:1917:6: error: unterminated comment
@/pci/if_rlreg.h:1509:1: error: unterminated #if mkdep: compile failed
*** Error code 1
1 error
*** Error code 2
1 error
*** Error code 2
2 errors
*** Error code 2
1 error
*** Error code 2
1 error
---



Any hints?



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: fun with if_re

2009-02-05 Thread Gerrit Kühn

On Thu, 5 Feb 2009 12:05:46 +0100 Gerrit Kühn ger...@pmp.uni-hannover.de
wrote about Re: fun with if_re:

GK Sorry to be back so soon again, but I just noticed that I did in fact
GK not produce new images yesterday. :-)
GK Kernel build stopped with

[...]

Ignore me, my bad (downloaded the webpage instead of the code via
webcvs :-). On my way now.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

fun with if_re

2009-02-04 Thread Gerrit Kühn

Hi folks,

I have several routers here which are based on Jetway J7F4 ITX boards that
come with two onboard re-interfaces. I run 7-stable on them via nanobsd
and update them about once in three or four months.

After the last update (11th December 2008) I have noticed the following
strange behaviour on at least two machines (identical hard- and software):
After weeks of flawless operation, the network connection on both
interfaces suddenly starts to mangle packages. Even a simple ping can show
up to 50% or so package loss. The machine is mostly unreachable via net.
ifconfig up/down did not cure this, turning off checksum-offloading
and stuff did not help. Even simply rebooting the machine did not make the
problem go away! I had to power-cycle them by unplugging all cables to get
back to normal operation.

I have seen this behaviour on two different machines, so I can most
probably rule out a hardware issue. It does not appear to happen often,
though. I did not see this with an earlier image of 7-stable from June
2008, and probably even an image from early September was working fine
(although I did not use that one for such a long time).

Visiting the webcvs I noticed that there are a lot of patches for if_re in
December 2008 and January 2009. The revision I'm having problems with is
tagged 1.95.2.37 2008/12/09 11:01:17. Does anyone have an idea what
broke if_re for me, and how I can get back to stable operation? Is it
possible to use if_re from head as drop-in replacement to test the patches
available after 12/09? I would prefer not to move the machines completely
from -stable to -current.

Here some further information about the NICs:

---pciconf---
r...@pci0:0:9:0: class=0x02 card=0x10ec16f3 chip=0x816710ec rev=0x10
hdr=0x00 vendor = 'Realtek Semiconductor'
device = 'RTL8169/8110 Family Gigabit Ethernet NIC'
class  = network
subclass   = ethernet
r...@pci0:0:11:0:class=0x02 card=0x10ec16f3 chip=0x816710ec
rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor'
device = 'RTL8169/8110 Family Gigabit Ethernet NIC'
class  = network
subclass   = ethernet
---


---dmesg---
re0: RealTek 8169SC/8110SC Single-chip Gigabit Ethernet port
0xf000-0xf0ff mem 0xfdfff000-0xfdfff0ff irq 10 at device 9.0 on pci0 re0:
Chip rev. 0x1800 re0: MAC rev. 0x
miibus0: MII bus on re0
rgephy0: RTL8169S/8110S/8211B media interface PHY 1 on miibus0
rgephy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto re0: Ethernet address: 00:30:18:ab:d0:19
re0: [FILTER]
re1: RealTek 8169SC/8110SC Single-chip Gigabit Ethernet port
0xf200-0xf2ff mem 0xfdffe000-0xfdffe0ff irq 10 at device 11.0 on pci0 re1:
Chip rev. 0x1800 re1: MAC rev. 0x
miibus1: MII bus on re1
rgephy1: RTL8169S/8110S/8211B media interface PHY 1 on miibus1
rgephy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto re1: Ethernet address: 00:30:18:ab:d0:1a
re1: [FILTER]
---



cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: fun with if_re

2009-02-04 Thread Gerrit Kühn

On Wed, 4 Feb 2009 19:46:55 +0900 Pyun YongHyeon pyu...@gmail.com wrote
about Re: fun with if_re:


PY Since you're using RTL8169SC it could be related with my commit
PY r180519(cvs rev 1.95.2.22). It seems that RTL8169SC does not like
PY memory mapped register access and I think jkim@ committed patch
PY for the issue. Would you try re(4) in HEAD?
PY (Just copying if_re.c, if_rlreg.h and if_rl.c from HEAD to
PY stable would be enough to build re(4) on stable).

Thanks for the advice.
I did build new nanobsd images with these patches meanwhile and will start
using them today. However, as it has worked without problems for weeks
with the buggy version before, I will not be able to say if it is really
working until next month or so. Or do you know any method to reliably
trigger such errors?


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Curious failure of ZFS snapshots

2008-12-01 Thread Gerrit Kühn

On Sat, 29 Nov 2008 11:46:40 +0100 Pawel Jakub Dawidek [EMAIL PROTECTED]
wrote about Re: Curious failure of ZFS snapshots:

PJD   GK mclane# ll /tank/home/pt/.zfs/
PJD   GK ls: snapshot: Bad file descriptor
PJD   GK total 0

PJD Is there a way for me to reproduce that?

None that I could tell you right now.
This was on a machine which uses zfs send/receive to backup its zfs
filesystem to a backup server. Only one out of 6 or 7 zfs filesystems
showed this problem. After rebooting it went away and did not appear again
since then.

cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Curious failure of ZFS snapshots

2008-12-01 Thread Gerrit Kühn

On Sun, 30 Nov 2008 01:05:48 + Pete French
[EMAIL PROTECTED] wrote about Re: Curious failure of ZFS
snapshots:

PF Here is what I am doing - this script is run with an argument '7am' or
PF '7pm' once per day. the mysql database is a slave replication from a
PF master, so there is a continuous trickle of data into it. The symbolic
PF links are there so you can connect to the mysql server and access
PF 'xxx-7am' or 'xxx-7pm' to get a previous version of database 'xxx'.
PF In case its not obvious, the filesystem 'tank/zfs' is mounted on the
PF director '/var/db/mysql'. If you run this for a few cycles it should
PF preseumably break for you too.

If you think it will be useful I can also post my scripts. However, as I
did not see the problem again so far, it might be the case that I messed
something up manually while developing the scripts one or two weeks ago.
As mentioned, even the unaccessible zfs snapshots did send/receive fine,
so internally zfs seems to be happy (only unmounting them was a bad
idea :-).


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Curious failure of ZFS snapshots

2008-11-24 Thread Gerrit Kühn

On Fri, 21 Nov 2008 08:16:35 -0800 Freddie Cash [EMAIL PROTECTED] wrote
about Re: Curious failure of ZFS snapshots:

FC  GK mclane# ll /tank/home/pt/.zfs/
FC  GK ls: snapshot: Bad file descriptor
FC  GK total 0

FC Which shell are you using?  I've seen quite a few 
FC different non-existent/invalid directory errors when using tcsh
FC to navigate through the .zfs/ hierarchy.  Can do cd .., ls ., or
FC tab completion when in anything under .zfs/

Standard root login, so it's /bin/csh.
I cannot remember if I tried to cd into the dir, and after rebooting
everything's fine up to now. I will try this if I see the problem again.
However, it would be rather strange if this was shell-dependent, as all
other snapshots were happily accessible with csh (and the panic after
trying to unmount the fs is definitely not an expected behaviour
either :-). 

FC Using sh or zsh, these errors don't occur.
FC Just curious if this is the same kind of thing.

I will try it when I see the problem next time.


cu
  Gerrit
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

1 2 >

1 - 100 of 160 matches

Mail list logo