MFC misc/124164 (Add SHA-256/512 hash algorithm to crypt(3)) to stable/8?

2012-02-08 Thread Tim Bishop
Are there any committers willing to merge PR misc/124164 to stable/8
before the 8.3 release freeze? It's already in HEAD and stable/9 so it's
had some testing.

misc/124164 adds support for SHA256/512 to crypt(3). This is something
we make use of on Linux and FreeBSD 9, and it'd be great to have the
same support on FreeBSD 8.

http://www.freebsd.org/cgi/query-pr.cgi?pr=124164

SVN Revs: 220496 220497

I've tried markm@ already and had no response.

Thanks,

Tim.

-- 
Tim Bishop
http://www.bishnet.net/tim/
PGP Key: 0x5AE7D984
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: kernel debugging and ULE

2012-02-08 Thread Julian Elischer

On 2/7/12 1:50 AM, Andriy Gapon wrote:

on 06/02/2012 07:52 Julian Elischer said the following:

so if I'm sitting still in the debugger for too long, a hardclock
event happens that goes into ULE, which then hits the following KASSERT.


KASSERT(pri= PRI_MIN_BATCH  pri= PRI_MAX_BATCH,
 (sched_priority: invalid priority %d: nice %d, 
 ticks %d ftick %d ltick %d tick pri %d,
 pri, td-td_proc-p_nice, td-td_sched-ts_ticks,
 td-td_sched-ts_ftick, td-td_sched-ts_ltick,
 SCHED_PRI_TICKS(td-td_sched)));


The reason seems to be that I've been sitting still for too long and things have
become pear shaped.


how is it that being in the debugger doesn't stop hardclock events?
is there something I can do to make them not happen..
It means I have to ge tmy debugging done in less than about 60 seconds.

suggesions welcome.

Does this really happen when you just sit in the debugger?
Or does it happen when you let the kernel run?  Like stepping through the code,
etc


good point.. I was doing some single stepping..

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Eugene M. Zheganin

Hi.

On 08.02.2012 02:17, Andriy Gapon wrote:

[output snipped]

Thank you.  I don't see anything suspicious/unusual there.
Just case, do you have ZFS dedup enabled by a chance?

I think that examination of vmstat -m and vmstat -z outputs may provide some
clues as to what got all that memory wired.


Nope, I don't have deduplication feature enabled.

By the way, today, after eating another 100M of wired memory this server 
hanged out with multiple non-stopping messages


swap_pager: indefinite wait buffer

Since it's swapping on zvol, it looks to me like it could be the 
mentioned in another thread here (Swap on zvol - recommendable?) 
resource starvation issue; may be it happens faster when the ARC isn't 
limited.


So I want to ask - how to report it and what should I include in such pr ?

Thanks.
Eugene.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: MFC misc/124164 (Add SHA-256/512 hash algorithm to crypt(3)) to stable/8?

2012-02-08 Thread Mark Murray
Tim Bishop writes:
 Are there any committers willing to merge PR misc/124164 to stable/8
 before the 8.3 release freeze? It's already in HEAD and stable/9 so it's
 had some testing.
 
 misc/124164 adds support for SHA256/512 to crypt(3). This is something
 we make use of on Linux and FreeBSD 9, and it'd be great to have the
 same support on FreeBSD 8.
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=124164
 
 SVN Revs: 220496 220497
 
 I've tried markm@ already and had no response.

Apologies - I'll get to it ASAP.

M
--
Mark R V Murray
Cert APS(Open) Dip Phys(Open) BSc Open(Open) BSc(Hons)(Open)
Pi: 132511160

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Alexander Leidinger
On Wed, 08 Feb 2012 16:31:44 +0600 Eugene M. Zheganin
e...@norma.perm.ru wrote:

 swap_pager: indefinite wait buffer
 
 Since it's swapping on zvol, it looks to me like it could be the 
 mentioned in another thread here (Swap on zvol - recommendable?) 
 resource starvation issue; may be it happens faster when the ARC
 isn't limited.
 
 So I want to ask - how to report it and what should I include in such
 pr ?

I can't remember to have seen any mention of SWAP on ZFS being safe
now. So if nobody can provide a reference to a place which tells that
the problems with SWAP on ZFS are fixed:
 1. do not use SWAP on ZFS
 2. see 1.
 3. check if you see the same problem without SWAP on ZFS (btw. see 1.)

Bye,
Alexander.


-- 
http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org   netchild @ FreeBSD.org  : PGP ID = 72077137
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


i18n not working during startup

2012-02-08 Thread Victor Balada Diaz
Hello,

I tried freebsd-i18n but no one answered, so i will try better luck here. Sorry
for the people who are subscribed to both lists.

- Forwarded message from Victor Balada Diaz vic...@bsdes.net -

Date: Thu, 2 Feb 2012 19:17:21 +0100
From: Victor Balada Diaz vic...@bsdes.net
To: freebsd-i...@freebsd.org
Subject: i18n not working during startup
User-Agent: Mutt/1.5.21 (2010-09-15)

Hello,

I've setup login classes by handbook recommendation but seems that daemons 
started by rc
at system bootup don't use it. What i'm actually trying to do is configure 
tomcat to
use UTF-8 by default. I've configured it's user class on /etc/login.conf adding:

:setenv=LC_ALL=en_US.UTF-8:\
:lang=en_US.UTF-8:\

rebuilt login.conf db and tried rebooting. It doesn't seem to have lang or 
lc_all set
in their environment. As a workaround i thought about adding export lines at 
start of
/etc/rc.conf, but that's an ugly hack. 

Is there any other way of setting up lang settings for system startup daemons?

FreeBSD version: 7.4
Arch: amd64

Thanks a lot.
Regards
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 

- End forwarded message -

-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: i18n not working during startup

2012-02-08 Thread Gala IT
Hi Victor,

Try setting tomcat7_java_opts=-Dfile.encoding=UTF-8 in /etc/rc.conf.

It works for us under 8.2.

Kind regards,
David.

El 08/02/2012, a les 13:11, Victor Balada Diaz va escriure:

 Hello,
 
 I tried freebsd-i18n but no one answered, so i will try better luck here. 
 Sorry
 for the people who are subscribed to both lists.
 
 - Forwarded message from Victor Balada Diaz vic...@bsdes.net -
 
 Date: Thu, 2 Feb 2012 19:17:21 +0100
 From: Victor Balada Diaz vic...@bsdes.net
 To: freebsd-i...@freebsd.org
 Subject: i18n not working during startup
 User-Agent: Mutt/1.5.21 (2010-09-15)
 
 Hello,
 
 I've setup login classes by handbook recommendation but seems that daemons 
 started by rc
 at system bootup don't use it. What i'm actually trying to do is configure 
 tomcat to
 use UTF-8 by default. I've configured it's user class on /etc/login.conf 
 adding:
 
:setenv=LC_ALL=en_US.UTF-8:\
:lang=en_US.UTF-8:\
 
 rebuilt login.conf db and tried rebooting. It doesn't seem to have lang or 
 lc_all set
 in their environment. As a workaround i thought about adding export lines at 
 start of
 /etc/rc.conf, but that's an ugly hack. 
 
 Is there any other way of setting up lang settings for system startup daemons?
 
 FreeBSD version: 7.4
 Arch: amd64
 
 Thanks a lot.
 Regards
 -- 
 La prueba más fehaciente de que existe vida inteligente en otros
 planetas, es que no han intentado contactar con nosotros. 
 
 - End forwarded message -
 
 -- 
 La prueba más fehaciente de que existe vida inteligente en otros
 planetas, es que no han intentado contactar con nosotros. 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 8.2-stable: devd fails to restart

2012-02-08 Thread Torfinn Ingolfsen
On Tue, 07 Feb 2012 12:16:15 -0700 (MST)
Warren Block wbl...@wonkity.com wrote:

 
 It's devd, IMO.  Hey, come to think of it, I did enter a PR, the one 
 above.  If this is still a problem in 9 (which I can test in a bit), 
 posting to -current might get some needed attention on it.

PR updated.
-- 
Torfinn

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Eugene M. Zheganin

Hi.

On 08.02.2012 18:15, Alexander Leidinger wrote:

I can't remember to have seen any mention of SWAP on ZFS being safe
now. So if nobody can provide a reference to a place which tells that
the problems with SWAP on ZFS are fixed:
  1. do not use SWAP on ZFS
  2. see 1.
  3. check if you see the same problem without SWAP on ZFS (btw. see 1.)

So, if a swap have to be used, and, it has to be backed up with 
something like gmirror so it won't come down with one of the disks, 
there's no need to use zfs for system.


This makes zfs only useful in cases where you need to store something on 
a couple+ of terabytes, still having OS on ufs. Occam's razor and so on.


Thanks for explanation.
Eugene.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Freddie Cash
On Wed, Feb 8, 2012 at 10:25 AM, Eugene M. Zheganin e...@norma.perm.ru wrote:
 On 08.02.2012 18:15, Alexander Leidinger wrote:
 I can't remember to have seen any mention of SWAP on ZFS being safe
 now. So if nobody can provide a reference to a place which tells that
 the problems with SWAP on ZFS are fixed:
  1. do not use SWAP on ZFS
  2. see 1.
  3. check if you see the same problem without SWAP on ZFS (btw. see 1.)

 So, if a swap have to be used, and, it has to be backed up with something
 like gmirror so it won't come down with one of the disks, there's no need to
 use zfs for system.

 This makes zfs only useful in cases where you need to store something on a
 couple+ of terabytes, still having OS on ufs. Occam's razor and so on.

Or, you plug a USB stick into the back (or even inside the case as a
lot of mobos have internal USB connectors now) and use that for swap.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Freddie Cash
On Wed, Feb 8, 2012 at 10:40 AM, Freddie Cash fjwc...@gmail.com wrote:
 On Wed, Feb 8, 2012 at 10:25 AM, Eugene M. Zheganin e...@norma.perm.ru 
 wrote:
 On 08.02.2012 18:15, Alexander Leidinger wrote:
 I can't remember to have seen any mention of SWAP on ZFS being safe
 now. So if nobody can provide a reference to a place which tells that
 the problems with SWAP on ZFS are fixed:
  1. do not use SWAP on ZFS
  2. see 1.
  3. check if you see the same problem without SWAP on ZFS (btw. see 1.)

 So, if a swap have to be used, and, it has to be backed up with something
 like gmirror so it won't come down with one of the disks, there's no need to
 use zfs for system.

 This makes zfs only useful in cases where you need to store something on a
 couple+ of terabytes, still having OS on ufs. Occam's razor and so on.

 Or, you plug a USB stick into the back (or even inside the case as a
 lot of mobos have internal USB connectors now) and use that for swap.

That also works well for adding L2ARC (cache) to the ZFS pool as well.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Andriy Gapon
on 08/02/2012 12:31 Eugene M. Zheganin said the following:
 Hi.
 
 On 08.02.2012 02:17, Andriy Gapon wrote:
 [output snipped]

 Thank you.  I don't see anything suspicious/unusual there.
 Just case, do you have ZFS dedup enabled by a chance?

 I think that examination of vmstat -m and vmstat -z outputs may provide some
 clues as to what got all that memory wired.

 Nope, I don't have deduplication feature enabled.

OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?

 By the way, today, after eating another 100M of wired memory this server 
 hanged
 out with multiple non-stopping messages
 
 swap_pager: indefinite wait buffer
 
 Since it's swapping on zvol, it looks to me like it could be the mentioned in
 another thread here (Swap on zvol - recommendable?) resource starvation 
 issue;
 may be it happens faster when the ARC isn't limited.

It could be very well possible that swap on zvol doesn't work well when the
kernel itself is starved on memory.

 So I want to ask - how to report it and what should I include in such pr ?

I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem to be
ZFS-related.  I suspect that you might be running into some kernel memory leak.
 If you manage to reproduce the high wired value again, then vmstat -m and
vmstat -z may provide some useful information.

In this vein, do you use any out-of-tree kernel modules?
Also, can you try to monitor your system to see when wired count grows?

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Jeremy Chadwick
On Wed, Feb 08, 2012 at 10:29:36PM +0200, Andriy Gapon wrote:
 on 08/02/2012 12:31 Eugene M. Zheganin said the following:
  Hi.
  
  On 08.02.2012 02:17, Andriy Gapon wrote:
  [output snipped]
 
  Thank you.  I don't see anything suspicious/unusual there.
  Just case, do you have ZFS dedup enabled by a chance?
 
  I think that examination of vmstat -m and vmstat -z outputs may provide 
  some
  clues as to what got all that memory wired.
 
  Nope, I don't have deduplication feature enabled.
 
 OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?

Andriy,

Politely -- recommending this to a user is a good choice of action, but
the problem is that no user, even an experienced user, is going to know
what all of the Types (vmstat -m) or ITEMs (vmstat -z) correlate
with on the system.

For example, for vmstat -m, the ITEM name is solaris.  For vmstat -z,
the Types are named zio_* but I have a feeling there are more than just
that which pertain to ZFS.  I'm having to make *assumptions*.

The FreeBSD VM is highly complex and is not easy to understand even
remotely.  It becomes more complex when you consider that we use terms
like wired, active, inactive, cache, and free -- and none of
them, in simple English terms, actually represent the words chosen for
what they do.

Furthermore, the only definition I've been able to find over the years
for how any of these work, what they do/mean, etc. is here:

http://www.freebsd.org/doc/en/books/arch-handbook/vm.html

And this piece of documentation is only useful for people who understand
VMs (note: it was written by Matt Dillon, for example).  It is not
useful for end-users trying to track down what within the kernel is
actually eating up memory.  vmstat -m is as best as it's going to get,
and like I said, with the ITEM names being borderline ambiguous
(depending on what you're looking for -- with VFS and so on it's spread
all over the place), this becomes a very tedious task, where the user or
admin have to continually ask developers on the mailing lists what it is
they're looking at.

-- 
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


siisch1: Error while READ LOG EXT

2012-02-08 Thread Mike Tancsa
I have a 4 port eSata PCIe card with 3 external port multipliers attached on an 
AMD64 box (8G of RAM), RELENG8 from Feb1st.

siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 
hdr=0x00
vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
device = 'PCI-X to Serial ATA Controller (SiI 3124)'
class  = mass storage
subclass   = RAID
bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
transactions
cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message

siis0: SiI3124 SATA controller port 0x3000-0x300f mem 
0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
siis0: [ITHREAD]
siisch0: SIIS channel at channel 0 on siis0
siisch0: [ITHREAD]
siisch1: SIIS channel at channel 1 on siis0
siisch1: [ITHREAD]
siisch2: SIIS channel at channel 2 on siis0
siisch2: [ITHREAD]
siisch3: SIIS channel at channel 3 on siis0
siisch3: [ITHREAD]

# camcontrol devlist
WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 0 lun 0 (pass0,ada0)
WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 1 lun 0 (pass1,ada1)
WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 2 lun 0 (pass2,ada2)
WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 3 lun 0 (pass3,ada3)
Port Multiplier 47261095 1f06at scbus0 target 15 lun 0 (pass4,pmp1)
WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 0 lun 0 (pass5,ada4)
WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 1 lun 0 (pass6,ada5)
WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 2 lun 0 (pass7,ada6)
WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 3 lun 0 (pass8,ada7)
WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 4 lun 0 (pass9,ada8)
Port Multiplier 37261095 1706at scbus1 target 15 lun 0 (pass10,pmp0)
Areca usrvar R001at scbus4 target 0 lun 0 (pass11,da0)
Areca backup1 R001   at scbus4 target 0 lun 1 (pass12,da1)
Areca RAID controller R001   at scbus4 target 16 lun 0 (pass13)
AMCC 9650SE-2LP DISK 4.10at scbus5 target 0 lun 0 (pass14,da2)
ST31000333AS SD35at scbus6 target 0 lun 0 (pass15,ada9)
ST31000528AS CC35at scbus7 target 0 lun 0 (pass16,ada10)
ST31000340AS SD1Aat scbus8 target 0 lun 0 (pass17,ada11)
WDC WD1002FAEX-00Z3A0 05.01D05   at scbus11 target 0 lun 0 (pass18,ada12)


Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
along with a the odd slot timeout error.


Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:33:52 backup3 last message repeated 2 times
Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:50:31 backup3 last message repeated 2 times
Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 04:04:05 backup3 kernel: siisch1: Error while READ LOG EXT


smartctl doesnt show any issues on the drives other than one that has some 
historical errors from a while ago.  What are these errors and do I need to 
worry about them ? The READ LOG EXT ones are new.


This is the only drive with anything 

Re: zfs arc and amount of wired memory

2012-02-08 Thread Andriy Gapon
on 08/02/2012 22:50 Jeremy Chadwick said the following:
 Politely -- recommending this to a user is a good choice of action, but
 the problem is that no user, even an experienced user, is going to know
 what all of the Types (vmstat -m) or ITEMs (vmstat -z) correlate
 with on the system.

I see no problem with users sharing the output and asking for help interpreting
it.  I do not know of any easier way to analyze problems like this one.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Jeremy Chadwick
On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:
 I have a 4 port eSata PCIe card with 3 external port multipliers attached on 
 an AMD64 box (8G of RAM), RELENG8 from Feb1st.
 
 siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 
 rev=0x02 hdr=0x00
 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
 device = 'PCI-X to Serial ATA Controller (SiI 3124)'
 class  = mass storage
 subclass   = RAID
 bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
 bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
 bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
 cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
 transactions
 cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message
 
 siis0: SiI3124 SATA controller port 0x3000-0x300f mem 
 0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
 siis0: [ITHREAD]
 siisch0: SIIS channel at channel 0 on siis0
 siisch0: [ITHREAD]
 siisch1: SIIS channel at channel 1 on siis0
 siisch1: [ITHREAD]
 siisch2: SIIS channel at channel 2 on siis0
 siisch2: [ITHREAD]
 siisch3: SIIS channel at channel 3 on siis0
 siisch3: [ITHREAD]
 
 # camcontrol devlist
 WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 0 lun 0 (pass0,ada0)
 WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 1 lun 0 (pass1,ada1)
 WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 2 lun 0 (pass2,ada2)
 WDC WD2001FASS-00U0B0 01.00101   at scbus0 target 3 lun 0 (pass3,ada3)
 Port Multiplier 47261095 1f06at scbus0 target 15 lun 0 (pass4,pmp1)
 WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 0 lun 0 (pass5,ada4)
 WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 1 lun 0 (pass6,ada5)
 WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 2 lun 0 (pass7,ada6)
 WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 3 lun 0 (pass8,ada7)
 WDC WD2002FAEX-007BA0 05.01D05   at scbus1 target 4 lun 0 (pass9,ada8)
 Port Multiplier 37261095 1706at scbus1 target 15 lun 0 (pass10,pmp0)
 Areca usrvar R001at scbus4 target 0 lun 0 (pass11,da0)
 Areca backup1 R001   at scbus4 target 0 lun 1 (pass12,da1)
 Areca RAID controller R001   at scbus4 target 16 lun 0 (pass13)
 AMCC 9650SE-2LP DISK 4.10at scbus5 target 0 lun 0 (pass14,da2)
 ST31000333AS SD35at scbus6 target 0 lun 0 (pass15,ada9)
 ST31000528AS CC35at scbus7 target 0 lun 0 (pass16,ada10)
 ST31000340AS SD1Aat scbus8 target 0 lun 0 (pass17,ada11)
 WDC WD1002FAEX-00Z3A0 05.01D05   at scbus11 target 0 lun 0 (pass18,ada12)
 
 
 Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
 along with a the odd slot timeout error.
 
 
 Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
 Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
 Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
 rs 7f17e8b9 es  sts 801d2000 serr 0068
 Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
 Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
 Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
 rs 7f17e8b9 es  sts 801d2000 serr 0068
 Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
 Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
 Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
 rs 7f17e8b9 es  sts 801d2000 serr 0068
 Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
 Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
 Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
 rs 7f17e8b9 es  sts 801d2000 serr 0068

This indicates the controller on channel 1 (siisch1) is stalled
waiting for underlying communication with the device attached to it.

 Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 01:33:52 backup3 last message repeated 2 times
 Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 01:50:31 backup3 last message repeated 2 times
 Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 04:04:05 backup3 

Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Jeremy Chadwick
On Wed, Feb 08, 2012 at 01:27:23PM -0800, Jeremy Chadwick wrote:
 On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:
  Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
  along with a the odd slot timeout error.

BTW, something I forgot to cover in my reply: the slot number shown in
the output (e.g. Timeout on slot NN) has nothing to do with port
number, connector, or anything like that.  It's an internal
controller feature; AHCI offers the same thing.  I performed rudimentary
analysis on this back in April 2011 by reviewing the code and a small
write-up on it (semi-technical):

http://lists.freebsd.org/pipermail/freebsd-fs/2011-April/011197.html

Taken from my post at that time, which is what I'm wanting to relay
here: Timeout on slot N != SATA port N.  Two unrelated things.

-- 
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Alexander Motin

On 08.02.2012 23:27, Jeremy Chadwick wrote:

On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:

I have a 4 port eSata PCIe card with 3 external port multipliers attached on an 
AMD64 box (8G of RAM), RELENG8 from Feb1st.

siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 
hdr=0x00
 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
 device = 'PCI-X to Serial ATA Controller (SiI 3124)'
 class  = mass storage
 subclass   = RAID
 bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
 bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
 bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
 cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
transactions
 cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message

siis0:SiI3124 SATA controller  port 0x3000-0x300f mem 
0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
siis0: [ITHREAD]
siisch0:SIIS channel  at channel 0 on siis0
siisch0: [ITHREAD]
siisch1:SIIS channel  at channel 1 on siis0
siisch1: [ITHREAD]
siisch2:SIIS channel  at channel 2 on siis0
siisch2: [ITHREAD]
siisch3:SIIS channel  at channel 3 on siis0
siisch3: [ITHREAD]

# camcontrol devlist
WDC WD2001FASS-00U0B0 01.00101at scbus0 target 0 lun 0 (pass0,ada0)
WDC WD2001FASS-00U0B0 01.00101at scbus0 target 1 lun 0 (pass1,ada1)
WDC WD2001FASS-00U0B0 01.00101at scbus0 target 2 lun 0 (pass2,ada2)
WDC WD2001FASS-00U0B0 01.00101at scbus0 target 3 lun 0 (pass3,ada3)
Port Multiplier 47261095 1f06 at scbus0 target 15 lun 0 (pass4,pmp1)
WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 0 lun 0 (pass5,ada4)
WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 1 lun 0 (pass6,ada5)
WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 2 lun 0 (pass7,ada6)
WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 3 lun 0 (pass8,ada7)
WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 4 lun 0 (pass9,ada8)
Port Multiplier 37261095 1706 at scbus1 target 15 lun 0 (pass10,pmp0)
Areca usrvar R001 at scbus4 target 0 lun 0 (pass11,da0)
Areca backup1 R001at scbus4 target 0 lun 1 (pass12,da1)
Areca RAID controller R001at scbus4 target 16 lun 0 (pass13)
AMCC 9650SE-2LP DISK 4.10 at scbus5 target 0 lun 0 (pass14,da2)
ST31000333AS SD35 at scbus6 target 0 lun 0 (pass15,ada9)
ST31000528AS CC35 at scbus7 target 0 lun 0 (pass16,ada10)
ST31000340AS SD1A at scbus8 target 0 lun 0 (pass17,ada11)
WDC WD1002FAEX-00Z3A0 05.01D05at scbus11 target 0 lun 0 (pass18,ada12)


Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
along with a the odd slot timeout error.


Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068


This indicates the controller on channel 1 (siisch1) is stalled
waiting for underlying communication with the device attached to it.


Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:33:52 backup3 last message repeated 2 times
Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:50:31 backup3 last message repeated 2 times
Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:16:28 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:36:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 04:04:05 backup3 kernel: 

Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Jeremy Chadwick
On Thu, Feb 09, 2012 at 12:22:40AM +0200, Alexander Motin wrote:
 On 08.02.2012 23:27, Jeremy Chadwick wrote:
 On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:
 I have a 4 port eSata PCIe card with 3 external port multipliers attached 
 on an AMD64 box (8G of RAM), RELENG8 from Feb1st.
 
 siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 
 rev=0x02 hdr=0x00
  vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
  device = 'PCI-X to Serial ATA Controller (SiI 3124)'
  class  = mass storage
  subclass   = RAID
  bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
  bar   [18] = type Memory, range 64, base 0xb440, size 32768, 
  enabled
  bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
  cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
  cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
  transactions
  cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message
 
 siis0:SiI3124 SATA controller  port 0x3000-0x300f mem 
 0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
 siis0: [ITHREAD]
 siisch0:SIIS channel  at channel 0 on siis0
 siisch0: [ITHREAD]
 siisch1:SIIS channel  at channel 1 on siis0
 siisch1: [ITHREAD]
 siisch2:SIIS channel  at channel 2 on siis0
 siisch2: [ITHREAD]
 siisch3:SIIS channel  at channel 3 on siis0
 siisch3: [ITHREAD]
 
 # camcontrol devlist
 WDC WD2001FASS-00U0B0 01.00101at scbus0 target 0 lun 0 (pass0,ada0)
 WDC WD2001FASS-00U0B0 01.00101at scbus0 target 1 lun 0 (pass1,ada1)
 WDC WD2001FASS-00U0B0 01.00101at scbus0 target 2 lun 0 (pass2,ada2)
 WDC WD2001FASS-00U0B0 01.00101at scbus0 target 3 lun 0 (pass3,ada3)
 Port Multiplier 47261095 1f06 at scbus0 target 15 lun 0 (pass4,pmp1)
 WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 0 lun 0 (pass5,ada4)
 WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 1 lun 0 (pass6,ada5)
 WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 2 lun 0 (pass7,ada6)
 WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 3 lun 0 (pass8,ada7)
 WDC WD2002FAEX-007BA0 05.01D05at scbus1 target 4 lun 0 (pass9,ada8)
 Port Multiplier 37261095 1706 at scbus1 target 15 lun 0 (pass10,pmp0)
 Areca usrvar R001 at scbus4 target 0 lun 0 (pass11,da0)
 Areca backup1 R001at scbus4 target 0 lun 1 (pass12,da1)
 Areca RAID controller R001at scbus4 target 16 lun 0 (pass13)
 AMCC 9650SE-2LP DISK 4.10 at scbus5 target 0 lun 0 (pass14,da2)
 ST31000333AS SD35 at scbus6 target 0 lun 0 (pass15,ada9)
 ST31000528AS CC35 at scbus7 target 0 lun 0 (pass16,ada10)
 ST31000340AS SD1A at scbus8 target 0 lun 0 (pass17,ada11)
 WDC WD1002FAEX-00Z3A0 05.01D05at scbus11 target 0 lun 0 (pass18,ada12)
 
 
 Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
 along with a the odd slot timeout error.
 
 
 Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
 Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
 Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 
 7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
 Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
 Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
 Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 
 7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
 Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
 Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
 Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 
 7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
 Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
 Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
 Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 
 7f17e8b9 rs 7f17e8b9 es  sts 801d2000 serr 0068
 
 This indicates the controller on channel 1 (siisch1) is stalled
 waiting for underlying communication with the device attached to it.
 
 Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 01:33:52 backup3 last message repeated 2 times
 Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 01:50:31 backup3 last message repeated 2 times
 Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
 Feb  8 03:16:28 

Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Alexander Motin

On 09.02.2012 00:38, Jeremy Chadwick wrote:

On Thu, Feb 09, 2012 at 12:22:40AM +0200, Alexander Motin wrote:

On 08.02.2012 23:27, Jeremy Chadwick wrote:

On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:

I have a 4 port eSata PCIe card with 3 external port multipliers attached on an 
AMD64 box (8G of RAM), RELENG8 from Feb1st.

siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 
hdr=0x00
 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
 device = 'PCI-X to Serial ATA Controller (SiI 3124)'
 class  = mass storage
 subclass   = RAID
 bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
 bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
 bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
 cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
transactions
 cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message

siis0:SiI3124 SATA controller   port 0x3000-0x300f mem 
0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
siis0: [ITHREAD]
siisch0:SIIS channel   at channel 0 on siis0
siisch0: [ITHREAD]
siisch1:SIIS channel   at channel 1 on siis0
siisch1: [ITHREAD]
siisch2:SIIS channel   at channel 2 on siis0
siisch2: [ITHREAD]
siisch3:SIIS channel   at channel 3 on siis0
siisch3: [ITHREAD]

# camcontrol devlist
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 0 lun 0 (pass0,ada0)
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 1 lun 0 (pass1,ada1)
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 2 lun 0 (pass2,ada2)
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 3 lun 0 (pass3,ada3)
Port Multiplier 47261095 1f06  at scbus0 target 15 lun 0 (pass4,pmp1)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 0 lun 0 (pass5,ada4)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 1 lun 0 (pass6,ada5)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 2 lun 0 (pass7,ada6)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 3 lun 0 (pass8,ada7)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 4 lun 0 (pass9,ada8)
Port Multiplier 37261095 1706  at scbus1 target 15 lun 0 (pass10,pmp0)
Areca usrvar R001  at scbus4 target 0 lun 0 (pass11,da0)
Areca backup1 R001 at scbus4 target 0 lun 1 (pass12,da1)
Areca RAID controller R001 at scbus4 target 16 lun 0 (pass13)
AMCC 9650SE-2LP DISK 4.10  at scbus5 target 0 lun 0 (pass14,da2)
ST31000333AS SD35  at scbus6 target 0 lun 0 (pass15,ada9)
ST31000528AS CC35  at scbus7 target 0 lun 0 (pass16,ada10)
ST31000340AS SD1A  at scbus8 target 0 lun 0 (pass17,ada11)
WDC WD1002FAEX-00Z3A0 05.01D05 at scbus11 target 0 lun 0 (pass18,ada12)


Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
along with a the odd slot timeout error.


Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068


This indicates the controller on channel 1 (siisch1) is stalled
waiting for underlying communication with the device attached to it.


Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:33:52 backup3 last message repeated 2 times
Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:50:31 backup3 last message repeated 2 times
Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:16:28 backup3 kernel: 

Re: zfs arc and amount of wired memory

2012-02-08 Thread Miroslav Lachman

Andriy Gapon wrote:

on 08/02/2012 12:31 Eugene M. Zheganin said the following:

Hi.

On 08.02.2012 02:17, Andriy Gapon wrote:

[output snipped]

Thank you.  I don't see anything suspicious/unusual there.
Just case, do you have ZFS dedup enabled by a chance?

I think that examination of vmstat -m and vmstat -z outputs may provide some
clues as to what got all that memory wired.


Nope, I don't have deduplication feature enabled.


OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?


By the way, today, after eating another 100M of wired memory this server hanged
out with multiple non-stopping messages

swap_pager: indefinite wait buffer

Since it's swapping on zvol, it looks to me like it could be the mentioned in
another thread here (Swap on zvol - recommendable?) resource starvation issue;
may be it happens faster when the ARC isn't limited.


It could be very well possible that swap on zvol doesn't work well when the
kernel itself is starved on memory.


So I want to ask - how to report it and what should I include in such pr ?


I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem to be
ZFS-related.  I suspect that you might be running into some kernel memory leak.
  If you manage to reproduce the high wired value again, then vmstat -m and
vmstat -z may provide some useful information.

In this vein, do you use any out-of-tree kernel modules?
Also, can you try to monitor your system to see when wired count grows?


I am seeing something similar on one of our machine. This is old 7.3 
with ZFS v13, that's why I did not reported it.


The machine is used as storage for backups made by rsync. All is running 
fine for about 107 days. Then backups are slower and slower because of 
some strange memory situation.


Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M Free

ARC Size:
 Current Size: 1769 MB (arcsize)
 Target Size (Adaptive):   512 MB (c)
 Min Size (Hard Limit):512 MB (zfs_arc_min)
 Max Size (Hard Limit):3584 MB (zfs_arc_max)

The target size is going down to the min size and after few more days, 
the system is so slow, that I must reboot the machine. Then it is 
running fine for about 107 days and then it all repeat again.


You can see more on MRTG graphs
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
You can see links to other useful informations on top of the page 
(arc_summary, top, dmesg, fs usage, loader.conf)


There you can see nightly backups (higher CPU load started at 01:13), 
otherwise the machine is idle.


It coresponds with ARC target size lowering in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html

And with ARC metadata cache overflowing the limit in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

I don't know what's going on and I don't know if it is something know / 
fixed in newer releases. We are running a few more ZFS systems on 8.2 
without this issue. But those systems are in different roles.


Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Jeremy Chadwick
On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:
 Andriy Gapon wrote:
 on 08/02/2012 12:31 Eugene M. Zheganin said the following:
 Hi.
 
 On 08.02.2012 02:17, Andriy Gapon wrote:
 [output snipped]
 
 Thank you.  I don't see anything suspicious/unusual there.
 Just case, do you have ZFS dedup enabled by a chance?
 
 I think that examination of vmstat -m and vmstat -z outputs may provide 
 some
 clues as to what got all that memory wired.
 
 Nope, I don't have deduplication feature enabled.
 
 OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?
 
 By the way, today, after eating another 100M of wired memory this server 
 hanged
 out with multiple non-stopping messages
 
 swap_pager: indefinite wait buffer
 
 Since it's swapping on zvol, it looks to me like it could be the mentioned 
 in
 another thread here (Swap on zvol - recommendable?) resource starvation 
 issue;
 may be it happens faster when the ARC isn't limited.
 
 It could be very well possible that swap on zvol doesn't work well when the
 kernel itself is starved on memory.
 
 So I want to ask - how to report it and what should I include in such pr ?
 
 I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem 
 to be
 ZFS-related.  I suspect that you might be running into some kernel memory 
 leak.
   If you manage to reproduce the high wired value again, then vmstat -m and
 vmstat -z may provide some useful information.
 
 In this vein, do you use any out-of-tree kernel modules?
 Also, can you try to monitor your system to see when wired count grows?
 
 I am seeing something similar on one of our machine. This is old 7.3
 with ZFS v13, that's why I did not reported it.
 
 The machine is used as storage for backups made by rsync. All is
 running fine for about 107 days. Then backups are slower and slower
 because of some strange memory situation.
 
 Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M Free
 
 ARC Size:
  Current Size: 1769 MB (arcsize)
  Target Size (Adaptive):   512 MB (c)
  Min Size (Hard Limit):512 MB (zfs_arc_min)
  Max Size (Hard Limit):3584 MB (zfs_arc_max)
 
 The target size is going down to the min size and after few more
 days, the system is so slow, that I must reboot the machine. Then it
 is running fine for about 107 days and then it all repeat again.
 
 You can see more on MRTG graphs
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
 You can see links to other useful informations on top of the page
 (arc_summary, top, dmesg, fs usage, loader.conf)
 
 There you can see nightly backups (higher CPU load started at
 01:13), otherwise the machine is idle.
 
 It coresponds with ARC target size lowering in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
 
 And with ARC metadata cache overflowing the limit in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html
 
 I don't know what's going on and I don't know if it is something
 know / fixed in newer releases. We are running a few more ZFS
 systems on 8.2 without this issue. But those systems are in
 different roles.

This sounds like the... damn, what is it called... some kind of internal
counter or ticks thing within the ZFS code that was discovered to
only begin happening after a certain period of time (which correlated to
some number of days, possibly 107).  I'm sorry that I can't be more
specific, but it's been discussed heavily on the lists in the past, and
fixes for all of that were committed to RELENG_8.  I wish I could
remember the name of the function or macro or variable name it pertained
to, something like LTHAW or TLOCK or something like that.  I would say
I don't know why I can't remember, but I do know why I can't remember:
because I gave up trying to track all of these problems.

Does someone else remember this issue?  CC'ing Martin who might remember
for certain.

-- 
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Artem Belevich
On Wed, Feb 8, 2012 at 4:28 PM, Jeremy Chadwick
free...@jdc.parodius.com wrote:
 On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:
...
 ARC Size:
          Current Size:             1769 MB (arcsize)
          Target Size (Adaptive):   512 MB (c)
          Min Size (Hard Limit):    512 MB (zfs_arc_min)
          Max Size (Hard Limit):    3584 MB (zfs_arc_max)

 The target size is going down to the min size and after few more
 days, the system is so slow, that I must reboot the machine. Then it
 is running fine for about 107 days and then it all repeat again.

 You can see more on MRTG graphs
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
 You can see links to other useful informations on top of the page
 (arc_summary, top, dmesg, fs usage, loader.conf)

 There you can see nightly backups (higher CPU load started at
 01:13), otherwise the machine is idle.

 It coresponds with ARC target size lowering in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html

 And with ARC metadata cache overflowing the limit in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

 I don't know what's going on and I don't know if it is something
 know / fixed in newer releases. We are running a few more ZFS
 systems on 8.2 without this issue. But those systems are in
 different roles.

 This sounds like the... damn, what is it called... some kind of internal
 counter or ticks thing within the ZFS code that was discovered to
 only begin happening after a certain period of time (which correlated to
 some number of days, possibly 107).  I'm sorry that I can't be more
 specific, but it's been discussed heavily on the lists in the past, and
 fixes for all of that were committed to RELENG_8.  I wish I could
 remember the name of the function or macro or variable name it pertained
 to, something like LTHAW or TLOCK or something like that.  I would say
 I don't know why I can't remember, but I do know why I can't remember:
 because I gave up trying to track all of these problems.

 Does someone else remember this issue?  CC'ing Martin who might remember
 for certain.

It's LBOLT. :-)

And there was more than one related integer overflow. One of them
manifested itself as L2ARC feeding thread hogging CPU time after about
a month of uptime. Another one caused issue with ARC reclaim after 107
days. See more details in this thread:

http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Mike Tancsa
On 2/8/2012 4:27 PM, Jeremy Chadwick wrote:
 
 This indicates the controller on channel 1 (siisch1) is stalled
 waiting for underlying communication with the device attached to it.

Hi,
But which device ? the PM itself, or the disks behind it ? And which 
disk ?
 
 
 This is almost certainly a lower level problem with the disk that cannot
 be addressed/solved via normal means.  Thus, my recommendation is to
 replace the disk.

I would gladly replace it if I knew which one :)

 
 Regarding the repeated errors at semi-regular (but not entirely)
 intervals: are you using smartd?  Do you have a cronjob that issues
 smartctl -a or smartctl -x commands at intervals?  I imagine any of
 these could be tickling something lower level.

Dont have smartd running. The box takes a lot of backups as well as a constant 
stream of netflow data, so a lot of writes to it.

 
 Also, please upgrade your smartmontools to 5.42.  It does provide some
 further enhancements that are useful.
 

Done.


# smartctl -x /dev/ada9
smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-STABLE amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST31000333AS
Serial Number:9TE14SRV
LU WWN Device Id: 5 000c50 010a39664
Firmware Version: SD35
User Capacity:1,000,204,886,016 bytes [1.00 TB]
Sector Size:  512 bytes logical/physical
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:Wed Feb  8 20:00:47 2012 EST

== WARNING: There are known problems with these drives,
see the following Seagate web pages:
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207931
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207951
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=207957

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever 
been run.
Total time to complete Offline 
data collection:(  617) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine 
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 203) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x103b) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAGSVALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate POSR--   112   099   006-44490692
  3 Spin_Up_TimePO   093   092   000-0
  4 Start_Stop_Count-O--CK   100   100   020-68
  5 Reallocated_Sector_Ct   PO--CK   100   100   036-2
  7 Seek_Error_Rate POSR--   088   060   030-791764702
  9 Power_On_Hours  -O--CK   075   075   000-22759
 10 Spin_Retry_CountPO--C-   100   100   097-2
 12 Power_Cycle_Count   -O--CK   100   100   020-68
184 End-to-End_Error-O--CK   100   100   099-0
187 Reported_Uncorrect  -O--CK   095   095   000-5
188 Command_Timeout -O--CK   100   100   000-0

Re: zfs arc and amount of wired memory

2012-02-08 Thread Miroslav Lachman

Artem Belevich wrote:

On Wed, Feb 8, 2012 at 4:28 PM, Jeremy Chadwick
free...@jdc.parodius.com  wrote:

On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:

...

ARC Size:
  Current Size: 1769 MB (arcsize)
  Target Size (Adaptive):   512 MB (c)
  Min Size (Hard Limit):512 MB (zfs_arc_min)
  Max Size (Hard Limit):3584 MB (zfs_arc_max)

The target size is going down to the min size and after few more
days, the system is so slow, that I must reboot the machine. Then it
is running fine for about 107 days and then it all repeat again.

You can see more on MRTG graphs
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
You can see links to other useful informations on top of the page
(arc_summary, top, dmesg, fs usage, loader.conf)

There you can see nightly backups (higher CPU load started at
01:13), otherwise the machine is idle.

It coresponds with ARC target size lowering in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html

And with ARC metadata cache overflowing the limit in last 5 days
http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

I don't know what's going on and I don't know if it is something
know / fixed in newer releases. We are running a few more ZFS
systems on 8.2 without this issue. But those systems are in
different roles.


This sounds like the... damn, what is it called... some kind of internal
counter or ticks thing within the ZFS code that was discovered to
only begin happening after a certain period of time (which correlated to
some number of days, possibly 107).  I'm sorry that I can't be more
specific, but it's been discussed heavily on the lists in the past, and
fixes for all of that were committed to RELENG_8.


Thank you for your quick response. I am glad that it is fixed in 8.x. So 
I will upgrade this last old machine in few weeks. :)



 I wish I could
remember the name of the function or macro or variable name it pertained
to, something like LTHAW or TLOCK or something like that.  I would say
I don't know why I can't remember, but I do know why I can't remember:
because I gave up trying to track all of these problems.

Does someone else remember this issue?  CC'ing Martin who might remember
for certain.


It's LBOLT. :-)

And there was more than one related integer overflow. One of them
manifested itself as L2ARC feeding thread hogging CPU time after about
a month of uptime. Another one caused issue with ARC reclaim after 107
days. See more details in this thread:

http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html


Yes, it is exactly this problem. Thank you for the link to this thread. 
I am subscribed to freebsd-fs@ and I am reading it almost daily, but I 
missed this one!


Thanks to both of you!

Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Charles Sprickman

On Feb 8, 2012, at 7:11 PM, Miroslav Lachman wrote:

 Andriy Gapon wrote:
 on 08/02/2012 12:31 Eugene M. Zheganin said the following:
 Hi.
 
 On 08.02.2012 02:17, Andriy Gapon wrote:
 [output snipped]
 
 Thank you.  I don't see anything suspicious/unusual there.
 Just case, do you have ZFS dedup enabled by a chance?
 
 I think that examination of vmstat -m and vmstat -z outputs may provide 
 some
 clues as to what got all that memory wired.
 
 Nope, I don't have deduplication feature enabled.
 
 OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?
 
 By the way, today, after eating another 100M of wired memory this server 
 hanged
 out with multiple non-stopping messages
 
 swap_pager: indefinite wait buffer
 
 Since it's swapping on zvol, it looks to me like it could be the mentioned 
 in
 another thread here (Swap on zvol - recommendable?) resource starvation 
 issue;
 may be it happens faster when the ARC isn't limited.
 
 It could be very well possible that swap on zvol doesn't work well when the
 kernel itself is starved on memory.
 
 So I want to ask - how to report it and what should I include in such pr ?
 
 I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem 
 to be
 ZFS-related.  I suspect that you might be running into some kernel memory 
 leak.
  If you manage to reproduce the high wired value again, then vmstat -m and
 vmstat -z may provide some useful information.
 
 In this vein, do you use any out-of-tree kernel modules?
 Also, can you try to monitor your system to see when wired count grows?
 
 I am seeing something similar on one of our machine. This is old 7.3 with ZFS 
 v13, that's why I did not reported it.
 
 The machine is used as storage for backups made by rsync. All is running fine 
 for about 107 days. Then backups are slower and slower because of some 
 strange memory situation.
 
 Mem: 15M Active, 17M Inact, 3620M Wired, 420K Cache, 48M Buf, 1166M Free
 
 ARC Size:
 Current Size: 1769 MB (arcsize)
 Target Size (Adaptive):   512 MB (c)
 Min Size (Hard Limit):512 MB (zfs_arc_min)
 Max Size (Hard Limit):3584 MB (zfs_arc_max)
 
 The target size is going down to the min size and after few more days, the 
 system is so slow, that I must reboot the machine. Then it is running fine 
 for about 107 days and then it all repeat again.
 
 You can see more on MRTG graphs
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
 You can see links to other useful informations on top of the page 
 (arc_summary, top, dmesg, fs usage, loader.conf)
 
 There you can see nightly backups (higher CPU load started at 01:13), 
 otherwise the machine is idle.
 
 It coresponds with ARC target size lowering in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
 
 And with ARC metadata cache overflowing the limit in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html

I'm not having luck finding it, but there's some known issue that exists even 
in 8.2 where some 32-bit counter overflows or something. I don't truly remember 
the logic in it, but when you hit it, it's around 110 days or so.  Before it 
gets really bad (to the point where you either reboot or get some memory 
exhaustion panic), you can see zfs evict skips incrementing rapidly.  Looking 
at that graph, that would be my guess as to what's happening to you.  It's easy 
to check - run one of the arc stats scripts, look for evict_skips, note the 
number and then run it a few minutes later.  If it increases by more than a few 
hundred, you've hit the bug.  You'll find at that point the kernel is no longer 
evicting ARC from the kernel and it will just continue to grow until bad 
things happen.

Charles

 
 I don't know what's going on and I don't know if it is something know / fixed 
 in newer releases. We are running a few more ZFS systems on 8.2 without this 
 issue. But those systems are in different roles.
 
 Miroslav Lachman
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Charles Sprickman

On Feb 8, 2012, at 7:43 PM, Artem Belevich wrote:

 On Wed, Feb 8, 2012 at 4:28 PM, Jeremy Chadwick
 free...@jdc.parodius.com wrote:
 On Thu, Feb 09, 2012 at 01:11:36AM +0100, Miroslav Lachman wrote:
 ...
 ARC Size:
  Current Size: 1769 MB (arcsize)
  Target Size (Adaptive):   512 MB (c)
  Min Size (Hard Limit):512 MB (zfs_arc_min)
  Max Size (Hard Limit):3584 MB (zfs_arc_max)
 
 The target size is going down to the min size and after few more
 days, the system is so slow, that I must reboot the machine. Then it
 is running fine for about 107 days and then it all repeat again.
 
 You can see more on MRTG graphs
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/
 You can see links to other useful informations on top of the page
 (arc_summary, top, dmesg, fs usage, loader.conf)
 
 There you can see nightly backups (higher CPU load started at
 01:13), otherwise the machine is idle.
 
 It coresponds with ARC target size lowering in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_arcstats_size.html
 
 And with ARC metadata cache overflowing the limit in last 5 days
 http://freebsd.quip.cz/ext/2012/2012-02-08-kiwi-mrtg-12-15/local_zfs_vfs_meta.html
 
 I don't know what's going on and I don't know if it is something
 know / fixed in newer releases. We are running a few more ZFS
 systems on 8.2 without this issue. But those systems are in
 different roles.
 
 This sounds like the... damn, what is it called... some kind of internal
 counter or ticks thing within the ZFS code that was discovered to
 only begin happening after a certain period of time (which correlated to
 some number of days, possibly 107).  I'm sorry that I can't be more
 specific, but it's been discussed heavily on the lists in the past, and
 fixes for all of that were committed to RELENG_8.  I wish I could
 remember the name of the function or macro or variable name it pertained
 to, something like LTHAW or TLOCK or something like that.  I would say
 I don't know why I can't remember, but I do know why I can't remember:
 because I gave up trying to track all of these problems.
 
 Does someone else remember this issue?  CC'ing Martin who might remember
 for certain.
 
 It's LBOLT. :-)
 
 And there was more than one related integer overflow. One of them
 manifested itself as L2ARC feeding thread hogging CPU time after about
 a month of uptime. Another one caused issue with ARC reclaim after 107
 days. See more details in this thread:
 
 http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011584.html

This would be an excellent piece of information to have on one of the ZFS
wiki pages.  The 107 day issue exists post-8.2, correct?  Anyone on this 
cc: list have permissions to edit those pages?

Thanks,

Charles

 
 --Artem
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Gary Palmer
On Wed, Feb 08, 2012 at 11:18:02PM +0200, Andriy Gapon wrote:
 on 08/02/2012 22:50 Jeremy Chadwick said the following:
  Politely -- recommending this to a user is a good choice of action, but
  the problem is that no user, even an experienced user, is going to know
  what all of the Types (vmstat -m) or ITEMs (vmstat -z) correlate
  with on the system.
 
 I see no problem with users sharing the output and asking for help 
 interpreting
 it.  I do not know of any easier way to analyze problems like this one.

Also, since we are looking for gigs of memory it should be relatively easy
to look down the 'Size' or 'MemUse' columns and identify likely candidates
for eating gobs of memory.  The user doesn't need to know what the rest
of the data means, and can ask what that line means and how to fix it

Gary
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: zfs arc and amount of wired memory

2012-02-08 Thread Eugene M. Zheganin

Hi.

On 09.02.2012 02:29, Andriy Gapon wrote:

on 08/02/2012 12:31 Eugene M. Zheganin said the following:

Hi.

On 08.02.2012 02:17, Andriy Gapon wrote:

[output snipped]

Thank you.  I don't see anything suspicious/unusual there.
Just case, do you have ZFS dedup enabled by a chance?

I think that examination of vmstat -m and vmstat -z outputs may provide some
clues as to what got all that memory wired.


Nope, I don't have deduplication feature enabled.

OK.  So, did you have a chance to inspect vmstat -m and vmstat -z?


I did. I didn't understand it, but kinda 'felt the atmosphere'. It was 
pretty much similar to the output I supplied below. Most of the sizes 
were used by 'solaris' and numerous 'zio' caches.




It could be very well possible that swap on zvol doesn't work well when the
kernel itself is starved on memory.


So I want to ask - how to report it and what should I include in such pr ?

I am leaving swap-on-zvol issue aside.  Your original problem doesn't seem to be
ZFS-related.  I suspect that you might be running into some kernel memory leak.
  If you manage to reproduce the high wired value again, then vmstat -m and
vmstat -z may provide some useful information.

In this vein, do you use any out-of-tree kernel modules?
Also, can you try to monitor your system to see when wired count grows?


Nope, I don't have any 3rd party kernel modules.
Yes, I can monitor it, but I have no idea what should I exactly monitor. 
This system is running squid with a dozens of authentication helpers, 
freeradius + postgresql, sendmail and a perl squid log parser, which 
uses postgresql too. net/isc-dhcp, quagga, net/mpd5, a bunch of sendmail 
milters, net/samba35, bind. So it's some kind of a corporate production 
zoo. As I write this letter, the wired amount of memory increases by 70 
megs. Excuse me, 80 megs now.


The output I promised (if it's MORE acceptable in the form of a link to 
a paste site, just say it):


[emz@taiga:etc/snmp]# vmstat -m
 Type InUse MemUse HighUse Requests  Size(s)
hhook 2 1K   -2  128
  ithread8514K   -   85  32,128,256
   KTRACE   10013K   -  100  128
   linker   280   226K   -  384  
16,32,64,128,256,512,1024,2048,4096

lockf9410K   - 20264872  64,128
   loginclass 3 1K   -  367  64
 pci_link13 2K   -   13  16,128
   ip6ndp55 5K   -   78  64,128
   ip6opt23 6K   -   142134  32,256
 temp   14620K   -   114199  
16,32,64,128,256,512,1024,2048,4096
   devbuf 28285 56235K   -29225  
16,32,64,128,256,512,1024,2048,4096

   module   29137K   -  291  128
   USBdev3910K   -   39  64,128,512,1024
 mtx_pool 216K   -2
  USB55   166K   -   58  16,32,64,128,256,512,2048,4096
  osd22 1K   -10870  16,64
  ddb_capture 148K   -1
  subproc   831  1312K   -56233  512,4096
 proc 216K   -2
  session66 9K   -16431  128
 pgrp7310K   -16581  128
 cred   650   102K   -   818736  64,256
  uidinfo15 4K   - 5420  128,2048
   plimit25 7K   - 4948  256
   kbdmux 818K   -8  16,512,1024,2048
sysctltmp 0 0K   -  9741241  16,32,64,128,4096
sysctloid  4837   243K   - 4950  16,32,64,128
   sysctl 0 0K   -50230  16,32,64
  tidhash 116K   -1
  callout 3  1536K   -3
 umtx  2712   339K   - 2766  128
 p1003.1b 1 1K   -1  16
 SWAP 2  1097K   -2  64
   bus-sc84   686K   - 2193  
16,32,64,128,256,512,1024,2048,4096

  bus   86178K   - 4641  16,32,64,128,256,512,1024
  devstat 4 9K   -4  32,4096
 eventhandler83 7K   -   83  64,128
 kobj   194   776K   -  231  4096
  Per-cpu 1 1K   -1  32
   aacbuf   24172K   -  273  64,128,512
 rman   21923K   -  449  16,32,128
 acpiintr 1 1K   -1  64
 sbuf 1 1K   -  967  
16,32,64,128,256,512,1024,2048,4096
   acpica  1641   174K   -50289  
16,32,64,128,256,512,1024,2048,4096

   DEVFS1   10653K   -  111  512
   DEVFS3   26166K   -  269  256
stack 0 0K   -2  256
taskqueue85 8K   -  121  16,32,64,128,1024
   Unitno21 1K   -   208557  32,64
   DEVFS2   106 2K   -  108  16
   DEVFS_RULE5426K   -   54  64,512
DEVFS39 1K   -   40  16,128
  Witness 1   128K   -1
  iov 0