Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Bernhard Schmidt
On Wednesday, June 29, 2011 03:50:08 Adrian Chadd wrote:
 This is kinda strange; that symbol doesn't exist in the net80211 or ath 
 source.
 
 What the heck?
 
 
 
 adrian
 
 
 
 On 28 June 2011 17:28, Stefan Esser st_es...@t-online.de wrote:
  Hi,
 
  is this a known issue?
 
  My -CURRENT system (r223560M, amd64, 8GB, Atheros WLAN) panics after
  minutes to hours of uptime with the following message:
 
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 0
  fault virtual address   = 0xff807f502000
  fault code  = supervisor data read, page not present
  ...
  processor eflags= interrupt enabled, resume, IOPL = 0
  current process = 11 (swi4: clock)
  [ thread pid 11 tid 112 ]
  Stopped at  ieee80211_tx_mgmt_timeout+0x1:  movq (%rdi),%rdi
 
  db bt
  Tracing pid 11 tid 100012 td 0xfe00032e
  ieee80211_tx_mgmt_timeout() at ieee80211_tx_mgmt_timeout+0x1
  intr_event_execute_handlers() at intr_event_execute_handlers+0x66
  ithread_loop() at ithread_loop+0x96
  fork_exit() at fork_exit+0x11d
  fork_trampoline() at fork_trampoline+0xe
  --- trap 0, rip = 0, rsp = 0xff8000288d00, rbp = 0 ---
 
  This panic message is manually transcribed, since the GPT-only
  partitioning prevents dumping of a kernel core. (Why, BTW?)
  I could add a swap partition on a MBR disk, if a core dump seems
  neccessary to diagnose the problem. I'm also willing to wait for that
  panic to occur again and to gather more debug output.
 
 
  Other information: The Atheros WLAN in this system is unused (not
  associated) but both ath0 and wlan0 were UP at the time of the panic.
 
  Initial testing shows the system to be stable with both wlan0 and ath0
  set to down after boot. But still, the timeout should not panic the
  kernel, if WLAN is active but not fully configured (e.g. no SSID).
 
  Any ideas?

It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
requests. Afaik there is even a similar PR about that.

Adrian, you've got a AP set up to drop either a AUTH or ASSOC
response frame?

-- 
Bernhard
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: NFS/BOOTP problem

2011-06-29 Thread Grzegorz Bernacki

On 06/28/11 15:38, Rick Macklem wrote:

Grzegorz Bernacki wrote:

Hi,

After rebasing to new -current I experienced problem with mounting
root
via NFS. I was getting error: Mounting from nfs: failed with error 2:
unknown file system.. I use BOOTP and NFSv3 (option NFSCLIENT). It
seems that bootp set fs type to 'nfs' and recently NFSv3 was renamed
to
'oldnfs'. Patch below fixes the problem. Do you think it is proper
solution? Could it be fixed some other better way?


If you wish to use the old NFS client as the root fs, you can add a line
like:
vfs.root.mountfrom=oldnfs:

in the boot/loader.conf file on the root fs in the NFS server to have the
same effect as your patch. (This is mentioned in the message in UPDATING.)

However, it would be nice if you used the new NFS client instead, by building
the kernel with option NFSCL so that it uses the new NFS client.
(The new NFS client does NFSv3 in what I believe to be a completely compatible
  way as the old one.)

If you tried the new NFS client and it didn't work for you, I would like
to hear what the problem was, so it can be resolved. (The intent was to leave
the old NFS client in the system so that it could be used as a fallback when
the new one didn't work correctly for some situation.)

If others feel that having a system that will boot via the old NFS client
without needing to add the line to loader.conf is important, then doing
your patch would be appropriate. I didn't do what your patch does, since I
was hoping folks would use the new client instead and only force use of the
old one when it was really necessary.

rick
[patch snipped for brevity]


Hi Rick,

The problem is that in embedded development sometimes loader is not 
used. And in that case we would like to have possibity to use old NFS 
client without patching code every time. So if you don't mind I would 
like to commit the patch.

I also will try new nfs client and let you know if I find any problems.

grzesiek
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread KOT MATPOCKuH
2011/6/28 Marius Strobl mar...@alchemy.franken.de:

 I'm got a problem with named on FreeBSD-CURRENT/sparc64.
 Up to 5 times a day it crashes with these messages:
 27-Jun-2011 03:42:14.384 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)

 I found a some similar problems on alpha and IA64, which was related
 to problems with isc_atomic_xadd() function in include/isc/atomic.h.
 But I don't understand that there may be incorrect for sparc64 and
 this function was not changed for a minimum 4 years...
 Uhm, we once fixed a problem in the MD atomic implementation which
 still seems to present in the ISC copy. Could you please test whether
 the following patch makes a difference?
 http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
Oh, Marius, You are my savior...
I ran named with your patch and and watching him.

I think this should be sufficient:
cd /usr/src/lib/bind/dns
make clean
make
cd /usr/src/usr.sbin/named
make clean
make
make install
(and named's restart)

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


daily snapshot build

2011-06-29 Thread Hiroki Sato
Hi,

 Just wanted to let you know that daily snapshot builds from the HEAD
 source tree are available again at
 http://pub.allbsd.org/FreeBSD-snapshots/.  Currently i386 and amd64
 build have been recovered.  Although it was down for a while due to
 hardware failure, it is recovering now including other archs.  The
 snapshot builds can also be found via ftp://ftp.allbsd.org/pub or
 rsync://rsync.allbsd.org/freebsd-snapshots/.  I hope these help for
 testing purpose.

 Please note that it may be unstable this week because temporary network
 access outage (for a short time) is planned on Friday.

-- Hiroki



pgpMJ0BhEmUTp.pgp
Description: PGP signature


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Adrian Chadd
On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:

 It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
 requests. Afaik there is even a similar PR about that.

 Adrian, you've got a AP set up to drop either a AUTH or ASSOC
 response frame?

Tell me how and I'll set it up.

A panic at that point in the function indicates maybe ni is NULL?
or ni-vap is now NULL, maybe?



Adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Bernhard Schmidt
On Wednesday, June 29, 2011 10:03:02 Adrian Chadd wrote:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:
 
  It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
  requests. Afaik there is even a similar PR about that.
 
  Adrian, you've got a AP set up to drop either a AUTH or ASSOC
  response frame?
 
 Tell me how and I'll set it up.
 
 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?

vap should never be NULL, so, I'd guess it's ni.

Hmm.. I'd guess there is some kind of racy behavior, if the driver is
telling us that it was able to send the AUTH req frame, net80211 sets
up the timeout callback. What happens if the AUTH resp as well as the
callback hit at the same time? It should be locked appropriately, but
is it?

This will drop the AUTH response:

Index: sys/net80211/ieee80211_hostap.c
===
--- sys/net80211/ieee80211_hostap.c (revision 223661)
+++ sys/net80211/ieee80211_hostap.c (working copy)
@@ -978,7 +978,7 @@ hostap_auth_open(struct ieee80211_node *ni, struct
%s, station authentication defered (radius acl));
ieee80211_notify_node_auth(ni);
} else {
-   IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
+   //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
IEEE80211_NOTE_MAC(vap,
IEEE80211_MSG_DEBUG | IEEE80211_MSG_AUTH, ni-ni_macaddr,
%s, station authenticated (open));
@@ -1158,7 +1158,7 @@ hostap_auth_shared(struct ieee80211_node *ni, stru
estatus = IEEE80211_STATUS_SEQUENCE;
goto bad;
}
-   IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
+   //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
return;
 bad:
/*


-- 
Bernhard
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: CAM/SATA: attach/detach problems with harddrive

2011-06-29 Thread Andriy Gapon
on 29/06/2011 03:25 Hartmann, O. said the following:
 But any kind of access to the new device, like gpart or simply a zpool import 
 to
 show the pool to be imported gets locked up forever (waited two hours). The
 state could only be resolved by resetting the box and then the filesystem is
 unclean an needs to be fsck'ed.
 
 Will proceed with a verbose kernel.

I think that procstat -kk output for at least one stuck process could also turn
out to be useful.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Stefan Esser
Am 29.06.2011 10:03, schrieb Adrian Chadd:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:
 It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
 requests. Afaik there is even a similar PR about that.

Sorry, I manually entered the panic message, since dumps were not
working on my system at the time of that panic.

 Adrian, you've got a AP set up to drop either a AUTH or ASSOC
 response frame?

I've got a number of AUTH - SCAN transition lost messages for wlan0,
seconds to minutes apart:

Jun 28 21:16:17 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:34:46 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:36:33 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:45:14 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:45:44 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost

The setup is easy to reproduce, my rc.conf contained:

wlans_ath0=wlan0
ifconfig_ath0=down
ifconfig_wlan0=down
wpa_supplicant_enable=YES

This system used to be connected via ath0, but recently was moved to a
place where Ethernet is available. The panics started only after WLAN
was not used anymore. I might disable wpa_supplicant, since it is not
required in the current situation, but did not try whether that helps
prevent the panic.

 Tell me how and I'll set it up.
 
 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?

I recreated the panic, this time with kernel dumps correctly configured
(thanks for the hint, Scott). The panic message is:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xff809c7a1000
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x805e1851
stack pointer   = 0x28:0xff8000288ab0
frame pointer   = 0x28:0xff8000288b60
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 11 (swi4: clock)

Traceback:

#10 0x805e1851 in ieee80211_tx_mgt_timeout (arg=0xff809c7a1000)
at ../../../net80211/ieee80211_output.c:2487

This indicates, that an invalid argument is passed and assigned to
*ni, which causes the page fault when dereferencing ni to obtain *va.

I'm afraid that the assumption in the comment (about timeout being save
to use) does not really hold:

static void
ieee80211_tx_mgt_timeout(void *arg)
{
struct ieee80211_node *ni = arg;
struct ieee80211vap *vap = ni-ni_vap;

if (vap-iv_state != IEEE80211_S_INIT 
(vap-iv_ic-ic_flags  IEEE80211_F_SCAN) == 0) {
/*
 * NB: it's safe to specify a timeout as the reason here;
 * it'll only be used in the right state.
 */
ieee80211_new_state(vap, IEEE80211_S_SCAN,
IEEE80211_SCAN_FAIL_TIMEOUT)*vap ;
}
}

If vap is valid during one invocation of that function, I'd expect it
to at least be a pointer to valid kernel memory after the timeout.
I.e., the value found by dereferencing it may be stale, but the pointer
itself should at least not cause a page fault. (???)


The compressed core.txt is 27KB, the compressed vmcore is 800MB. I might
be able to find a place to upload the vmcore file to, but since I'm
currently on a DSL with only 672KBit/s upstream, it would take me some 3
hours to upload to a better connected server (and I'd like to avoid
doing that, if not essential for debugging).

The core.txt is small enough to send by mail. Let me know if you think
it helps you understand the problem.


I'm willing to support debugging, e.g. by placement of printfs in my
kernel for the timeout handler and the creation and destruction of *vap
structures.


After removal of wlans_ath0=wlan0 the system will most probably be
stable, I did not specifically test this case (i.e. ath0 configured, but
no wlan0 created). I do know, that an ifconfig down of ath0 and wlan0
suffices; probably an ifconfig wlan0 down alone would be enough.

So, I know how to avoid the panic, but I think it is still important to
find the cause.

Thank you for looking into this!


Best regards, STefan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Stefan Esser
On 29.06.2011 10:27, Bernhard Schmidt wrote:
 On Wednesday, June 29, 2011 10:03:02 Adrian Chadd wrote:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:

 It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
 requests. Afaik there is even a similar PR about that.

 Adrian, you've got a AP set up to drop either a AUTH or ASSOC
 response frame?

 Tell me how and I'll set it up.

 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?
 
 vap should never be NULL, so, I'd guess it's ni.

No, neither vap no vap-ni appear to cause NULL dereferences.

The panic message indicates a fault address of 0xff809c7a1000, which
is the value of arg passed to ieee80211_tx_mgt_timeout().

The fault occurs on the first instruction within that function and I
take this to mean, that it points outside kernel VM space. (I have got
to admit, that I do not know the exact memory layout for amd64, though.)

 Hmm.. I'd guess there is some kind of racy behavior, if the driver is
 telling us that it was able to send the AUTH req frame, net80211 sets
 up the timeout callback. What happens if the AUTH resp as well as the
 callback hit at the same time? It should be locked appropriately, but
 is it?
 
 This will drop the AUTH response:

I have received a number of messages that might indicate a lost race:

ieee80211_new_state_locked: pending AUTH - SCAN transition lost

repeats with between a few seconds and 20 minutes between messages.

 Index: sys/net80211/ieee80211_hostap.c
 ===
 --- sys/net80211/ieee80211_hostap.c   (revision 223661)
 +++ sys/net80211/ieee80211_hostap.c   (working copy)
 @@ -978,7 +978,7 @@ hostap_auth_open(struct ieee80211_node *ni, struct
   %s, station authentication defered (radius acl));
   ieee80211_notify_node_auth(ni);
   } else {
 - IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 + //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
   IEEE80211_NOTE_MAC(vap,
   IEEE80211_MSG_DEBUG | IEEE80211_MSG_AUTH, ni-ni_macaddr,
   %s, station authenticated (open));
 @@ -1158,7 +1158,7 @@ hostap_auth_shared(struct ieee80211_node *ni, stru
   estatus = IEEE80211_STATUS_SEQUENCE;
   goto bad;
   }
 - IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 + //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
   return;
  bad:
   /*
 
 

I could try that patch for a few hours ...

Regards, STefan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Adrian Chadd
The question here is - what context is the callback being called in?

The lack of net80211 locking has me confused and sad. :/


Adrian

On 29 June 2011 16:27, Bernhard Schmidt bschm...@freebsd.org wrote:
 On Wednesday, June 29, 2011 10:03:02 Adrian Chadd wrote:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:

  It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
  requests. Afaik there is even a similar PR about that.
 
  Adrian, you've got a AP set up to drop either a AUTH or ASSOC
  response frame?

 Tell me how and I'll set it up.

 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?

 vap should never be NULL, so, I'd guess it's ni.

 Hmm.. I'd guess there is some kind of racy behavior, if the driver is
 telling us that it was able to send the AUTH req frame, net80211 sets
 up the timeout callback. What happens if the AUTH resp as well as the
 callback hit at the same time? It should be locked appropriately, but
 is it?

 This will drop the AUTH response:

 Index: sys/net80211/ieee80211_hostap.c
 ===
 --- sys/net80211/ieee80211_hostap.c     (revision 223661)
 +++ sys/net80211/ieee80211_hostap.c     (working copy)
 @@ -978,7 +978,7 @@ hostap_auth_open(struct ieee80211_node *ni, struct
                    %s, station authentication defered (radius acl));
                ieee80211_notify_node_auth(ni);
        } else {
 -               IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 +               //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 
 1);
                IEEE80211_NOTE_MAC(vap,
                    IEEE80211_MSG_DEBUG | IEEE80211_MSG_AUTH, ni-ni_macaddr,
                    %s, station authenticated (open));
 @@ -1158,7 +1158,7 @@ hostap_auth_shared(struct ieee80211_node *ni, stru
                estatus = IEEE80211_STATUS_SEQUENCE;
                goto bad;
        }
 -       IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 +       //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
        return;
  bad:
        /*


 --
 Bernhard

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFC] winbond watchdog driver for FreeBSD/i386 and FreeBSD/amd64

2011-06-29 Thread Andriy Gapon
on 29/06/2011 01:32 Xin LI said the following:
 Hi,
 
 I'd like to request for comments on the attached driver, which supports
 watchdogs on several Winbond super I/O chip models and have been tested
 on a few of recent Supermicro motherboards.

Some comments.

 From 343b2e7b6ed19e4b6ca2bf76c0ca6b8544dd4320 Mon Sep 17 00:00:00 2001
 From: Xin LI d...@delphij.net
 Date: Mon, 27 Jun 2011 21:54:13 -0700
 Subject: [PATCH] Driver for Winbond watchdog.
 
 ---
  share/man/man4/Makefile|3 +
  share/man/man4/winbondwd.4 |   88 ++
  sys/amd64/conf/NOTES   |2 +
  sys/conf/files.amd64   |1 +
  sys/conf/files.i386|1 +
  sys/dev/winbondwd/winbondwd.c  |  368 
 
  sys/dev/winbondwd/winbondwd.h  |   47 +
  sys/i386/conf/NOTES|2 +
  sys/modules/Makefile   |3 +
  sys/modules/winbondwd/Makefile |8 +
  10 files changed, 523 insertions(+), 0 deletions(-)
  create mode 100644 share/man/man4/winbondwd.4
  create mode 100644 sys/dev/winbondwd/winbondwd.c
  create mode 100644 sys/dev/winbondwd/winbondwd.h
  create mode 100644 sys/modules/winbondwd/Makefile
 
 diff --git a/share/man/man4/Makefile b/share/man/man4/Makefile
 index 7fb..777e2fd 100644
 --- a/share/man/man4/Makefile
 +++ b/share/man/man4/Makefile
 @@ -447,6 +447,7 @@ MAN=  aac.4 \
   tun.4 \
   twa.4 \
   twe.4 \
 + tws.4 \

Looks like this has sneaked in accidentally.

   tx.4 \
   txp.4 \
   u3g.4 \
 @@ -503,6 +504,7 @@ MAN=  aac.4 \
   watchdog.4 \
   wb.4 \
   wi.4 \
 + ${_winbondwd.4} \
   witness.4 \
   wlan.4 \
   wlan_acl.4 \
 @@ -706,6 +708,7 @@ _speaker.4=   speaker.4
  _spkr.4= spkr.4
  _tpm.4=  tpm.4
  _urtw.4= urtw.4
 +_winbondwd.4=winbondwd.4
  _wpi.4=  wpi.4
  _xen.4=  xen.4
  
 diff --git a/share/man/man4/winbondwd.4 b/share/man/man4/winbondwd.4
 new file mode 100644
 index 000..6fd2719
 --- /dev/null
 +++ b/share/man/man4/winbondwd.4
 @@ -0,0 +1,88 @@
 +.\-
 +.\ Copyright (c) 2011 Xin LI delp...@freebsd.org
 +.\ All rights reserved.
 +.\
 +.\ Redistribution and use in source and binary forms, with or without
 +.\ modification, are permitted provided that the following conditions
 +.\ are met:
 +.\ 1. Redistributions of source code must retain the above copyright
 +.\notice, this list of conditions and the following disclaimer.
 +.\ 2. Redistributions in binary form must reproduce the above copyright
 +.\notice, this list of conditions and the following disclaimer in the
 +.\documentation and/or other materials provided with the distribution.
 +.\
 +.\ THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 +.\ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 +.\ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 
 PURPOSE
 +.\ ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 +.\ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 
 CONSEQUENTIAL
 +.\ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 +.\ OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 +.\ HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, 
 STRICT
 +.\ LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 +.\ OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 +.\ SUCH DAMAGE.
 +.\
 +.\ $FreeBSD$
 +.\
 +.Dd July 1, 2011
 +.Dt WINBONDWD 4
 +.Os
 +.Sh NAME
 +.Nm winbondwd
 +.Nd device driver for the Winbond Super I/O watchdog timer
 +.Sh SYNOPSIS
 +To compile this driver into the kernel,
 +place the following line in your
 +kernel configuration file:
 +.Bd -ragged -offset indent
 +.Cd device winbondwd
 +.Ed
 +.Pp
 +Alternatively, to load the driver as a
 +module at boot time, place the following line in
 +.Xr loader.conf 5 :
 +.Bd -literal -offset indent
 +winbondwd_load=YES
 +.Ed
 +.Sh DESCRIPTION
 +The
 +.Nm
 +driver provides
 +.Xr watchdog 4
 +support for the watchdog interrupt timer present on
 +all Winbond super I/O controllers.
 +.Pp
 +The Winbond super I/O controller have a built-in watchdog timer,
 +which can be enabled and disabled by user's program and set between
 +1 to 255 seconds or 1 to 255 minutes.
 +Supported watchdog intervals range from 1 to 255 seconds.
 +.Pp
 +On some systems the watchdog timer is enabled and set to 5 minutes
 +by BIOS on boot.
 +The
 +.Nm
 +driver will detect and print out the existing setting, however, 
 +it will not make any changes unless told by the userland through
 +the
 +.Xr watchdog 4
 +interface,
 +for instance by using the
 +.Xr watchdogd 8
 +daemon.
 +.Sh SEE ALSO
 +.Xr watchdog 4 ,
 +.Xr watchdog 8 ,
 +.Xr watchdogd 8 ,
 +.Xr watchdog 9
 +.Sh HISTORY
 +The
 +.Nm
 +driver first appeared in
 +.Fx 9.0 .
 +.Sh AUTHORS
 +.An -nosplit
 +The
 +.Nm
 +driver and this manual page were 

Re: [RFT] Automatic load of USB kernel modules

2011-06-29 Thread Hans Petter Selasky
On Wednesday 29 June 2011 12:10:36 Robert Millan wrote:
 2011/6/24 Hans Petter Selasky hsela...@c2i.net:
  I would like to request testing of the attached
  patch before I commit it. The patch is about only having ukbd, ums and
  umass per default in the kernel GENERIC config file(s).
 
 What about urio?

It is loaded automatically now.

--HPS
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFT] Automatic load of USB kernel modules

2011-06-29 Thread Hans Petter Selasky
On Wednesday 29 June 2011 12:10:36 Robert Millan wrote:
 2011/6/24 Hans Petter Selasky hsela...@c2i.net:
  I would like to request testing of the attached
  patch before I commit it. The patch is about only having ukbd, ums and
  umass per default in the kernel GENERIC config file(s).
 
 What about urio?

Hi,

urio is not used for booting the kernel, so it is not important that it is in 
the kernel.

--HPS
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread KOT MATPOCKuH
2011/6/29 KOT MATPOCKuH matpoc...@gmail.com:
 I'm got a problem with named on FreeBSD-CURRENT/sparc64.
 Up to 5 times a day it crashes with these messages:
 27-Jun-2011 03:42:14.384 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)

 I found a some similar problems on alpha and IA64, which was related
 to problems with isc_atomic_xadd() function in include/isc/atomic.h.
 But I don't understand that there may be incorrect for sparc64 and
 this function was not changed for a minimum 4 years...
 Uhm, we once fixed a problem in the MD atomic implementation which
 still seems to present in the ISC copy. Could you please test whether
 the following patch makes a difference?
 http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff

 I ran named with your patch and and watching him.
Omg.
Or I incorrectly rebuilt named, or the problem is not solved.
I got a crash after about 2 hours after named restarted:
29-Jun-2011 13:51:28.855 general:
/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
REQUIRE(prev  0) failed
29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)

-- 
MATPOCKuH
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Bernhard Schmidt
On Wednesday, June 29, 2011 10:53:41 Stefan Esser wrote:
 Am 29.06.2011 10:03, schrieb Adrian Chadd:
  On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:
  It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
  requests. Afaik there is even a similar PR about that.
 
 Sorry, I manually entered the panic message, since dumps were not
 working on my system at the time of that panic.
 
  Adrian, you've got a AP set up to drop either a AUTH or ASSOC
  response frame?
 
 I've got a number of AUTH - SCAN transition lost messages for wlan0,
 seconds to minutes apart:
 
 Jun 28 21:16:17 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:34:46 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:36:33 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:45:14 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:45:44 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 
 The setup is easy to reproduce, my rc.conf contained:
 
 wlans_ath0=wlan0
 ifconfig_ath0=down
 ifconfig_wlan0=down
 wpa_supplicant_enable=YES

Strip the last 3 lines, don't ever fiddle around with ath0 directly.
This configuration always starts wpa_supplicant.

 This system used to be connected via ath0, but recently was moved to a
 place where Ethernet is available. The panics started only after WLAN
 was not used anymore. I might disable wpa_supplicant, since it is not
 required in the current situation, but did not try whether that helps
 prevent the panic.
 
  Tell me how and I'll set it up.
  
  A panic at that point in the function indicates maybe ni is NULL?
  or ni-vap is now NULL, maybe?
 
 I recreated the panic, this time with kernel dumps correctly configured
 (thanks for the hint, Scott). The panic message is:
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0xff809c7a1000
 fault code  = supervisor read data, page not present
 instruction pointer = 0x20:0x805e1851
 stack pointer   = 0x28:0xff8000288ab0
 frame pointer   = 0x28:0xff8000288b60
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 11 (swi4: clock)
 
 Traceback:
 
 #10 0x805e1851 in ieee80211_tx_mgt_timeout (arg=0xff809c7a1000)
 at ../../../net80211/ieee80211_output.c:2487
 
 This indicates, that an invalid argument is passed and assigned to
 *ni, which causes the page fault when dereferencing ni to obtain *va.

The problem here seems to be wpa_supplicant. It can try to associate
at any given point in time which results in the BSS ni being destroyed,
though it might still be referenced somewhere (In this case the timeout
stuff, or better said ath's TX queue). Not clearing the reference (or
stopping whatever is using it) is the fault here. Now how to figure out
who the caller is? Got the complete backtrace?

-- 
Bernhard
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


HEADSUP: Call for FreeBSD Status Reports - 2Q/2011

2011-06-29 Thread Daniel Gerzo

Dear all,

I would like to remind you that the next round of status reports
covering the second quarter of 2011 is due on July 15th, 2011. As this
initiative is very popular among our users, I would like to
ask you to submit your status reports soon, so that we can compile the
report on time.

Do not hesitate and write us a few lines; a short  description about
what you are working on, what your plans and goals are, or any other
information that you consider interested is always welcome. This way
we can inform our community about your great work!
Check out the reports from the past to get some inspiration of what
your submission should look like.

If you know about a project that should be included in the status
report, please let us know as well, so we can poke the responsible
people to provide us with something useful. Updates to submissions from
the last report are welcome too.

Note that the submissions are accepted from anyone involved within the
FreeBSD community, you do not have to be a FreeBSD committer. Anything
related to FreeBSD can be covered.

Please email us the filled-in XML template which can be found at
http://www.freebsd.org/news/status/report-sample.xml to
mont...@freebsd.org, or alternatively use our web based form located at
http://www.freebsd.org/cgi/monthly.cgi.

For more information, please visit http://www.freebsd.org/news/status/.

We are looking forward to see your submissions!

--
Kind regards
  Daniel Gerzo
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Stefan Esser
Am 29.06.2011 12:41, schrieb Bernhard Schmidt:
 On Wednesday, June 29, 2011 10:53:41 Stefan Esser wrote:
 I recreated the panic, this time with kernel dumps correctly configured
 (thanks for the hint, Scott). The panic message is:

 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0xff809c7a1000
 fault code  = supervisor read data, page not present
 instruction pointer = 0x20:0x805e1851
 stack pointer   = 0x28:0xff8000288ab0
 frame pointer   = 0x28:0xff8000288b60
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 11 (swi4: clock)

 Traceback:

 #10 0x805e1851 in ieee80211_tx_mgt_timeout (arg=0xff809c7a1000)
 at ../../../net80211/ieee80211_output.c:2487

 This indicates, that an invalid argument is passed and assigned to
 *ni, which causes the page fault when dereferencing ni to obtain *va.
 
 The problem here seems to be wpa_supplicant. It can try to associate
 at any given point in time which results in the BSS ni being destroyed,
 though it might still be referenced somewhere (In this case the timeout
 stuff, or better said ath's TX queue). Not clearing the reference (or
 stopping whatever is using it) is the fault here. Now how to figure out
 who the caller is? Got the complete backtrace?

Not sure that I understand your question correctly ...

#10 0x805e1851 in ieee80211_tx_mgt_timeout
(arg=0xff809c7a1000) at ../../../net80211/ieee80211_output.c:2487
#11 0x8050f45c in softclock (arg=Variable arg is not
available.) at ../../../kern/kern_timeout.c:564
#12 0x804d9876 in intr_event_execute_handlers (p=Variable p is
not available.) at ../../../kern/kern_intr.c:1257
#13 0x804da4d6 in ithread_loop (arg=0xfe00032dcc60) at
../../../kern/kern_intr.c:1270
#14 0x804d718d in fork_exit (callout=0x804da440
ithread_loop, arg=0xfe00032dcc60, frame=0xff8000288c50) at
../../../kern/kern_fork.c:920
#15 0x807258ce in fork_trampoline () at
../../../amd64/amd64/exception.S:603

Bernhard, I'm sending you the compressed core.txt in private mail.

Regards, STefan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFT] Automatic load of USB kernel modules

2011-06-29 Thread Robert Millan
2011/6/24 Hans Petter Selasky hsela...@c2i.net:
 I would like to request testing of the attached
 patch before I commit it. The patch is about only having ukbd, ums and umass
 per default in the kernel GENERIC config file(s).

What about urio?

-- 
Robert Millan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFT] Automatic load of USB kernel modules

2011-06-29 Thread Robert Millan
2011/6/29 Hans Petter Selasky hsela...@c2i.net:
 What about urio?

 Hi,

 urio is not used for booting the kernel, so it is not important that it is in
 the kernel.

Ah, ok.  I thought /dev/urio0 was a block device.

-- 
Robert Millan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFC] winbond watchdog driver for FreeBSD/i386 and FreeBSD/amd64

2011-06-29 Thread John Baldwin
On Wednesday, June 29, 2011 5:22:26 am Andriy Gapon wrote:
 on 29/06/2011 01:32 Xin LI said the following:
  +/*
  + * Look for Winbond device.
  + */
  +static void
  +winbondwd_identify(driver_t *driver, device_t parent)
  +{
  +   unsigned int baseport;
  +   device_t dev;
  +
  +if ((dev = device_find_child(parent, driver-name, 0)) == NULL) {
  +   if (resource_int_value(winbondwd, 0, baseport, baseport) 
  != 0) {
  +   baseport = winbondwd_baseport_probe();
  +   if (baseport == (unsigned int)(-1)) {
  +   printf(winbondwd0: Compatible Winbond Super 
  I/O not found.\n);
  +   return;
  +   }
  +   }
  +
  +   dev = BUS_ADD_CHILD(parent, 0, driver-name, 0);
  +
  +   bus_set_resource(dev, SYS_RES_IOPORT, 0, baseport, 2);
  +   }
  +
  +   if (dev == NULL)
  +   return;
 
 These last two lines are redundant?
 
 Also, maybe I am confused, but I think that in ISA identify method you don't
 actually need to parse any hints/tunables.  That is, you can use standard 
 hints
 approach like e.g.:
 hint.winbondwd.0.at=isa
 hint.winbondwd.0.port=0x3F0
 and ISA will automatically add a winbondwd child with an I/O port resource at
 0x3F0.  The identify method should only add a child for a 
 no-hints/auto-probing
 case.
 E.g. see boiler-plate code for the ISA case in
 share/examples/drivers/make_device_driver.sh, especially the comments.
 
 I am not saying that your approach won't work (apparently it does) or that it 
 is
 inherently bad.  It just seems to be different from how other ISA drivers do
 their identify+probe dance.

I agree, it should probably look something like this:

{
if (device_find_child(parent, driver-name, 0) != NULL)
return;

if (resource_int_value(driver-name, 0, port, baseport) == 0)
return;

baseport = winbondwd_baseport_probe();
if (baseport == -1)
/* No reason to warn on every boot here. */
return;

dev = BUS_ADD_CHILD(parent, 0, driver-name, 0);
if (dev != NULL)
bus_set_resource(dev, SYS_RES_IOPORT, 0, baseport, 2);
}

  +   sc-wb_bst = rman_get_bustag(sc-wb_res);
  +   sc-wb_bsh = rman_get_bushandle(sc-wb_res);

Please don't use these.  bus_read_X(sc-wb_wres) is easier on the eyes.

One other comment, several places in the code are using various magic
numbers for register offsets, bit field values, etc.  For example:

+   winbondwd_write_reg(sc, /* LDN bit 7:1 */ 0x7, /* GPIO Port 2 */ 0x8);
+   winbondwd_write_reg(sc, /* CR30 */ 0x30, /* Activate */ 0x1);

Could you add a winbondwdreg.h header and define constants for the registers
and their bitfields instead?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


a question about ifa_del_loopback_route: deletion failed message

2011-06-29 Thread Svatopluk Kraus
Hi,

   I've got ifa_del_loopback_route: deletion failed message from
ifa_del_loopback_route() called from rip_ctlinput(). Is IFA_RTSELF
flag consistent with ifa_add_loopback_route() and
ifa_del_loopback_route() calls?

   I think that rip_ctlinput() in sys/netinet/raw_ip.c should be
patched to do a check that IFA_RTSELF flag is set before
ifa_del_loopback_route() is called. The proposed check is done in
in_scrubprefix() in sys/netinet/in.c.

   Or exists some reason that it is not a good idea? I've got the
message as nobody calls ifa_add_loopback_route() before
ifa_del_loopback_route().

 Svata
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: NFS/BOOTP problem

2011-06-29 Thread Rick Macklem
 
 Hi Rick,
 
 The problem is that in embedded development sometimes loader is not
 used. And in that case we would like to have possibity to use old NFS
 client without patching code every time. So if you don't mind I would
 like to commit the patch.
 I also will try new nfs client and let you know if I find any
 problems.

Ok, sure. (I actually had something similar in the code when I was testing
the diskless stuff, but took it out once I saw that the environment variable
worked. However I didn't realize that some situations don't use the loader.)

rick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


kern/143370: splash_txt ASCII splash screen module

2011-06-29 Thread Antony Mawer
Hi all,

Not sure if this is the right place to post it -- about 6 years ago I
put together a module which displays an ASCII splash screen on boot
(rather than the graphical splash_pcx and splash_bmp modules). We have
been running it in production since that time without issue.

With the the code slush for 9.0 on the horizon, I thought it might
again be worth trying to see if someone is prepared to get this into
the tree so others can benefit from it. I have a PR open, kern/143370,
which includes the patch for the module; it is against 7.0 but has
been used largely unmodified since 4.x days. It currently builds on
8.x still fine as we are running it in production on 8.x for $WORK.

Summary of instructions from my previous post about this from ~18mths ago:

In case the list eats the patch, you can grab a copy of it here:

http://www.mawer.org/freebsd/splash_txt.patch

To give you an idea of what it looks like, here is a screenshot of a
quick generic FreeBSD splash screen I put together:

http://www.mawer.org/freebsd/splash_txt_1.png
http://www.mawer.org/freebsd/splash_txt_2.png

If you'd like to try it for yourself then the process to build it
should be something like this:

1. Download the attached patch
2. Create the required folders before applying the patch -- cd
/usr/src  mkdir sys/modules/splash/txt
3. Apply the patch -- patch  splash_txt.patch
4. Build the module -- cd sys/modules/splash/txt  make  make install

Once that's completed, you can configure it by adding the following to
loader.conf:

splash_txt_load=YES
bitmap_load=YES
bitmap_name=/boot/freebsd.bin

I have uploaded two sample boot splash screens at
http://www.mawer.org/freebsd/freebsd1.bin and
http://www.mawer.org/freebsd/freebsd2.bin . The files can be produced
using TheDraw and saving in its Binary file format, which consists of
a sequence of 2 byte pairs. The first byte in a pair is the character
to draw on the screen, and the second is the colour/display attributes
to draw the character with.

If anyone else would like to try this out and has any feedback, or if
someone thinks it may be of interest to integrate into the tree please
let me know ...

Otherwise if anyone would like to help push this into the tree in time
for 9.0 would be great. It should be safe to MFC to 8.x as well -- as
I said we've been running it ever since 4.x days. I am sure others out
there would gain at least some (cosmetic) benefits from this!

-- Antony
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread Marius Strobl
On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote:
 2011/6/29 KOT MATPOCKuH matpoc...@gmail.com:
  I'm got a problem with named on FreeBSD-CURRENT/sparc64.
  Up to 5 times a day it crashes with these messages:
  27-Jun-2011 03:42:14.384 general:
  /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
  REQUIRE(prev  0) failed
  27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)
 
  I found a some similar problems on alpha and IA64, which was related
  to problems with isc_atomic_xadd() function in include/isc/atomic.h.
  But I don't understand that there may be incorrect for sparc64 and
  this function was not changed for a minimum 4 years...
  Uhm, we once fixed a problem in the MD atomic implementation which
  still seems to present in the ISC copy. Could you please test whether
  the following patch makes a difference?
  http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
 
  I ran named with your patch and and watching him.
 Omg.
 Or I incorrectly rebuilt named, or the problem is not solved.
 I got a crash after about 2 hours after named restarted:
 29-Jun-2011 13:51:28.855 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)
 

The remainder of the isc atomic.h looks fine though, so this likely
is a general bug in BIND, especially if it didn't happen before
BIND 9.6.-ESV-R4-P1. Doug should be able to help you.
Doug, could you please nevertheless take care of getting the above
patch into BIND? It's a merge of r148453.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread Doug Barton

On 06/29/2011 06:41, Marius Strobl wrote:

On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote:

2011/6/29 KOT MATPOCKuHmatpoc...@gmail.com:

I'm got a problem with named on FreeBSD-CURRENT/sparc64.
Up to 5 times a day it crashes with these messages:
27-Jun-2011 03:42:14.384 general:
/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
REQUIRE(prev  0) failed
27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)



I found a some similar problems on alpha and IA64, which was related
to problems with isc_atomic_xadd() function in include/isc/atomic.h.
But I don't understand that there may be incorrect for sparc64 and
this function was not changed for a minimum 4 years...

Uhm, we once fixed a problem in the MD atomic implementation which
still seems to present in the ISC copy. Could you please test whether
the following patch makes a difference?
http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff



I ran named with your patch and and watching him.

Omg.
Or I incorrectly rebuilt named, or the problem is not solved.
I got a crash after about 2 hours after named restarted:
29-Jun-2011 13:51:28.855 general:
/usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
REQUIRE(prev  0) failed
29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)



The remainder of the isc atomic.h looks fine though, so this likely
is a general bug in BIND, especially if it didn't happen before
BIND 9.6.-ESV-R4-P1. Doug should be able to help you.
Doug, could you please nevertheless take care of getting the above
patch into BIND? It's a merge of r148453.


Hmm, I thought I had already pushed that rock up the appropriate hill, 
but maybe not. I've been following this thread, but it's incredibly 
unlikely that I'll be able to do anything useful with it until Friday.



hth,

Doug

--

Nothin' ever doesn't change, but nothin' changes much.
-- OK Go

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Thoughts on TMPFS no longer being considered highly experimental

2011-06-29 Thread Volodymyr Kostyrko

23.06.2011 19:31, David O'Brien wrote:

Does anyone object to this patch?

David Wolfskill and I have run TMPFS on a number of machines for two
years with no problems.

I may have missed something, but I'm not aware of any serious PRs on
TMPFS either.



Maybe i'm missing something but creating/removing large number of files 
in one directory on tmpfs was very slow for me. That was long ago and 
ZFS was in so i'll try to retest...


--
Sphinx of black quartz judge my vow.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Time keeping Issues with the low-resolution TSC timecounter

2011-06-29 Thread Matt

On 06/23/11 08:25, Jung-uk Kim wrote:


 On Thursday 23 June 2011 04:21 am, Ian FREISLICH wrote:

 Jung-uk Kim wrote:

 On Tuesday 21 June 2011 04:53 pm, Jung-uk Kim wrote:

 Can you please try the attached patch?  It should disable
 TSC/TSC-low timecounter for your CPU models, I think.

 Sorry, I attached a wrong patch.  Please ignore the previous one
 and try this, instead.

 TSC-low is not presented as an option any more:

 CPU: Intel(R) Atom(TM) CPU N270   @ 1.60GHz (1596.03-MHz 686-class
 CPU) Origin = GenuineIntel  Id = 0x106c2  Family = 6  Model = 1c
 Stepping = 2
 Features=0xbfe9fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTR
 R,PGE,MCA,CMOV,PAT,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
 Features2=0x40c39dSSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,xTPR,PDCM,M
 OVBE   AMD Features2=0x1LAHF
TSC: P-state invariant, performance statistics

 Event timer LAPIC quality 400
 Timecounter ACPI-fast frequency 3579545 Hz quality 900
 acpi_timer0:24-bit timer at 3.579545MHz   port 0x408-0x40b on
 acpi0 atrtc0:AT realtime clock   port 0x70-0x77 on acpi0
 atrtc0: Warning: Couldn't map I/O.
 Event timer RTC frequency 32768 Hz quality 0
 hpet0:High Precision Event Timer   iomem 0xfed0-0xfed003ff irq
 0,8 on acpi0 Timecounter HPET frequency 14318180 Hz quality 950
 Event timer HPET frequency 14318180 Hz quality 450
 Event timer HPET1 frequency 14318180 Hz quality 440
 Event timer HPET2 frequency 14318180 Hz quality 440
 attimer0:AT timer   port 0x40-0x43,0x50-0x53 on acpi0
 Timecounter i8254 frequency 1193182 Hz quality 0
 Event timer i8254 frequency 1193182 Hz quality 100

 kern.timecounter.choice: TSC(-1000) i8254(0) HPET(950)
 ACPI-fast(900) dummy(-100)

 It's already committed (r223426).

 Thanks!

 Jung-uk Kim
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org




I had been holding off on csup on this machine for a moment:
Machine: Thinkpad SL410 Core2Duo T6570
I rm -rf /usr/src  csup'd sources yesterday.

Issues still exist with TSC-low on Intel laptop hardware. Quality was
set to 1000, but time was inaccurate. Felt like 300 baud serial console
over a very long link!

I have C2  powerd:

/etc/sysctl.conf:
...
# Save electricity  thermal
hw.pci.do_power_nodriver=3
hw.acpi.cpu.cx_lowest=C2
dev.cpu.1.cx_lowest=C2
dev.cpu.0.cx_lowest=C2
...

/etc/rc.conf:
...
#power
powerd_enable=YES
powerd_flags=-b adaptive -a maximum
...

I'm getting as far as
/usr/src/sys/x86/x86:
...
if (smp_cpus  1) {
tsc_timecounter.tc_quality = test_smp_tsc();
max_freq= 8;
...

test_smp_tsc() returns 1000, allowing TSC-slow to win.
Forcing this to 50 fixed all time/speed issues, allowing HPET to win.

I think the test for C3 above that may need to include additional
machines under its protection! I have no C3 support, but it's clear that
issues are occuring with TSC clocks even in C2 on intel platforms.

Matt

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Time keeping Issues with the low-resolution TSC timecounter

2011-06-29 Thread Jung-uk Kim
On Wednesday 29 June 2011 05:50 pm, Matt wrote:
 On 06/23/11 08:25, Jung-uk Kim wrote:
   On Thursday 23 June 2011 04:21 am, Ian FREISLICH wrote:
   Jung-uk Kim wrote:
   On Tuesday 21 June 2011 04:53 pm, Jung-uk Kim wrote:
   Can you please try the attached patch?  It should disable
   TSC/TSC-low timecounter for your CPU models, I think.
 
   Sorry, I attached a wrong patch.  Please ignore the previous
  one and try this, instead.
 
   TSC-low is not presented as an option any more:
 
   CPU: Intel(R) Atom(TM) CPU N270   @ 1.60GHz (1596.03-MHz
  686-class CPU) Origin = GenuineIntel  Id = 0x106c2  Family = 6
   Model = 1c Stepping = 2
  
  Features=0xbfe9fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,
 MTR
  R,PGE,MCA,CMOV,PAT,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,
 PBE
  Features2=0x40c39dSSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,xTPR,PDC
 M,M OVBE   AMD Features2=0x1LAHF
  TSC: P-state invariant, performance statistics
 
   Event timer LAPIC quality 400
   Timecounter ACPI-fast frequency 3579545 Hz quality 900
   acpi_timer0:24-bit timer at 3.579545MHz   port 0x408-0x40b on
   acpi0 atrtc0:AT realtime clock   port 0x70-0x77 on acpi0
   atrtc0: Warning: Couldn't map I/O.
   Event timer RTC frequency 32768 Hz quality 0
   hpet0:High Precision Event Timer   iomem
  0xfed0-0xfed003ff irq 0,8 on acpi0 Timecounter HPET
  frequency 14318180 Hz quality 950 Event timer HPET frequency
  14318180 Hz quality 450
   Event timer HPET1 frequency 14318180 Hz quality 440
   Event timer HPET2 frequency 14318180 Hz quality 440
   attimer0:AT timer   port 0x40-0x43,0x50-0x53 on acpi0
   Timecounter i8254 frequency 1193182 Hz quality 0
   Event timer i8254 frequency 1193182 Hz quality 100
 
   kern.timecounter.choice: TSC(-1000) i8254(0) HPET(950)
   ACPI-fast(900) dummy(-100)
 
   It's already committed (r223426).
 
   Thanks!
 
   Jung-uk Kim
   ___
   freebsd-current@freebsd.org mailing list
   http://lists.freebsd.org/mailman/listinfo/freebsd-current
   To unsubscribe, send any mail to
  freebsd-current-unsubscr...@freebsd.org

 I had been holding off on csup on this machine for a moment:
 Machine: Thinkpad SL410 Core2Duo T6570
 I rm -rf /usr/src  csup'd sources yesterday.

 Issues still exist with TSC-low on Intel laptop hardware. Quality
 was set to 1000, but time was inaccurate. Felt like 300 baud serial
 console over a very long link!

 I have C2  powerd:

 /etc/sysctl.conf:
 ...
 # Save electricity  thermal
 hw.pci.do_power_nodriver=3

This betther be set from /boot/loader.conf.

 hw.acpi.cpu.cx_lowest=C2
 dev.cpu.1.cx_lowest=C2
 dev.cpu.0.cx_lowest=C2

There is no reason to do this here.  Just add a line in /etc/rc.conf:

economy_cx_lowest=C2

 ...

 /etc/rc.conf:
 ...
 #power
 powerd_enable=YES
 powerd_flags=-b adaptive -a maximum
 ...

 I'm getting as far as
 /usr/src/sys/x86/x86:
 ...
 if (smp_cpus  1) {
  tsc_timecounter.tc_quality = test_smp_tsc();
  max_freq= 8;
 ...

 test_smp_tsc() returns 1000, allowing TSC-slow to win.
 Forcing this to 50 fixed all time/speed issues, allowing HPET to
 win.

 I think the test for C3 above that may need to include additional
 machines under its protection! I have no C3 support, but it's clear
 that issues are occuring with TSC clocks even in C2 on intel
 platforms.

Hmm...  That's strange.  Can you show me verbose dmesg output?  Also, 
I'd like to see 'acpidump -dt' output.

Sorry about the trouble.

Jung-uk Kim
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org