Re: carp0 interface goes down on 6.2-PRERELEASE

2006-10-12 Thread Tom Judge

Ari Suutari wrote:

Hi,

I started experimenting with carp, in order to replace
freevrrpd stuff we are currently using.

I'm running quite recent version of RELENG_6 (compiled
this week).

I was able to configure carp ok, but for some odd reason the
interface goes down by itself shortly after it has been configured.

Here is output from test script:

# sh -x test.sh
+ ifconfig carp0 destroy
+ ifconfig carp0 create
+ ifconfig carp0 up
+ ifconfig carp0 inet 192.168.5.59/24 vhid 55 pass xxx123
+ ifconfig carp0
carp0: flags=49UP,LOOPBACK,RUNNING mtu 1500
inet 192.168.5.59 netmask 0xff00
carp: BACKUP vhid 55 advbase 1 advskew 0
+ sleep 5
+ ifconfig carp0
carp0: flags=8LOOPBACK mtu 1500
carp: INIT vhid 55 advbase 1 advskew 0

See, here the interface is up, but after 5 seconds it has gone
down. Could anybody give a hint why this happens ? There are
messages on console about promiscuous mode being enabled/disabled,
but nothing else.



I have seen similar problems when the carp multicast (224.0.0.18) 
traffic was not allowed to be transmitted to the network due to a 
firewall configuration problem.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bce issues still outstanding

2006-10-12 Thread Tom Judge

Scott Long wrote:

Bill Moran wrote:

I've copied many of the people who I've been working with directly on
this issue.

Can anyone provide a status update on these problems?  Discussion seems
to have stopped since Oct 5.

Any new patches to test?



I'm actively working on fixing the driver right now.

Scott



If there are any patches you want testing I have 3 very expensive 
paperweights sat around the office at the moment in the form of Dell 
PE2950's with twin adapters in them:


bce0: Broadcom NetXtreme II BCM5708 1000Base-T (B1), v0.9.6 mem 
0xf400-0xf5ff irq 16 at device 0.0 on pci9

bce0: ASIC ID 0x57081010; Revision (B1); PCI-X 64-bit 133MHz
miibus0: MII bus on bce0
brgphy0: BCM5708C 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto

bce0: Ethernet address: 00:13:72:60:35:8a


bce1: Broadcom NetXtreme II BCM5708 1000Base-T (B1), v0.9.6 mem 
0xf800-0xf9ff irq 16 at device 0.0 on pci5

bce1: ASIC ID 0x57081010; Revision (B1); PCI-X 64-bit 133MHz
miibus1: MII bus on bce1
brgphy1: BCM5708C 10/100/1000baseTX PHY on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto

bce1: Ethernet address: 00:13:72:60:35:88

The easiest way to cause the watchdog time out on these systems is to 
write a file larger than 50Mb to an NFS file system that is mounted over 
UDP (TCP doesn't cause the time out strangely).  From the testing I have 
done the timeout is triggered when arround 49Mb has been copied to the 
NFS server. Perhaps this could suggest a bug in the udp packet checksum 
offload code?


I have 2 systems available to test on right now,  one running code from 
RELENG_6 just after beta2 was announced, and one cvsup'd from the uk 
mirror today.


The kernels are compiled with:

option INVARIANTS
option INVARIANT_SUPPORT

However this doesn't cause a kernel panic.

Let me know if you need any more information.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: carp0 interface goes down on 6.2-PRERELEASE

2006-10-13 Thread Tom Judge

Ari Suutari wrote:

Ari Suutari wrote:

I have now tested with real hardware (ethernet is fxp0) and
under VmWare (ethernet is lnc0). Same problem on both.


I'll have to correct this. Carp works with fxp0. Problem is
only under vmware, which makes me more and more suspect
that it is because lnc0 does not support link state reporting
(it seems to be present on only a few drivers).

I do remember seeing this problem when developing some systems in vmware 
that the carp interfaces where always in INIT when the system booted. I 
added a small rc script to for the interfaces up using 'ifconfig carp0 
up'  which seemed to make the interfaces come up however if on system is 
unplugged from the network it will automatically put itself into the 
master state until it can talk to the other servers in the carp group.


Tom J
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: When will the new BCE driver in HEAD be incorporated into RELENG_6?

2006-10-18 Thread Tom Judge

Kris Kennaway wrote:

On Wed, Oct 18, 2006 at 06:15:14PM +0100, Jason Thomson wrote:
  

Using the driver from HEAD* in the latest RELENG_6 didn't fix our
problems.

We could still trigger the Watchdog timeout when copying a local file to
an NFS mounted filesystem (UDP mount, GigE speeds).

It was also possible to trigger this bug with multiple simultaneous TCP
streams,  but that took a little longer.

Copying a local file to an NFS/UDP filesystem would trigger the bug in a
few seconds.

If there's anything we can do to help debug this,  please let us know.



Per my previous mails, the (known) bce watchdogs are symptoms of
driver bugs which can be usefully converted into panics by enabling
INVARIANTS and INVARIANT_SUPPORT.  Please do so, then report what
happens.

Kris
  

Hi Kris,

I am a colleague of Jason's,  when we were testing this patch the kernel 
used had both options set:


option INVARIANTS
option INVARIANT_SUPPORT

Although on several boxes we have failed to cause a kernel panic, only 
watchdog timeouts.  However the last crash that we reproduced did 
trigger several:


bce: need to defrag

Messages on the console before the watchdob timeout occured.


Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Gmirror question

2006-10-25 Thread Tom Judge

Guido van Rooij wrote:

Is it possible to use gmirror to mirror a single BSD partition?
If not: is it possible with other tools?

-Guido
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]
  


Yes it is possible, I found this site had very good examples of setting 
it up:


http://people.freebsd.org/~rse/mirror/

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Gmirror question

2006-10-25 Thread Tom Judge

Guido van Rooij wrote:

On Wed, Oct 25, 2006 at 11:07:14AM +0100, Tom Judge wrote:
  

Guido van Rooij wrote:


Is it possible to use gmirror to mirror a single BSD partition?
If not: is it possible with other tools?
  
Yes it is possible, I found this site had very good examples of setting 
it up:


http://people.freebsd.org/~rse/mirror/




This only documents how to set it up on an entire disk or on a slice.

-Guido
The instructions for mirroring a slice can be modified slightly (by 
using the correct devices) to do mirror a single partition. 


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Gmirror question

2006-10-25 Thread Tom Judge

Guido van Rooij wrote:

On Wed, Oct 25, 2006 at 11:34:25AM +0100, Tom Judge wrote:
  

Guido van Rooij wrote:


On Wed, Oct 25, 2006 at 11:07:14AM +0100, Tom Judge wrote:
 
  

Guido van Rooij wrote:
   


Is it possible to use gmirror to mirror a single BSD partition?
If not: is it possible with other tools?
 
  
Yes it is possible, I found this site had very good examples of setting 
it up:


http://people.freebsd.org/~rse/mirror/

   


This only documents how to set it up on an entire disk or on a slice.

-Guido
  
The instructions for mirroring a slice can be modified slightly (by 
using the correct devices) to do mirror a single partition. 



Please tell me then how these modified instructions look.
I still do not know how to reserve the last sector of the partition. 
I can create 1-sector holes using bsdlabel, but I'm not sure this

would be the way to go...

-Guido
  


Please read Patricks reply to your earlier post re reserving blocks.

--SNIP--

You don't need to.

If your bsdlabel partition is N sectors in size, the gmirror
object will have size N - 1. Newfs will not be able to write
to that last sector. You newfs the finished mirror device,

--SNIP--

Tom 
not the individual partitions.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SAS Raid - mfi driver

2006-10-31 Thread Tom Judge

Fredrik Widlund wrote:

Ivan Voras wrote:
  

Several:

- are there cache differences between the controllers (amount of
memory, cache policy)?


Default settings on both.
  

- how does writing directly to the device (bypassing file system)
compare?


Drives are four seagate 7200.10 400GB in a Raid-5 configuration.

[/mnt/test (/dev/mfid0p1 mounted)]
read: 200MB/s
write: 15MB/s

[/dev/mfid0p1]
read: 200MB/s
write: 8MB/s

[/dev/mfid0]
read: 200MB/s
write: 10MB/s

Kind regards,
Fredrik Widlund


  
Have you tried setting the write cache policy to write back rather than 
write thru?


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


FreeBSD 6.1 IPsec Path MTU Discovery

2006-11-07 Thread Tom Judge

Hi,

I am seeing some problems with some problems with IPsec encrypted gif 
tunnels and path mtu discovery. 

It seems that the router with the IPsec tunnel sends an ICMP need to 
frag packet with the next hop mtu set to 0. This causes ssh to 
retransmit a the same packet without reducing the size of the data payload.


Is this a know problem? If so are there any know work arounds?

Tom

Network Layout:

Box 1 --(lan)-- Router 1 --(lan)-- Router 2 --(Ipsec tunnel)-- Router 3 
--(lan) --- Box 2


Box 1: FreeBSD 5.4
Router [123]: FreeBSD 6.1
Box 2: Linux 2.6



PING Test from box 1 to box 2 with do not fragment set and a packet 
larger than the path MTU:


box1# ping -s 1280 -D box2
PING box2 (10.0.0.79): 1280 data bytes
36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
4  5  00 051c b454   0   40  01 c9fc 172.17.1.48  10.0.0.79

36 bytes from router2 (172.17.3.6): frag needed and DF set (MTU 0)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
4  5  00 1c05 b454   0   3f  01 cafc 172.17.1.48  10.0.0.79

36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
4  5  00 051c b45f   0   40  01 c9f1 172.17.1.48  10.0.0.79

36 bytes from router2 (172.17.3.6): frag needed and DF set (MTU 0)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
4  5  00 1c05 b45f   0   3f  01 caf1 172.17.1.48  10.0.0.79

^C
--- box2 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

PING Test from box 1 to box 2 with do not fragment set and a packet 
smaller than the path MTU:


box1# ping -s 1200 -D box2
PING box2 (10.0.0.79): 1200 data bytes
36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
4  5  00 04cc b472   0   40  01 ca2e 172.17.1.48  10.0.0.79

1208 bytes from 10.0.0.79: icmp_seq=0 ttl=61 time=111.017 ms
36 bytes from router1 (172.17.3.5): Redirect Host(New addr: 172.17.3.6)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
4  5  00 04cc b479   0   40  01 ca27 172.17.1.48  10.0.0.79

1208 bytes from 10.0.0.79: icmp_seq=1 ttl=61 time=110.419 ms
^C
--- box2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 110.419/110.718/111.017/0.299 ms
box1# 




Relevent interface configuration on box1 (from ifconfig):

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
   options=bRXCSUM,TXCSUM,VLAN_MTU
   inet 172.17.1.48 netmask 0x broadcast 172.17.255.255
   ether 00:0f:1f:fa:d1:b5
   media: Ethernet autoselect (1000baseTX full-duplex)
   status: active



Relevent interface configuration on router2 (from ifconfig):

em0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST mtu 1500
   options=bRXCSUM,TXCSUM,VLAN_MTU
   inet 172.17.3.6 netmask 0x broadcast 172.17.255.255
   ether 00:c0:9f:12:13:1b
   media: Ethernet autoselect (1000baseTX full-duplex)
   status: active
gif0: flags=8051UP,POINTOPOINT,RUNNING,MULTICAST mtu 1280
   tunnel inet 63.174.175.252 -- 82.195.173.206
   inet 192.168.174.10 -- 192.168.174.9 netmask 0xfffc


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.1 IPsec Path MTU Discovery

2006-11-09 Thread Tom Judge

Johann Hugo wrote:

On Tuesday 07 November 2006 19:39, Tom Judge wrote:

I'm seeing the same problem on my gif tunnel.

For an interim work around you can reduce the MTU size between Box1 and Box2 
e.g route change Box2 -mtu 1200. After it's starts working you can change it 
back to 1500 en it keeps on working. 
 
Don't ask me why it works, I'm still trying to figure out what the problem is.


Johann


I have a patch for the problem,  it is related to a broken peice of code 
that is supposed to calculate the mtu using the size of the ip header 
and the size of the ipsec header.  However when the ipsec security 
policy is fetched some required sections are null and the code block 
completely fails.  The following patch fixes the problem for me as it 
allows the code to fall through to the standard mtu calculation using 
either the destination interface mtu or by calculating the next smallest 
rfc defined mtu.


It would be interesting to see if this patch works for you, I have 
submitted it on the open pr but have not had a response yet.



Tom J

PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/91412

Patch:

Index: sys/netinet/ip_input.c
===
--- sys/netinet/ip_input.c  (revision 24)
+++ sys/netinet/ip_input.c  (working copy)
@@ -1990,8 +1990,8 @@
 #else /* FAST_IPSEC */
KEY_FREESP(sp);
 #endif
-   ipstat.ips_cantfrag++;
-   break;
+// ipstat.ips_cantfrag++;
+// break;
}
}
 #endif /*IPSEC || FAST_IPSEC*/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: adaptec utilities on amd64?

2006-11-17 Thread Tom Judge

Scott Long wrote:

Vivek Khera wrote:
Some time long ago, someone posted a very short C program that probes 
the LSI controller and spits out this kind of output:


[EMAIL PROTECTED] amrstat
Drive 0:34.18 GB, RAID1 writeback,no-read-ahead,no-adaptative-io 
optimal
Drive 1:   102.54 GB, RAID1 writeback,no-read-ahead,no-adaptative-io 
optimal


This is the kind of output I'd love to get from my adaptec 
controllers, too.  This can be trivially scripted and hooked into a 
monitoring system like nagios.


The aaccli tool is a curses based app (despite the cli in the name) 
and scripting it is damn near impossible.  It doesn't even read 
commands from stdin!




Yes, scripting it is possible, and it does have a non-interactive mode.
Try the following:

printf open aac0\ncontroller details\nexit\n | aaccli



I have been trying to find a similar solution for aac controllers that 
will display the status of the volumes that the controller has 
configured (E.g. Optimal, Degraded, Failed etc...).  So far the only way 
I have found to do it is to store the output of:


printf open aac0\ncontainer list\nexit\n | aaccli

For a known good status and then periodiacly check the output of the 
above command with the same status.


Is there a better way to do this?

Tom





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Removing unused core components. (Disabled in make.conf)

2007-01-17 Thread Tom Judge

Hi,

I have the following options in /etc/make.conf:

NO_PROFILE=true
NO_SENDMAIL=true
NO_GAMES=true
NO_I4B=true
NO_ATM=true
NO_INET6=true
NO_BLUETOOTH=true
NO_IPFILTER=true
NO_RCMDS=true
NO_KERBEROS=true


However after a make buildworld installworld the utilities and libs 
associated with these packages are still installed,  is there any easy 
way to remove them from the system?


Thanks

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Boot prompt for Intel AMT

2007-03-06 Thread Tom Judge

Artem Kuchin wrote:

I hope some people will understand what  i am talking about, because
the technology, i think, is not very popular, but can come VERY handy.

Intel AMT Serial over LAN (SOL, why is it called 'over LAN' if it is 
really

'OVER IP'?) allows to boot into BIOS of a remote machine
and even, as seen in their demo, can be used to control MS DOS prompt.


well because it isnt using IP, besides SOIP is uninspiring :)


Wait.. how so? I was sure that the whose SOL (IPMI) protocal is running 
over

IP and i can REMOTELY (e.g. from anoth planet with IP connection) access
the machine in the data center. If i can do such thing, then it DOES run 
over IP

eventually. Isn't it?

Anyway, nobody said anothing about getting freebsd boot prompt over SOL.
My guess, that this is THE MOST usefull usage of SOL for remote upgrades.
I understand that this is not as simple as sending data to UART. THis is 
must

be done explicitely in the boot loader, i thinks. But why no do it?



We have a number of Dell PowerEdge 2950's that we boot using the built 
in SOL, which does run over IP as we use it across a routed VPN backbone 
(server in the data center, console in the office).  We have found that 
the IPMI serial port is connected to the system as COM2, which we select 
in the bios configuration.  We then set device.hints so that the freebsd 
console is set to use the same port.  We use the open source ipmitool to 
access the ipmi controller, and serial port, on the system.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Boot prompt for Intel AMT

2007-03-06 Thread Tom Judge

Artem Kuchin wrote:

Artem Kuchin wrote:

snip/
We have a number of Dell PowerEdge 2950's that we boot using the built 
in SOL, which does run over IP as we use it across a routed VPN 
backbone (server in the data center, console in the office).  We have 
found that the IPMI serial port is connected to the system as COM2, 
which we select in the bios configuration.  We then set device.hints 
so that the freebsd console is set to use the same port.  We use the 
open source ipmitool to access the ipmi controller, and serial port, 
on the system.


Aha! This is something already.

When our system boot it says:
Mar  5 23:59:04 aaa kernel: sio0: configured irq 4 not in bitmap of 
probed irqs 0

Mar  5 23:59:04 aaa kernel: sio0: port may not be enabled
Mar  5 23:59:04 aaa kernel: sio0: 16550A-compatible COM port port 
0x3f8-0x3ff irq 4 flags 0x10 on acpi0

Mar  5 23:59:04 aaa kernel: sio0: type 16550A
Mar  5 23:59:04 aaa kernel: sio1: configured irq 3 not in bitmap of 
probed irqs 0

Mar  5 23:59:04 aaa kernel: sio1: port may not be enabled

My guess is that sio0 is the real port and sio1 is the IPMI port of 
Intel AMT. But what does this message

really say? What must i do to enable the port?
The other question, do i need to include
device ipmi

in the kernel config? And  how do i tell the boot loader to redirect its 
output to serial port? Sorry,

working with freebsd for 10 years now but never touched this issue.



This is the dmesg from our servers (sio1 is the ipmi SOL port):
sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A
sio1: 16550A-compatible COM port port 0x2f8-0x2ff irq 3 flags 0x10 on 
acpi0

sio1: type 16550A, console

As far as I know it is not required to have ipmi in the kernel (we dont 
have it in our kernels) to use the SOL port.


This page should help you setup the serial console:

http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/serialconsole-setup.html

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Xen Dom0, are we making progress?

2007-03-13 Thread Tom Judge

Nikolas Britton wrote:

On 3/12/07, Andras Gót [EMAIL PROTECTED] wrote:

Nikolas Britton wrote:
 On 3/12/07, Ronald Klop [EMAIL PROTECTED] wrote:
 On Mon, 12 Mar 2007 20:16:32 +0100, Nikolas Britton
 [EMAIL PROTECTED] wrote:

  Is FreeBSD making any progress in Xen Dom0 / Intel VT support? I'd
  really like to consolidate some underutilized FreeBSD servers. Are
  their any alternative solutions that will enable me to do this 
kind of

  stuff with FreeBSD, or would it be better to go with Solaris Dom0 +
  FreeBSD DomU?

 http://docs.freebsd.org/44doc/papers/jail/jail.html
 google: jail freebsd


 Yes I'd like to know more about jails, is there a high level /
 executive summary type document that I can read somewhere? From what I
 remember jails are mostly designed to partition stuff... for security
 reasons.

 What I'd really love to do is split up each service (httpd, postgres,
 samba/nfs,  ldap/nis, asterisk, etc.) into discrete virtual machines.
 It's too much work trying to make them all play nice on one system,
 especially during upgrades. As it is right now I don't upgrade any
 services once a system is in production use.

Hi,

For first read man jail. :) Apache, bind, mysql and postfix run fine in
a jail. For postgres you've to turn on the jail.ipc.
This is basicly not so bad, but definitely reduces security. For
samba/nfs/ldap/nis and asterisk I don't have the experience, but if they
not need ipc, they'll run fine out of the box. In jails I suggest that
you mount your ports tree with some nullfs mount. With this you'll save
some hd capacity. (The installed port list is in /var, not in
/usr/ports.) In jails you can't do resource control, so keep that in 
mind.




Is their anyway to transfer jails on the fly between systems... For
example, say I wanted to transfer the http service to a more powerful
box because load was too high, can you do stuff like this?



You could export the jail file system via nfs, or use some other form of 
shared storage to share the file system.  I have seen systems that put 
the jail IP address onto the loop back interface and then use OSPF to 
advertise the service to your border routers.  If your storage subsystem 
supports if (NFS will) you can have both jails up and running at the 
same time and then just change the routing advertisements to move the 
service.


Just an idea,  I have never tried it,  but I did see some fail over 
project that used the methods above.  The project advertised the fact 
that not only can you move services between hosts but also you can move 
them between physical sites if your routers all run ospf.


Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


NFS Mount problems

2007-03-28 Thread Tom Judge

Hi,


I have a HA NFS server setup,  but I am having some problems with 
mounting the NFS shares.  I have had to patch mountd to allow it to be 
configured with an IP to bind to,  its a bit of quick hack (no docs, 
ipv6 etc...) but solves the problem for us where mountd sends the 
packets from the wrong ip. (See patch bellow).



The NFS server IP is 172.31.0.200 and we have the following flags set in 
rc.conf:


nfs_server_flags=-u -t -n 4 -h 172.31.0.200
rpcbind_flags=-h 172.31.0.200
mountd_flags=-r -p 832 -h 172.31.0.200

When I try and mount the share I get the following error the command:

maverick# mount nfs-server:/usr/home /usr/home
[udp] nfs-server:/usr/home: RPCPROG_MNT: RPC: Timed out
[udp] nfs-server:/usr/home: RPCPROG_MNT: RPC: Timed out

And the following data from tcpdump on the server:

[EMAIL PROTECTED] /usr/home/mintel]# tcpdump -n 'ip host 172.31.0.2 and ip 
host 172.31.0.200'

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 68 bytes
17:10:58.321858 IP 172.31.0.2.906  172.31.0.200.111: UDP, length 56
17:10:58.322018 IP 172.31.0.200.111  172.31.0.2.906: UDP, length 28
17:10:58.322231 IP 172.31.0.2.1175084341  172.31.0.200.2049: 40 null
17:10:58.322280 IP 172.31.0.200.2049  172.31.0.2.1175084341: reply ok 
24 null

17:10:58.322481 IP 172.31.0.2.921  172.31.0.200.111: [|lwres]
17:10:58.322560 IP 172.31.0.200.111  172.31.0.2.921: [|lwres]
17:10:58.322731 IP 172.31.0.2.854  172.31.0.200.832: UDP, length 112
17:11:13.324547 IP 172.31.0.200.832  172.31.0.2.854: UDP, length 68
17:11:13.324652 IP 172.31.0.2  172.31.0.200: ICMP 172.31.0.2 udp port 
854 unreachable, length 36



I can reproduce the problem on a number 6.2 Release systems (i386/amd64).

Has anyone seen this before,  or know of a fix?

Thanks

Tom



/usr/src/sys/usr.sbin/mountd/
Index: mountd.c
===
--- mountd.c(revision 37)
+++ mountd.c(working copy)
@@ -257,7 +257,7 @@
fd_set readfds;
struct sockaddr_in sin;
struct sockaddr_in6 sin6;
-   char *endptr;
+   char *endptr, *svcaddr;
SVCXPRT *udptransp, *tcptransp, *udp6transp, *tcp6transp;
struct netconfig *udpconf, *tcpconf, *udp6conf, *tcp6conf;
pid_t otherpid;
@@ -290,7 +290,7 @@
errx(1, NFS server is not available or loadable);
}

-   while ((c = getopt(argc, argv, 2dlnp:r)) != -1)
+   while ((c = getopt(argc, argv, 2dlnp:rh:)) != -1)
switch (c) {
case '2':
force_v2 = 1;
@@ -307,6 +307,9 @@
case 'l':
dolog = 1;
break;
+case 'h':
+svcaddr = optarg;
+break;
case 'p':
endptr = NULL;
svcport = (in_port_t)strtoul(optarg, endptr, 10);
@@ -392,6 +395,7 @@
sin.sin_len = sizeof(struct sockaddr_in);
sin.sin_family = AF_INET;
sin.sin_port = htons(svcport);
+sin.sin_addr.s_addr = inet_addr(svcaddr);

bzero(sin6, sizeof(struct sockaddr_in6));
sin6.sin6_len = sizeof(struct sockaddr_in6);
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [SOLVED] NFS Mount problems

2007-03-28 Thread Tom Judge

Tom Judge wrote:

Hi,


I have a HA NFS server setup,  but I am having some problems with 
mounting the NFS shares.  I have had to patch mountd to allow it to be 
configured with an IP to bind to,  its a bit of quick hack (no docs, 
ipv6 etc...) but solves the problem for us where mountd sends the 
packets from the wrong ip. (See patch bellow).



The NFS server IP is 172.31.0.200 and we have the following flags set in 
rc.conf:


nfs_server_flags=-u -t -n 4 -h 172.31.0.200
rpcbind_flags=-h 172.31.0.200
mountd_flags=-r -p 832 -h 172.31.0.200

When I try and mount the share I get the following error the command:

maverick# mount nfs-server:/usr/home /usr/home
[udp] nfs-server:/usr/home: RPCPROG_MNT: RPC: Timed out
[udp] nfs-server:/usr/home: RPCPROG_MNT: RPC: Timed out

And the following data from tcpdump on the server:

[EMAIL PROTECTED] /usr/home/mintel]# tcpdump -n 'ip host 172.31.0.2 and ip 
host 172.31.0.200'

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 68 bytes
17:10:58.321858 IP 172.31.0.2.906  172.31.0.200.111: UDP, length 56
17:10:58.322018 IP 172.31.0.200.111  172.31.0.2.906: UDP, length 28
17:10:58.322231 IP 172.31.0.2.1175084341  172.31.0.200.2049: 40 null
17:10:58.322280 IP 172.31.0.200.2049  172.31.0.2.1175084341: reply ok 
24 null

17:10:58.322481 IP 172.31.0.2.921  172.31.0.200.111: [|lwres]
17:10:58.322560 IP 172.31.0.200.111  172.31.0.2.921: [|lwres]
17:10:58.322731 IP 172.31.0.2.854  172.31.0.200.832: UDP, length 112
17:11:13.324547 IP 172.31.0.200.832  172.31.0.2.854: UDP, length 68
17:11:13.324652 IP 172.31.0.2  172.31.0.200: ICMP 172.31.0.2 udp port 
854 unreachable, length 36



I can reproduce the problem on a number 6.2 Release systems (i386/amd64).

Has anyone seen this before,  or know of a fix?

After running ktrace on the mountd process during the mount process, I 
noticed that mountd was trying to do a reverse DNS lookup on the 
address.  The problem was that the DNS server was incorrectly configured 
and not responding to the request.


Adding the client to the server hosts file fixed the problem.

Sorry for the noise.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ggate + gmirror write performance woes

2007-04-05 Thread Tom Judge

Dmitriy Kirhlarov wrote:

On Thu, Apr 05, 2007 at 10:58:56AM -0400, Sven Willenberger wrote:

I am trying to set up a HA type system involving two identical boxes and
have gone through the following to set up the systems:

Slave server: 
ggated -R 196608 -S 196608

(exporting /dev/amrd1 )
net.inet.tcp.sendspace: 65536
net.inet.tcp.recvspace: 131072


Try
net.local.stream.recvspace=65535
net.local.stream.sendspace=65535

Also, try increase this sysctls with
net.inet.tcp.rfc1323=1

I use it on FreeBSD 5.x with:
net.inet.tcp.sendspace=131072
net.inet.tcp.recvspace=131072
net.local.stream.recvspace=65535
net.local.stream.sendspace=65535

ggated -R 1048576 -S 1048576
ggatec -R 1048576 -S 1048576

WBR.
Dmitriy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]



I have seen sustained writes of 30Mb/s using the following configuration:

cat /boot/loader.conf
kern.ipc.nmbclusters=32768

cat /etc/sysctl.conf
net.inet.tcp.sendspace=1048576
net.inet.tcp.recvspace=1048576

Server:
/sbin/ggated -S 1310720 -R 1310720 -a 172.31.0.18 /etc/gg.exports

Client:
/sbin/ggatec create -q 2048 -t 5 -S 1310720 -R 1310720 172.31.0.18 
/dev/amrd0s2


The raid array is a RAID 1 volume on a dell PERC4 (Dell PE1850) with 
adaptive read ahead and write back caching.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Repeatable crash with mkdir causing a divide by zero error

2007-04-06 Thread Tom Judge

Hi,

I have seen some problems with a new file system that I created 
yesterday in that I could repeatedly get the system to crash in with a 
mkdir.


Here is the disk information
mfid1: MFI Logical Disk on mfi1
mfid1: 5716992MB (11708399616 sectors) RAID volume 'Images' is optimal

I created a new file system tuned for 64k blocks, an average file size 
of 1Mb, and 2500 files per directory.


newfs -b 65535 -g 1048576 -h 2500 /dev/mfid1p1
mount /dev/mfid1p1 /compere
mkdir /compere/images
mkdir /compere/images/1999

(Also tested with mkdir test; mkdir test/1998)

The system is and amd64 system running 6.2-RELEASE and the pmap.c patch. 
 I have 3 cores cause by 3 different apps (rsync, gmkdir, mkdir) and 
can provide any more information if required.  I have attached a back 
trace, unfortunatly I cannot do any testing  as the system is now in 
testing (newfs -b 65535 -g 1048576 /dev/mfid1p1 was used and seems not 
to cause the bug).



kgdb /usr/obj/usr/src/sys/PE2950/kernel.debug /var/crash/vmcore.2
[GDB will not be able to debug user-mode threads: 
/usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup]

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd.

Unread portion of the kernel message buffer:


Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x8:0x80391347
stack pointer   = 0x10:0xa78736f0
frame pointer   = 0x10:0xff0001d7a600
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1206 (mkdir)
trap number = 18
panic: integer divide fault
cpuid = 0
Uptime: 4m29s
Dumping 1023 MB (2 chunks)
  chunk 0: 1MB (156 pages) ... ok
  chunk 1: 1023MB (261800 pages) 1007 991 975 959 943 927 911 895 879 
863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 
575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 
287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15


#0  doadump () at pcpu.h:172
172 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:172
#1  0x0004 in ?? ()
#2  0x8029a557 in boot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:409
#3  0x8029abf1 in panic (fmt=0xff0029753000 X?/) at 
/usr/src/sys/kern/kern_shutdown.c:565
#4  0x803f62ff in trap_fatal (frame=0xff0029753000, 
eva=18446742974994109272) at /usr/src/sys/amd64/amd64/trap.c:660

#5  0x803f67a2 in trap (frame=
  {tf_rdi = 0, tf_rsi = 0, tf_rdx = 0, tf_rcx = 1951858688, tf_r8 = 
2500, tf_r9 = 2975, tf_rax = 1951858688, tf_rbx = -2050457600, tf_rbp = 
-1099480717824, tf_r10 = 246016, tf_r11 = 184512, tf_r12 = 
-1098707543808, tf_r13 = 246015, tf_r14 = -2050457600, tf_r15 = 255, 
tf_trapno = 18, tf_addr = 0, tf_flags = 2147483648012, tf_err = 0, 
tf_rip = -2143743161, tf_cs = 8, tf_rflags = 66182, tf_rsp = 
-1484310784, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:469
#6  0x803e1a6b in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:168
#7  0x80391347 in ffs_valloc (pvp=0xff002f24d7c0, 
mode=16877, cred=0x0, vpp=0xa7873798) at libkern.h:56
#8  0x803b8a5e in ufs_mkdir (ap=0xa78739a0) at 
/usr/src/sys/ufs/ufs/ufs_vnops.c:1386
#9  0x8043b355 in VOP_MKDIR_APV (vop=0x7457, 
a=0xa78739a0) at vnode_if.c:1251
#10 0x80310e19 in kern_mkdir (td=0xff002f24d7c0, 
path=0xff003dabe400 , segflg=4, mode=511) at vnode_if.h:653

#11 0x803f7151 in syscall (frame=
  {tf_rdi = 140737488348678, tf_rsi = 511, tf_rdx = 4294967295, 
tf_rcx = 1, tf_r8 = 0, tf_r9 = 140737488347272, tf_rax = 136, tf_rbx = 
2, tf_rbp = 140737488348024, tf_r10 = 4294967295, tf_r11 = 582, tf_r12 = 
140737488348678, tf_r13 = 140737488348008, tf_r14 = 0, tf_r15 = 0, 
tf_trapno = 12, tf_addr = 34367037072, tf_flags = 0, tf_err = 2, tf_rip 
= 34367037084, tf_cs = 43, tf_rflags = 518, tf_rsp = 140737488347720, 
tf_ss = 35})

at /usr/src/sys/amd64/amd64/trap.c:792
#12 0x803e1c08 in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:270

#13 0x0008006f5e9c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) frame 7
#7  0x80391347 in ffs_valloc (pvp=0xff002f24d7c0, 
mode=16877, cred=0x0, vpp=0xa7873798) at libkern.h:56
56  static __inline u_int min(u_int a, u_int b) { return (a  b ? a 
: b); }

(kgdb) list
51  static __inline int imax(int a, int b) { return (a  b ? a : b); }
52  static 

Re: ggate + gmirror write performance woes

2007-04-06 Thread Tom Judge

Sven Willenberger wrote:

On Thu, 2007-04-05 at 17:38 +0100, Tom Judge wrote:

Dmitriy Kirhlarov wrote:

On Thu, Apr 05, 2007 at 10:58:56AM -0400, Sven Willenberger wrote:

I am trying to set up a HA type system involving two identical boxes and
have gone through the following to set up the systems:

Slave server: 
ggated -R 196608 -S 196608

(exporting /dev/amrd1 )
net.inet.tcp.sendspace: 65536
net.inet.tcp.recvspace: 131072

Try
net.local.stream.recvspace=65535
net.local.stream.sendspace=65535

Also, try increase this sysctls with
net.inet.tcp.rfc1323=1

I use it on FreeBSD 5.x with:
net.inet.tcp.sendspace=131072
net.inet.tcp.recvspace=131072
net.local.stream.recvspace=65535
net.local.stream.sendspace=65535

ggated -R 1048576 -S 1048576
ggatec -R 1048576 -S 1048576

WBR.
Dmitriy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


I have seen sustained writes of 30Mb/s using the following configuration:

cat /boot/loader.conf
kern.ipc.nmbclusters=32768

cat /etc/sysctl.conf
net.inet.tcp.sendspace=1048576
net.inet.tcp.recvspace=1048576

Server:
/sbin/ggated -S 1310720 -R 1310720 -a 172.31.0.18 /etc/gg.exports

Client:
/sbin/ggatec create -q 2048 -t 5 -S 1310720 -R 1310720 172.31.0.18 
/dev/amrd0s2


The raid array is a RAID 1 volume on a dell PERC4 (Dell PE1850) with 
adaptive read ahead and write back caching.


Tom


I have tried both the settings ideas suggested above but I cannot even
get out of the gate with those. Setting net.inet.tcp.{send,recv}space to
anything higher that 131072 results in ggated bailing with the error:
# ggated -v -a 10.10.0.19
info: Reading exports file (/etc/gg.exports).
debug: Added 10.10.0.0/24 /dev/amrd1 RW to exports list.
debug: Added 10.10.0.0/24 /dev/amrd3 RW to exports list.
info: Exporting 2 object(s).
error: Cannot open stream socket: No buffer space available.
error: Exiting.

setting net.inet.tcp.{send,recv}space to 131072 allows me to start
ggated with the default R and S values of 131072; anything higher
results in no buffer space errors. At 131072 ggated starts but then I
cannot even open a new connection (like ssh) to the server as the ssh
client bails with no buffer space available.


Did you also set kern.ipc.nmbclusters=32768 in /boot/loader.conf and 
reboot?  It sounds like you did not as this is the exact same problem I 
came across before adjusting that value.


SNIP


This is on a FreeBSD 6.2-RELENG box i386 SMP using the amr driver (SATA
Raid using LSiMegaRaid.


Do you have the cache BBU fitted (Batery Backup Unit) and the array 
caching set to write back?  Also have you tested writing to the array 
locally without ggate to test the write speed?




The odd thing is that even after I set the send and recvspace down to
values like 65536, I continue to get the no buffer error when trying to
connect to it remotely again.



I found that the easyest way to fix this was to reboot the system with 
good values for net.inet.tcp.{send,recv}space.




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Repeatable crash with mkdir causing a divide by zero error

2007-04-07 Thread Tom Judge

Kris Kennaway wrote:

On Fri, Apr 06, 2007 at 11:05:22AM +0100, Tom Judge wrote:

Hi,

I have seen some problems with a new file system that I created 
yesterday in that I could repeatedly get the system to crash in with a 
mkdir.


Here is the disk information
mfid1: MFI Logical Disk on mfi1
mfid1: 5716992MB (11708399616 sectors) RAID volume 'Images' is optimal

I created a new file system tuned for 64k blocks, an average file size 
of 1Mb, and 2500 files per directory.


newfs -b 65535 -g 1048576 -h 2500 /dev/mfid1p1
mount /dev/mfid1p1 /compere
mkdir /compere/images
mkdir /compere/images/1999

(Also tested with mkdir test; mkdir test/1998)

The system is and amd64 system running 6.2-RELEASE and the pmap.c patch. 
 I have 3 cores cause by 3 different apps (rsync, gmkdir, mkdir) and 
can provide any more information if required.  I have attached a back 
trace, unfortunatly I cannot do any testing  as the system is now in 
testing (newfs -b 65535 -g 1048576 /dev/mfid1p1 was used and seems not 
to cause the bug).


This might be simple to fix, but please file a PR if it does not get
picked up by someone on this list.

Kris


SNIP

For any one that is interested there is now a PR for this problem 
(kern/111352).


Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Possible mtu bug in vlan or bce

2007-04-16 Thread Tom Judge

Hi,

I have seen some strange behaviour today with VLAN interfaces on bce 
interfaces.  I am running 6.2 Release on i386.


I have a bce interface setup on a gig-e network with an MTU of 8192 i 
attach a vlan interface to this and chnage the vlan if mtu to 1500 as it 
has 100Mbit devices on it.  This does not seem to affect the MTU of the 
bce interface, the VLAN mtu is reported as changed by if config but the 
bce is not reported as changed.  However I then start to get error 
messages saying that the NFS server is not responding as it is sending 
packets larger than 1500 bytes.



If I try to raise the mtu of the vlan interface (to 8192 which is the 
value that ifconfig reports for bce0) at this stage ifconfig thows an 
error saying that the value is incorrect.  If I then 'raise' (it already 
appears to be set to 8192 according to ifconfig) the mtu of bce0 to 8192 
and then set the vlan interface mtu to 8192 the nfs server starts to 
work again.



Is this a know bug/problem or should I raise a PR?

Thanks

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell SAS5 Performance Issue

2007-04-17 Thread Tom Judge

Richard Tector wrote:

Ivan Voras wrote:

Richard Tector wrote:
 

I'm suffering from very slow write performance on a Dell PowerEdge 860
with the SAS5/IR controller (mpt driver) running either 6.2-RELEASE or
6.2-STABLE with sources from yesterday. The controller hosts 2 Western
Digital 320GB SATA disks in a RAID1 configuration.
Reads approach 65MB/s however writes appear extremely slow, in the
region of 6-7MB/s with a dd and a blocksize of 1MB all the way down to
about 300KB/s while extracting a ports snapshot.

It was suggested to me that perhaps write caching has been disabled on
the controller however no options exist within the BIOS configuration to
view/adjust *any* caching options.



You looked in the controller's BIOS, not motherboard's, right?

Indeed I did.

There should be at least a write through vs write back switch...
Correct, there should be options, but there aren't. The controller BIOS 
has very few options at all in fact.



No, but can you post the relevant bits for the controller from dmesg?
  

Sure:

mpt0: LSILogic SAS/SATA Adapter port 0xec00-0xecff mem 
0xfe9fc000-0xfe9f,0xfe9e-0xfe9e irq 16 at device 8.0 on pci2

mpt0: [GIANT-LOCKED]
mpt0: MPI Version=1.5.12.0
mpt0: mpt_cam_event: 0x16
mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
mpt0: mpt_cam_event: 0x12
mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required).
mpt0: mpt_cam_event: 0x12
mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required).
mpt0: mpt_cam_event: 0x16
mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
[...snip...]
da0 at mpt0 bus 0 target 0 lun 0
da0: Dell VIRTUAL DISK 1028 Fixed Direct Access SCSI-5 device
da0: 300.000MB/s transfers, Tagged Queueing Enabled
da0: 305175MB (624998400 512 byte sectors: 255H 63S/T 38904C)


I have just stumbled across this problem on 4 PE860's and 2 PE840's.  I 
have been through the BIOS of the card and found no information about 
caching in any of the menus.  I then decided to take the card out and 
could not see any place to attach a cache battery backup unit,  I could 
also not see any ram chips on the card.


Is there any news on the performance of this card?

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Tom Judge

Hi,

Recently I have noticed that one of our Dell PE1950's has been crashing 
a lot with the following reason supervisor read, page not present.


The system runs 6.2 Release under i386.

I have attached 2 back traces, and I still have both cores if any more 
information is required.  Any light that can be shed on this problem 
would be greatly appreciated.


Tom

===

uname -a
FreeBSD narthex.mintel.co.uk 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Mon 
Apr  2 20:13:11 BST 2007 
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/PE1950  i386



## Core 1

[EMAIL PROTECTED] '13:14:47' '/home/london/tj'
 $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.1
[GDB will not be able to debug user-mode threads: 
/usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup]

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-marcel-freebsd.

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x15c
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc05df61f
stack pointer   = 0x28:0xe4f63c30
frame pointer   = 0x28:0xe4f63c90
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (swi1: net)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 1h25m33s
Dumping 2047 MB (2 chunks)
 chunk 0: 1MB (159 pages) ... ok
 chunk 1: 2047MB (523944 pages) 2031 2015 1999 1983 1967 1951 1935 1919 
1903 1887

7arp_rtrequest: bad gateway 172.31.1.1 (!AF_LINK)
7arp_rtrequest: bad gateway 172.31.0.1 (!AF_LINK)
1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 
1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 
1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 
1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 
959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 
671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 
383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 
95 79 63 47 31 15


#0  doadump () at pcpu.h:165
165 pcpu.h: No such file or directory.
   in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc05622ba in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc05625e1 in panic (fmt=0xc06e2578 %s) at 
/usr/src/sys/kern/kern_shutdown.c:565
#3  0xc06b4580 in trap_fatal (frame=0xe4f63bf0, eva=16777308) at 
/usr/src/sys/i386/i386/trap.c:837
#4  0xc06b42bf in trap_pfault (frame=0xe4f63bf0, usermode=0, 
eva=16777308) at /usr/src/sys/i386/i386/trap.c:745

#5  0xc06b3f19 in trap (frame=
 {tf_fs = -1067581432, tf_es = -965803992, tf_ds = -964624344, 
tf_edi = -957112288, tf_esi = -965676032, tf_ebp = -453624688, tf_isp = 
-453624804, tf_ebx = 16777216, tf_edx = -968955648, tf_ecx = 4, tf_eax = 
0, tf_trapno = 12, tf_err = 0, tf_eip = -1067583969, tf_cs = 32, 
tf_eflags = 66118, tf_esp = 3, tf_ss = 0}) at 
/usr/src/sys/i386/i386/trap.c:435

#6  0xc06a095a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc05df61f in in_arpinput (m=0xc68ba200) at 
/usr/src/sys/netinet/if_ether.c:636
#8  0xc05df4ea in arpintr (m=0xc68ba200) at 
/usr/src/sys/netinet/if_ether.c:551
#9  0xc05d861b in netisr_processqueue (ni=0xc076b078) at 
/usr/src/sys/net/netisr.c:236

#10 0xc05d881a in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:349
#11 0xc054cc49 in ithread_execute_handlers (p=0xc63ed860, ie=0xc643bb80) 
at /usr/src/sys/kern/kern_intr.c:682
#12 0xc054cd59 in ithread_loop (arg=0xc63bb870) at 
/usr/src/sys/kern/kern_intr.c:765
#13 0xc054b9fd in fork_exit (callout=0xc054cd04 ithread_loop, 
arg=0xc63bb870, frame=0xe4f63d38) at /usr/src/sys/kern/kern_fork.c:821
#14 0xc06a09bc in fork_trampoline () at 
/usr/src/sys/i386/i386/exception.s:208

(kgdb) exit
Undefined command: exit.  Try help.
(kgdb) quit


## Core 2
[EMAIL PROTECTED] '13:15:32' '/home/london/tj'
 $ kgdb /usr/obj/usr/src/sys/PE1950/kernel.debug /var/crash/vmcore.0
[GDB will not be able to debug user-mode threads: 
/usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup]

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-marcel-freebsd.

Unread portion 

Re: Fatal trap 12: page fault while in kernel mode

2007-04-23 Thread Tom Judge

Kai wrote:

On Thu, Apr 19, 2007 at 04:14:23PM +0200, Christian Walther wrote:
  

On 19/04/07, Kai [EMAIL PROTECTED] wrote:


On Wed, Apr 11, 2007 at 12:53:32PM +0200, Kai wrote:
  

Hello all,

We're running into regular panics on our webserver after upgrading
from 4.x to 6.2-stable:


Hi Again,

The panics keep happening, so I'm trying alternate kernel setups. This is a
trace of a panic on a default SMP kernel with debugging symbols.

I'm At a loss on how to progress at this point, perhaps someone can help me
please?
  

[snip]


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x34
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc06bdefa
stack pointer   = 0x28:0xeb9cf938
frame pointer   = 0x28:0xeb9cf944
code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 13577 (perl5.8.8)
trap number = 12
panic: page fault
  

Is this perl derived from ports? And if so, did you rebuild it after you
upgraded to 6.2? Or is maybe FreeBSD 4.x binary compatibility missing from
your kernel?



Hi Chris,

Thanks for your reply; The upgrade i'm talking about is just a term
describing that we switched from FreeBSD 4.10 to 6.2. Its new hardware; its
hardware on which FreeBSD 4.10 will not run.
So in effect its not an upgrade, though the symptoms did not show on
apache-1.3.37 + nfsmounted homepages under FreeBSD 4.10.

If perl would be the problem, the OS shouldn't panic IMHO. Perl in this case
is writing a fairly large guestbook file (eg. 2 Mb), and does this through
perls own:
open(BOOK, +$file) or die;

This $file is located on an NFS mounted filesystem. It'll get read and
written.

The NFS filesystem is mounted with rw,nosuid,intr,bg,resvport,nfsv3. I
have tried mounting without intr, but panics keep happening. The NFS server
is a Netapp filer.

This is a production environment, so I can't go updating to the latest
current. 


Kai
  


Just a me too, however I seem to get these crashes from random applications.

See my last post for back traces.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.2-STABLE (i386) Repeating crash (supervisor read, page not present)

2007-04-23 Thread Tom Judge

Michael Proto wrote:

Kris Kennaway wrote:

On Mon, Apr 23, 2007 at 01:24:52PM +0100, Tom Judge wrote:

Hi,

Recently I have noticed that one of our Dell PE1950's has been crashing 
a lot with the following reason supervisor read, page not present.


The system runs 6.2 Release under i386.

I have attached 2 back traces, and I still have both cores if any more 
information is required.  Any light that can be shed on this problem 
would be greatly appreciated.




SNIP/


You might be hitting a bug in an obscure code path because of the
above errors.  I'm CC'ing someone who might be able to help.

Kris



Bear in mind that a recent urgent firmware update was released by Dell
last week for 1950, 1955, and 2950 systems that is supposed to fix some
data-corruption issues related to dual-core processors. I don't know if
this problem is a symptom of that, but it strongly suggested to apply
the firmware update regardless.





I have just been to dells site and there are firmware updates for almost 
every component in the system released about 2 weeks ago (10-11/4).  I 
have around 17 [12]950's waiting to go into pre production testing at 
the moment so I think that I will spend some time upgrading the firmware 
on them now rather than later.


Thanks for the heads up.

Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell SAS5 Performance Issue

2007-04-25 Thread Tom Judge

Matthew Jacob wrote:

I've been trying to get one in my lab. I also am completely saturated
with two jobs and a new infant (which is why I'm responding to this at
0100) and was trying to get a box in my lab that evidenced te
behaviour. If I get stuck when I actually get time to chase this, I'll
ask- thanks.



SNIP

I have one sat on my desk, if you would like I can ship it to you?


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM and TSO

2007-05-16 Thread Tom Judge

Jack Vogel wrote:

I introduced a change yesterday that limited TSO to PCI Express
adapters, I did this more for avoidance rather than a bug fix, and
I'm not 100% sure its the right thing, so I thought I would poll
everyone, do you have a PCI-X adapter and are using TSO without
problems and wish to keep the support in?

If no one is then I'll just leave it as is.

Jack


I have a number of systems with built in em adapters (on the 
motherboard)  is there any easy way to find out what type of bus these 
are connected to?


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: EM and TSO

2007-05-16 Thread Tom Judge

Sorry sent this to the wrong list,  should have been [EMAIL PROTECTED]

Sorry for the spam.

Tom Judge wrote:

Jack Vogel wrote:

I introduced a change yesterday that limited TSO to PCI Express
adapters, I did this more for avoidance rather than a bug fix, and
I'm not 100% sure its the right thing, so I thought I would poll
everyone, do you have a PCI-X adapter and are using TSO without
problems and wish to keep the support in?

If no one is then I'll just leave it as is.

Jack


I have a number of systems with built in em adapters (on the 
motherboard)  is there any easy way to find out what type of bus these 
are connected to?


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: make mplayer failed

2007-05-19 Thread Tom Judge

Sorry about the top post but this has nothing to do with your problem.

Please stop hijacking threads.  Both of you last 3 messages that should 
have been new threads (Install SCSI over ATAPI fro DVD, xfce4 broke 
after pkgdb -Ff and this thread) have hijacked totally unrelated threads.


Can I suggest that when you want to start a new thread on the mailing 
list that you hit the new message button and type in the list email 
address rather than hitting reply to another message and then replacing 
the message to be replied to with a new message.


Tom

KAYVEN RIESE wrote:


here is my error message

onfigure: error: Package requirements (glib-2.0 = 2.12.0atk = 
1.9.0 pa

ngo = 1.12.0cairo = 1.2.0) were not met:

No package 'atk' found
Requested 'cairo = 1.2.0' but version of cairo is 1.0.2

Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.

Alternatively, you may set the environment variables 
BASE_DEPENDENCIES_CFLAGS

and BASE_DEPENDENCIES_LIBS to avoid the need to call pkg-config.
See the pkg-config man page for more details.

===  Script configure failed unexpectedly.
Please run the gnomelogalyzer, available from
http://www.freebsd.org/gnome/gnomelogalyzer.sh;, which will diagnose the
problem and suggest a solution. If - and only if - the gnomelogalyzer 
cannot

solve the problem, report the build failure to the FreeBSD GNOME team at
[EMAIL PROTECTED], and attach (a)
/usr/ports/x11-toolkits/gtk20/work/gtk+-2.10.11/config.log, (b) the 
output
of the failed make command, and (c) the gnomelogalyzer output. Also, it 
might
be a good idea to provide an overview of all packages installed on your 
system

(i.e. an `ls /var/db/pkg`). Put your attachment up on any website,
copy-and-paste into http://freebsd-gnome.pastebin.com, or use send-pr(1) 
with

the attachment. Try to avoid sending any attachments to the mailing list
([EMAIL PROTECTED]), because attachments sent to FreeBSD mailing lists are
usually discarded by the mailing list software.
*** Error code 1

Stop in /usr/ports/x11-toolkits/gtk20.
*** Error code 1

Stop in /usr/ports/multimedia/mplayer.
bsd@/root# find / -name gnomelogalyzer.sh
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: mysql frequently crash on 6.2

2007-05-19 Thread Tom Judge

Albert Wong wrote:
SNIP
[Usually it only shows about three or four of these repeated errors.] 


I am using FreeBSD 6.2 and MySQL 4.1. I am trying to use the libthr
threading mechanism through the libmap.conf setting, as to earlier in this
thread post, as a possible fix. [Though, I don't know if I have in fact been
successful in switching to libthr or not... because I'm not sure if I need
to recompile / reboot?  I don't know if mysql was install from a port or
not.]

In any event, my libmap.conf settings are now [located in /etc/libmap.conf]:


[mysqld]
libc_r.so libthr.so
libc_r.so.6 libthr.so.2
libthr.so.2 libthr.so.2
libpthread.so libthr.so
libpthread.so.2 libthr.so.2

and I also added ...
WITH_LIBMAP= yes

to my make.conf file. 


Is there something else I need to do [e.g., recompile? / reboot?] in order
to activate libthr?  The problem remains even with these adjustments.


SNIP

You can validate the settings in libmap.conf with ldd. Here is the 
output you should see if libmap.conf is correct.


 $ ldd /usr/local/libexec/mysqld
/usr/local/libexec/mysqld:
libz.so.3 = /lib/libz.so.3 (0x6849c000)
libwrap.so.4 = /usr/lib/libwrap.so.4 (0x684ac000)
libcrypt.so.3 = /lib/libcrypt.so.3 (0x684b3000)
libstdc++.so.5 = /usr/lib/libstdc++.so.5 (0x684cc000)
libm.so.4 = /lib/libm.so.4 (0x6859b000)
libpthread.so.2 = /usr/lib/libthr.so.2 (0x685b1000)
libc.so.6 = /lib/libc.so.6 (0x685c4000)


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: minimizing downtime on upgrades? (for example: mysql 4.1 - 5.0 or php)

2007-05-22 Thread Tom Judge

Chuck Swiger wrote:

On May 22, 2007, at 12:03 PM, Olivier Mueller wrote:

So I can only do that after the installation of mysql50-client, which
means all the services will have to be stopped during the compilation of
mysql50-server, which usually takes some time.

Isn't there a better way?  How do you handle such cases?


Pretty much as you suggest below:


Same questions for php upgrades: on php5 upgrade, all the other php5-*
packages have to be compiled too, and keeping the webserver running
during this time is probably not the best idea.

What I'm going to try is to prepare packages of the ports I have to
upgrade on a dev/test server, and then install them with pkg_add: is
that the right way ?


You have a build box that you generate new tarballs of the packages you 
want to update (via make package, make package-recursive, 
portupgrade -p, etc), which you can then test and make sure they 
behave sensibly, and then use these to rapidly update your production 
machines with minimal downtime.




I have found that the ports-mgmt/tinderbox port is very useful for 
building and maintaining up to date packages with custom patchs, or non 
default knobs set.  I have a pair of dedicated build servers that it 
runs on but I cant see a reason why it could not run on any old system 
on your network.


You can then use pkg_add/pkg_delete to do the upgrade very quickly.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: installworld breaks

2007-05-25 Thread Tom Judge

[EMAIL PROTECTED] wrote:

cvsup and built around 20.30 BST
cd /usr/src; make -f Makefile.inc1 install
=== share/info install
=== include install
creating osreldate.h from newvers.sh
touch: not found
***error code 127
Stop in /usr/src/include
***error code 1
I tried cd usr.bin/touch  make  make install
but it does not seem to have any effect
any offers?
cheers



I have seen this problem when PATH was not set correctly.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Problems with BCE network adapter (Dell PE2950)

2007-06-07 Thread Tom Judge

Hi,

I am seeing some problems with one of my Dell PowerEdge 2950's (running
RELENG_6_2) on board bce NICs.  The interface seems to crash with the
following errors, to which the fix seems to be and ifconfig bce0 down;
ifconfig bce0 up:

Jun  7 12:20:29 gonzo kernel: bce0: discard frame w/o leading ethernet
header (len 4294967292 pkt len 4294967292)
Jun  7 12:20:29 gonzo last message repeated 54 times
Jun  7 12:20:58 gonzo kernel: nfs server nfs-server:/usr/home: not
responding


Is this a know problem? If so is there a solution?

Tom

#uname -a
FreeBSD gonzo.mintel.co.uk 6.2-RELEASE FreeBSD 6.2-RELEASE #10: Thu Apr
 5 10:53:39 BST 2007
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/PE2950 amd64 amd64 Intel(R)
Xeon(R) CPU 5160  @ 3.00GHz FreeBSD


## dmesg.boot snippet
bce0: Broadcom NetXtreme II BCM5708 1000Base-T (B2), v0.9.6 mem
0xf400-0xf5ff irq 16 at device 0.0 on pci9
bce0: ASIC ID 0x57081020; Revision (B2); PCI-X 64-bit 133MHz
miibus0: MII bus on bce0
brgphy0: BCM5708C 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bce0: Ethernet address: 00:18:8b:88:d8:81

## pciconf -lv snippet
[EMAIL PROTECTED]:0:0:  class=0x02 card=0x01b21028 chip=0x164c14e4 rev=0x12
hdr=0x00
vendor   = 'Broadcom Corporation'
class= network
subclass = ethernet

#ifconfig bce0
bce0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 8192
options=3bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU
inet 172.31.0.13 netmask 0xff00 broadcast 172.31.0.255
inet 172.31.0.157 netmask 0x broadcast 172.31.0.157
inet 172.31.0.161 netmask 0x broadcast 172.31.0.161
ether 00:18:8b:88:d8:81
media: Ethernet autoselect (1000baseTX full-duplex)
status: active

#Switch port counters
console# show interfaces counters ethernet g10
  PortInOctets  InUcastPkts InMcastPkts InBcastPkts
 -- --- --- ---
  g102615415297  875639870 13480

  Port   OutOctets  OutUcastPkts OutMcastPkts OutBcastPkts
 --   
  g102535595313  136391705 10740316 1420686

FCS Errors: 0
Single Collision Frames: 0
Late Collisions: 0
Excessive Collisions: 0
Internal MAC Tx Errors: 0
Oversize Packets: 0
Internal MAC Rx Errors: 0
Received Pause Frames: 0
Transmitted Pause Frames: 0

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Strange problem with skype on RELENG_6 KERNEL since the end of may.

2007-06-19 Thread Tom Judge

Tom Evans wrote:

On Tue, 2007-06-19 at 23:40 +1000, Norberto Meijome wrote:

On Thu, 14 Jun 2007 09:29:09 -0500
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

A RELENG_6 kernel from May 21 works fine with skype but boot a newer  
kernel and skype seems to be blocking port 80.   Apache logs show  
nothing.  I can find no logs errors anywhere but a telnet to port 80  
answers with what would seem to be binary chars.  I close skype and  
all is back to normal.  I had originally thought that it had to do  
with the new xorg installation but it seems to be something in the  
kernel.  The configurations were the same, basically GENERIC with all  
the pf stuff.

Hi,

Skype has an option to listen on tcp/80 and tcp/443 for incoming connections,
because it assumes somehow that firewalls will be configured to allow that
traffic in (some Windowze world assumption, i guess).

In the tools menu,  go to Options, Advanced, untick the option that reads Use
port (sic) 80 and 443 as alternatives for incoming connections.

Apply, exit skype, restart it.

confirm with 


sockstat -4 | grep skype | grep \*:80

that skype is NOT listening on port 80 (you shouldn't see any output back from
that cmd) ( similar for 443)



Doesn't this imply that the OP was running Skype as root?


Yes if the sysctl's below net.inet.ip.portrange all have there default 
values.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell PERC5/i SAS5/5IR - RAID monitoring

2007-07-17 Thread Tom Judge

Michael Worobcuk wrote:

Hi,
I am trying to set up my first webserver. I bought a Dell Poweredge 860, 
provided with a SAS5/IR RAID-Controller.
The problem is now, that I cannot find software, that monitors the state 
of my disks. I already tried megarc from the ports but all I get is a 
short answer that no adapters where found:


#
megarc -AllAdpInfo -nolog



**
  MEGARC MegaRAID Configuration 
Utility(FreeBSD)-1.04(03-02-2005)

  By LSI Logic Corp.,USA

**

  [Note: For SATA-2, 4 and 6 channel controllers, please specify
  Ch=0 Id=0..15 for specifying physical drive(Ch=channel, 
Id=Target)]


Type ? as command line arg for help

No Adapters Found

Error: No MegaRaid Found
#

I had emails with Dell and LSI. Dell does not support FreeBSD and LSI 
says I should go and ask Dell ...



The second thing is, the perfomance. 


SNIP


Final score for writes:16
Final score for reads :  2025
 #

(Just to remember: Pentium D; 2,8GHZ; 4 GB RAM; 2 x 500GB SATA RAID1)
That is pretty poor, isn't it ?


I am wondering now, if somebody has experience with the PERC5/I 
Controller. Would it be possible to monitor the disks, if I would buy 
that controller ?

Any hints are highly appreciated.

Thanks

Michael



I don't know about monitoring the SAS5/I however I read some posts on 
one of the lists that was talking about the linux compatibility system 
providing all of the correct interface for the linux version of 
?megacli? to work on FreeBSD.  As the SAS5/I is mpt driver based could 
it not be checked with camcontrol? (Just an idea never tested).


As for performance issues with the SAS5/i, there is a problem in the 
controller.  A work arround was created by Scott Long which created a 
sysctl that could be set to cause the controller to turn on the on drive 
write cache's. These changes where commited to RELENG_6 on 2007-06-05 
21:32:57 UTC.


The PERC5/[ei] controllers do not suffer the performance problems of the 
SAS5/i controller, we have ~30 systems with these controllers and have 
never seen any performance problems with them even when they have 20 
drives attached to them.


If you had search the archives you would have found an almost identical 
response by me to an almost identical question regarding the SAS5/i 
performance problem.


Tom


Here is the original commit log:

scottl  2007-06-03 23:13:05 UTC

  FreeBSD src repository

  Modified files:
sys/dev/mpt  mpt.c mpt.h mpt_cam.c
  Log:
  mpt.c:
  mpt.h:
  Add support for reading extended configuration pages.
  mpt_cam.c:
  Do a top level topology scan on the SAS controller.  If any 
SATA device are discovered in this scan, send a passthrough FIS to set 
the write cache.  This is controllable through the following tunable at 
boot:


  hw.mpt.enable_sata_wc:
  -1 = Do not configure, use the controller default
   0 = Disable the write cache
   1 = Enable the write cache

  The default is -1.  This tunable is just a hack and may be
  deprecated in the future.

  Turning on the write cache alleviates the write performance problems 
with  SATA that many people have observed.  It is not recommend for 
those who value data reliability!  I cannot stress this strongly enough. 
 However, it is useful in certain circumstances, and it brings the 
performence in line  with what a generic SATA controller running under 
the FreeBSD ATA driver  provides (and the ATA driver has had the WC 
enabled by default for years).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell PERC5/i SAS5/5IR - RAID monitoring

2007-07-18 Thread Tom Judge

Michael Worobcuk wrote:


Am 18.07.2007 um 01:27 schrieb Tom Judge:


Michael Worobcuk wrote:

Hi,
I am trying to set up my first webserver. I bought a Dell Poweredge 
860, provided with a SAS5/IR RAID-Controller.
The problem is now, that I cannot find software, that monitors the 
state of my disks. I already tried megarc from the ports but all I 
get is a short answer that no adapters where found:
# 


megarc -AllAdpInfo -nolog

**
  MEGARC MegaRAID Configuration 
Utility(FreeBSD)-1.04(03-02-2005)

  By LSI Logic Corp.,USA

**

  [Note: For SATA-2, 4 and 6 channel controllers, please specify
  Ch=0 Id=0..15 for specifying physical drive(Ch=channel, 
Id=Target)]

Type ? as command line arg for help
No Adapters Found
Error: No MegaRaid Found
# 

I had emails with Dell and LSI. Dell does not support FreeBSD and LSI 
says I should go and ask Dell ...

The second thing is, the perfomance.


SNIP


Final score for writes:16
Final score for reads :  2025
 # 


(Just to remember: Pentium D; 2,8GHZ; 4 GB RAM; 2 x 500GB SATA RAID1)
That is pretty poor, isn't it ?
I am wondering now, if somebody has experience with the PERC5/I 
Controller. Would it be possible to monitor the disks, if I would buy 
that controller ?

Any hints are highly appreciated.
Thanks
Michael



I don't know about monitoring the SAS5/I however I read some posts on 
one of the lists that was talking about the linux compatibility system 
providing all of the correct interface for the linux version of 
?megacli? to work on FreeBSD.  As the SAS5/I is mpt driver based could 
it not be checked with camcontrol? (Just an idea never tested).



SNIP


Hi Tom,
thank you for your response. What about monitoring the PERC5/ie ? Does 
it work with megarc or any program under FreeBSD ?




Just to clarify that the the following was related to the PERC5/ie and 
the mfi driver:


 I read some posts on one of the lists that was talking about the linux 
compatibility system providing all of the correct interface for the 
linux version of megacli/megarc to work on FreeBSD


I think google can answer the rest for you.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell PERC5/i SAS5/5IR - RAID monitoring

2007-07-20 Thread Tom Judge

Michael Worobcuk wrote:

Tom Judge wrote:

As for performance issues with the SAS5/i, there is a problem in the 
controller.  A work arround was created by Scott Long which created a 
sysctl that could be set to cause the controller to turn on the on 
drive write cache's. These changes where commited to RELENG_6 on 
2007-06-05 21:32:57 UTC.



...



Here is the original commit log:

scottl  2007-06-03 23:13:05 UTC

  FreeBSD src repository

  Modified files:
sys/dev/mpt  mpt.c mpt.h mpt_cam.c
  Log:
  mpt.c:
  mpt.h:
  Add support for reading extended configuration pages.
  mpt_cam.c:
  Do a top level topology scan on the SAS controller.  If any 
SATA device are discovered in this scan, send a passthrough FIS to 
set the write cache.  This is controllable through the following 
tunable at boot:


  hw.mpt.enable_sata_wc:
  -1 = Do not configure, use the controller default
   0 = Disable the write cache
   1 = Enable the write cache

  The default is -1.  This tunable is just a hack and may be
  deprecated in the future.




I set mpt.enable_sata_wc to 1, as hw.mpt.enable_sata_wc is, AFAIK not 
tunable in mpt_cam.c. This did not take any effect to the performance. 
Is there anything else to change ?




Not that I know of, do you have SAS or SATA disks attached to the 
controller?


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell PERC5/i SAS5/5IR - RAID monitoring

2007-07-20 Thread Tom Judge

Michael Worobcuk wrote:


Am 21.07.2007 um 00:18 schrieb Tom Judge:


Michael Worobcuk wrote:

Tom Judge wrote:

As for performance issues with the SAS5/i, there is a problem in the 
controller.  A work arround was created by Scott Long which created 
a sysctl that could be set to cause the controller to turn on the on 
drive write cache's. These changes where commited to RELENG_6 on 
2007-06-05 21:32:57 UTC.



...



Here is the original commit log:

scottl  2007-06-03 23:13:05 UTC

  FreeBSD src repository

  Modified files:
sys/dev/mpt  mpt.c mpt.h mpt_cam.c
  Log:
  mpt.c:
  mpt.h:
  Add support for reading extended configuration pages.
  mpt_cam.c:
  Do a top level topology scan on the SAS controller.  If 
any SATA device are discovered in this scan, send a passthrough FIS 
to set the write cache.  This is controllable through the following 
tunable at boot:


  hw.mpt.enable_sata_wc:
  -1 = Do not configure, use the controller default
   0 = Disable the write cache
   1 = Enable the write cache

  The default is -1.  This tunable is just a hack and may be
  deprecated in the future.


I set mpt.enable_sata_wc to 1, as hw.mpt.enable_sata_wc is, AFAIK not 
tunable in mpt_cam.c. This did not take any effect to the 
performance. Is there anything else to change ?


Not that I know of, do you have SAS or SATA disks attached to the 
controller?


yes, SAS.




In that case the above sysctl is not going to work for you as as the 
name suggests it is only for sata devices.


For someone with a bit more SCSI experience than me: Could this be 
solved by setting the WCE (Write cache enable) bit in the modepage (8) 
on sas devices if it is not already set?  The driver could make this 
change on SAS devices during the topology scan in similar way to the way 
it does for SATA devices?


Tom


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: A little story of failed raid5 (3ware 8000 series)

2007-08-25 Thread Tom Judge

Tom Samplonius wrote:

- Artem Kuchin [EMAIL PROTECTED] wrote: ...

But i don't understand how and why it happened. ONly 6 hours ago (a
 night before) all those files were backed up fine w/o any read
error. And now, right after replacing the driver and starting
rebuild it said that there are bad sectors all over those file. How
come?


What happened to you was an extremely common occurrence.  You had a
disk develop a media failure sometime ago, but the controller never
detected it, because that particular bad area was not read.  Your
backups worked because they never touched this portion of the disk
(ex. empty space, meta data, etc).  And then another drive developed
a electronics failure, which is instantly detected, putting the array
into a degraded mode.  When you did a rebuild onto a replace drive,
the controller discovered that there was a second failed disk, and
this is unrecoverable.


3ware controllers can recover from this situation, all you need to do is 
tell the controller not to verify the source data.  This is a litle 
dangerous but it has saved me in the past where 1 drive died in a raid 
10 array and 2 of the 3 remaining drives had surface defects.  The trick 
was to replace each drive 1 at a time and rebuild without data 
verification.  After 10 painful hours the array was rebuild with out any 
noticeable data corruption.





RAID, of any level, isn't magic.  It is important to understand how
it works, an realize that drives can passive fail.  BTW, if you were
using RAID1 or RAID10, you would likely have had the same problem
(well, RAID10 can survive _some_ double-disk failures).  RAID6 is the
only RAID level that can survive failure of any two disks.


This is not all true RAID 1 can survive multiple disk failures as it has
the storage capacity of 1 spindle and can tolerate the failure of N-1
spindles where N is the number of spindles in the mirror set.  This also 
is kind of true in RAID 10, the more spindles in your mirror sets the 
more chance you have of being able to survive multiple failures in the 
array (Say use 6 disks in 2 3 disk mirror sets striped together).




The real solution is RAID scrubbing:  a low level background process
that reads every sector of every disk.  All of the real RAID systems
do this (usually scheduled weekly, or every other week).  Most 3ware
RAID card don't have this feature.

So rather than not using RAID5 or RAID6 again, you should just not
use 3ware anymore.


If you use the 3dm2 management interface you can schedule verify and
rebuild tasks to run on a regular basis.  I think that 7500 series
controllers can do this, 9500 and 9550's definitely can.

We have 50+ systems that are using 3ware cards (7500-9550 4 and 8 
channel models) with 200+ spindles in use (no host spares unfortunately) 
and drives in that pool failing on average around once a month. We have 
only ever had trouble recovering from failed drives on 7500 series 
controllers that have been in production for a reasonably long time.


I don't think that you are justified in your slagging off of 3ware 
controllers.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: wpa_supplicant features and compile options

2007-09-06 Thread Tom Judge

Mark Andrews wrote:

On 12/23/-58 20:59, Greg Rivers wrote:

div class=moz-text-flowedI connect to certain wireless networks that
require the EAP_GTC and EAP_OTP features in wpa_supplicant.  These
features are not compiled into wpa_supplicant by default.

Using the patch below works great, but it's inconvenient having to
remember to apply it after every cvsup.  Is there a better way to
accomplish this?  If not, might a change such as this be committed to
enable GTC and OTP by default?


Greg,

I'm unable to comment on including your patch into cvs, but for own
patches like your's, I'm using a script which does 1) csup and 2)
integrate a bunch of patches into /usr/src.

HTH

Volker

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Or you can transfer the cvs repository and update your
source tree from that using cvs.  You just leave the local
changes uncommitted.

Alteratively you can cvs import the FreeBSD src periodically.
You can then commit local changes.  This can also make it
easy to roll back to your previous build state.  This should
take less disk space than the previous solution.

Mark
There is a better way, you can use cvsup to mirror the FreeBSD cvs 
repository and then use cvs to commit your changes to your own branch. 
There is a flag for cvs that you can set that causes branches you make 
to have really high branch numbers so that they are not likely to clash 
with the branches in the master repository.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: geli and SMP problems?

2007-10-25 Thread Tom Judge

Krassimir Slavchev wrote:
SNIP


Any ideas how to debug this?


SNIP

Have you looked at:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: botched RELENG_6 buildworld (ncurses)

2007-10-25 Thread Tom Judge
Have you read the thread HEADSUP: don't upgrade to RELENG_6 now (7 is 
fine)?  I would suggest you do as the fix has been posted by Rong-en 
Fan 8 hours ago.


Tom


Robin P. Blanchard wrote:

Any suggestions how to get past this (now make itself it broken) ?

Buildworld/installworld with this morning's sources:


=== lib/ncurses/ncurses (install)
install -C -o root -g wheel -m 444   libncurses.a /usr/lib
install -s -o root -g wheel -m 444 libncurses.so.6 /lib
ln -fs /lib/libncurses.so.6  /usr/lib/libncurses.so
install -o root -g wheel  -m 444
/usr/src/lib/ncurses/ncurses/../../../contrib/ncurses/doc/html/ncurses-intro.
html
/usr/src/lib/ncurses/ncurses/../../../contrib/ncurses/doc/html/hackguide.html
/usr/share/doc/ncurses
install -o root -g wheel -m 444 curs_addch.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_addchstr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_addstr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_attr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_beep.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_bkgd.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_bkgrnd.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_border.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_border_set.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_clear.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_color.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_delch.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_deleteln.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_extend.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_getcchar.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_getch.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_getstr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_getyx.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_inch.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_inchstr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_initscr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_inopts.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_insch.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_insstr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_instr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_inwstr.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_kernel.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_mouse.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_move.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_outopts.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_overlay.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_pad.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_print.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_refresh.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_scr_dump.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_scroll.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_slk.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_termattrs.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_termcap.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_terminfo.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_touch.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_trace.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_util.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 curs_window.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 default_colors.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 define_key.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 key_defined.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 keybound.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 keyok.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 legacy_coding.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 ncurses.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 resizeterm.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 wresize.3.gz  /usr/share/man/man3
install -o root -g wheel -m 444 term.5.gz  /usr/share/man/man5
install -o root -g wheel -m 444 terminfo.5.gz  /usr/share/man/man5
install -o root -g wheel -m 444 term.7.gz  /usr/share/man/man7
/libexec/ld-elf.so.1: /lib/libncurses.so.6: Undefined symbol __mb_sb_limit
*** Error code 1

Stop in /usr/src/lib/ncurses/ncurses.
*** Error code 1

Stop in /usr/src/lib/ncurses.
*** Error code 1

Stop in /usr/src/lib.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.

Re: SAS5IR performance issue with Dell 860

2007-11-25 Thread Tom Judge

Espen Tagestad wrote:

Hi,

We recently bought 3 new Dell 860 servers with the onboard SAS5/IR SATA 
RAID-controller. They seem to be quite well spec'ed servers with 
management and everything - but I am experiencing av major performance 
issue with the disc i/o. On write I get at max 7-8MB/sec, while read 
gives a bit more (11MB/sec). I tried first with 6.2-RELEASE, and then 
upgraded to 6.3-PRERELEASE without any better results.


I am aware of some discussion around this issue on these two maillists 
in the spring earlier on this year, but I have not been able to find any 
good resolution. My old firewall/router at home equipped with a 733Mhz 
Pentium 3 processor and a old 40GB IDE harddrive made around year 2001 
performe better. Is there anybody out there with the same problem who 
has solved this issue? Could it be that this is solved in 7.0?


Thanks in advance.


Hi,

If you have sata disks on this controller you will need to run RELENG_6 
dated after 2007-06-05 21:32:57.


You will also need to set the hw.mpt.enable_sata_wc sysctl in loader.conf.

Here is Scott Long's commit log of the changes that 'fix' this issue.

Tom


Commit Message:

scottl  2007-06-03 23:13:05 UTC

   FreeBSD src repository

   Modified files:
 sys/dev/mpt  mpt.c mpt.h mpt_cam.c
   Log:
   mpt.c:
   mpt.h:
   Add support for reading extended configuration pages.
   mpt_cam.c:
   Do a top level topology scan on the SAS controller.  If any
SATA device are discovered in this scan, send a passthrough FIS to set
the write cache.  This is controllable through the following tunable at
boot:
 hw.mpt.enable_sata_wc:
   -1 = Do not configure, use the controller default
0 = Disable the write cache
1 = Enable the write cache

   The default is -1.  This tunable is just a hack and may be
   deprecated in the future.

Turning on the write cache alleviates the write performance problems
with SATA that many people have observed.  It is not recommend for those
who value data reliability!  I cannot stress this strongly enough.
However,  it is useful in certain circumstances, and it brings the
performence in line with what a generic SATA controller running under
the FreeBSD ATA driver provides (and the ATA driver has had the WC
enabled by default for years).


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SAS5IR performance issue with Dell 860

2007-11-25 Thread Tom Judge

Espen Tagestad wrote:

Tom Judge wrote:

Espen Tagestad wrote:
We recently bought 3 new Dell 860 servers with the onboard SAS5/IR 
SATA RAID-controller. They seem to be quite well spec'ed servers with 
management and everything - but I am experiencing av major 
performance issue with the disc i/o. On write I get at max 7-8MB/sec, 
while read gives a bit more (11MB/sec). I tried first with 
6.2-RELEASE, and then upgraded to 6.3-PRERELEASE without any better 
results.


You will also need to set the hw.mpt.enable_sata_wc sysctl in 
loader.conf.


There isn't any hw.mpt.enable_sata_wc sysctl available on my systems. It 
is 6.3-PRERELEASE, but as I recon there wasn't any *mpt* sysctls on the 
latest 6.2-RELEASE either. Is it deprecated? Is there any other options? 
I can see a hw.ata.wc but that one is set to 1 which I presume equals 
enabled.


Anyway - the write cache, is that something that is set on the raid 
controller, or is it a buffer in the FreeBSD kernel that takes care of 
the caching? As the commit note you sent me said - to ensure absolutely 
best data integrity the write cache should be left switched off. But 
write performance of 7-8MB/sec is just too low for that - is the 
controller /sata drives really that slow?





Please read 
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2007-07/msg00347.html


You will have to set the sysctl in /boot/loader.conf and reboot the 
system for the changes to take affect.


Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bootless!

2009-10-17 Thread Tom Judge

Look at the footers of this email for how to remove yourself.

Swearing will not get you any where, and is likely to result in people 
not helping you.



Clifford, Ken wrote:

Take me off this fucking list!@

From: owner-freebsd-sta...@freebsd.org [owner-freebsd-sta...@freebsd.org] On 
Behalf Of Randy Bush [ra...@iij.ad.jp]
Sent: Friday, October 16, 2009 9:13 PM
To: FreeBSD-STABLE Mailing List
Subject: bootless!

i386 running 7.2 as of aug 29
twe, gmirrored boot partition, zfs universe
cvsupped 24 hours ago
new kernel world
will not boot.  get beastie but stops at first twirly

can boot old kernel -s, but not new kernel

can not use old kernel with new world, hangs if i try to /etc/rc.d/zfs
start

am desperate enough that i am flying from dc to seattle see if i can get
it revved back to aug 29 with a fixit

any clues appreciated.  please use From: address, as my normal email is
guess where.

randy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
  


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Regression in dhclient?

2009-10-19 Thread Tom Judge

Kevin Oberman wrote:

I just noticed that my dhclient.conf file seems to be ignored in
8.0. Worked fine in 7.2.

interface ath0 {
  send host-name slan.XXX.YYY;
  prepend domain-name XXX.YYY ;
  append domain-name-servers 198.128.W.ZZ;
}
  


Your interface is wrong, it should be wlanX not athX

When I look at /etc/resolv.conf, neither the domain-name is added nor te
dns-server. No errors or anything else in the logs.

Anyone else see this or do I have a local problem?
  


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Nagios + 6.3-RELEASE == Hung Process

2008-01-02 Thread Tom Judge

Michael Butler wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Marc G. Fournier wrote:

G'day ...

  Yesterday, I setup nagios to do some system monitoring ... installed the
latest version from ports into a jail, so that I could easily move it around
between machines as I upgrade, without losing data ... after about 30 minutes
running, I get a second nagios process running (fork?) that takes up ch CPU
time as is available, and just hangs there until I kill -9 it ...


[ .. ]


After searching the 'Net a bit, came across this thread:

http://www.nagiosexchange.org/nagios-users.34.0.html?tx_maillisttofaq_pi1%5Bmode%5D=1tx_maillisttofaq_pi1%5BshowUid%5D=7694

That recommends modifying libmap.conf with:

[/usr/local/bin/nagios]
libpthread.so.2 libthr.so.2
libpthread.so libthr.so


Thanks for pointing this out. I've had similar problems with nagios but
hadn't found a solution until I saw your pointer. Sadly, my expertise
with both thread libraries is sufficiently lacking that I have no clue
where to start looking for the cause :-(



I have also seen this issue, but have always put it down to the way that
we manage our nagios deployments with cfengine.  I will try to deploy
this change and monitor for the problem to see if it persists.

On a side note if you want to use broker modules with nagios from port
you need to change the following in the port Makefile in order to make
them load properly:

From:
USE_AUTOTOOLS=  autoconf:259
To:
SE_AUTOTOOLS=  autoconf:259 libltdl:15


I sent an email to the maintainer but got no response and my email did
not seem to have affected the last commit to upgrade to 2.10.


Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Nagios + 6.3-RELEASE == Hung Process

2008-01-02 Thread Tom Judge

Jarrod Sayers wrote:

On 03/01/2008, at 1:56 AM, Tom Judge wrote:

I have also seen this issue, but have always put it down to the way that
we manage our nagios deployments with cfengine.  I will try to deploy
this change and monitor for the problem to see if it persists.


I hope I can confirm your frustrations.  There is a threading issue with 
Nagios when it's binaries are linked against libpthread(3) threading 
library, the default on recent FreeBSD 5.x releases and all 6.x 
releases. The issue is random and extremely difficult to track down with 
the symptoms being a second Nagios process sitting on the system hanging 
a CPU.  Be rest assured that I have been working on it, and have seen it 
on one system of mine.




Not sure if this is related at all but out of the 3 nagios deployments 
we have here I have only ever seen it on one (It currently has 2 nagios 
threads spinning CPU time atm).


The differences on that server are:

* It is amd64 compared to i386
* It also runs ndo2db from ndoutils 1.4b7

All the systems run 6.2-RELEASE-p5 and nagios-2.9_1, they are also all 
patched with gnu libltdl patch below.


Don't know if that info is of any use to you.

Changes have been submitted for net-mgmt/nagios-devel (aka Nagios 
3.0.r1)) to force the build process to link against libthr(3) where 
available, removing the need to map libpthread() out with 
/etc/libmap.conf.  If this goes well, as stated in the PR, i'll 
back-port it to net-mgmt/nagios (aka Nagios 2.10) in the next few days.


If anyone out there is running net-mgmt/nagios-devel and feels like 
trying it for me, see ports/119246 and drop me an email with a before 
and after ldd /usr/local/bin/nagios.



On a side note if you want to use broker modules with nagios from port
you need to change the following in the port Makefile in order to make
them load properly:

From:
USE_AUTOTOOLS=  autoconf:259
To:
SE_AUTOTOOLS=  autoconf:259 libltdl:15

I sent an email to the maintainer but got no response and my email did
not seem to have affected the last commit to upgrade to 2.10


I did receive that email and the changes went in with the last commit of 
net-mgmt/nagios-devel to test.  No issues have arisen so i'll be 
back-porting it to net-mgmt/nagios soon for you.  There also has been a 
rather large ports freeze which delayed the upgrade to Nagios 2.10, that 
PR was submitted on the 1st of November and committed on the 13th of 
December.  Unfortunately your email fell somewhere in the middle, 
apologies for not letting you know.




Thanks for this,  I currently maintain the patch on our build servers.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: What current Dell Systems are supported/work

2008-01-08 Thread Tom Judge

Richard Bates wrote:

Sorry for the repost...
I don't think the first one posted..

posted to freebsd.stable, freebsd-current, Freebsd-hardware

I checked the hardware in the online documentation manual/hardware

It only lists the bits and peices of the machine say the hard drive 
controller and so forth. but doesn't give you a particular system to 
look at as a working machine with FreeBSD 6.2


does anybody know if a Dell PowerEdge 1950
• Quad-Core Intel Xeon Processors 5400 series 3.16GHz
• 4GB Ram



We have ~20 PE [12]950 systems here all running 6/2 with a back ported 
bce driver from RELENG_6.


Tom
I am looking to attach 2 machines to a SAN to make a constantly up 
system. Is there a Dell San and San Switch that will work with this 
version of BSD?


Thank you for your help

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell PERC6?

2008-01-17 Thread Tom Judge

Ferdinand Goldmann wrote:

Hi!

I am in the process of buying new Dell hardware, mainly the 2950 III.
According to various postings I found, the PERC6/i Controller _should_ 
work with FreeBSD 6.3. Does anyone successfully use a 2950 III with 
PERC6/i controller and can confirm this?


Sorry if the question sounds stupid, but as I cannot find any references 
to the PERC6 in either documentation or source code I am a bit confused, 
and I wanted to make sure it works before shelling out my employers 
money. :-)


Many thanks for any enlightenment on this subject,
kind regards,
Ferdinand

Hi,

I have just pulled a new one out of a the box and done a boot and 
partition test on it using 6.3-RC2 boot only CD.  Seemed to work just 
fine.  I haven't got any PERC6/i's in production at this stage so can 
really say about their stability/performance just that they appear to 
work just fine with the 10 minute test I just did.


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Backup solution suggestions [ggated]

2008-01-18 Thread Tom Judge

Ulrich Spoerlein wrote:

On Jan 18, 2008 9:11 AM, Johan Ström [EMAIL PROTECTED] wrote:

Your no,barely, bad hell no seems to fit pretty good.. I did some
testing during the night with the above (non-production) setup.
What I did was doing some rsyncing over the night:

while true ; do
 echo `date` Clearing vmail  logfile
 rm -rf vmail
 echo `date` Starting rsync  logfile
 rsync -vr /usr/var/vmail . |tee -a logfile
 echo `date` Rsync finished   logfile
done

I started this at ~02.0. The results? A freshly rebooted 6.2 (6.2-
RELEASE-p6 FreeBSD 6.2-RELEASE-p6 #0: Fri Jul 27 15:47:50 UTC 2007)
box in the morning..
[...]
What I dont have is a coredump, judging from dmesg -a savecore wasnt
even run.. running it now, 5 hours later, didnt find any cores.

The other end (7.0 server) wasnt affected at all.

Not realy sure what it had been doing, because looking at my
bandwidth graphs from the switch, nothing was done at all.. It didnt
even go through one iteration of rsync... ~7.5k files/directorys
seems to have been transfered, then the log doesnt say more. But
according to the BW graph, after ~03.00 no traffic was sent at all...

Some known bug with 6.2?


There was some ggatec problems with TCP and/or sockets, I think they
have been mostly resolved post-6.2. If you want to pursue this further
(it *would* be a cool setup, no doubt) I'd suggest three things:
- Update to 6.3
- Leave GELI out of the loop for now (only do ggate, with random data perhaps)
- Build a kernel *without* options PREEMPTION



Hi,

We have 4 production High Avaliability NFS clusters running 
GMirror+GGate+LinuxHA (2 Nodes per cluster) on RELENG_6_2.  This setup 
has proved very stable for us you have to do some tuning though:


/etc/sysctl.conf:
net.inet.tcp.sendspace=1048576
net.inet.tcp.recvspace=1048576
kern.ipc.maxsockbuf=2049152

/boot/loader.conf:
kern.ipc.nmbclusters=32768


Command line options to ggate[cd]
ggate[dc]_buf_size=1310720
ggatec_timeout=5
ggatec_queue_size=2048

Cluster node uptimes range from 40-160 days with the last reboots being 
caused by power problems not FreeBSD issues.


The problems may be in the tuning or with geli, personally I would leave 
geli out at try with the above configuration.  Then try above with geli 
to see what the problem is.


Tom







___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell Perc 6 disk geometry problem with RAID5 (both 6.3 final and 7.0 RC1)

2008-01-20 Thread Tom Judge

Erik Trulsson wrote:

On Sun, Jan 20, 2008 at 04:48:56PM +, David Wood wrote:

[ambrisko@ and scottl@ added to CCs]

Hi there,

In message [EMAIL PROTECTED], 
Aldas Nabazas [EMAIL PROTECTED] writes

We bought a new Dell PowerEdge 2950III with Perc 6/i and have the disk
geometry problem using 6.3 final or 7.0 RC1. Seems that we are not alone at
least one guy has similar problem reported earlier:
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/questions/2008-01/msg00506.html



This post is related to using BSDLabel and FDISK MBR partition tables 
with a disk over 2Tb.  This is a known limitation of this type of 
partition table where the max size is 2Tb.  If you wish to make a larger 
partition you must use GPT or the raw device with no partition table 
(Not recommended).



I was reading the mailing list and found that some of the people are happily
using this hardware with the latest FreeBSD:
http://lists.freebsd.org/pipermail/freebsd-stable/2008-January/039675.html

I just wonder what the status of mfi driver? Maybe it's not fully tested or
there will be some important fixes before 7.0 final? We are going to try
different RAID combinations but it definitely not working using 6x146GB as
RAID5.


I do not know if the mfi(4) driver has any problems with large disks, but it
is however well known that fdisk(8) and bsdlabel(8) (the tools normally used
to partition disks) have problems with volumes larger than 2TB.

If you want to use volumes larger than 2TB then gpt(8) is the recommended
way to partition the disks.  It is however doubtful if the BIOS in your
system will allow you to boot from a gpt(8) parttioned volume which is
best solved with having a separate - smaller - boot volume where the OS
itself is installed.



I have 5 PERC6/i based systems awaiting deployment atm, might be able to 
do some very limited testing on one of them if time permits.  However I 
have some very large arrays running on PERC5/e's (6TB Raid 50 - 5 disk 
spans - 15 * 500Gb disks spread over 2 md1000 shelves) (the systems run 
RELENG_6_2) and have not seem any issues with them.  They are a single 
gpt partition with UFS2 on them.  As far as I have seen in the last 
~8 months there have been no issues with the mfi driver and large 
arrays.  However I cannot say at this stage if the same can be said for 
the PERC6/i.





This is extremely disappointing to read, as I was relatively close to 
buying a Poweredge 2950 III with PERC 6/i. However, it's no good to me if 
mfi(4) has issues with large virtual disks; the tentative disk 
configuration is 2x146GB as RAID 1 and 4x750GB (or 1TB) as RAID 5.



Looking at CVSweb:
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/mfi/



There is one related PCI ID's change to do with DELL sub vendor id's I 
think that is only on HEAD.  Any chance of getting this MFC'd?


Tom


the updates for the 1078 chip which powers the PERC 6 series were 
contributed by LSI, so you would have hoped things were right. There is a 
disclaimer on the code, but you would also hope that someone in the know is 
testing it, especially as my impression has always been that the FreeBSD 
community is favourably disposed towards LSI storage controllers and that 
LSI and their vendors try to help the FreeBSD developers.




Maybe someone could share the RAID combinations they successfully are
running on?
I haven't been keeping a very close eye on the problem as I don't currently 
have any hardware - but is the issue simply one of virtual disk size - 
there's a cut-off size after which things don't work properly?


You could try pulling disks from your server (or removing them from the 
virtual disk) one by one until things start to work. To save a lot of pain 
you could create a virtual disk containing one disk as RAID 0 (a single 
disk) and install the OS on it, leaving the other five disks to play with.



I do hope that someone is in a position to investigate this quickly - even 
if it's too late to get the fix in 7.0-RELEASE now. There's nothing that I 
can see as relevant that's waiting for MFC, anyway.


Of course, it's worth checking whether you've got the latest firmware on 
the PERC 6/i - the latest version from Dell appears to be 6.0.1-0080.



If this doesn't get fixed soon, I'm either going to have to go to another 
hardware vendor that uses a different RAID controller (HP is a possibility 
- though we're an all Dell shop) or - sniff - leave FreeBSD in favour of a 
Linux distribution. I realise that FreeBSD is a volunteer project, and that 
users can't have any specific expectations - this is not a threat, as I 
like FreeBSD and want to remain in the community, but I would need a new 
server to work properly!



Looking at CVSweb, it seems that ambrisko@ and scottl@ are the two people 
most closely associated with the mfi code - I've added them into the CCs. 
My apologies if that was unwelcome.







___
freebsd-stable@freebsd.org mailing list

Re: Latest Stable FreeBSD release and is it supported on dell 2950

2008-01-23 Thread Tom Judge

Vivek Khera wrote:


On Jan 23, 2008, at 12:30 AM, navneet Upadhyay wrote:

SNIP



6. Which platform/machine which BSD supports. Is Dell 2950 ok


I've only ever had one compatibility issue with a Dell, and that was 
easily fixed by teaching the bge ethernet driver the name of the chipset 
used on that particular box.  I have a PE1900 here we just got and it 
runs 7.0-RC1 quite nicely.


In general, unless you're running some obscure devices, FreeBSD works 
just fine with most systems.



PE2950's use the bce driver, which works fine with 6.3+

Tom J
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Well-supported SAS RAID card for 6.3?

2008-01-27 Thread Tom Judge

Josh Endries wrote:

Hi Scott,

Scott Long wrote:
LSI, Highpoint, Areca, 3ware, and Adaptec are all well supported in 
FreeBSD.


Are they? I don't see any reference to the LSI8708, LSI or LSI1068 
in the man pages I can find...does anyone use these? Some people have 
problems with the PERC 6/i (which I think is an LSI), which makes me 
wonder.

SNIP

The problems I have seen reported about this are due to the use of FDISK 
partitions on arrays larger than 2TB.  I have yet to see a report about 
a real fault with the mfi driver (on stable@ and current@ at least).


Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Crashing repeatedly: 6.2-RELEASE-p5 and MySQL 5.0.41

2008-02-04 Thread Tom Judge

Primeroz lists wrote:

Hi all,

we are experiencing repeated crash on a Dell PowerEdge 2950 (rev 1 or 2).

FBSD release is 6.2-RELEASE-p5 , AMD64. 2xXeon QuadCore and 8G of Ram.


SNIP




$ sudo  kgdb /usr/obj/usr/src/sys/PE2950/kernel.debug vmcore.2

SNIP


This back trace will be useless,  I rebuilt the kernels on the build 
servers the week before last, not all systems got the updates.  You will 
need to 'make installkernel' and crash the box again to get a usable 
vmcore...



Pointy hat to me sorry, should have told you this before.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RELENG_7_0 buildworld failure on read only source tree

2008-03-07 Thread Tom Judge

Hi,

We have been building RELENG_6_x source trees from read only NFS file 
systems for well over a year now with out any problems.  However I have 
just tried to do make buildworld on a RELENG_7_0 source tree from 
yesterday and it failed to build with the following error:



=== gnu/usr.bin/cvs/cvs (cleandir)
rm -f cvs add.o admin.o annotate.o buffer.o checkin.o checkout.o 
classify.o client.o commit.o create_adm.o cvsrc.o diff.o edit.o 
entries.o error.o expand_path.o fileattr.o filesubr.o find_names.o 
hardlink.o hash.o history.o ignore.o import.o lock.o log.o login.o 
logmsg.o main.o mkmodules.o modules.o myndbm.o no_diff.o parseinfo.o 
patch.o prepend_args.o rcs.o rcscmds.o recurse.o release.o remove.o 
repos.o root.o run.o scramble.o server.o stack.o status.o subr.o tag.o 
update.o vers_ts.o version.o watch.o wrapper.o zlib.o cvs.1.gz cvs.5.gz 
cvs.1.cat.gz cvs.5.cat.gz

rm -rf cvs-sanity
rm -f .depend GPATH GRTAGS GSYMS GTAGS
=== gnu/usr.bin/cvs/contrib (cleandir)
sed -e 's,@CSH@,/bin/csh,' -e 's,@PERL@,/usr/bin/perl,' 
/usr/src/gnu/usr.bin/cvs/contrib/../../../../contrib/cvs/contrib/Makefile.in 
 Makefile

cannot create Makefile: Read-only file system
*** Error code 2

Stop in /usr/src/gnu/usr.bin/cvs/contrib.
*** Error code 1

Stop in /usr/src/gnu/usr.bin/cvs.
*** Error code 1

Stop in /usr/src/gnu/usr.bin.
*** Error code 1

Stop in /usr/src/gnu.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.


Should it be possible to build RELENG_7_0 with a read only source tree?

Thanks

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RELENG_7_0 buildworld failure on read only source tree

2008-03-09 Thread Tom Judge

Xin LI wrote:

Tom Judge wrote:

Hi,

We have been building RELENG_6_x source trees from read only NFS file 
systems for well over a year now with out any problems.  However I 
have just tried to do make buildworld on a RELENG_7_0 source tree 
from yesterday and it failed to build with the following error:



[error snipped]


Should it be possible to build RELENG_7_0 with a read only source tree?


This can be worked around, I think touching the Makefile would do the 
trick.  IIRC David (cc'ed) has fixed the Makefile at some point this 
January while he is updating the base cvs(1) for this exact issue, maybe 
we should MFC the changeset to RELENG_7?


Cheers,



Thanks for the response,  it was indeed just a case of touching 
/usr/src/gnu/usr.bin/cvs/contrib/Makefile.  Do you have the information 
about the relevent change sets that fix this?  Then I can merge this fix 
to my local tree.


Thanks

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Using FreeBSD Update to deploy system updates from custom builds

2009-01-13 Thread Tom Judge

Hi,

I was wondering if anyone was using freebsd-update to manage deployment 
of custom FreeBSD builds to there systems.


Here is the scenario, I have 2 binary build servers at the moment (one 
for i386 and one for amd64) and currently we stage the deployments of 
updates on NFS servers at each site.  We use make installworld/kernel to 
update the servers from read only src and obj NFS mounts.


I'm now looking to remove the src trees from the NFS servers and 
possibly the obj trees and use freebsd-update to deploy and maintain the 
custom build installation and updates.


So I have 2 questions:

   1) Does this seem sensible?  It seems within scope of what 
freebsd-update was designed to do.


   2) How does one go about building the binary distributions that 
freebsd-update expects to be on the update server?



Thanks

Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Using FreeBSD Update to deploy system updates from custom builds

2009-01-15 Thread Tom Judge

Daniel Bond wrote:

Hi Tom,

I don't know how much documentation there is on this, but if you are 
investigating this issue, maybe you would like to contribute/update 
some documentation on it?


Royce gave me a link to the tools, 
http://www.freebsd.org/cgi/cvsweb.cgi/projects/freebsd-update-server/

reading through some of the scripts might give some clues.



Regards,

Daniel Bond.



Thanks for the info,  I will look into this over the next few weeks and 
see what I can come up with.


Regards

Tom Judge


On Jan 14, 2009, at 6:05 AM, Tom Judge wrote:


Hi,

I was wondering if anyone was using freebsd-update to manage 
deployment of custom FreeBSD builds to there systems.


Here is the scenario, I have 2 binary build servers at the moment 
(one for i386 and one for amd64) and currently we stage the 
deployments of updates on NFS servers at each site.  We use make 
installworld/kernel to update the servers from read only src and obj 
NFS mounts.


I'm now looking to remove the src trees from the NFS servers and 
possibly the obj trees and use freebsd-update to deploy and maintain 
the custom build installation and updates.


So I have 2 questions:

  1) Does this seem sensible?  It seems within scope of what 
freebsd-update was designed to do.


  2) How does one go about building the binary distributions that 
freebsd-update expects to be on the update server?



Thanks

Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to 
freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Increasing number of requests for jumbo clusters denied in netstat -m

2009-11-03 Thread Tom Judge

Tim Chen wrote:

My machine is an IBM HS21 blade server and the NIC is bce.

bce1: Broadcom NetXtreme II BCM5708 1000Base-SX (B2) mem
0xd800-0xd9ff  irq 19 at device 0.0 on pci6
miibus1: MII bus on bce1
bce1: Ethernet address: 00:1a:64:34:f2:fa
bce1: [ITHREAD]
bce1: ASIC (0x57081021); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C
(0x03040405) ; Flags( MSI )

That machine is serving as a mail and web server. Recently I found that the
number of requests for jumbo clusters denied in netstat -m increases all
the time.

1031/3469/4500 mbufs in use (current/cache/total)
510/3326/3836/32768 mbuf clusters in use (current/cache/total/max)
510/2278 mbuf+clusters out of packet secondary zone in use (current/cache)
1/1453/1454/8704 4k (page size) jumbo clusters in use
(current/cache/total/max)
510/1086/1596/4352 9k jumbo clusters in use (current/cache/total/max)
0/0/0/2176 16k jumbo clusters in use (current/cache/total/max)
5871K/23105K/28977K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/4337166/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

ifconfig shows:
bce1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 9000

options=1bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,
TSO4
ether 00:1a:64:34:f2:fa
inet 192.168.152.152 netmask 0xff00 broadcast 192.168.152.255
media: Ethernet autoselect (1000baseSX full-duplex)
status: active

And in rc.conf:
ifconfig_bce1=inet 192.168.152.152  netmask 255.255.255.0 mtu 9000

Is the problem of increasing requests for jumbo clusters denied resulted
from my mtu 9000 jumbo frame setting?
Will it be the problem of bce nic's driver or hardware?
Most of all, how will my system be efftected by that requests for jumbo
clusters denied problem? Will it degrade
the performance or it is harmless?



Second thread on this bug today :)

There is a bug in bce(4) that causes memory fragmentation, which results 
in denied mbuf requests as you are seeing.


You can correct this issue by applying the patch created by this command:

svn diff -r 198319:198320 http://svn.freebsd.org/base/head


Once you have applied the patch you need to add:

optionsBCE_JUMBO_HDRSPLIT

To you kernel config and recompile and install the new kernel.


Hope this works for you.

Tom


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


UFS Panic on 7.1 - ffs_valloc: dup alloc

2009-11-30 Thread Tom Judge

Hi,

I had a panic today when someone created a symlink over NFS to a UFS 
file system.


There seem to be 2 open PRs on this already:

kern/122380
kern/133980

Any ideas on a fix?  I have not tried to repeat this crash but I have 
saved a snapshot of the file system so I can test if needed.  I also 
have the core file preserved.


# uname -a
FreeBSD mongo.XXX 7.1-RELEASE-p4 FreeBSD 7.1-RELEASE-p4 #0 @718:817M: 
Tue Nov 24 02:31:49 UTC 2009 
t...@dev-tj-7-1-amd64.xxx:/usr/obj/usr/src/sys/XXXv5  amd64





# kgdb /boot/kernel/kernel vmcore.0
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:
mode = 0100600, inum = 2355296, fs = /usr/home
panic: ffs_valloc: dup alloc
cpuid = 0
Uptime: 5d13h10m53s
Physical memory: 6122 MB
Dumping 510 MB: 495 479 463 447 431 415 399 383 367 351 335 319 303 287 
271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15


#0  doadump () at pcpu.h:195
195 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:195
#1  0x0004 in ?? ()
#2  0x8048e079 in boot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:418
#3  0x8048e482 in panic (fmt=0x104 Address 0x104 out of 
bounds) at /usr/src/sys/kern/kern_shutdown.c:574

#4  0x80607752 in ffs_valloc (pvp=Variable pvp is not available.
) at /usr/src/sys/ufs/ffs/ffs_alloc.c:968
#5  0x8063104e in ufs_makeinode (mode=41453, 
dvp=0xff001d954dc8, vpp=0xb48e28a8, cnp=0xb48e28d0) 
at /usr/src/sys/ufs/ufs/ufs_vnops.c:2254
#6  0x8063153f in ufs_symlink (ap=0xb48e29a0) at 
/usr/src/sys/ufs/ufs/ufs_vnops.c:1831
#7  0x80737fe3 in VOP_SYMLINK_APV (vop=Variable vop is not 
available.

) at vnode_if.c:1351
#8  0x805b8f38 in nfsrv_symlink (nfsd=0xff0065996100, 
slp=0xff00035e6e00, td=0xff00041886e0, mrq=0xb48e2b00) 
at vnode_if.h:712

#9  0x805b in nfssvc (td=Variable td is not available.
) at /usr/src/sys/nfsserver/nfs_syscalls.c:456
#10 0x806d7fa7 in syscall (frame=0xb48e2c80) at 
/usr/src/sys/amd64/amd64/trap.c:907
#11 0x806be06b in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:330

#12 0x000800687bfc in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) frame 4
#4  0x80607752 in ffs_valloc (pvp=Variable pvp is not available.
) at /usr/src/sys/ufs/ffs/ffs_alloc.c:968
968 panic(ffs_valloc: dup alloc);
(kgdb) list
963 }
964 ip = VTOI(*vpp);
965 if (ip-i_mode) {
966 printf(mode = 0%o, inum = %lu, fs = %s\n,
967 ip-i_mode, (u_long)ip-i_number, fs-fs_fsmnt);
968 panic(ffs_valloc: dup alloc);
969 }
970 if (DIP(ip, i_blocks)  (fs-fs_flags  FS_UNCLEAN) == 0) {  
/* XXX */
971 printf(free inode %s/%lu had %ld blocks\n,
972 fs-fs_fsmnt, (u_long)ino, (long)DIP(ip, i_blocks));
(kgdb)


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: UFS Panic on 7.1 - ffs_valloc: dup alloc

2009-12-01 Thread Tom Judge
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jaakko Heinonen wrote:
 On 2009-11-30, Tom Judge wrote:
 kern/122380
 kern/133980

 Any ideas on a fix?

 panic: ffs_valloc: dup alloc
 
 You may be hitting UFS2 32-bit inode limit bug. See this analysis by
 Bruce Evans:
 
 http://docs.freebsd.org/cgi/mid.cgi?20090508120355.S1497
 

I briefly read the information that you linked to.  It seems this is for
file systems that are very large or have strange tuning?  This is a
standard FS created by newfs with soft updates enabled.


Is either of the mentioned PR's relevant to this bug?  If so I can
submit an update.

- -- FS INFO --

df -i /usr/home
Filesystem   1K-blocks  UsedAvail Capacity iused   ifree %iused
 Mounted on
/dev/mirror/home  20308396 55764 18627962 0%   13419 26244031%
 /usr/home



dumpfs /dev/mirror/home | head -n18
magic   19540119 (UFS2) timeTue Dec  1 19:09:22 2009
superblock location 65536   id  [ 4b07e10b 31429cb7 ]
ncg 112 size10485759blocks  10154198
bsize   16384   shift   14  mask0xc000
fsize   2048shift   11  mask0xf800
frag8   shift   3   fsbtodb 2
minfree 8%  optim   timesymlinklen 120
maxbsize 16384  maxbpg  2048maxcontig 8 contigsumsize 8
nbfree  1265749 ndir3721nifree  2624403 nffree  324
bpg 11758   fpg 94064   ipg 23552   unrefs  0
nindir  2048inopb   64  maxfilesize 140806241583103
sbsize  2048cgsize  16384   csaddr  3000cssize  2048
sblkno  40  cblkno  48  iblkno  56  dblkno  3000
cgrotor 0   fmod0   ronly   0   clean   0
avgfpdir 64 avgfilesize 16384
flags   soft-updates
fsmnt   /usr/home
volname homeswuid   0


Tom

- --
TJU13-ARIN
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.13 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJLFWq6AAoJEMSwVS7lr0OdzqUIAJk4Qxq7H/Wuhefq5OSyx3UE
uZjCj59mYQm/nabr6qea9oDqXHmeqz8T0mbWxpsNElVEXVPHS5CJE6goUQYpYrGG
/XAhaT3Cq8wHJEXLHv7v7+z22VtUsVnwOfwcZUL0S0otx7xhErnjQseeWc5/i20K
ObkqaNJDhsNs7BISbBC0hKd8Ar+towcvVZlxDrX16vZucC/Vwi/08Af7bG05tgg/
03TDjrUf4w3wP31taeY4mTYaGtYibM1PMIIrXo8mjNY3LvlD290gvqi4OFQxslvU
K2IFHSQJAzsKqkJwn/wAfCTXa4wWlsjDva/I9jjsNCzN9KkypTkZaqpdT3VzLgA=
=saDl
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org