Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)

2009-10-05 Thread Richard Perini
On Sat, Oct 03, 2009 at 10:27:39PM +, Bjoern A. Zeeb wrote:

...

 
 As we will try to keep the default in 8.x and 9.x to disallow user
 mappings at virtual address 0,  we are interested in further issues
 that were not yet metnioned in either this thread or the Errata Notice.

quagga 0.99.15 (built from ports) has the same issue as samba.

--
Richard Perini   Internet:  r...@ci.com.au
Corinthian Engineering Pty Ltd   PHONE:   +61 2 9552 5500
Sydney, AustraliaFAX: +61 2 9552 5549
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ral(4) on 8-RC1

2009-10-05 Thread Matt Dawson
On Monday 05 Oct 2009 00:28:20 you wrote:
 maxpower are expressed as dBm.

Thanks for that and the pointers to get the channels right. A temporary 
hack in net80211 was trivial and I can take my time with the ral end of 
things. I believe Kip Macy was the last to look at ral in-depth, around the 
time support was added for gen 2 chipset, although I can't seem to find the 
code that was in p4 at that time. It was a while ago...

 I had problems w/ the iwi firmware on 64-bit so set the build to i386
 only.  The problems I had were relocation errors and noone could help;
 if those are gone then building the fw image for amd64 should be fine.
 Whether the driver works is another matter...
 
The iwi driver seems to work for normal operation with the caveat that 
802.11s is never going to work, but that's noted on the wiki anyway. I 
haven't been able to get Kismet to work (it used to on i386 on the iwi) 
although that may be down to PEBKAC in not fully understanding the 802.11 
architectural changes in 8. I'll hammer the thing for a few more days, see 
if I can find any regression tests to apply to this setup and maybe move it 
to a different machine to ensure it works in multiple environments. This 
is, however, exactly the same hardware that failed to work with 7.1 amd64, 
so I'm pretty confident that it should be consistent.

I'm torn between decent support for most things on ral and dual band and a 
more sensitive radio, to the order of around 5dB in the same location, on 
iwi. I'm tempted to just get an Atheros a/b/g card and be done with it. 
Oxford Tech have some AR5414/AR5006XS cards for reasonable money.

Thanks for your help, Sam.

Best regards,
-- 
Matt Dawson
MTD15-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [setup] no floppies FreeBSD 8 ?

2009-10-05 Thread Remko Lodder

I am not sure whether we dropped floppy support.

But I can imagine that the release candidates do not have floppies.

On Oct 4, 2009, at 1:54 PM, luca wrote:


hi,

I'm looking for the floppies images to install FreeBSD 8 on a PC  
which can't boot from CDROM ; but the images are not available (i.e.  
no floppies directory)


Do FreeBSD 8 dropped floppy install ?

Regards,
Luca



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org 



--
/\   Best regards,| re...@freebsd.org
\ /   Remko Lodder  | re...@efnet
Xhttp://www.evilcoder.org/|
/ \   ASCII Ribbon Campaign| Against HTML Mail and News

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Daniel Bond

Hi,

I've been struggling with watchdog timeouts in 7.1/7.2-RELEASE for the  
past 6months too. It looks related.


I've tried to replace the hardware 3 times (2 different IBM x3755  
chassis, one IBM x3650 chassis).
I tried first with onboard broadcom NICs (bce-based) PCIx-based, until  
I had issues with watchdog timeout.


I tried replacing it with a 4-port pci-x Intel NIC, which gave me same  
problems. I was told that the 4-port intel NICs had an onboard bus- 
controller, that
could cause trouble, so I replaced this with a 2-port PCI-e intel,  
which I was told by a Sepherosa Ziehau was the best performing gig-e  
NIC (rx/tx).


Still getting watchdog timeouts, I tried upgrading all sort of sysctls  
I found in mailing-list threads (disable msi/msix interrupts, adjust  
rx/tx processing, etc, etc).
I tried upgrading BIOS, firmware on all kinds of stuff (disks, BMC,  
etc, etc) to newest version. I also tried using a different qlogic  
isp(4) FC-controller (PCI-e).


No matter what I tried, I could not diagnose this problem, or at least  
fix it. Also it happened rarely enough, to not be easy to debugging. I  
would get a series of watchdog timeout -- resetting, until the NIC  
would go completly offline - at the point I'd reboot it from console.


This happened about once every 1-10 days, usually about 11-13:00. This  
machine has now been replaced with Linux, unfortunately, just to avoid  
more customer complaints and downtime. The IBM x3755 with FreeBSD7.2  
which was replaced with Linux, is still online, and
can be put at disposal for any developers who would like to debug this  
further.


Like Stefan Krueger mentioned, this machine is also running as NFS  
server, with a mix of BSD and Linux clients, and it's getting hit  
pretty hard by clients.



Hope we can iron this bug out, in the future.


Best regards,


Daniel Bond.



On Oct 2, 2009, at 10:36 PM, Rudy wrote:



Ah, I'll stop messing with them.


I just set them all to 0 to see if that will help and noticed the card
was leaving tx_int_delay=1.

# sysctl dev.em.4.debug=1
Oct  2 13:26:07 mango kernel: em4: tx_int_delay = 1,  
tx_abs_int_delay = 0
Oct  2 13:26:07 mango kernel: em4: rx_int_delay = 0,  
rx_abs_int_delay = 0


# sysctl dev.em.4
dev.em.4.%desc: Intel(R) PRO/1000 Network Connection 6.9.12
dev.em.4.rx_int_delay: 0
dev.em.4.tx_int_delay: 0
dev.em.4.rx_abs_int_delay: 0
dev.em.4.tx_abs_int_delay: 0

Splitting traffic to different ports has brought down the watchdog
events to once a day.  ... essentially, I have a quad 30Mbps (not quad
1Gbps) card.  heheh.
Would turning off net.inet.ip.fastforwarding or any other setting  
help?


Today, I set net.inet.ip.fw.enable=0 and I'll see if that helps.  I  
have

a feeling that isn't related to the NIC at all, but I'm not sure what
else to try.

Rudy



Jack Vogel wrote:
Watchdog resets the adapter. Messing with these values is of  
dubious value

anyway.

Jack


On Fri, Oct 2, 2009 at 11:36 AM, Rudy cra...@monkeybrains.net  
wrote:




I noticed something interesting.

I set the rc_int_delay to 0:
sysctl dev.em.5.rx_int_delay=0

Chcking via sysctl dev.em.5.debug=1 shows ex_int_delay is indeed 0:
Oct  1 17:32:41 mango kernel: em5: rx_int_delay = 0,  
rx_abs_int_delay = 66


After a watchdog event, sysctl dev.em.5.debug=1 shows ex_int_delay  
is

now 32:
Oct  2 11:29:49 mango kernel: em5: rx_int_delay = 32,  
rx_abs_int_delay =

66

However, running sysctl dev.em.5 shows it as 0:
dev.em.5.rx_int_delay: 0
dev.em.5.tx_int_delay: 66

Seems like the adapter and the kernel don't agree on the  
rx_int_delay

value.

Rudy







___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org 





PGP.sig
Description: This is a digitally signed message part


Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)

2009-10-05 Thread Mike Tancsa

At 12:47 PM 10/4/2009, Andre Albsmeier wrote:

On Sat, 03-Oct-2009 at 22:27:39 +, Bjoern A. Zeeb wrote:
 On Sat, 3 Oct 2009, Andre Albsmeier wrote:

 Hi,

  On Sat, 03-Oct-2009 at 16:27:32 -0400, jhell wrote:
  On Sat, 3 Oct 2009 14:42 -, Andre.Albsmeier wrote:
 
  FYI,
 
  after setting security.bsd.map_at_zero to 0 on 7.2-STABLE all
  samba33 programmes did abort() immediately after start. The
  solution was to use
 
  CONFIGURE_ARGS+= --disable-pie
 
-Andre
 
 
  To add an additional note samba33 even when not running (not 
enabled by a rcvar)

  also runs a tdbcleanup routine on shutdown and/or start that also does
  abort().
 
  Yes, every samba programme is linked with -pie per default (so
  all abort()).


 Thanks for reporting the issue.  People are aware of the problem now
 and we'll try to present a solution within the next days for better
 position-independent executable (PIE) handling.

 Meanwhile there are multiple solutions for people affected:

 (1) recompile the port; but as more than just samba might be affected
  and we generally do not want to flip the pie switch everywhere that's
 probably only a temporary, private solution.

I'll stick to this since I am happy about having the map_at_zero
option and want to continue to try it out on 7.2-STABLE. And I
see now reason why samba has to be linked with -pie (without -pie
it is also 4% smaller).


Hi,
What are the impacts (if any) of compiling all the ports with PIE 
disabled that are effected by setting security.bsd.map_at_zero=0 ?


---Mike





Mike Tancsa,  tel +1 519 651 3400
Sentex Communications,m...@sentex.net
Providing Internet since 1994www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Robert Blayzor

On Oct 2, 2009, at 4:36 PM, Rudy wrote:
Today, I set net.inet.ip.fw.enable=0 and I'll see if that helps.  I  
have

a feeling that isn't related to the NIC at all, but I'm not sure what
else to try.



Just curious, have you tried (or are you using) device polling?

--
Robert Blayzor, BOFH
INOC, LLC
rblay...@inoc.net
http://www.inoc.net/~rblayzor/



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)

2009-10-05 Thread Bjoern A. Zeeb

On Mon, 5 Oct 2009, Mike Tancsa wrote:

Hi Mike,


 Thanks for reporting the issue.  People are aware of the problem now
 and we'll try to present a solution within the next days for better
 position-independent executable (PIE) handling.

 Meanwhile there are multiple solutions for people affected:

 (1) recompile the port; but as more than just samba might be affected
  and we generally do not want to flip the pie switch everywhere 
that's

 probably only a temporary, private solution.

I'll stick to this since I am happy about having the map_at_zero
option and want to continue to try it out on 7.2-STABLE. And I
see now reason why samba has to be linked with -pie (without -pie
it is also 4% smaller).


Hi,
What are the impacts (if any) of compiling all the ports with PIE disabled 
that are effected by setting security.bsd.map_at_zero=0 ?


Basically in first place compared to yesterday, there is no impact if
you do it privately as it will not make much, if any, difference as
PIE support in FreeBSD so far has basically been non-existent and was
more working out of luck, according to my current understanding.

Actually there is a slight difference that I should mention.  With PIE
valid user code is currently mapped at virtual address 0. That's why
it started to fail for people setting map_at_zero to 0.  So NULL
pointer dereferences in applications like samba will not lead to the
obvious error but will point at something, which in that case usually
will be garbage and the application will either not work as intended
(well that's alsready the case with a NULL derefernce;) or crash in
random code (as a later consequence of the NULL deref).

In the future though, it seems we will support PIE and in that case
you'll get mappings at different place, in the end ideally slightly
random so that it'll be hard to exploit the code itself as people no
longer can easily pre-guess where things are in virtual memory.
Disbaling PIE now means, that this will not happen later but you'll
have the fixed (entry point) address, unless you recompile the ports
again.


I said, no impact if you do it privately above; the problems for the
ports crew here are:
1) the entire set of ports affected by PIE is unidentified.
2) they build packages for 6/7 and 8, if not yet 9 as well soon for
   multiple architectures and cannot just rebuild everything.
3) they can especially not just rebuild the package set for the
   upcoming 8.0-RELEASE (don't ask, I don't know when it'll happen;).
4) basically they shouldn't need to care about which way the port
   (read the package as released by its devlepors) choses to be
   compiled by default.

I do not have strong arguments if PIE makes a lot of sense but it
seems that for the long term we'll have to support it, as parts of the
other world support it as well,  and once we support it even old
packages will then continue to work even if people disable
mapping_at_zero or if the release by default does that.

/bz

--
Bjoern A. Zeeb It will not break if you know what you are doing.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Jack Vogel
This posting just muddies the issue, first you talk about having a problem
that
involves Broadcom, ok, so post about that on something other than em :)

Then you make some references to hardware that you might have bought
but didn't, I'm not about debugging 'possible worlds problems' though so
can't help you there either :)

Finally you never say what the actual hardware is, other than a person who
I do not know told you it was the best performer... so, what exactly is it?

You have a problem once every 10 days,  and at a specific time no less,
this almost always means something in your environment, a cron job run
amok, a piece of hardware that resets, I dunno, but the last thing I would
suspect given this description is the driver.

You need a good sysadmin for this debugging I would venture, not a driver
developer.

Jack


On Mon, Oct 5, 2009 at 7:19 AM, Daniel Bond d...@danielbond.org wrote:

 Hi,

 I've been struggling with watchdog timeouts in 7.1/7.2-RELEASE for the past
 6months too. It looks related.

 I've tried to replace the hardware 3 times (2 different IBM x3755 chassis,
 one IBM x3650 chassis).
 I tried first with onboard broadcom NICs (bce-based) PCIx-based, until I
 had issues with watchdog timeout.

 I tried replacing it with a 4-port pci-x Intel NIC, which gave me same
 problems. I was told that the 4-port intel NICs had an onboard
 bus-controller, that
 could cause trouble, so I replaced this with a 2-port PCI-e intel, which I
 was told by a Sepherosa Ziehau was the best performing gig-e NIC (rx/tx).

 Still getting watchdog timeouts, I tried upgrading all sort of sysctls I
 found in mailing-list threads (disable msi/msix interrupts, adjust rx/tx
 processing, etc, etc).
 I tried upgrading BIOS, firmware on all kinds of stuff (disks, BMC, etc,
 etc) to newest version. I also tried using a different qlogic isp(4)
 FC-controller (PCI-e).

 No matter what I tried, I could not diagnose this problem, or at least fix
 it. Also it happened rarely enough, to not be easy to debugging. I would get
 a series of watchdog timeout -- resetting, until the NIC would go
 completly offline - at the point I'd reboot it from console.

 This happened about once every 1-10 days, usually about 11-13:00. This
 machine has now been replaced with Linux, unfortunately, just to avoid more
 customer complaints and downtime. The IBM x3755 with FreeBSD7.2 which was
 replaced with Linux, is still online, and
 can be put at disposal for any developers who would like to debug this
 further.

 Like Stefan Krueger mentioned, this machine is also running as NFS server,
 with a mix of BSD and Linux clients, and it's getting hit pretty hard by
 clients.


 Hope we can iron this bug out, in the future.


 Best regards,


 Daniel Bond.




 On Oct 2, 2009, at 10:36 PM, Rudy wrote:


 Ah, I'll stop messing with them.


 I just set them all to 0 to see if that will help and noticed the card
 was leaving tx_int_delay=1.

 # sysctl dev.em.4.debug=1
 Oct  2 13:26:07 mango kernel: em4: tx_int_delay = 1, tx_abs_int_delay = 0
 Oct  2 13:26:07 mango kernel: em4: rx_int_delay = 0, rx_abs_int_delay = 0

 # sysctl dev.em.4
 dev.em.4.%desc: Intel(R) PRO/1000 Network Connection 6.9.12
 dev.em.4.rx_int_delay: 0
 dev.em.4.tx_int_delay: 0
 dev.em.4.rx_abs_int_delay: 0
 dev.em.4.tx_abs_int_delay: 0

 Splitting traffic to different ports has brought down the watchdog
 events to once a day.  ... essentially, I have a quad 30Mbps (not quad
 1Gbps) card.  heheh.
 Would turning off net.inet.ip.fastforwarding or any other setting help?

 Today, I set net.inet.ip.fw.enable=0 and I'll see if that helps.  I have
 a feeling that isn't related to the NIC at all, but I'm not sure what
 else to try.

 Rudy



 Jack Vogel wrote:

 Watchdog resets the adapter. Messing with these values is of dubious
 value
 anyway.

 Jack


 On Fri, Oct 2, 2009 at 11:36 AM, Rudy cra...@monkeybrains.net wrote:


  I noticed something interesting.

 I set the rc_int_delay to 0:
 sysctl dev.em.5.rx_int_delay=0

 Chcking via sysctl dev.em.5.debug=1 shows ex_int_delay is indeed 0:
 Oct  1 17:32:41 mango kernel: em5: rx_int_delay = 0, rx_abs_int_delay =
 66

 After a watchdog event, sysctl dev.em.5.debug=1 shows ex_int_delay is
 now 32:
 Oct  2 11:29:49 mango kernel: em5: rx_int_delay = 32, rx_abs_int_delay =
 66

 However, running sysctl dev.em.5 shows it as 0:
 dev.em.5.rx_int_delay: 0
 dev.em.5.tx_int_delay: 66

 Seems like the adapter and the kernel don't agree on the rx_int_delay
 value.

 Rudy





 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)

2009-10-05 Thread jhell




On Sun, 4 Oct 2009 12:07 -0700, dougb@ wrote:


Bjoern A. Zeeb wrote:

On Sat, 3 Oct 2009, Andre Albsmeier wrote:

Hi,


On Sat, 03-Oct-2009 at 16:27:32 -0400, jhell wrote:

On Sat, 3 Oct 2009 14:42 -, Andre.Albsmeier wrote:


FYI,

after setting security.bsd.map_at_zero to 0 on 7.2-STABLE all
samba33 programmes did abort() immediately after start. The
solution was to use

CONFIGURE_ARGS+= --disable-pie

-Andre



To add an additional note samba33 even when not running (not enabled
by a rcvar)
also runs a tdbcleanup routine on shutdown and/or start that also does
abort().


Yes, every samba programme is linked with -pie per default (so
all abort()).



Thanks for reporting the issue.  People are aware of the problem now
and we'll try to present a solution within the next days for better
position-independent executable (PIE) handling.

Meanwhile there are multiple solutions for people affected:

(1) recompile the port;


Just to be clear, you have to recompile the port with --disable-pie
added to the CONFIGURE_ARGS in the Makefile.

It would also be nice if there were a __FreeBSD_version bump for this
new feature.


Doug




Just to add on to this for those that may be wondering what they can do to 
solve this for just the ports infrastructure in the mean time.


You may add the following to /etc/make.conf

.if ${.CURDIR:M/usr/ports*}
CONFIGURE_ARGS+= --disable-pie
.endif

This is assuming that you have your ports installed in the standard place 
of /usr/ports. If not you may adjust the match accordingly.


This could also be extended to individual ports or substructures of your 
liking so that you are not adding those configure arguments to every port 
under the sun.


Keep in mind, this should be followed carefully and not expected to be a 
full workaround as a greater solution still lies in wait.


Best regards.

--

%{+
 | dataix.net!jhell 2048R/89D8547E 2009-09-30 |
 | BSD since FreeBSD 4.2Linux since Slackware 2.1 |
 | 85EF E26B 07BB 3777 76BE  B12A 9057 8789 89D8 547E |
 +%}
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: glabel+gmirror (8.0-RC1 problem)

2009-10-05 Thread Oliver Lehmann
Pawel Jakub Dawidek wrote:

 On Mon, Sep 28, 2009 at 08:37:56PM +0200, Oliver Lehmann wrote:
  Hi Pawel,
  
  Pawel Jakub Dawidek wrote:
  
   Does anything change between you upgrade from BETA3 and RC1? For example
   gmirror was compiled into the kernel before and now is loaded as module
   or something similar?
  
  Nope, it was a clean BETA3 installation with the default GENERIC kernel
  which has afaik geom_label in kernel, but not geom_mirror (nevertheless I
  loaded geom_label.ko at boottime as well as geom_mirror)
  The same with RC1 - clean and fresh installation with the default GENERIC
  kernel and geom_label in kernel (default), but still loaded as module at
  boottime as well as geom_mirror.
  
   Could you test this patch:
   
 http://people.freebsd.org/~pjd/patches/improved_taste.patch
  
  This makes gmirror+glabel work again on RC1
 
 Thanks for confirmation.

gjorunal is also affected. I tried to use one partition of my gmirror
disk as journal device for my 3ware raid-5 device which works until I
reboot - the journal is then gone as well.
Is this patch likly to fix this as well? Will it be included in a future
RC? Until now I've stayed away using glabel+gmirror but I didn't knew
that gjournal is affected as well so I'm now left with warning that the
journal provider is gone wile booting - and more tragically I'm left
without journaling at all (which hurts on a 2.7TB partition when the
system was not cleanly shut down)


-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Still possible to panic RELENG_7 with ZFS (kmem exhaustion)?

2009-10-05 Thread Jeremy Chadwick
(Note: please keep me CC'd, as I am not subscribed to freebsd-stable)

Is it still possible with ZFS to panic a RELENG_7 amd64 box (kernel/world
from recent[1] source) with kmem map too small or similar conditions?

Why I ask:

Our production SQL/backup box kernel panic'd a couple days ago.  Sadly,
the box also acts as a serial console server, so I don't have the
exact message spit back from the kernel prior to being dumped to DDB,
but the backtrace looks very much like the historic problem of the ZFS
ARC exhausting all kernel memory, so I'm betting the message prior to
being dumped to DDB was kmem map too small:

db bt
Tracing pid 40738 tid 100168 td 0xff001f078720
kdb_enter_why() at kdb_enter_why+0x3d
panic() at panic+0x176
kmem_malloc() at kmem_malloc+0x548
uma_large_malloc() at uma_large_malloc+0x3c
malloc() at malloc+0xc1
arc_get_data_buf() at arc_get_data_buf+0x1bb
arc_buf_alloc() at arc_buf_alloc+0xa1
arc_read_nolock() at arc_read_nolock+0xd1
arc_read() at arc_read+0x71
dbuf_prefetch() at dbuf_prefetch+0x135
dmu_zfetch_dofetch() at dmu_zfetch_dofetch+0xe3
dmu_zfetch() at dmu_zfetch+0xa58
dbuf_read() at dbuf_read+0x433
dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x119
dmu_buf_hold_array() at dmu_buf_hold_array+0x57
dmu_read_uio() at dmu_read_uio+0x3f
zfs_freebsd_read() at zfs_freebsd_read+0x55a
vn_read() at vn_read+0x1ef
dofileread() at dofileread+0x88
kern_readv() at kern_readv+0x43
read() at read+0x4d
syscall() at syscall+0x247
Xfast_syscall() at Xfast_syscall+0xab

The machine in question has absolutely no loader.conf tuning applied,
and kernel/world was built from RELENG_7 dated 2009/06/11.  The ZFS pool
consisted of a single (entire) disk; nothing special.  I do not have
sysctl counters from the box before it panic'd, but these are what are
active presently:

hw.physmem: 4286558208
vm.kmem_size_max: 329853485875
vm.kmem_size_min: 0
vm.kmem_size: 1381478400

With regards to the above counters: ZFS is not in use.  I had to switch
back to UFS2 (zpool destroy + newfs -O2 -U...) because of stability
concerns relating to the question at hand.

If someone familiar with the FreeBSD ZFS internals, and/or the VM,
please make a statement I think it would beneficial to those who are
considering using/migrating to ZFS on FreeBSD.

The only semi-official statements I've read as of late are here:

http://lists.freebsd.org/pipermail/freebsd-stable/2009-September/051810.html
http://lists.freebsd.org/pipermail/freebsd-stable/2009-September/051830.html

And what's in src/UPDATING:

20090207:
ZFS users on amd64 machines with 4GB or more of RAM should
reevaluate their need for setting vm.kmem_size_max and
vm.kmem_size manually.  In fact, after recent changes to the
kernel, the default value of vm.kmem_size is larger than the
suggested manual setting in most ZFS/FreeBSD tuning guides.

Thanks!

[1]: Recent means post-February 2009, specifically after Alan Cox's
commits listed here:

http://svn.freebsd.org/changeset/base/188291
http://svn.freebsd.org/changeset/base/187523
http://svn.freebsd.org/changeset/base/187522
http://svn.freebsd.org/changeset/base/187520
http://svn.freebsd.org/changeset/base/187485
http://svn.freebsd.org/changeset/base/187466
http://svn.freebsd.org/changeset/base/187465
http://svn.freebsd.org/changeset/base/187464
http://svn.freebsd.org/changeset/base/187458
http://svn.freebsd.org/changeset/base/187428
http://svn.freebsd.org/changeset/base/187425
http://svn.freebsd.org/changeset/base/187420
http://svn.freebsd.org/changeset/base/187419
http://svn.freebsd.org/changeset/base/187416
http://svn.freebsd.org/changeset/base/187414
http://svn.freebsd.org/changeset/base/187408
http://svn.freebsd.org/changeset/base/187407
http://svn.freebsd.org/changeset/base/187404
http://svn.freebsd.org/changeset/base/187400

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Daniel Bond

Hi Jack,

I'll comment your mail inline:


On Oct 5, 2009, at 6:57 PM, Jack Vogel wrote:

This posting just muddies the issue, first you talk about having a  
problem that
involves Broadcom, ok, so post about that on something other than  
em :)


I only meant to indicate that the problem might exist outside the  
intel driver.
I'm also indicating that it happens with several drivers (bge, bce and  
em) on several different machines, on both pci-x and pci-e.


I'm sorry if this is confusing to you, but I still think it's relevant  
to mention.




Then you make some references to hardware that you might have bought
but didn't, I'm not about debugging 'possible worlds problems'  
though so

can't help you there either :)


No. I only made references to hardware I actually used, and had real- 
world issues with.




Finally you never say what the actual hardware is, other than a  
person who
I do not know told you it was the best performer... so, what exactly  
is it?


Sepherosa is a guy that writes drivers for BSD based operating  
systems. Including FreeBSD. He has a lot of knowledge in this area.

http://people.freebsd.org/~sephe/

The NIC you are referring to, the one sephe recommended me, is a  
82571EB. I didn't mention specific hardware, as I think it's more  
important
to note this is an issue I'm experiencing across different sets of  
hardware and drivers.




You have a problem once every 10 days,  and at a specific time no  
less,

this almost always means something in your environment, a cron job run
amok, a piece of hardware that resets, I dunno, but the last thing I  
would

suspect given this description is the driver.


This is not what I wrote. I wrote I had a problem every 1-10 days, but  
it would usually happen once every 3-4 days. At worst, every day in  
periods.


It's not at any specific time. If you read my email correctly, I say  
it *usually* happens arround 11-13:00,

but it has happened at random times too.

This is my point exactly. I don't think it's the Intel-driver, I think  
the problem is elsewhere. I had a suspicion it had to do with the  
combination of nic + qlogic fc-controller, but I have no evidence of  
this.




You need a good sysadmin for this debugging I would venture, not a  
driver

developer.


What I need is useful advice/help. I never stated I needed a driver  
developer.


I'd like to be able to run my favorite OS on cool hardware, in the  
future, for a high-performing NFS-server, without problems like I've  
experienced the past 6months, on a production system.
Please note that I'm managing a server-park almost completely based on  
FreeBSD, and I'm running many NFS servers on other hardware, for other  
services, without issues.


I've seen several other FreeBSD-users having problems with this too,  
so I think it's of importance for the project. As I mentioned  
originally, I'm happy to dispose the hardware to any FreeBSD developer
that might want to look further into this. Debugging it further is  
above my skill-set, I don't even know where to begin looking,  
especially since I can't produce any panics.


I'm sorry to say, but your reply was %0 useful, Jack.



Jack



- Daniel


PGP.sig
Description: This is a digitally signed message part


Re: em0 watchdog timeouts

2009-10-05 Thread Jack Vogel
Sorry, its a Monday morning, I was being kinda facetious, guess it didn't
work very well :) I apologize.

I know it must be annoying for you, its as much so for me when its something
I can't just fix because its not reproducible. So, I feel your pain.

Will try to restrain my Monday blues in the future.

Jack


On Mon, Oct 5, 2009 at 11:32 AM, Daniel Bond d...@danielbond.org wrote:

 Hi Jack,

 I'll comment your mail inline:


 On Oct 5, 2009, at 6:57 PM, Jack Vogel wrote:

  This posting just muddies the issue, first you talk about having a problem
 that
 involves Broadcom, ok, so post about that on something other than em :)


 I only meant to indicate that the problem might exist outside the intel
 driver.
 I'm also indicating that it happens with several drivers (bge, bce and em)
 on several different machines, on both pci-x and pci-e.

 I'm sorry if this is confusing to you, but I still think it's relevant to
 mention.


 Then you make some references to hardware that you might have bought
 but didn't, I'm not about debugging 'possible worlds problems' though so
 can't help you there either :)


 No. I only made references to hardware I actually used, and had real-world
 issues with.


 Finally you never say what the actual hardware is, other than a person who
 I do not know told you it was the best performer... so, what exactly is
 it?


 Sepherosa is a guy that writes drivers for BSD based operating systems.
 Including FreeBSD. He has a lot of knowledge in this area.
 http://people.freebsd.org/~sephe/ http://people.freebsd.org/%7Esephe/

 The NIC you are referring to, the one sephe recommended me, is a 82571EB. I
 didn't mention specific hardware, as I think it's more important
 to note this is an issue I'm experiencing across different sets of hardware
 and drivers.


 You have a problem once every 10 days,  and at a specific time no less,
 this almost always means something in your environment, a cron job run
 amok, a piece of hardware that resets, I dunno, but the last thing I would
 suspect given this description is the driver.


 This is not what I wrote. I wrote I had a problem every 1-10 days, but it
 would usually happen once every 3-4 days. At worst, every day in periods.

 It's not at any specific time. If you read my email correctly, I say it
 *usually* happens arround 11-13:00,
 but it has happened at random times too.

 This is my point exactly. I don't think it's the Intel-driver, I think the
 problem is elsewhere. I had a suspicion it had to do with the combination of
 nic + qlogic fc-controller, but I have no evidence of this.


 You need a good sysadmin for this debugging I would venture, not a driver
 developer.


 What I need is useful advice/help. I never stated I needed a driver
 developer.

 I'd like to be able to run my favorite OS on cool hardware, in the future,
 for a high-performing NFS-server, without problems like I've experienced the
 past 6months, on a production system.
 Please note that I'm managing a server-park almost completely based on
 FreeBSD, and I'm running many NFS servers on other hardware, for other
 services, without issues.

 I've seen several other FreeBSD-users having problems with this too, so I
 think it's of importance for the project. As I mentioned originally, I'm
 happy to dispose the hardware to any FreeBSD developer
 that might want to look further into this. Debugging it further is above my
 skill-set, I don't even know where to begin looking, especially since I
 can't produce any panics.

 I'm sorry to say, but your reply was %0 useful, Jack.


 Jack


 - Daniel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Greg Byshenk
On Mon, Oct 05, 2009 at 08:32:14PM +0200, Daniel Bond wrote:
 
 What I need is useful advice/help. I never stated I needed a driver  
 developer.
 
 I'd like to be able to run my favorite OS on cool hardware, in the  
 future, for a high-performing NFS-server, without problems like I've  
 experienced the past 6months, on a production system.
 Please note that I'm managing a server-park almost completely based on  
 FreeBSD, and I'm running many NFS servers on other hardware, for other  
 services, without issues.
 
 I've seen several other FreeBSD-users having problems with this too,  
 so I think it's of importance for the project. As I mentioned  
 originally, I'm happy to dispose the hardware to any FreeBSD developer
 that might want to look further into this. Debugging it further is  
 above my skill-set, I don't even know where to begin looking,  
 especially since I can't produce any panics.

I can give one bit of advice that helped me in a similar situation:
check you motherboards.

I run about a dozen fileservers on FreeBSD, and have always been very
happy with their performance, but some months ago I began to experience
problems with one of them.  These problems were 'watchdog timeout'
errors.  Tried all manner of things, different NICs of different types,
changing settings, etc., but nothing helped over the long term.  At 
some point, when very heavy i/o was going on to our Beowulf cluster, the
'watchdog timeouts' would begin.  What was strange is that other 
(supposedly identical) machines handled _more_ i/o without a problem.

Finally, while doing some comparisons, I realized that the motherboard
having the problem was _not_ the same as the others; it was similar, but
not identical.  I changed the motherboard and all the problems went away,
never to reappear.

I don't know if it was a specific problem with that particular
motherboard, or something about that model, but for whatever reason, it
appears that the buses just couldn't handle a RAID card and three active
NICs.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Rudy

Finally, while doing some comparisons, I realized that the motherboard
having the problem was _not_ the same as the others; it was similar, but
not identical.


This is a good piece of info.  I can try swapping out the MB and see 
what happens.


I do want to add: thank you Jack for all your help and if does turn out 
to be the MB, then double thanks.  Viva Monday!   :)


What would be nice would be MORE info for a watchdog timeout... maybe a 
sysctl dev.watchdog.debug=1 or something where when a watchdog event 
happened --- for whatever driver --- a bunch of stats were dumped 
relating to the event.


Rudy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Jack Vogel
Hmmm, I did have one of the drivers print more info at watchdog time, but I
just looked
and that's not em, time to add that I guess.

Since you're in the driver there isn't a huge amount of info that you can
print, it still
may not be enough to help.

BTW, I've always been somewhat dissatisfied with the watchdog design and
think
its kinda flawed, I could try and make you an experimental with debug and
some
changes that you can try if you'd like.

Jack


On Mon, Oct 5, 2009 at 1:54 PM, Rudy cra...@monkeybrains.net wrote:

 Finally, while doing some comparisons, I realized that the motherboard
 having the problem was _not_ the same as the others; it was similar, but
 not identical.


 This is a good piece of info.  I can try swapping out the MB and see what
 happens.

 I do want to add: thank you Jack for all your help and if does turn out to
 be the MB, then double thanks.  Viva Monday!   :)

 What would be nice would be MORE info for a watchdog timeout... maybe a
 sysctl dev.watchdog.debug=1 or something where when a watchdog event
 happened --- for whatever driver --- a bunch of stats were dumped relating
 to the event.

 Rudy

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: openssh concerns

2009-10-05 Thread Daniel Bond

Hi.

I explained my opinion quite well (imo) a bit further down in my  
previous email. I'm not sure what

to answer.

I don't necessarily think it's relevant for every computer running  
sshd. I see a tendency to change
sshd port to 2022 and other port numbers. I'm not sure everyone doing  
it is aware that using
unprivileged ports also has consequences, compared to (often) a few  
harmless logentries.


I'd much rather use an privileged port, or mac_portacl(4), like  
mentioned earlier.



Best regards,


Daniel.

I've noticed quite a bit of suggestions to use 2022,  and such

On Oct 5, 2009, at 11:58 PM, Doug Barton wrote:


Daniel Bond wrote:

However, I'm concerned about the suggestion of using an
unprivileged port


Please explain your reasoning, and how it's relevant in a world where
the vast majority of Internet users have complete administrative
control over the systems they use.


Doug

--

  This .signature sanitized for your protection

___
freebsd-secur...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-security
To unsubscribe, send any mail to freebsd-security-unsubscr...@freebsd.org 





PGP.sig
Description: This is a digitally signed message part


libthr and daemon()

2009-10-05 Thread Matthew Fleming
I have some code that tries to use pthread_cond_wait() and it's getting
back EPERM.  Upon further investigation, here's what I've found:

When the app starts, libthr's _libpthread_init calls init_main_thread()
to set the thread id in struct pthread's tid.

The app opens a log file then calls daemon().
daemon() calls fork()
fork() does not appear to be linked to _fork() in libthr; see below.
The app creates a thread to handle signals.
The app attempts to wait on a condition variable (pthread_cond_wait();
this gives EPERM).

Looking into libthr's cond_wait_common(), it does a THR_UMUTEX_LOCK on
the cv's c_lock using the struct pthread from _get_curthread().  Here,
curthread points to the pthread struct that got the tid from thr_self on
startup.  Because of fork() this is the same address in the daemonized
app as the original.  But curthread-tid is the tid of the original app,
not the daemonized version, hence my assumption that fork() didn't
resolve to libthr's _fork().

When cond_wait_common() calls into the kernel to actually do the
cv_wait, do_unlock_umutex/do_unlock_normal() returns EPERM since the tid
does not match.

AFAICT this has nothing to do with any code in the app itself.

The two things I don't know:

1) what utilities can I use to show me which version of fork will be
used at runtime?  ldd just shows me that the app is linked against libc
and libthr.

2) why would fork resolve to the one in libc (presumably, I'm not sure
how to prove this) instead of the one in libthr?

Thanks,
matthew
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: libthr and daemon()

2009-10-05 Thread Daniel Eischen

On Mon, 5 Oct 2009, Matthew Fleming wrote:


I have some code that tries to use pthread_cond_wait() and it's getting
back EPERM.  Upon further investigation, here's what I've found:

When the app starts, libthr's _libpthread_init calls init_main_thread()
to set the thread id in struct pthread's tid.


Is the application threaded before calling daemon()?


The app opens a log file then calls daemon().
daemon() calls fork()
fork() does not appear to be linked to _fork() in libthr; see below.
The app creates a thread to handle signals.
The app attempts to wait on a condition variable (pthread_cond_wait();
this gives EPERM).


Was the condition variable created before daemon() was
called?

The picture is not clear to me.

POSIX states that only async-signal-safe function calls
can be made from a child fork()'d from a threaded
application.  The intent is that the child should soon
after call a function in the exec() family.  Certainly,
any more threaded calls in the child are invalid.

--
DE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Rudy (bulk)



BTW, I've always been somewhat dissatisfied with the watchdog design and
think
its kinda flawed, I could try and make you an experimental with debug and
some
changes that you can try if you'd like.
  


I'm game -- it would be nice if the machine still reset the watchdog in 
3 seconds and didn't cause any more damage from the debug code (eg a 
panic).  :)


My frequency of watchdog events is about 2 or 3 times per day.
I am running:   Intel(R) PRO/1000 Network Connection 6.9.12




Rudy

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org