Re: scp more perfectly fills the pipe than NFS/TCP

2009-12-21 Thread Dag-Erling Smørgrav
Zaphod Beeblebrox zbee...@gmail.com writes:
 While the link is slow, it is really directly connected with a latency
 of 10ms or so.

10 ms is pretty high.  A direct connection (same Ethernet segment)
should have a round-trip latency well below 1 ms.

DES
-- 
Dag-Erling Smørgrav - d...@des.no
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading / kernel bug?

2009-12-21 Thread John Baldwin
On Thursday 17 December 2009 12:27:17 pm Steven Hartland wrote:
 - Original Message - 
 From: John Baldwin j...@freebsd.org
  For the hang it seems you have a thread waiting in a blocking read(), a 
  thread 
  waiting in a blocking accept(), and lots of threads creating condition 
  variables.  However, the pthread_cond_init() in libpthread (libthr on 
  FreeBSD) 
  doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense 
  to 
  me.  However, that may be gdb getting confused.  The pthread_cleanup_push() 
  frame may be cond_init().  However, it doesn't call umtx_op() (the 
  _thr_umutex_init() call it makes just initializes the structure, it doesn't 
  make a _umtx_op() system call).  You might try posting on threads@ to try 
  to 
  get more info on this, but your pthread_cond_init() stack traces don't 
  really 
  make sense.  Can you rebuild libc and libthr with debug symbols?
  
  For example:
  
  # cd /usr/src/lib/libc
  # make clean 
  # make DEBUG_FLAGS=-g
  # make DEBUG_FLAGS=-g install
  
  However, if you are hanging in read(), that usually means you have a socket 
  that just doesn't have data.  That might be an application bug of some sort.
  
  The segv trace doesn't include the first part of GDB messages which show 
  which 
  thread actually had a seg fault.  It looks like it was the thread that was 
  throwing an exception.  However, nanosleep() doesn't throw exceptions, so 
  that 
  stack trace doesn't really make sense either.  Perhaps that stack is hosed 
  by 
  the exception handling code?
 
 I've uploaded a two more traces for the oxt test failure / segv.
 http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1
 
 From looking at the test case it testing the capture of failures and its 
 ability
 to create a stack trace output so that may give others some indication where
 the issue may be?
 
 I will look to do the same on for the hang issue but that's on a live site so
 will need to schedule some downtime before I can get those rebuilt and then
 wait for it to hang again, which could be quite some time :(

Hmmm, the only seg fault I see is happening down inside libgcc in the stack
unwinding code and that is 3rd party code from gcc.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading /kernel bug?

2009-12-21 Thread Steven Hartland
- Original Message - 
From: John Baldwin 

I've uploaded a two more traces for the oxt test failure / segv.
http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1

From looking at the test case it testing the capture of failures and its 
ability
to create a stack trace output so that may give others some indication where
the issue may be?

I will look to do the same on for the hang issue but that's on a live site so
will need to schedule some downtime before I can get those rebuilt and then
wait for it to hang again, which could be quite some time :(


Hmmm, the only seg fault I see is happening down inside libgcc in the stack
unwinding code and that is 3rd party code from gcc.


Thanks for looking John, so you believe this may be an issue with the gcc code?

What would be the next step on this, raise it on a gcc mail list or something?

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Passenger hangs on live and SEGV on tests possible threading /kernel bug?

2009-12-21 Thread John Baldwin
On Monday 21 December 2009 9:45:53 am Steven Hartland wrote:
 - Original Message - 
 From: John Baldwin 
  I've uploaded a two more traces for the oxt test failure / segv.
  http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1
  
  From looking at the test case it testing the capture of failures and its 
  ability
  to create a stack trace output so that may give others some indication 
  where
  the issue may be?
  
  I will look to do the same on for the hang issue but that's on a live site 
  so
  will need to schedule some downtime before I can get those rebuilt and then
  wait for it to hang again, which could be quite some time :(
  
  Hmmm, the only seg fault I see is happening down inside libgcc in the stack
  unwinding code and that is 3rd party code from gcc.
 
 Thanks for looking John, so you believe this may be an issue with the gcc 
 code?
 
 What would be the next step on this, raise it on a gcc mail list or something?

I'm not sure. :)  That may be best.  You could also try examining the
registers and assembly to see if you can figure out more of what is going on
when it dies.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Atheros AR9281 still not supported?

2009-12-21 Thread Frank Mayhar
So I take it that the newer Atheros chipsets are still not supported in
FreeBSD 8?  I thought I saw a rumor a while ago that they would be but
now I can't track it down and of course the BugBusting page shows this
series as still unsupported.

Unfortunately I'm having a problem with my formerly trusty ar5212
interface since I upgraded (hangs, can't reset, error ath_chan_set:
unable to reset channel 36 (5180 Mhz, flags 0x140), hal status 3) so I
was hoping to use the builtin interface, but it appears that's out.
Please correct me if I'm wrong.  Thanks.
-- 
Frank Mayhar fr...@exit.com  http://www.exit.com/
http://www.exit.com/blog/frank/
http://www.zazzle.com/fmayhar*
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: scp more perfectly fills the pipe than NFS/TCP

2009-12-21 Thread Matthew Dillon
Play with the read-ahead mount options for NFS, but it might require
more work with that kind of latency.  You need to be able to have
a lot of RPC's in-flight to maintain the pipeline with higher latencies.
At least 16 and possibly more.

It might be easier to investigate why the latency is so high and fix
that first.  10ms is way too high for a LAN.

I remember there was some work in the FreeBSD tree to clean up the
client-side NFS rpc mechanics but if they are still threaded (kernel
thread or user thread, doesn't matter) with one synchronous RPC per
thread then a large amount of read-ahead will cause the requests to be
issued out of order over the wire (for both TCP and UDP NFS mounts),
which really messes up the server-side heuristics.  Plus the
client-side threads wind up competing with each other for the
socket lock.  So there is a limit to how large a read-ahead you
can specify and still get good results.

If they are using a single kernel thread for socket reading and a
single kernel thread for socket writing (i.e. a 100% async RPC model,
which is what DFly uses), then you can boost the read-ahead to 50+.
At that point the socket buffer becomes the limiting factor in the
pipeline.

Make sure the NFS mount is TCP (It defaults to TCP in FreeBSD 8+).  UDP
mounts will not perform well with any read-ahead greater then 3 or 4
RPCs because occassional seek latencies on the server will cause
random UDP RPCs to timeout and retry, which completely destroys
performance.  UDP mounts have no understanding of the RPC queue backlog
on the server and treat each RPC independently for timeout/retry
purposes.  So one minor stall can blow up every single pending RPC
backed up behind the one that stalled.

-Matt

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: scp more perfectly fills the pipe than NFS/TCP

2009-12-21 Thread Zaphod Beeblebrox
I must say that I often deeply respect your position and your work,
but your recent willingness to jump into a conversation without
reading the whole of it ... simply to point out some point where your
pet is better than the subject of the list... is disappointing.  Case
in point...

On Mon, Dec 21, 2009 at 3:42 PM, Matthew Dillon
dil...@apollo.backplane.com wrote:
    Play with the read-ahead mount options for NFS, but it might require
    more work with that kind of latency.  You need to be able to have
    a lot of RPC's in-flight to maintain the pipeline with higher latencies.
    At least 16 and possibly more.

I should almost label that ObContent.

    It might be easier to investigate why the latency is so high and fix
    that first.  10ms is way too high for a LAN.

Ref. my origional post.  The connection is DSL, but completely
managed.  10ms is fairly good for DSL

    Make sure the NFS mount is TCP (It defaults to TCP in FreeBSD 8+).  UDP
    mounts will not perform well with any read-ahead greater then 3 or 4
    RPCs because occassional seek latencies on the server will cause
    random UDP RPCs to timeout and retry, which completely destroys
    performance.  UDP mounts have no understanding of the RPC queue backlog
    on the server and treat each RPC independently for timeout/retry
    purposes.  So one minor stall can blow up every single pending RPC
    backed up behind the one that stalled.

Again, from the original post, not only was -T specified, but (as you
say) it is the default for FreeBSD 8.

for a 4 megabit pipe, very few transactions need to be in flight to
fill it.  Does the TCP NFS use tech like selective ack?  Is it the
same stack as the one that scp is using?
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: scp more perfectly fills the pipe than NFS/TCP

2009-12-21 Thread Matthew Dillon
I'm just covering all the bases.  To be frank, half the time when
someone posts they are doing something a certain way it turns out that
they actually aren't.  I've learned that covering the bases tends to
lead to solutions more quickly than assuming a perfect rendition.

For example, is that 10ms latency with a ping?  What about a
ping -s 4000?  If you are talking about 16KB RCP transactions over
TCP then the real question is what is the latency for 16KB of data
coming back along the wire?

In your case we can calculate the read-ahead needed to keep the pipe
full.  500 KBytes/sec divided by 16KB is 31 transactions per second,
or an effective latency of 32ms + probably 5-10 for the RPC to be
sent... so probably more around 40ms.  Not 10ms.  And if you are using
32KB transactions the latency is going to be more around 70ms.

500K x 40ms = is about 20KB, so theoretically a read-ahead of
2 packets should do the trick.

There's a catch, however.  Depending on the client-side implementation
the read-ahead requests may be transmitted out of order.  That is
if the cp or dd program wants to read blocks 0, 1, 2, 3, 4, the
actual RPC's sent over the wire might be sent like this:  0, 2, 1, 4, 3,
or even 0, 4, 1, 2, 3.  Someone who know what work was done on the
FreeBSD NFS stack can tell you whether that is still the case.  If
the nfsiod's (whether kernel threads or not) are separate synchronous
RPCs then the read-ahead can transmit the RPC requests out of order.
The server may also respond to them out of order... (typically there
being 4 server-side threads handling RPCs).  The combination is deadly.

If the read-aheads transmit out of order what happens is that
cp/dd/whatever on the client winds up stalling waiting for the
next linear block to come back, which might be BEHIND a later
read-ahead block coming back down the wire.  That is, the stall,
the RPC latency winds up being multiplied by N.  A 40ms turn can
turn into an 80 or 120ms turn before the cp/dd/whatever unstalls.

To deal with this you want to set the read-ahead higher... probably at
least 3 or four RPCs.

As I said, there are other issues as the amount of read-ahead
increases.  The only way to really figure out what is going on is
to tcpdump the link and determine why the pipeline is not being
maintained.  Look for out of order requests, out of order responses,
and stalls (actual lost packets).

Actual lost packets are not likely in your case, assuming you are
using something like fair-share scheduling and not RED (RED should
only be used by routers in the middle of a large network, it should
never be used at the end-points).

-Matt

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: scp more perfectly fills the pipe than NFS/TCP

2009-12-21 Thread Matthew Dillon
Oh, one more thing... I'm assuming you haven't used tcpdump with
NFS much.  tcpdump has issues parsing the NFS RPC's out of a TCP
stream.  For the purposes of testing you may want to temporarily
use a UDP NFS mount.  tcpdump can parse the NFS RPCs out of the UDP
stream far more easily.  If you use a UDP mount use the dumbtimer
option and set it to something big, like 10 seconds, so you don't
get caught up in NFS/UDP's retry code (which will confuse your
parsing of the output).

A typical tcpdump line would be something like this:

tcpdump -n -i nfe0 -l -s 4096 -vvv not port 2049

Where the port is whatever port the NFS RPC's are running over
while you are running the test.  You'd want to display it on a
max-width xterm, or record a bunch of it to a file and then review
it via less.

The purpose of running the tcpdump is to validate all your assumptions
as well as determine whether basic features such as read-ahead are
actually running.  You can also determine if packet loss is occuring,
if requests are being sent or responded to out of order (the RPC
tcpdump parses includes the request id's and the file offsets so it
should be easy to figure that out).  You can also determine the
actual latency by looking at the timestamps for the request vs
the reply.

Once you've figured out as much as you can from that you can try
tcpdumping the TCP stream.  In this case you may not be able to
pick out RPCs but you should be able to determine whether the
requests are being pipelined and whether any packet loss is occurring
or not.  You can also determine whether the TCP link is working
properly... i.e. that the TCP packets are properly flagging the
'P'ushes and not delaying the responses, and that the link isn't
blowing out its socket buffer or TCP window (those are two separate
things).  The kernel should be scaling things properly but you never
know.

-Matt

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Suggestion: rename killall to fkill, but wait five years to phase the new name in

2009-12-21 Thread Jason A. Spiro
Dear Craig, thanks for maintaining the killall command on Linux.
Dear hackers, thanks for maintaining it on FreeBSD.

Naming it the same as System V killall, which just kills all
processes, can wreak havoc.  When someone types a standard Linux
killall command line as root on a Solaris or HP-UX server, System V
killall runs and kills all processes.

It might be good if you'd rename it to something else.  Not akill
(All Kill):  it looks like IRIX probably ships with something called
akill already, so this would be confusing.  Maybe fkill (Friendly
Kill).

You could do this in phases:  for the first five years,
/usr/bin/killall could print a warning onscreen, then function as
usual.  After five years, it could cease to function unless you call
it as fkill.

Craig, and hackers, are you both willing to do this?

-Jason

--
Jason Spiro: software/web developer, packager, trainer, IT consultant.
I support Linux, UNIX, Windows, and more. Contact me to discuss your needs.
+1 (416) 992-3445 / www.jspiro.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Suggestion: rename killall to fkill, but wait five years to phase the new name in

2009-12-21 Thread Daniel O'Connor
On Tue, 22 Dec 2009, Jason A. Spiro wrote:
 Naming it the same as System V killall, which just kills all
 processes, can wreak havoc.  When someone types a standard Linux
 killall command line as root on a Solaris or HP-UX server, System V
 killall runs and kills all processes.

 It might be good if you'd rename it to something else.  Not akill
 (All Kill):  it looks like IRIX probably ships with something called
 akill already, so this would be confusing.  Maybe fkill (Friendly
 Kill).

snark
Why not get Sun and HP to change killall to match Linux  *BSD 
behaviour?
/snark

Although seriously, why not? killall just killing everything is a fairly 
dangerous command with almost no use in the real world.
-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


signature.asc
Description: This is a digitally signed message part.


Re: Suggestion: rename killall to fkill, but wait five years to phase the new name in

2009-12-21 Thread Xin LI
On Mon, Dec 21, 2009 at 10:31 PM, Jason A. Spiro jasonspi...@gmail.com wrote:
 Craig, and hackers, are you both willing to do this?

No.

killall is not part of standard, and, just because System V choose to
implement that way, does not warrant that FreeBSD has to.  Moreover,
user can always alias /sbin/killall to 'fkill' and 'kill -15 -1' to
'killall' if they really want the System V behavior.

Cheers,
-- 
Xin LI delp...@delphij.net http://www.delphij.net
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org