Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Andi Kleen
Krishna Kumar [EMAIL PROTECTED] writes: Doing some measurements, I found that for small packets like 128 bytes, the bandwidth is approximately 60% of the line speed. To possibly speed up performance of small packet xmits, a method of linking skbs was thought of - where two pointers

Re: [RFC] New driver API to speed up small packets xmits

2007-05-13 Thread Andi Kleen
On Friday 11 May 2007 13:16:44 Roland Dreier wrote: I wasn't talking about sending. But there actually is :- TSO/GSO. As I said before, getting multiple packets in one call to xmit would be nice for amortizing per-xmit overhead in IPoIB. So it would be nice if the cases where the

Re: select(0, ..) is valid ?

2007-05-18 Thread Andi Kleen
On Wednesday 16 May 2007 17:37, Anton Blanchard wrote: Hi Hugh, It's interesting that compat_core_sys_select() shows this kmalloc(0) failure but core_sys_select() does not. That's because core_sys_select() avoids kmalloc by using a buffer on the stack for small allocations (and 0 sure

Re: [RFC] netdevice ops

2007-05-19 Thread Andi Kleen
Stephen Hemminger [EMAIL PROTECTED] writes: I would think a non-conditional deref would be easily pipelined. If the net_device struct was more cache dense, it probably would even out. It might be a good idea to consider strategic prefetch points for it. e.g. TCP executes quite a lot of code

Re: [PATCH 2/4] forcedeth: fix power management support

2007-05-29 Thread Andi Kleen
Ayaz Abdulla [EMAIL PROTECTED] writes: This patch fixes the power management functions. It includes lowering the phy speed to conserve power. Shouldn't there be some way to disable this? AFAIK a few old switches have trouble with this. I assume a new ethtool option would be appropiate since we

Re: [PATCH 3/4] Make net watchdog timers 1 sec jiffy aligned

2007-05-31 Thread Andi Kleen
Venki Pallipadi [EMAIL PROTECTED] writes: If this does not work: Another option is to use 'deferrable timer' here which will be called at same as before time when CPU is busy and on idle CPU it will be delayed until CPU comes out of idle due to any other events. That would sound like a good

Re: [RFC] Failover-friendly TCP retransmission

2007-06-04 Thread Andi Kleen
[EMAIL PROTECTED] writes: Please note first that I want to address physical failures by the failover-capable network devices, which are increasingly becoming important as Xen-based VM systems are getting popular. Reducing a single-point-of-failure (physical device) is vital on such VM

Re: [RFC] Failover-friendly TCP retransmission

2007-06-05 Thread Andi Kleen
Your suggestion, to utilize NET_XMIT_* code returned from an underlying layer, is done in tcp_transmit_skb. But my problem is that tcp_transmit_skb is not called during a certain period of time. So I'm suggesting to cap RTO value so that tcp_transmit_skb gets called more frequently. The

Re: a maze of twisty stats, most different

2007-06-29 Thread Andi Kleen
David Stevens [EMAIL PROTECTED] writes: I think there's a more general problem that's a huge hassle. There are lots of new SNMP MIB's, but no conventions (that I'm aware of, at least) to allow for changes to the /proc entries that get them to applications. Actually /proc/net/snmp and

Re: a maze of twisty stats, most different

2007-06-29 Thread Andi Kleen
That works ok for some things, like new global counters, but some items really fit best in existing files and the concern there is about other uses of them beyond the standard tools. Examples: -addition of route age in /proc/net/route and /proc/net/ipv6_route Routing information

Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Andi Kleen
Brice Goglin [EMAIL PROTECTED] writes: I am trying to understand whether I can setup a skb destructor in my code (which is basically a protocol above dev_queue_xmit() and co). From what I see in many parts in the current kernel code, the protocol (I mean, the one who actually creates the skb)

Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Andi Kleen
On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote: I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos?

Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Andi Kleen
The destructor method is set and used for skbs originating from the RDMA driver sitting above cxgb3. If these skbs never reach the normal sockets based stack it might be ok. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED]

Re: linux networking manpages

2005-07-26 Thread Andi Kleen
On Tue, Jul 26, 2005 at 04:24:02AM -0400, Harald Welte wrote: At the moment I'm mostly thinking about correcting those issues that I know off my head. Don't know whether I would find the time to systematically go through all of them.. Ok. Maybe someone else at netdev is interested. Also

Re: [PATCH] add new nfnetlink_log subsystem

2005-08-03 Thread Andi Kleen
no, it hasn't. I am travelling and don't have the space for a debian/i386 installation in addition to the debian/x86_64 on this box, sorry :( That sounds risky. I would ask for this stuff not being merged before it isn't tested. However, all nfnetlink-based protocols are supposed to be both

Re: [PATCH] add new nfnetlink_log subsystem

2005-08-04 Thread Andi Kleen
On Thu, Aug 04, 2005 at 06:49:46AM -0700, David S. Miller wrote: From: Harald Welte [EMAIL PROTECTED] Date: Thu, 4 Aug 2005 00:03:53 +0200 However, all nfnetlink-based protocols are supposed to be both endian and 32/64 as well as alignment (*) safe. the protocol definitions always use

Re: [PATCH] add new nfnetlink_log subsystem

2005-08-04 Thread Andi Kleen
Unless I'm overlooking something, to the best of my knowledge I don't think we could still run into any trouble here. Still risky. Here's a different idea. Define a new aligned u64 type and use that on i386 too. Like /* Must be #define because __attribute__ doesn't work on typedefs */

Re: lockups with netconsole on e1000 on media insertion

2005-08-05 Thread Andi Kleen
This is fixing the symptom and is not the cure. Unfortunately I don't have a e1000 card so I can't try a fix. But I did have a e100 card that would lock up the same way. The problem was that netpoll_poll calls the cards netpoll routine (in e1000_main.c e1000_netpoll). In the e100 case,

Re: lockups with netconsole on e1000 on media insertion

2005-08-05 Thread Andi Kleen
On Fri, Aug 05, 2005 at 10:10:13AM -0400, Steven Rostedt wrote: On Fri, 2005-08-05 at 15:55 +0200, Andi Kleen wrote: This is fixing the symptom and is not the cure. Unfortunately I don't have a e1000 card so I can't try a fix. But I did have a e100 card that would lock up the same way

Re: [PATCH] netpoll can lock up on low memory.

2005-08-05 Thread Andi Kleen
On Fri, Aug 05, 2005 at 01:01:57PM -0700, Matt Mackall wrote: The netpoll philosophy is to assume that its traffic is an absolute priority - it is better to potentially hang trying to deliver a panic message than to give up and crash silently. That would be ok if netpoll was only used to

Re: lockups with netconsole on e1000 on media insertion

2005-08-05 Thread Andi Kleen
I still don't like this fix. Yes, you're right, it should eventually give up. But here it gives up way too easily - 5 could easily translate to 5 microseconds. This is analogous to giving up on serial transmit if CTS is down for 5 loops. I'd be much happier if there were some udelay or the

Re: [PATCH] netpoll can lock up on low memory.

2005-08-05 Thread Andi Kleen
If that was the policy it would be a quite dumb one and make netpoll totally unsuitable for production use. I hope it is not. Suggest you rip __GFP_NOFAIL out of JBD before complaining about this. So you're suggesting we should become as bad at handling networking errors as we are at

Re: lockups with netconsole on e1000 on media insertion

2005-08-05 Thread Andi Kleen
But why are we in a hurry to dump the backlog on the floor? Why are we worrying about the performance of netpoll without the cable plugged in at all? We shouldn't be optimizing the data loss case. Because a system shouldn't stall for minutes (or forever like right now) at boot just because

Re: [PATCH] netpoll can lock up on low memory.

2005-08-06 Thread Andi Kleen
On Sat, Aug 06, 2005 at 09:45:03AM +0200, Ingo Molnar wrote: * Andi Kleen [EMAIL PROTECTED] wrote: On Fri, Aug 05, 2005 at 01:01:57PM -0700, Matt Mackall wrote: The netpoll philosophy is to assume that its traffic is an absolute priority - it is better to potentially hang trying

Re: [PATCH] add new iptables ipt_connbytes match

2005-08-12 Thread Andi Kleen
I don't think that we're ever going to fix that bug in the old {get,set}sockopt interface, but rather introduce a netlink interface when pkt_tables matures. All new interfaces should be emulation clean, so that if the old interface is replaced later it should eventually work. The best way to

Re: udp source port randomization?

2005-08-15 Thread Andi Kleen
It does help 16 bits :-) Better than nothing. 16bits is so poor that any secure algorithms using it would just give a false sense of security. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: IP_RECVTTL

2005-08-16 Thread Andi Kleen
On Tue, Aug 16, 2005 at 12:50:51PM +0300, Hasso Tepper wrote: According to man 7 ip: IP_RECVTTL When this flag is set pass a IP_RECVTTL control message with the time to live field of the received packet as a byte. Not supported for SOCK_STREAM sockets. This is not

Re: Bridge MTU

2005-08-17 Thread Andi Kleen
Some L3 switches do it by violating the layers and faking an ICMP fragmentation unreachable from the destination for DF=1 and otherwise fragmenting. But it's a big hack and probably nothing that should be put into Linux. -Andi - To unsubscribe from this list: send the line unsubscribe netdev

Re: [PATCH] make use of -private_data in sockfd_lookup

2005-08-17 Thread Andi Kleen
No no, the private_data is actually far beyond, even for a L1_CACHE_LINE of 128 bytes Yuck. (because of the insane struct file_ra_state f_ra. I wish this structure were dynamically allocated only for files that really use it) How about you submit a patch for that instead? -Andi - To

Re: [PATCH] struct file cleanup : the very large file_ra_state is now allocated only on demand.

2005-08-17 Thread Andi Kleen
On Thu, Aug 18, 2005 at 02:40:46AM +0200, Eric Dumazet wrote: Andi Kleen a ?crit : (because of the insane struct file_ra_state f_ra. I wish this structure were dynamically allocated only for files that really use it) How about you submit a patch for that instead? -Andi OK

Re: [PATCH] struct file cleanup : the very large file_ra_state is now allocated only on demand.

2005-08-17 Thread Andi Kleen
You don't want to always have bad performance though, so you could attempt to allocate if either the pointer is null _or_ it points to the global structure? Remember it's after a GFP_KERNEL OOM. If that fails most likely you have deadlocked somewhere else already because Linux's handling of

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-19 Thread Andi Kleen
Now I can take you even less seriously. In RFC2581, they are talking about unloading a burst of data into a connection where there has been significant idle time since the most recent data send. To be fair Linux would be using TSO in this case too and therefore cause bursts. But it also would

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-19 Thread Andi Kleen
I'm personally not a big fan of TSO or TOE. They both add a lot of complexity to the network stack, and have other downsides. The *best* way to solve these problems is to engineer technologies to use larger packet sizes. Even at 9k (or better yet 16k) the advantages of these offload

Re: [E1000-devel] Page Allocation Failure with e1000 using jumbo frame

2005-08-19 Thread Andi Kleen
I guess we need to approach the memory manager guys and ask them why the current kernels are having so much trouble getting contiguous memory. Because memory fragments. The only long term reliable way is to not allocate buffers PAGE_SIZE. The stack supports paged skbs for that. -Andi - To

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-19 Thread Andi Kleen
Right. The other issue with jumbos frames (9000MTU) is that the allocation needed is just over 2 pages for 4K page size machines (common case). 3 page contig allocations tend to fail once a server is heavily loaded and memory gets fragmented. That's just a driver bug. The driver should be

Re: [PATCH] TCP Offload (TOE) - Chelsio

2005-08-19 Thread Andi Kleen
On the spec website, the current results have it off. That was because the old implementation violated the congestion window. With David's new superTSO the next generation of benchmarks will likely have it on again. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the

Re: [E1000-devel] Page Allocation Failure with e1000 using jumboframe

2005-08-19 Thread Andi Kleen
Ahh, okay. I'm pretty sure that SuSE did some changes (not sure what) to memory management. I don't think so. the formula for the size that the current e1000 looks for is something like a = MTU roundup to next power of 2 a += 2 (skb_reserve(NET_IP_ALIGN)) a += 16 (skb_reserve 16 by

Re: [patch 2/7] [IPV4]: Consistency and whitespace cleanup of ip_rcv()

2005-08-19 Thread Andi Kleen
On Sat, Aug 20, 2005 at 03:14:15AM +0200, Thomas Graf wrote: + len = ntohs(iph-tot_len); + if (skb-len len || len (iph-ihl*4)) + goto inhdr_error; If you rewrite it to something like u32 minlen = skb-len; if (minlen iph-ihl*4) minlen =

Re: [patch 4/7] [IPV4]: Move ip options parsing out of ip_rcv_finish()

2005-08-19 Thread Andi Kleen
How about uninlining this? Options are rare and options parsing is rather expensive anyway. You would need explicit noinline because even without inline gcc with unit-at-a-time would happily inline it. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a

Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-22 Thread Andi Kleen
Another approach would be: 1) Determine that we don't care about the callback (ie. it gets reset to NULL) when the skb-dev changes, as would occur for forwarding, and certain kinds of firewalling and classification actions. 2) As a result of #1 we can put the callback into the

Re: Receive Traffic Distribution (Was RE: [PATCH] TCP Offload (TOE) - Chelsio_

2005-08-22 Thread Andi Kleen
On Sun, 21 Aug 2005 16:19:26 -0700 (PDT) David S. Miller [EMAIL PROTECTED] wrote: From: Andi Kleen [EMAIL PROTECTED] Date: Mon, 22 Aug 2005 01:13:21 +0200 Basically, you'll have skb-free_callback(skb, ARG), and skb-free_callback_ARG. And when the SKB and it's memory is about to get

Re: LRO Patent vs. patent free TOE

2005-08-22 Thread Andi Kleen
On Mon, 22 Aug 2005 08:57:45 -0700 (PDT) David S. Miller [EMAIL PROTECTED] wrote: From: Leonid Grossman [EMAIL PROTECTED] Date: Mon, 22 Aug 2005 11:50:54 -0400 Christoph - sorry, but I don't see a reason to continue this debate. Good luck fighting TOE patents - you are going to need it.

Re: LRO Patent vs. patent free TOE

2005-08-22 Thread Andi Kleen
On Mon, 22 Aug 2005 19:19:06 -0700 (PDT) David S. Miller [EMAIL PROTECTED] wrote: From: Andi Kleen [EMAIL PROTECTED] Date: Tue, 23 Aug 2005 01:44:34 +0200 To be fair the situation as seen from the Linux kernel software perspective is very similar for TOE and for LSO - both are patented

Re: LRO Patent vs. patent free TOE

2005-08-23 Thread Andi Kleen
On Tuesday 23 August 2005 17:21, David S. Miller wrote: From: Leonid Grossman [EMAIL PROTECTED] Date: Tue, 23 Aug 2005 02:25:07 -0400 On a more serious note, I'm all in for stateless offloads but I think that dropping stack support for adapters that don't implement TSO, etc (either in

Re: LRO Patent vs. patent free TOE

2005-08-23 Thread Andi Kleen
On Tuesday 23 August 2005 18:01, David S. Miller wrote: From: Andi Kleen [EMAIL PROTECTED] Date: Tue, 23 Aug 2005 17:53:58 +0200 However the drawback is that you would likely need to submit the packets as two pieces (payload and header) which would need more accesses to TX rings and could

Re: LRO Patent vs. patent free TOE

2005-08-23 Thread Andi Kleen
There are actually some non-trivial issues wrt. this. We would need to loop inside of the packet scheduler, and netfilter, to do correct traffic classification and firewalling. It could be introduced slowly, with some compat code that just falls back to packet at a time mode (like it has

Re: [Bug 5610] New: IP MTU Path Discovery now working properly

2005-11-15 Thread Andi Kleen
On Wednesday 16 November 2005 01:45, David S. Miller wrote: Alternatively, we're ignoring the PMTU message for one reason or another. Perhaps the quoted TCP packet in the ICMP pmtu message has an incorrect sequence number or is truncated for some reason. There are counters for all of this

Re: Linux UDP Implementation

2006-09-02 Thread Andi Kleen
It seems that the implementation (at code level) does not match with the actual behaviour. I would like to seek expertise on clarifying my understanding in UDP implementation so that this phenomenon can be explained. How about you just add some printks or use a tool like systemtap to

Re: Paper on lookup in Linux.

2006-09-04 Thread Andi Kleen
On Monday 04 September 2006 13:43, Robert Olsson wrote: Hello. People on this list might find this paper interesting: http://www.csc.kth.se/~snilsson/public/papers/trash/ Looks nice. Have you looked at using it for local TCP/UDP socket lookups too or would that be part of the unified flow

Re: Paper on lookup in Linux.

2006-09-04 Thread Andi Kleen
On Monday 04 September 2006 14:53, Robert Olsson wrote: No we haven't put struct socket in the leafs (result node) yet we just kept dst entries and some stateful flow variables that we used for active GC and flow logging so far. So 128 bit flow version of the dst hash. It would have

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-18 Thread Andi Kleen
Vladimir B. Savkin [EMAIL PROTECTED] writes: [you seem to send your emails in a strange way that doesn't keep me in cc. Please stop doing that.] On Mon, Sep 18, 2006 at 11:58:21AM +0200, Andi Kleen wrote: The x86-64 timer subsystems currently doesn't have clocksources at all

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-18 Thread Andi Kleen
People who run tcpdump want wire timestamps as close as possible. Yes, things get delayed with the IRQ path, DMA delays, IRQ mitigation and whatnot, but it's an order of magnitude worse if you delay to user read() since that introduces also the delay of the packet copies to userspace which

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-18 Thread Andi Kleen
On Monday 18 September 2006 17:19, Alan Cox wrote: Ar Llu, 2006-09-18 am 16:29 +0200, ysgrifennodd Andi Kleen: The only delay this would add would be the queueing time from the NIC to the softirq. Do you really think that is that bad? If you are trying to do things like network record

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-18 Thread Andi Kleen
On Monday 18 September 2006 17:38, Alexey Kuznetsov wrote: Hello! For netdev: I'm more and more thinking we should just avoid the problem completely and switch to true end2end timestamps. This means don't time stamp when a packet is received, but only when it is delivered to a socket.

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-18 Thread Andi Kleen
On Monday 18 September 2006 18:28, Alexey Kuznetsov wrote: Hello! Hmm, not sure how that could happen. Also is it a real problem even if it could? As I said, the problem is _occasionally_ theoretical. This would happen f.e. if packet socket handler was installed after IP handler.

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-18 Thread Andi Kleen
On Monday 18 September 2006 23:03, Alexey Kuznetsov wrote: And do you have some other prefered way to solve this? Even if the timer was fast it would be still good to avoid it in the fast path when DHCPD is running. No. The way, which you suggested, seems to be the best. Ok. I also

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-19 Thread Andi Kleen
On Monday 18 September 2006 23:22, David Miller wrote: Ok, ok, but don't we have queueing disciplines that need the timestamp even on ingress? I grepped and I can't find any. The only non SIOCGTSTAMP users of the time stamp seem to be sunrpc and conntrack and I bet both can be converted over

Re: [PATCH] tcp: set congestion default through Kconfig

2006-09-19 Thread Andi Kleen
On Tuesday 19 September 2006 06:41, Stephen Hemminger wrote: Bert's attempt was noble It showed your desire for the truth A simple path exists I guess most people won't have a clue on what to configure here (especially with such sparse help text, but even with more it would be hard). And

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-19 Thread Andi Kleen
It seems only natural to me that the real problem is the slow clock source which needs to be resolved regardless of the outcome of this discussion. I believe that updating the stamp at socket enqueue time is the right thing to do but it shouldn't be considered as a solution to the

Re: 2.6.18-rc7-mm1

2006-09-21 Thread Andi Kleen
On Wednesday 20 September 2006 16:23, Mike Galbraith wrote: On Tue, 2006-09-19 at 13:36 -0700, Andrew Morton wrote: On Tue, 19 Sep 2006 22:25:21 +0200 Rafael J. Wysocki [EMAIL PROTECTED] wrote: - It took maybe ten hours solid work to get this dogpile vaguely compiling and limping

Re: Network performance degradation from 2.6.11.12 to 2.6.16.20

2006-09-22 Thread Andi Kleen
On Friday 22 September 2006 17:35, Alexey Kuznetsov wrote: Hello! I can't even find a reference to SIOCGSTAMP in the dhcp-2.0pl5 or dhcp3-3.0.3 sources shipped in Ubuntu. But I will note that tpacket_rcv() expects to always get valid timestamps in the SKB, it does a: It is equally

Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver

2006-09-25 Thread Andi Kleen
David Miller [EMAIL PROTECTED] writes: First, the only mentioned real use of EtherIP I've seen anywhere is to tunnel old LAN based games that used protocols other than IP :-) How would you convince those old LAN games to use a MTU 1500 which is needed for the tunnel? I bet they have the

Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver

2006-09-25 Thread Andi Kleen
On Monday 25 September 2006 13:57, Joerg Roedel wrote: On Mon, Sep 25, 2006 at 12:22:41PM +0200, Andi Kleen wrote: How would you convince those old LAN games to use a MTU 1500 which is needed for the tunnel? I bet they have the size hardcoded. The tunnel provides an MTU of 1500

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-04 Thread Andi Kleen
On Wednesday 04 October 2006 17:45, Andrew Morton wrote: On Wed, 04 Oct 2006 08:42:28 -0500 Steve Fox [EMAIL PROTECTED] wrote: On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote: On Thu, 28 Sep 2006 17:50:31 + (UTC) Steve Fox [EMAIL PROTECTED] wrote: On Thu, 28 Sep 2006

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-04 Thread Andi Kleen
I think most likely it would crash on 2.6.18. Keith mannthey had reported a different crash on 2.6.18-rc4-mm2 when this patch was introduced first time. Following is the link to the thread. Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4? -Andi - To unsubscribe

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 19:57, Steve Fox wrote: On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: Please don't snip the Code: line. It is fairly important. Sorry about that. The remote console I was using appears to overwrite some text after I force the reboot. Here's a clean one

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 20:51, Steve Fox wrote: On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote: I guess we need to track when it gets corrupted. Can you send the full boot log with this patch applied? Here she blows! Can you please try it again with this patch to narrow it down

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 20:52, Vivek Goyal wrote: On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote: On Thursday 05 October 2006 19:57, Steve Fox wrote: On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: Please don't snip the Code: line. It is fairly important

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 22:42, Steve Fox wrote: On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote: Can you please try it again with this patch to narrow it down further? Unfortunately this is as far as it got before it hung. Boot with earlyprintk=serial,ttyS0,57600 (or change

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
hmm, rather than bugging you with patches now, I'll see what I can find with the x86_64 machines I have access to and see can I reproduce it. I started the bisect, should finish soon. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL

Re: 2.6.18-mm2 boot failure on x86-64 II

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 22:51, Andi Kleen wrote: hmm, rather than bugging you with patches now, I'll see what I can find with the x86_64 machines I have access to and see can I reproduce it. I started the bisect, should finish soon. It ended at diff-tree

Re: 2.6.18-mm2 boot failure on x86-64 II

2006-10-05 Thread Andi Kleen
As of yet I haven't been able to recreate the hang. I am running similar HW to Steve. That was on a 4 core Opteron with Tyan board (S2881) and AMD-8111 chipset. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More

2.6.19rc2 XFRM does too large direct mapping allocations for hashes

2006-10-18 Thread Andi Kleen
I got this while restarting ipsec on a 2.6.19rc2 system that was up for a few days. Order 8 is really a bit big to get from the direct mapping after boot. Should the hash allocation fall back to vmalloc? -Andi Initializing XFRM netlink socket events/0: page allocation failure. order:8,

Re: [PATCH] Bound TSO defer time (resend)

2006-10-18 Thread Andi Kleen
On Tuesday 17 October 2006 06:18, John Heffner wrote: Stephen Hemminger wrote: On Mon, 16 Oct 2006 20:53:20 -0400 (EDT) John Heffner [EMAIL PROTECTED] wrote: This patch limits the amount of time you will defer sending a TSO segment to less than two clock ticks, or the time between two

Re: [PATCH 0/59] Cleanup sysctl

2007-01-16 Thread Andi Kleen
On Wednesday 17 January 2007 03:33, Eric W. Biederman wrote: There has not been much maintenance on sysctl in years, and as a result is there is a lot to do to allow future interesting work to happen, and being ambitious I'm trying to do it all at once :) The patches in this series fall into

Re: [PATCH] select: fix sys_select to not leak ERESTARTNOHAND to userspace

2007-01-22 Thread Andi Kleen
On Tuesday 23 January 2007 00:00, Neil Horman wrote: As it is currently written, sys_select checks its return code to convert ERESTARTNOHAND to EINTR. However, the check is within an if (tvp) clause, and so if select is called from userspace with a NULL timeval, then it is possible for the

Re: [linux-pm] [RFC] Runtime power management on ipw2100

2007-01-31 Thread Andi Kleen
Matthew Garrett [EMAIL PROTECTED] writes: PCI seems to require a delay of 10ms when sequencing from D3 to D0, which probably isn't acceptable latency for an up state. It might be if the interface has been idle for some time (and the delay is not busy looping of course) How idle should be

Re: [linux-pm] [RFC] Runtime power management on ipw2100

2007-01-31 Thread Andi Kleen
On Wednesday 31 January 2007 11:27, Matthew Garrett wrote: On Wed, Jan 31, 2007 at 12:13:04PM +0100, Andi Kleen wrote: Matthew Garrett [EMAIL PROTECTED] writes: PCI seems to require a delay of 10ms when sequencing from D3 to D0, which probably isn't acceptable latency for an up state

Re: [PATCH] HTB O(1) class lookup

2007-02-01 Thread Andi Kleen
Simon Lodal [EMAIL PROTECTED] writes: Memory is generally not an issue, but CPU is, and you can not beat the CPU efficiency of plain array lookup (always faster, and constant time). Actually that's not true when the array doesn't fit in cache. The cost of going out to memory over caches is

Re: meaningful spinlock contention when bound to non-intr CPU?

2007-02-02 Thread Andi Kleen
Rick Jones [EMAIL PROTECTED] writes: Still, does this look like something worth persuing? In a past life/OS when one was able to eliminate one percentage point of spinlock contention, two percentage points of improvement ensued. The stack is really designed to go fast with per CPU local RX

Re: meaningful spinlock contention when bound to non-intr CPU?

2007-02-02 Thread Andi Kleen
The meta question behind all that would seem to be whether the scheduler should be telling us where to perform the network processing, or should the network processing be telling the scheduler what to do? (eg all my old blathering about IPS vs TOPS in HP-UX...) That's an unsolved

Re: meaningful spinlock contention when bound to non-intr CPU?

2007-02-02 Thread Andi Kleen
Perhaps a poor choice of words on my part - something along the lines of: hold_lock(); wake_up_someone(); release_lock(); where the someone being awoken can try to grab the lock before the path doing the waking manages to release it. Yes the wakeup happens deep inside the critical

Re: [PATCH] HTB O(1) class lookup

2007-02-05 Thread Andi Kleen
On Monday 05 February 2007 11:16, Jarek Poplawski wrote: Strange - it seems you gave only arguments against this analysis... For a naturally clustered key space (as is common in this case) the two level structure is likely more cache efficient than a generic hash function. That is because

Re: [PATCH 2.6.16-rc5] S2io: Receive packet classification and steering mechanisms

2006-04-18 Thread Andi Kleen
On Wednesday 19 April 2006 02:38, Ravinandan Arakali wrote: configuration: A mask(specified using loadable parameter rth_fn_and_mask) can be used to select a subset of TCP/UDP tuple for hash calculation. eg. To mask source port for TCP/IPv4 configuration, # insmod s2io.ko rx_steering_type=2

Re: I/OAT: Call for discussion

2006-04-19 Thread Andi Kleen
On Wednesday 19 April 2006 18:39, Grover, Andrew wrote: We have posted all the performance data we have gathered so far on the linux-net wiki: http://linux-net.osdl.org/index.php/I/OAT , and listed the overall concerns that have been expressed in private. I'm hoping you will look at the data,

Re: [PATCH 2.6.16-rc5] S2io: Receive packet classification and steering mechanisms

2006-04-19 Thread Andi Kleen
On Thursday 20 April 2006 00:45, Ravinandan Arakali wrote: Andi, We would like to explain that this patch is tier-1 of a two tiered approach. It implements all the steering functionality at driver-only level, and it is fairly Neterion-specific. That's fine for experiments, but probably not

Re: Congestion Avoidance Monitoring Tools

2006-04-21 Thread Andi Kleen
On Friday 21 April 2006 07:59, Tom Young wrote: On Thu, 2006-04-20 at 22:26 -0700, Piet Delaney wrote: I'm upgrading our 2.6.12 kernel to 2.6.13, which includes significant congestion avoidance code additions and changes. I was wondering if there are any tools folks can recommend for

Re: Fw: Bug: PPP dropouts in =2.6.16

2006-04-21 Thread Andi Kleen
On Friday 21 April 2006 19:15, Jesse Brandeburg wrote: On 4/21/06, Andrew Morton [EMAIL PROTECTED] wrote: We do seem to have had a few reports of ppp regressions around this timeframe. me too. I couldn't use 2.6.16 at home on my pppoe connected router because it was so slow. I didn't

Re: determine outgoing interface (eth0,eth1) for a packet according to the dest IP

2006-04-25 Thread Andi Kleen
On Tuesday 25 April 2006 09:31, John Que wrote: Hello, What is the right way to determine on which interface card (eth0 or eth1) will a packet be sent (according to the dest IP)? You can send a rtnetlink RTM_GETROUTE message to ask the kernel. Result is the interface index in RTA_OIF, which

Re: determine outgoing interface (eth0,eth1) for a packet according to the dest IP

2006-04-25 Thread Andi Kleen
On Tuesday 25 April 2006 16:44, John Que wrote: Thanks a lot ! I had tried the sending RTM_GETROUTE message using a NETLINK_ROUTE socket in a User Space program and it went OK. It gaves correct routing struct which I could parse. In fact it gave the rotuing

Re: Disabling TCP Treason uncloaked

2006-05-02 Thread Andi Kleen
On Tuesday 02 May 2006 18:19, Just Marc wrote: I thought that maybe it's time to either set TCP_DEBUG to 0 or alternatively allow an admin to toggle the printing of this message off/on? On a few busy web servers running usually latest versions of 2.6 I have this message displaying

Re: Van Jacobson's net channels and real-time

2006-05-02 Thread Andi Kleen
On Tuesday 02 May 2006 14:41, Vojtech Pavlik wrote: You seem to be missing the fact that most of todays interrupts are delivered through the APIC bus, which isn't fast at all. You mean slow right? Modern x86s (anything newer than a P3) generally don't have an separate APIC bus anymore but

Re: forcedeth driver NAPI?

2006-05-02 Thread Andi Kleen
On Tuesday 02 May 2006 21:11, Stephen Hemminger wrote: Has anyone looked into making this driver use NAPI? This would fix the IRQ overload problem and a number of other DoS risks. Also, if done right the device lock could be reduced to just a transmit lock. Making it support netpoll would be

Re: pci_enable_msix throws up error

2006-05-05 Thread Andi Kleen
On Friday 05 May 2006 07:14, Ayaz Abdulla wrote: I noticed the same behaviour, i.e. can not use both MSI and MSIX without rebooting. I had sent a message to the maintainer of the MSI/MSIX source a few months ago and got a response that they were working on fixing it. Not sure what the

Re: [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb()

2006-05-05 Thread Andi Kleen
On Friday 05 May 2006 10:49, Eric Dumazet wrote: On a dual opteron box, I noticed high oprofile numbers in net/core/dst.c , function dst_destroy(struct dst_entry * dst) It appears the smb_rmb() done at the begining of dst_destroy() is the killer (this is a lfence machine instruction,

Re: [Xen-devel] [RFC PATCH 34/35] Add the Xen virtual network device driver.

2006-05-09 Thread Andi Kleen
On Tuesday 09 May 2006 15:01, Herbert Xu wrote: Christian Limpach [EMAIL PROTECTED] wrote: There's at least two reasons why having it in the driver is preferable: - synchronizing sending the fake ARP request with when the device is operational -- you really want to make this well

Re: [RFC PATCH 34/35] Add the Xen virtual network device driver.

2006-05-10 Thread Andi Kleen
On Tuesday 09 May 2006 22:46, Roland Dreier wrote: Keir Where should we get our entropy from in a VM environment? Keir Leaving the pool empty can cause processes to hang. You could have something like a virtual HW RNG driver (with a frontend and backend), which steals from the dom0

Re: [PATCH 2/6] myri10ge - Add missing PCI IDs

2006-05-10 Thread Andi Kleen
On Wednesday 10 May 2006 23:35, Brice Goglin wrote: [PATCH 2/6] myri10ge - Add missing PCI IDs Add nVidia nForce CK804 PCI-E bridge and ServerWorks HT2000 PCI-E bridge IDs. They will be used by the myri10ge driver. That's a bad sign. It means you have code in your driver that should be

Re: [RFC PATCH 34/35] Add the Xen virtual network device driver.

2006-05-11 Thread Andi Kleen
On Thursday 11 May 2006 09:49, Keir Fraser wrote: On 11 May 2006, at 01:33, Herbert Xu wrote: But if sampling virtual events for randomness is really unsafe (is it really?) then native guests in Xen would also get bad random numbers and this would need to be somehow addressed. Good

Re: [RFC PATCH 34/35] Add the Xen virtual network device driver.

2006-05-11 Thread Andi Kleen
On Thursday 11 May 2006 18:48, Rick Jones wrote: From the peanut gallery... Can remote TCP ISN's be considered a source of entropy these days? How about checksums? Indirectly - we measure how long it takes to compute them. -Andi - To unsubscribe from this list: send the line unsubscribe

  1   2   3   4   5   6   >