Re: Silent corruption with r8169

2007-04-05 Thread Andi Kleen
> I'll try to get to testing this, but I'm wondering if people may have > misunderstood my original post. I don't get any corruption over > Ethernet; it's just corruption on the filesystem during certain load > patterns that involve the Realtek ethernet card. When disabling hardware checksums help

[PATCH] Uninline tcp_done

2007-04-05 Thread Andi Kleen
The function is quite big and has several call sites and nothing to collapse by compiler optimization on inlining. Besides it's nicer to read in a in .c file. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Index: linux-2.6.21-rc3-net/includ

Re: TCP connection stops after high load.

2007-04-12 Thread Andi Kleen
Ben Greear <[EMAIL PROTECTED]> writes: > > I don't mind adding printks...and I've started reading through the code, > but there is a lot of it, and indiscriminate printks will likely just > hide the problem because it will slow down performance so much. You could add /proc/net/snmp counters for i

Re: PROBLEM: tg3 spitting out uninitialized memory

2007-04-12 Thread Andi Kleen
Jamie webb <[EMAIL PROTECTED]> writes: > Hi there > > I have a Dell PE860 with built-in BCM5721, which is reported as > working fine with the tg3 driver, however I have been getting sporadic > data corruption, mostly evident as SSH MAC errors. FWIW i also saw this (data corruption with tg3) occa

Re: intermittant petabyte usage reported with broadcom nic

2007-04-12 Thread Andi Kleen
Roland Dreier <[EMAIL PROTECTED]> writes: > [Adding Michael Chan, who seems to look after bnx2, to the cc list] > > > To clarify it's an Intel Dual Core Xeon (I just wound up as thinking of > > them all as amd64s). Network card driver in use is the one defined by > > CONFIG_BNX2. Kernel's mono

Re: [patch 3/4] net: Percpufy frequently used variables -- proto.sockets_allocated

2006-01-28 Thread Andi Kleen
[adding linux-arch] On Sunday 29 January 2006 01:55, Andrew Morton wrote: > Benjamin LaHaise <[EMAIL PROTECTED]> wrote: > > On Sat, Jan 28, 2006 at 01:28:20AM +0100, Eric Dumazet wrote: > > > We might use atomic_long_t only (and no spinlocks) > > > Something like this ? > > > > Erk, complex and sl

Re: [PATCH]Enhancements of ip_options_fragment()

2006-01-29 Thread Andi Kleen
[Please put new lines every 80 characters in your mails] On Monday 30 January 2006 22:44, Wei Yongjun wrote: > [1]Summary of the problem: > Kernel does not delete the space of the options which not allowed in > fragments. > > [2]Full description of the problem: > ip_options_fragment() just fill

Re: [RFC] TCP MTU probing

2006-01-30 Thread Andi Kleen
On Tuesday 31 January 2006 01:09, John Heffner wrote: > David S. Miller wrote: > > From: John Heffner <[EMAIL PROTECTED]> > > Date: Tue, 06 Dec 2005 14:42:53 -0500 > > > >>I'd like to get a few people at least to look this over, and maybe give > >>it a try. One remaining item to consider is how be

Re: [RFC] Poor Network Performance with e1000 on 2.6.14.3

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 17:52, Ben Greear wrote: > I haven't been able to get a TCP connection to saturate a 1Gbps link > in both directions simultaneously. I *have* been able to fully saturate > 2 pro/1000 NICs on the same machine using pktgen, so the NIC/driver can > support it if only TC

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 14:48, Leonid Grossman wrote: > David S. Miller wrote: > > > And with Van Jacobson net channels, none of this is going to > > matter and 512 is going to be your limit whether you like it > > or not. So this short term complexity gain is doubly not justified. > >

Re: [RFC] Poor Network Performance with e1000 on 2.6.14.3

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 19:44, Stephen Hemminger wrote: > > Also, you have to increase TCP max window size to saturate a 1Gbps link. > You need to increase tcp_rmem[2] on receiver and tcp_wmem[2] on sender. They did all that. Also it's all the same on 2.4 -Andi - To unsubscribe from this

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 20:37, Jeff Garzik wrote: > To have a fully async, zero copy network receive, POSIX read(2) is > inadequate. Agreed, but POSIX aio is adequate. > One needs a ring buffer, similar in API to the mmap'd > packet socket, where you can queue a whole bunch of reads.

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 22:11, David S. Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Wed, 1 Feb 2006 19:28:46 +0100 > > > http://www.lemis.com/grog/Documentation/vj/lca06vj.pdf > > I did a writeup in my blog about all of this, another good >

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 21:26, Jeff Garzik wrote: > Andi Kleen wrote: > > But I don't think Van's design is supposed to be exposed to user space. > > It is supposed to be exposed to userspace AFAICS. Then it's likely insecure and root only, unless he knows som

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 02:53, Greg Banks wrote: > On Thu, 2006-02-02 at 08:11, David S. Miller wrote: > > Van is not against NAPI, in fact he's taking NAPI to the next level. > > Softirq handling is overhead, and as this work shows, it is totally > > unnecessary overhead. > > I got the impres

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 04:19, Greg Banks wrote: > On Thu, 2006-02-02 at 14:13, David S. Miller wrote: > > From: Greg Banks <[EMAIL PROTECTED]> > > Date: Thu, 02 Feb 2006 14:06:06 +1100 > > > > > On Thu, 2006-02-02 at 13:46, David S. Miller wrote: > > > > I know SAMBA is using sendfile() (when

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 00:37, Mitchell Blank Jr wrote: > Jeff Garzik wrote: > > Once packets classified to be delivered to a specific local host socket, > > what further operations are require privs? What received packet data > > cannot be exposed to userspace? > > You just need to make sure

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 07:49, David S. Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Thu, 2 Feb 2006 07:45:26 +0100 > > > Don't think it was ever implemented though. In the end we just > > eat the slowdown in that particular load. > >

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 00:08, Jeff Garzik wrote: > Definitely not. POSIX AIO is far more complex than the operation > requires, Ah, I sense strong a NIH field. > and is particularly bad for implementations that find it wise > to queue a bunch of to-be-filled buffers. Why? lio_listio se

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 00:50, David S. Miller wrote: > > Why not concentrate your thinking on how to make it can be made to > _work_ instead of punching holes in the idea? Isn't that more > productive? What I think would be very practical to do would be to try to replace the socket rx que

Re: [Bug 5990] New: call to socket(AF_INET, SOCK_RAW, IPPROTO_IP);

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 01:15, Stephen Hemminger wrote: > Is this supposed to work? Or is it a just something linux > doesn't implement? He wants to use a packet socket with ntohs(ETH_P_IP) instead. I vaguely remember some crazy SUS comittee proposed to use 0 as a pseudo portable way to impl

Re: [Bug 5990] New: call to socket(AF_INET, SOCK_RAW, IPPROTO_IP);

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 01:34, David S. Miller wrote: > There really is no sane way to help these guys out who have used > the wildcard protocol number value for a real protocol. I think they want to use it as a wildcard. -Andi - To unsubscribe from this list: send the line "unsubscribe net

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 08:31, Greg Banks wrote: > The tg3 driver uses small hardcoded values for the RXCOL_TICKS > and RXMAX_FRAMES registers, and allows "ethtool -C" to change > them. SGI's solution is do is ship a script that uses ethtool > at boot to tune rx-usecs, rx-frames, rx-usecs-ir

Re: Van Jacobson net channels and NIC channels

2006-02-02 Thread Andi Kleen
On Thursday 02 February 2006 17:27, Leonid Grossman wrote: > By now we have submitted UFO, MSI-X and LRO patches. The one item on > the TODO list that we did not submit a full driver patch for is the > "support for distributing receive processing across multiple CPUs (using > NIC hw queues)", mai

Re: Van Jacobson net channels

2006-02-03 Thread Andi Kleen
On Friday 03 February 2006 02:07, Greg Banks wrote: > > (Don't ask for code - it's not really in an usable state) > > Sure. I'm looking forward to it. I had actually shelved the idea because of TSO. But if you can get me some data from your NFS servers that shows TSO is not enough for them that

Re: [PATCH] af_unix: use shift instead of integer division

2006-02-07 Thread Andi Kleen
On Tuesday 07 February 2006 15:54, Benjamin LaHaise wrote: > + if (size > ((sk->sk_sndbuf >> 1) - 64)) > + size = (sk->sk_sndbuf >> 1) - 64; This is really surprising. Are you double plus sure gcc doesn't do this automatically? -Andi - To unsubscribe from this l

Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Wednesday 08 February 2006 01:44, David S. Miller wrote: > From: Ben Greear <[EMAIL PROTECTED]> > Date: Tue, 07 Feb 2006 16:39:52 -0800 > > > Rick Jones wrote: > > > In the realm of straw ideas, how often are netdevs added and removed, > > > and would leaving a tombstone behind consume too muc

Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Wednesday 08 February 2006 11:34, Eric Dumazet wrote: > 1) Instead of storing a 2-uple {pointer,generation} (and using 12 or 16 bytes > on 64 bits platforms), we could just use a 32 bit quantity > [(ifindex<<8)+(gen_number)] That would add an 2^24 netdevice limit. Someone will sooner or late

Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Wednesday 08 February 2006 20:12, Stephen Hemminger wrote: > On Tue, 07 Feb 2006 16:26:01 -0800 (PST) > "David S. Miller" <[EMAIL PROTECTED]> wrote: > > > From: Stephen Hemminger <[EMAIL PROTECTED]> > > Date: Tue, 7 Feb 2006 16:19:42 -0800 > > > > > Also, isn't a lot of the problem reduced if

Re: [PATCH] NET : SMP optimization of netdevice refcount

2006-02-08 Thread Andi Kleen
On Thursday 09 February 2006 00:07, David Stevens wrote: > From: Stephen Hemminger <[EMAIL PROTECTED]> > > > IMHO converting skb->dev to skb->devindex and using ifindex sounds best. > > It gets rid of the need to refcount as much but keeps the safety from > > buggy protocols. Ipv6 could probably

Re: [BUG] recent commit breaks multi-descriptor receives with ip fragments

2006-02-09 Thread Andi Kleen
On Thursday 09 February 2006 02:33, Jesse Brandeburg wrote: > My experience with alloc_page/put_page in the "packet split" e1000 code is > that it is slower to call alloc_page/put_page (they show up in top > 10 oprofile) Did you resolve it down to specific lines in alloc_page/put_page? Perhaps

Re: [PATCH] NET : No need to update last_rx in loopback driver

2006-02-09 Thread Andi Kleen
On Thursday 09 February 2006 12:05, David S. Miller wrote: > From: Eric Dumazet <[EMAIL PROTECTED]> > Date: Thu, 09 Feb 2006 10:51:00 +0100 > > > I'm sorry but last_rx is not available to User Land. > > > > You need a kernel debugger to access it, or mess with /proc/kcore > > There are tons of p

Re: [PATCH] NET : No need to update last_rx in loopback driver

2006-02-09 Thread Andi Kleen
On Thursday 09 February 2006 14:53, John W. Linville wrote: > On Thu, Feb 09, 2006 at 12:33:11PM +0100, Andi Kleen wrote: > > > Same would probably apply to schedulers and classifiers. So I cannot > > imagine a good use of this field. What does bonding use it for anywa

Re: [SKY2] a outdated patch (seems to fix my problems)

2006-02-16 Thread Andi Kleen
e problems with broken MMCONFIG Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/i386/pci/Makefile |2 +- arch/i386/pci/direct.c | 15 +-- arch/i386/pci/init.c | 25 + arch/i386/pci/mmconfig.c | 11 +++ arch/i386/pci/pcbios.

Re: [PATCH] avoid atomic op on page free

2006-03-07 Thread Andi Kleen
On Tuesday 07 March 2006 02:52, Benjamin LaHaise wrote: > Those 1-2 cycles are free if you look at how things get scheduled with the > execution of the surrounding code. I bet $20 that you can't find a modern > CPU where the cost is measurable (meaning something like a P4, Athlon). > If this l

Re: [EXPERIMENTAL] HT aware loopback device (hack, x86-64 only atm)

2006-03-07 Thread Andi Kleen
On Tuesday 07 March 2006 22:19, Benjamin LaHaise wrote: > At this point I'd just like to stir up some discussion, so please comment > away with any ideas and concerns. I think the first step for better spreading of work would be better MSI-X support. In particular we need ways for drivers to eas

Re: [EXPERIMENTAL] HT aware loopback device (hack, x86-64 only atm)

2006-03-07 Thread Andi Kleen
On Tuesday 07 March 2006 23:51, David S. Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Tue, 7 Mar 2006 23:47:11 +0100 > > > In particular I liked the concept of using arrays of pointers as > > queues instead of double linked lists to be more cache friendl

Re: [EXPERIMENTAL] HT aware loopback device (hack, x86-64 only atm)

2006-03-07 Thread Andi Kleen
On Wednesday 08 March 2006 00:13, David S. Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Tue, 7 Mar 2006 16:39:58 +0100 > > > Also the locking requirements would need to be defined. The originals > > didn't have any locking at all. > > You d

Re: [PATCH] x86-64, use page->virtual to get 64 byte struct page

2006-03-07 Thread Andi Kleen
On Wednesday 08 March 2006 00:26, Benjamin LaHaise wrote: > Hi Andi, > > On x86-64 one inefficiency that shows up on profiles is the handling of > struct page conversion to/from idx and addresses. This is mostly due to > the fact that struct page is currently 56 bytes on x86-64, so gcc has to

Re: [PATCH] x86-64, use page->virtual to get 64 byte struct page

2006-03-07 Thread Andi Kleen
On Wednesday 08 March 2006 02:29, Benjamin LaHaise wrote: > On Tue, Mar 07, 2006 at 05:27:37PM +0100, Andi Kleen wrote: > > On Wednesday 08 March 2006 00:26, Benjamin LaHaise wrote: > > > Hi Andi, > > > > > > On x86-64 one inefficiency that shows up on profi

Re: [PATCH] x86-64, use page->virtual to get 64 byte struct page

2006-03-08 Thread Andi Kleen
On Wednesday 08 March 2006 03:38, Benjamin LaHaise wrote: > It's hardly that uncommon for pages to cross cachelines or for pages to > move around CPUs with networking. Data? > Please name some sort of benchmarks that show your concerns for decreased > performance. Anything that manipulates l

Re: [PATCH] x86-64, use page->virtual to get 64 byte struct page

2006-03-08 Thread Andi Kleen
On Wednesday 08 March 2006 10:09, Eric Dumazet wrote: > I suggest the following change (seems better than playing vmlinux.lds > games) > > include/asm-x86_64/mmzone.h > struct memnode { > int shift; > u8 map[NODEMAPSIZE]; > } cacheline_aligned; > extern struct memnode memnode; > #def

Re: [PATCH] x86-64, use page->virtual to get 64 byte struct page

2006-03-08 Thread Andi Kleen
On Wednesday 08 March 2006 17:07, Benjamin LaHaise wrote: > You haven't come up with any data to support your position, You're proposing to waste 0.1% of the memory of the system with a quite dubious optimization. Unless you're showing real gains on some macro benchmark I won't accept that chan

Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh

2006-03-08 Thread Andi Kleen
Andrew Morton <[EMAIL PROTECTED]> writes: > > x86_64 is signed 32-bit! I'll change it. You want signed 64bit? -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [patch 1/4] net: percpufy frequently used vars -- add percpu_counter_mod_bh

2006-03-09 Thread Andi Kleen
On Thursday 09 March 2006 09:06, Ravikiran G Thirumalai wrote: > On Wed, Mar 08, 2006 at 04:32:58PM -0800, Andrew Morton wrote: > > Ravikiran G Thirumalai <[EMAIL PROTECTED]> wrote: > > > > > > On Wed, Mar 08, 2006 at 03:43:21PM -0800, Andrew Morton wrote: > > > > Benjamin LaHaise <[EMAIL PROTECTED

Re: [PATCH] x86-64, use page->virtual to get 64 byte struct page

2006-03-09 Thread Andi Kleen
On Wednesday 08 March 2006 11:45, Eric Dumazet wrote: > Andi Kleen a écrit : > > > > > Can you send tested patches with proper descriptions and signed off lines > > please? > > > > -Andi > > > > > > You are welcome Andi :) Applied. Pl

Re: Question about TCP behavior

2006-03-22 Thread Andi Kleen
On Tuesday 21 March 2006 18:33, Patrick Klos wrote: > > If the Linux machine has just recently been booted, the transfer takes around > 8 or 9 milliseconds. If the Linux machine has been up for a while (but still > primarily idle), the transfer starts to take anywhere from 32 to 70 milli- > seco

Re: [Comment] sizeof("struct tcp_sock") is above 1024 on x86 since linux-2.6.15

2006-03-22 Thread Andi Kleen
On Tuesday 21 March 2006 15:17, Eric Dumazet wrote: > This is a new point of failure for x86 machines that use lot of tcp sockets, > I > learnt it the bad way and had to revert to 2.6.14 some servers that cannot > run > stock 2.6.15/2.6.16 for long because of this problem. x86-64/ppc64/other

Re: agitating for larger default max tcp buffer sizes

2006-03-23 Thread Andi Kleen
On Thursday 23 March 2006 09:31, David S. Miller wrote: > The key point is to keep the per-socket limits far enough away from > the global pool limits such that it is not easy for a single entity > to maliciously put the allocator into conservative mode and penalize > the legitimate users. It's p

Re: Output packet processing (was stretch ACKs, etc.)

2006-03-25 Thread Andi Kleen
On Saturday 25 March 2006 23:32, Mark Butler wrote: > A true firewall should never need to do anything but drop packets and > reset connections. Changes to the way packets are routed should be done > at the routing layer, using the flow information from the transport > layer. The real world

Re: [PATCH 1/1] ixp2000: fix gcc4 breakage

2006-03-26 Thread Andi Kleen
On Sunday 26 March 2006 21:18, Lennert Buytenhek wrote: > gcc4 doesn't allow declaring a static function inside another function, > so convert to extern. (The function whose prototype we're changing is > not defined anywhere and intended purely to cause a link error when some > internal calculatio

Re: Fix memory allocation in com90xx.c

2006-03-28 Thread Andi Kleen
On Monday 27 March 2006 17:41, Darren Jenkins\ wrote: > - shmems = kzalloc(((0x1-0xa) / 0x800) * sizeof(unsigned long), > + shmems = kcalloc(((0x1-0xa) / 0x800), sizeof(unsigned long), >GFP_KERNEL); If it's too big for kzalloc then it's too big for

Re: dcache leak in 2.6.16-git8 II

2006-03-29 Thread Andi Kleen
On Tuesday 28 March 2006 05:00, Andrew Morton wrote: > Andi Kleen <[EMAIL PROTECTED]> wrote: > > > > On Monday 27 March 2006 13:48, Bharata B Rao wrote: > > > On Mon, Mar 27, 2006 at 07:50:20AM +0200, Andi Kleen wrote: > > > > > > > > A 2GB x

Re: dcache leak in 2.6.16-git8 II

2006-03-29 Thread Andi Kleen
On Thursday 30 March 2006 00:50, Andrew Morton wrote: > It looks that way. Didn't someone else report a sock_inode_cache leak? Didn't see it. > > I still got a copy of the /proc in case anybody wants more information. > > We have this fancy new /proc/slab_allocators now, it might show somethi

Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver

2006-09-25 Thread Andi Kleen
David Miller <[EMAIL PROTECTED]> writes: > > First, the only mentioned real use of EtherIP I've seen anywhere is to > tunnel old LAN based games that used protocols other than IP :-) How would you convince those old LAN games to use a MTU < 1500 which is needed for the tunnel? I bet they have th

Re: [PATCH 00/03][RESUBMIT] net: EtherIP tunnel driver

2006-09-25 Thread Andi Kleen
On Monday 25 September 2006 13:57, Joerg Roedel wrote: > On Mon, Sep 25, 2006 at 12:22:41PM +0200, Andi Kleen wrote: > > > How would you convince those old LAN games to use a MTU < 1500 which > > is needed for the tunnel? I bet they have the size hardcoded. > > Th

Re: [PATCH] Introduce BROKEN_ON_64BIT facility

2006-10-02 Thread Andi Kleen
Jeff Garzik <[EMAIL PROTECTED]> writes: > Add a broken-on-64bit option, similar to the existing broken-on-smp > config option. This is just the first pass, marking the obvious > candidates. When I had this problem in the past I just used && !64BIT. How is this new option different? > config ISD

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-04 Thread Andi Kleen
On Wednesday 04 October 2006 17:45, Andrew Morton wrote: > On Wed, 04 Oct 2006 08:42:28 -0500 > Steve Fox <[EMAIL PROTECTED]> wrote: > > > On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote: > > > On Thu, 28 Sep 2006 17:50:31 + (UTC) > > > "Steve Fox" <[EMAIL PROTECTED]> wrote: > > > > >

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-04 Thread Andi Kleen
> I think most likely it would crash on 2.6.18. Keith mannthey had reported > a different crash on 2.6.18-rc4-mm2 when this patch was introduced first > time. Following is the link to the thread. Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4? -Andi - To unsubscribe f

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 17:32, Steve Fox wrote: > On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote: > > > Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? > > CONFIG_DEBUG_KERNEL should be on > > > Last time I couldn't match your instruction dump to any code s

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 19:57, Steve Fox wrote: > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: > > > Please don't snip the Code: line. It is fairly important. > > Sorry about that. The remote console I was using appears to overwrite > some text after I

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 20:51, Steve Fox wrote: > On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote: > > > I guess we need to track when it gets corrupted. Can you send the full > > boot log with this patch applied? > > Here she blows! Can you please try it again w

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 20:52, Vivek Goyal wrote: > On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote: > > On Thursday 05 October 2006 19:57, Steve Fox wrote: > > > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: > > > > > > > Please

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 22:42, Steve Fox wrote: > On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote: > > > Can you please try it again with this patch to narrow it down further? > > Unfortunately this is as far as it got before it hung. Boot with earlyprintk=serial,ttyS

Re: 2.6.18-mm2 boot failure on x86-64

2006-10-05 Thread Andi Kleen
> hmm, rather than bugging you with patches now, I'll see what I can find > with the x86_64 machines I have access to and see can I reproduce it. I started the bisect, should finish soon. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAI

Re: 2.6.18-mm2 boot failure on x86-64 II

2006-10-05 Thread Andi Kleen
On Thursday 05 October 2006 22:51, Andi Kleen wrote: > > > hmm, rather than bugging you with patches now, I'll see what I can find > > with the x86_64 machines I have access to and see can I reproduce it. > > I started the bisect, should finish soon

Re: 2.6.18-mm2 boot failure on x86-64 II

2006-10-05 Thread Andi Kleen
> As of yet I haven't been able to recreate the hang. I am running > similar HW to Steve. That was on a 4 core Opteron with Tyan board (S2881) and AMD-8111 chipset. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More ma

2.6.19rc2 XFRM does too large direct mapping allocations for hashes

2006-10-18 Thread Andi Kleen
I got this while restarting ipsec on a 2.6.19rc2 system that was up for a few days. Order 8 is really a bit big to get from the direct mapping after boot. Should the hash allocation fall back to vmalloc? -Andi Initializing XFRM netlink socket events/0: page allocation failure. order:8, mode:0x

Re: [PATCH] Bound TSO defer time (resend)

2006-10-18 Thread Andi Kleen
On Tuesday 17 October 2006 06:18, John Heffner wrote: > Stephen Hemminger wrote: > > On Mon, 16 Oct 2006 20:53:20 -0400 (EDT) > > John Heffner <[EMAIL PROTECTED]> wrote: > > >> This patch limits the amount of time you will defer sending a TSO segment > >> to less than two clock ticks, or the time

Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Andi Kleen
> But, it also violates the assumptions of the network devices. > It calls NAPI poll back with IRQ's disabled and potentially doesn't > obey the semantics about only running on the same CPU as the > received packet. netpoll always played a little fast'n'lose with various locking rules. Also often

Re: [PATCH 2/3] netpoll: rework skb transmit queue

2006-10-20 Thread Andi Kleen
On Friday 20 October 2006 23:08, David Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Fri, 20 Oct 2006 23:01:29 +0200 > > > netpoll always played a little fast'n'lose with various locking rules. > > The current code is fine, it never reen

Re: [PATCH 2.6.15] s2io: UFO support

2005-11-14 Thread Andi Kleen
On Monday 14 November 2005 21:25, Ananda Raju wrote: > Hi, > This patch implements the UFO support in S2io driver. This patch uses the UFO > interface available in linux-2.6.15 kernel. Can you share some numbers on how much difference it makes vs non UFO? > +#ifdef __BIG_ENDIAN > +

Re: [Bug 5610] New: IP MTU Path Discovery now working properly

2005-11-15 Thread Andi Kleen
On Wednesday 16 November 2005 01:45, David S. Miller wrote: > > Alternatively, we're ignoring the PMTU message for one reason or > another. Perhaps the quoted TCP packet in the ICMP pmtu message > has an incorrect sequence number or is truncated for some reason. There are counters for all of th

Re: [RFC] [PATCH 0/3] ioat: DMA engine support

2005-11-23 Thread Andi Kleen
On Wed, Nov 23, 2005 at 05:06:42PM -0500, Jeff Garzik wrote: > IOAT is super-neat stuff. The main problem I see is that it'll likely only pay off when you can keep the queue of copies long (to amortize the cost of talking to an external chip). At least for the standard recvmsg skb->user space,

Re: [RFC] [PATCH 0/3] ioat: DMA engine support

2005-11-23 Thread Andi Kleen
On Wed, Nov 23, 2005 at 07:17:01PM -0500, Benjamin LaHaise wrote: > On Wed, Nov 23, 2005 at 11:30:08PM +0100, Andi Kleen wrote: > > The main problem I see is that it'll likely only pay off when you can keep > > the queue of copies long (to amortize the cost of > > t

Re: [RFC] [PATCH 0/3] ioat: DMA engine support

2005-11-24 Thread Andi Kleen
On Thu, Nov 24, 2005 at 05:24:34PM +0200, Avi Kivity wrote: > Andi Kleen wrote: > > >>Don't forget that there are benefits of not polluting the cache with the > >>traffic for the incoming skbs. > >> > >> > > > >Is that a general benef

Re: [RFC] [PATCH 0/3] ioat: DMA engine support

2005-11-24 Thread Andi Kleen
> >Just pointing out that it's not clear it will always be a big help. > > > > > > > Agree it should default to in-cache. This would mean no DMA engine by default. Clearly there needs to be some heuristic to decide by default. We'll see how effective it will be in the end. -Andi - To unsubscr

Re: Resend [PATCH netdev-2.6 2/8] e1000: Performance Enhancements

2005-12-02 Thread Andi Kleen
On Fri, Dec 02, 2005 at 11:04:14AM -0700, Grant Grundler wrote: > At the time you did this, I read the Intel docs on P3 and P4 cache > behaviors. IIRC, the P4 HW prefetches very aggressively. ie the SW > prefetching just becomes noise or burns extra CPU cycles. My guess I don't think they can foll

Re: Resend [PATCH netdev-2.6 2/8] e1000: Performance Enhancements

2005-12-02 Thread Andi Kleen
On Fri, Dec 02, 2005 at 05:01:39PM -0800, John Ronciak wrote: > On 12/2/05, Grant Grundler <[EMAIL PROTECTED]> wrote: > > > Yup. We can tune for workload/load-latency of each architecture. > > I think tuning for all of them in one source code is the current problem. > > We have to come up with a w

Re: Resend [PATCH netdev-2.6 2/8] e1000: Performance Enhancements

2005-12-08 Thread Andi Kleen
On Thu, Dec 08, 2005 at 01:35:11AM -0800, David S. Miller wrote: > From: Robert Olsson <[EMAIL PROTECTED]> > Date: Thu, 8 Dec 2005 10:20:43 +0100 > > > Why not remove copybreak from the drivers and do eventual copybreak after > > we > > have looked up the packet. This way we can get copybreak f

Re: Resend [PATCH netdev-2.6 2/8] e1000: Performance Enhancements

2005-12-08 Thread Andi Kleen
> For example, if this is just a TCP ACK, we can do better than > copybreak and just let the driver use the SKB again upon > return from netif_receive_skb(). :-) That's a cool optimization. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMA

Re: [PATCH RFC]: ipv6 addrconf async jobs

2005-12-13 Thread Andi Kleen
> Comments? Wouldn't it be cleaner to pass function pointers instead of all these switches? -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism

2005-12-14 Thread Andi Kleen
> I would appreciate any feedback or comments on this approach. Maybe I'm missing something but wouldn't you need an own critical pool (or at least reservation) for each socket to be safe against deadlocks? Otherwise if a critical sockets needs e.g. 2 pages to finish something and 2 critical sock

Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism

2005-12-14 Thread Andi Kleen
> Here we are assuming that the pre-allocated critical page pool is big enough > to satisfy the requirements of all the critical sockets. That seems like a lot of assumptions. Is it really better than the existing GFP_ATOMIC which works basically the same? It has a lot more users that compete tr

Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism

2005-12-14 Thread Andi Kleen
On Wed, Dec 14, 2005 at 08:30:23PM -0800, David S. Miller wrote: > From: Matt Mackall <[EMAIL PROTECTED]> > Date: Wed, 14 Dec 2005 19:39:37 -0800 > > > I think we need a global receive pool and per-socket send pools. > > Mind telling everyone how you plan to make use of the global receive > pool

Re: [RFC] Fine-grained memory priorities and PI

2005-12-15 Thread Andi Kleen
> When processes request memory through any subsystem, their memory > priority would be passed through the kernel layers to the allocator, > along with any associated information about how to free the memory in > a low-memory condition. As a result, I could configure my database > to have

Re: [RFC] Fine-grained memory priorities and PI

2005-12-15 Thread Andi Kleen
> Naturally this is all still in the vaporware stage, but I think that > if implemented the concept might at least improve the OOM/low-memory > situation considerably. Starting to fail allocations for the cluster > programs (including their kernel allocations) well before failing > them fo

2.6.15rc5-git4 Forcedeth unstable on Nforce4

2005-12-15 Thread Andi Kleen
Hallo, When I boot a kernel with iommu=force (this forces all pci_map_sgs through the K8 aperture) and slab debugging on a Nforce4 x86-64 system the network corrupts data very quickly. Even when just sshing somewhere ssh quickly aborts with MAC errors etc. Reverting only forcedeth.c to the one f

Re: 2.6.15rc5-git4 Forcedeth unstable on Nforce4

2005-12-15 Thread Andi Kleen
On Thu, Dec 15, 2005 at 11:40:08AM -0500, John W. Linville wrote: > On Thu, Dec 15, 2005 at 04:29:04PM +0100, Andi Kleen wrote: > > > > Hallo, > > > > When I boot a kernel with iommu=force (this forces all pci_map_sgs > > through the K8 aperture) and slab debuggi

Re: 2.6.15rc5-git4 Forcedeth unstable on Nforce4

2005-12-15 Thread Andi Kleen
On Thu, Dec 15, 2005 at 01:58:41PM -0800, David S. Miller wrote: > From: "John W. Linville" <[EMAIL PROTECTED]> > Date: Thu, 15 Dec 2005 15:21:37 -0500 > > > Interesting... FWIW the FC4.netdev.6 kernel seems to be working fine > > on (a yet-to-be-released box), which is an x86_64 (AMD) box w/ > >

Re: 2.6.12.6 to 2.6.14.3 Major 10-GigE TCP Network Performance Degradation

2005-12-15 Thread Andi Kleen
> It appears that it is getting CPU starved for some reason (note the > 43%/40% transmitter CPU usage versus the 99%/99% CPU usage for the > 2.6.12.6 case). What happens when you turn off tso in ethtool? -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a me

Re: 2.6.12.6 to 2.6.14.3 Major 10-GigE TCP Network Performance Degradation

2005-12-15 Thread Andi Kleen
On Thu, Dec 15, 2005 at 08:35:32PM -0500, Bill Fink wrote: > On Fri, 16 Dec 2005, Andi Kleen wrote: > > > > It appears that it is getting CPU starved for some reason (note the > > > 43%/40% transmitter CPU usage versus the 99%/99% CPU usage for the > > > 2.6.12.6

Re: 2.6.15rc5-git4 Forcedeth unstable on Nforce4 - update III

2005-12-15 Thread Andi Kleen
Some more testing shows that even the 2.6.14 driver eventually causes slab debugging BUGs() like --- [cut here ] - [please bite here ] - Kernel BUG at /home/andi/lsrc/linux-2.6.15rc5-git4/mm/slab.c:2307 invalid operand: [1] PREEMPT SMP CPU 3 Modules linked in: Pid:

Re: [PATCH] Small cleanup to socket initialization

2005-12-17 Thread Andi Kleen
On Fri, Dec 16, 2005 at 11:15:17PM -0800, David S. Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Sat, 17 Dec 2005 08:10:29 +0100 > > > sock_init can be done as a core_initcall instead of calling > > it directly in init/main.c > > > &g

[PATCH] Fix SLAB_DEBUG failures with forcedeth

2005-12-23 Thread Andi Kleen
5k, 9k) to 2k or 16k. But at least the network driver is usable now again with slab debugging enabled. The magic value 86 has been found with binary search. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- linux-2.6.15rc5-git4/drivers/net/forcedeth.c-ORIG 2005-12-16 01:00:33.0

Re: [PATCH] Fix SLAB_DEBUG failures with forcedeth

2005-12-23 Thread Andi Kleen
On Fri, Dec 23, 2005 at 01:42:52PM +0100, Manfred Spraul wrote: > Hi, > > Andi Kleen wrote: > > >It shouldn't make any difference on !SLAB_DEBUG kernels because kmalloc > >will pad typical mtus (1.5k, 9k) to 2k or 16k. But at least the > >network driver is u

Re: [PATCH] Fix SLAB_DEBUG failures with forcedeth

2005-12-23 Thread Andi Kleen
On Fri, Dec 23, 2005 at 03:15:24PM +0100, Manfred Spraul wrote: > Andi Kleen wrote: > > >It's more than 82 bytes but less than 86. I didn't run the binary > >search further. > > > > > > > My guess: with 86 byte additional padding, you en

Re: [PATCH] Fix SLAB_DEBUG failures with forcedeth

2005-12-24 Thread Andi Kleen
> For me, there are 64 bytes left. Could you send me your .config, then I > can check it. I tested your pci_map_single patch now and it indeed fixes the problem. Looks like your theory was right. I redid my math and i really was a bit off and the "crosses into 4k slab" theory makes a lot of sens

Re: [PATCH, RFC] RCU : OOM avoidance and lower latency

2006-01-06 Thread Andi Kleen
On Friday 06 January 2006 11:17, Eric Dumazet wrote: > > I assume that if a CPU queued 10.000 items in its RCU queue, then the > oldest entry cannot still be in use by another CPU. This might sounds as a > violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable. I don't think i

Re: [PATCH, RFC] RCU : OOM avoidance and lower latency

2006-01-06 Thread Andi Kleen
On Friday 06 January 2006 20:26, Lee Revell wrote: > On Fri, 2006-01-06 at 13:58 +0100, Andi Kleen wrote: > > Another CPU might be stuck in a long > > running interrupt > > Shouldn't a long running interrupt be considered a bug? In normal operation yes, but there

<    1   2   3   4   5   6   >