Re: [-mm patch] make csum_and_copy_from_user arch independent (was Re: 2.6.21-mm2)

2007-05-11 Thread Andrew Morton
On Fri, 11 May 2007 10:27:38 +0200 Frederik Deweerdt [EMAIL PROTECTED] wrote:

 On Wed, May 09, 2007 at 01:23:22AM -0700, Andrew Morton wrote:
  
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21/2.6.21-mm2/
  
 
 ERROR: csum_partial_copy_from_user [net/rxrpc/af-rxrpc.ko] undefined!
 
 Linking on ARM fails because albeit a generic csum_and_copy_from_user()
 function is provided in the case ! _HAVE_ARCH_COPY_AND_CSUM_FROM_USER, the
 generic function uses csum_partial_copy_from_user() which is i386 only.
 
 The following patch uses copy_from_user() followed by csum_partial()
 to make the function platform independent.
 
 Regards,
 Frederik
 
 Signed-off-by: Frederik Deweerdt [EMAIL PROTECTED]
 
 diff --git a/include/net/checksum.h b/include/net/checksum.h
 index 1242461..2eebb95 100644
 --- a/include/net/checksum.h
 +++ b/include/net/checksum.h
 @@ -30,13 +30,16 @@ static inline
  __wsum csum_and_copy_from_user (const void __user *src, void *dst,
 int len, __wsum sum, int *err_ptr)
  {
 - if (access_ok(VERIFY_READ, src, len))
 - return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
 + if (access_ok(VERIFY_READ, src, len)) {
 + if (copy_from_user(dst, src, len))
 + return -EFAULT;
 + return csum_partial(dst, len, sum);
 + }
  
   if (len)
   *err_ptr = -EFAULT;
  
 - return sum;
 + return (__force __wsum)-1; /* invalid checksum */
  }
  #endif
  

hm.  Please cc netdev on net patches.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] AFS: Fix interminable loop in afs_write_back_from_locked_page()

2007-05-11 Thread Andrew Morton
On Fri, 11 May 2007 10:49:23 +0100 David Howells [EMAIL PROTECTED] wrote:

 Andrew Morton [EMAIL PROTECTED] wrote:
 
   Following bug was uncovered by compiling with '-W' flag:
  
  gcc -W finds a number of fairly scary bugs.
 
 Do you mean in my code specifically?  Or in the kernel in general?

In general.

  As far as
 I can tell -W only finds an eye-glazingly large quantity of 'unused parameter'
 warnings in AFS and AF_RXRPC.

Yes, it's a shame that there doesn't seem to be a fine-grained way of
turning on -W's useful bits.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 04/13] ppp_generic: fix lockdep warning

2007-05-11 Thread Andrew Morton
On Fri, 11 May 2007 14:03:09 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Jeff Garzik [EMAIL PROTECTED]
 Date: Fri, 11 May 2007 16:57:19 -0400
 
  applied
 
 I was under the impression that this patch didn't actually fix the
 problem yet?  I might be thinking about something else...

yeah, sorry, it seems that the discussion is ongoing.  Please drop the
patch.  I did.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 08/13] Use menuconfig objects II - netdev (general+100mbit)

2007-05-11 Thread Andrew Morton
On Fri, 11 May 2007 23:32:08 +0200 (MEST)
Jan Engelhardt [EMAIL PROTECTED] wrote:

 
 On May 11 2007 16:57, Jeff Garzik wrote:
  
  CONFIG_NETDEVICES, CONFIG_NET_ETHERNET:
  Change Kconfig objects from menu, config into menuconfig so
  that the user can disable the whole feature without having to
  enter the menu first.
  
  CONFIG_SMC9194:
  Move it so that it appears correctly in menuconfig.
  
  Signed-off-by: Jan Engelhardt [EMAIL PROTECTED]
  Cc: Jeff Garzik [EMAIL PROTECTED]
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]
  ---
  
  drivers/net/Kconfig |  167 --
  drivers/net/arm/Kconfig |   12 +-
  drivers/net/fec_8xx/Kconfig |2 drivers/net/fs_enet/Kconfig |2
  drivers/net/tulip/Kconfig   |   27 ++---
  5 files changed, 102 insertions(+), 108 deletions(-)
 
  ACK, but failed to apply
 
 Name the tree that I should rebase it on. (Did not seem like basing
 everything on -mm always works out.)
 

Is OK.  It applies OK here, but is obviously geting rejects due to all the
other maintainers who have their sticky paws on Jeff's stuff.

akpm:/usr/src/25 for i in drivers/net/Kconfig drivers/net/arm/Kconfig 
drivers/net/fec_8xx/Kconfig drivers/net/tulip/Kconfig
for do
for grep -l $i pc/git-*.pc
for done | sort | uniq
pc/git-infiniband.pc
pc/git-qla3xxx.pc
pc/git-wireless.pc

I'll just continue to maintain the patch and one day the conflicting
changes will make it into mainline and the patch will then apply on Jeff's
tree.  Ho hum.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 8474] New: regression failure, can't even ping modem

2007-05-13 Thread Andrew Morton


Begin forwarded message:

Date: Sun, 13 May 2007 19:05:40 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 8474] New: regression failure, can't even ping modem


http://bugzilla.kernel.org/show_bug.cgi?id=8474

   Summary: regression failure, can't even ping modem
Kernel Version: 2.6.20.11
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did *NOT* occur:
2.6.19.3

Distribution:
Mandriva 2007.0 

Hardware Environment:
AMD Sempron
ASUS K8V-VM 
Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 7c)
onboard Ethernet Realtek RTL8201CL 10/100M LAN PHY

Software Environment:

Problem Description:

I suspect that this is a configuration problem rather than a code problem - 
but I am often wrong.  8-(  

Network failure, can't even ping ADSL modem/router.  All packets are lost.  

My suspicions revolve around CONFIG_IP_ROUTE_FWMARK=y, and several options 
that follow it, which are missing from the .config file of the broken kernel.  

Steps to reproduce:

Boot with 2.6.19.3, can surf the 'net fine at this stage.  
cd /usr/src/linux-2.6.20.11
make distclean
make oldconfig
accept defaults offered
nice make
as root...
nice make modules_install
make install
reboot with 2.6.20.11
can't even ping modem.  

Can't include bothe .config files, because of 64k character limit for bug 
reports.  

Here is the broken one...  
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.20.11
# Sun May  6 15:55:36 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_INITRAMFS_SOURCE=
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y

#
# Block layer
#
CONFIG_BLOCK=y
CONFIG_LBD=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_LSF=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=anticipatory

#
# Processor type and features
#
# CONFIG_SMP is not set
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MCORE2 is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
CONFIG_MK8=y
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y

Re: kernel oops after unloading nf_conntrack_netbios_ns_module

2007-05-14 Thread Andrew Morton
On Mon, 14 May 2007 09:16:11 +0200
Gabor Burjan [EMAIL PROTECTED] wrote:

 On Sat, May 12, 2007 at 05:05:51PM +0200, Patrick McHardy wrote:
  Gabor Burjan wrote:
   EIP is at destroy_conntrack+0x52/0x127 [nf_conntrack]
  
  nmblookup existing_netbios_name
  cat /proc/net/ip_conntrack
  
  sleep 3
  
  rmmod nf_conntrack_netbios_ns
  
  Thanks for the report and good testcase, the crash can only happen with
  a sleep of = 3s after the last nmblookup packet was sent.
  
  Can you try if this patch fixes it please?
 
 Yes, it fixes the problem.

Just speaking generally, rather than about this particualr patch...

Gabor did quite an amount of valuable work here: tested a 2.6.20.x kernel,
developed a test case for reproducing the bug, reported it all quite
exhaustively, tested the resulting patch.

The least we can do in return is to put a big fat thanks in the changelog
when the fix gets merged.

 Thank you,

No - thank _you_.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/1] icom: Add new sub-device-id to support new adapter

2007-05-15 Thread Andrew Morton
On Tue, 15 May 2007 11:29:15 -0500 wendy xiong [EMAIL PROTECTED] wrote:

 I have tested this with new adapter on our systems. I didn't get
 comments since I sent out last Wednesday.
 
 Could you help me with this patch?

You sent it to the wrong mailing list: netdev doesn't handle serial drivers.
I don't normally troll netdev for missed patches.

Please send miscellaneous patches to linux-kernel.

Your email client is replacing tabs with spaces - I fixed that up.

The undersized irq number bug was already fixed.

Thanks for the patch.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: select(0, ..) is valid ?

2007-05-15 Thread Andrew Morton
On Tue, 15 May 2007 10:29:18 -0700
Badari Pulavarty [EMAIL PROTECTED] wrote:

 Hi,
 
 Is select(0, ..) is a valid operation ?

Probably - it becomes an elaborate way of doing a sleep.  Whatever - we
used to permit it without error, so we should continue to do so.

 I see that there is no check to prevent this or return
 success early, without doing any work. Do we need one ?
 
 slub code is complaining that we are doing kmalloc(0).
 
 [ cut here ]
 Badness at include/linux/slub_def.h:88
 Call Trace:
 [c001e4eb7640] [c000e650] .show_stack+0x68/0x1b0
 (unreliable)
 [c001e4eb76e0] [c029b854] .report_bug+0x94/0xe8
 [c001e4eb7770] [c00219f0] .program_check_exception
 +0x12c/0x568
 [c001e4eb77f0] [c0004a84] program_check_common+0x104/0x180
 --- Exception: 700 at .get_slab+0x4c/0x234
 LR = .__kmalloc+0x24/0xc4
 [c001e4eb7ae0] [c001e4eb7b80] 0xc001e4eb7b80 (unreliable)
 [c001e4eb7b80] [c00a7ff0] .__kmalloc+0x24/0xc4
 [c001e4eb7c10] [c00ea720] .compat_core_sys_select+0x90/0x240
 [c001e4eb7d00] [c00ec3a4] .compat_sys_select+0xb0/0x190
 [c001e4eb7dc0] [c0014944] .ppc32_select+0x14/0x28
 [c001e4eb7e30] [c000872c] syscall_exit+0x0/0x40


I _think_ we can just do

--- a/fs/compat.c~a
+++ a/fs/compat.c
@@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat
 */
ret = -ENOMEM;
size = FDS_BYTES(n);
-   bits = kmalloc(6 * size, GFP_KERNEL);
-   if (!bits)
-   goto out_nofds;
+   if (likely(size)) {
+   bits = kmalloc(6 * size, GFP_KERNEL);
+   if (!bits)
+   goto out_nofds;
+   } else {
+   bits = NULL;
+   }
fds.in  = (unsigned long *)  bits;
fds.out = (unsigned long *) (bits +   size);
fds.ex  = (unsigned long *) (bits + 2*size);
_

I mean, if that oopses then I'd be very interested in finding out why.

But I'm starting to suspect that it would be better to permit kmalloc(0) in
slub.  It depends on how many more of these things need fixing.

otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so
there's some value in forcing us to go look at all the callsites.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: select(0, ..) is valid ?

2007-05-15 Thread Andrew Morton
On Tue, 15 May 2007 11:10:22 -0700 (PDT)
Christoph Lameter [EMAIL PROTECTED] wrote:

 On Tue, 15 May 2007, Andrew Morton wrote:
 
  I _think_ we can just do
  
  --- a/fs/compat.c~a
  +++ a/fs/compat.c
  @@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat
   */
  ret = -ENOMEM;
  size = FDS_BYTES(n);
  -   bits = kmalloc(6 * size, GFP_KERNEL);
  -   if (!bits)
  -   goto out_nofds;
  +   if (likely(size)) {
  +   bits = kmalloc(6 * size, GFP_KERNEL);
  +   if (!bits)
  +   goto out_nofds;
  +   } else {
  +   bits = NULL;
  +   }
  fds.in  = (unsigned long *)  bits;
  fds.out = (unsigned long *) (bits +   size);
  fds.ex  = (unsigned long *) (bits + 2*size);
  _
  
  I mean, if that oopses then I'd be very interested in finding out why.
  
  But I'm starting to suspect that it would be better to permit kmalloc(0) in
  slub.  It depends on how many more of these things need fixing.
  
  otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so
  there's some value in forcing us to go look at all the callsites.
  
 Hmmm... We could have kmalloc(0) return a pointer to the zero page? That 
 would catch any writers?

Returning NULL would have the same effect..

But the problem is that we won't get 100% coverage of all codepaths
for ages, so any oopses we added won't get found.

otoh, any code which does dereference that pointer is buggy anwyay.

The problem here is that code which does

kmalloc(some-expression-which-returns-0)

will go and assume that the kmalloc(0) got an ENOMEM and it'll take the
error path.

Oh well, let's persist with things as they now are.

Perhaps putting a size=0 detector into slab also would speed this
process up.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8491] New: OOPS triggered by ip(8) deconfiguring a network interface

2007-05-17 Thread Andrew Morton
On Thu, 17 May 2007 06:59:21 -0700 [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8491
 
Summary: OOPS triggered by ip(8) deconfiguring a network
 interface
 Kernel Version: 2.6.22-rc1
 Status: NEW
   Severity: high
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did *NOT* occur:
 Distribution: Ubuntu Feisty/Gutsy
 Hardware Environment: Pentium M, e1000, ipw2200
 Software Environment: network-manager
 Problem Description: Oops removing proc entry during ifdown
 
 Steps to reproduce: Suspend/resume with interfaces alive
 

[42968.728000] ACPI: AC Adapter [AC] (on-line)
[42968.824000] ACPI: Battery Slot [BAT0] (battery present)
[42976.264000] eth1: no IPv6 routers present
[43039.50] ADDRCONF(NETDEV_UP): eth1: link is not ready
[43084.092000] e1000: eth2: e1000_watchdog: NIC Link is Up 1000 Mbps Full 
Duplex, Flow Control: RX
[43084.096000] ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[43092.388000] BUG: unable to handle kernel NULL pointer dereference at virtual 
address 
[43092.388000]  printing eip:
[43092.388000] c01ade92
[43092.388000] *pde = 
[43092.388000] Oops:  [#1]
[43092.388000] SMP 
[43092.388000] Modules linked in: battery ac ibm_acpi thermal processor fan
button e1000 ipw2200 ieee80211 usbhid hid michael_mic arc4 ecb blkcipher
ieee80211_crypt_tkip af_packet binfmt_misc rfcomm l2cap bluetooth ipv6
nvram uinput radeon drm speedstep_centrino cpufreq_userspace cpufreq_stats
cpufreq_powersave cpufreq_ondemand freq_table cpufreq_conservative video
sbs i2c_ec i2c_core bay dock container asus_acpi lp joydev snd_intel8x0
snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss irtty_sir sir_dev snd_pcm
nsc_ircc snd_seq_dummy snd_seq_oss irda psmouse pcmcia crc_ccitt parport_pc
snd_seq_midi snd_rawmidi snd_seq_midi_event serio_raw parport snd_seq
snd_timer snd_seq_device pcspkr snd soundcore snd_page_alloc
ieee80211_crypt iTCO_wdt iTCO_vendor_support yenta_socket rsrc_nonstatic
pcmcia_core intel_agp agpgart shpchp pci_hotplug tsdev evdev ext3 jbd
mbcache sg sr_mod cdrom sd_mod generic piix ata_generic floppy ata_piix
libata scsi_mod ehci_hcd uhci_hcd usbcore capability commoncap
[43092.388000] CPU:0
[43092.388000] EIP:0060:[c01ade92]Not tainted VLI
[43092.388000] EFLAGS: 00210246   (2.6.22-1-generic #1)
[43092.388000] EIP is at remove_proc_entry+0x22/0x1b0
[43092.388000] eax:    ebx: e2e8a5c0   ecx:    edx: e2e8ae40
[43092.388000] esi: dfe0aa58   edi:    ebp: c826ca00   esp: e6b09c80
[43092.388000] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
[43092.388000] Process ip (pid: 11893, ti=e6b08000 task=dfea8a90 
task.ti=e6b08000)
[43092.388000] Stack: e30f433c cab6 c028608e  e2e8ae40  
e2e8a5c0 dfe0aa58 
[43092.388000]c826ca00 c826ca00 f10a72ff f1088989 00a33d71 80fe 
 0001 
[43092.388000]cab6 e2e8a5c0 dfe0aa58  c826ca00 f1089594 
c826ca54 0040 
[43092.388000] Call Trace:
[43092.388000]  [c028608e] pneigh_queue_purge+0x1e/0x30
[43092.388000]  [f10a72ff] snmp6_unregister_dev+0x2f/0x40 [ipv6]
[43092.388000]  [f1088989] addrconf_ifdown+0x2b9/0x2f0 [ipv6]
[43092.388000]  [f1089594] inet6_addr_del+0xb4/0xe0 [ipv6]
[43092.388000]  [f108b3d8] inet6_rtm_deladdr+0x68/0x70 [ipv6]
[43092.388000]  [f108b370] inet6_rtm_deladdr+0x0/0x70 [ipv6]
[43092.388000]  [c02896cd] rtnetlink_rcv_msg+0x16d/0x250
[43092.388000]  [c0289560] rtnetlink_rcv_msg+0x0/0x250
[43092.388000]  [c02984b2] netlink_run_queue+0x82/0x120
[43092.388000]  [c0289508] rtnetlink_rcv+0x28/0x50
[43092.388000]  [c0298962] netlink_data_ready+0x12/0x50
[43092.388000]  [c0297671] netlink_sendskb+0x21/0x40
[43092.388000]  [c0298873] netlink_sendmsg+0x223/0x300
[43092.388000]  [c0277012] sock_sendmsg+0x112/0x130
[43092.388000]  [c0129d71] current_fs_time+0x41/0x50
[43092.388000]  [c0138f60] autoremove_wake_function+0x0/0x50
[43092.388000]  [c01f3009] copy_to_user+0x29/0x50
[43092.388000]  [c0277b03] move_addr_to_user+0x63/0x70
[43092.388000]  [c0277c83] sys_recvmsg+0x173/0x220
[43092.388000]  [c01f2d97] copy_from_user+0x27/0x60
[43092.388000]  [c02773cd] sys_sendto+0x12d/0x190
[43092.388000]  [c0158e90] find_get_page+0x20/0x50
[43092.388000]  [c015b991] filemap_nopage+0x2f1/0x3a0
[43092.388000]  [c016602e] __handle_mm_fault+0x23e/0x960
[43092.388000]  [c0168b26] __vma_link+0x36/0x70
[43092.388000]  [c0278458] sys_socketcall+0x198/0x280
[43092.388000]  [c0104114] sysenter_past_esp+0x5d/0x89
[43092.388000]  [c02e] xfrm_timer_handler+0x220/0x250
[43092.388000]  ===
[43092.388000] Code: 00 00 8d bc 27 00 00 00 00 55 57 56 53 83 ec 18 85 d2 89 
54 24 10 89 44 24 14 0f 84 40 01 00 00 8b 7c 24 14 31 c0 b9 ff ff ff ff f2 ae 
f7 d1 49 b8 c0 20 3b c0 89 cd e8 8d 0e 14 00 8b 5c 24 10 
[43092.388000] EIP: [c01ade92] remove_proc_entry+0x22/0x1b0 SS:ESP 
0068:e6b09c80
-
To 

Re: [PATCH] Fix incorrect prototype for ipxrtr_route_packet()

2007-05-17 Thread Andrew Morton
On Thu, 17 May 2007 18:48:12 +0800
David Woodhouse [EMAIL PROTECTED] wrote:

 The function ipxrtr_route_packet() takes a 'len' argument of type
 size_t. However, its prototype in af_ipx.c incorrectly suggests that the
 corresponding argument is of type 'int' instead.
 
 Discovered by building with --combine and letting the compiler see it
 all at once.
 
 Signed-off-by: David Woodhouse [EMAIL PROTECTED]
 
 --- a/net/ipx/af_ipx.c
 +++ b/net/ipx/af_ipx.c
 @@ -87,7 +87,7 @@ extern int ipxrtr_add_route(__be32 network, struct 
 ipx_interface *intrfc,
   unsigned char *node);
  extern void ipxrtr_del_routes(struct ipx_interface *intrfc);
  extern int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
 -struct iovec *iov, int len, int noblock);
 +struct iovec *iov, size_t len, int noblock);
  extern int ipxrtr_route_skb(struct sk_buff *skb);
  extern struct ipx_route *ipxrtr_lookup(__be32 net);
  extern int ipxrtr_ioctl(unsigned int cmd, void __user *arg);

Lovely.  So it was actually generating wrong code on all
sizeof(size_t)!=sizeof(int) architectures.

If only we could find some way in which all callers of a function as
well as its definition can see the same declaration?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] net driver updates

2007-05-18 Thread Andrew Morton
On Fri, 18 May 2007 23:46:21 +0200
Mariusz Koz__owski [EMAIL PROTECTED] wrote:

 Hello, 
 
  diff --git a/drivers/net/smc91x.h b/drivers/net/smc91x.h
  index 7053026..111f23d 100644
  --- a/drivers/net/smc91x.h
  +++ b/drivers/net/smc91x.h
  @@ -279,6 +279,40 @@ SMC_outw(u16 val, void __iomem *ioaddr, int reg)
 
 ...
 
  +#define SMC_outb(v, a, r) __ __ 
  __outw(((inw((a)+((r)~1))*(0xff8*(r%2 | ((v)(8*(r2, (a) + 
  ((r)~1))
 
 This one has unbalanced parenthesis.
 

I dunno how you can tell - I can't count that high.

Can this be programmed in C, rather than in cpp?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: STRANGE ERROR

2007-05-19 Thread Andrew Morton
On Sun, 20 May 2007 00:30:55 +0200 Sasa Ostrouska [EMAIL PROTECTED] wrote:

 Hi everybody,
 
 I tried today to upgrade the kernel to 2.6.21.1 and i got the same
 error during the boot time.
 Here is the dmesg of the 2.6.20.2, can somebody tell me what this is ?
 
 ...

 Marvell 88E1101: Registered new driver
 Fixed PHY: Registered new driver
 driver_bound: device [EMAIL PROTECTED]:1 already bound

I don't know what caused that one.

 Device '[EMAIL PROTECTED]:1' does not have a release() function, it is broken
 and must be fixed.
 BUG: at drivers/base/core.c:104 device_release()
 
 Call Trace:
  [802ec380] kobject_cleanup+0x53/0x7e
  [802ec3ab] kobject_release+0x0/0x9
  [802ecf3f] kref_put+0x74/0x81
  [8035493b] fixed_mdio_register_device+0x230/0x265
  [80564d31] fixed_init+0x1f/0x35
  [802071a4] init+0x147/0x2fb
  [80223b6e] schedule_tail+0x36/0x92
  [8020a678] child_rip+0xa/0x12
  [80311714] acpi_ds_init_one_object+0x0/0x83
  [8020705d] init+0x0/0x2fb
  [8020a66e] child_rip+0x0/0x12

This appears to have happened because fixed_mdio_register_device() (or
phy_device_create) didn't suitably initialise phy_device.dev.

But I don't immediately see why this doesn't affect all phy drivers. 
Presumably it's the fixed driver which is at fault.  Jeff, how is this
supposed to work?

Thanks.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8519] New: NAT prerouting over tun interface broken

2007-05-21 Thread Andrew Morton
On Mon, 21 May 2007 13:05:36 -0700
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8519
 
Summary: NAT prerouting over tun interface broken
 Kernel Version: 2.6.21.1
 Status: NEW
   Severity: normal
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did *NOT* occur: 2.6.20.7
 Distribution: Debian unstable
 Hardware Environment: EM64T (Pentium D) running amd64 kernel
 Software Environment: Debian unstable
 
 Problem Description:
 I have the hercules s/390 emulator running on an EM64T host, both running 
 Debian unstable. I use a tun interface, a second IP address on eth0 and 
 iptables/nat so the emulator has it's own address on my local network.
 
 With 2.6.21.1 on the host, networking between the emulator and the host 
 system 
 is fine (I can ssh from the host into the emulator without problems), but 
 communication from the emulator with other boxes is broken. Other boxes also 
 don't see the emulator if I ping its external address.
 
 If I ping another box on my LAN from the emulator while running wireshark on 
 the host, I can see that:
 - the echo request gets sent OK
 - the other box replies OK
 - the host receives the echo reply
 - but the tun interface never gets it.
 
 If I boot the host with 2.6.20 everything works fine again.
 
 Here is how the setup looks:
 | host system |
|-- emulator --|
 eth0  tun  ctc0
  LAN  --- 10.19.66.21   
  LAN  --- 10.19.66.92 --- 10.19.92.2 --- 10.19.92.1
  nat  P2P
 
 The only active iptables rules are:
 iptables -t nat -A PREROUTING -d 10.19.66.92 \
  -j DNAT --to-destination 10.19.92.1
 iptables -t nat -A POSTROUTING -s 10.19.92.1 \
  -j SNAT --to-source 10.19.66.92
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v4] s2io: add PCI error recovery support

2007-05-21 Thread Andrew Morton
On Mon, 21 May 2007 13:58:53 -0500
[EMAIL PROTECTED] (Linas Vepstas) wrote:

 This patch adds PCI error recovery support to the 
 s2io 10-Gigabit ethernet device driver. Fourth revision,
 blocks MSI interrupts, and statistics gathering, as well.
 
 Tested, seems to work well.
 
 Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
 Acked-by: Ramkrishna Vepa [EMAIL PROTECTED]
 Cc: Sivakumar Subramani [EMAIL PROTECTED]
 Cc: Sreenivasa Honnur [EMAIL PROTECTED]
 Cc: Rastapur Santosh [EMAIL PROTECTED]
 Cc: Wen Xiong [EMAIL PROTECTED]
 
 
 Please apply. This has been submitted for 2.6.19, 2.6.20 and 2.6.21
 with no major criticisms made, although with minor polish  fixups.
 I think its ready.

This is already in Jeff's development tree.  Your new patch neither
applies nor unapplies, so if you've changed it, Jeff is now sitting
on an old version.  I assume he'd like an incremental update patch.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2 v4] s2io: add PCI error recovery support

2007-05-21 Thread Andrew Morton
On Mon, 21 May 2007 17:23:57 -0500
[EMAIL PROTECTED] (Linas Vepstas) wrote:

 On Mon, May 21, 2007 at 02:48:47PM -0700, Andrew Morton wrote:
  On Mon, 21 May 2007 13:58:53 -0500
  [EMAIL PROTECTED] (Linas Vepstas) wrote:
   This patch adds PCI error recovery support to the 
  
  This is already in Jeff's development tree.  Your new patch neither
  applies nor unapplies, so if you've changed it, Jeff is now sitting
  on an old version.  I assume he'd like an incremental update patch.
 
 Ahh ! 
 
 I assume I have to git-pull
   /pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
 or something like that. Will try that now.

git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git#ALL
should suit.

 The part that confuses me is that I'd gotten a message from Jeff
 back in March (well before 2.6.21 came out), saying it was in his
 development tree; yet, the patch its not in 2.6.22-rc; Torvalds
 hasn't yet pulled from it?

Look like it went into Jeff's tree on May 14.  2.6.22-rc1 was released on
May 13, so the patch missed the merge window.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EIP is at netlink_insert+0x41/0x10c

2007-05-28 Thread Andrew Morton
On Sun, 27 May 2007 23:47:52 -0700 Miles Lane [EMAIL PROTECTED] wrote:


please cc netdev@vger.kernel.org on net-related matters.

 Linux version 2.6.22-rc2-mm1 ([EMAIL PROTECTED]) (gcc
 version 4.1.2 20070502 (Red Hat 4.1.2-12)) #1 PREEMPT Sun May 27
 18:30:28 PDT 2007

 ...

 serio: i8042 AUX port at 0x60,0x64 irq 12
 mice: PS/2 mouse device common for all mice
 i2c-adapter i2c-0: nForce2 SMBus adapter at 0x5000
 i2c-adapter i2c-1: nForce2 SMBus adapter at 0x5500
 usbcore: registered new interface driver usbhid
 drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
 oprofile: using NMI interrupt.
 TCP cubic registered
 NET: Registered protocol family 1
 NET: Registered protocol family 10
 IPv6 over IPv4 tunneling driver
 NET: Registered protocol family 17
 powernow-k8: Processor cpuid 6a0 not supported
 Using IPI Shortcut mode
 Freeing unused kernel memory: 260k freed
 Write protecting the kernel text: 3108k
 Write protecting the kernel read-only data: 1699k
 input: AT Translated Set 2 keyboard as /class/input/input2
 BUG: unable to handle kernel paging request at virtual address 7272746d
  printing eip:
 c038fb7f
 *pde = 
 Oops:  [#1]
 PREEMPT
 Modules linked in:
 CPU:0
 EIP:0060:[c038fb7f]Not tainted VLI
 EFLAGS: 00210002   (2.6.22-rc2-mm1 #1)
 EIP is at netlink_insert+0x41/0x10c
 eax: c201a100   ebx: c242e3c0   ecx: 59bdb910   edx: 7272746d
 esi: c2060348   edi: c2ae2c00   ebp: c2af3ebc   esp: c2af3ea0
 ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
 Process init (pid: 505, ti=c2af3000 task=c2186150 task.ti=c2af3000)
 Stack: 0020 01f9 c201a100  c242e3c0 c2ae2c00 c2af3ee0 c2af3ed8
c03910dd c242e3c0 c2af3ee0 c0431fe0 c242e3c0 c2af3ee0 c2af3f74 c0378d48
0010 01f9  c218c2f4 48305000   c2af5ac0
 Call Trace:
  [c03910dd] netlink_bind+0x86/0x12d
  [c0378d48] sys_bind+0x4e/0x6d
  [c037a008] sys_socketcall+0x72/0x222
  [c0103e16] sysenter_past_esp+0x5f/0x99
  [e410] 0xe410
  ===
 INFO: lockdep is turned off.
 Code: 35 ec aa 8a c0 e8 c4 fd ff ff 8b 55 e8 89 f0 e8 80 f6 ff ff 89
 45 ec 8b 10 c7 45 f0 00 00 00 00 eb 05 ff 45 f0 89 c2 85 d2 74 1b 8b
 02 0f 18 00 90 8b 4d e8 39 8a 70 02 00 00 75 e6 bb 9e ff ff
 EIP: [c038fb7f] netlink_insert+0x41/0x10c SS:ESP 0068:c2af3ea0
 note: init[505] exited with preempt_count 1

I wonder how /bin/init got to run netlink stuff.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] Make net watchdog timers 1 sec jiffy aligned

2007-05-30 Thread Andrew Morton
On Tue, 29 May 2007 11:01:13 -0700
Venki Pallipadi [EMAIL PROTECTED] wrote:

 round_jiffies for net dev watchdog timer.
 
 Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED]
 
 Index: linux-2.6.22-rc-mm/net/sched/sch_generic.c
 ===
 --- linux-2.6.22-rc-mm.orig/net/sched/sch_generic.c   2007-05-24 
 11:16:03.0 -0700
 +++ linux-2.6.22-rc-mm/net/sched/sch_generic.c2007-05-25 
 15:10:02.0 -0700
 @@ -224,7 +224,8 @@
   if (dev-tx_timeout) {
   if (dev-watchdog_timeo = 0)
   dev-watchdog_timeo = 5*HZ;
 - if (!mod_timer(dev-watchdog_timer, jiffies + 
 dev-watchdog_timeo))
 + if (!mod_timer(dev-watchdog_timer,
 +round_jiffies(jiffies + dev-watchdog_timeo)))
   dev_hold(dev);
   }
  }

Please cc netdev on net patches.

Again, I worry that if people set the watchdog timeout to, say, 0.1 seconds
then they will get one second, which is grossly different.

And if they were to set it to 1.5 seconds, they'd get 2.0 which is pretty
significant, too.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


e100 resume failure

2007-05-30 Thread Andrew Morton

I was doing some suspend-to-ram testing on the Vaio with the 2.6.22-rc3-mm1
lineup.  After 10 or 15 cycles a resume failed:

[  357.119436] Suspending device full
[  357.120450] Suspending device zero
[  358.084978] Suspending device port
[  358.085664] Suspending device null
[  358.086432] Suspending device kmem
[  358.087200] Suspending device mem
[  358.087975] Suspending device 00:09
[  358.088764] Suspending device 00:08
[  358.089546] Suspending device 00:07
[  358.090343] Suspending device 00:06
[  358.091125] Suspending device 00:05
[  358.091912] Suspending device 00:04
[  358.092702] Suspending device 00:03
[  358.093486] Suspending device 00:02
[  358.094272] Suspending device 00:01
[  358.095074] Suspending device 00:00
[  358.095863] Suspending device pnp0
[  358.096672] Suspending device :06
[  358.097482] Suspending device :07
[  358.098275] Suspending device :06:0b.0
[  358.101644] Suspending device :06:08.0
[18014750.543703] ATA: abnormal status 0x7F on port 0x000118af
[18014750.555105] e100: eth0: e100_exec_cb_wait: ucode load failed

and the machine hung.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix GFP_KERNEL - GFP_ATOMIC in spin_locked region

2007-06-04 Thread Andrew Morton
On Mon, 04 Jun 2007 18:25:28 +0200 Yoann Padioleau [EMAIL PROTECTED] wrote:

 
 In a few files a function such as usb_submit_urb is taking GFP_KERNEL
 as an argument whereas this function call is inside a
 spin_lock_irqsave region of code. Documentation says that it must be
 GFP_ATOMIC instead.
 
 Me and my colleagues have made a tool targeting program
 transformations in device drivers. We have designed a scripting
 language for specifying easily program transformations and a
 transformation engine for performing them. In the spirit of Linux
 development practice, the language is based on the patch syntax. We
 call our scripts semantic patches because as opposed to traditional
 patches, our semantic patches do not work at the line level but on a
 high level representation of the C program.
 
 FYI here is our semantic patch detecting invalid use of GFP_KERNEL and
 fixing the problem:
 
 @@
 identifier fn;
 @@
 
  spin_lock_irqsave(...)
  ... when != spin_unlock_irqrestore(...)
  fn(...,
 - GFP_KERNEL
 + GFP_ATOMIC
,...
)

I think I read that paper.

 And now the real patch resulting from the automated transformation:
 
  net/wan/lmc/lmc_main.c|2 +-
  scsi/megaraid/megaraid_mm.c   |2 +-
  usb/serial/io_ti.c|2 +-
  usb/serial/ti_usb_3410_5052.c |2 +-
  usb/serial/whiteheat.c|6 +++---
  5 files changed, 7 insertions(+), 7 deletions(-)

This patch covers three areas of maintainer responsibility, so poor old me
has to split it up and redo the changelogs.  Oh well.

 
 diff --git a/drivers/net/wan/lmc/lmc_main.c b/drivers/net/wan/lmc/lmc_main.c
 index ae132c1..750b3ef 100644
 --- a/drivers/net/wan/lmc/lmc_main.c
 +++ b/drivers/net/wan/lmc/lmc_main.c
 @@ -483,7 +483,7 @@ #endif /* end ifdef _DBG_EVENTLOG */
  break;
  }
  
 -data = kmalloc(xc.len, GFP_KERNEL);
 +data = kmalloc(xc.len, GFP_ATOMIC);
  if(data == 0x0){
  printk(KERN_WARNING %s: Failed to allocate 
 memory for copy\n, dev-name);
  ret = -ENOMEM;

A few lines later we do:

if(copy_from_user(data, xc.data, xc.len))

which also is illegal under spinlock.


Frankly, I think that a better use of this tool would be to just report on
the errors, let humans fix them up.

Nobody maintains this ATM code afaik.

 index e075a52..edee220 100644
 --- a/drivers/scsi/megaraid/megaraid_mm.c
 +++ b/drivers/scsi/megaraid/megaraid_mm.c
 @@ -547,7 +547,7 @@ mraid_mm_attach_buf(mraid_mmadp_t *adp, 
  
   kioc-pool_index= right_pool;
   kioc-free_buf  = 1;
 - kioc-buf_vaddr = pci_pool_alloc(pool-handle, GFP_KERNEL,
 + kioc-buf_vaddr = pci_pool_alloc(pool-handle, GFP_ATOMIC,
   kioc-buf_paddr);
   spin_unlock_irqrestore(pool-lock, flags);

Again, a better fix would probably be to move the pci_pool_alloc() to
before the spin_lock_irqsave(), so we can continue to use the stronger
GFP_KERNEL.

But the locking in there looks basically nonsensical or wrong anyway.  It
appears that local variable `right_pool' cannot be validly used unless
we're holding pool-lock for the whole duration.

Somebody does maintain the megaraid driver, but I'm not sure who, and 
the MAINTAINERS file isn't very useful.  So I'll spray it around a bit.
We definitely have bugs in there.

 diff --git a/drivers/usb/serial/io_ti.c b/drivers/usb/serial/io_ti.c
 index 544098d..9ec38e3 100644
 --- a/drivers/usb/serial/io_ti.c
 +++ b/drivers/usb/serial/io_ti.c
 @@ -2351,7 +2351,7 @@ static int restart_read(struct edgeport_
   urb-complete = edge_bulk_in_callback;
   urb-context = edge_port;
   urb-dev = edge_port-port-serial-dev;
 - status = usb_submit_urb(urb, GFP_KERNEL);
 + status = usb_submit_urb(urb, GFP_ATOMIC);
   }
   edge_port-ep_read_urb_state = EDGE_READ_URB_RUNNING;
   edge_port-shadow_mcr |= MCR_RTS;
 diff --git a/drivers/usb/serial/ti_usb_3410_5052.c 
 b/drivers/usb/serial/ti_usb_3410_5052.c
 index 4203e2b..10dc36f 100644
 --- a/drivers/usb/serial/ti_usb_3410_5052.c
 +++ b/drivers/usb/serial/ti_usb_3410_5052.c
 @@ -1559,7 +1559,7 @@ static int ti_restart_read(struct ti_por
   urb-complete = ti_bulk_in_callback;
   urb-context = tport;
   urb-dev = tport-tp_port-serial-dev;
 - status = usb_submit_urb(urb, GFP_KERNEL);
 + status = usb_submit_urb(urb, GFP_ATOMIC);
   }
   tport-tp_read_urb_state = TI_READ_URB_RUNNING;
  
 diff --git a/drivers/usb/serial/whiteheat.c b/drivers/usb/serial/whiteheat.c
 index 27c5f8f..1b01207 100644
 --- a/drivers/usb/serial/whiteheat.c
 +++ b/drivers/usb/serial/whiteheat.c
 @@ -1116,7 +1116,7 @@ static int firm_send_command (struct usb
   memcpy (transfer_buffer[1], data, datasize);
   

Re: [PATCH] bugfix GFP_KERNEL - GFP_ATOMIC in spin_locked region

2007-06-04 Thread Andrew Morton
On Mon, 4 Jun 2007 21:00:18 -0700 Andrew Morton [EMAIL PROTECTED] wrote:

  diff --git a/drivers/usb/serial/io_ti.c b/drivers/usb/serial/io_ti.c
  index 544098d..9ec38e3 100644
  --- a/drivers/usb/serial/io_ti.c
  +++ b/drivers/usb/serial/io_ti.c
  @@ -2351,7 +2351,7 @@ static int restart_read(struct edgeport_
  urb-complete = edge_bulk_in_callback;
  urb-context = edge_port;
  urb-dev = edge_port-port-serial-dev;
  -   status = usb_submit_urb(urb, GFP_KERNEL);
  +   status = usb_submit_urb(urb, GFP_ATOMIC);
  }
  edge_port-ep_read_urb_state = EDGE_READ_URB_RUNNING;
  edge_port-shadow_mcr |= MCR_RTS;
  diff --git a/drivers/usb/serial/ti_usb_3410_5052.c 
  b/drivers/usb/serial/ti_usb_3410_5052.c
  index 4203e2b..10dc36f 100644
  --- a/drivers/usb/serial/ti_usb_3410_5052.c
  +++ b/drivers/usb/serial/ti_usb_3410_5052.c
  @@ -1559,7 +1559,7 @@ static int ti_restart_read(struct ti_por
  urb-complete = ti_bulk_in_callback;
  urb-context = tport;
  urb-dev = tport-tp_port-serial-dev;
  -   status = usb_submit_urb(urb, GFP_KERNEL);
  +   status = usb_submit_urb(urb, GFP_ATOMIC);
  }
  tport-tp_read_urb_state = TI_READ_URB_RUNNING;
   
  diff --git a/drivers/usb/serial/whiteheat.c b/drivers/usb/serial/whiteheat.c
  index 27c5f8f..1b01207 100644
  --- a/drivers/usb/serial/whiteheat.c
  +++ b/drivers/usb/serial/whiteheat.c
  @@ -1116,7 +1116,7 @@ static int firm_send_command (struct usb
  memcpy (transfer_buffer[1], data, datasize);
  command_port-write_urb-transfer_buffer_length = datasize + 1;
  command_port-write_urb-dev = port-serial-dev;
  -   retval = usb_submit_urb (command_port-write_urb, GFP_KERNEL);
  +   retval = usb_submit_urb (command_port-write_urb, GFP_ATOMIC);
  if (retval) {
  dbg(%s - submit urb failed, __FUNCTION__);
  goto exit;
  @@ -1316,7 +1316,7 @@ static int start_command_port(struct usb
  usb_clear_halt(serial-dev, command_port-read_urb-pipe);
   
  command_port-read_urb-dev = serial-dev;
  -   retval = usb_submit_urb(command_port-read_urb, GFP_KERNEL);
  +   retval = usb_submit_urb(command_port-read_urb, GFP_ATOMIC);
  if (retval) {
  err(%s - failed submitting read urb, error %d, 
  __FUNCTION__, retval);
  goto exit;
  @@ -1363,7 +1363,7 @@ static int start_port_read(struct usb_se
  wrap = list_entry(tmp, struct whiteheat_urb_wrap, list);
  urb = wrap-urb;
  urb-dev = port-serial-dev;
  -   retval = usb_submit_urb(urb, GFP_KERNEL);
  +   retval = usb_submit_urb(urb, GFP_ATOMIC);
  if (retval) {
  list_add(tmp, info-rx_urbs_free);
  list_for_each_safe(tmp, tmp2, info-rx_urbs_submitted) 
  {
 
 This part might make sense so I'll queue it for the USB guys to look at.
 
 

Everything in USB appears to already be fixed, apart from the io_ti.c bug.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


warnings in git-wireless

2007-06-05 Thread Andrew Morton

i386 allmodconfig isn't that hard, guys.

drivers/net/wireless/mac80211/zd1211rw/zd_mac.c:600: warning: 'fill_rt_header' 
defined but not used
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c: In function 
'iwl_hw_tx_queue_free_tfd':
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:964: warning: left shift count 
= width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c: In function 
'iwl_hw_tx_queue_attach_buffer_to_tfd':
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2041: warning: left shift 
count = width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2041: warning: left shift 
count = width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2047: warning: left shift 
count = width of type
drivers/net/wireless/mac80211/iwlwifi/iwl-4965.c:2050: warning: left shift 
count = width of type

With some trepidation I looked in just that header.


 #define iwl_get_bits(src, pos, len)   \
 ({\
   u32 __tmp = le32_to_cpu(src); \
   __tmp = pos;\
   __tmp = (1UL  len) - 1;\
   __tmp;\
 })

Can be a inlined C function.  Should be commented.

 #define iwl_set_bits(dst, pos, len, val) \
 ({   \
   u32 __tmp = le32_to_cpu(*dst);   \
 __tmp = ~((1UL  (pos+len)) - (1  pos)); \
   __tmp |= (val  ((1UL  len) - 1))  pos;  \
 *dst = cpu_to_le32(__tmp);   \
 })

Ditto.  Whitespace broken.

 #define _IWL_SET_BITS(s, d, o, l, v) \
 iwl_set_bits(s.d, o, l, v)
 
 #define IWL_SET_BITS(s, sym, v) \
 _IWL_SET_BITS((s), IWL_ ## sym ## _SYM, IWL_ ## sym ## _POS, IWL_ ## 
 sym ## _LEN, (v))
 
 #define _IWL_GET_BITS(s, v, o, l) \
 iwl_get_bits(s.v, o, l)
 
 #define IWL_GET_BITS(s, sym) \
 _IWL_GET_BITS((s), IWL_ ## sym ## _SYM, IWL_ ## sym ## _POS, IWL_ ## 
 sym ## _LEN)

Shudder.

 /*
  * make C=2 CF=-Wall will complain if you use ARRAY_SIZE on global data
  */
 #define GLOBAL_ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

This is identical to ARRAY_SIZE.

And if there's some problem with ARRAY_SIZE then fix ARRAY_SIZE!  Don't go 
off and create some private thing and leave everyone else twisting in the
wind.

 /* Debug and printf string expansion helpers for printing bitfields */
 #define BIT_FMT8 %c%c%c%c-%c%c%c%c
 #define BIT_FMT16 BIT_FMT8 : BIT_FMT8
 #define BIT_FMT32 BIT_FMT16   BIT_FMT16
 
 #define BITC(x,y) (((xy)1)?'1':'0')
 #define BIT_ARG8(x) \
 BITC(x,7),BITC(x,6),BITC(x,5),BITC(x,4),\
 BITC(x,3),BITC(x,2),BITC(x,1),BITC(x,0)
 
 #define BIT_ARG16(x) \
 BITC(x,15),BITC(x,14),BITC(x,13),BITC(x,12),\
 BITC(x,11),BITC(x,10),BITC(x,9),BITC(x,8),\
 BIT_ARG8(x)
 
 #define BIT_ARG32(x) \
 BITC(x,31),BITC(x,30),BITC(x,29),BITC(x,28),\
 BITC(x,27),BITC(x,26),BITC(x,25),BITC(x,24),\
 BITC(x,23),BITC(x,22),BITC(x,21),BITC(x,20),\
 BITC(x,19),BITC(x,18),BITC(x,17),BITC(x,16),\
 BIT_ARG16(x)

None of the above is appropriate to a driver-private header.

 #define KELVIN_TO_CELSIUS(x) ((x)-273)

Nor is that.

 #define IEEE80211_CHAN_W_RADAR_DETECT 0x0010
 
 static inline struct ieee80211_conf *ieee80211_get_hw_conf(struct ieee80211_hw
  *hw)
 {
   return hw-conf;
 }
 
 static inline const struct ieee80211_hw_mode *iwl_get_hw_mode(struct iwl_priv
 *priv, int mode)
 {
   int i;
 
   for (i = 0; i  3; i++)
   if (priv-modes[i].mode == mode)
   return priv-modes[i];
 
   return NULL;
 }

Far too large to inline, has five callsites.

 #define WLAN_FC_GET_TYPE(fc)(((fc)  IEEE80211_FCTL_FTYPE))
 #define WLAN_FC_GET_STYPE(fc)   (((fc)  IEEE80211_FCTL_STYPE))
 #define WLAN_GET_SEQ_FRAG(seq)  ((seq)  0x000f)
 #define WLAN_GET_SEQ_SEQ(seq)   ((seq)  4)

These don't need to be macros

 #define QOS_CONTROL_LEN 2
 
 static inline u16 *ieee80211_get_qos_ctrl(struct ieee80211_hdr *hdr)
 {
   int hdr_len = ieee80211_get_hdrlen(hdr-frame_control);
   if (hdr-frame_control  IEEE80211_STYPE_QOS_DATA)
   return (u16 *) ((u8 *) hdr + (hdr_len) - QOS_CONTROL_LEN);
   return NULL;
 }

Two callsites, too large to inline.

 #define IEEE80211_STYPE_BACK_REQ  0x0080
 #define IEEE80211_STYPE_BACK  0x0090
 
 #define ieee80211_is_back_request(fc) \
   ((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_CTL)  \
   (WLAN_FC_GET_STYPE(fc) == IEEE80211_STYPE_BACK_REQ))
 
 #define ieee80211_is_probe_response(fc) \
((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_MGMT)  \
 ( WLAN_FC_GET_STYPE(fc) == IEEE80211_STYPE_PROBE_RESP ))
 
 #define ieee80211_is_probe_request(fc) \
((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_MGMT)  \
 ( WLAN_FC_GET_STYPE(fc) ==IEEE80211_STYPE_PROBE_REQ ))
 
 #define ieee80211_is_beacon(fc) \
((WLAN_FC_GET_TYPE(fc) == IEEE80211_FTYPE_MGMT)  \
 ( 

Re: warnings in git-wireless

2007-06-05 Thread Andrew Morton
On Tue, 05 Jun 2007 13:12:03 -0700
James Ketrenos [EMAIL PROTECTED] wrote:

 John W. Linville wrote:
  On Tue, Jun 05, 2007 at 02:06:14AM -0700, Andrew Morton wrote:
  
  Please, don't anybody dare think about thinking about letting this anywhere
  near mainline until it has had a thorough review.  Or at least, a little 
  bit
  of review.
  
  Don't worry -- I assure you that everyone is aware of the issues.
  
  John
 
 Yes, we certainly don't want a driver to be near mainline that does things 
 that the rest of the kernel and other drivers are doing.  We should force 
 them to stay out-of-tree until any and everything is resolved.  Heaven forbid 
 that the code should be merged, contributed, and improved upon as a community.

That isn't the only decision criterion.

Overall the c files look reasonable to me: a few little things like large
on-stack arrays built at runtim which I think could be assembled at
compile-time, various unneeded casts, a bit of space-vs-tab confusion, but
nothing serious leaps out.

So perhaps that header file was unrepresentative.  It is seriously
duplicative and bloaty though.



This:

akpm:/usr/src/25 perl scripts/checkpatch.pl patches/git-wireless.patch | wc -l
9941

should be an endless source of fun.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22-rc3-mm1 - pppd hanging in netdev_run_todo while holding mutex

2007-06-06 Thread Andrew Morton
On Mon, 04 Jun 2007 14:00:56 -0400 [EMAIL PROTECTED] wrote:

 On Wed, 30 May 2007 23:58:23 PDT, Andrew Morton said:
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc3/2.6.22-rc3-mm1/
 
 Under 22-rc2-mm1, if my VPN connection got reset, ppp0 just quietly went away.
 
 Under 22-rc3-mm1, it seems to end up wedged and waiting for references to
 go away:
 
 Jun  4 09:23:01 turing-police kernel: [90089.270707] unregister_netdevice: 
 waiting for ppp0 to become free. Usage count = 8
 Jun  4 09:23:11 turing-police kernel: [90099.396121] unregister_netdevice: 
 waiting for ppp0 to become free. Usage count = 8
 Jun  4 09:23:21 turing-police kernel: [90109.520574] unregister_netdevice: 
 waiting for ppp0 to become free. Usage count = 8
 Jun  4 09:23:32 turing-police kernel: [90119.653129] unregister_netdevice: 
 waiting for ppp0 to become free. Usage count = 8

Interesting refcount.

 'echo t  /proc/sysrq_trigger' shows pppd hung up here:
 
 Jun  4 10:52:57 turing-police kernel: [95478.047892] pppd  D 
 000105ad3830  4968  3815  1 (NOTLB)
 Jun  4 10:52:57 turing-police kernel: [95478.047902]  810008d5fd78 
 0086  81000349
 Jun  4 10:52:57 turing-police kernel: [95478.047911]  810008d5fd28 
 810008d4a040 810003461820 810008d4a2b0
 Jun  4 10:52:57 turing-police kernel: [95478.047920]  000105ad3733 
 0202 00ff 80239795
 Jun  4 10:52:57 turing-police kernel: [95478.047928] Call Trace:
 Jun  4 10:52:57 turing-police kernel: [95478.047936]  [805207a2] 
 schedule_timeout+0x8d/0xb4
 Jun  4 10:52:57 turing-police kernel: [95478.047945]  [805207e2] 
 schedule_timeout_uninterruptible+0x19/0x1b
 Jun  4 10:52:57 turing-police kernel: [95478.047954]  [802397bb] 
 msleep+0x14/0x1e
 Jun  4 10:52:57 turing-police kernel: [95478.047963]  [8048aa4e] 
 netdev_run_todo+0x12f/0x234 
 Jun  4 10:52:57 turing-police kernel: [95478.047972]  [8049166f] 
 rtnl_unlock+0x35/0x37
 Jun  4 10:52:57 turing-police kernel: [95478.047981]  [804894a9] 
 unregister_netdev+0x1e/0x23
 Jun  4 10:52:57 turing-police kernel: [95478.047994]  [88a5f2c2] 
 :ppp_generic:ppp_shutdown_interface+0x67/0xbb
 Jun  4 10:52:57 turing-police kernel: [95478.048018]  [88a5f5b8] 
 :ppp_generic:ppp_release+0x33/0x65
 Jun  4 10:52:57 turing-police kernel: [95478.048028]  [8028d54a] 
 __fput+0xac/0x176
 Jun  4 10:52:57 turing-police kernel: [95478.048036]  [8028d628] 
 fput+0x14/0x16
 Jun  4 10:52:57 turing-police kernel: [95478.048045]  [8028a9c6] 
 filp_close+0x66/0x71
 Jun  4 10:52:57 turing-police kernel: [95478.048054]  [8028bd54] 
 sys_close+0x98/0xd7
 Jun  4 10:52:57 turing-police kernel: [95478.048062]  [8020a03c] 
 tracesys+0xdc/0xe1
 Jun  4 10:52:57 turing-police kernel: [95478.048073]  [2b45cd2429a0]

I don't know what could have caused this, sorry.  If it's still there in next 
-mm
(which is still 10 compile fixes away) it'd be good if you could bisect it.
Suspects would be git-net.patch, get-netdev-all.patch and gregkh-driver-*.patch

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: warnings in git-wireless

2007-06-06 Thread Andrew Morton
On Wed, 06 Jun 2007 13:51:41 -0700 James Ketrenos [EMAIL PROTECTED] wrote:

 
   * make C=2 CF=-Wall will complain if you use ARRAY_SIZE on global data
   */
  #define GLOBAL_ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
  
  This is identical to ARRAY_SIZE.
 
  And if there's some problem with ARRAY_SIZE then fix ARRAY_SIZE!  Don't go 
  off and create some private thing and leave everyone else twisting in the
  wind.
 
 The code was resolving the sparse warnings.  GLOBAL_ARRAY_SIZE removes the 
 part of the ARRAY_SIZE definition that causes sparse to complain ('+ 
 __must_be_array(arr)').  I don't know enough about how sparse works to fix 
 sparse, or to know if its a problem with __must_be_array.  The code itself 
 was fine -- using an array with ARRAY_SIZE.

(These 340-column emails are rather hard to reply to)

Your GLOBAL_ARRAY_SIZE() is, afaict, identical to ARRAY_SIZE().

If ARRAY_SIZE() is spitting some sparse warning then please report it and
we'll take a look into it.

 
  /* Debug and printf string expansion helpers for printing bitfields */
  #define BIT_FMT8 %c%c%c%c-%c%c%c%c
  #define BIT_FMT16 BIT_FMT8 : BIT_FMT8
  #define BIT_FMT32 BIT_FMT16   BIT_FMT16
 ...
  #define BIT_ARG32(x) \
  BITC(x,31),BITC(x,30),BITC(x,29),BITC(x,28),\
  BITC(x,27),BITC(x,26),BITC(x,25),BITC(x,24),\
  BITC(x,23),BITC(x,22),BITC(x,21),BITC(x,20),\
  BITC(x,19),BITC(x,18),BITC(x,17),BITC(x,16),\
  BIT_ARG16(x)
  
  None of the above is appropriate to a driver-private header.
 
 Where would be the appropriate place?

Dunno.  include/linux/bitfield-helpers.h?

  We use it in with iwlwifi; I don't know if others need it or want to use it. 
  Do you know of other projects using something similar?

I haven't looked.

  #define KELVIN_TO_CELSIUS(x) ((x)-273)
  
  Nor is that.
 
 Where would the appropriate place be?

include/linux/temperature.h?  acpi could use it, and there are other things
we could put into temperature.h

  static inline const struct ieee80211_hw_mode *iwl_get_hw_mode(struct 
  iwl_priv
   *priv, int mode)
  {
 int i;
 
 for (i = 0; i  3; i++)
 if (priv-modes[i].mode == mode)
 return priv-modes[i];
 
 return NULL;
  }
  
  Far too large to inline, has five callsites.
 
 Currently CodingStyle states to use inline where you might have otherwise 
 used a macro, and then later if the function is not overly complex (citing 3 
 lines as a guideline).  Is this too long because it has a for loop in it?  
 Or a loop and a branch?

Anything more than 10-20 instructions turns out to be too large.

 Removing static inline from the functions declared in header files means they 
 need to be moved to .c files, declared as extern, and pollute the namespace.  
 In prior drivers we had been beaten up about doing that...

You were mis-beaten-up.  Choose an appropriate namespace (iwl_* sounds OK),
stick to it and you'll be fine.

  #define WLAN_FC_GET_TYPE(fc)(((fc)  IEEE80211_FCTL_FTYPE))
  #define WLAN_FC_GET_STYPE(fc)   (((fc)  IEEE80211_FCTL_STYPE))
  #define WLAN_GET_SEQ_FRAG(seq)  ((seq)  0x000f)
  #define WLAN_GET_SEQ_SEQ(seq)   ((seq)  4)
  
  These don't need to be macros
 
 What would you like these to be?

/*
 * comment goes here
 */
static inline suitable_return_type wlan_fc_get_type(whatever_type_fc_has fc)
{
return fc  IEEE80211_FCTL_FTYPE;
}

 These currently exist as macros in ieee80211.h and there are other examples 
 in the kernel of similar macros.  If a goal is to remove *all macros* then 
 that should be stated in CodingStyle, preferably in a way that helps 
 developers understand how they are supposed to write their code.

These things come up again and again in lkml code-review.  Could be that
CodingStyle doesn't cover everything.  Common sense and taste applies too.

  #define QOS_CONTROL_LEN 2
 
  static inline u16 *ieee80211_get_qos_ctrl(struct ieee80211_hdr *hdr)
  {
 int hdr_len = ieee80211_get_hdrlen(hdr-frame_control);
 if (hdr-frame_control  IEEE80211_STYPE_QOS_DATA)
 return (u16 *) ((u8 *) hdr + (hdr_len) - QOS_CONTROL_LEN);
 return NULL;
  }
  
  Two callsites, too large to inline.
 
 If something is used more than once then it is unsuitable for an inline?  
 Again maybe updating CodingStyle would be helpful?
 
 Change:
   Generally, inline functions are preferable to macros resembling 
 functions.
 
 To:
   Macros resembling functions and inline functions should *NOT* be used.
 
 IIRC, ieee80211_get_qos_ctrl used to be a macros, and was then changed to an 
 inline per CodingStyle.  Now we don't want inlines, and instead want pure 
 functions (and consequently polluted namespace--or do we want to add to 
 CodingStyle that all functions should be implemented in a single file and 
 marked static?)

inlines are better than macros because they are more readable, because they
have typechecking and because for some reasons programmers are more likely
to 

Re: warnings in git-wireless

2007-06-06 Thread Andrew Morton
On Wed, 06 Jun 2007 15:33:46 -0700 James Ketrenos [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  On Wed, 06 Jun 2007 13:51:41 -0700 James Ketrenos [EMAIL PROTECTED] wrote:
  
   * make C=2 CF=-Wall will complain if you use ARRAY_SIZE on global data
   */
  #define GLOBAL_ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
  This is identical to ARRAY_SIZE.
 
  And if there's some problem with ARRAY_SIZE then fix ARRAY_SIZE!  Don't 
  go 
  off and create some private thing and leave everyone else twisting in the
  wind.
  The code was resolving the sparse warnings.  
  
 ...
  Your GLOBAL_ARRAY_SIZE() is, afaict, identical to ARRAY_SIZE().
 
 From include/linux/kernel.h
 
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
 __must_be_array(arr))

o crap, sorry, I was looking at one of the other definitions of ARRAY_SIZE :(


 From drivers/net/wireless/mac80211/iwlwifi/iwl-helpers.h
 
 #define GLOBAL_ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
 
 The '+ __must_be_array(arr)' part of ARRAY_SIZE was causing sparse to 
 complain 
 with:
 
 drivers/net/wireless/mac80211/iwlwifi/base.c:4646:22: error: cannot size 
 expression
 ...

OK, that's a problem in sparse, I guess.

There _should_ be some #ifdeffable thing which is being passed to cpp when
we run sparse (but I'm not sure what it is).  If there is such a thing then
we could disable the __must_be_array() thing during sparse runs.

But longer-term, sparse should be taught about __must_be_array().

 When I had run my builds, I had restricted the sparse checks to just iwlwifi 
 (vs. checking the rest of the kernel).
 
 I just ran it C=2 CF=-Wall against the rest of the kernel and do see other 
 code 
 with the same problem, eg:
 
 sound/core/memalloc.c:521:14: error: cannot size expression
 ...
 
 I had erroneously thought it was just a problem with iwlwifi...
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22-rc4-mm2 -- ipw2200 -- SIOCSIFADDR: No buffer space available

2007-06-07 Thread Andrew Morton
On Thu, 7 Jun 2007 11:25:30 -0700
Miles Lane [EMAIL PROTECTED] wrote:

 Hi Andrew,
 
 This might be some problem with my kernel configuration.
 I added:
 CONFIG_BONDING=y
 
 # dhclient eth1
 There is already a pid file /var/run/dhclient.pid with pid 134993416
 Internet Systems Consortium DHCP Client V3.0.4
 Copyright 2004-2006 Internet Systems Consortium.
 All rights reserved.
 For info, please visit http://www.isc.org/sw/dhcp/
 SIOCSIFADDR: No buffer space available
 Listening on LPF/eth1/00:12:f0:5e:db:2f
 Sending on   LPF/eth1/00:12:f0:5e:db:2f
 Sending on   Socket/fallback
 DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 6
 DHCPOFFER from 192.168.1.1
 DHCPREQUEST on eth1 to 255.255.255.255 port 67
 DHCPACK from 192.168.1.1
 SIOCSIFADDR: No buffer space available
 SIOCSIFNETMASK: Cannot assign requested address
 SIOCSIFBRDADDR: Cannot assign requested address
 SIOCADDRT: Network is unreachable
 bound to 192.168.1.2 -- renewal in 2993 seconds.
 
 # ping www.yahoo.com
 ping: unknown host www.yahoo.com
 
 Any suggestions what to try now?  I'll go ahead and turn off the
 bonding option and see if that helps.
 

It won't be related to bonding.

It has a high probability of being very related to Herbert's changes
to inet_set_ifa().
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22-rc4-mm2: Assigning IP address fails

2007-06-07 Thread Andrew Morton
On Thu, 7 Jun 2007 17:46:09 -0400
[EMAIL PROTECTED] (Joseph Fannin) wrote:

 On Wed, Jun 06, 2007 at 10:03:13PM -0700, Andrew Morton wrote:
 
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc4/2.6.22-rc4-mm2/
 
 I'm not able to bring an ethernet interface down and back up again
 with this if avahi-autoipd is installed on my Ubuntu boxes.  I've seen
 it on three different computers with different NIC hardware.
 
 I've worked out an easy way to reproduce it without
 avahi-autoipd.  Starting with eth0 up (address assigned by DHCP):
 
   # ifdown eth0
dhclient makes the normal noise about releasing the address 
   # ip addr add 169.254.255.67/16 brd 169.254.255.255 label eth0:avahi scope 
 link dev eth0
   # ip addr del 169.154.255.67/16 brd 169.254.255.255 label eth0:avahi scope 
 link dev eth0
   # ifup eth0
   SIOCSIFADDR: No buffer space available -- first sign of trouble HERE
dhclient copyright boilerplate 
   Listening on LPF/eth0/MAC addr
   Sending on   LPF/eth0/MAC addr
   Sending on   Socket/fallback
   DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 5
   DHCPOFFER from  DHCP server 
   DHCPREQUEST on eth0 to 255.255.255.255 port 67
   DHCPACK from   DHCP server 
   SIOCSIFADDR: No buffer space available
   SIOCSIFNETMASK: Cannot assign requested address
   SIOCSIFBRDADDR: Cannot assign requested address
   SIOCADDRTL No such process
   bound to  IP address  -- renewal in  seconds
   #
 
 At this point, the interface is up, but has no address assigned.
 Manually assigning one with ifconfig fails:
 
   # ifconfig eth0 netmask 255.255.255.0 172.16.0.1
   SIOCSIFNETMASK: Cannot assign requested address
   SIOCSIFADDR: No buffer space available
   #
 
 ... and a reboot is the only way I've been able to get the interface
 to work again.
 
 The last kernels I tried were 2.6.22-rc3 and *I think*
 2.6.22-rc1-mm1, neither of which had this problem.  I will test
 2.6.22-rc4 and 2.6.22-rc3-mm1 later, but I'm out of time today.
 
 I've attached my .config .

Yep, thanks - Miles has reported the same thing.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] make network DMA usable for non-tcp drivers

2007-06-08 Thread Andrew Morton
On Fri, 8 Jun 2007 10:30:53 -0400
Ed L. Cashin [EMAIL PROTECTED] wrote:

 Here is a patch against the netdev-2.6 git tree that makes the net DMA
 feature usable for drivers like the ATA over Ethernet block driver,
 which can use dma_skb_copy_datagram_iovec when receiving data from the
 network.
 
 The change was suggested on kernelnewbies.
 
   http://article.gmane.org/gmane.linux.kernel.kernelnewbies/21663
 
 Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
 ---
  drivers/dma/Kconfig |2 +-
  net/core/user_dma.c |2 ++
  2 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
 index 72be6c6..270d23e 100644
 --- a/drivers/dma/Kconfig
 +++ b/drivers/dma/Kconfig
 @@ -14,7 +14,7 @@ config DMA_ENGINE
  comment DMA Clients
  
  config NET_DMA
 - bool Network: TCP receive copy offload
 + bool Network: receive copy offload
   depends on DMA_ENGINE  NET
   default y
   ---help---
 diff --git a/net/core/user_dma.c b/net/core/user_dma.c
 index 0ad1cd5..69d0b15 100644
 --- a/net/core/user_dma.c
 +++ b/net/core/user_dma.c
 @@ -130,3 +130,5 @@ end:
  fault:
   return -EFAULT;
  }
 +
 +EXPORT_SYMBOL(dma_skb_copy_datagram_iovec);

We wouldn't want to merge this until code which actually uses the export is
also merged.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8635] New: EV6 version of csum_ipv6_magic causing unaligned access errors

2007-06-16 Thread Andrew Morton
On Fri, 15 Jun 2007 13:47:33 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8635
 
Summary: EV6 version of csum_ipv6_magic causing unaligned access
 errors
Product: Networking
Version: 2.5
  KernelVersion: 2.6.18  2.6.22-rc4-mm2
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: IPV6
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: Unknown (Sorry!)
 Distribution: Debian/testing (lenny)
 Hardware Environment: API Networks CS20 (2 x 833Mhz 21264B Alpha CPUs)
 Software Environment:
 Linux sky 2.6.22-rc4-mm2 #1 SMP Tue Jun 12 23:00:10 CDT 2007 alpha GNU/Linux
 
 Gnu C  4.2.1
 Gnu make   3.81
 binutils   Binutils
 util-linux 2.12r
 mount  2.12r
 module-init-tools  3.3-pre11
 e2fsprogs  1.40-WIP
 xfsprogs   2.8.18
 Linux C Library libc.2.5
 Dynamic linker (ldd)   2.5
 Procps 3.2.7
 Net-tools  1.60
 Console-tools  0.2.3
 Sh-utils   5.97
 udev   105
 Modules Loaded nfnetlink ip_tables x_tables adm9240 hwmon_vid loop
 i2c_ali1535 i2c_ali15x3 i2c_core ide_cd cdrom generic sd_mod alim15x3 
 sym53c8xx
 ide_core e100 scsi_transport_spi mii scsi_mod
 
 Problem Description:
 
 [EMAIL PROTECTED]:~$ grep 'kernel unaligned' /proc/cpuinfo 
 kernel unaligned acc: 34 (pc=fc4bbf58,va=fc000394811c)
 
 fc4bbdb0 T __copy_user
 fc4bbf50 T csum_ipv6_magic
 fc4bc020 T memchr
 
 Seems to be associated w/ arch/alpha/lib/ev6-csum_ipv6_magic.S
 
 Steps to reproduce:
 
 Use IPv6 on an Alpha EV6 or higher CPU.
 

I assume that networking passed the arch an unaligned pointer and the arch
didn't expect that.

I further assume that this is an alpha shortcoming, and that this behaviour
of networking is expected?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos?

2007-06-16 Thread Andrew Morton
On Sat, 16 Jun 2007 03:11:30 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8638
 
Summary: unregister_netdevice: waiting for ppp0 to become free.
 pppoe + multihome + htb qos?
Product: Networking
Version: 2.5
  KernelVersion: 2.6.20-1.2316.fc5
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: high
   Priority: P1
  Component: Netfilter/Iptables
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: has occurred since at least
 2.6.18-1.2200.fc5 (Sep 2005) but could have been in earlier versions as I
 wasn't then using the tecnology I believe triggers the bug
 Distribution: FC5
 Hardware Environment: x86 P4 UP 512MB
 Software Environment: lots of cutting-edge (but stock kernel) networking
 technology
 Problem Description:
 
 Every few months on 1 box I administer:
 kernel: unregister_netdevice: waiting for ppp0 to become free. Usage count = 1
 system gets very locked up (but often not completely, no panics) and won't
 reboot: requires onsite hard reset.  In fact, most reboot attempts will fail
 even before the bug hits as a reboot will trigger the bug.  I always reboot 
 the
 box with reboot -f now when I'm remote.
 
 I have a dozen extremely similar boxes to this buggy one out there and they
 don't show this bug.  Unique to this box and I think relevant to the bug:
 
 1) 2 PPPoE DSL connections (multihomed, 2 IP addresses, traffic split by port,
 used to achieve higher aggregate upload bandwidth)
 2) multi-table ip route rules (ip rule add ... table 2) to achieve traffic
 splitting in #1.
 
 Other technologies combined on this box but not on any others (though others
 use them separately without the bug hitting):
 
 3) QoS, HTB qdiscs (used on non-PPPoE boxes without the bug)
 4) 2.6sec IPSEC VPN (used on many other PPPoE and non-PPPoE boxes without
 problems)
 5) PPPoE (used on many other boxes without this bug)
 
 I'm not even sure where to begin on what info to provide.  I can provide my
 config for any of the above technologies if it will help.  The box is an
 important production box and unless I can find a way to reliably make it barf
 while onsite it may be hard to test things, like turn off QoS, because all
 the tecnologies are essential for day to day operations.
 
 I'll attach a useful log excerpt from the last 4 times the bug hit if I can.
 
 If this is a bad bug entry, please tell me what I need to add.  It's my first
 entry on this bugzilla and I'm not sure what's required.  I'm sorry this bug
 report is on the FC5 stock kernels, but I'm not sure I can use a vanilla
 kernel instead of FC5 and not screw something up.  However, there are NO 
 binary
 modules or any weird stuff on the box.  It's all stock FC5 rpms.
 
 This box is a production box and the only one I have with 2 PPPoE connections
 to test.  I'm nearly positive it's either a 2-PPPoE+advanced-routing problem 
 or
 a 2-PPPoE+HTB problem.  Since I've seen no other hits on google or elsewhere
 that are exactly like this bug, I must assume it's something fairly unique to
 this box: but what combination?!
 
 I've had a Redhat bugzilla open on this since Sep 2005 with zero replies!  It
 shows more detail and my thought process over the years.
 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169502
 
 Steps to reproduce:
 Haven't figured out a way to reliably hit this bug.  Any hints to allow easier
 testing (which must be done onsite) are welcome.
 

I have a vague feeling that we fixed this in a later kernel.  Does anyone
recall?

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos?

2007-06-18 Thread Andrew Morton
On Mon, 18 Jun 2007 10:56:06 -0400 Chuck Ebbert [EMAIL PROTECTED] wrote:

 
 Is there any way to print the addresses the notifier is calling
 to try and release net device references? I see:
 
 net/core/dev/c::netdev_wait_allrefs():
 
 while (atomic_read(dev-refcnt) != 0) {
 if (time_after(jiffies, rebroadcast_time + 1 * HZ)) {
 rtnl_lock();
 
 /* Rebroadcast unregister notification */
 raw_notifier_call_chain(netdev_chain,
 NETDEV_UNREGISTER, dev);
 
 but don't see any way to print the functions that get called.

Nope.  I guess we could add some print_notifier_call_chain() thing, but
then we'd need one flavour per locking scheme and it would get ridiculous.

I guess just an unlocked version would be OK - it's just a debug thing.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8654] New: possible connect() bug

2007-06-20 Thread Andrew Morton
 On Wed, 20 Jun 2007 03:56:28 -0700 (PDT) [EMAIL PROTECTED] wrote:
 http://bugzilla.kernel.org/show_bug.cgi?id=8654
 
Summary: possible connect() bug
Product: Networking
Version: 2.5
  KernelVersion: Linux version 2.6.21.1 ([EMAIL PROTECTED]) (gcc
 version 3.3.
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: low
   Priority: P1
  Component: Other
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 ...

 01:01.0 Ethernet controller: Intel Corp. 82547GI Gigabit Ethernet Controller
 Subsystem: Micro-Star International Co., Ltd.: Unknown device 1490
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
 Stepping- SERR- FastB2B-
 Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium TAbort-
 TAbort- MAbort- SERR- PERR-
 Latency: 0 (63750ns min), cache line size 08
 Interrupt: pin A routed to IRQ 11
 Region 0: Memory at fb10 (32-bit, non-prefetchable) [size=128K]
 Region 2: I/O ports at b000 [size=32]
 Capabilities: [dc] Power Management version 2
 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
 PME(D0+,D1-,D2-,D3hot+,D3cold+)
 Status: D0 PME-Enable- DSel=0 DScale=1 PME-
 
 ...
 
 Minimal slackware installation, kernel downloaded from kernel.org
 
 Problem Description:
 
 connect() syscall normally reports no route to host when is called while
 network cable is
 unplugged, and ethernet interface is up and configured. But it hangs eternally
 and utilizes
 processor up to 100% if one tries to plug ethernet cable in, here two cases:
 a) socket is blocking, connect() is called and did not completed, current
 syscall hangs
 b) socket is non-blocking, connect() normally returns EINPROGRESS, next 
 syscall
 hangs
 
 Steps to reproduce:
 
 Just try to plug ethernet cable in while trying to connect()
 

That might be a device driver bug.  Please generate a kernel profile while
it is occuring (Documentation/basic_profiling.txt) and/or generate a few
sysrq-P traces.  Send them via emailed reply-to-all to this email.

I'll be travelling for the next few days, but hopefully one of the netdev
developers will be able to work with you on this, thanks.  

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] alpha: fix alignment problem in csum_ipv6_magic()

2007-06-21 Thread Andrew Morton
 On Sun, 17 Jun 2007 01:20:20 +0400 Ivan Kokshaysky [EMAIL PROTECTED] wrote:
 Hopefully this fixes http://bugzilla.kernel.org/show_bug.cgi?id=8635
 
 The struct in6_addr passed to csum_ipv6_magic() is 4 byte aligned,
 so we can't use the regular 64-bit loads.
 Since the cost of handling of 4 byte and 1 byte aligned 64-bit data is
 roughly the same, this code can cope with any src/dst [mis]alignment.
 
 Signed-off-by: Ivan Kokshaysky [EMAIL PROTECTED]
 
 Ivan.
 
 --- 2.6.22-rc4/arch/alpha/lib/ev6-csum_ipv6_magic.S   Sun Feb  4 21:44:54 2007
 +++ linux/arch/alpha/lib/ev6-csum_ipv6_magic.SSun Jun 17 00:41:53 2007
 @@ -46,6 +46,10 @@
   * add the 3 low ushorts together, generating a uint
   * a final add of the 2 lower ushorts
   * truncating the result.
 + *
 + * Misalignment handling added by Ivan Kokshaysky [EMAIL PROTECTED]
 + * The cost is 16 instructions (~8 cycles), including two extra loads which
 + * may cause additional delay in rare cases (load-load replay traps).
   */
  
   .globl csum_ipv6_magic
 @@ -55,25 +59,45 @@
  csum_ipv6_magic:
   .prologue 0
  
 - ldq $0,0($16)   # L : Latency: 3
 + ldq_u   $0,0($16)   # L : Latency: 3
   inslh   $18,7,$4# U : 00AABBCC
 - ldq $1,8($16)   # L : Latency: 3
 + ldq_u   $1,8($16)   # L : Latency: 3
   sll $19,8,$7# U : U L U L : 0x 00aabb00
  
 + and $16,7,$6# E : src misalignment
 + ldq_u   $5,15($16)  # L : Latency: 3
   zapnot  $20,15,$20  # U : zero extend incoming csum
 - ldq $2,0($17)   # L : Latency: 3
 - sll $19,24,$19  # U : U L L U : 0x00aa bb00
 - inswl   $18,3,$18   # U : 00CCDD00
 + ldq_u   $2,0($17)   # L : U L U L : Latency: 3
 +
 + extql   $0,$6,$0# U :
 + extqh   $1,$6,$22   # U :
 + ldq_u   $3,8($17)   # L : Latency: 3
 + sll $19,24,$19  # U : U U L U : 0x00aa bb00
 +
 + cmoveq  $6,$31,$22  # E : src aligned?
 + ldq_u   $23,15($17) # L : Latency: 3
 + or  $18,$4,$18  # E : 00CCDDAABBCC
 + extql   $1,$6,$1# U : U L L U :
  
 - ldq $3,8($17)   # L : Latency: 3
 - bis $18,$4,$18  # E : 00CCDDAABBCC
 + or  $0,$22,$0   # E : 1st src word complete
 + extqh   $5,$6,$5# U :
   addl$19,$7,$19  # E : sign bitsbbaabb00
 - nop # E : U L U L
 + and $17,7,$6# E : L U L U : dst misalignment
  
 + inswl   $18,3,$18   # U : 00CCDD00
 + or  $1,$5,$1# E : 2nd src word complete
 + extql   $2,$6,$2# U :
 + extqh   $3,$6,$22   # U : U L U U :
 +
 + cmoveq  $6,$31,$22  # E : dst aligned?
 + extql   $3,$6,$3# U :
   addq$20,$0,$20  # E : begin summing the words
 + extqh   $23,$6,$23  # U : L U L U :
 +
   srl $18,16,$4   # U : 00CCDDAA
 + or  $2,$22,$2   # E : 1st dst word complete
   zap $19,0x3,$19 # U : sign bitsbbaa
 - nop # E : L U U L
 + or  $3,$23,$3   # E : U L U L : 2nd dst word complete
  
   cmpult  $20,$0,$0   # E :
   addq$20,$1,$20  # E :
 --- 2.6.22-rc4/arch/alpha/lib/csum_ipv6_magic.S   Sun Feb  4 21:44:54 2007
 +++ linux/arch/alpha/lib/csum_ipv6_magic.SSun Jun 17 00:29:28 2007
 @@ -7,6 +7,9 @@
   *__u32 len,
   *unsigned short proto,
   *unsigned int csum);
 + *
 + * Misalignment handling (which costs 16 instructions / 8 cycles) 
 + * added by Ivan Kokshaysky [EMAIL PROTECTED]
   */
  
   .globl csum_ipv6_magic
 @@ -16,37 +19,57 @@
  csum_ipv6_magic:
   .prologue 0
  
 - ldq $0,0($16)   # e0: load src  dst addr words
 + ldq_u   $0,0($16)   # e0: load src  dst addr words
   zapnot  $20,15,$20  # .. e1 : zero extend incoming csum
   extqh   $18,1,$4# e0: byte swap len  proto while we wait
 - ldq $1,8($16)   # .. e1 :
 + ldq_u   $21,7($16)  # .. e1 : handle misalignment
  
   extbl   $18,1,$5# e0:
 - ldq $2,0($17)   # .. e1 :
 + ldq_u   $1,8($16)   # .. e1 :
   extbl   $18,2,$6# e0:
 - ldq $3,8($17)   # .. e1 :
 + ldq_u   $22,15($16) # .. e1 :
  
   extbl   $18,3,$18   # e0:
 + ldq_u   $2,0($17)   # .. e1 :
   sra $4,32,$4# e0:
 + ldq_u   $23,7($17)  # .. e1 :
 +
 + extql   $0,$16,$0   # e0:
 + ldq_u   $3,8($17)   # .. e1 :
 + extqh   $21,$16,$21 # e0:
 + ldq_u   $24,15($17) # .. e1 :
 +
   sll $5,16,$5# e0:
 + or  $0,$21,$0   # .. e1 : 1st src word complete
 + extql   $1,$16,$1   # e0:
   addq

Re: [PATCH] Ethernet driver for EISA only SNI RM200/RM400 machines

2007-06-23 Thread Andrew Morton
 On Fri, 22 Jun 2007 21:53:58 +0200 [EMAIL PROTECTED] (Thomas Bogendoerfer) 
 wrote:
 Hi,
 
 This is new ethernet driver, which use the code taken out of lasi_82596
 (done by the other patch I just sent).
 
 Thomas.
 
 
 Ethernet driver for EISA only SNI RM200/RM400 machines
 
 ...

 +static char sni_82596_string[] = snirm_82596;

const?

 +
 +#define DMA_ALLOC  dma_alloc_coherent
 +#define DMA_FREE   dma_free_coherent
 +#define DMA_WBACK(priv, addr, len) do { } while (0)
 +#define DMA_INV(priv, addr, len)   do { } while (0)
 +#define DMA_WBACK_INV(priv, addr, len) do { } while (0)
 +
 +#define SYSBUS  0x4400
 +
 +/* big endian CPU, 82596 little endian */
 +#define SWAP32(x)   cpu_to_le32((u32)(x))
 +#define SWAP16(x)   cpu_to_le16((u16)(x))
 +
 +#define OPT_MPU_16BIT0x01
 +
 +static inline void CA(struct net_device *dev);
 +static inline void MPU_PORT(struct net_device *dev, int c, dma_addr_t x);

These two function's implementations could be moved to before the #include,
s we wouldn't need to forward-declare them?

 +#include lib82596.c

ugh.  Is this really unavoidable?

 +MODULE_AUTHOR(Thomas Bogendoerfer);
 +MODULE_DESCRIPTION(i82596 driver);
 +MODULE_LICENSE(GPL);
 +module_param(i596_debug, int, 0);
 +MODULE_PARM_DESC(i596_debug, 82596 debug mask);
 +
 +static inline void CA(struct net_device *dev)
 +{
 + struct i596_private *lp = netdev_priv(dev);
 + 
 + writel(0, lp-ca);
 +}
 +
 +
 +static inline void MPU_PORT(struct net_device *dev, int c, dma_addr_t x)
 +{
 + struct i596_private *lp = netdev_priv(dev);
 +
 + u32 v = (u32) (c) | (u32) (x);
 + 
 + if (lp-options  OPT_MPU_16BIT) {
 + writew(v  0x, lp-mpu_port);
 + wmb(); udelay(1); /* order writes to MPU port */

Nope, please put these on separate lines.  No exceptions..

 + writew(v  16, lp-mpu_port);
 + } else {
 + writel(v, lp-mpu_port);
 + wmb(); udelay(1); /* order writes to MPU port */
 + writel(v, lp-mpu_port);
 + }
 +}

Three callsites: This looks too large to inline.

I see no reason why this and CA() are have upper-case names?

 +
 +static int __devinit sni_82596_probe(struct platform_device *dev)
 +{
 + struct  net_device *netdevice;
 + struct i596_private *lp;
 + struct  resource *res, *ca, *idprom, *options;
 + int retval = -ENODEV;
 + static int init;
 + void __iomem *mpu_addr = NULL;
 + void __iomem *ca_addr = NULL;
 + u8 __iomem *eth_addr = NULL;
 + 
 + if (init == 0) {
 + printk(KERN_INFO SNI_82596_DRIVER_VERSION \n);
 + init++;
 + }

Might as well do this message in the module_init() function?  There's a
per-probed-device message later on anwyay.

The patchset tries to add rather a lot of new trailing whitespace btw.

 + res = platform_get_resource(dev, IORESOURCE_MEM, 0);
 + if (!res)
 + goto probe_failed;
 + mpu_addr = ioremap_nocache(res-start, 4);
 + if (!mpu_addr) {
 + retval = -ENOMEM;
 + goto probe_failed;
 + }
 + ca = platform_get_resource(dev, IORESOURCE_MEM, 1);
 + if (!ca)
 + goto probe_failed;
 + ca_addr = ioremap_nocache(ca-start, 4);
 + if (!ca_addr) {
 + retval = -ENOMEM;
 + goto probe_failed;
 + }
 + idprom = platform_get_resource(dev, IORESOURCE_MEM, 2);
 + if (!idprom)
 + goto probe_failed;
 + eth_addr = ioremap_nocache(idprom-start, 0x10);
 + if (!eth_addr) {
 + retval = -ENOMEM;
 + goto probe_failed;
 + }
 + options = platform_get_resource(dev, 0, 0);
 + if (!options)
 + goto probe_failed;
 +
 + printk(KERN_INFO Found i82596 at 0x%x\n, res-start);
 +
 + netdevice = alloc_etherdev(sizeof(struct i596_private));
 + if (!netdevice) {
 + retval = -ENOMEM;
 + goto probe_failed;
 + }
 + SET_NETDEV_DEV(netdevice, dev-dev);
 + platform_set_drvdata (dev, netdevice);
 +
 + netdevice-base_addr = res-start;
 + netdevice-irq = platform_get_irq(dev, 0);
 + 
 + /* someone seams to like messed up stuff */
 + netdevice-dev_addr[0] = readb(eth_addr + 0x0b);
 + netdevice-dev_addr[1] = readb(eth_addr + 0x0a);
 + netdevice-dev_addr[2] = readb(eth_addr + 0x09);
 + netdevice-dev_addr[3] = readb(eth_addr + 0x08);
 + netdevice-dev_addr[4] = readb(eth_addr + 0x07);
 + netdevice-dev_addr[5] = readb(eth_addr + 0x06);
 + iounmap(eth_addr);
 + 
 + if (!netdevice-irq) {
 + printk(KERN_ERR %s: IRQ not found for i82596 at 0x%lx\n,
 + __FILE__, netdevice-base_addr);
 + goto probe_failed;
 + }
 + 
 + lp = netdev_priv(netdevice);
 + lp-options = options-flags  IORESOURCE_BITS;
 + lp-ca = ca_addr;
 + lp-mpu_port = mpu_addr;
 + 
 + retval = 

Re: [PATCH v2.6.22-rc5] cxgb2: handle possible NULL pointer dereferencing, take 2

2007-06-23 Thread Andrew Morton
 On Thu, 21 Jun 2007 18:48:30 +0530 pradeep singh [EMAIL PROTECTED] wrote:
 Hi,
 My mistake.
 Resending after reformatting the patch by hand.
 Looks like gmail messes the plain text patches.
 

That's still mangled so I typed it in again.

Please always include a full changlog with each version of a patch.

I do not know what this patch does - please provide a changelog.  In this
case it should tell us whether and how this null pointer deref is actually
occuring and if so, why.

As well as a full description of the problem which it solves, a changelog
should also describe _how_ it solved it, but that is sufficiently obvious
in this case.


Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Scaling Max IP address limitation

2007-06-24 Thread Andrew Morton
On Sun, 24 Jun 2007 12:20:01 -0500 David Jones [EMAIL PROTECTED] wrote:

 Hi,
 I am trying to add multiple IP addresses ( v6 ) to my FC7 box on eth0. 
 But I am hitting a max limit of 4000 IP address . Seems like there is a 
 limiting variable in linux kernel (which one? ) that prevents from 
 adding more IP addresses than 4096. What do I need to change in Linux 
 kernel  ( and then recompile ) to be able to add more IP addresses than 
 4K addresses per system? ..

(cc netdev)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8668] New: HTB Deadlock

2007-06-24 Thread Andrew Morton
On Sun, 24 Jun 2007 21:57:19 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8668
 
Summary: HTB Deadlock
Product: Networking
Version: 2.5
  KernelVersion: 2.6.19.7
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Other
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur:
 Distribution:
 Hardware Environment:
 Software Environment:
 Problem Description:
 Greetings,
 
 I've been experiencing problems with HTB where the whole machine locks
 up. This usually happens when the whole qdisc is being removed and
 occasionally when a leaf is being removed.
 
 Common is that it always happens when some sort of removal is in
 progress.
 
 Console output I have captured is at the end of this message. The same
 behavior exists from vanilla 2.6.19.7 and above. It is possible that the
 problem also exist in the earlier versions however I did not go further
 back.
 
 I also believe I have found where the actual problem is:
 
 qdisc_destroy() function is always called with dev-queue_lock locked.
 htb_destroy() function up the stack is using del_timer_sync() call to
 deactivate HTB qdisc timers. 

yep, I would agree with that analysis.  del_timer_sync() under a lock is
quite dangerous in this regard.

If the (misspelled) comment over htb_destroy() is true, current mainline
appears still to have this bug.


 From the comments in the source where del_timer_sync() is defined:
 
 ---copy/paste---
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated
  *
  * This function only differs from del_timer() on SMP: besides deactivating
  * the timer it also makes sure the handler has finished executing on other
  * CPUs.
  *
  * Synchronization rules: Callers must prevent restarting of the timer,
  * otherwise this function is meaningless. It must not be called from
  * interrupt contexts. The caller must not hold locks which would prevent
  * completion of the timer's handler. The timer's handler must not call
  * add_timer_on(). Upon exit the timer is not queued and the handler is
  * not running on any CPU.
  *
  * The function returns whether it has deactivated a pending timer or not.
  */
 ---copy/paste---
 
 Now, htb_rate_timer() does exactly what appears to be the source of the
 problem - it tries obtain dev-queue_lock - and given the right moment
 (timer fired handler while qdisc_destroy was holding the lock) - system
 locks up - del_timer_sync is waiting for handler to finish while the
 handler is waiting for the dev-queue_lock.
 
 Of course I could also be completely wrong here and missing something
 not so obvious.
 
 I could also attempt to fix this but I haven't dealt with this code in
 the past so I was hoping someone with better insight might just have an
 elegant solution up his sleeve.
 
 Best regards,
 
 Ranko
 
 PS: If this is not the right place for this report - please let me
 know.
 
 ---CONSOLE (2.6.19.7)---
 BUG: soft lockup detected on CPU#3!
  [c013c890] softlockup_tick+0x93/0xc2
  [c0127585] update_process_times+0x26/0x5c
  [c0111cd5] smp_apic_timer_interrupt+0x97/0xb2
  [c0104373] apic_timer_interrupt+0x1f/0x24
  [c02e007b] klist_next+0x4/0x8a
  [c02e2570] _spin_unlock_irqrestore+0xa/0xc
  [c012729b] try_to_del_timer_sync+0x47/0x4f
  [c01272b1] del_timer_sync+0xe/0x14
  [f8b8a85b] htb_destroy+0x20/0x7b [sch_htb]
  [c028f196] qdisc_destroy+0x44/0x8d
  [f8b89645] htb_destroy_class+0xd0/0x12d [sch_htb]
  [f8b895c7] htb_destroy_class+0x52/0x12d [sch_htb]
  [f8b8a87a] htb_destroy+0x3f/0x7b [sch_htb]
  [c028f196] qdisc_destroy+0x44/0x8d
  [f8b89645] htb_destroy_class+0xd0/0x12d [sch_htb]
  [f8b895c7] htb_destroy_class+0x52/0x12d [sch_htb]
  [f8b8a87a] htb_destroy+0x3f/0x7b [sch_htb]
  [c028f196] qdisc_destroy+0x44/0x8d
  [c0290ba9] tc_get_qdisc+0x1a3/0x1ef
  [c0290a06] tc_get_qdisc+0x0/0x1ef
  [c028a366] rtnetlink_rcv_msg+0x158/0x215
  [c028a20e] rtnetlink_rcv_msg+0x0/0x215
  [c0294598] netlink_run_queue+0x88/0x11d
  [c028a1c0] rtnetlink_rcv+0x26/0x42
  [c0294b0c] netlink_data_ready+0x12/0x54
  [c0293843] netlink_sendskb+0x1c/0x33
  [c0294a11] netlink_sendmsg+0x1ee/0x2d7
  [c0278ff7] sock_sendmsg+0xe5/0x100
  [c01306b9] autoremove_wake_function+0x0/0x37
  [c01306b9] autoremove_wake_function+0x0/0x37
  [c0278ff7] sock_sendmsg+0xe5/0x100
  [c01cd8be] copy_from_user+0x33/0x69
  [c027913f] sys_sendmsg+0x12d/0x243
  [c02e2564] _read_unlock_irq+0x5/0x7
  [c013fb2b] find_get_page+0x37/0x42
  [c01423dd] filemap_nopage+0x30c/0x3a3
  [c014bb99] __handle_mm_fault+0x21c/0x943
  [c02e24c5] _spin_unlock_bh+0x5/0xd
  [c027b475] sock_setsockopt+0x63/0x59d
  [c0151801] anon_vma_prepare+0x1b/0xcb
  [c027a2ea] sys_socketcall+0x24f/0x271
  [c02e3ad0] 

Re: [PATCH v2.6.22-rc5] cxgb2: handle possible NULL pointer dereferencing, take 2

2007-06-25 Thread Andrew Morton
On Thu, 21 Jun 2007 18:48:30 +0530
pradeep singh [EMAIL PROTECTED] wrote:

 diff --git a/drivers/net/chelsio/cxgb2.c b/drivers/net/chelsio/cxgb2.c
 index 231ce43..006c634 100644
 --- a/drivers/net/chelsio/cxgb2.c
 +++ b/drivers/net/chelsio/cxgb2.c
 @@ -1022,6 +1022,11 @@ static int __devinit init_one(struct pci_dev *pdev,
mmio_start = pci_resource_start(pdev, 0);
mmio_len = pci_resource_len(pdev, 0);
bi = t1_get_board_info(ent-driver_data);
 +
 +   if (!bi) {
 +CH_ERR(%s: Board info array index out of 
 range\n,pci_name(pdev));
 +goto out_disable_pdev;
 +}
 
for (i = 0; i  bi-port_number; ++i) {
struct net_device *netdev;

The chelsio driver is assuming that pci_device_id.driver_data has been
initialised to the board index, but I am unable to locate anywhere where
that initialisation actually happens.  Is this a bug?

(Who maintains this driver now?)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET] au1000_eth: Fix warnings.

2007-06-25 Thread Andrew Morton
On Sun, 24 Jun 2007 15:59:54 +0200
Ralf Baechle [EMAIL PROTECTED] wrote:

 Fixed by including linux/dma-mapping.h:
 
   CC  drivers/net/au1000_eth.o
 drivers/net/au1000_eth.c: In function 'au1000_probe':
 drivers/net/au1000_eth.c:661: warning: implicit declaration of function 
 'dma_alloc_noncoherent'
 drivers/net/au1000_eth.c:802: warning: implicit declaration of function 
 'dma_free_noncoherent'
 
 Signed-off-by: Ralf Baechle [EMAIL PROTECTED]
 
 diff --git a/drivers/net/au1000_eth.c b/drivers/net/au1000_eth.c
 index c39ab80..c27cfce 100644
 --- a/drivers/net/au1000_eth.c
 +++ b/drivers/net/au1000_eth.c
 @@ -34,7 +34,7 @@
   *
   *
   */
 -
 +#include linux/dma-mapping.h
  #include linux/module.h
  #include linux/kernel.h
  #include linux/string.h

That's more than a warning fix.  On most platforms, dma_alloc_noncoherent()
is a #define so the driver just won't link there.

looks

But the driver is mips-only, and MIPS uses a regular C function for
dma_alloc_noncoherent(), so you got lucky.

Still, I'd say this is for-2.6.22.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2.6.22-rc5] cxgb2: handle possible NULL pointer dereferencing, take 2

2007-06-25 Thread Andrew Morton
On Mon, 25 Jun 2007 19:14:05 -0400
Jeff Garzik [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  The chelsio driver is assuming that pci_device_id.driver_data has been
  initialised to the board index, but I am unable to locate anywhere where
  that initialisation actually happens.
 
 It's hidden inside the CH_DEVICE() initializer-helper macro.
 

oic.

Does this driver still have a maintainer, or is it now a community
driver (giggle) ?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.22: ERROR: __ucmpdi2 [drivers/net/s2io.ko] undefined!

2007-06-26 Thread Andrew Morton
On Thu, 21 Jun 2007 05:55:13 -0400 Sivakumar Subramani [EMAIL PROTECTED] 
wrote:

 -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  On Behalf Of Olaf Hering
  Sent: Wednesday, June 20, 2007 2:11 AM
  To: Stephen Hemminger
  Cc: [EMAIL PROTECTED]; netdev@vger.kernel.org
  Subject: Re: 2.6.22: ERROR: __ucmpdi2 [drivers/net/s2io.ko] undefined!
  
  On Tue, Jun 19, Stephen Hemminger wrote:
  
   On Tue, 19 Jun 2007 21:02:53 +0200
   Olaf Hering [EMAIL PROTECTED] wrote:
   

What happend to __ucmpdi2 from David Woodhouse?
google has a few hits about stuff like this on 32bit powerpc with
  gcc 4.1.2:

ERROR: __ucmpdi2 [drivers/net/s2io.ko] undefined!

using the drivers/net/s2io* files from 2.6.21 with 2.6.22-rc5 fixes 
the compile.

25805dcf9d83098cf5492117ad2669cd14cc9b24 adds two u64 = 48 
followed by a switch statement (line 2889 and 6816).
   
   Probably the switch(err) { needs a cast to a smaller type (like u8).
  
  This change removes the calls to __ucmpdi2.

(fixes quoting, fixes top-posting.  Please don't top-post).

 Hi,
 
 We will include this fix in next set of patch submission. Thanks for the
 fix.

 ---
   drivers/net/s2io.c |   16 +---
   1 file changed, 9 insertions(+), 7 deletions(-)
  
  --- a/drivers/net/s2io.c
  +++ b/drivers/net/s2io.c
  @@ -2868,6 +2868,7 @@ static void tx_intr_handler(struct fifo_
  struct tx_curr_get_info get_info, put_info;
  struct sk_buff *skb;
  struct TxD *txdlp;
  +   u8 err_mask;
   
  get_info = fifo_data-tx_curr_get_info;
  memcpy(put_info, fifo_data-tx_curr_put_info,
  sizeof(put_info)); @@ -2886,8 +2887,8 @@ static void
  tx_intr_handler(struct fifo_
  }
   
  /* update t_code statistics */
  -   err = 48;
  -   switch(err) {
  +   err_mask = err  48;
  +   switch(err_mask) {
  case 2:
   
  nic-mac_control.stats_info-sw_stat.
   
  tx_buf_abort_cnt++;
  @@ -6805,6 +6806,7 @@ static int rx_osm_handler(struct ring_in
  u16 l3_csum, l4_csum;
  unsigned long long err = rxdp-Control_1  RXD_T_CODE;
  struct lro *lro;
  +   u8 err_mask;
   
  skb-dev = dev;
   
  @@ -6813,8 +6815,8 @@ static int rx_osm_handler(struct ring_in
  if (err  0x1) {
   
  sp-mac_control.stats_info-sw_stat.parity_err_cnt++;
  }
  -   err = 48;
  -   switch(err) {
  +   err_mask = err  48;
  +   switch(err_mask) {
  case 1:
  sp-mac_control.stats_info-sw_stat.
  rx_parity_err_cnt++;
  @@ -6867,9 +6869,9 @@ static int rx_osm_handler(struct ring_in
  * Note that in this case, since checksum will be
  incorrect,
  * stack will validate the same.
  */
  -   if (err != 0x5) {
  -   DBG_PRINT(ERR_DBG, %s: Rx error Value:
  0x%llx\n,
  -   dev-name, err);
  +   if (err_mask != 0x5) {
  +   DBG_PRINT(ERR_DBG, %s: Rx error Value: 0x%x\n,
  +   dev-name, err_mask);
  sp-stats.rx_crc_errors++;
  sp-mac_control.stats_info-sw_stat.mem_freed 
  += skb-truesize;
 

This fix is still not present in anyone's tree and is required for
2.6.22.  Where are we up to with it?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Re: [2.6.21.1] soft lockup when removing netconsole module

2007-06-26 Thread Andrew Morton
On Wed, 13 Jun 2007 11:25:37 +0200
Jarek Poplawski [EMAIL PROTECTED] wrote:

 On Tue, Jun 12, 2007 at 01:02:33PM +0200, Jarek Poplawski wrote:
 ...
  Of course such a problem should preferably be fixed by somebody who
  knows the code (alas I don't know netconsole), to be sure all needed
  cancels are still done after this change. I hope Jason's patch is
  right but I'm a little surprised I can't see netdev in cc (I'll try
  to fix this).
 
 So, I've had a look into netpoll and, unfortunately, I don't
 think this patch is right... 
 
   From: Jason Wessel [EMAIL PROTECTED]
   
   Do not call cancel_rearming_delayed_work() if there is no
   pending work.
   
   Signed-off-by: Jason Wessel [EMAIL PROTECTED]
   Signed-off-by: Andrew Morton [EMAIL PROTECTED]
   ---
   
net/core/netpoll.c |6 --
1 file changed, 4 insertions(+), 2 deletions(-)
   
   diff -puN net/core/netpoll.c~a net/core/netpoll.c
   --- a/net/core/netpoll.c~a
   +++ a/net/core/netpoll.c
   @@ -784,8 +784,10 @@ void netpoll_cleanup(struct netpoll *np)
 if (atomic_dec_and_test(npinfo-refcnt)) {
 skb_queue_purge(npinfo-arp_tx);
 skb_queue_purge(npinfo-txq);
   - cancel_rearming_delayed_work(npinfo-tx_work);
   - flush_scheduled_work();
   + if (delayed_work_pending(npinfo-tx_work)) {
   + 
   cancel_rearming_delayed_work(npinfo-tx_work);
   + flush_scheduled_work();
   + }

 kfree(npinfo);
 }
   _
 
 There are such possibilities:
 
 1. After positive delayed_work_pending(npinfo-tx_work) test
 some work is queued, but there is no guarantee that when running
 it'll rearm again, so cancel_rearming_delayed_work can loop again;
 
 2. After negative delayed_work_pending(npinfo-tx_work) test
 a work is just running, eg. waiting on netif_tx_lock, while
 kfree(npinfo) is done here (oops?!).
 
 I've found an additional problem here with or without this patch:
 after deleting a timer in cancel_rearming_delayed_work() there could
 stay a last skb queued in npinfo-txq, and after kfree(npinfo)
 we have small memory leak. If I'm right here similar fix is needed
 in the current netpoll code: additional npinfo-txq purging only
 or maybe the whole cancel_rearming_ changed like this.
 
 I've tried to eliminate these problems in attached below patch
 proposal. I'm not sure it's all right: as I've written earlier I
 don't know netconsole enough, but it's probably a little better
 than above solution.
 
 I've some doubts yet (I didn't have time to check this all):
 
 1. I hope this other schedule_delayed_work() from netpoll_send_skb()
 is not possible when netpoll_cleanup() runs - if I'm wrong additional
 check of npinfo-refcnt should be done there;
 2. I also hope npinfo-refcnt before scheduling should be enough here
 - if not - another possibility is adding some locking eg.:
 netif_tx_lock before cancel for synchronization.
 
 Of course it would be very nice if somebody could test or verify
 this patch more.
 
 Regards,
 Jarek P.
 
 
 Signed-off-by: Jarek Poplawski [EMAIL PROTECTED]
 
 ---
 
 diff -Nurp 2.6.21-/net/core/netpoll.c 2.6.21/net/core/netpoll.c
 --- 2.6.21-/net/core/netpoll.c2007-04-26 15:08:32.0 +0200
 +++ 2.6.21/net/core/netpoll.c 2007-06-12 21:05:23.0 +0200
 @@ -73,7 +73,8 @@ static void queue_process(struct work_st
   netif_tx_unlock(dev);
   local_irq_restore(flags);
  
 - schedule_delayed_work(npinfo-tx_work, HZ/10);
 + if (atomic_read(npinfo-refcnt))
 + schedule_delayed_work(npinfo-tx_work, HZ/10);
   return;
   }
   netif_tx_unlock(dev);
 @@ -780,9 +781,15 @@ void netpoll_cleanup(struct netpoll *np)
   if (atomic_dec_and_test(npinfo-refcnt)) {
   skb_queue_purge(npinfo-arp_tx);
   skb_queue_purge(npinfo-txq);
 - cancel_rearming_delayed_work(npinfo-tx_work);
 + cancel_delayed_work(npinfo-tx_work);
   flush_scheduled_work();
  
 + /* clean after last, unfinished work */
 + if (!skb_queue_empty(npinfo-txq)) {
 + struct sk_buff *skb;
 + skb = __skb_dequeue(npinfo-txq);
 + kfree_skb(skb);
 + }
   kfree(npinfo);
   }
   }

Everything went quiet?

If this patch has been tested and fixes the bug, can you please send a
version which is ready for merging?  (ie: add a suitable

Re: [PATCH] Re: [2.6.21.1] soft lockup when removing netconsole module

2007-06-26 Thread Andrew Morton
On Tue, 26 Jun 2007 17:46:13 -0700 Wessel, Jason [EMAIL PROTECTED] wrote:

 }
 }
  
  Everything went quiet?
  
  If this patch has been tested and fixes the bug, can you 
  please send a version which is ready for merging?  (ie: add a 
  suitable description of what it does).
  
  
 
 I mailed Jarek separately.
 
 I had tested the patch with netconsole and kgdb and it does in fact fix
 the problem that was reported.

OK, thanks.  Please don't mail people separately!

I queued this up with a null changelog for now.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git-net, git-netdev-all and everything else on g5

2007-06-28 Thread Andrew Morton

With the full -mm lineup, my tg3-using powerpc g5 spits lots of these:

windfarm: Drive bay control loop started.
audit(1183017094.732:2): audit_pid=2117 old=0 by auid=4294967295
[ cut here ]
Badness at net/core/dev.c:1303
Call Trace:
[cb45ead0] [c00108c8] .show_stack+0x50/0x1cc (unreliable)
[cb45eb80] [c01b563c] .report_bug+0xa0/0x110
[cb45ec10] [c00250a0] .program_check_exception+0x1fc/0x738
[cb45ecd0] [c0004a84] program_check_common+0x104/0x180
--- Exception: 700 at .skb_gso_segment+0xd0/0x298
LR = .dev_hard_start_xmit+0x23c/0x33c
[cb45efc0] [0001] 0x1 (unreliable)
[cb45f060] [c037bc2c] .dev_hard_start_xmit+0x23c/0x33c
[cb45f100] [c0394934] .__qdisc_run+0x8c/0x410
[cb45f1b0] [c037c0f4] .dev_queue_xmit+0x3c8/0x408
[cb45f240] [c03a7180] .ip_output+0x1d8/0x3d4
[cb45f300] [c03a63f0] .ip_queue_xmit+0x374/0x51c
[cb45f450] [c03bdcc0] .tcp_transmit_skb+0x54c/0x9dc
[cb45f560] [c03bf8f8] .__tcp_push_pending_frames+0x2fc/0xb58
[cb45f6a0] [c03bbccc] .tcp_rcv_established+0x204/0x900
[cb45f750] [c03c6080] .tcp_v4_do_rcv+0x230/0x598
[cb45f830] [c03b1200] .tcp_prequeue_process+0xa0/0xf4
[cb45f8c0] [c03b19c8] .tcp_recvmsg+0x4f0/0x940
[cb45f9b0] [c0370fa0] .sock_common_recvmsg+0x68/0x90
[cb45fa40] [c036b2cc] .sock_aio_read+0x120/0x148
[cb45fb50] [c00d029c] .do_sync_read+0xd0/0x160
[cb45fcf0] [c00d04ec] .vfs_read+0x1c0/0x1d8
[cb45fd90] [c00d0888] .sys_read+0x4c/0x90
[cb45fe30] [c000872c] syscall_exit+0x0/0x40
[ cut here ]
Badness at net/core/dev.c:1303
Call Trace:
[cb45e9c0] [c00108c8] .show_stack+0x50/0x1cc (unreliable)
[cb45ea70] [c01b563c] .report_bug+0xa0/0x110
[cb45eb00] [c00250a0] .program_check_exception+0x1fc/0x738
[cb45ebc0] [c0004a84] program_check_common+0x104/0x180
--- Exception: 700 at .skb_gso_segment+0xd0/0x298
LR = .dev_hard_start_xmit+0x23c/0x33c
[cb45eeb0] [c0079a60] .__wake_up_bit+0x4c/0x60 (unreliable)
[cb45ef50] [c037bc2c] .dev_hard_start_xmit+0x23c/0x33c
[cb45eff0] [c0394934] .__qdisc_run+0x8c/0x410
[cb45f0a0] [c037c0f4] .dev_queue_xmit+0x3c8/0x408
[cb45f130] [c03a7180] .ip_output+0x1d8/0x3d4
[cb45f1f0] [c03a63f0] .ip_queue_xmit+0x374/0x51c
[cb45f340] [c03bdcc0] .tcp_transmit_skb+0x54c/0x9dc
[cb45f450] [c03bf8f8] .__tcp_push_pending_frames+0x2fc/0xb58
[cb45f590] [c03bbccc] .tcp_rcv_established+0x204/0x900
[cb45f640] [c03c6080] .tcp_v4_do_rcv+0x230/0x598
[cb45f720] [c036fb8c] .release_sock+0xa0/0x16c
[cb45f7c0] [c036fd08] .sk_wait_data+0xb0/0x148
[cb45f8c0] [c03b1a6c] .tcp_recvmsg+0x594/0x940
[cb45f9b0] [c0370fa0] .sock_common_recvmsg+0x68/0x90
[cb45fa40] [c036b2cc] .sock_aio_read+0x120/0x148
[cb45fb50] [c00d029c] .do_sync_read+0xd0/0x160
[cb45fcf0] [c00d04ec] .vfs_read+0x1c0/0x1d8
[cb45fd90] [c00d0888] .sys_read+0x4c/0x90
[cb45fe30] [c000872c] syscall_exit+0x0/0x40

That's here, in skb_gso_segment():

skb_reset_mac_header(skb);
skb-mac_len = skb-network_header - skb-mac_header;
__skb_pull(skb, skb-mac_len);

-if (WARN_ON(skb-ip_summed != CHECKSUM_PARTIAL)) {


config: http://userweb.kernel.org/~akpm/config-g5.txt
dmesg: http://userweb.kernel.org/~akpm/dmesg-g5.txt

Note that that dmesg contains extra stuff at the end which looks like the
box is trying to oops.

Generally ugly.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000: backport ich9 support from 7.5.5 ?

2007-06-29 Thread Andrew Morton
On Fri, 29 Jun 2007 14:39:20 -0700
Kok, Auke [EMAIL PROTECTED] wrote:

 
 That's why we want to introduce a second e1000 driver (named differently, 
 pick 
 any name) that contains the new code base, side-by-side into the kernel with 
 the 
 current e1000.

Sounds like a reasonable approach to me (it has plenty of precedent).  But
I forget what all the other issues were, so ignore me.

 This new e1000 codebase goes miles and miles beyond what I posted in 
 april/march 
 and what is in -mm.

There are no e1000 changes in -mm (from you), I don't believe.  git-e1000
has been in permadrop mode since 2.6.22-rc1-mm1 due to a huge number of git
rejects/conflicts.

Is git://lost.foo-projects.org/~ahkok/git/netdev-2.6#mm the correct URL?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8697] New: nfs-root doesn't work with jumbo frames

2007-07-01 Thread Andrew Morton
please submit the patch via email as per
http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt
to
Andrew Morton [EMAIL PROTECTED]
netdev@vger.kernel.org
[EMAIL PROTECTED]

thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/12] use a dynamic pool of sk_buffs to keep up with fast targets

2007-07-02 Thread Andrew Morton
On Tue, 26 Jun 2007 14:50:11 -0400 Ed L. Cashin [EMAIL PROTECTED] wrote:

 Use a dynamic pool of sk_buffs to keep up with fast targets.

That's far too skimpy a description of what this patch is doing, what it is
for, what makes AOE need this functionality, etc.

My initial thought is that if there is a legitimate need for this new capability
then it should be made available to other parts of the kernel rather than being
private to the AEO driver.

But 12 words is not enough information for us to make that judgement.

I have one lower-level comment way down below:

 Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
 ---
  drivers/block/aoe/aoe.h|5 ++
  drivers/block/aoe/aoecmd.c |  129 
 +---
  drivers/block/aoe/aoedev.c |   51 +++---
  3 files changed, 134 insertions(+), 51 deletions(-)
 
 diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
 index 7fa86dd..55c2f08 100644
 --- a/drivers/block/aoe/aoe.h
 +++ b/drivers/block/aoe/aoe.h
 @@ -98,6 +98,7 @@ enum {
   MIN_BUFS = 16,
   NTARGETS = 8,
   NAOEIFS = 8,
 + NSKBPOOLMAX = 128,
  
   TIMERTICK = HZ / 10,
   MINTIMER = HZ  2,
 @@ -147,6 +148,7 @@ struct aoetgt {
   u16 useme;
   ulong lastwadj; /* last window adjustment */
  int wpkts, rpkts;
 +int dataref;
  };
  
  struct aoedev {
 @@ -168,6 +170,9 @@ struct aoedev {
   spinlock_t lock;
   struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
   struct sk_buff *sendq_tl;
 + struct sk_buff *skbpool_hd;
 + struct sk_buff *skbpool_tl;
 + int nskbpool;
   mempool_t *bufpool; /* for deadlock-free Buf allocation */
   struct list_head bufq;  /* queue of bios to work on */
   struct buf *inprocess;  /* the one we're currently working on */
 diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
 index 62ba58c..89df9de 100644
 --- a/drivers/block/aoe/aoecmd.c
 +++ b/drivers/block/aoe/aoecmd.c
 @@ -105,43 +105,102 @@ ifrotate(struct aoetgt *t)
   }
  }
  
 +static void
 +skb_pool_put(struct aoedev *d, struct sk_buff *skb)
 +{
 + if (!d-skbpool_hd)
 + d-skbpool_hd = skb;
 + else
 + d-skbpool_tl-next = skb;
 + d-skbpool_tl = skb;
 +}
 +
 +static struct sk_buff *
 +skb_pool_get(struct aoedev *d)
 +{
 + struct sk_buff *skb;
 +
 + skb = d-skbpool_hd;
 + if (skb)
 + if (atomic_read(skb_shinfo(skb)-dataref) == 1) {
 + d-skbpool_hd = skb-next;
 + skb-next = NULL;
 + return skb;
 + }
 + if (d-nskbpool  NSKBPOOLMAX)
 + if ((skb = new_skb(ETH_ZLEN))) {
 + d-nskbpool++;
 + return skb;
 + }
 + return NULL;
 +}
 +
 +/* freeframe is where we do our load balancing so it's a little hairy. */
  static struct frame *
  freeframe(struct aoedev *d)
  {
 - struct frame *f, *e;
 + struct frame *f, *e, *rf;
   struct aoetgt **t;
 - ulong n;
 + struct sk_buff *skb;
  
   if (d-targets[0] == NULL) {/* shouldn't happen, but I'm paranoid */
   printk(KERN_ERR aoe: NULL TARGETS!\n);
   return NULL;
   }
 - t = d-targets;
 - do {
 + t = d-tgt;
 + t++;
 + if (t = d-targets[NTARGETS] || !*t)
 + t = d-targets;
 + for (;;) {
 + if ((*t)-nout  (*t)-maxout)
   if (t != d-htgt)
 - if ((*t)-ifp-nd)
 - if ((*t)-nout  (*t)-maxout) {
 - n = (*t)-nframes;
 + if ((*t)-ifp-nd) {
 + rf = NULL;
   f = (*t)-frames;
 - e = f + n;
 + e = f + (*t)-nframes;
   for (; fe; f++) {
   if (f-tag != FREETAG)
   continue;
 - if (atomic_read(skb_shinfo(f-skb)-dataref) 
 != 1) {
 - n--;
 + skb = f-skb;
 + if (!skb)
 + if (!(f-skb = skb = new_skb(ETH_ZLEN)))
 + continue;
 + if (atomic_read(skb_shinfo(skb)-dataref) != 
 1) {
 + if (!rf)
 + rf = f;
   continue;
   }
 - skb_shinfo(f-skb)-nr_frags = f-skb-data_len 
 = 0;
 - skb_trim(f-skb, 0);
 +gotone:  skb_shinfo(skb)-nr_frags = 
 skb-data_len = 0;
 + skb_trim(skb, 0);
   d-tgt = t;
   ifrotate(*t);
   return f;
   }
 - if (n == 0) /* slow polling network card */
 + /* Work can be 

Re: [Bugme-new] [Bug 8724] New: Unaligned acess in udp_recvmsg() on EV56

2007-07-08 Thread Andrew Morton
On Sun,  8 Jul 2007 14:30:17 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8724
 
Summary: Unaligned acess in udp_recvmsg() on EV56
Product: Platform Specific/Hardware
Version: 2.5
  KernelVersion: 2.6.22-rc7-git7
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Alpha
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: Occurs in all 2.6.2[12] at
 least
 Distribution: Debian testing/lenny
 Hardware Environment: Digital PWS 433au (EV56)
 Software Environment:
 Linux utopia 2.6.22-rc7-git7 #1 Thu Jul 8 10:34:17 CDT 2027 alpha GNU/Linux
 
 Gnu C  4.2.1
 Gnu make   3.81
 binutils   (GNU Binutils for Debian) 2.17.50.20070426
 util-linux 2.12r
 mount  2.12r
 module-init-tools  3.3-pre11
 e2fsprogs  1.40-WIP
 xfsprogs   2.8.18
 Linux C Library libc.2.5
 Dynamic linker (ldd)   2.5
 Procps 3.2.7
 Net-tools  1.60
 Kbd85:
 Sh-utils   5.97
 udev   105
 Modules Loaded ipt_TOS xt_multiport xt_tcpudp xt_state ip6table_mangle
 ip6table_filter ip6_tables ipv6 iptable_nat nf_nat nf_conntrack_ipv4
 nf_conntrack iptable_mangle iptable_filter ip_tables x_tables dm_mod serio_raw
 mxser_new ide_generic via_rhine mii generic ide_core tulip bitrev crc32 sg
 sr_mod cdrom raid1 md_mod loop
 
 Problem Description:
 kernel unaligned acc: 2248 (pc=fc583de4,va=fc00071c382a)
 
 fc583bc0 T udp_recvmsg
 fc583ed0 T udp_destroy_sock
 
 kernel unaligned acc: 1231 (pc=fc585190,va=fc000795e02e)
 
 fc585120 T __udp4_lib_rcv
 fc585bf0 T udp_rcv
 
 This problem does NOT seem to affect my Tsunami/Shark (EV68AL) box running the
 same kernel version on the same LAN.  Not sure if it's CPU generation related
 (EV5 vs EV6), or if it's the NIC (via_rhine on the EV5 vs e100 on the EV6).
 
 Steps to reproduce:
 According to tshark, the only UDP packets are DHCP packets:
 
 1815064200.131003   10.5.128.1 - 255.255.255.255 DHCP DHCP Offer-
 Transaction ID 0x27c6
 
 

That output isn't terribly illuminating.  We'd need to work out
which code corresponds with pc=fc583de4 and
pc=fc585190.

I don't think there's necessarily a bug here: that's just the kernel
telling us that there are unaligned accesses which got successfully
fixed up, so we're being perhaps a bit inefficient.  Yes?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8726] New: MSG_TRUNC not regarded in unix_dgram_recvmsg()

2007-07-09 Thread Andrew Morton
On Mon,  9 Jul 2007 04:01:58 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8726
 
Summary: MSG_TRUNC not regarded in unix_dgram_recvmsg()
Product: Networking
Version: 2.5
  KernelVersion: 2.6.19
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Other
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Problem Description:
 
 In unix_dgram_recvmsg() in af_unix.c the flag MSG_TRUNC is not regarded as
 described in the recv(2) man page.
 
 This did work in older kernels (I have a working 2.6.13, but am not sure if
 that is a plain kernel.org one, nor if it was the last working revision).
 I believe the bug was introduced when the variable copied was removed from
 the code.
 
 IMHO, the line 1650:
 
 err = size;
 
 should read
 
 err = (flags  MSG_TRUNC) ? skb-len : size;
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 4922] New: Bug in netfilter.c when drivers do hardware checksum generation.

2005-07-21 Thread Andrew Morton


Begin forwarded message:

Date: Thu, 21 Jul 2005 11:39:44 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 4922] New: Bug in netfilter.c when drivers do 
hardware checksum generation.


http://bugzilla.kernel.org/show_bug.cgi?id=4922

   Summary: Bug in netfilter.c when drivers do hardware checksum
generation.
Kernel Version: 2.6.12.2
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Distribution:

Has been reproduced in Fedora Core 2, Fedora Core 3, YDL 4.0.
with 2.6.8, 2.6.8.1 2.6.9, 2.6.10, 2.6.12.2

Hardware Environment:

x86 (tigon3 driver for BMC5705 tg3.c and ppc systems (mac mini) having ethernet
drivers that do hardware IP checksums.

Software Environment:

See kernel revs above.
Standard distributions. 
Problem Description:

There is a bug in the Linux kernel from 2.6.7 through 2.6.12.2.

The problem occurs when packets are being diverted to user space through
ipq/netlink sockets on systems that have ethernet drivers with hardware IP
checksum capability. It has been reproduced when user code is mangling the
packet headers. 

Where hardware has set ip_summed field in the skb and falsely indicates that the
checksum does not need to be re-generated after IP headers are mangled.

   This bug was originally introduced with a change to net/core/netfilter.c in
the 2.6.8 distribution.

Steps to reproduce:

   To reproduce the bug, divert packets through netlink ipq and change IP header
information.

The following patch fixes the problem on 2.6.12.2:

--- linux-2.6.12.2/net/core/netfilter.c.orig2005-06-29 19:00:53.0 
-0400
+++ linux-2.6.12.2/net/core/netfilter.c 2005-07-19 19:07:18.0 -0400
@@ -485,6 +485,14 @@
unsigned int verdict;
int ret = 0;
 
+if ((*pskb)-ip_summed == CHECKSUM_HW) {
+if (outdev == NULL) {
+(*pskb)-ip_summed = CHECKSUM_NONE;
+} else {
+skb_checksum_help(*pskb, 0);
+}
+}
+
/* We may already have this, but read-locks nest anyway */
rcu_read_lock();
 

The following patch fixes the problem on 2.6.6 - 2.6.11

--- linux-2.6.7/net/core/netfilter.c2005-07-19 13:02:11.0 -0400
+++ linux-2.6.7-netfilter/net/core/netfilter.c  2005-07-19 15:56:51.0 
-0400
@@ -504,6 +504,14 @@
unsigned int verdict;
int ret = 0;
 
+if (skb-ip_summed == CHECKSUM_HW) {
+if (outdev == NULL) {
+skb-ip_summed = CHECKSUM_NONE;
+} else {
+skb_checksum_help(skb, 0);
+}
+}
+
/* We may already have this, but read-locks nest anyway */
rcu_read_lock();
 

Thanks in advance for your consideration of this bug,

--Tom Herbert

[EMAIL PROTECTED]  [EMAIL PROTECTED]

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918

2005-07-31 Thread Andrew Morton


Begin forwarded message:

Date: Sun, 31 Jul 2005 17:02:01 +0200
From: Guillaume Pelat [EMAIL PROTECTED]
To: linux-kernel@vger.kernel.org
Cc: [EMAIL PROTECTED]
Subject: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918


Hi,

I've been trying to upgrade kernel from 2.6.12.3 to 2.6.13-rc4 on a 
rather loaded http server, but i'm currently having a kernel panic a few 
minutes only after booting. The bug was reproductible (the crash 
happened after every reboot, with the same backtrace).

Here is the error log:
[ cut here ]
kernel BUG at net/ipv4/tcp_output.c:918!
invalid operand:  [#1]
CPU:0
EIP:0060:[c027dd56]Not tainted VLI
EFLAGS: 00010293   (2.6.13-rc4-endy)
EIP is at tcp_tso_should_defer+0xd6/0xf0
eax: 0007   ebx: f1258080   ecx: 0007   edx: f297f800
esi: 0008   edi: 0004   ebp: c031fd80   esp: c031fd70
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c031e000 task=c02dbb80)
Stack: f5f547b8 f1258080 0008 f297f800 c031fdb8 c027de4b f297f800 
f297f800
f1258080 0009 f297f800 d039250c  0002 0002 
f297f800
f297f800 0100 c031fddc c027e192 f297f800 0218 0001 
f5fd4034
Call Trace:
  [c0102e5f] show_stack+0x7f/0xa0
  [c0103002] show_registers+0x152/0x1c0
  [c01031f8] die+0xc8/0x140
  [c0103325] do_trap+0xb5/0xc0
  [c010366c] do_invalid_op+0xbc/0xd0
  [c0102aa3] error_code+0x4f/0x54
  [c027de4b] tcp_write_xmit+0xdb/0x3f0
  [c027e192] __tcp_push_pending_frames+0x32/0xd0
  [c027c04e] tcp_rcv_state_process+0x2be/0x9c0
  [c0283ee9] tcp_v4_do_rcv+0x99/0x120
  [c02844e2] tcp_v4_rcv+0x572/0x750
  [c026a62b] ip_local_deliver+0xcb/0x1d0
  [c026aa52] ip_rcv+0x322/0x4a0
  [c0256a97] netif_receive_skb+0x137/0x1a0
  [c0256b8f] process_backlog+0x8f/0x110
  [c0256c82] net_rx_action+0x72/0x100
  [c01172dc] __do_softirq+0x8c/0xa0
  [c011731a] do_softirq+0x2a/0x30
  [c01173d5] irq_exit+0x35/0x40
  [c01044fc] do_IRQ+0x3c/0x70
  [c0102a46] common_interrupt+0x1a/0x20
  [c0100997] cpu_idle+0x57/0x60
  [c010024b] _stext+0x2b/0x30
  [c0320847] start_kernel+0x147/0x170
  [c0100199] 0xc0100199
Code: 89 f8 0f af c2 3b 45 f0 0f 47 45 f0 31 d2 89 45 f0 f7 f3 31 d2 39 
c1 73 ce ba 01 00 00 00 eb c7 6b c2 03 31 d2 39 c1 77 be eb ee 0f 0b 
96 03 ae 54 2d c0 e9 76 ff ff ff 8b ba 78 02 00 00 eb eb
  0Kernel panic - not syncing: Fatal exception in interrupt

Some infos about my system:

My network card is an e1000.

root # cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 15
model   : 3
model name  : Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping: 3
cpu MHz : 2995.045
cache size  : 1024 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 5
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dtsacpi mmx fxsr sse sse2 ss ht tm pbe pni 
monitor ds_cpl cid
bogomips: 5914.62

http00 root # uname -a
Linux http00 2.6.13-rc4 #1 Thu May 19 14:19:19 CEST 2005 i686 Intel(R) 
Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux

You can find dmesg, lspci and config at the following address:
http://82.196.5.50/20050731/config.txt
http://82.196.5.50/20050731/dmesg.txt
http://82.196.5.50/20050731/lspci.txt
http://82.196.5.50/20050731/sysctl.txt

Best regards,

Guillaume Pelat
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.13-rc4 - kernel panic - BUG at net/ipv4/tcp_output.c:918

2005-08-04 Thread Andrew Morton
Guillaume Pelat [EMAIL PROTECTED] wrote:

 Hi,
 
 Herbert Xu wrote:
  On Thu, Aug 04, 2005 at 01:33:29PM +1000, herbert wrote:
  
 So I suppose we should reset cwnd_quota after tcp_transmit_skb?
  
  Please try this patch to see if this is really the problem or not.
  
  Thanks,
 
 I just applied your patch, and it seems to work :)
 2 hours uptime, and no crash yet (without the patch, it was crashing a 
 few mins only after booting).
 So i think the bug is crushed :)
 

Thanks, Guillaume.  Herbert, David is travelling and not able to do a lot
of patchmonkeying.  Could you please prepare and submit a final patch?

Thanks.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [TCP]: Fix TSO cwnd caching bug

2005-08-04 Thread Andrew Morton
Herbert Xu [EMAIL PROTECTED] wrote:

 On Thu, Aug 04, 2005 at 04:58:42PM -0700, Andrew Morton wrote:
   
   Thanks, Guillaume.  Herbert, David is travelling and not able to do a lot
   of patchmonkeying.  Could you please prepare and submit a final patch?
 
  OK, here is the final version.

Thanks.

  It depends on the patch that David
  posted earlier on in this thread.  Please let me know if you need a
  copy of that.

Yes please.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: oops with 2.6.13-rc5 on webserver with raid

2005-08-05 Thread Andrew Morton

Did we fix this today?


Begin forwarded message:

Date: Fri, 05 Aug 2005 11:52:15 +0200
From: Martin Braun [EMAIL PROTECTED]
To: linux-kernel@vger.kernel.org
Subject: oops with 2.6.13-rc5 on webserver with raid


Hi,

I've been trying to upgrade kernel to 2.6.13-rc5. The server boots
normally w/o errors, but after while (from 5 minutes up to 2 hours) the
Kernel hangs (no keyboard input possible). As I am a newbie I cannot
figure out who will be concerned with this error.


Here ist the ksymoops output (done while running  2.6.11.12 #1 SMP, hope
that's OK)

ksymoops -V -K -L -O -m /boot/System.map-2.6.13-rc5  oops.txt
ksymoops 2.4.9 on i686 2.6.11.12.  Options used
 -V (specified)
 -K (specified)
 -L (specified)
 -O (specified)
 -m /boot/System.map-2.6.13-rc5 (specified)

kernel BUG at bad filename:27369!
invalid operand:  [#1]
CPU:0
EIP:0060:[c0324afd]Not tainted VLI
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010297   (2.6.13-rc5)
eax: 0005   ebx: f51c3880   ecx: 0007   edx: ebfa6c00
esi: f51c3880   edi:    ebp: 0006   esp: c03ebda4
ds: 007b   es: 007b   ss: 0068
Stack:  f51c3880 0006 ebfa6c00 9daf2f89 c0324de0 ebfa6c00
ebfa6c00
   f51c3880 000c ebfa6c00 0006 0002 ebfa6c00 ebfa6c00
0100
   f5bca034 c0324f45 ebfa6c00 05b4 0001 c02f4391 4740b79d
4740b79d
Call Trace:
 [c0324de0]
 [c0324f45]
 [c02f4391]
 [c0321eea]
 [c032b60a]
 [c032bea1]
 [c0310390]
 [c0306cce]
 [c030fc5b]
 [c0310390]
 [c03102b4]
 [c03105a0]
 [c02fa2a8]
 [c02fa389]
 [c02fa4d7]
 [c011fa12]
 [c011fac5]
 [c010542e]
 [c0103726]
 [c0100ce5]
 [c0100b19]
 [c03ec9d5]
 [c03ec3b0]
Code: 24 8b 5c 24 04 8b 74 24 08 8b 7c 24 0c 8b 6c 24 10 83 c4 14 c3 c7
04 24 0
Error (Oops_code_values): invalid value 0x0 in Code line, must be 2, 4,
8 or 16 digits, value ignored


EIP; c0324afd tcp_tso_should_defer+fd/110   =

ebx; f51c3880 pg0+34d56880/3fb91400
edx; ebfa6c00 pg0+2bb39c00/3fb91400
esi; f51c3880 pg0+34d56880/3fb91400
esp; c03ebda4 init_thread_union+1da4/2000

Trace; c0324de0 tcp_write_xmit+2d0/400
Trace; c0324f45 __tcp_push_pending_frames+35/d0
Trace; c02f4391 kfree_skbmem+21/30
Trace; c0321eea tcp_rcv_established+39a/920
Trace; c032b60a tcp_v4_do_rcv+12a/150
Trace; c032bea1 tcp_v4_rcv+871/940
Trace; c0310390 ip_local_deliver_finish+0/210
Trace; c0306cce nf_hook_slow+6e/130
Trace; c030fc5b ip_local_deliver+eb/270
Trace; c0310390 ip_local_deliver_finish+0/210
Trace; c03102b4 ip_rcv+4d4/5b0
Trace; c03105a0 ip_rcv_finish+0/320
Trace; c02fa2a8 netif_receive_skb+168/1b0
Trace; c02fa389 process_backlog+99/130
Trace; c02fa4d7 net_rx_action+b7/120
Trace; c011fa12 __do_softirq+82/100
Trace; c011fac5 do_softirq+35/40
Trace; c010542e do_IRQ+1e/30
Trace; c0103726 common_interrupt+1a/20
Trace; c0100ce5 mwait_idle+25/50
Trace; c0100b19 cpu_idle+69/80
Trace; c03ec9d5 start_kernel+175/1a0
Trace; c03ec3b0 unknown_bootoption+0/1e0

Code;  c0324afd tcp_tso_should_defer+fd/110
 _EIP:
Code;  c0324afd tcp_tso_should_defer+fd/110   =
   0:   24 8b and$0x8b,%al   =
Code;  c0324aff tcp_tso_should_defer+ff/110
   2:   5cpop%esp
Code;  c0324b00 tcp_tso_should_defer+100/110
   3:   24 04 and$0x4,%al
Code;  c0324b02 tcp_tso_should_defer+102/110
   5:   8b 74 24 08   mov0x8(%esp),%esi
Code;  c0324b06 tcp_tso_should_defer+106/110
   9:   8b 7c 24 0c   mov0xc(%esp),%edi
Code;  c0324b0a tcp_tso_should_defer+10a/110
   d:   8b 6c 24 10   mov0x10(%esp),%ebp
Code;  c0324b0e tcp_tso_should_defer+10e/110
  11:   83 c4 14  add$0x14,%esp
Code;  c0324b11 tcp_write_xmit+1/400
  14:   c3ret
Code;  c0324b12 tcp_write_xmit+2/400
  15:   c7 04 24 00 00 00 00  movl   $0x0,(%esp)

 0Kernel panic - not syncing: Fatal exception in interrupt

1 error issued.  Results may not be reliable.

=
lspci
=
:00:00.0 Host bridge: Intel Corp. E7320 Memory Controller Hub (rev 0c)
:00:00.1 Class ff00: Intel Corp. E7320 Error Reporting Registers
(rev 0c)
:00:02.0 PCI bridge: Intel Corp. E7525/E7520/E7320 PCI Express Port
A (rev 0c)
:00:03.0 PCI bridge: Intel Corp. E7525/E7520/E7320 PCI Express Port
A1 (rev 0c)
:00:1c.0 PCI bridge: Intel Corp. 6300ESB 64-bit PCI-X Bridge (rev 02)
:00:1d.0 USB Controller: Intel Corp. 6300ESB USB Universal Host
Controller (rev 02)
:00:1d.1 USB Controller: Intel Corp. 6300ESB USB Universal Host
Controller (rev 02)
:00:1d.4 System peripheral: Intel Corp. 6300ESB Watchdog Timer (rev 02)
:00:1d.5 PIC: Intel Corp. 6300ESB I/O Advanced Programmable
Interrupt Controller (rev 02)
:00:1d.7 USB Controller: Intel Corp. 6300ESB USB2 Enhanced Host
Controller (rev 02)
:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev 0a)
:00:1f.0 ISA bridge: Intel Corp. 6300ESB LPC Interface Controller
(rev 02)
:00:1f.1 IDE interface: Intel Corp. 

Fw: [Bugme-new] [Bug 5014] New: rp_filter proc interface generate oops when enable

2005-08-07 Thread Andrew Morton


Begin forwarded message:

Date: Sun, 7 Aug 2005 07:12:40 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 5014] New: rp_filter proc interface generate oops 
when enable


http://bugzilla.kernel.org/show_bug.cgi?id=5014

   Summary: rp_filter proc interface generate oops when enable
Kernel Version: 2.6.12
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur:
Distribution:
Debian unstabble
Hardware Environment:
compaq nx7010
Software Environment:
Debian networking startup script
Problem Description:
each one generate oops
echo 1  /proc/sys/net/ipv4/conf/default/rp_filter
echo 1  /proc/sys/net/ipv4/conf/eth0/rp_filter
echo 1  /proc/sys/net/ipv4/conf/lo/rp_filter

Steps to reproduce:

step by step

from debian /etc/init.d/networking
if [ -e /proc/sys/net/ipv4/conf/all/rp_filter ]; then
 for f in /proc/sys/net/ipv4/conf/*/rp_filter; do
echo 1  $f
 done
 return 0
else
 return 1
fi

echo 1  /proc/sys/net/ipv4/conf/all/rp_filter

nothing happends ...

echo 1  /proc/sys/net/ipv4/conf/default/rp_filter



 1Unable to handle kernel paging request at virtual address 40d9db94
 printing eip:
c012362f
*pde = 
Oops: 0002 [#5]
PREEMPT 
Modules linked in: md5 ipv6 af_packet ohci1394 snd_intel8x0m snd_intel8x0
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
snd_page_alloc ehci_hcd hci_usb bluetooth tsdev joydev uhci_hcd usbcore psmouse
pcspkr parport_pc parport irtty_sir sir_dev irda crc_ccitt evdev nls_iso8859_1
nls_cp437 vfat fat ipw2200 ieee80211 ieee80211_crypt eth1394 8139cp sr_mod sbp2
scsi_mod ieee1394 genrtc unix
CPU:0
EIP:0060:[c012362f]Not tainted VLI
EFLAGS: 00010246   (2.6.12) 
EIP is at do_proc_dointvec_conv+0xf/0x40
eax: 0001   ebx: 0001   ecx: 40d9db94   edx: dee9ff18
esi: 080f3c09   edi: dee9feff   ebp: 0001   esp: dee9fed8
ds: 007b   es: 007b   ss: 0068
Process bash (pid: 4617, threadinfo=dee9e000 task=df6bb060)
Stack: c012393f 0001  42f61172 0c2b8c70 40d9db94 0001 0001 
    3116bff3 d121000a dee9ff60 d121fe24  8242 dee9ff00 
   0001  080f3c08 080f3c08 0001 c169a0e0 c01239ec 080f3c08 
Call Trace:
 [c012393f] do_proc_dointvec+0x2df/0x360
 [c01239ec] proc_dointvec+0x2c/0x40
 [c0123620] do_proc_dointvec_conv+0x0/0x40
 [c012330a] do_rw_proc+0xaa/0xc0
 [c0123370] proc_writesys+0x0/0x30
 [c012338f] proc_writesys+0x1f/0x30
 [c015c957] vfs_write+0xb7/0x130
 [c015ca81] sys_write+0x41/0x70
 [c0103111] syscall_call+0x7/0xb
Code: 8b 5c 24 0c 89 c8 8b 74 24 10 8b 7c 24 14 8b 6c 24 18 83 c4 1c c3 8d b6 00
00 00 00 83 7c 24 04 00 74 0d 8b 00 85 c0 75 18 8b 02 89 01 31 c0 c3 8b 09 85
c9 78 16 c7 00 00 00 00 00 31 c0 89 0a 


echo 1  /proc/sys/net/ipv4/conf/eth0/rp_filter


1Unable to handle kernel paging request at virtual address 6048c278
 printing eip:
c012362f
*pde = 
Oops: 0002 [#6]
PREEMPT 
Modules linked in: md5 ipv6 af_packet ohci1394 snd_intel8x0m snd_intel8x0
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
snd_page_alloc ehci_hcd hci_usb bluetooth tsdev joydev uhci_hcd usbcore psmouse
pcspkr parport_pc parport irtty_sir sir_dev irda crc_ccitt evdev nls_iso8859_1
nls_cp437 vfat fat ipw2200 ieee80211 ieee80211_crypt eth1394 8139cp sr_mod sbp2
scsi_mod ieee1394 genrtc unix
CPU:0
EIP:0060:[c012362f]Not tainted VLI
EFLAGS: 00010246   (2.6.12) 
EIP is at do_proc_dointvec_conv+0xf/0x40
eax: 0001   ebx: 0001   ecx: 6048c278   edx: dee9ff18
esi: 080f3c09   edi: dee9feff   ebp: 0001   esp: dee9fed8
ds: 007b   es: 007b   ss: 0068
Process bash (pid: 4658, threadinfo=dee9e000 task=df6bb060)
Stack: c012393f 0001  db15ebcc 0001 6048c278 0001 0001 
    3101 d111000a 0007 dee9ffbc df6bb060 b7f5a5d0 dee9ff00 
   0001  080f3c08 080f3c08 0001 de2470e0 c01239ec 080f3c08 
Call Trace:
 [c012393f] do_proc_dointvec+0x2df/0x360
 [c01239ec] proc_dointvec+0x2c/0x40
 [c0123620] do_proc_dointvec_conv+0x0/0x40
 [c012330a] do_rw_proc+0xaa/0xc0
 [c0123370] proc_writesys+0x0/0x30
 [c012338f] proc_writesys+0x1f/0x30
 [c015c957] vfs_write+0xb7/0x130
 [c015ca81] sys_write+0x41/0x70
 [c0103111] syscall_call+0x7/0xb
Code: 8b 5c 24 0c 89 c8 8b 74 24 10 8b 7c 24 14 8b 6c 24 18 83 c4 1c c3 8d b6 00
00 00 00 83 7c 24 04 00 74 0d 8b 00 85 c0 75 18 8b 02 89 01 31 c0 c3 8b 09 85
c9 78 16 c7 00 00 00 00 00 31 c0 89 0a 



echo 1  /proc/sys/net/ipv4/conf/lo/rp_filter



 1Unable to handle kernel paging request at virtual address 41f76bf8
 printing eip:
c012362f
*pde = 
Oops: 0002 [#7]
PREEMPT 
Modules linked in: md5 ipv6 af_packet ohci1394 snd_intel8x0m snd_intel8x0
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
snd_page_alloc ehci_hcd hci_usb bluetooth tsdev 

Fw: [Bugme-new] [Bug 5080] New: bonding related oops on boot

2005-08-17 Thread Andrew Morton


Begin forwarded message:

Date: Wed, 17 Aug 2005 07:20:36 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 5080] New: bonding related oops on boot


http://bugzilla.kernel.org/show_bug.cgi?id=5080

   Summary: bonding related oops on boot
Kernel Version: 2.6.13-rc6
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Distribution: debian pure64
Hardware Environment: 4 way x86_64
Software Environment: linux 2.6.12, 2.6.13-rc6 
Problem Description:

I have a bond with two slave interfaces, both connected.

On boot, when the bond gets initialized, I get the following oops:

[  136.773164] Ethernet Channel Bonding Driver: v2.6.3 (June 8, 2005)
[  136.773266] bonding: In ALB mode you might experience client disconnections
upon reconnection of a link if the bonding module updelay parameter (15000 msec)
is incompatible with the forwarding delay time of the switch
[  136.773427] bonding: MII link monitoring set to 100 ms
[  137.122235] bonding: bond0: enslaving eth0 as an active interface with a down
link.
[  137.353781] bonding: bond0: enslaving eth1 as an active interface with a down
link.
[  137.397579] e100: eth2: e100_watchdog: link up, 100Mbps, full-duplex
[  138.823615] NET: Registered protocol family 10
[  138.824319] IPv6 over IPv4 tunneling driver
[  142.995176] tg3: eth0: Link is up at 1000 Mbps, full duplex.
[  142.995238] tg3: eth0: Flow control is on for TX and on for RX.
[  142.995294] bonding: bond0: link status up for interface eth0, enabling it in
15000 ms.
[  144.226482] tg3: eth1: Link is up at 1000 Mbps, full duplex.
[  144.226543] tg3: eth1: Flow control is on for TX and on for RX.
[  144.293858] bonding: bond0: link status up for interface eth1, enabling it in
15000 ms.
[  149.051679] eth0: no IPv6 routers present
[  149.311570] bond0: no IPv6 routers present
[  149.661411] eth1: no IPv6 routers present
[  149.781362] eth2: no IPv6 routers present
[  157.987577] bonding: bond0: link status definitely up for interface eth0.

[  157.987642] bonding: bond0: making interface eth0 the new active one.
[  158.023763] RTNL: assertion failed at net/ipv4/devinet.c (962)
[  158.023819]
[  158.023820] Call Trace: IRQ 80273c03{rt_run_flush+48}
8029d6c8{inetdev_event+116}
[  158.023964]80273c4e{rt_run_flush+123}
8013f087{notifier_call_chain+31}
[  158.024094]802610f8{dev_set_mac_address+84}
88035740{:bonding:alb_set_slave_mac_addr+76}
[  158.024233]8803580f{:bonding:alb_swap_mac_addr+170}
8802f0b8{:bonding:bond_change_active_slave+546}
[  158.024373]8802f997{:bonding:bond_mii_monitor+1012}
8802f5a3{:bonding:bond_mii_monitor+0}
[  158.024507]8013a86a{run_timer_softirq+384}
80136d42{__do_softirq+110}
[  158.024631]8010ec07{call_softirq+31}
801106a1{do_softirq+54}
[  158.024752]8010e3b6{apic_timer_interrupt+98}  EOI
801f7cbc{acpi_walk_namespace+117}
[  158.024886]8010bf4b{mwait_idle+86}
80207694{acpi_processor_idle+298}
[  158.025011]8010bedb{cpu_idle+76}
803fc708{start_kernel+372}
[  158.025130]803fc216{_sinittext+534}
[  158.061376] RTNL: assertion failed at net/ipv4/devinet.c (962)
[  158.061431]
[  158.061432] Call Trace: IRQ 80273c32{rt_run_flush+95}
8029d6c8{inetdev_event+116}
[  158.061569]80273c4e{rt_run_flush+123}
8013f087{notifier_call_chain+31}
[  158.061693]802610f8{dev_set_mac_address+84}
88035740{:bonding:alb_set_slave_mac_addr+76}
[  158.061826]88035821{:bonding:alb_swap_mac_addr+188}
8802f0b8{:bonding:bond_change_active_slave+546}
[  158.061963]8802f997{:bonding:bond_mii_monitor+1012}
8802f5a3{:bonding:bond_mii_monitor+0}
[  158.062096]8013a86a{run_timer_softirq+384}
80136d42{__do_softirq+110}
[  158.062219]8010ec07{call_softirq+31}
801106a1{do_softirq+54}
[  158.062338]8010e3b6{apic_timer_interrupt+98}  EOI
801f7cbc{acpi_walk_namespace+117}
[  158.062470]8010bf4b{mwait_idle+86}
80207694{acpi_processor_idle+298}
[  158.062593]8010bedb{cpu_idle+76}
803fc708{start_kernel+372}
[  158.062711]803fc216{_sinittext+534}
[  159.366930] bonding: bond0: link status definitely up for interface eth1.

The bond has the following options:
 options bonding mode=6 miimon=100 updelay=15000 max_bonds=2

I know this doesn't appear with 2.6.11.12, and that it did in a 2.6.12, although
I don't know which precisely :(

I'll attach my .config

Steps to reproduce: just boot :p

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe 

Re: [patch 2.6.13-rc6] net/802/tr: use interrupt-safe locking

2005-08-21 Thread Andrew Morton
Jay Vosburgh [EMAIL PROTECTED] wrote:

  FWIW, this patch is currently being carried in the Fedora and RHEL
  kernels.  It certainly looks like it is necessary to me.  Can we get
  some movement on this?
 
   It's in the SuSE kernel as well.

For how long has this fix been in the vendor kernels?

Could someone please tell us why there are unmerged bugfixes in vendor
kernels?

Are there any more?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 5138] New: 64bit put_unaligned/get_unaligned does not work on 32bit kernel

2005-08-27 Thread Andrew Morton
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=5138
 
Summary: 64bit put_unaligned/get_unaligned does not work on 32bit
 kernel
 Kernel Version: 2.6.12
 Status: NEW
   Severity: normal
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur:2.6.11
 Distribution:any
 Hardware Environment:mips, possibly parisc, sh, sparc
 Software Environment:32bit kernel
 Problem Description:
 put_unaligned/get_unaligned in include/asm-generic/unaligned.h use 'unsigned
 long' to hold 64bit value.
 So if sizeof(long) was smaller than 8, higher 32bit will be lost. 

get_unaligned() looks OK to me.

But yes, there is a seemingly-unneeded typecast in put_unaligned() which
I think will indeed truncate 64-bit values.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 5175] New: Kernel 2.6.13 breaks libpcap (at least on ppp)

2005-09-02 Thread Andrew Morton

(Full bug record.  All replies will go into bugzilla - please trim text)

[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=5175
 
Summary: Kernel 2.6.13 breaks libpcap (at least on ppp)
 Kernel Version: 2.6.13
 Status: NEW
   Severity: normal
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: 2.6.12
 Distribution: Fedora Core 2 (with some updates)
 Hardware Environment: Pentium III, dialup (serial port modem: ppp)
 Software Environment: Linux, gcc 3.3.3, tcpdump/libpcap built from the
   source RPM (Fedora development) for version 3.9.1)
 Problem Description: tcpdump/libpcap not able to filter packets during capture
 
 Steps to reproduce:
 
 I use a tcptraceroute programme (well, I did until I updated to kernel 
 2.6.13).
 It sends TCP SYN/ACK and captures the ICMP error messages returned.
 I use(d) traceproto and tcptraceroute.
 In kernel 2.6.13 they do not work (the standard, UDP, traceroute which comes
 with FC2 does work).
 Both use libnet and libpcap.
 
 libpcap can capture packets:
 
   tcpdump -w 1.cap
 
 works
 
   and I can extract the ICMP packets when I write the captured packets to a 
 file.
 
   tcpdump -f ip proto \icmp -r 1.cap
 
 works.
 
 
 However, it cannot filter the packets as it captures:
 
   tcpdump -f ip proto \icmp
 
 fails (as does tethereal, but in tethereal I can capture all the packets and 
 use
 a '-R', read filter, to capture all packets, which works, but only display the
 ones I want = so I had to change a script from capture filter to read 
 filter).
 
 I would guess that trying to filter out the ICMP (for the time exceeded error
 messages) is failing in traceproto and tcptraceroute (but why not in the
 standard UDP traceroute while traceproto fails both in TCP and UDP modes?).
 
 --- You are receiving this mail because: ---
 You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel 2.6.13 breaks libpcap (and tcpdump).

2005-09-02 Thread Andrew Morton
John McGowan [EMAIL PROTECTED] wrote:

 Kernel 2.6.13. Breaks libpcap.
 
 Fedora Core 2, gcc 3.3.3, Pentium III (933MHz)
 
 I had written about my dismay that traceproto and tcptraceroute
 no longer worked and suspected that libnet was broken.
 
 It seems that it is libpcap that is broken by kernel 2.6.13 and
 tcpdump itself no longer works.
 Well, it works ... but not correctly.
 
  Capture data, then look for ICMP messages
  (e.g. Time Exceeded errors as in a traceroute)
  by filtering the file.
  
   tcpdump -w 1.cap
   tcpdump -f ip proto \icmp -r 1.cap
 
 That works.
 
 
  Filter incoming data, looking for ICMP messages:
  
   tcpdump -f ip proto \icmp
  
 Well, that catches nothing.
 
 
 I tried recompiling (source RPM, Fedora Core 2) tcpdump
 (libpcap, tcpdump, etc.) and reinstalling. That did not
 fix the problem with tcpdump.
 
 It also broke a tethereal script I was using (which I changed
 to capture all packets, which works as indicated above, and
 then used a '-R', read, filter to display the one's I want).
 

(cc netdev)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 5182] New: 2.6.13-git3 won't compile on firewall

2005-09-03 Thread Andrew Morton


Begin forwarded message:

Date: Sat, 3 Sep 2005 13:26:30 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 5182] New: 2.6.13-git3 won't compile on firewall


http://bugzilla.kernel.org/show_bug.cgi?id=5182

   Summary: 2.6.13-git3 won't compile on firewall
Kernel Version: 2.6.13-git3
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur:2.6.12-8-686 (standard debian)
Distribution:debian-sarge
Hardware Environment:P3-800Mhz 256Mb RAM dual ethernet e100 IDE disk
Software Environment: firewall (nat)
Problem Description:
kernel compile failes:
make[2]: `arch/i386/kernel/asm-offsets.s' is up to date.
  CHK include/linux/compile.h
  CHK usr/initramfs_list
  GEN .version
  CHK include/linux/compile.h
  UPD include/linux/compile.h
  CC  init/version.o
  LD  init/built-in.o
  LD  .tmp_vmlinux1
net/built-in.o: In function `ip_ct_port_tuple_to_nfattr':
: undefined reference to `__nfa_fill'
net/built-in.o: In function `ip_ct_port_tuple_to_nfattr':
: undefined reference to `__nfa_fill'
net/built-in.o: In function `tcp_to_nfattr':
ip_conntrack_proto_tcp.c:(.text+0x5abf1): undefined reference to `__nfa_fill'
net/built-in.o: In function `icmp_tuple_to_nfattr':
ip_conntrack_proto_icmp.c:(.text+0x5c87f): undefined reference to `__nfa_fill'
ip_conntrack_proto_icmp.c:(.text+0x5c8b0): undefined reference to `__nfa_fill'
net/built-in.o:ip_conntrack_proto_icmp.c:(.text+0x5c8e1): more undefined
references to `__nfa_fill' follow
make[1]: *** [.tmp_vmlinux1] Error 1
make[1]: Leaving directory `/usr/src/linux-2.6.13-git3'
make: *** [stamp-build] Error 2


Steps to reproduce:
use this dotconfig file:
http://www.dth.net/kernel/dotconfig-2.6.13-git3-firewall_wont_compile

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 5194] New: IPSec related OOps in 2.6.13

2005-09-06 Thread Andrew Morton


Begin forwarded message:

Date: Tue, 6 Sep 2005 03:49:57 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 5194] New: IPSec related OOps in 2.6.13


http://bugzilla.kernel.org/show_bug.cgi?id=5194

   Summary: IPSec related OOps in 2.6.13
Kernel Version: 2.6.13
Status: NEW
  Severity: high
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur: 2.6.12
Distribution: Slackware

Software Environment:

Linux gate 2.6.13 #1 Sat Sep 3 11:32:13 CEST 2005 i686 unknown

Gnu C  3.3.5
Gnu make   3.80
binutils   2.15.92.0.2
util-linux 2.11z
mount  2.11z
module-init-tools  3.1
e2fsprogs  1.35
reiserfsprogs  line
reiser4progs   line
Linux C Library2.3.5
Dynamic linker (ldd)   2.3.5
Linux C++ Library  5.0.7
Procps 3.1.8
Net-tools  1.60
Kbd1.08
Sh-utils   2.0
Modules Loaded

Problem Description:

Oops:  [#1]
PREEMPT
Modules linked in:
CPU:0
EIP:0060:[c01f562c]Not tainted VLI
EFLAGS: 00010216   (2.6.13)
EIP is at sha1_update+0x7c/0x160
eax: dce92e6c   ebx: 0014   ecx: 0005   edx: 0104
esi: 907529d5   edi: dce92eb4   ebp: 907529d5   esp: c04c5c98
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c04c5000 task=c03eeb80)
Stack: dce92e74 dbe09db4 c04c5ca4     
          
          
Call Trace:
 [c01f39e0] update+0x80/0xb0
 [c01f4106] crypto_hmac_update+0x26/0x40
 [c036d370] skb_icv_walk+0xf0/0x200
 [c01f4071] crypto_hmac_init+0xd1/0x140
 [c0348a23] esp_hmac_digest+0x93/0xf0
 [c01f40e0] crypto_hmac_update+0x0/0x40
 [c01f3644] cbc_encrypt+0x54/0x60
 [c0347ecb] esp_output+0x38b/0x4a0
 [c0366e1a] xfrm4_output+0x7a/0x1a0
 [c031537b] ip_forward+0x17b/0x2e0
 [c03154e0] ip_forward_finish+0x0/0x60
 [c0313a96] ip_rcv+0x266/0x520
 [c0313f30] ip_rcv_finish+0x0/0x2d0
 [c02e5918] netif_receive_skb+0x198/0x240
 [c02e5a3f] process_backlog+0x7f/0x100
 [c02e5b4e] net_rx_action+0x8e/0x1c0
 [c011f7cd] __do_softirq+0x8d/0xa0
 [c0105493] do_softirq+0x63/0x70
 ===
 [c011f8a8] irq_exit+0x38/0x40
 [c0105359] do_IRQ+0x59/0x80
 [c01035fe] common_interrupt+0x1a/0x20
 [c0241d07] acpi_processor_idle+0x123/0x299
 [c01009d8] cpu_idle+0x48/0x60
 [c044b7b7] start_kernel+0x157/0x180
 [c044b390] unknown_bootoption+0x0/0x1b0
Code: 0f 86 f9 00 00 00 8b 84 24 60 01 00 00 bb 40 00 00 00 29 f3 81 fb ff 01 00
00 8d 7c 06 1c 0f 87 c4 00 00 00 89 d9 89 ee
c1 e9 02 f3 a5 89 d9 83 e1 03 74 02 f3 a4 8b 84 24 60 01 00 00 8b b4 24
 0Kernel panic - not syncing: Fatal exception in interrupt


Steps to reproduce:
Setup IPsec  wait. Sometimes 30m, sometimes 5h.

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2.6.13 2/2] 3c59x: add option for using memory-mapped PCI I/O resources

2005-09-06 Thread Andrew Morton
Christoph Hellwig [EMAIL PROTECTED] wrote:

 On Tue, Sep 06, 2005 at 04:44:00PM -0400, John W. Linville wrote:
  Add module option to enable 3c59x driver to use memory-mapped PCI I/O
  resources.  This may improve performance for those devices so equipped.
  
  Add use_mmio=1 to the 3c59x module options in order to enable this
  functionality.
 
 I'm not sure a module option makes sense for this setting, except maybe
 as a debugging aid.  You should rather have a flag in the PCI IDs private
 data that can be used to enable mmio for those cards that support it.

I guess it's OK for the initial testing.  Plus we should make the new
feature default to on during initial public testing.  I'll make that
change.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2.6.13 2/2] 3c59x: add option for using memory-mapped PCI I/O resources

2005-09-06 Thread Andrew Morton
John W. Linville [EMAIL PROTECTED] wrote:

 I fully intend to have have a flag in the private data set based on
  the PCI ID when I accumulate some data on which devices support this
  and which don't.  So far I've only got a short list...  Do you think
  such a flag should be based on which ones work, or which ones break?

The ones which are known to work.

Bear in mind that this is an old, messy and relatively stable driver which
handles a huge number of different NICs.   Caution is the rule here.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2.6.13 2/2] 3c59x: add option for using memory-mapped PCI I/O resources

2005-09-06 Thread Andrew Morton
John W. Linville [EMAIL PROTECTED] wrote:

 On Tue, Sep 06, 2005 at 03:15:46PM -0700, Andrew Morton wrote:
  John W. Linville [EMAIL PROTECTED] wrote:
  
   I fully intend to have have a flag in the private data set based on
the PCI ID when I accumulate some data on which devices support this
and which don't.  So far I've only got a short list...  Do you think
such a flag should be based on which ones work, or which ones break?
  
  The ones which are known to work.
  
  Bear in mind that this is an old, messy and relatively stable driver which
  handles a huge number of different NICs.   Caution is the rule here.
 
 I definitely agree.  That is another part of why I defaulted to use_mmio=0.
 
 I'll post PCI ID based patches as I determine supported cards.
 

What I'd suggest you do is to look at enabling the feature for, say,
IS_CYCLONE and IS_TORNADO NICs.  Do that as a separate -mm patch, make sure
that an explicit `use_mmio=0' will still turn it off.

So in the style of that driver, something like:

static int use_mmio[MAX_UNITS] = { [ 0 .. MAX_UNITS-1 ] = -1, };

Then:

if (module parm given)
use_mmio[unit] = 1 or 0

...

/* Determine the default if the user didn't override us */
if (use_mmio[unit] == -1  (IS_CYCLONE || IS_TORNADO))
use_mmio[unit] = 1;

priv-use_mmio = use_mmio[unit];(maybe)



if (priv-use_mmio == 1)
do mmio stuff


There's a bit to be done here, so I'll drop your initial set of patches.

btw, Donald Becker's 3c59x.c has done mmio for ages.  Suggest you take a
look in there. http://www.scyld.com/vortex.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: masquerading failure for at least icmp and tcp+sack on amd64

2005-09-07 Thread Andrew Morton


Begin forwarded message:

Date: Tue, 6 Sep 2005 19:29:30 +0200
From: Marc Lehmann [EMAIL PROTECTED]
To: linux-kernel@vger.kernel.org
Subject: masquerading failure for at least icmp and tcp+sack on amd64


Hi!

I recently upgraded a 32 bit machine to a new amd64 board+cpu. I took the
same kernel (2.6.13-rc7) and just recompiled it for 64 bit, plus upgraded
userspace to 64 bit.

Firewall config stayed the same.

Problem: neither ping nor tcp was being masqueraded properly. I created
the following test-set-up:

   iptables -t mangle -F
   iptables -t filter -F
   iptables -t nat -F
   iptables -t nat -A POSTROUTING -p all -s 10.0.0.0/8 -d \! 10.0.0.0/8 -j 
MASQUERADE

i..e the above masquerade rule should be the only firewall rule, and all
fules shoul[d have policy ACCEPT.

The effect was that tcp packets and icmp packets coming from 10.0.0.1 on
interface eth0 were properly masqueraded on the outgoing inet interface
(ppp0 renamed):

eth0:
   19:17:24.364351 IP 10.0.0.1.44320  129.13.162.95.80: S 
3745828676:3745828676(0) win 5840 mss 1460,nop,nop,sackOK

inet:
   19:17:24.364505 IP 84.56.237.68.44320  129.13.162.95.80: S 
3745828676:3745828676(0) win 5840 mss 1452,nop,nop,sackOK
   19:17:24.378029 IP 129.13.162.95.80  84.56.237.68.44320: S 
3777391404:3777391404(0) ack 3745828677 win 5840 mss 1460,nop,nop,sackOK
   19:17:24.378103 IP 84.56.237.68.44320  129.13.162.95.80: R 
3745828677:3745828677(0) win 0

However, the reverse packets were rejected. ip_conntrack showed this:

   tcp  6 52 SYN_SENT src=10.0.0.1 dst=129.13.162.95 sport=44320 dport=80 
[UNREPLIED] src=129.13.162.95 dst=84.56.237.68 sport=80 dport=44320 mark=0 use=1

ICMP echo replies were also masqueraded, but the reply was ignored.

Weird observation 1:

   ip route del default
   ip add default via 10.0.0.17

Resulted in working masquerading, this time over device vpn0, which is
a tuntap-interface. Working means that outgoing packets were correctly
re-written with source 10.0.0.5 (local address of vpn0) and replie were
correctly un-translated.

Weird obervation 2:

Some sites could be connected to with TCP. It turned out that those
sites did not support TCP SACK. Indeed, turning off SACK either on the
remote side of a connection or on the origonator side resulted in workign
masquerading:

eth0:
   19:23:29.928470 IP 10.0.0.1.45611  129.13.162.95.80: S 
4113365634:4113365634(0) win 5840 mss 1460
   19:23:29.942246 IP 129.13.162.95.80  10.0.0.1.45611: S 
4161877683:4161877683(0) ack 4113365635 win 5840 mss 1460
   19:23:29.942313 IP 10.0.0.1.45611  129.13.162.95.80: . ack 1 win 5840

inet:
   19:23:29.928249 IP 84.56.237.68.45611  129.13.162.95.80: S 
4113365634:4113365634(0) win 5840 mss 1452
   19:23:29.942199 IP 129.13.162.95.80  84.56.237.68.45611: S 
4161877683:4161877683(0) ack 4113365635 win 5840 mss 1460
   19:23:29.942332 IP 84.56.237.68.45611  129.13.162.95.80: . ack 1 win 5840

However, ICMP still is not masqueraded.

Kernels that worked:

   2.6.13-rc7, 2.6.12.5, 2.6.11 and lower, compiled for x86 with gcc-3.4

Kernels that don't work:

   2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)

Kernel configuration was exactly the same for the 2.6.13-rc7 kernels,
modulo the cpu and architectrue selections.

I have a somewhat nontrivial source routing set-up on that machine that I
could document more if that could be a possible reason for that problem. I
am confident that this is not a configuration error, as the configuraiton
worked basically unchanged since the 2.4 days, and I am confident it's not
a iptables setup problem either, as I can reproduce it with empty rules
except for the masquerading rule.

I did not mention UDP because I didn't test it, but it's likely that UDP
masquerading also fails.

Any idea at what I could look at or try out to find out more about this
problem?

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 5200] New: Wrong source IPv6 address selected for destination

2005-09-07 Thread Andrew Morton


Begin forwarded message:

Date: Wed, 7 Sep 2005 06:14:45 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 5200] New: Wrong source IPv6 address selected for 
destination


http://bugzilla.kernel.org/show_bug.cgi?id=5200

   Summary: Wrong source IPv6 address selected for destination
Kernel Version: 2.6.12
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Distribution: Fedora Core 4
Hardware Environment: x86_64

Problem Description:

This problem has been filed on FC4 and is reported here as upstream fix
(ie. in kernel) is needed.
Original report:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166636

There is a host with normal and 6to4 IPv6 connectivity and
two global IPv6 addresses configured on eth0:

2: eth0: BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0f:ea:61:98:cc brd ff:ff:ff:ff:ff:ff
inet 192.168.253.2/24 brd 192.168.253.255 scope global eth0
inet6 2002:5580:5ba8:1:20f:eaff:fe61:98cc/64 scope global dynamic
   valid_lft 86397sec preferred_lft 43197sec
inet6 2001:5c0:8a70:0:20f:eaff:fe61:98cc/64 scope global dynamic
   valid_lft 2591997sec preferred_lft 604797sec
inet6 fe80::20f:eaff:fe61:98cc/64 scope link
   valid_lft forever preferred_lft forever

When trying to connect to site with global normal (i.e.2001:...) IP address
hosts chooses 6to4 source address and TCP connection is not established - it
is stuck in SYN_SENT state:

tcp0  1 2002:5580:5ba8:1:20f::43588 2001:5c0:8a70::1:25 
SYN_SENT
tcp0  1 2002:5580:5ba8:1:20f::32945 2001:200:0:8002:203:47ff:80 
SYN_SENT

Please note - this problem occurs on end node. Linux Box working as router to
IPv6 and 6to4 seems to work fine.

Adding route using ip route add ... src .. for 2001: and 2002: hasn't helped.

Version-Release number of selected component (if applicable):
kernel-2.6.12-1.1398_FC4
quick code review of vanilla 2.6.13 kernel shows that the problem also
is there

How reproducible:
always

Steps to Reproduce:
1. Setup linux box with normal and 6to4 connectivity
2. Try to connect to normal IPv6 site
3. If the above step works - try to connect to 6to4 site as
   kernel seems to pick up the first IPv6 address configured
   on interface
4. Step 3 or step 4 fails
  
Actual results:
Some connections from host with normal and 6to4 connectivity do
not work.

Expected results:
All connections from host with normal and 6to4 connectivity do
not work.

Additional info:
Kernel should choose normal global IPv6 source addrress for connections
to normal global IPv6 addrresses. 
Kernel should choose 6to4 global IPv6 source addrress for connections
to 6to4 global IPv6 addrresses.
Rules for IPv6 source address selection are described in RFC3056. For now
it looks like kernel picks the first address with the same scope as
destination found on outgoing interface.
It looks like USAGI project has this implemented in this patch:
ftp://ftp.linux-ipv6.org/pub/usagi/stable/split/usagi-linux26-stable-20050714-2.6.10.diff.bz2

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 5201] New: Badness in dst_release at include/net/dst.h:154

2005-09-07 Thread Andrew Morton


Begin forwarded message:

Date: Wed, 7 Sep 2005 06:16:22 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 5201] New: Badness in dst_release at 
include/net/dst.h:154


http://bugzilla.kernel.org/show_bug.cgi?id=5201

   Summary: Badness in dst_release at include/net/dst.h:154
Kernel Version: Linux version 2.6.13 ([EMAIL PROTECTED]) (gcc
version 3.4.
Status: NEW
  Severity: high
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur:
[unknown]

Distribution:
Sorcerer

Hardware Environment:
# cat /proc/ioports
-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
0060-006f : keyboard
0070-0077 : rtc
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
02f8-02ff : serial
0376-0376 : ide1
0378-037a : parport0
037b-037f : parport0
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial
0778-077a : parport0
0cf8-0cff : PCI conf1
a000-a03f : :00:0d.0
 a000-a03f : e100
a400-a43f : :00:0b.0
 a400-a43f : e100
a800-a83f : :00:0a.0
 a800-a83f : e100
b000-b03f : :00:09.0
 b000-b03f : e100
b400-b41f : :00:04.2
 b400-b41f : uhci_hcd
b800-b80f : :00:04.1
 b800-b807 : ide0
 b808-b80f : ide1
d000-dfff : PCI Bus #01
 d800-d8ff : :01:00.0
e400-e43f : :00:04.3
 e400-e43f : motherboard
   e400-e403 : PM1a_EVT_BLK
   e404-e405 : PM1a_CNT_BLK
   e408-e40b : PM_TMR
   e40c-e40f : GPE0_BLK
   e410-e415 : ACPI CPU throttle
e800-e81f : :00:04.3
 e800-e80f : motherboard
   e800-e80f : pnp 00:02
 e800-e807 : piix4-smbus

# cat /proc/iomem  -0009 : System RAM
000a-000b : Video RAM area
000c-000c7fff : Video ROM
000c8000-000c8fff : Adapter ROM
000cc000-000ccfff : Adapter ROM
000d-000d0fff : Adapter ROM
000d4000-000d57ff : Adapter ROM
000f-000f : System ROM
0010-1fffbfff : System RAM
 0010-004a9e9a : Kernel code
 004a9e9b-0060a4a7 : Kernel data
1fffc000-1fffefff : ACPI Tables
1000-1fff : ACPI Non-volatile Storage
2000-200f : :00:09.0
2010-201f : :00:0a.0
2020-202f : :00:0b.0
2030-203f : :00:0d.0
de00-de0f : :00:0d.0
 de00-de0f : e100
de80-de800fff : :00:0d.0
 de80-de800fff : e100
df00-df0f : :00:0b.0
 df00-df0f : e100
df80-df800fff : :00:0b.0
 df80-df800fff : e100
e000-e00f : :00:0a.0
 e000-e00f : e100
e080-e0800fff : :00:0a.0
 e080-e0800fff : e100
e100-e10f : :00:09.0
 e100-e10f : e100
e180-e1800fff : :00:09.0
 e180-e1800fff : e100
e200-e2af : PCI Bus #01
 e200-e2000fff : :01:00.0
e2f0-e3ff : PCI Bus #01
 e2f0-e2f1 : :01:00.0
 e300-e3ff : :01:00.0
e400-e7ff : :00:00.0
- : reserved

# lspci -vvv
00:00.0 Host bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 
03)
   Subsystem: Asustek Computer, Inc.: Unknown device 8024
   Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
   Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- 
MAbort+ SERR- PERR-
   Latency: 64
   Region 0: Memory at e400 (32-bit, prefetchable) [size=64M]
   Capabilities: [a0] AGP version 1.0
   Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 
64bit- FW- AGP3- Rate=x1,x2
   Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- 
Rate=none

00:01.0 PCI bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03) 
(prog-if 00 [Normal decode])
   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
   Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- 
MAbort- SERR- PERR-
   Latency: 64
   Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
   I/O behind bridge: d000-dfff
   Memory behind bridge: e200-e2af
   Prefetchable memory behind bridge: e2f0-e3ff
   BridgeCtl: Parity- SERR- NoISA- VGA+ MAbort- Reset- FastB2B+

00:04.0 ISA bridge: Intel Corp. 82371AB/EB/MB PIIX4 ISA (rev 02)
   Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
   Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- 
MAbort- SERR- PERR-
   Latency: 0

00:04.1 IDE interface: Intel Corp. 82371AB/EB/MB PIIX4 IDE (rev 01) (prog-if 80 
[Master])
   Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
   Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- 
MAbort- SERR- PERR-
   Latency: 32
   Region 4: I/O ports at b800 [size=16]

00:04.2 USB Controller: Intel Corp. 82371AB/EB/MB PIIX4 USB (rev 01) (prog-if 
00 
[UHCI])
 

Fw: PROBLEM: Badness in dst_release at include/net/dst.h:154

2005-09-07 Thread Andrew Morton

I think this got fixed?

If so, is the fix queued for 2.6.13.1?

Thanks.


Begin forwarded message:

Date: Wed, 07 Sep 2005 15:26:50 +0300
From: Ady Deac [EMAIL PROTECTED]
To: linux-kernel@vger.kernel.org
Subject: PROBLEM: Badness in dst_release at include/net/dst.h:154


[1.] One line summary of the problem:   
I am using linux and quagga (latest release) for a small network. It 
does load-balancing between 3 providers.

[2.] Full description of the problem/report:
If I only make a default route (the hard way - route add default gw 
xxx.xxx.xxx.xxx) everything is more then OK, but AFAIK this is not the 
way to handle routes in BGP. So I did registered the three gateways in 
zebra as default routes. After a couple of minutes, the kernel oopses:
I have googled the problem and showed up only one article:
https://www.redhat.com/archives/fedora-test-list/2005-May/msg00373.html

Maybe, if we find out the problem, we can let the poor guy know ;)

[3.] Keywords (i.e., modules, networking, kernel):
kernel, networking

[4.] Kernel version (from /proc/version):
Linux version 2.6.13 ([EMAIL PROTECTED]) (gcc version 3.4.4) #1 
Sun Sep 4 03:34:46 EEST 2005

[5.] Output of Oops.. message (if applicable) with symbolic information
 resolved (see Documentation/oops-tracing.txt)
...
Sep  7 12:57:40 router.mikesnet.ro kernel: Badness in dst_release at 
include/net/dst.h:154
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c03f2fcd] 
__kfree_skb+0x16d/0x180
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c0457e2d] 
arp_process+0x8d/0x570
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c0458404] arp_rcv+0xf4/0x180
Sep  7 12:57:41 router.mikesnet.ro kernel:   [c0457da0 arp_process.t ] 
arp_process+0x0/0x570
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c03f906b] 
netif_receive_skb+0x2db/0x360
Sep  7 12:57:41 router.mikesnet.ro kernel:  [e090a1f0] 
e100_poll+0x420/0x780 [e100]
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c03fe03a] 
neigh_periodic_timer+0xea/0x1c0
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c03f931f] 
net_rx_action+0x12f/0x1c0
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c0127621] 
__do_softirq+0x41/0xa0
Sep  7 12:57:41 router.mikesnet.ro kernel:  [c01276a6] 
do_softirq+0x26/0x30
Sep  7 12:57:42 router.mikesnet.ro kernel:  [c0127765] irq_exit+0x35/0x40
Sep  7 12:57:42 router.mikesnet.ro kernel:  [c010580e] do_IRQ+0x1e/0x30
Sep  7 12:57:42 router.mikesnet.ro kernel:  [c0103c1a] 
common_interrupt+0x1a/0x20
Sep  7 12:57:42 router.mikesnet.ro kernel:  [c030cc55] 
acpi_processor_idle+0xff/0x27f
Sep  7 12:57:42 router.mikesnet.ro kernel:  [c01010e2] cpu_idle+0x42/0x60
Sep  7 12:57:42 router.mikesnet.ro kernel:  [c060e87d] 
start_kernel+0x18d/0x1d0
Sep  7 12:57:42 router.mikesnet.ro kernel:   [c060e3b0 
unknown_bootoption.t ] unknown_bootoption+0x0/0x1f0
...
and this part is repeatting over and over again. What could be the problem?

[6.] A small shell script or example program which triggers the
 problem (if possible)
//-- zebra.conf [part]
ip route 0.0.0.0/0 85.186.56.129
ip route 0.0.0.0/0 194.105.21.65
ip route 0.0.0.0/0 212.146.86.161
//-- END

[7.] Environment
declare -x EDITOR=nano
declare -x HOME=/root
declare -x INPUTRC=/etc/inputrc
declare -x LOGNAME=root
declare -x 
LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.avi=01;35:*.fli=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.ogg=01;35:*.mp3=01;35:*.wav=01;35:
declare -x OLDPWD=/root
declare -x 
PATH=/opt/gcc/current/bin:/sbin:/usr/sbin:/usr/local/sbin:/bin:/usr/bin:/usr/local/bin:/usr/games:/usr/local/games:/usr/bin/X11:/root/bin:.:/usr/busybox
declare -x 
PS1=\\[\\033[0m\\][\\[\\033[0;[EMAIL 
PROTECTED];32m\\]\\H\\[\\033[0m\\]][\\[\\033[0;33m\\]\\w\\[\\033[0m\\]]# 

declare -x PWD=/etc/quagga
declare -x SHELL=/bin/bash
declare -x SHLVL=1
declare -x TERM=xterm
declare -x USER=root

[7.1.] Software (add the output of the ver_linux script here)
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.
 
Linux router.mikesnet.ro 2.6.13 #1 Sun Sep 4 03:34:46 EEST 2005 i686 
unknown unknown GNU/Linux
 
Gnu C  3.4.4
Gnu make   3.80
binutils   2.15
util-linux 2.12q
mount  2.12q
module-init-tools  3.1
e2fsprogs  1.38
jfsutils   1.1.8
reiserfsprogs  3.6.19
reiser4progs   line
xfsprogs   2.6.36
nfs-utils  1.0.7
Linux C Library2.3.5
Dynamic linker (ldd)   2.3.5
Procps 3.2.5
Net-tools  1.60
Kbd  

Fw: Oops in 2.6.13 (__tcp_push_pending_frames)

2005-09-07 Thread Andrew Morton

I have a feeling that I'm spamming netdev with already-fixed bugs.  But
please bear with me - it's better than letting unfixed bugs slip past ;)




Begin forwarded message:

Date: Wed, 7 Sep 2005 19:04:45 +0200
From: Peter Palfrader [EMAIL PROTECTED]
To: Linux Kernel list linux-kernel@vger.kernel.org
Subject: Oops in 2.6.13 (__tcp_push_pending_frames)


Hi,

I got the following Oops on a pristine 2.6.13:

[17179929.236000] Unable to handle kernel NULL pointer dereference at virtual 
address 0001
[17179929.236000]  printing eip:
[17179929.236000] 0001
[17179929.236000] *pde = 
[17179929.236000] Oops:  [#1]
[17179929.236000] SMP 
[17179929.236000] Modules linked in: lp autofs4 ipv6 ide_cd cdrom pcspkr analog 
parport_pc parport floppy tsdev evdev snd_via82xx usbhid gameport 
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc 
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore ehci_hcd uhci_hcd 
usbcore dm_mod sg unix
[17179929.236000] CPU:0
[17179929.236000] EIP:0060:[0001]Not tainted VLI
[17179929.236000] EFLAGS: 00010292   (2.6.13-came32) 
[17179929.236000] EIP is at 0x1
[17179929.236000] eax:    ebx: 05a8   ecx: 0001   edx: 3aa5
[17179929.236000] esi: 7ed44989   edi: 04b0   ebp: 0028   esp: db031dac
[17179929.236000] ds: 007b   es: 007b   ss: 0068
[17179929.236000] Process rsync (pid: 5606, threadinfo=db03 task=c1764060)
[17179929.236000] Stack: dc1aa800 dc1aa800 1000 cad5ece0 c0423ec4 dc1aa800 
05a8  
[17179929.236000] dc1aa800 c0418593 dc1aa800 dc1aa800 05a8 
  
[17179929.236000]c041e37e dc1aa800  0004 cc723100 bff8c440 
 1000 
[17179929.236000] Call Trace:
[17179929.236000]  [c0423ec4] __tcp_push_pending_frames+0x24/0xa0
[17179929.236000]  [c0418593] tcp_sendmsg+0x323/0xb10
[17179929.236000]  [c041e37e] tcp_clean_rtx_queue+0x4ce/0x500
[17179929.236000]  [c041e850] tcp_ack+0x1d0/0x300
[17179929.236000]  [c0436fab] inet_sendmsg+0x3b/0x50
[17179929.236000]  [c03eb7e4] sock_aio_write+0xe4/0x110
[17179929.236000]  [c013e524] __alloc_pages+0x2a4/0x400
[17179929.236000]  [c0156524] do_sync_write+0xb4/0x100
[17179929.236000]  [c0168ada] poll_freewait+0x3a/0x50
[17179929.236000]  [c012f0c0] autoremove_wake_function+0x0/0x40
[17179929.236000]  [c01566ab] vfs_write+0x13b/0x150
[17179929.236000]  [c015676d] sys_write+0x3d/0x70
[17179929.236000]  [c0102c79] syscall_call+0x7/0xb
[17179929.236000] Code:  Bad EIP value.
[17179929.236000]  

Let me know if there is anything else that you need.


Cheers,
Peter
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: 2.6.13-git7 strange system freeze

2005-09-08 Thread Andrew Morton


Begin forwarded message:

Date: Thu, 8 Sep 2005 14:14:39 +0200
From: Michal Piotrowski [EMAIL PROTECTED]
To: LKML linux-kernel@vger.kernel.org
Subject: 2.6.13-git7 strange system freeze


Hi,

after about 20 hours of uptime, my 2.6.13-git7 system freeze. I find
it in my klog.

Sep  8 13:45:09 ng02 kernel: KERNEL: assertion ((int)tp-lost_out =
0) failed at net/ipv4/tcp_input.c (2148)
Sep  8 13:45:09 ng02 kernel: Leak l=4294967295 4
Sep  8 13:45:20 ng02 kernel: KERNEL: assertion ((int)tp-sacked_out =
0) failed at net/ipv4/tcp_input.c (2147)
Sep  8 13:45:20 ng02 kernel: Leak s=4294967295 4
Sep  8 13:46:21 ng02 kernel: retrans_out leaked.
Sep  8 13:48:37 ng02 kernel: KERNEL: assertion ((int)tp-sacked_out =
0) failed at net/ipv4/tcp_input.c (2147)
Sep  8 13:49:08 ng02 last message repeated 2 times
Sep  8 13:49:41 ng02 last message repeated 3 times
Sep  8 13:49:41 ng02 kernel: Leak s=4294967295 3
Sep  8 13:49:46 ng02 kernel: KERNEL: assertion ((int)tp-sacked_out =
0) failed at net/ipv4/tcp_input.c (2147)
Sep  8 13:49:46 ng02 kernel: Leak l=1 4
Sep  8 13:49:46 ng02 kernel: Leak s=4294967295 4
Sep  8 13:49:52 ng02 kernel: KERNEL: assertion ((int)tp-sacked_out =
0) failed at net/ipv4/tcp_input.c (2147)
Sep  8 13:49:52 ng02 kernel: KERNEL: assertion ((int)tp-sacked_out =
0) failed at net/ipv4/tcp_input.c (2147)
Sep  8 13:49:52 ng02 kernel: Leak l=1 4
Sep  8 13:49:52 ng02 kernel: Leak s=4294967295 4
Sep  8 13:49:52 ng02 kernel: Leak r=1 4
Sep  8 13:49:58 ng02 kernel: KERNEL: assertion ((int)tp-sacked_out =
0) failed at net/ipv4/tcp_input.c (2147)

Regards,
Michal Piotrowski
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] 3c59x: read current link status from phy

2005-09-08 Thread Andrew Morton
Tommy Christensen [EMAIL PROTECTED] wrote:

 In order to spare some I/O operations, be more intelligent about
  when to read from the PHY.

Seems sane.

Should we also decrease the polling interval?  Perhaps only when the cable
is unplugged?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] 3c59x: read current link status from phy

2005-09-09 Thread Andrew Morton
Tommy Christensen [EMAIL PROTECTED] wrote:

 John W. Linville wrote:
  Any chance you could re-diff this to apply on top of the patch posted
  earlier today by Neil Horman?
 
 Sure, but his patch didn't apply to -git8.
 
 If Neil would please resend, then I can diff against that.
 

Is OK, I'll sort it all out.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]dgrs - Fixes Warnings when CONFIG_ISA and CONFIG_PCI are not enabled

2005-11-05 Thread Andrew Morton
Richard Knutsson [EMAIL PROTECTED] wrote:

  BTW, can anyone ack or is that up to the maintainers?

It's useful info - it shows that someone else took the time to revie the
code.

  BTW #2, why not remove #ifdef CONFIG_PCI on dgrs_cleanup_module() at the 
  same time? Or maybe that should be in a remove config_pci-patch...

yup.  There are lots of opportunities for that, I bet.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 5591] New: KERNEL: assertion (!sk-sk_forward_alloc) failed at net/core/stream.c (279)

2005-11-12 Thread Andrew Morton


Begin forwarded message:

Date: Fri, 11 Nov 2005 04:39:23 -0800
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 5591] New: KERNEL: assertion (!sk-sk_forward_alloc) 
failed at net/core/stream.c (279)


http://bugzilla.kernel.org/show_bug.cgi?id=5591

   Summary: KERNEL: assertion (!sk-sk_forward_alloc) failed at
net/core/stream.c (279)
Kernel Version: 2.6.14
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur:
Distribution: Rhel 4
Hardware Environment: 2 Xeon 2.8 8Gb Ram, e1000 Network card, LSI Scsi
Software Environment: Apache, Sendmail, Squid
Problem Description: Kernel Assetion

KERNEL: assertion (!sk-sk_forward_alloc) failed at net/core/stream.c (279)
KERNEL: assertion (!sk-sk_forward_alloc) failed at net/ipv4/af_inet.c (148)
KERNEL: assertion (!sk-sk_forward_alloc) failed at net/core/stream.c (279)
KERNEL: assertion (!sk-sk_forward_alloc) failed at net/ipv4/af_inet.c (148)
swapper: page allocation failure. order:1, mode:0x20
 [c014b712] __alloc_pages+0x2c2/0x4d0
 [c014ec31] kmem_getpages+0x31/0xa0
 [c014fb5f] cache_grow+0xcf/0x190
 [c014fcfe] cache_alloc_refill+0xde/0x210
 [c0150104] __kmalloc+0x74/0x80
 [c033f0c3] __alloc_skb+0x53/0x140
 [c0370da5] tcp_collapse+0xf5/0x3a0
 [c037117c] tcp_prune_queue+0x9c/0x1f0
 [c03704fe] tcp_data_queue+0x3ee/0xba0
 [c0371a94] tcp_rcv_established+0x244/0x8d0
 [c035e4f0] ip_local_deliver_finish+0x0/0x190
 [c037a87a] tcp_v4_do_rcv+0x14a/0x150
 [c037af72] tcp_v4_rcv+0x6f2/0x980
 [c035ddb7] ip_local_deliver+0xd7/0x230
 [c035e4f0] ip_local_deliver_finish+0x0/0x190
 [c035e1b1] ip_rcv+0x2a1/0x5e0
 [c035e680] ip_rcv_finish+0x0/0x2f0
 [c03449a7] __net_timestamp+0x17/0x30
 [c03454b4] netif_receive_skb+0x164/0x200
 [f88ee900] e1000_clean_rx_irq+0x190/0x520 [e1000]
 [f88ee10d] e1000_clean+0x4d/0x100 [e1000]
 [c03456dc] net_rx_action+0x7c/0x120
 [c0124f39] __do_softirq+0xd9/0xf0
 [c0124f85] do_softirq+0x35/0x40
 [c0125065] irq_exit+0x45/0x50
 [c010546e] do_IRQ+0x1e/0x30
 [c0103b52] common_interrupt+0x1a/0x20
 [c0101052] mwait_idle+0x52/0x80
 [c0100d70] default_idle+0x0/0x30
 [c02501a8] acpi_processor_idle+0x1b4/0x2e1
 [c0100d70] default_idle+0x0/0x30
 [c0100e4f] cpu_idle+0x6f/0x80
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1 used:5
cpu 0 cold: low 0, high 2, batch 1 used:1
cpu 1 hot: low 2, high 6, batch 1 used:5
cpu 1 cold: low 0, high 2, batch 1 used:0
cpu 2 hot: low 2, high 6, batch 1 used:5
cpu 2 cold: low 0, high 2, batch 1 used:1
cpu 3 hot: low 2, high 6, batch 1 used:5
cpu 3 cold: low 0, high 2, batch 1 used:1
Normal per-cpu:
cpu 0 hot: low 62, high 186, batch 31 used:173
cpu 0 cold: low 0, high 62, batch 31 used:32
cpu 1 hot: low 62, high 186, batch 31 used:74
cpu 1 cold: low 0, high 62, batch 31 used:38
cpu 2 hot: low 62, high 186, batch 31 used:100
cpu 2 cold: low 0, high 62, batch 31 used:33
cpu 3 hot: low 62, high 186, batch 31 used:163
cpu 3 cold: low 0, high 62, batch 31 used:55
HighMem per-cpu:
cpu 0 hot: low 62, high 186, batch 31 used:94
cpu 0 cold: low 0, high 62, batch 31 used:4
cpu 1 hot: low 62, high 186, batch 31 used:158
cpu 1 cold: low 0, high 62, batch 31 used:11
cpu 2 hot: low 62, high 186, batch 31 used:64
cpu 2 cold: low 0, high 62, batch 31 used:10
cpu 3 hot: low 62, high 186, batch 31 used:142
cpu 3 cold: low 0, high 62, batch 31 used:0
Free pages: 5121616kB (5080776kB HighMem)
Active:895272 inactive:54135 dirty:2192 writeback:1 unstable:0 free:1280404
slab:99963 mapped:251522 pagetables:2705
DMA free:3548kB min:68kB low:84kB high:100kB active:0kB inactive:0kB
present:16384kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 880 9968
Normal free:37292kB min:3756kB low:4692kB high:5632kB active:239876kB
inactive:136544kB present:901120kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 72704
HighMem free:5080776kB min:512kB low:640kB high:768kB active:3341212kB
inactive:79996kB present:9306112kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 95*4kB 0*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB
0*4096kB = 3548kB
Normal: 9147*4kB 0*8kB 0*16kB 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB
0*2048kB 0*4096kB = 37292kB
HighMem: 1724*4kB 3621*8kB 1169*16kB 519*32kB 309*64kB 45*128kB 13*256kB 6*512kB
7*1024kB 1*2048kB 1213*4096kB = 5080776kB
Swap cache: add 0, delete 0, find 0/0, race 0+0
Free swap  = 1485972kB
Total swap = 1485972kB
Free swap:   1485972kB
2555904 pages of RAM
2326528 pages of HIGHMEM
217994 reserved pages
1303991 pages shared
0 pages swap cached
2192 pages dirty
1 pages writeback
251522 pages mapped
99963 pages slab
2705 pages pagetables
swapper: page allocation failure. order:1, mode:0x20
 [c014b712] __alloc_pages+0x2c2/0x4d0
 [c014ec31] kmem_getpages+0x31/0xa0
 [c014f987] alloc_slabmgmt+0x57/0x70
 [c014fb5f] cache_grow+0xcf/0x190
 [c014fcfe] cache_alloc_refill+0xde/0x210
 

Fw: [2.6.14.2] Debug: sleeping function called from invalid context at mm/slab.c:2459

2005-11-14 Thread Andrew Morton

I think this got fixed, didn't it?

If so, should we backport the fix into 2.6.14.x?


Begin forwarded message:

Date: Mon, 14 Nov 2005 20:30:21 +0100
From: Frank van Maarseveen [EMAIL PROTECTED]
To: linux-kernel@vger.kernel.org
Subject: [2.6.14.2] Debug: sleeping function called from invalid context at 
mm/slab.c:2459


2.6.14.2 on a AMD Athlon X2 3800

Nov 14 20:17:30 iapetus kernel: in_atomic():1, irqs_disabled():0
Nov 14 20:17:30 iapetus kernel:  [c010410e] dump_stack+0x1e/0x20
Nov 14 20:17:30 iapetus kernel:  [c01211c5] __might_sleep+0xa5/0xb0
Nov 14 20:17:30 iapetus kernel:  [c01512cd] __kmalloc+0xdd/0x100
Nov 14 20:17:30 iapetus kernel:  [c041d8bd] pskb_expand_head+0x4d/0x150
Nov 14 20:17:30 iapetus kernel:  [c043bbc7] netlink_broadcast+0x387/0x3c0
Nov 14 20:17:30 iapetus kernel:  [c04a4673] nfnetlink_send+0x63/0xa0
Nov 14 20:17:30 iapetus kernel:  [c0484d20] 
ctnetlink_conntrack_event+0x3a0/0xac0
Nov 14 20:17:30 iapetus kernel:  [c013239d] notifier_call_chain+0x2d/0x50
Nov 14 20:17:30 iapetus kernel:  [c047fadb] destroy_conntrack+0x12b/0x190
Nov 14 20:17:30 iapetus kernel:  [c0480b35] ip_conntrack_in+0x1a5/0x360
Nov 14 20:17:30 iapetus kernel:  [c047eda6] ip_conntrack_local+0x66/0x70
Nov 14 20:17:30 iapetus kernel:  [c04a2fd8] nf_iterate+0x68/0xb0
Nov 14 20:17:30 iapetus kernel:  [c04a308d] nf_hook_slow+0x6d/0x140
Nov 14 20:17:30 iapetus kernel:  [c04460ab] ip_queue_xmit+0x47b/0x5f0
Nov 14 20:17:30 iapetus kernel:  [c0457402] tcp_transmit_skb+0x442/0x6e0
Nov 14 20:17:30 iapetus kernel:  [c0459fb7] tcp_connect+0x2f7/0x380
Nov 14 20:17:30 iapetus kernel:  [c045c0c7] tcp_v4_connect+0x627/0xb70
Nov 14 20:17:30 iapetus kernel:  [c046bddd] inet_stream_connect+0x7d/0x1a0
Nov 14 20:17:30 iapetus kernel:  [c0419328] sys_connect+0x78/0xa0
Nov 14 20:17:30 iapetus kernel:  [c0419db3] sys_socketcall+0xa3/0x240
Nov 14 20:17:30 iapetus kernel:  [c01031fb] sysenter_past_esp+0x54/0x75


-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: AIM7 fails with 2.6.18-rc5-mm1

2006-09-05 Thread Andrew Morton

We think this is a net bug.


Begin forwarded message:

Date: Mon, 4 Sep 2006 17:02:22 -0700 (PDT)
From: Christoph Lameter [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: linux-kernel@vger.kernel.org
Subject: AIM7 fails with 2.6.18-rc5-mm1


On an 8p Altix. 6 GB Ram

AIM Multiuser Benchmark - Suite VII Run Beginning

Tasksjobs/min  jti  jobs/min/task  real   cpu
1 2435.06  100  2435.0649  2.46  0.02   Mon Sep  4 
10:17:44 2006
  100   178784.27   94  1787.8427  3.36  7.08   Mon Sep  4 
10:17:58 2006
  200   280636.11   95  1403.1805  4.28 14.46   Mon Sep  4 
10:18:15 2006
  300   340973.67   91  1136.5789  5.28 22.35   Mon Sep  4 
10:18:37 2006
  400   382897.26   82   957.2431  6.27 30.44   Mon Sep  4 
10:19:03 2006
  500   413793.10   86   827.5862  7.25 38.14   Mon Sep  4 
10:19:33 2006
  600   434940.20   89   724.9003  8.28 46.43   Mon Sep  4 
10:20:07 2006
  700
Fatal error 98 at line 284 of file pipe_test.c: bind on write -- Address 
already in use

Child #489: : Address already in use

Failed to execute
udp_test 100

Fatal error 98 at line 264 of file pipe_test.c: bind on write -- Address 
already in use

Child #286: : Address already in use

Failed to execute
udp_test 100

etc etc

Is this a known issue?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7137] New: modprobe eth modules random loading order

2006-09-09 Thread Andrew Morton
On Sat, 9 Sep 2006 21:37:07 -0700
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=7137
 
Summary: modprobe eth modules random loading order
 Kernel Version: 2.6.17.x
 Status: NEW
   Severity: high
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur:2.6.17.13
 Distribution: Crux
 
 Hardware Environment: All our P4 and PIII Servers
 
 Software Environment: Not software dependable
 Problem Description:
 When upgrade to 2.6.17.11 then our servers with multiple NIC's changed the 
 order it was loaded.
 We use modules for NIC's in kernel config.
 modprobe.conf are set up as this example.
 
 alias eth0 e100
 alias eth1 8139too
 
 Efter upgrade to 2.6.17.11 the NIC which is first loaded is 8139too (Realtek) 
 which then get eth0 
 and second is the e100 (INTEL) eth1
 
 Kernel 2.6.17.8 and erlier was loaded in the order of modprobe.conf settings.
 
 After upgrade to 2.6.17.13 then the load order of eth modules have change 
 again, and still dont folow the modprobe.conf setting.
 
 Efter upgrade to 2.6.17.13 the NIC which now is loaded  first is the e100 eth0
 (INTEL) and second is 8139too (Realtek) eht1
 
 I have the same problem with 3com and realtek or 3com and Intel NIC's
 it change the load order and don't folow the modprobe.conf settings.
 
 Steps to reproduce:
 Use to different NIC's and use 2.6.17.8 and config the driver as module in 
 kernel config. use modprobe.conf and config as I have describe.
 Change then to 2.6.7.11.
 
 And then use 2.6.7.13 and see how the the NIC driver are loaded.
 
 It seams to work ok with 3com and tulip based (D-LINK 4-port NIC_s) in the 
 same 
 machine. In this case the NIC is loaded in the same order for all kernel 
 versions.
 
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
 I asume, when using modules for NIC's drives should be loaded as ethX using 
 the 
 setting in modprobe.conf.
 
 It is really important the eth0 is loaded as it is configured in 
 modprobe.conf 
 and not as the kernel find it in slot ID order.
 
 This strange behavior cause a lot of reconfiguring then we use very complex 
 IPTEBLES rules. Iptable syntax for in/out NIC -o -i whill then not be 
 correct.
 
 And because:
 Using two different NIC one 1 Gbs and 100 Mbs. The 1 Gbs NIC should then be 
 assined the eth0 which is used for primary heavy load and the 100 Mbs is used 
 for maintenance.
 
 --- You are receiving this mail because: ---
 You are on the CC list for the bug, or are watching someone who is.

argh.  Can annyone think what might have caused this?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7159] New: No networking on a machine with Ethernet Pro 100 and Realtek 8139

2006-09-14 Thread Andrew Morton

(Switching from bugzilla to email - please retain all Cc's)

On Thu, 14 Sep 2006 11:04:03 -0700
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=7159
 
Summary: No networking on a machine with Ethernet Pro 100 and
 Realtek 8139
 Kernel Version: 2.6.16, 2.6.17, 2.6.18-rc6
 Status: NEW
   Severity: normal
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: 2.6.8
 Distribution: Debian
 Hardware Environment: Dual-PIII, Ethernet Pro 100 and Realtek 8139 PCI 
 interfaces
 Software Environment: Debian Etch (Testing)
 Problem Description: The network is not reachable, though the kernel does seem
 to sense line presence on both interfaces.
 
 On boot, udev/discover loads e100, 8139cp and 8139too.  /etc/modules does not
 have any network modules (needs eepro100 for 2.6.8, but I removed it, no
 change).  The relevant lspci listings
 are:
 
 00:09.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] 
 (rev 05)
 00:0b.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
 RTL-8139/8139C/8139C+ (rev 10)
 
 Both interfaces work fine under 2.6.8 as long as eepro100 is loaded.
 
 More information (lspci -v, /proc/interrupts, /proc/ioports) can be found at 
 the
 Debian bug: http://bugs.debian.org/386972
 
 Steps to reproduce: Boot, try to use network.
 

This is all a bit peculiar.  I'd be assuming that you're not getting
any interrupts through for those NICs.

Could you please check /proc/interrupts, see if the interrupt counts
related to the NICs can be made to increase?

Also, the full `dmesg -s 100' output might help.

We might also get some interesting info if you can compile your own kernel,
build thsoe net drivers into vmlinux, capture the dmesg output.

If it _is_ an IRQ problem then you might find that fiddling with ACPI
helps: disable it in config or boot with `acpi=off', see if that helps.  Also
try booting with the `pci=routeirq' option.

There are various options described under acpi= and pci= in
Documentation/kernel-parameters.txt which it would be useful for you to
experiment with.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] please include in 2.6.18: e100 disable device on PCI error

2006-09-18 Thread Andrew Morton
On Mon, 18 Sep 2006 15:01:22 -0500
[EMAIL PROTECTED] (Linas Vepstas) wrote:

 
 Hi,
 
 Please apply the following one-liner patch to  
 what will become the stable 2.6.18.  This patch is 
 low-risk because it affects only the PCI error 
 recovery code, which dosn't run on most platforms
 (in particular, isn't invoked on current x86/ia64).
 
 This patch was originally sent on 29 June 2006
 to fix a bug that showed up in an -mm build.
 The code from -mm made it into mainline, but 
 this patch did not, and so we're unhappy. :-(
 
 Here's the original patch description:
 
 A recent patch in -mm3 titled 
 gregkh-pci-pci-don-t-enable-device-if-already-enabled.patch
 causes pci_enable_device() to be a no-op if the kernel thinks
 that the device is already enabled.  This change breaks the
 PCI error recovery mechanism in the e100 device driver, since, 
 after PCI slot reset, the card is no longer enabled. This is 
 a trivial fix for this problem. Tested.
 
 Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 Signed-off-by: Auke Kok [EMAIL PROTECTED]
 
 
  drivers/net/e100.c |1 +
  1 file changed, 1 insertion(+)
 
 Index: linux-2.6.18-rc7-git1/drivers/net/e100.c
 ===
 --- linux-2.6.18-rc7-git1.orig/drivers/net/e100.c 2006-09-18 
 14:21:49.0 -0500
 +++ linux-2.6.18-rc7-git1/drivers/net/e100.c  2006-09-18 14:24:50.0 
 -0500
 @@ -2799,6 +2799,7 @@ static pci_ers_result_t e100_io_error_de
   /* Detach; put netif into state similar to hotplug unplug. */
   netif_poll_enable(netdev);
   netif_device_detach(netdev);
 + pci_disable_device(pdev);
  
   /* Request a slot reset. */
   return PCI_ERS_RESULT_NEED_RESET;

hm.  I don't have this patch queued, but I _do_ have an equivalent patch
for e1000 queued; what's up with that?  Nobody seems to have paid much
attention to the e1000 fix.

If we can gather the appropriate acks quickly then I expect we can get both
of these into 2.6.18.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-rc7-mm1

2006-09-19 Thread Andrew Morton
On Tue, 19 Sep 2006 22:25:21 +0200
Rafael J. Wysocki [EMAIL PROTECTED] wrote:

  - It took maybe ten hours solid work to get this dogpile vaguely
compiling and limping to a login prompt on x86, x86_64 and powerpc. 
I guess it's worth briefly testing if you're keen.
 
 It's not that bad, but unfortunately the networking doesn't work on my system
 (HPC nx6325 + SUSE 10.1 w/ updates, 64-bit).  Apparently, the interfaces don't
 get configured (both tg3 and bcm43xx are affected).

Is there anything interesting in the dmesg output?

Perhaps an `strace -f ifup' or whatever would tell us what's failing.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fw: [Bugme-new] [Bug 7179] New: Compilation of .tmp_linux1 fails due to missing declaration in net/netfilter/xt_physdev.c

2006-09-21 Thread Andrew Morton

Methinks CONFIG_NETFILTER_XT_TARGET_CLASSIFY should depend upon
CONFIG_BRIDGE_NETFILTER.  Because brnf_deferred_hooks is defined in
net/bridge/br_netfilter.c and is referred to in net/netfilter/xt_physdev.c.

Or something else ;)



Begin forwarded message:

Date: Thu, 21 Sep 2006 14:41:13 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 7179] New: Compilation of .tmp_linux1 fails due to 
missing declaration in net/netfilter/xt_physdev.c


http://bugzilla.kernel.org/show_bug.cgi?id=7179

   Summary: Compilation of .tmp_linux1 fails due to missing
declaration in net/netfilter/xt_physdev.c
Kernel Version: 2.6.18
Status: NEW
  Severity: high
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur: 2.6.17.13
Distribution: CentOS
Hardware Environment: Dual Intel Xeon 5160
Software Environment: gcc 3.4.6, glibc 2.3.4, make 3.8
Problem Description:

Using the same config from 2.6.17.13, kernel 2.6.18 fails on make when 
attempting to make .tmp_vmlinux1

LD  .tmp_vmlinux1
net/built-in.o(.text.checkentry+0x1e1): In function `checkentry':
net/netfilter/xt_physdev.c:130: undefined reference to `brnf_deferred_hooks'
make: *** [.tmp_vmlinux1] Error 1

Line 130 is simply brnf_deferred_hooks = 1;  This variable is also used on 
line 118.

Adding 

int brnf_deferred_hooks = 0; 

in a line before line 104 (static int) will cause .tmp_vmlinux1 to be 
successfully created, and make will finish successfully.  However, it will 
generate a warning (seen below) on now line 119 which can be fixed by changing 
it to: 

if ((brnf_deferred_hooks == 0)  (info-bitmask  XT_PHYSDEV_OP_OUT)  

The warning generated is:

net/netfilter/xt_physdev.c: In function `checkentry':
net/netfilter/xt_physdev.c:118: warning: ISO C90 forbids mixed declarations 
and code

The kernel produced after making these changes works fine

Steps to reproduce:

I am unsure of which kernel .config parameter is sparking this.  My .config 
can be found here: http://www.animeforum.com/jakiao/misato.config

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.1[78] page allocation failure. order:3, mode:0x20

2006-09-22 Thread Andrew Morton
On Fri, 22 Sep 2006 07:27:18 + (GMT)
Holger Kiehl [EMAIL PROTECTED] wrote:

 I get some of the page allocation failure errors. My hardware is 4 CPU
 Opteron with one quad + one dual intel e1000 cards. Kernel is plain 2.6.18
 and for two cards MTU is set to 9000.
 
 Sep 21 21:03:15 athena kernel: vsftpd: page allocation failure. order:3, 
 mode:0x20
 Sep 21 21:03:15 athena kernel:
 Sep 21 21:03:15 athena kernel: Call Trace:
 Sep 21 21:03:15 athena kernel:  IRQ [8024e516] 
 __alloc_pages+0x282/0x29b
 Sep 21 21:03:15 athena kernel:  [8807aa93] 
 :ip_tables:ipt_do_table+0x1eb/0x318
 Sep 21 21:03:15 athena kernel:  [8026614b] 
 cache_grow+0x134/0x33d
 Sep 21 21:03:15 athena kernel:  [8026664c] 
 cache_alloc_refill+0x189/0x1d7
 Sep 21 21:03:15 athena kernel:  [80266724] __kmalloc+0x8a/0x94
 Sep 21 21:03:15 athena kernel:  [803b5438] 
 __alloc_skb+0x5c/0x123
 Sep 21 21:03:15 athena kernel:  [803b5f2e] 
 __netdev_alloc_skb+0x12/0x2d
 Sep 21 21:03:15 athena kernel:  [8033cb22] 
 e1000_alloc_rx_buffers+0x6f/0x2f3
 Sep 21 21:03:15 athena kernel:  [803d1234] 
 ip_local_deliver+0x173/0x23b
 Sep 21 21:03:15 athena kernel:  [8033d29a] 
 e1000_clean_rx_irq+0x4f4/0x514

Is OK, it's just a warning and it is expected - the kernel will recover.

I'm half-inclined to shut the warning up by sticking a __GFP_NOWARN in there.

But on the other hand, that warning is handy sometimes.  How come kmalloc
decided to request a 32k hunk of memory when the MTU size is only 9k?  Is
the driver doing something dumb?

else if (max_frame = E1000_RXBUFFER_8192)
adapter-rx_buffer_len = E1000_RXBUFFER_8192;
else if (max_frame = E1000_RXBUFFER_16384)
adapter-rx_buffer_len = E1000_RXBUFFER_16384;

It sure is.

This is going to cause an 9000-byte MTU to use a 16384-byte allocation. 
e1000_alloc_rx_buffers() adds two bytes to that, so we do kmalloc(16386),
which causes the slab allocator to request 32768 bytes.  All for a 9kbyte skb.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take19 0/4] kevent: Generic event handling mechanism.

2006-09-22 Thread Andrew Morton
On Wed, 20 Sep 2006 13:35:47 +0400
Evgeniy Polyakov [EMAIL PROTECTED] wrote:

 Generic event handling mechanism.
 
 Consider for inclusion.

Ulrich's objections sounded substantial, and afaik remain largely
unresolved.   How do we sort this out?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.1[78] page allocation failure. order:3, mode:0x20

2006-09-22 Thread Andrew Morton
On Fri, 22 Sep 2006 10:10:36 -0700
Auke Kok [EMAIL PROTECTED] wrote:

 I wonder if we can't account for NET_IP_ALIGN when selecting bufsize, to get 
 at 
 rid of at least 1 order size before we netdev_alloc_skb. This should make 9k 
 frames only kmalloc(16384) and thus stay within the 16k boundary. I hope.
 
 Completely untested: don't commit :)
 

I did - I think we want this patch.

 
 e1000: account for NET_IP_ALIGN when calculating bufsiz
 
 Account for NET_IP_ALIGN when requesting buffer sizes from netdev_alloc_skb 
 to 
 reduce slab allocation by half.

Could we please do whatever is needed to get this blessed and merged?  This
is such a common problem on such a common driver that I would suggest that
we want this in 2.6.18.x as well.  At least, I'd expect distributors to
ship this fix (they're nuts if they don't) and so it makes sense to deliver
it from kernel.org.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Restore the original TX FIFO overflow process.

2006-09-22 Thread Andrew Morton
On Fri, 22 Sep 2006 15:30:01 -0400
Jesse Huang [EMAIL PROTECTED] wrote:

  #define DRV_NAME sundance
 -#define DRV_VERSION  1.01+LK1.14
 -#define DRV_RELDATE  04-Aug-2006
 +#define DRV_VERSION  1.01+LK1.15
 +#define DRV_RELDATE  22-Sep-2006

Can we please delete this thing?  It's *forever* getting rejects and 
people only remember to update it a fraction of the time anyway.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.1[78] page allocation failure. order:3, mode:0x20

2006-09-22 Thread Andrew Morton
On Fri, 22 Sep 2006 22:25:07 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Andrew Morton [EMAIL PROTECTED]
 Date: Fri, 22 Sep 2006 21:50:00 -0700
 
  On Fri, 22 Sep 2006 10:10:36 -0700
  Auke Kok [EMAIL PROTECTED] wrote:
  
   e1000: account for NET_IP_ALIGN when calculating bufsiz
   
   Account for NET_IP_ALIGN when requesting buffer sizes from 
   netdev_alloc_skb to 
   reduce slab allocation by half.
  
  Could we please do whatever is needed to get this blessed and merged?  This
  is such a common problem on such a common driver that I would suggest that
  we want this in 2.6.18.x as well.  At least, I'd expect distributors to
  ship this fix (they're nuts if they don't) and so it makes sense to deliver
  it from kernel.org.
 
 The NET_IP_ALIGN existed not just for fun :)  There are ramifications
 for removing it.

It's still there, isn't it?

For the 9k MTU case, for example, we end up allocating 16384 byte skbs
instead of 32786 kbytes ones.


diff -puN 
drivers/net/e1000/e1000_main.c~e1000-account-for-net_ip_align-when-calculating-bufsiz
 drivers/net/e1000/e1000_main.c
--- 
a/drivers/net/e1000/e1000_main.c~e1000-account-for-net_ip_align-when-calculating-bufsiz
+++ a/drivers/net/e1000/e1000_main.c
@@ -1101,7 +1101,7 @@ e1000_sw_init(struct e1000_adapter *adap
 
pci_read_config_word(pdev, PCI_COMMAND, hw-pci_cmd_word);
 
-   adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE;
+   adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE + NET_IP_ALIGN;
adapter-rx_ps_bsize0 = E1000_RXBUFFER_128;
hw-max_frame_size = netdev-mtu +
 ENET_HEADER_SIZE + ETHERNET_FCS_SIZE;
@@ -3163,26 +3163,27 @@ e1000_change_mtu(struct net_device *netd
 * larger slab size
 * i.e. RXBUFFER_2048 -- size-4096 slab */
 
-   if (max_frame = E1000_RXBUFFER_256)
+   if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_256)
adapter-rx_buffer_len = E1000_RXBUFFER_256;
-   else if (max_frame = E1000_RXBUFFER_512)
+   else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_512)
adapter-rx_buffer_len = E1000_RXBUFFER_512;
-   else if (max_frame = E1000_RXBUFFER_1024)
+   else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_1024)
adapter-rx_buffer_len = E1000_RXBUFFER_1024;
-   else if (max_frame = E1000_RXBUFFER_2048)
+   else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_2048)
adapter-rx_buffer_len = E1000_RXBUFFER_2048;
-   else if (max_frame = E1000_RXBUFFER_4096)
+   else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_4096)
adapter-rx_buffer_len = E1000_RXBUFFER_4096;
-   else if (max_frame = E1000_RXBUFFER_8192)
+   else if (max_frame + NET_IP_ALIGN = E1000_RXBUFFER_8192)
adapter-rx_buffer_len = E1000_RXBUFFER_8192;
-   else if (max_frame = E1000_RXBUFFER_16384)
+   else
adapter-rx_buffer_len = E1000_RXBUFFER_16384;
 
/* adjust allocation if LPE protects us, and we aren't using SBP */
if (!adapter-hw.tbi_compatibility_on 
((max_frame == MAXIMUM_ETHERNET_FRAME_SIZE) ||
 (max_frame == MAXIMUM_ETHERNET_VLAN_SIZE)))
-   adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE;
+   adapter-rx_buffer_len = MAXIMUM_ETHERNET_VLAN_SIZE +
+   NET_IP_ALIGN;
 
netdev-mtu = new_mtu;
 
@@ -4002,7 +4003,8 @@ e1000_alloc_rx_buffers(struct e1000_adap
struct e1000_buffer *buffer_info;
struct sk_buff *skb;
unsigned int i;
-   unsigned int bufsz = adapter-rx_buffer_len + NET_IP_ALIGN;
+   /* we have already accounted for NET_IP_ALIGN */
+   unsigned int bufsz = adapter-rx_buffer_len;
 
i = rx_ring-next_to_use;
buffer_info = rx_ring-buffer_info[i];
_

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Restore the original TX FIFO overflow process.

2006-09-23 Thread Andrew Morton
On Fri, 22 Sep 2006 15:30:01 -0400
Jesse Huang [EMAIL PROTECTED] wrote:

 From: Jesse Huang [EMAIL PROTECTED]
 
 Change Logs:
- Restore the original TX FIFO overflow process.
 
 Signed-off-by: Jesse Huang [EMAIL PROTECTED]
 
 ...

 + txthreshold = ioread16 (ioaddr 
 + TxStartThresh);

Your patch ip100a-fix-tx-pause-bug-reset_tx-intr_handler.patch removed
TxStartThresh, so it won't compile.

I don't have a clue what's happening with this driver - I'll drop everything.

I suggest you send a complete new patch series against Jeff's latest tree. 
I'll send you a copy of that.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18 BUG: unable to handle kernel NULL pointer dereference at virtual address 000,0000a

2006-09-24 Thread Andrew Morton
On Sun, 24 Sep 2006 11:11:02 +0200
Christian Weiske [EMAIL PROTECTED] wrote:

 Andrew,
 

You keep on losing Cc:s.  Please preserve them all with care when replying.

 
  I have a reproducible BUG on my server that occurs whenever disk usage
  gets too high / too much swapping occurs (at least I think that is). The
  box has one reiserfs filesystem of about 187GB size, the disk is on an
  Epia 5000 board, between them is a Promise Ultra 100 PCI IDE controller
  card.
  Do you think this bug is due to the 2.6.18 upgrade?
 
 No. I already had it in 2.6.17.6.
 
  Have you run fsck across the filesystem(s)?
 fsck at boot turns up
  ReiserFS: hde3: checking transaction log (hde3)
  ReiserFS: hde3: replayed 22 transactions in 0 seconds
  ReiserFS: hde3: Using r5 hash to sort names
 nothing more
 
  Does the oops always look the same as this one?
 No, not exactly the same. I attach three log files. If you diff them,
 there will be about 30% of the lines different.
 
 One thing I have to note is that the second Oops appears about 10
 seconds after the first one.
 
  Please turn on the various CONFIG_DEBUG_* options, see if that turns up
  anything.
 That indeed turns up something. The debug messages indicate that java
 wants to lock something and gets stuck. Note that the messages until
 slab corruption are printed first, and the others about a minute or
 two later.
 
 And I still can ping and do everything until the slab corruption occurs.
 (Thus the other messages some minute later)
 
 
  It would be interesting to find out if enabling CONFIG_4KSTACKS makes this
  go away (although I'm not sure why).
 Didn't try this yet, but will.
 
 I put the logs in a tar.bz2 because I didn't want to flood the list with
 a 200k message.
 

OK, you have crashes in the scheduler and one crash when accessing a
reiserfs structure.

You have tcp_v6 lockdep warnings.  They're in
http://xml.cweiske.de/dojo%20kernelpanic%20+%20debug.tar.bz2 is anyone is
keen.  (I've largely lost interest in lockdep warnings - many of them are
false positives and require make-lockdep-shut-up patches).

You have what claims to be a netfilter-related memory corruption:

Slab corruption: start=c608a42c, len=172
Redzone: 0x6b6b6b6b/0xc0411958.
Last user: [170fc2a5](0x170fc2a5)
0a0: 6b 6b 6b 6b 6b 6b 6b a5 71 f0 2c 5a
Prev obj: start=c608a2c1, len=172
Redzone: 0xec0410f/0x1170fc2.
Last user: [3000](0x3000)
000: 00 00 00 10 a2 08 c6 a8 a5 08 c6 46 3a 00 00 10
010: 10 41 c0 bc a2 08 c6 20 d3 60 c0 00 00 00 00 00
slab error in cache_alloc_debugcheck_after(): cache `ip_conntrack': double freen
 [c01034b9] show_trace+0x19/0x20
 [c01035ba] dump_stack+0x1a/0x20
 [c0160c11] __slab_error+0x21/0x30
 [c0162ca1] cache_alloc_debugcheck_after+0x121/0x1a0
 [c0162ffb] kmem_cache_alloc+0x6b/0xc0
 [c041184c] ip_conntrack_alloc+0x3c/0x130
 [c041198a] init_conntrack+0x2a/0x110
 [c0411c4e] ip_conntrack_in+0x1de/0x230
BUG: unable to handle kernel NULL pointer dereference at virtual address 008
 printing eip:



And another in what appears to be core ipv4:

Slab corruption: start=c3aff608, len=240
Redzone: 0x6b6b6b6b/0x0.
Last user: [170fc2a5](0x170fc2a5)
0e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 71 f0 2c 5a
Prev obj: start=c3aff48f, len=240
Redzone: 0x6b6b6b6b/0x6b6b6b6b.
Last user: [6b6b6b6b](0x6b6b6b6b)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
slab error in cache_alloc_debugcheck_after(): cache `ip_dst_cache': double freen
 [c01034b9] show_trace+0x19/0x20
 [c01035ba] dump_stack+0x1a/0x20
 [c0160c11] __slab_error+0x21/0x30
 [c0162ca1] cache_alloc_debugcheck_after+0x121/0x1a0
 [c0162ffb] kmem_cache_alloc+0x6b/0xc0
 [c03ca3c4] dst_alloc+0x24/0x90
 [c03da865] ip_route_input_slow+0x295/0x8c0
 [c03daf92] ip_route_input+0x102/0x1d0
 [c03dd29a] ip_rcv+0x27a/0x440
 [c03c6d41] netif_receive_skb+0x1b1/0x1f0
 [c03c6e10] process_backlog+0x90/0x120
 [c03c6f0d] net_rx_action+0x6d/0x100
 [c011d4af] __do_softirq+0x6f/0x100
 [c011d59f] do_softirq+0x5f/0x70
 [c011d603] irq_exit+0x53/0x60
 [c0104c28] do_IRQ+0x38/0x70
 [c0103145] common_interrupt+0x25/0x30
 [c028e19b] memcpy+0x3b/0x50
 [c028e208] memmove+0x38/0x50
 [c01bf85d] leaf_paste_in_buffer+0x7d/0x320
 [c01a862c] balance_leaf+0x24c/0x27d0
 [c01aaee0] do_balance+0x60/0xf0
 [c01c56e4] reiserfs_paste_into_item+0x164/0x190
 [c01b3ab5] reiserfs_allocate_blocks_for_region+0x925/0x12e0
 [c01b5b2c] reiserfs_file_write+0x72c/0x7c0
 [c0166768] vfs_write+0x88/0x170
 [c01668fc] sys_write+0x3c/0x70
 [c0102e77] syscall_call+0x7/0xb


And another networking-related scribble:


Slab corruption: start=c64159ec, len=156
Redzone: 0x6b6b6b6b/0xc03c048a.
Last user: [170fc2a5](0x170fc2a5)
090: 6b 6b 6b 6b 6b 6b 6b a5 71 f0 2c 5a
Prev obj: start=c64158ec, len=156
Redzone: 0x6b6b6b6b/0x6b6b6b6b.
Last user: [6b6b6b6b](0x6b6b6b6b)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
slab error in cache_alloc_debugcheck_after(): cache 

Fw: [Bugme-new] [Bug 7198] New: balance-alb bonding oops when disconnecting primary slave interface

2006-09-24 Thread Andrew Morton


Begin forwarded message:

Date: Sun, 24 Sep 2006 19:58:03 -0700
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: [Bugme-new] [Bug 7198] New: balance-alb bonding oops when 
disconnecting primary slave interface


http://bugzilla.kernel.org/show_bug.cgi?id=7198

   Summary: balance-alb bonding oops when disconnecting primary
slave interface
Kernel Version: 2.6.18
Status: NEW
  Severity: normal
 Owner: [EMAIL PROTECTED]
 Submitter: [EMAIL PROTECTED]


Most recent kernel where this bug did not occur: 2.6.8 (Debian release kernel)
Distribution: Debian
Hardware Environment: IBM eServer xSeries 336, e1000 and/or broadcom BCM5721  
Software Environment: Debian Linux Kernel 
Problem Description: When bonding is enabled in balance-alb mode on several
gigabit interfaces, when the primary slave device is unplugged it always
displays this...

e1000: eth2: e1000_watchdog: NIC Link is Down
bonding: bond0: link status down for idle  interface eth2, disabling it in 100 
ms.
bonding: bond0: link status definitely down for interface eth2, disabling it
bonding: bond0: making interface eth3 the new active one.
device eth2 left promiscuous mode
RTNL: assertion failed at net/ipv4/devinet.c (984)
 [c03b0884] inetdev_event+0x273/0x2d5
 [c03885ca] rt_run_flush+0x68/0x94
 [c0127aae] notifier_call_chain+0x1d/0x2d
 [c036ead9] dev_set_mac_address+0x4a/0x4f
 [f9d6807e] alb_set_slave_mac_addr+0x64/0x88 [bonding]
 [f9d6967d] alb_swap_mac_addr+0x6c/0x15a [bonding]
 [f9d63a1c] bond_change_active_slave+0x2e8/0x348 [bonding]
 [f9d64617] bond_select_active_slave+0xa5/0x129 [bonding]
 [f9d64eeb] bond_mii_monitor+0x1a6/0x4c1 [bonding]
 [f9d64d45] bond_mii_monitor+0x0/0x4c1 [bonding]
 [c01236f1] run_timer_softirq+0xc6/0x19b
 [c01200f2] __do_softirq+0x75/0xe1
 [c0120192] do_softirq+0x34/0x36
 [c0120310] irq_exit+0x41/0x43
 [c0103733] apic_timer_interrupt+0x1f/0x24
 [c0101ae6] mwait_idle+0x2a/0x34
 [c0101a97] cpu_idle+0x63/0x88
RTNL: assertion failed at net/ipv4/devinet.c (984)
 [c03b0884] inetdev_event+0x273/0x2d5
 [c03885ca] rt_run_flush+0x68/0x94
 [c0127aae] notifier_call_chain+0x1d/0x2d
 [c036ead9] dev_set_mac_address+0x4a/0x4f
 [f9d6807e] alb_set_slave_mac_addr+0x64/0x88 [bonding]
 [f9d6968c] alb_swap_mac_addr+0x7b/0x15a [bonding]
 [f9d63a1c] bond_change_active_slave+0x2e8/0x348 [bonding]
 [f9d64617] bond_select_active_slave+0xa5/0x129 [bonding]
 [f9d64eeb] bond_mii_monitor+0x1a6/0x4c1 [bonding]
 [f9d64d45] bond_mii_monitor+0x0/0x4c1 [bonding]
 [c01236f1] run_timer_softirq+0xc6/0x19b
 [c01200f2] __do_softirq+0x75/0xe1
 [c0120192] do_softirq+0x34/0x36
 [c0120310] irq_exit+0x41/0x43
 [c0103733] apic_timer_interrupt+0x1f/0x24
 [c0101ae6] mwait_idle+0x2a/0x34
 [c0101a97] cpu_idle+0x63/0x88
device eth3 entered promiscuous mode
e1000: eth2: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex
bonding: bond0: link status up for interface eth2, enabling it in 100 ms.
bonding: bond0: link status definitely up for interface eth2.
device eth3 left promiscuous mode

and intermittently causes a segmentation fault.

This does not happen in balance-tlb mode.

Steps to reproduce:
1. Use kernel 2.6.10-2.6.18 (tried all of these)
2. Using generic i386 or P4-Xeon compile target 
3. Setup balance-alb on 4 e1000 or BCM5721, with each interface in slave mode
without IP.  
4. When all ports are connected to a gigabit switch, in-turn disconnect and
re-connect each interface.  When the primary is unplugged this message is 
displayed.

Hope this info was enough.  I apologize if this has been fixed before.  

I would greatly appreciate a fix for this.

Thanks in advance 
Geoff

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] neighbour.c, pneigh_get_next() skips published entry

2006-09-25 Thread Andrew Morton

I've been sitting on this patch because afaik the problem which it purports
to fix remains unfixed.

Should I drop it??

Thanks.



From: Jari Takkala [EMAIL PROTECTED]

Fix a problem where output from /proc/net/arp skips a record when the full
output does not fit into the users read() buffer.

To reproduce: publish a large number of ARP entries (more than 10 required
on my system).  Run 'dd if=/proc/net/arp of=arp-1024.out bs=1024'.  View
the output, one entry will be missing.

Signed-off-by: Jari Takkala [EMAIL PROTECTED]

[akpm: submitted before, discussion ended inconclusively, iirc]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 net/core/neighbour.c |6 ++
 1 file changed, 6 insertions(+)

diff -puN net/core/neighbour.c~neighbourc-pneigh_get_next-skips-published-entry 
net/core/neighbour.c
--- a/net/core/neighbour.c~neighbourc-pneigh_get_next-skips-published-entry
+++ a/net/core/neighbour.c
@@ -2209,6 +2209,12 @@ static struct pneigh_entry *pneigh_get_n
struct neigh_seq_state *state = seq-private;
struct neigh_table *tbl = state-tbl;
 
+   if (pos != NULL  *pos == 1 
+   (pn-next || tbl-phash_buckets[state-bucket])) {
+   --(*pos);
+   return pn;
+   }
+
pn = pn-next;
while (!pn) {
if (++state-bucket  PNEIGH_HASHMASK)
_

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] neighbour.c, pneigh_get_next() skips published entry

2006-09-25 Thread Andrew Morton
On Mon, 25 Sep 2006 16:47:31 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Andrew Morton [EMAIL PROTECTED]
 Date: Mon, 25 Sep 2006 16:45:35 -0700
 
  I've been sitting on this patch because afaik the problem which it purports
  to fix remains unfixed.
  
  Should I drop it??
  
  Thanks.
 
 Please drop it, the patch submitted didn't give us the feedback
 and test results we asked for which is necessary to pinpoint the
 true issue here.

Well that's why I hang onto such patches: so I can bug people about it
every few months.  Consider it the world's dumbest bug-tracking system.

But I have a feeling I'll get shouted at if I try it again with this one
(looks at the 18-month-old tulip-fix-for-64-bit-mips.patch), so yeah, I'll drop
it.  
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-mm1 -- ieee80211: Info elem: parse failed: info_element-len + 2 left : info_element-len+2=28 left=9, id=221.

2006-09-26 Thread Andrew Morton

[added netdev]

On Tue, 26 Sep 2006 12:04:40 -0700
Miles Lane [EMAIL PROTECTED] wrote:

 ieee80211: Info elem: parse failed: info_element-len + 2  left :
 info_element-len+2=28 left=9, id=221.
 ieee80211: Info elem: parse failed: info_element-len + 2  left :
 info_element-len+2=28 left=9, id=221.
 ieee80211: Info elem: parse failed: info_element-len + 2  left :
 info_element-len+2=28 left=9, id=221.
 
 From dmesg output:
 ieee80211: 802.11 data/management/control stack, git-1.1.13
 ieee80211: Copyright (C) 2004-2005 Intel Corporation [EMAIL PROTECTED]
 ieee80211_crypt: registered algorithm 'NULL'
 ieee80211_crypt: registered algorithm 'WEP'
 ieee80211_crypt: registered algorithm 'CCMP'
 ieee80211_crypt: registered algorithm 'TKIP'

I suspect that whatever caused this is now in mainline.  Are you able to
test Linus's current git tree?

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


e100 changes in git-netdev-all break reboot with netconsole

2006-09-28 Thread Andrew Morton

Enable netconsole-over-e100, and `reboot -f' hangs.  Disabling netconsole
prevents that from happening.

I assume what's happening is that the driver gets shut down and then
something tries to do a printk through it, and things hang.

For some reason sysrq-B still reboots the machine.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 7222] New: sky2 throws a lot of pci express error in 2.6.18-mm2 on amd64

2006-09-28 Thread Andrew Morton
On Thu, 28 Sep 2006 05:10:43 -0700
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=7222
 
Summary: sky2 throws a lot of pci express error in 2.6.18-mm2
 on amd64
 Kernel Version: 2.6.18-mm2
 Status: NEW
   Severity: normal
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur: 2.6.18-mm1
 Distribution: Gentoo Linux
 Hardware Environment: Asus P5W DH Deluxe
 Software Environment:
 Problem Description: kernel throws a lot of pci express errors
 
 Steps to reproduce:
 compile in sky2 driver in amd64 kernel (2.6.18-mm2),
 boot into system = errors appear:
 
 sky2 v1.9 addr 0xebdfc000 irq 19 Yukon-EC (0xb6) rev 2
 sky2 eth0: addr 00:17:31:e8:f3:53
 ACPI: PCI Interrupt :03:00.0[A] - GSI 16 (level, low) - IRQ 16
 PCI: Setting latency timer of device :03:00.0 to 64
 sky2 v1.9 addr 0xebcfc000 irq 16 Yukon-EC (0xb6) rev 2
 sky2 eth1: addr 00:17:31:ee:e4:18
 [...]
 sky2 :04:00.0: pci express error (0x100407)
 [...]
 sky2 :03:00.0: pci express error (0x100407)
 [...]
 sky2 :04:00.0: pci express error (0x500547)
 sky2 :03:00.0: pci express error (0x500547)
 sky2 :04:00.0: pci express error (0x500547)
 sky2 :03:00.0: pci express error (0x500547)
 sky2 :04:00.0: pci express error (0x500547)
 sky2 :03:00.0: pci express error (0x500547)
 Losing some ticks... checking if CPU frequency changed.
 sky2 :04:00.0: pci express error (0x500547)
 sky2 :03:00.0: pci express error (0x500547)
 
 this error occurs during the whole runtime of the kernel ...
 
 I don't know if this severe, so I didn't let that kernel run long ...
 
 network-functionality doesn't seem to be impaired:
 ping -c 3 www.google.com
 works  I can surf with links -g www.google.com
 

I would be suspecting that something went wrong with [PATCH] sky2: use
standard pci register capabilties for error register.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-mm2

2006-09-28 Thread Andrew Morton

(please always do reply-to-all)

On Thu, 28 Sep 2006 17:50:31 + (UTC)
Steve Fox [EMAIL PROTECTED] wrote:

 On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
 
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
 
 Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
 
 TCP bic registered
 TCP westwood registered
 TCP htcp registered
 NET: Registered protocol family 1
 NET: Registered protocol family 17
 Unable to handle kernel paging request at  RIP: 
  [8047ef93] packet_notifier+0x163/0x1a0
 PGD 203027 PUD 2b031067 PMD 0 
 Oops:  [1] SMP 
 last sysfs file: 
 CPU 0 
 Modules linked in:
 Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
 RIP: 0010:[8047ef93]  [8047ef93] 
 packet_notifier+0x163/0x1a0
 RSP: :810bffcbde90  EFLAGS: 00010286
 RAX:  RBX: 810bff4a1000 RCX: 
 RDX: 810bff4a1000 RSI: 0005 RDI: 8055f5e0
 RBP:  R08: 7616 R09: 000e
 R10: 0006 R11: 803373f0 R12: 
 R13: 0005 R14: 810bff4a1000 R15: 
 FS:  () GS:805d8000() knlGS:
 CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
 CR2:  CR3: 00201000 CR4: 06e0
 Process swapper (pid: 1, threadinfo 810bffcbc000, task 810bffcbb510)
 Stack:  810bff4a1000 8055f4c0  810bffcbdef0
   8042736e  
   8061c68d 806260f0 80207182
 Call Trace:
  [8042736e] register_netdevice_notifier+0x3e/0x70
  [8061c68d] packet_init+0x2d/0x53
  [80207182] init+0x162/0x330
  [8020a9d8] child_rip+0xa/0x12
  [8033c2a2] acpi_ds_init_one_object+0x0/0x82
  [80207020] init+0x0/0x330
  [8020a9ce] child_rip+0x0/0x12
 
 
 Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff 
 RIP  [8047ef93] packet_notifier+0x163/0x1a0
  RSP 810bffcbde90
 CR2: 
  0Kernel panic - not syncing: Attempted to kill init!
 

I'm really struggling to work out what went wrong there.  Comparing your
miserable 20 bytes of code to my object code makes me think that this:

struct packet_sock *po = pkt_sk(sk);

returned -1, perhaps in %ebp.  But it's all very crude.

Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
addresses might change) then have a poke around with `gdb vmlinux' (or
maybe just addr2line) to work out where it's really oopsing?

I don't see much which has changed in that area recently.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 (was Re: 2.6.18-mm2)

2006-09-28 Thread Andrew Morton
On Thu, 28 Sep 2006 19:07:05 -0400
Jeff Garzik [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  Another customer..
  
  Begin forwarded message:
  
  Date: Fri, 29 Sep 2006 00:44:01 +0200
  From: Matthias Hentges [EMAIL PROTECTED]
  To: Andrew Morton [EMAIL PROTECTED]
  Cc: linux-kernel@vger.kernel.org
  Subject: Re: 2.6.18-mm2
  
  
  Hello all,
  
  I've just tested -mm2 on my C2D system and I'm getting a lot of these
  messages:
  
  [  139.143807] printk: 131 messages suppressed.
  [  139.148235] sky2 :03:00.0: pci express error (0x500547)
  
  Please note that the sky2 driver has always been the black sheep on
  that system due to regular full lock-ups of the driver, requiring a
  rmmod sky2 + modprobe sky2 cycle.
  
  This happens often enough to warrant writing a cronjob checking the
  network and auto-rmmod'ing the module.
  
  While the above is bloody annoying at times (heh), the driver never
  caused any messages like the ones I now get with -mm2 .
 
 sky2 just turned on PCI Express error reporting, so it makes sense that 
 messages would appear.  The better question is whether this is a driver 
 problem, or a hardware problem.  With your black sheep comment, I 
 wonder if it isn't a hardware problem that's been hidden.
 

See also http://bugzilla.kernel.org/show_bug.cgi?id=7222

That's two reports in 18 hours, from amongst the presumably-small population
of sky2-owning -mm testers.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IPv6/DCCP: Fix memory leak in dccp_v6_do_rcv()

2006-09-29 Thread Andrew Morton
On Fri, 29 Sep 2006 02:45:33 +0200
Jesper Juhl [EMAIL PROTECTED] wrote:

 
 Coverity found what looks like a real leak in 
 net/dccp/ipv6.c::dccp_v6_do_rcv()
 
 We may leave via the return inside if (sk-sk_state == DCCP_OPEN) {
 but at that point we may have allocated opt_skb, but we never free it
 in that path before the return.
 
 
 Signed-off-by: Jesper Juhl [EMAIL PROTECTED]
 ---
 
  net/dccp/ipv6.c |2 ++
  1 file changed, 2 insertions(+)
 
 --- linux-2.6.18-git10-orig/net/dccp/ipv6.c   2006-09-28 22:40:07.0 
 +0200
 +++ linux-2.6.18-git10/net/dccp/ipv6.c2006-09-29 02:35:15.0 
 +0200
 @@ -997,6 +997,8 @@ static int dccp_v6_do_rcv(struct sock *s
   if (sk-sk_state == DCCP_OPEN) { /* Fast path */
   if (dccp_rcv_established(sk, skb, dccp_hdr(skb), skb-len))
   goto reset;
 + if (opt_skb)
 + __kfree_skb(opt_skb);
   return 0;
   }

Looks right to me.  But it'd be better coded as below, so we don't have
multiple deeply-nested return points (the cause of this bug) and duplicated
code.

otoh, it seems to me that opt_skb doesn't actually do anything and can be
removed?


diff -puN net/dccp/ipv6.c~ipv6-dccp-fix-memory-leak-in-dccp_v6_do_rcv 
net/dccp/ipv6.c
--- a/net/dccp/ipv6.c~ipv6-dccp-fix-memory-leak-in-dccp_v6_do_rcv
+++ a/net/dccp/ipv6.c
@@ -997,7 +997,7 @@ static int dccp_v6_do_rcv(struct sock *s
if (sk-sk_state == DCCP_OPEN) { /* Fast path */
if (dccp_rcv_established(sk, skb, dccp_hdr(skb), skb-len))
goto reset;
-   return 0;
+   goto out;
}
 
if (sk-sk_state == DCCP_LISTEN) {
@@ -1013,9 +1013,7 @@ static int dccp_v6_do_rcv(struct sock *s
if (nsk != sk) {
if (dccp_child_process(sk, nsk, skb))
goto reset;
-   if (opt_skb != NULL)
-   __kfree_skb(opt_skb);
-   return 0;
+   goto out;
}
}
 
@@ -1026,9 +1024,10 @@ static int dccp_v6_do_rcv(struct sock *s
 reset:
dccp_v6_ctl_send_reset(skb);
 discard:
+   kfree_skb(skb);
+out:
if (opt_skb != NULL)
__kfree_skb(opt_skb);
-   kfree_skb(skb);
return 0;
 }
 
_

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.18-mm2 - oops in cache_alloc_refill()

2006-09-29 Thread Andrew Morton
On Fri, 29 Sep 2006 20:01:54 -0400
[EMAIL PROTECTED] wrote:

 On Fri, 29 Sep 2006 12:45:58 PDT, Andrew Morton said:
 
 (Adding a bunch of people to the cc: list now that I have a clue what is
 going on)
 
  I'd expect it's the same bug - slab data structures have gone bad.
 
 *bing*! We have a winner.  A quick check showed the kernel wasn't built with
 slab debugging enabled, so I turned on the more obvious options, and got
 rewarded with a traceback..

doh.  I'd assumed that CONFIG_DEBUG_SLAB was enabled :(

  Again: how come nobody else is hitting this?  Something's different.
 
 gkrellm and wireless (specifically, gkrellm-wifi-0.9.12-3.fc6 from Fedora
 Core extras-development).  Kernel is still a 2.6.18 with *only* the
 origin.patch from -mm2 applied. Note that the gkrellm plugin hasn't had
 a change in the code since 01/03/2004 - hopefully there's been no 
 unintentional
 API change on the kernel side since then...
 
 Here's the traceback I got:
 
 slab error in verify_redzone_free(): cache `size-32': memory outside object 
 was overwritten
 [c0103ad2] dump_trace+0x64/0x1cd
 [c0103c4d] show_trace_log_lvl+0x12/0x25
 [c010415f] show_trace+0xd/0x10
 [c01041fc] dump_stack+0x19/0x1b
 [c014c796] __slab_error+0x17/0x1c
 [c014cdac] cache_free_debugcheck+0xaf/0x230
 [c014d43e] kfree+0x59/0x8c
 [c02dc04a] ioctl_standard_call+0x1da/0x218
 [c02dc275] wireless_process_ioctl+0x55/0x312
 [c02d3750] dev_ioctl+0x45f/0x49a
 [c02c92aa] sock_ioctl+0x1b3/0x1c6
 [c0160322] do_ioctl+0x22/0x67
 [c01605a5] vfs_ioctl+0x23e/0x251
 [c01605ff] sys_ioctl+0x47/0x64
 [c0102cd3] syscall_call+0x7/0xb
 DWARF2 unwinder stuck at syscall_call+0x7/0xb
 
 Leftover inexact backtrace:
 
 ===
 de57e16c: redzone 1:0x170fc2a5, redzone 2:0x170fc200.
 
 Repeated, over and over, just about once a second.
 
 A quick strace of gkrellm finds these likely ioctl's causing the problem:
 
 % grep ioctl /tmp/foo2 | sort -u | more
 ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
 ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
 ioctl(13, SIOCGIWRATE, 0xbfbcdbbc)  = 0

Yes.  The main thing which those WE-21 patches do is to shorten the size of
various buffers which are used in wireless ioctls.

 Since I'm using an orinoco-based card, these 2 look like the most likely
 candidates.  WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
 stable for me.

The WE-21 patches weren't in Jeff's tree for -mm1 or for -mm2.  They
appeared there transiently then quickly went mainline.  They _might_ have
been in the wireless git tree, although I often drop that due to git woes. 
But that hasn't happened recently

 I'll let somebody else argue over what path these took that
 I never tripped over them in an earlier -mm before they hit Linus's tree...
 
 commit baef186519c69b11cf7e48c26e75feb1e6173baa
 Author: John W. Linville [EMAIL PROTECTED]
 Date:   Fri Sep 8 16:04:05 2006 -0400
 
 [PATCH] WE-21 support (core API)
 
 This is version 21 of the Wireless Extensions. Changelog :
 o finishes migrating the ESSID API (remove the +1)
 o netdev-get_wireless_stats is no more
 o long/short retry
 
 This is a redacted version of a patch originally submitted by Jean
 Tourrilhes.  I removed most of the additions, in order to minimize
 future support requirements for nl80211 (or other WE successor).
 
 CC: Jean Tourrilhes [EMAIL PROTECTED]
 Signed-off-by: John W. Linville [EMAIL PROTECTED]
 
 commit eeec9f1a931262d69811135092c8447d6dccc3e6
 Author: Jean Tourrilhes [EMAIL PROTECTED]
 Date:   Tue Aug 29 18:02:31 2006 -0700
 
 [PATCH] WE-21 for orinoco
 
 Signed-off-by: Jean Tourrilhes [EMAIL PROTECTED]
 Signed-off-by: John W. Linville [EMAIL PROTECTED]
 

Try reverting those?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   10   >