Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' oops

2000-10-27 Thread Andrea Arcangeli

On Fri, Oct 27, 2000 at 12:14:56PM +0100, Ian Jackson wrote:
> gcc version 2.95.2 2220 (Debian GNU/Linux)

Please give a try to 2.95.2 19991024 (release) or egcs 1.1.2 or gcc 2.7.2.3. I
don't see anything strange in your .config.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' oops

2000-10-27 Thread Ian Jackson

Andrea Arcangeli writes ("Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' 
oops"):
> On Thu, Oct 26, 2000 at 06:37:37PM +0100, Ian Jackson wrote:
> >  Negative d_count (-805538369) for [binary garbage]/
> > 
> > followed by an oops.  Kernel logfile extract below, uuencoded.
> 
> Thanks for the feedback.
> 
> The oops is forced by the kernel after it sees then wrong negative d_count.
> 
> I'd say it's memory corruption, but it doesn't look like a memory bitflip.
> 
> I'm almost certain that it's not caused by the VM-global patch.
> 
> Which device driver and compiler are you using?

chiark:~> gcc -v
Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs
gcc version 2.95.2 2220 (Debian GNU/Linux)
chiark:~> dpkg -l gcc
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err:uppercase=bad)
||/ Name   VersionDescription
+++-==-==-
ii  gcc2.95.2-13  The GNU C compiler.
chiark:~>

I've enclosed a copy of `.config' from the 2.2.18pre17+VM-global.

I forgot to mention, and it might be relevant (given that the oops is
in `hung_up_tty_read'), that I'm using a VPN system of my own devising
which gets packets in and out of the kernel by using `slattach' on
pty's; it has no nonstandard kernel component, but probably has some
unusual pty handling behaviour.  If you really want to look at what it
does, the source is on ftp.chiark.greenend.org.uk in ipif/service.c
inside /users/ian/userv/userv-utils-0.2.0.tar.gz.

Ian.

#
# Automatically generated by make menuconfig: don't edit
#

#
# Code maturity level options
#
# CONFIG_EXPERIMENTAL is not set

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
CONFIG_M586TSC=y
# CONFIG_M686 is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_TSC=y
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
CONFIG_1GB=y
# CONFIG_2GB is not set
# CONFIG_MATH_EMULATION is not set
# CONFIG_MTRR is not set
# CONFIG_SMP is not set

#
# Loadable module support
#
# CONFIG_MODULES is not set

#
# General setup
#
CONFIG_NET=y
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_QUIRKS=y
CONFIG_PCI_OLD_PROC=y
# CONFIG_MCA is not set
# CONFIG_VISWS is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_MISC is not set
# CONFIG_PARPORT is not set
# CONFIG_APM is not set
# CONFIG_TOSHIBA is not set

#
# Plug and Play support
#
# CONFIG_PNP is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_IDE is not set
# CONFIG_BLK_DEV_HD_ONLY is not set
# CONFIG_BLK_DEV_LOOP is not set
# CONFIG_BLK_DEV_NBD is not set
CONFIG_BLK_DEV_MD=y
# CONFIG_MD_LINEAR is not set
CONFIG_MD_STRIPED=y
# CONFIG_MD_MIRRORING is not set
# CONFIG_MD_RAID5 is not set
# CONFIG_MD_BOOT is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_BLK_DEV_XD is not set
# CONFIG_BLK_DEV_DAC960 is not set
CONFIG_PARIDE_PARPORT=y
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_HD is not set

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_NETLINK=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK_DEV=y
CONFIG_FIREWALL=y
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK=y
# CONFIG_IP_MULTIPLE_TABLES is not set
# CONFIG_IP_ROUTE_MULTIPATH is not set
# CONFIG_IP_ROUTE_TOS is not set
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_ROUTE_LARGE_TABLES is not set
# CONFIG_IP_PNP is not set
CONFIG_IP_FIREWALL=y
CONFIG_IP_FIREWALL_NETLINK=y
CONFIG_NETLINK_DEV=y
# CONFIG_IP_TRANSPARENT_PROXY is not set
# CONFIG_IP_MASQUERADE is not set
# CONFIG_IP_ROUTER is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
CONFIG_IP_ALIAS=y
CONFIG_SYN_COOKIES=y
# CONFIG_INET_RARP is not set
CONFIG_SKB_LARGE=y
# CONFIG_IPX is not set
# CONFIG_ATALK is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set
# CONFIG_PHONE_IXJ is not set

#
# SCSI support
#
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI low-level drivers
#
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN20

Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' oops

2000-10-27 Thread Ian Jackson

Andrea Arcangeli writes ("Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' 
oops"):
 On Thu, Oct 26, 2000 at 06:37:37PM +0100, Ian Jackson wrote:
   Negative d_count (-805538369) for [binary garbage]/NULL
  
  followed by an oops.  Kernel logfile extract below, uuencoded.
 
 Thanks for the feedback.
 
 The oops is forced by the kernel after it sees then wrong negative d_count.
 
 I'd say it's memory corruption, but it doesn't look like a memory bitflip.
 
 I'm almost certain that it's not caused by the VM-global patch.
 
 Which device driver and compiler are you using?

chiark:~ gcc -v
Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs
gcc version 2.95.2 2220 (Debian GNU/Linux)
chiark:~ dpkg -l gcc
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err:uppercase=bad)
||/ Name   VersionDescription
+++-==-==-
ii  gcc2.95.2-13  The GNU C compiler.
chiark:~

I've enclosed a copy of `.config' from the 2.2.18pre17+VM-global.

I forgot to mention, and it might be relevant (given that the oops is
in `hung_up_tty_read'), that I'm using a VPN system of my own devising
which gets packets in and out of the kernel by using `slattach' on
pty's; it has no nonstandard kernel component, but probably has some
unusual pty handling behaviour.  If you really want to look at what it
does, the source is on ftp.chiark.greenend.org.uk in ipif/service.c
inside /users/ian/userv/userv-utils-0.2.0.tar.gz.

Ian.

#
# Automatically generated by make menuconfig: don't edit
#

#
# Code maturity level options
#
# CONFIG_EXPERIMENTAL is not set

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
CONFIG_M586TSC=y
# CONFIG_M686 is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_TSC=y
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
CONFIG_1GB=y
# CONFIG_2GB is not set
# CONFIG_MATH_EMULATION is not set
# CONFIG_MTRR is not set
# CONFIG_SMP is not set

#
# Loadable module support
#
# CONFIG_MODULES is not set

#
# General setup
#
CONFIG_NET=y
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_QUIRKS=y
CONFIG_PCI_OLD_PROC=y
# CONFIG_MCA is not set
# CONFIG_VISWS is not set
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_MISC is not set
# CONFIG_PARPORT is not set
# CONFIG_APM is not set
# CONFIG_TOSHIBA is not set

#
# Plug and Play support
#
# CONFIG_PNP is not set

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_IDE is not set
# CONFIG_BLK_DEV_HD_ONLY is not set
# CONFIG_BLK_DEV_LOOP is not set
# CONFIG_BLK_DEV_NBD is not set
CONFIG_BLK_DEV_MD=y
# CONFIG_MD_LINEAR is not set
CONFIG_MD_STRIPED=y
# CONFIG_MD_MIRRORING is not set
# CONFIG_MD_RAID5 is not set
# CONFIG_MD_BOOT is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_BLK_DEV_XD is not set
# CONFIG_BLK_DEV_DAC960 is not set
CONFIG_PARIDE_PARPORT=y
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_HD is not set

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_NETLINK=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK_DEV=y
CONFIG_FIREWALL=y
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK=y
# CONFIG_IP_MULTIPLE_TABLES is not set
# CONFIG_IP_ROUTE_MULTIPATH is not set
# CONFIG_IP_ROUTE_TOS is not set
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_ROUTE_LARGE_TABLES is not set
# CONFIG_IP_PNP is not set
CONFIG_IP_FIREWALL=y
CONFIG_IP_FIREWALL_NETLINK=y
CONFIG_NETLINK_DEV=y
# CONFIG_IP_TRANSPARENT_PROXY is not set
# CONFIG_IP_MASQUERADE is not set
# CONFIG_IP_ROUTER is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
CONFIG_IP_ALIAS=y
CONFIG_SYN_COOKIES=y
# CONFIG_INET_RARP is not set
CONFIG_SKB_LARGE=y
# CONFIG_IPX is not set
# CONFIG_ATALK is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set
# CONFIG_PHONE_IXJ is not set

#
# SCSI support
#
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI low-level drivers
#
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_MEGARAID 

Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' oops

2000-10-27 Thread Andrea Arcangeli

On Fri, Oct 27, 2000 at 12:14:56PM +0100, Ian Jackson wrote:
 gcc version 2.95.2 2220 (Debian GNU/Linux)

Please give a try to 2.95.2 19991024 (release) or egcs 1.1.2 or gcc 2.7.2.3. I
don't see anything strange in your .config.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' oops

2000-10-26 Thread Andrea Arcangeli

On Thu, Oct 26, 2000 at 06:37:37PM +0100, Ian Jackson wrote:
>  Negative d_count (-805538369) for [binary garbage]/
> 
> followed by an oops.  Kernel logfile extract below, uuencoded.

Thanks for the feedback.

The oops is forced by the kernel after it sees then wrong negative d_count.

I'd say it's memory corruption, but it doesn't look like a memory bitflip.

I'm almost certain that it's not caused by the VM-global patch.

Which device driver and compiler are you using?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: linux 2.2.18pre17 + VM-global -7 = `Negative d_count' oops

2000-10-26 Thread Andrea Arcangeli

On Thu, Oct 26, 2000 at 06:37:37PM +0100, Ian Jackson wrote:
  Negative d_count (-805538369) for [binary garbage]/NULL
 
 followed by an oops.  Kernel logfile extract below, uuencoded.

Thanks for the feedback.

The oops is forced by the kernel after it sees then wrong negative d_count.

I'd say it's memory corruption, but it doesn't look like a memory bitflip.

I'm almost certain that it's not caused by the VM-global patch.

Which device driver and compiler are you using?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre17

2000-10-20 Thread Marcelo Tosatti


On Thu, 19 Oct 2000, Alan Cox wrote:

> - Get to the bottom of the VM mystery if possible

The RAID problem (which is caused by VM changes) is the same deadlock
found in drbd and nbd.

It was not a problem with kernels < 2.2.17 because there was no write
throttling in shrink_mmap.

I'm attaching "raid-0.90-2.2.17.patch" and "raid-2.2.17.patch". They
should fix, respectively, raid 0.90 and stock 2.2.17 raid. 

I haven't tested the patches. 





--- linux/drivers/block/raid1.c.origThu Oct 19 18:18:25 2000
+++ linux/drivers/block/raid1.c Thu Oct 19 18:48:46 2000
@@ -40,7 +40,7 @@
 * simply can not afford to fail an allocation because
 * there is no failure return path (eg. make_request())
 */
-   while (!(ptr = kmalloc (sizeof (raid1_conf_t), GFP_KERNEL)))
+   while (!(ptr = kmalloc (sizeof (raid1_conf_t), GFP_BUFFER)))
printk ("raid1: out of memory, retrying...\n");
 
memset(ptr, 0, size);


--- linux/drivers/block/raid1.c.origThu Oct 19 19:03:16 2000
+++ linux/drivers/block/raid1.c Thu Oct 19 19:03:03 2000
@@ -209,7 +209,7 @@
PRINTK(("raid1_make_request().\n"));
 
while (!( /* FIXME: now we are rather fault tolerant than nice */
-   r1_bh = kmalloc (sizeof (struct raid1_bh), GFP_KERNEL)
+   r1_bh = kmalloc (sizeof (struct raid1_bh), GFP_BUFFER)
) )
{
printk ("raid1_make_request(#1): out of memory\n");
@@ -301,7 +301,7 @@
 * of this function to grok the difference ;)
 */
while (!( /* FIXME: now we are rather fault tolerant than nice */
-   mirror_bh[i] = kmalloc (sizeof (struct buffer_head), GFP_KERNEL)
+   mirror_bh[i] = kmalloc (sizeof (struct buffer_head), GFP_BUFFER)
) )
{
printk ("raid1_make_request(#2): out of memory\n");



Re: Linux 2.2.18pre17

2000-10-20 Thread Marcelo Tosatti


On Thu, 19 Oct 2000, Alan Cox wrote:

 - Get to the bottom of the VM mystery if possible

The RAID problem (which is caused by VM changes) is the same deadlock
found in drbd and nbd.

It was not a problem with kernels  2.2.17 because there was no write
throttling in shrink_mmap.

I'm attaching "raid-0.90-2.2.17.patch" and "raid-2.2.17.patch". They
should fix, respectively, raid 0.90 and stock 2.2.17 raid. 

I haven't tested the patches. 





--- linux/drivers/block/raid1.c.origThu Oct 19 18:18:25 2000
+++ linux/drivers/block/raid1.c Thu Oct 19 18:48:46 2000
@@ -40,7 +40,7 @@
 * simply can not afford to fail an allocation because
 * there is no failure return path (eg. make_request())
 */
-   while (!(ptr = kmalloc (sizeof (raid1_conf_t), GFP_KERNEL)))
+   while (!(ptr = kmalloc (sizeof (raid1_conf_t), GFP_BUFFER)))
printk ("raid1: out of memory, retrying...\n");
 
memset(ptr, 0, size);


--- linux/drivers/block/raid1.c.origThu Oct 19 19:03:16 2000
+++ linux/drivers/block/raid1.c Thu Oct 19 19:03:03 2000
@@ -209,7 +209,7 @@
PRINTK(("raid1_make_request().\n"));
 
while (!( /* FIXME: now we are rather fault tolerant than nice */
-   r1_bh = kmalloc (sizeof (struct raid1_bh), GFP_KERNEL)
+   r1_bh = kmalloc (sizeof (struct raid1_bh), GFP_BUFFER)
) )
{
printk ("raid1_make_request(#1): out of memory\n");
@@ -301,7 +301,7 @@
 * of this function to grok the difference ;)
 */
while (!( /* FIXME: now we are rather fault tolerant than nice */
-   mirror_bh[i] = kmalloc (sizeof (struct buffer_head), GFP_KERNEL)
+   mirror_bh[i] = kmalloc (sizeof (struct buffer_head), GFP_BUFFER)
) )
{
printk ("raid1_make_request(#2): out of memory\n");



Re: Linux 2.2.18pre17

2000-10-19 Thread Marcelo Tosatti


On Thu, 19 Oct 2000, Alan Cox wrote:

> This is just to give folks something to sync against. Test it by all means
> however.
> 
> Must fix stuff left to do for 2.2.18final
> - Merge the S/390 stuff and make S/390 build again
> - Fix the megaraid (revert if need be)
> - Fix the ps/2 misdetect bug that has appeared
> - NFSv3 hang report
> - Get to the bottom of the VM mystery if possible
> 
> 2.2.18pre17
> o Move a few escaped m68k headers into the right  (me)
>   directory
> o Backport 2.4 AF_UNIX garbage collect speedups   (Dave Miller)
> o TCP fixes for NFS   (Saadia Khan)
> o Fix USB audio hangs (David Woodhouse)
> o Sparc64 dcache and exec fixes   (Dave Miller)
> o Fix typing crap in divert.h (Jeff Garzik)
> o Use pkt_type in diverter, add maintainer info   (Dave Miller)
> o Fix obscure NAT problem in FIB code (Dave Miller)
> o Fix sk->allocation in TCP sendmsg   (Marcelo Tossati)

This TCP fix was done by davem.

Any reason why the nbd fix was not included? 

Thanks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre17

2000-10-19 Thread Marcelo Tosatti


On Thu, 19 Oct 2000, Alan Cox wrote:

 This is just to give folks something to sync against. Test it by all means
 however.
 
 Must fix stuff left to do for 2.2.18final
 - Merge the S/390 stuff and make S/390 build again
 - Fix the megaraid (revert if need be)
 - Fix the ps/2 misdetect bug that has appeared
 - NFSv3 hang report
 - Get to the bottom of the VM mystery if possible
 
 2.2.18pre17
 o Move a few escaped m68k headers into the right  (me)
   directory
 o Backport 2.4 AF_UNIX garbage collect speedups   (Dave Miller)
 o TCP fixes for NFS   (Saadia Khan)
 o Fix USB audio hangs (David Woodhouse)
 o Sparc64 dcache and exec fixes   (Dave Miller)
 o Fix typing crap in divert.h (Jeff Garzik)
 o Use pkt_type in diverter, add maintainer info   (Dave Miller)
 o Fix obscure NAT problem in FIB code (Dave Miller)
 o Fix sk-allocation in TCP sendmsg   (Marcelo Tossati)

This TCP fix was done by davem.

Any reason why the nbd fix was not included? 

Thanks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/