Re: [Fwd: NTFS repair tools]

2000-12-07 Thread Andrzej Krzysztofowicz

"Peter Samuelson wrote:"
> [Michael Warfield]
> > This thing is not armed and dangerous due to an act of ommision.
> > It's live and active only through three acts of commision.
> 
> We could make it *four* acts of commission. (: (: (:
> 
> diff -urk~ fs/Config.in
> --- fs/Config.in~ Mon Nov 13 01:43:42 2000
> +++ fs/Config.in  Thu Dec  7 23:00:34 2000
> @@ -37,7 +37,8 @@
>  tristate 'Minix fs support' CONFIG_MINIX_FS
>  
>  tristate 'NTFS file system support (read only)' CONFIG_NTFS_FS
> -dep_mbool '  NTFS write support (DANGEROUS)' CONFIG_NTFS_RW $CONFIG_NTFS_FS 
>$CONFIG_EXPERIMENTAL
> +dep_mbool '  NTFS write support (DANGEROUS)' CONFIG_MORON $CONFIG_NTFS_FS 
>$CONFIG_EXPERIMENTAL
> +dep_bool  'Are you sure?  I hope you dont care about your NTFS filesystems' 
>CONFIG_NTFS_RW $CONFIG_MORON
>  
>  tristate 'OS/2 HPFS file system support' CONFIG_HPFS_FS

Of course, you know that it *WILL NOT* work as CONFIG_MORON is nowhere 
defined ... ?

Andrzej
--
===
  Andrzej M. Krzysztofowicz   [EMAIL PROTECTED]
  phone (48)(58) 347 14 61
Faculty of Applied Phys. & Math.,   Technical University of Gdansk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] NTFS repair tools

2000-12-07 Thread Anton Altaparmakov

Hearing how many people trash their partition I would agree to comment out 
the NTFS write option altogether. I will make a patch for both 2.4.0-testX 
and 2.2.18latest and send them off to Linus/Alan over the weekend if no one 
beats me to it.

Considering that people are blatantly ignoring all our warnings this might 
be the Right Thing(TM) as it is easy enough to activate the option if 
someone really wants/needs to use it. That should hopefully lower the 
amount of incidents with people trashing their partitions[1][2].

Anton

[1] On the other hand it might not help much as people might just uncomment 
it and go ahead using it, but there is a limit to how far we can go without 
taking out the write part of the driver altogether! Which might actually 
not be a Bad Thing(TM) were it not for the fact that having the write 
support can actually help in fixing a trashed partition when people know 
what they are doing...i.e. when they know what they can do safely and what 
not. - It's saved me from loosing 10Gb+ of non-backed up data in the past!

[2] My NTFS repair utility is under development albeit very slowly which 
should help a little bit once I have a stable release. - Initial release is 
yet TBA as there are some very strange bugs in it at the moment, which 
might actually turn out to be bugs in the compiler/libc/kernel as the 
program runs fine sometimes and sometimes corrupts the partitions slightly, 
operating on the _exact_ same partition with the _exact_ same data on it! - 
Anyway, I am not releasing this to the public before I have figured out WTH 
is going on...

At 06:06 08/12/2000, Peter Samuelson wrote:
>[Jeff Merkey]
> > Please consider the attached patch to make it a little bit harder for
> > folks to enable NTFS Write Support under Linux until it can get fixed
> > properly.
>
>Hey!  It was a joke!  A better way would be just to comment out the
>CONFIG_NTFS_RW line entirely.  Actually, I think that *would* be a good
>idea.  Anyone who has any business messing with NTFS_RW is more than
>capable of editing Config.in.
>
>Peter
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [EMAIL PROTECTED]
>Please read the FAQ at http://www.tux.org/lkml/

-- 
  "Education is what remains after one has forgotten everything he 
learned in school." - Albert Einstein
-- 
Anton Altaparmakov  Voice: +44-(0)1223-333541(lab) / +44-(0)7712-632205(mobile)
Christ's CollegeeMail: [EMAIL PROTECTED] / [EMAIL PROTECTED]
Cambridge CB2 3BUICQ: 8561279
United Kingdom   WWW: http://www-stu.christs.cam.ac.uk/~aia21/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827 in test12-pre6 and 7

2000-12-07 Thread Tom Leete

Linus Torvalds wrote:
> 
> It's not a new bug - it's an old bug that apparently is uncovered by a
> new stricter test.
> 
> Apparently loopback unlocks an already unlocked page - which has always
> been a serious offense, but has never been detected before.
> 
> test12-pre6+ detects it, and thus the BUG().
> 
> Your stack trace isn't symbolic (see Documentation/oops-tracing.txt), so
> it's impossible to debug, but it's already interesting information to
> see that it seems to be either loopback of vfat.
> 
> Can you test some more? In particular, I'd love to hear if this happens
> with vfat even without loopback, or with loopback even without vfat
> (make an ext2 filesystem or similar instead). That woul dnarrow down the
> bug further.
> 
> Thanks,
> Linus

Hi,

Here is a rather different datapoint. I hope it's different enough to help
nail this.

test12-pre5 + kdb + Serial Console. Sorry, I didn't get the contents of
pointer args.

It's probably worth saying that this kernel was compiled with gcc 2.95.2. I
have the blessed egcs also, will compile pre7 with that and see what
happens.

Cheers,
Tom

from nfs mounted ext2 (2.4.0-test12-pre5):
kernel BUG at buffer.c:827!

Entering kdb (current=0xc2014000, pid 466) Panic: invalid operand
due to panic @ 0xc0130c73
eax = 0x001c ebx = 0xc109ebf8 ecx = 0x edx = 0x0006 
esi = 0xc2739af0 edi = 0xc2739b38 esp = 0xc2015dd4 eip = 0xc0130c73 
ebp = 0xc2015df4 xss = 0x0018 xcs = 0xc2010010 eflags = 0x00010046 
xds = 0x0018 xes = 0x0018 origeax = 0x  = 0xc2015da0
kdb> bt
EBP   EIP Function(args)
0xc2015df4 0xc0130c73 end_buffer_io_async+0xd3 (0xc2739af0, 0x1)
   kernel .text 0xc010 0xc0130ba0 0xc0130cb0
0xc2015e10 0xc0164756 end_that_request_first+0x66 (0xc11c2c20, 0x1,
0xc031cf04)
   kernel .text 0xc010 0xc01646f0 0xc01647b0
0xc2015e30 0xc018a128 ide_end_request+0x28 (0x1, 0xc11c5060)
   kernel .text 0xc010 0xc018a100 0xc018a180
0xc2015e64 0xc018e614 read_intr+0x104 (0xc031ce20)
   kernel .text 0xc010 0xc018e510 0xc018e650
0xc2015e88 0xc018bad6 ide_intr+0x106 (0xe, 0xc11c5060, 0xc2015ed4, 0x1c0)
   kernel .text 0xc010 0xc018b9d0 0xc018bb30
0xc2015ea8 0xc010ab50 handle_IRQ_event+0x30 (0xe, 0xc2015ed4, 0xc11de2e0)
   kernel .text 0xc010 0xc010ab20 0xc010ab80
0xc2015ecc 0xc010ace2 do_IRQ+0x72 (0xc02f4520, 0xc02a92ac, 0xc201c000,
0xc201c000, 0xfc18)
   kernel .text 0xc010 0xc010ac70 0xc010ad30
   0xc01093f0 ret_from_intr
   kernel .text 0xc010 0xc01093f0 0xc0109410
Interrupt registers:
eax = 0x0019 ebx = 0xc02f4520 ecx = 0xc02a92ac edx = 0xc201c000 
esi = 0xc201c000 edi = 0xfc18 esp = 0xc2015f08 eip = 0xc0115816 
ebp = 0xc2015f4c xss = 0x0018 xcs = 0xc010 eflags = 0x0287 
xds = 0xc2070018 xes = 0xc2070018 origeax = 0xff0e  = 0xc2015ed4 
   0xc0115816 schedule+0x1b6
   kernel .text 0xc010 0xc0115660 0xc0115b00
0xc2015f70 0xc01155c7 schedule_timeout+0x17 (0xc2014000, 0x1785222)
   kernel .text 0xc010 0xc01155b0 0xc0115650
0xc2015fac 0xc4055753 [sunrpc]svc_recv+0x1a3 (0xc24b2470, 0xc207be00,
0x7fff)
   sunrpc .text 0xc404e060 0xc40555b0 0xc4055940
0xc2015fec 0xc40713f3 [nfsd]nfsd+0x253
   nfsd .text 0xc4071060 0xc40711a0 0xc40714a0
   0xc0107843 kernel_thread+0x23
   kernel .text 0xc010 0xc0107820 0xc0107860
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.0test10: 3c59x: Transmit timed out

2000-12-07 Thread Hans-Joachim Baader

Hi,

I got the following timeout on an SMP system:

3c59x.c:LK1.1.9  2 Sep 2000  Donald Becker and others. 
http://www.scyld.com/network/vortex.html $Revision: 1.102.2.38 $
See Documentation/networking/vortex.txt
eth0: 3Com PCI 3c900 Boomerang 10Mbps Combo at 0xa800,  00:60:97:b0:c2:25, IRQ 10
  8K word-wide RAM 3:5 Rx:Tx split, autoselect/10base2 interface.
Enabling bus-master transmits and whole-frame receives.


NETDEV WATCHDOG: eth0: transmit timed out
eth0: transmit timed out, tx_status 00 status e000.
Flags; bus-master 1, full 1; dirty 8031(15) current 8047(15).
Transmit list 05d802f0 vs. c5d802f0.
0: @c5d80200  length 802a status 002a
1: @c5d80210  length 802a status 002a
2: @c5d80220  length 802a status 002a
3: @c5d80230  length 802a status 002a
4: @c5d80240  length 802a status 002a
5: @c5d80250  length 802a status 002a
6: @c5d80260  length 802a status 002a
7: @c5d80270  length 802a status 002a
8: @c5d80280  length 802a status 002a
9: @c5d80290  length 802a status 002a
10: @c5d802a0  length 802a status 002a
11: @c5d802b0  length 802a status 002a
12: @c5d802c0  length 802a status 002a
13: @c5d802d0  length 802a status 802a
14: @c5d802e0  length 802a status 802a
15: @c5d802f0  length 802a status 002a


# cat /proc/interrupts (after reloading the driver)
   CPU0   CPU1   
0:  17132151742004IO-APIC-edge  timer
1:0  2IO-APIC-edge  keyboard
2:0  0  XT-PIC  cascade
3:1  4IO-APIC-edge  serial
4:78236  81425IO-APIC-edge  serial
5:1  0IO-APIC-edge  soundblaster
8:0  3IO-APIC-edge  rtc
9:   177420 177907   IO-APIC-level  sym53c8xx
10:3277   3323   IO-APIC-level  eth0
11:   0  0IO-APIC-edge  mcdx
12:   50650  49080   IO-APIC-level  eth1
13:   0  0  XT-PIC  fpu
NMI:34551463455146 
LOC:34547153454709 
ERR:  0

Regards,
hjb
-- 
http://www.pro-linux.de/ - Germany's largest volunteer Linux support site
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: java (and possibly other threaded apps) hanging in rt_sigsuspend

2000-12-07 Thread Juergen Kreileder

> "Frank" == Frank de Lange <[EMAIL PROTECTED]> writes:

Frank> I saw your remarks on the kernel mailing list
Frank> wrt. 'threaded processes get stuck in
Frank> rt_sigsuspend/fillonedir/exit_notify' dd. 2911-12, and
Frank> thought you might be interested in the fact that something
Frank> quite like this also happens on 2.4.0-test11 with glibc-2.2
Frank> (release), BUT NOT ALWAYS...

Frank> I can reliably hang java (Blackdown port jdk1.3, FCS) using
Frank> the -Xmx parameter (which specifies a maximum heap size),
Frank> the weird thing is that it does NOT hang which this
Frank> parameter is either not specified OR specified but larger
Frank> than a certain value. When it hangs, it always is stuck in
Frank> a rt_sigsuspend call just after a clone() call. An example:

Frank>  [frank@behemoth frank]$ java
Frank> (java starts and spits out some info, then exits as
Frank> it should)

Frank>  [frank@behemoth frank]$ java -Xmx32m
Frank> (java ALWAYS gets stuck:

Frank>  pipe([6, 7])= 0
Frank>  clone() = 14732
Frank>  [pid 14679] write(7, "\0\0\0\0\5\0\0\0~\266\2@ $T@\0 T@\0 
T@\300\265\2@\0\0\0"..., 148) = 148
Frank>  [pid 14679] rt_sigprocmask(SIG_SETMASK, NULL, [RT_0], 8) = 0
Frank>  [pid 14679] write(7, 
"`S\3@\0\0\0\0\20\321\377\277pD\37@\30&\5\10\0\0\0\200\0"..., 148) = 148
Frank>  [pid 14679] rt_sigprocmask(SIG_SETMASK, NULL, [RT_0], 8) = 0
Frank>  [pid 14679] rt_sigsuspend([]
Frank>  )

Can you reproduce this without strace?

I only see this problem when I run with 'strace -f' and java wants to
exit (apart from that java works correctly).  I don't see the dependency
on the heap size here.


Juergen

-- 
Juergen Kreileder, Blackdown Java-Linux Team
http://www.blackdown.org/java-linux.html
JVM'01: http://www.usenix.org/events/jvm01/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] NTFS repair tools

2000-12-07 Thread Peter Samuelson


[Jeff Merkey]
> Please consider the attached patch to make it a little bit harder for
> folks to enable NTFS Write Support under Linux until it can get fixed
> properly.

Hey!  It was a joke!  A better way would be just to comment out the
CONFIG_NTFS_RW line entirely.  Actually, I think that *would* be a good
idea.  Anyone who has any business messing with NTFS_RW is more than
capable of editing Config.in.

Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



multiple emails

2000-12-07 Thread Mohammad A. Haque

My ISP is having mailserver problems causing messages to be sent out
multiple times. They should have it fixed soon.

-- 

=
Mohammad A. Haque  http://www.haque.net/ 
   [EMAIL PROTECTED]

  "Alcohol and calculus don't mix. Project Lead
   Don't drink and derive." --Unknown  http://wm.themes.org/
   [EMAIL PROTECTED]
=
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] NTFS repair tools

2000-12-07 Thread Jeff V. Merkey


Linus/Alan,

Please consider the attached patch to make it a little bit harder for
folks to enable NTFS Write Support under Linux until it can get fixed
properly.

Jeff


Peter Samuelson wrote:
> 
> [Michael Warfield]
> > This thing is not armed and dangerous due to an act of ommision.
> > It's live and active only through three acts of commision.
> 
> We could make it *four* acts of commission. (: (: (:
> 
> diff -urk~ fs/Config.in
> --- fs/Config.in~   Mon Nov 13 01:43:42 2000
> +++ fs/Config.inThu Dec  7 23:00:34 2000
> @@ -37,7 +37,8 @@
>  tristate 'Minix fs support' CONFIG_MINIX_FS
> 
>  tristate 'NTFS file system support (read only)' CONFIG_NTFS_FS
> -dep_mbool '  NTFS write support (DANGEROUS)' CONFIG_NTFS_RW $CONFIG_NTFS_FS 
>$CONFIG_EXPERIMENTAL
> +dep_mbool '  NTFS write support (DANGEROUS)' CONFIG_MORON $CONFIG_NTFS_FS 
>$CONFIG_EXPERIMENTAL
> +dep_bool  'Are you sure?  I hope you dont care about your NTFS filesystems' 
>CONFIG_NTFS_RW $CONFIG_MORON
> 
>  tristate 'OS/2 HPFS file system support' CONFIG_HPFS_FS
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



cramfs filesystem patch

2000-12-07 Thread Daniel Quinlan

Here's a patch for the cramfs filesystem.  Lots of improvements and a
new cramfsck program, see below for the full list of changes.

It only modifies cramfs code (aside from adding cramfs to struct
super_block) and aims to be completely backward-compatible.  All old
cramfs images will still work and new cramfs images are mountable on
old kernels if you avoid using holes (requires 2.3.39 or later) and
you don't pad the beginning of the cramfs filesystem for a boot sector
(and presumably, you have a new kernel in that case).

Most of this code has been fairly well-tested, but a few features are
newer and less well-tested (like the directory sorting).

If you use cramfs, please try it out and let me know if you have any
feedback or comments (please Cc me if you respond here).

  Dan



superblock changes:
  - revised fsid field (still unique, contains useful fs info:
a CRC, edition number for implementation-specific versioning,
block and file counts for statfs)
  - size field is now used
  - new feature flags (fsid version 2, holes, shifted root for boot loader,
directory sorting)

inode.c
  - proper checking for device size (torvalds)
  - speed up IO a bit - read in the block asynchronously rather than
using bread() on them one by one (torvalds)
  - early directory lookup exit (since directories are now sorted)
  - add reporting of blocks and blksize for statfs
  - removed superfluous test for cramfs signature

cramfsck.c
  - new program to fsck/extract cramfs images

mkcramfs.c
  - CRC added to filesystem image
  - allow holes to be created (see -z option)
  - support for a cramfs boot loader (see -p option)
  - support for inserting an kernel image into a cramfs image (see -i option)
  - directory sorting (more consistent images, allows faster lookups)
  - bug fix: allocate enough RAM for the fs image (small images)
  - bug fix: work around a bug in ramfs where it would
incorrectly report the number of blocks for file (a problem when
creating cramfs images from a ramfs filesystem)
  - options to set name/edition number in superblock (-n and -e, respectively)

cramfs.txt:
  - added entry for /etc/magic to cramfs.txt documentation

cleanup:
  - moved width definitions to cramfs.h
  - define special cramfs super block

diff -ur linux-2.4.0-test10.orig/Documentation/filesystems/cramfs.txt 
linux-2.4.0-test10/Documentation/filesystems/cramfs.txt
--- linux-2.4.0-test10.orig/Documentation/filesystems/cramfs.txtSat May 20 
11:30:31 2000
+++ linux-2.4.0-test10/Documentation/filesystems/cramfs.txt Wed Dec  6 19:12:12 
+2000
@@ -47,6 +47,21 @@
 mind the filesystem becoming unreadable to future kernels.
 
 
+For /usr/share/magic
+--
+
+0  long0x28cd3d45  Linux cramfs
+>4 longx   size %d
+>8 longx   flags 0x%x
+>12longx   future 0x%x
+>16string  >\0 signature "%.16s"
+>32longx   fsid.crc 0x%x
+>36longx   fsid.edition %d
+>40longx   fsid.blocks %d
+>44longx   fsid.files %d
+>48string  >\0 name "%.16s"
+
+
 Hacker Notes
 
 
diff -ur linux-2.4.0-test10.orig/fs/cramfs/cramfs.h 
linux-2.4.0-test10/fs/cramfs/cramfs.h
--- linux-2.4.0-test10.orig/fs/cramfs/cramfs.h  Tue Jan 11 10:24:58 2000
+++ linux-2.4.0-test10/fs/cramfs/cramfs.h   Wed Dec  6 19:12:12 2000
@@ -5,12 +5,23 @@
 #define CRAMFS_SIGNATURE   "Compressed ROMFS"
 
 /*
+ * Width of various bitfields in struct cramfs_inode.
+ * Primarily used to generate warnings in mkcramfs.
+ */
+#define CRAMFS_MODE_WIDTH 16
+#define CRAMFS_UID_WIDTH 16
+#define CRAMFS_SIZE_WIDTH 24
+#define CRAMFS_GID_WIDTH 8
+#define CRAMFS_NAMELEN_WIDTH 6
+#define CRAMFS_OFFSET_WIDTH 26
+
+/*
  * Reasonably terse representation of the inode data.
  */
 struct cramfs_inode {
-   u32 mode:16, uid:16;
+   u32 mode:CRAMFS_MODE_WIDTH, uid:CRAMFS_UID_WIDTH;
/* SIZE for device files is i_rdev */
-   u32 size:24, gid:8;
+   u32 size:CRAMFS_SIZE_WIDTH, gid:CRAMFS_GID_WIDTH;
/* NAMELEN is the length of the file name, divided by 4 and
rounded up.  (cramfs doesn't support hard links.) */
/* OFFSET: For symlinks and non-empty regular files, this
@@ -19,7 +30,14 @@
   see README).  For non-empty directories it is the offset
   (divided by 4) of the inode of the first file in that
   directory.  For anything else, offset is zero. */
-   u32 namelen:6, offset:26;
+   u32 namelen:CRAMFS_NAMELEN_WIDTH, offset:CRAMFS_OFFSET_WIDTH;
+};
+
+struct cramfs_info {
+   u32 crc;
+   u32 edition;
+   u32 blocks;
+   u32 files;
 };
 
 /*
@@ -27,22 +45,33 @@
  */
 struct cramfs_super {
u32 magic;  /* 0x28cd3d45 - random number */
-   u32 size;   /* Not used.  mkcramfs 

Re: [Fwd: NTFS repair tools]

2000-12-07 Thread Peter Samuelson


[Michael Warfield]
> This thing is not armed and dangerous due to an act of ommision.
> It's live and active only through three acts of commision.

We could make it *four* acts of commission. (: (: (:

diff -urk~ fs/Config.in
--- fs/Config.in~   Mon Nov 13 01:43:42 2000
+++ fs/Config.inThu Dec  7 23:00:34 2000
@@ -37,7 +37,8 @@
 tristate 'Minix fs support' CONFIG_MINIX_FS
 
 tristate 'NTFS file system support (read only)' CONFIG_NTFS_FS
-dep_mbool '  NTFS write support (DANGEROUS)' CONFIG_NTFS_RW $CONFIG_NTFS_FS 
$CONFIG_EXPERIMENTAL
+dep_mbool '  NTFS write support (DANGEROUS)' CONFIG_MORON $CONFIG_NTFS_FS 
+$CONFIG_EXPERIMENTAL
+dep_bool  'Are you sure?  I hope you dont care about your NTFS filesystems' 
+CONFIG_NTFS_RW $CONFIG_MORON
 
 tristate 'OS/2 HPFS file system support' CONFIG_HPFS_FS
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [Fwd: NTFS repair tools]

2000-12-07 Thread Jeff V. Merkey



"Michael H. Warfield" wrote:

> > Agree.  We need to disable it, since folks do not read the docs
> > (obviously).  Of course, we could leave it on, and I could start
> > charging money for these tools -- there's little doubt it would be a
> > lucrative business.  Perhaps this is what I'll do if the numbers of
> > copies keeps growing.  When it hits > 100 per week, it's taking a lot of
> > our time to support, so I will have to start charging for it.
> 
> Huh?
> 
> How disabled do you want it.  It can't even be enabled unless
> you enabled experimental code options.  Then, it's disabled by default
> and you first have to enable the R/O NTFS.  Then you have to explicitly
> select the option to enable RW access that is clearly labeled DANGEROUS.
> This thing is not armed and dangerous due to an act of ommision.  It's
> live and active only through three acts of commision.
> 
> About the only thing left, short of removing it from the kernel
> entirely, is to make the option a hidden control option, like some of the
> debugging options, that requires editing a header file or a Makefile to
> enable.  Is that what you are looking for?
> 

Linux today monitors this list.  Some public education may be the best
route.  How do we post a security advisory warning people that will get
posted?  I'm sure folks see the DANGEROUS comments, but they don't seem
to stick in their heads.  Then they get themselves into trouble, and
fortunately for them, I'm around.  I am just concerned about the scope
of the black eye that will just keep getting bigger and bigger for Linux
NTFS.

:-)

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [Fwd: NTFS repair tools]

2000-12-07 Thread Michael H. Warfield

On Thu, Dec 07, 2000 at 09:43:24PM -0700, Jeff V. Merkey wrote:

> Peter Samuelson wrote:
> > 
> > [Jeff Merkey]
> > > Do folks not know this NTFS driver will trash hard drives?  We need
> > > to alert folks DO NOT USE WRITE NTFS MODE in those versions we know
> > > are busted.

> > Here's an idea: let's make r/w support a separate CONFIG option, and
> > label it "DANGEROUS".

> > Oh wait, we already do that.

> > Perhaps we should warn users to back up their NTFS partitions before
> > trying this option.  Put that warning in the help text for
> > CONFIG_NTFS_RW.

> > Oh wait, we already do that too.

> > How stupid does one have to be in order to enable an option labeled
> > "DANGEROUS" for a non-experimental system?

> Agree.  We need to disable it, since folks do not read the docs
> (obviously).  Of course, we could leave it on, and I could start
> charging money for these tools -- there's little doubt it would be a
> lucrative business.  Perhaps this is what I'll do if the numbers of
> copies keeps growing.  When it hits > 100 per week, it's taking a lot of
> our time to support, so I will have to start charging for it.

Huh?

How disabled do you want it.  It can't even be enabled unless
you enabled experimental code options.  Then, it's disabled by default
and you first have to enable the R/O NTFS.  Then you have to explicitly
select the option to enable RW access that is clearly labeled DANGEROUS.
This thing is not armed and dangerous due to an act of ommision.  It's
live and active only through three acts of commision.

About the only thing left, short of removing it from the kernel
entirely, is to make the option a hidden control option, like some of the
debugging options, that requires editing a header file or a Makefile to
enable.  Is that what you are looking for?

> Jeff

> Jeff   


> > 
> > Peter

Mike
-- 
 Michael H. Warfield|  (770) 985-6132   |  [EMAIL PROTECTED]
  (The Mad Wizard)  |  (678) 463-0932   |  http://www.wittsend.com/mhw/
  NIC whois:  MHW9  |  An optimist believes we live in the best of all
 PGP Key: 0xDF1DD471|  possible worlds.  A pessimist is sure of it!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Kernel Development Documentation?

2000-12-07 Thread Jeff V. Merkey



Carl Perry wrote:
> 
> I was reading an article today on Slashdot about how poorly documented the
> Windows API was and had this fear that Linux could get the same way.  So, cut to
> the point:

Whoever posted this on Slashdot is smoking some killer dope or
something.  The Windows API is very well documented via online docs from
MSDN -- better than most software anywhere in the world.   Whoever said
this was high or something or ignorant or never had an MSDN
subscription.  I used to get 20-40 CD's a month from MS on a Universal
Subscription, with enormous documentation..

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [Fwd: NTFS repair tools]

2000-12-07 Thread Jeff V. Merkey



Peter Samuelson wrote:
> 
> [Jeff Merkey]
> > Do folks not know this NTFS driver will trash hard drives?  We need
> > to alert folks DO NOT USE WRITE NTFS MODE in those versions we know
> > are busted.
> 
> Here's an idea: let's make r/w support a separate CONFIG option, and
> label it "DANGEROUS".
> 
> Oh wait, we already do that.
> 
> Perhaps we should warn users to back up their NTFS partitions before
> trying this option.  Put that warning in the help text for
> CONFIG_NTFS_RW.
> 
> Oh wait, we already do that too.
> 
> How stupid does one have to be in order to enable an option labeled
> "DANGEROUS" for a non-experimental system?

Agree.  We need to disable it, since folks do not read the docs
(obviously).  Of course, we could leave it on, and I could start
charging money for these tools -- there's little doubt it would be a
lucrative business.  Perhaps this is what I'll do if the numbers of
copies keeps growing.  When it hits > 100 per week, it's taking a lot of
our time to support, so I will have to start charging for it.

Jeff

Jeff   


> 
> Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Kernel Development Documentation?

2000-12-07 Thread Carl Perry

I was reading an article today on Slashdot about how poorly documented the
Windows API was and had this fear that Linux could get the same way.  So, cut to
the point:

Is there a project underway that documents how things like the VM, the Memory
Manger, what a a specific driver needs to do, what it needs to return, how it is
called, what do all those files in arch/whatever do?  Are there bits and pieces
spewed around all over the net?  It would seem to me that someone ( and I will
volunteer ) sould sit down and put together some sort of documentation system
for the Kernel.  Somthing to the effect of a super help for users with all the
technical details for developers.

Obviously it should be kept on the Internet but in such a way that it can be
contributed to by many people ( preferrably the people who have influence with
the code ).  It should also have a team of "editors" and be maintained such that
it is browsable, printable, and publishable.

I personally feel that now would be the time to do this while the current
development tree is being wrapped up for production and there is no major work
on a 2.5 development tree - at least publicly.  Like I said, I voulenteer to
coordinate this effort and find a place to host it.  What are the thoughts of
others?  Good idea, bad idea?  Input is more than welcome.
-- 
-Carl Perry
[EMAIL PROTECTED]

"Real programmers don't draw flowcharts.  Flowcharts are, after
all, the illiterate's form of documentation.  Cavemen drew
flowcharts; look how much good it did them."
-Fortune (The App, not the Magazine)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



KernelWiki for December: Season of the Gift

2000-12-07 Thread Gary Lawrence Murphy

No, I hadn't forgotten: Time for another poke-the-kernel-list post.

December is, for many people, a time of community and family, and a
time for giving gifts to friends and strangers.  In Japan, I am told,
they have a custom of giving away to others the gifts they have
received.

The December 2000 KernelWiki Challenge is really simple:

"When I asked about ___,  told me "

1) Fill in the blanks or comb your back emails for some gift of kernel
   insight which you received from someone else.

2) Go to http://kernelbook.sourceforge.net/wiki/?KernelWiki and find
   the appropriate KernelWiki page.

3) Click the "Edit this Page" link

4) Plunk your December KernelWiki response into the text box.

5) Click "Save" and get back to your holiday festivities.

It's painless. All I want is 10 minutes of your time.  The best stuff
is already sitting there in your email files, all you have to do is
dig it out, dust it off, and share it.  Even if you are still trying
to make sense of it, if it seems useful to understanding Linux 2.4,
plunk it in. It's easy. 10 minutes work, 15 tops.

Hundreds of messages pass through this list in a day, and while most
are about the day to day business of _building_ the new kernel, some
small percent are general answers that illuminate the Kernel.  Those
flashes _deserve_ to be collected and shared. One month's worth of
these gems could illuminate whole sections of the kernel.

What do you win?  Do it right, and you might cause that "transmission
of light" which nets you assistance in your kernel hacking.  The word
"Community" comes from Latin roots meaning "Those with whom I share
gifts".  Your contribution to the KernelWiki makes that community just
a little larger.

  WARNING: I will persist in pestering for participation, but no more
  than once a month.  The subject line is stable enough to regexp for
  a kill file, but the simple fact is the KernelWiki _is_ working.

KernelWiki charges forth, breaking all records, charting new ground,
belieing the naysayers.  KernelWiki has exceeded all expectations.
KernelWiki is a hit, the cover of Die Spiegle and Time's Kernel Doc of
the Year.  Be the first in your network segment to KernelWiki!

Should you have more than 15 minutes to spare and you are interested
in this KernelWiki thing, you are invited to fetch

http://kernelbook.sourceforge.net/wiki/?KernelWikiWhy
http://kernelbook.sourceforge.net/wiki/?KernelWikiPolicies
http://kernelbook.sourceforge.net/wiki/?HowToUseWiki

The dedicated gift-givers are invited to cruise KernelWiki for
question marks, and click the mark to describe the undefined term.
KernelWiki lives by your kind contributions.

See you in 2001


-- 
Gary Lawrence Murphy <[EMAIL PROTECTED]>: office voice/fax: 01 519 4222723
T(!c)Inc Business Innovation through Open Source http://www.teledyn.com
M:I-3 - Documenting the Linux kernel: http://kernelbook.sourceforge.net
"My humanity is bound up in yours; we can only be human together"(Tutu)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [Fwd: NTFS repair tools]

2000-12-07 Thread Peter Samuelson


[Jeff Merkey]
> Do folks not know this NTFS driver will trash hard drives?  We need
> to alert folks DO NOT USE WRITE NTFS MODE in those versions we know
> are busted.

Here's an idea: let's make r/w support a separate CONFIG option, and
label it "DANGEROUS".

Oh wait, we already do that.

Perhaps we should warn users to back up their NTFS partitions before
trying this option.  Put that warning in the help text for
CONFIG_NTFS_RW.

Oh wait, we already do that too.

How stupid does one have to be in order to enable an option labeled
"DANGEROUS" for a non-experimental system?

Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Jeff V. Merkey



"Jeff V. Merkey" wrote:
> 
> > So have you enabled core dumps and actually looked at the core dumps
> > of the programs using gdb to see where they crashed ?
> 
> Yes.  I can only get the SSH crash when I am running remotely from the
> house over the internet, and it only shows then when running a build in
> jobserver mode (parallel build).  The X problem seems related as well,
> since it's related to (usually) NetScape spawing off a forked process.
> I will attempt to recreate tonight, and post the core dump file.

BTW.  Were I to wager a guess, I would guess it's related to the paging
problems in 2.4 when a process gets cloned, since everytime I have seen
it, it happens when a child process gets forked then accesses the cloned
data from the parent.  In the previous core dumps, it always puked right
after a call to fork() when the child process attempted to WRITE (not
read) data in the program.

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[Fwd: NTFS repair tools]

2000-12-07 Thread Jeff V. Merkey


Linux/Linus/Anton/Alan,

I am still sending out the NTFS repair tools for Linux trashed volumes,
and I've lost count now relative to how many I've sent out, but it's
somewhere in the thousands.  Is NTFS write stable enough now in 2.4 to
fix these problems, if so, can we DISABLE by REMOVING write code in the
VFS tables for those versions in other trees we know will trash people's
drives.  I am sending out over 100 copies a week now (I could make a
business out of fixing NTFS drives trashed by Linux) and the numbers are
getting higher instead of lower of people asking for these tools, which
woul indicate more people's data is getting trashed.

Do folks not know this NTFS driver will trash hard drives?  We need to
alert folks DO NOT USE WRITE NTFS MODE in those versions we know are
busted.  I enjoy helping NT customers get their data back and helping
with this problem, but at some point, the NTFS driver either needs to
get sync'd or WRITE disabled.  What I'm doing here is like trying to put
a bandaid over the mouth of the amazon river, and as Linux grows and
grows and grows, this problem will just get larger, and to a point where
I don't have the bandwidth to support it properly.

I will keep providing this service, but I am only treating the symptons
of the illness and not curing the patient.  Based upon the level of
contamination of TRG with Microsoft IP, I have been advised if I post an
NTFS replacement before the 18 month doctrine of inevitability "window"
is past, Microsoft will most certainly sue us, and win.

I strongly recommend stubbing our the file_write() calls in the NTFS VFS
until this gets fixed, until I can get working NTFS out there, or Anton
can get one out there (which will be another year and a half if it comes
from us based on the agreements we have with Microsoft).

:-)

Jeff


I have been using a PC which dual boots Linux/NT. Linux
seems to have trashed the ntfs partition when it ran out of
space while writing to it. The partition is data only but
was not backed up.
A search on the net indicated that this is a common problem
and that you may be able to point me at or provide tools
which could help repair the partition.
I would be grateful for any help,
Lynn

--


Lynn Evans
Department of Earth Sciences
P.O. Box 28E
Monash University
Melbourne, VIC 3800 Australia

Phone +61 (3) 9905 1527
Fax   +61 (3) 9905 4903
[EMAIL PROTECTED]






Re: Signal 11

2000-12-07 Thread davej

On Thu, 7 Dec 2000, Jeff V. Merkey wrote:

> I think there may be a case when a process forks, that the MMU or some
> other subsystem is either not setting the page bits correctly, or
> mapping in a bad page.  It's a LEVEL I bug in 2.4 is this is the case,
> BTW.  In core dumps (I've looked at 2 of them from SSH) it barfs right
> after executing fork() or one of the exec functions and at some places
> in the code where there's not any obvious coding bugs.  Looks like some
> type of mapping problem.  I reported it three months ago, but it was
> pretty much ignored.
> 
> Linus needs to add this one to the pre-12 list -- looks like some type
> of mapping bug.

Now that you mention it, every app that has bombed has been the type
that forks a lot. MpegTV, gtv, and make spring to mind. All apps drive
the CPU load up quite a lot, which was why I initially suspected
overheating. I don't see it on my other 2.4 boxes though which is
suspicious. But they don't get as much of a beating as this, which was
up until last week my main workstation.

regards,

Dave.

-- 
| Dave Jones <[EMAIL PROTECTED]>  http://www.suse.de/~davej
| SuSE Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Jeff V. Merkey


Dave,

I think there may be a case when a process forks, that the MMU or some
other subsystem is either not setting the page bits correctly, or
mapping in a bad page.  It's a LEVEL I bug in 2.4 is this is the case,
BTW.  In core dumps (I've looked at 2 of them from SSH) it barfs right
after executing fork() or one of the exec functions and at some places
in the code where there's not any obvious coding bugs.  Looks like some
type of mapping problem.  I reported it three months ago, but it was
pretty much ignored.

Linus needs to add this one to the pre-12 list -- looks like some type
of mapping bug.

Jeff

[EMAIL PROTECTED] wrote:
> 
> On Thu, 7 Dec 2000, Jeff V. Merkey wrote:
> 
> > It's related to some change in 2.4 vs. 2.2.  There are other programs
> > affected other than X, SSH also get's spurious signal 11's now and again
> > with 2.4 and glibc <= 2.1 and it does not occur on 2.2.
> 
> 
> 
> I've begun to get a bit paranoid about my K6-2 500 box.
> 
> Various processes have been getting random signals after heavy CPU usage.
> Playing an MPEG movie, kernel compile, or even just some small apps
> compiling sometimes. Just for the record, this isn't an OOM situation,
> I've watched this box with half its memory free or in buffers left
> unattended, and suddenly a compile will just die.
> 
> I replaced the CPU with a brand new K6-2. Problem remained.
> Next suspect was faulty RAM. Despite having passed a memtest, I
> swapped out the DIMMs for some known good ones.
> Suspecting cooling problems, I added some case fans.
> Next came a bigger power supply. Still the problems.
> The latest last ditch attempt to make this box stable has been
> to attach the biggest fan I could find that would fit a socket 7 CPU.
> 
> And still the problems are there.
> The only remaining suspect would be a flaky motherboard.
> But then comes the real killer : This box is rock solid under 2.2
> 
> *boggle*
> 
> I'm not sure exactly when this started, but I think I first noticed
> it around test5 or so, but didn't suspect the kernel at the time.
> 
> I've tried kernels compiled with everything from 2.91.66 when this
> was a Redhat box, to gcc 2.95.2 (from Debian woody) when I installed
> debian on it.  If this is a compiler bug, it's one that no compiler
> I've tried seems to be immune from.
> 
> regards,
> 
> Davej.
> 
> --
> | Dave Jones <[EMAIL PROTECTED]>  http://www.suse.de/~davej
> | SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.2.18pre21 oops reading /proc/apm

2000-12-07 Thread Neale Banks

Hi Stephen,

I presume this should be going to you, as the person named in
arch/i386/kernel/apm.c - if not please redirect/ignore as appropriate.

I compiled the Debian distribution of 2.2.18pre21 source on and for a
AcerNote-950, with APM enabled.

All is fine except that I can reliably "oops" it simply by trying to read
from /proc/apm (e.g. cat /proc/apm).

oops output and ksymoops-2.3.4 output is attached.

Is there anything else I can contribute?

Thanks,
Neale.


Unable to handle kernel paging request at virtual address 3eb8
current->tss.cr3 = 02577000, %cr3 = 02577000
*pde = 
Oops: 0002
CPU:0
EIP:0050:[<8185>]
EFLAGS: 00010006
eax: 63cb   ebx:    ecx: 012f   edx: 
esi: 00ff   edi: c25c0391   ebp: 3eac   esp: c25c3eae
ds: 0058   es:    ss: 0018
Process cat (pid: 557, process nr: 58, stackpage=c25c3000)
Stack: 3f30 005000ff 00013ec0  8328530a 0048  61da 
   0010c010  3f30 0018c25c 0018 51f1 00ffc01a  
    0292  c20b 63cec014 530ac010 0001  
Call Trace: Bad ESP value.
Code: <1>Unable to handle kernel paging request at virtual address 8185
current->tss.cr3 = 02577000, %cr3 = 02577000
*pde = 
Oops: 
CPU:0
EIP:0010:[]
EFLAGS: 00010046
eax: 8185   ebx:    ecx: c25c3e72   edx: c25c3e72
esi: c25c3eae   edi: c25c3f0e   ebp: c25c2000   esp: c25c3e1e
ds: 0018   es: 0018   ss: 0018
Process cat (pid: 557, process nr: 58, stackpage=c25c3000)
Stack: c26d4bf8 c25c2000 c01ebd4e c0108c44 c25c3e72 c01a5856 c01a71ce 0002 
    c010de3c c01a71ce c25c3e72 0002 c25c2000 00ff c25c0391 
   3eac c26d4bf8 c01088b5 c25c3e72 0002  012f  
Call Trace: Bad ESP value.
Code: 8a 04 03 25 ff 00 00 00 50 68 4e 58 1a c0 e8 4a 9c 00 00 83 


ksymoops 2.3.4 on i586 2.2.18pre21.  Options used
 -V (default)
 -k /proc/ksyms (default)
 -l /proc/modules (default)
 -o /lib/modules/2.2.18pre21/ (default)
 -m /boot/System.map-2.2.18pre21 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Unable to handle kernel paging request at virtual address 3eb8
current->tss.cr3 = 02577000, %cr3 = 02577000
*pde = 
Oops: 0002
CPU:0
EIP:0050:[<8185>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010006
eax: 63cb   ebx:    ecx: 012f   edx: 
esi: 00ff   edi: c25c0391   ebp: 3eac   esp: c25c3eae
ds: 0058   es:    ss: 0018
Process cat (pid: 557, process nr: 58, stackpage=c25c3000)
Stack: 3f30 005000ff 00013ec0  8328530a 0048  61da 
   0010c010  3f30 0018c25c 0018 51f1 00ffc01a  
    0292  c20b 63cec014 530ac010 0001  
Call Trace: Bad ESP value.
Code: <1>Unable to handle kernel paging request at virtual address 8185
Warning (Oops_code): trailing garbage ignored on Code: line
  Text: 'Code: <1>Unable to handle kernel paging request at virtual address 8185'
  Garbage: 'Unable to handle kernel paging request at virtual address 8185'
Warning (Oops_code_values): Code looks like message, not hex digits.  No disassembly 
attempted.

>>EIP; 8185 Before first symbol   <=

current->tss.cr3 = 02577000, %cr3 = 02577000
*pde = 
Oops: 
CPU:0
EIP:0010:[]
EFLAGS: 00010046
eax: 8185   ebx:    ecx: c25c3e72   edx: c25c3e72
esi: c25c3eae   edi: c25c3f0e   ebp: c25c2000   esp: c25c3e1e
ds: 0018   es: 0018   ss: 0018
Process cat (pid: 557, process nr: 58, stackpage=c25c3000)
Stack: c26d4bf8 c25c2000 c01ebd4e c0108c44 c25c3e72 c01a5856 c01a71ce 0002 
    c010de3c c01a71ce c25c3e72 0002 c25c2000 00ff c25c0391 
   3eac c26d4bf8 c01088b5 c25c3e72 0002  012f  
Call Trace: Bad ESP value.
Code: 8a 04 03 25 ff 00 00 00 50 68 4e 58 1a c0 e8 4a 9c 00 00 83 

>>EIP; c0108be3<=
Code;  c0108be3 
 <_EIP>:
Code;  c0108be3<=
   0:   8a 04 03  mov(%ebx,%eax,1),%al   <=
Code;  c0108be6 
   3:   25 ff 00 00 00and$0xff,%eax
Code;  c0108beb 
   8:   50push   %eax
Code;  c0108bec 
   9:   68 4e 58 1a c0push   $0xc01a584e
Code;  c0108bf1 
   e:   e8 4a 9c 00 00call   9c5d <_EIP+0x9c5d> c0112840 
Code;  c0108bf6 
  13:   83 00 00  addl   $0x0,(%eax)


3 warnings issued.  

Re: kernel BUG at buffer.c:827 in test12-pre6 and 7

2000-12-07 Thread Keith Owens

On Thu, 07 Dec 2000 17:23:51 -0800, 
Joseph Cheek <[EMAIL PROTECTED]> wrote:
>i'll check it out.  i'm compiling ksymoops now, is there a way to get it to
>work without a static libbfd?  all i've got is a libbfd.so, and i'm going to
>need to recompile binutils if i must have a libbfd.a.

ksymoops/Makefile, change
$(CC) $(OBJECTS) $(CFLAGS) -Wl,-Bstatic -lbfd -liberty -Wl,-Bdynamic -o $@
to
$(CC) $(OBJECTS) $(CFLAGS) -lbfd -liberty -o $@

But you should have libbfd.a.  Debian splits binutils into binutils and
binutils-dev, libbfd.a might be in the latter.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [patch] Re: [patch-2.4.0-test12-pre6] truncate(2) permissions

2000-12-07 Thread Alexander Viro



On Fri, 8 Dec 2000 [EMAIL PROTECTED] wrote:

> > BTW, if you still have 1.7, 1.10, 1.13 and 1.14...
> 
> See ftp://ftp.cwi.nl/pub/aeb/manpages/ (will soon disappear again).

Got them, thanks.

> > BTW, could we finally lose mpx(2)?
> 
> Maybe we lost it - I find sys_mpx only in a comment in arch/arm/kernel/calls.S

Sure, but man2/mpx.2 is alive and well...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827! and scsi modules no load at boot w/ initrd - test12pre7

2000-12-07 Thread Mohammad A. Haque

Linus,

I booted without an initrd defined (still have support compiled in
though) and I didn't get an Oops when I booted. So I guess it's related
to initrd. Here's some more that I noticed..

--snip--
VFS: Mounted root (ext2 filesystem).
kernel BUG at buffer.c:827!
.Oops.
Code: 0f 0b 83 c4 0c 8d 73 28 8d 43 2c 39 43 2c 74 15 b9 01 00 00 
hub.c: port 1 connection change
hub.c: port 1, portstatus 101, change 1, 12 Mb/s
VFS: Mounted root (ext2 filesystem) readonly.
change_root: old root has d_count=4
Trying to unmount old root ... <3>error -16
Change root to /initrd: error -2
--snip--

I noticed that there have been changes made to the initrd documention so
I'll look there to see if maybe the mkinitrd script on my system isn't
following something correctly. Unless of course you know of hand of
something else to look at.

Linus Torvalds wrote:
> Do you have something special that triggers this? Can you test if it
> only happens with initrd, for example?

-- 

=
Mohammad A. Haque  http://www.haque.net/ 
   [EMAIL PROTECTED]

  "Alcohol and calculus don't mix. Project Lead
   Don't drink and derive." --Unknown  http://wm.themes.org/
   [EMAIL PROTECTED]
=
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: related to pthread

2000-12-07 Thread Peter Samuelson


[Rajiv Majumdar]
> during an exec it gives the following error message : "Pthread
> internal error : message : __libc__reinit() failed" and creates a
> core dump.

This is libc failing -- please report this through libc channels (see
http://www.gnu.org/software/libc).  If it is in fact a kernel bug I'm
sure they will forward it.

Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: YMF PCI - thanks, glitches, patches (fwd)

2000-12-07 Thread Peter Samuelson


  [AC]
> > [EMAIL PROTECTED] is alive and well however.

[Pavel Roskin]
> You cannot imagine how frustrating it was to search for the archive.
> I couldn't find an up-to-date archive, and www.kernel.org keeps
> silence about mailing lists. I cannot afford subscribing to every
> list just to slightly polish support for my hardware.

Linus does take patches to the MAINTAINERS file from time to time. (:

In my opinion, the L: entries of MAINTAINERS ought to have URLs for
list archives, where available:

  L: [EMAIL PROTECTED], http://www.xxx>

Perhaps archive URLs should be on W: lines but I think the L: lines are
appropriate -- keep related information together.

Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread davej

On Thu, 7 Dec 2000, Jeff V. Merkey wrote:

> It's related to some change in 2.4 vs. 2.2.  There are other programs
> affected other than X, SSH also get's spurious signal 11's now and again
> with 2.4 and glibc <= 2.1 and it does not occur on 2.2.



I've begun to get a bit paranoid about my K6-2 500 box.

Various processes have been getting random signals after heavy CPU usage.
Playing an MPEG movie, kernel compile, or even just some small apps
compiling sometimes. Just for the record, this isn't an OOM situation,
I've watched this box with half its memory free or in buffers left
unattended, and suddenly a compile will just die.

I replaced the CPU with a brand new K6-2. Problem remained.
Next suspect was faulty RAM. Despite having passed a memtest, I
swapped out the DIMMs for some known good ones.
Suspecting cooling problems, I added some case fans.
Next came a bigger power supply. Still the problems.
The latest last ditch attempt to make this box stable has been
to attach the biggest fan I could find that would fit a socket 7 CPU.

And still the problems are there.
The only remaining suspect would be a flaky motherboard.
But then comes the real killer : This box is rock solid under 2.2

*boggle*

I'm not sure exactly when this started, but I think I first noticed
it around test5 or so, but didn't suspect the kernel at the time.

I've tried kernels compiled with everything from 2.91.66 when this
was a Redhat box, to gcc 2.95.2 (from Debian woody) when I installed
debian on it.  If this is a compiler bug, it's one that no compiler
I've tried seems to be immune from.

regards,

Davej.

-- 
| Dave Jones <[EMAIL PROTECTED]>  http://www.suse.de/~davej
| SuSE Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



who is writing to disk

2000-12-07 Thread Zhiruo Cao


Hello,

I found a process constantly writing to disk when I run gnome as desktop 
and while the whole system is idle.  I don't find anything in the log
file, and I don't see anything updated in my home dir or in /tmp.  Does it
sound like bdflush is writing?  But I don't hear the disk access when I
am not running gnome.  

My question then is, is there a (monitoring) tool that can tell me who is
writing to disk?  Or how I configure the kernel to know that?

Thanks!

Joe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827 in test12-pre6 and 7

2000-12-07 Thread Joseph Cheek

hi,

comments below.

Linus Torvalds wrote:

> In article <[EMAIL PROTECTED]>,
> Joseph Cheek  <[EMAIL PROTECTED]> wrote:
> >copying files off a loopback-mounted vfat filesystem exposes this bug.
> >test11 worked fine.
>
> It's not a new bug - it's an old bug that apparently is uncovered by a
> new stricter test.
>
> Apparently loopback unlocks an already unlocked page - which has always
> been a serious offense, but has never been detected before.
>
> test12-pre6+ detects it, and thus the BUG().
>
> Your stack trace isn't symbolic (see Documentation/oops-tracing.txt), so
> it's impossible to debug, but it's already interesting information to
> see that it seems to be either loopback of vfat.
>
> Can you test some more? In particular, I'd love to hear if this happens
> with vfat even without loopback, or with loopback even without vfat
> (make an ext2 filesystem or similar instead). That woul dnarrow down the
> bug further.

this happens on loopbacked ext2, and not on regular vfat.  so it appears that
the culprit is loopback.  i got ksymoops working, here are the traces from
the vfat-over-loopback [first] and the ext2-over-loopback [second].

again, these are copied by hand, so i give it a 1% chance of
mistranscription.

ksymoops 2.3.5 on i686 2.4.0.  Options used
 -V (default)
 -k /proc/ksyms (default)
 -l /proc/modules (default)
 -o /lib/modules/2.4.0/ (default)
 -m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

kernel BUG at buffer.c:827!
invalid operand: 
CPU: 0
EIP: 0010:[]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010082
eax: 001c ebx: c1d8fc60 ecx:  edx: 0001
esi: c10658e4 edi: 0002 ebp: c1d8fca8 esp: c1793dc0
ds: 0018 es: 0018 ss: 0018
Process cp (pid 762, stackpage=c1793000)
Stack: c01fe484 c01fe95a 033b c1d8fc60 c1cef420 0001 0001
c01610e1
   c1d8fc60 0001 c1cef420  c1cef420 c02c8ed8 c88df91c
c1cef420
   0001 c88e0986 0007  0001 c02c8ed8 c02c8ee8
c4f18800
Call Trace: [] [] [] []
[] [] [] []
   [] [] [] [] []
[] [] [c0128720>]
   [] [] [c010b56b>]
Code: 0f 0b 83 c4 0c 8d 5e 28 8d 46 2c 39 46 2c 74 24 b9 01 00 00

>>EIP; c013660c<=
Trace; c01fe484 
Trace; c01fe95a 
Trace; c0130703 <__alloc_pages+df/2d4>
Trace; c8895de3 <[fat]fat_readpage+f/14>
Trace; c88df91c <[cdrom]cdrom_read_mech_status+c/4c>
Trace; c8894494 <[fat]fat_get_block+0/e4>
Trace; c0160d2f <__make_request+5cb/648>
Trace; c0160ead 
Trace; c0161011 
Trace; c0137a49 
Trace; c0130703 <__alloc_pages+df/2d4>
Trace; c8895de3 <[fat]fat_readpage+f/14>
Trace; c8894494 <[fat]fat_get_block+0/e4>
Trace; c01284d3 
Trace; c012887b 
Trace; c889448d <[fat]fat_file_read+2d/34>
Trace; c01349a7 
Code;  c013660c 
 <_EIP>:
Code;  c013660c<=
   0:   0f 0b ud2a  <=
Code;  c013660e 
   2:   83 c4 0c  add$0xc,%esp
Code;  c0136611 
   5:   8d 5e 28  lea0x28(%esi),%ebx
Code;  c0136614 
   8:   8d 46 2c  lea0x2c(%esi),%eax
Code;  c0136617 
   b:   39 46 2c  cmp%eax,0x2c(%esi)
Code;  c013661a 
   e:   74 24 je 34 <_EIP+0x34> c0136640

Code;  c013661c 
  10:   b9 01 00 00 00mov$0x1,%ecx


1 warning issued.  Results may not be reliable.

ksymoops 2.3.5 on i686 2.4.0.  Options used
 -V (default)
 -k /proc/ksyms (default)
 -l /proc/modules (default)
 -o /lib/modules/2.4.0/ (default)
 -m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

kernel BUG at buffer.c:827!
invalid operand: 
CPU: 0
EIP: 0010:[]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010086
eax: 001c ebx: c1d212a0 ecx:  edx: 0001
esi: c11274bc edi: 0002 ebp: c1d212e8 esp: c4f1bddc
ds: 0018 es: 0018 ss: 0018
Process cp (pid 772, stackpage=c4f1b000)
Stack: c01fe484 c01fe95a 033b c1d21290 c1983420 0002 0001
c01610e1
   c1d212a0 0001 c1983420  c1983420 c02c8ed8 c88df91c
c1983420
   0001 c88e0986 0007  0002 c02c8ed8 c02c8ee8
c4cdf998
Call Trace: [] [] [] [] []
[] []
   [] [] [] [] []
[] [c0128720>]
   [] 

RE: Signal 11

2000-12-07 Thread Rainer Mager

Hi all,

Thanks for all the input so far. Regarding this...

> (I'm not sure exactly what cerberos does, do you have a link for it ?).

The official name is "Cerberus Test Control System" aka CTCS. I don't know
the official site but a search for this should reveal something. Anyway it
is a pretty comprehensive test that includes multiple kernel compiles,
memory tests, disk test, etc, etc. Like I said, I ran this for more than 15
hours with no problems.

Well, actually, I did notice that if I run CTCS from within X then it
freezes up after a few minutes. This appears to happen when/because of
extreme swapping.


Aside from the above I've also run repeated kernel compiles (more than 50
times) with 'make -j bzImage' and had no problems; all outputs were
identical.

So given these tests, I'm reasonably confident the core hardware is ok. I
suppose it is possible there's some iffy bits in the G400's VRAM (but
wouldn't that just result in screen artifacts?). I will admit that I have't
yet tried swapping RAM or any other system components.


Any other ideas?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Richard B. Johnson

On Fri, 8 Dec 2000, Rainer Mager wrote:

> Hi all,
> 
>   I've searched around for a answer to this with no real luck yet. If anyone
> has some ideas I'd be very grateful.

Signal 11 just means that you "seg-faulted". This is usually caused
by a coding error. However, if you have tools (like the C compiler)
that has been running fine, but starts to seg-fault, this points to
the very real possibility of a hardware error.

Modern RAM (with no error correction), running outside of its
timing specifications, is often the culpret. Even power supplies can
cause this problem. All you need is a single-bit error in a pointer's
value and -- signal 11.

Also, a bad opcode fetched from RAM with an error, also traps to
the same handler.

Do:

char main[]={0xff,0xff,0xff,0xff};


Compile and run this (it will compile!). You will see what
bad opcodes will do.



Cheers,
Dick Johnson

Penguin : Linux version 2.4.0 on an i686 machine (799.54 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Jeff V. Merkey



Andi Kleen wrote:
> 
> On Thu, Dec 07, 2000 at 06:24:34PM -0700, Jeff V. Merkey wrote:
> >
> > Andi,
> >
> > It's related to some change in 2.4 vs. 2.2.  There are other programs
> > affected other than X, SSH also get's spurious signal 11's now and again
> > with 2.4 and glibc <= 2.1 and it does not occur on 2.2.
> 
> So have you enabled core dumps and actually looked at the core dumps
> of the programs using gdb to see where they crashed ?

Yes.  I can only get the SSH crash when I am running remotely from the
house over the internet, and it only shows then when running a build in
jobserver mode (parallel build).  The X problem seems related as well,
since it's related to (usually) NetScape spawing off a forked process. 
I will attempt to recreate tonight, and post the core dump file.  

Jeff 





> 
> -Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Andi Kleen

On Thu, Dec 07, 2000 at 06:24:34PM -0700, Jeff V. Merkey wrote:
> 
> Andi,
> 
> It's related to some change in 2.4 vs. 2.2.  There are other programs
> affected other than X, SSH also get's spurious signal 11's now and again
> with 2.4 and glibc <= 2.1 and it does not occur on 2.2.

So have you enabled core dumps and actually looked at the core dumps 
of the programs using gdb to see where they crashed ? 



-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread Richard B. Johnson

On Thu, 7 Dec 2000, Brian Gerst wrote:

> "Richard B. Johnson" wrote:
> > 
> > On Thu, 7 Dec 2000 [EMAIL PROTECTED] wrote:
> > 
> > >
> > >
> > > Which surely we can on today's x86 systems. Even back in the days of OS/2
> > > 2.0 running on a 386 with 4Mb RAM we used a taskgate for both NMI and
> > > Double Fault. You need only a minimal stack - 1K, sufficient to save state
> > > and restore ESP to a known point before switching back to the main TSS to
> > > allow normal exception handling to occur.
> > >
> > > There no architectural restriction that some folks have hinted at - as long
> > > as the DPL for the task gates is 3.
> > >
> > [SNIPPED...]
> > 
> > Please refer to page 6-16, Inter486 Microprocessor Family Programmer's
> > Reference Manual.
> > 
> > The specifc text is: "The TSS does not have a stack pointer for a
> > privilege level 3 stack, because the procedure cannot be called by a less
> > privileged procedure. The stack for privilege level 3 is preserved by the
> > contents of SS and EIP registers which have been saved on the stack
> > of the privilege level called from level 3".
> > 
> > What this means is that a stack-fault in level 3 will kill you no
> > matter how cute you try to be. And, putting a task gate as call
> > procedure entry from a trap or fault is just trying to be cute.
> > It's extra code that will result in the same processor reset.
> 
> No, because the CPL of the task gate would be 0, which means the stack
> will be set to tss->esp0.  The DPL of 3 means that the descriptor can be
> accessed from CPL3.  The text you mention generally means that the only
> way to get back to CPL3 is with iret (via the saved %cs:%eip and
> %ss:%esp pushed on the CPL0/1/2 stack).
> 
> --
> 
It is yes, not no.

(1) User traps, CPL3, stack for trap is in CPL0.
(2) CPL0 has stack-fault (bad ring zero code, bad memory).
(3) CPL0 traps, using faulted stack, double fault.
(4) There is no stack-trick, including a call-gate  to another
"environment" (complete with its previously-reserved stack),
that will ever get you back to (2), much less to (1).

I am not denying the possibility of "warm-booting", i.e.,
reloate some code to where there is a 1:1 physical to virtual
translation, jump to the relocated code, disable paging, restart kernel
code, and possibly examine what happened. You just have to get
back to "flat-mode" with no paging to handle anything beyond a
double fault. You are just not going to be able to restart
from the stack-faulted code.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.0 on an i686 machine (799.54 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: io_request_lock question (2.2)

2000-12-07 Thread Reto Baettig

> 
> > I looked at the implementation of the nbd which just calls 
> > 
> > spin_unlock_irq(_request_lock);
> > ... do network io ...
> > spin_lock_irq(_request_lock);
> > 
> > This seems to work but it looks very dangerous to me (and ugly, too). Isn't there 
>a better way to do this?
> 
> It is only dangerous if you unlock it in the wrong place, unlocking it as much
> as possible is good behaviour. You need it locked until you get the actual
> request off the queue, you need it locked when you complete the request. The
> rest of the time you can be polite
> 
> 

I'm sorry but I still have some doubts:

The add_request function calls 
spin_lock_irqsave(_request_lock,flags); 
and then calls our request_fn which does 
spin_unlock_irq(_request_lock);

...do network I/O ...

spin_lock_irq(_request_lock);
we finish the request and return to the add_request function which calls
spin_unlock_irqrestore(_request_lock,flags);
and restores the flags. 

Isn't it possible now that the flags which we restore are out of date now?

Is this idiom the right one to use for 2.2?

Thanks,

Reto
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Jeff V. Merkey


Andi,

It's related to some change in 2.4 vs. 2.2.  There are other programs
affected other than X, SSH also get's spurious signal 11's now and again
with 2.4 and glibc <= 2.1 and it does not occur on 2.2.

Jeff

Andi Kleen wrote:
> 
> On Fri, Dec 08, 2000 at 09:44:29AM +0900, Rainer Mager wrote:
> >   I recently upgraded to a new machine. It is running RedHat 6.2 Linux (with
> > a SMP 2.4.0test[8-11] kernel) and has a Matrox G400 in it. X is 4.0.1.
> > Anyway, about once every 2-3 days X will spontaneously die and the only info
> > I get back is that it was because of signal 11.
> >   I've heard that signal 11 can be related to bad hardware, most often
> > memory, but I've done a good bit of testing on this and the system seems ok.
> > What I did was to run the VA Linux Cerberos(sp?) test for 15 hours+ with no
> > errors. Actually this only worked when running from the console. When
> > running from X the machine locked up (although no signal 11).
> >   The only info I've gotten back from the XFree86 mailing lists so far is
> > that there are known and wide spread problems with SMP and these types of
> > problems. Can anyone comment on this? Are there known SMP problems? What is
> > the current resolution plan?
> 
> signal 11 just means that the program crashed with a segmentation fault.
> 
> Sounds like a X Server bug. You should probably contact XFree86, not
> linux-kernel
> 
> -Andi
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre25

2000-12-07 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Alan Cox  <[EMAIL PROTECTED]> wrote:
>
>Excellent. I've been trying to avoid VM fixes for 2.2.18 to stop stuff getting
>muddled together and hard to debug. Running with page aging convinces me that
>2.2.19 we need to sort some of the vm issues out badly, and make it faster than
>2.4test 8)

Ahh.. The challenge is out!

You and me. Mano a mano. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827 in test12-pre6 and 7

2000-12-07 Thread Joseph Cheek

i'll check it out.  i'm compiling ksymoops now, is there a way to get it to
work without a static libbfd?  all i've got is a libbfd.so, and i'm going to
need to recompile binutils if i must have a libbfd.a.

Linus Torvalds wrote:

> Your stack trace isn't symbolic (see Documentation/oops-tracing.txt), so
> it's impossible to debug, but it's already interesting information to
> see that it seems to be either loopback of vfat.
>
> Can you test some more? In particular, I'd love to hear if this happens
> with vfat even without loopback, or with loopback even without vfat
> (make an ext2 filesystem or similar instead). That woul dnarrow down the
> bug further.

thanks!

joe

--
Joseph Cheek, Sr Linux Consultant, Linuxcare | http://www.linuxcare.com/
Linuxcare.  Support for the Revolution.  | [EMAIL PROTECTED]
CTO / Acting PM, Redmond Linux Project   | [EMAIL PROTECTED]
425 990-1072 vox [1074 fax] 206 679-6838 pcs | [EMAIL PROTECTED]



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: SCSI modules and kmod

2000-12-07 Thread Torben Mathiasen

On Thu, Dec 07 2000, [EMAIL PROTECTED] wrote:

[deleted]
 
> int regdevresult;
> 
> case MODULE_SCSI_DEV:
> #ifdef CONFIG_KMOD
> if (scsi_hosts == NULL)
>   {
> request_module("scsi_hostadapter");
> return scsi_register_device_module((struct
> Scsi_Device_Template *) ptr);
>   }
> #endif
> regdevresult = scsi_register_device_module((struct
> Scsi_Device_Template *) ptr);
> #ifdef CONFIG_KMOD
> if (regdevresult != 0) /* is this the right case? */
> request_module("scsi_hostadapter");
> regdevresult = scsi_register_device_module((struct
> Scsi_Device_Template *) ptr);
> #endif
> return regdevresult;
> 
This won't work. scsi_register_device_module returns 0 when the 
driver loads ok, not when a device was actually found. Remember
its possible to load the sd driver with no host adapter present.

How about just removing the check for scsi_hosts == NULL. If some
hostadapter was already loaded, so be it. It won't change anything,
besides maybe more devices beeing loaded which shouldn't hurt anyone.

Small patch attached (against t12p7). Not tested, not even compiled.


-- 
Torben Mathiasen <[EMAIL PROTECTED]>
Linux ThunderLAN maintainer 
http://tlan.kernel.dk


--- /opt/kernel/kernels/linux/drivers/scsi/scsi.c   Wed Nov  1 15:04:02 2000
+++ linux/drivers/scsi/scsi.c   Fri Dec  8 02:13:47 2000
@@ -2322,11 +2322,9 @@
/* Load upper level device handler of some kind */
case MODULE_SCSI_DEV:
 #ifdef CONFIG_KMOD
-   if (scsi_hosts == NULL)
-   request_module("scsi_hostadapter");
+   request_module("scsi_hostadapter");
 #endif
-   return scsi_register_device_module((struct Scsi_Device_Template *) 
ptr);
-   /* The rest of these are not yet implemented */
+   return scsi_register_device_module((struct Scsi_Device_Template *) ptr);
 
/* Load constants.o */
case MODULE_SCSI_CONST:



Re: Disableing USB.

2000-12-07 Thread David Riley

Bryan Whitehead wrote:
> 
> Is there a way I can disble a part of the kernel that is compiled into the
> kernel? For example I'd like to pass this to lilo: "usb=disable" and then
> the usb code is not loaded even though USB has been built into the kernel.
> 
> Is such a feature stupid? Or has this already been implemented?
> 
> It would be nice if this was generic and I could also pass things like
> "procfs=disabled".
> 
> The resone I ask is a friend of mine got a new Sony Vaio Laptop that has
> the ethernet card and USB device stepping on eachother. It would be nice
> to pass to the Redhat/Mandrake/whatever installation boot disk usb=disable
> so the ethernet card can work freely (he's doiung a ntwork install becasue
> he has no CD-ROM), as he doesn't use any USB devices anyway.

Er... Well, the traditional solution has been "don't build it into your
kernel if you don't want it", but in the case of stock kernels, that
isn't always an option, I suppose.  Theoretically, the two devices
shouldn't step on each other, but this is a computer.  Theory is so far
removed from practice that it's... Well, I can't think up a good
analogy.  It's far.

*looks at kernel config options*

Well, it looks like the usb cores (UHCI and OHCI) can be modular.  Why
aren't they in the stock kernel, other than possibly autodetection
problems?  If they are built as modules, using expert mode in RedHat or
whatever equivalent may be in other dists may let you avoid insmodding
the USB stuff...

Beyond that, having a command-line disable feature does seem pretty
neat.  Although why would you want to disable procfs?  Maybe I missed
something there, but it seems awful darn important to leave out. :-)

-- 
"Windows for Dummies?  Isn't that redundant?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Andi Kleen

On Fri, Dec 08, 2000 at 09:44:29AM +0900, Rainer Mager wrote:
>   I recently upgraded to a new machine. It is running RedHat 6.2 Linux (with
> a SMP 2.4.0test[8-11] kernel) and has a Matrox G400 in it. X is 4.0.1.
> Anyway, about once every 2-3 days X will spontaneously die and the only info
> I get back is that it was because of signal 11.
>   I've heard that signal 11 can be related to bad hardware, most often
> memory, but I've done a good bit of testing on this and the system seems ok.
> What I did was to run the VA Linux Cerberos(sp?) test for 15 hours+ with no
> errors. Actually this only worked when running from the console. When
> running from X the machine locked up (although no signal 11).
>   The only info I've gotten back from the XFree86 mailing lists so far is
> that there are known and wide spread problems with SMP and these types of
> problems. Can anyone comment on this? Are there known SMP problems? What is
> the current resolution plan?

signal 11 just means that the program crashed with a segmentation fault. 

Sounds like a X Server bug. You should probably contact XFree86, not
linux-kernel


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Ramdisk root filesystem strangeness

2000-12-07 Thread Jeff Chua

>Is there support for using RAMDISK as the final root file system
>in 2.2.x versions, or is it there in the 2.4.x versions.

Works with 2.2x and up to 2.4.0 test12-pre3.bz2

Make sure you specify the following if you're using loadlin

root=/dev/ram

With anything above test12-pre3.bz2, you'll ran into the following problem

kernel BUG at buffer.c:827!
invalid operand: 
CPU:0
EIP:0010:[]
EFLAGS: 00010286


Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Michel LESPINASSE

On Fri, Dec 08, 2000 at 09:44:29AM +0900, Rainer Mager wrote:

>   I've heard that signal 11 can be related to bad hardware, most
> often memory, but I've done a good bit of testing on this and the
> system seems ok.  What I did was to run the VA Linux Cerberos(sp?)
> test for 15 hours+ with no errors. Actually this only worked when
> running from the console. When running from X the machine locked up
> (although no signal 11).

Don't be so quick to dismiss the "bad hardware" possibility. It is
really quite common these days. And, some cases of bad hardware are
not detected using simple tests like memtest86. (I'm not sure exactly
what cerberos does, do you have a link for it ?).

My recommandation would be to take a big source tree (say, a bit
bigger than the amount of RAM you have), and run repetitive
tar+detar+diff -ru runs on it for 48 hours or so. If your hardware
runs OK, diff should not report any inconsistencies. I found this test
to be quite reliable to detect hardware problems. If you have several
disk controllers, run one instance of the test on each of
them. Additionally you could run a background task to keep the CPU at
100% - a simple while 1 loop would do.

-- 
Michel "Walken" LESPINASSE
Of course I think I'm right. If I thought I was wrong, I'd change my mind.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Signal 11

2000-12-07 Thread Jeff V. Merkey


I have previously reported this error (about three months ago) on 2.4
with XFree 3.3.6.  If you are running RedHat 6.2, then you are running
this X Server.  It also shows up on Calders'a 2.4 eDesktop.  It appears
to be something with glib 2.1 < versions on 2.4.  I also see it with
secure shell 1.2.27 on 2.4.  I've also seen it on RH 7.0 on 2.4 kernels
as well, but only with SSH.

Jeff

Rainer Mager wrote:
> 
> Hi all,
> 
> I've searched around for a answer to this with no real luck yet. If anyone
> has some ideas I'd be very grateful.
> 
> I recently upgraded to a new machine. It is running RedHat 6.2 Linux (with
> a SMP 2.4.0test[8-11] kernel) and has a Matrox G400 in it. X is 4.0.1.
> Anyway, about once every 2-3 days X will spontaneously die and the only info
> I get back is that it was because of signal 11.
> I've heard that signal 11 can be related to bad hardware, most often
> memory, but I've done a good bit of testing on this and the system seems ok.
> What I did was to run the VA Linux Cerberos(sp?) test for 15 hours+ with no
> errors. Actually this only worked when running from the console. When
> running from X the machine locked up (although no signal 11).
> The only info I've gotten back from the XFree86 mailing lists so far is
> that there are known and wide spread problems with SMP and these types of
> problems. Can anyone comment on this? Are there known SMP problems? What is
> the current resolution plan?
> 
> Thanks,
> 
> --Rainer
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: cyrix III by via

2000-12-07 Thread davej

On Thu, 7 Dec 2000, Eric Estabrooks wrote:

> A test probably needs to be added in the centaur_model section to test
> for the cyrixIII in disguise.

2.2.18pre, and 2.4.0test have contained this test for some time now.
However, I've heard no reports of it working or not due to no-one having
the necessary hardware to test it.

Are you saying the latest versions still don't recognise it?
What kernel version did you try ?

regards,

Davej.

-- 
| Dave Jones <[EMAIL PROTECTED]>  http://www.suse.de/~davej
| SuSE Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PCI irq routing..

2000-12-07 Thread Martin Diehl

On Thu, 7 Dec 2000, Linus Torvalds wrote:

> > btw, I'm thinking I could guess the routing from the VLSI config space,
> 
> Please do. You might leave them commented out right now, but this is
> actually how most of the pirq router entries have been created: by looking
> at various pirq tables and matching them up with other (non-pirq)
> documentation and testing. Th epirq "link" value is basically completely
> NDA'd, and is per-chipset-specific. Nobody documents it except in their
> bios writers guide, if even there.

Ok - will do it. Unfortunately the BIOS of this notebook has no
customizeable routing option which I could use to to play with.
So testing here will hardly cover orthogonal cases.

> > The reason for this is in drivers/pci.c where bridges are touched
> > twice: once as a device on a bus and once via ->self from the bus behind.
> 
> Not intended behaviour. The self case should be removed.

I was wondering whether there might be bridges which have to be awoken
from both sides because they have different config spaces there.
Is bus->children->self guarantied to be identical to bus->device for
all kinds of bridge devices?
Sure, dividing bridges wouldn't make too much sense - at least I don't see
what half a bridge might be good for, but ...

Removing self cases is straightforward - pci_pm-2.4.0-t9p3-patch below.

> Ok, definitely needs some more work. Thanks for testing - I have no
> hardware where this is needed.

Could do some more testing if a day or two for feedback is ok.

Two more things I've noticed:

- when all pcmcia/yenta stuff is in modules and doing suspend/resume
immediately after fresh cold reboot there is nothing our cardbus stuff
might have set up which was lost in suspend. Nevertheless, what happens is
the pcmcia_core/yenta_socket/ds modules get loaded without problem but the
"Socket status" printk from yenta_open_bh() is completely garbage. This is
not the case when the modules are loaded before the suspend.
Despite the garbage, subsequent cardmgr startup does not give any error
message - but the cards in the slots are not recognized (no beeps, no
status to retrieve from cardctl). Reboot is the only solution.
My conclusion is, the reason must be in the init-path doing or forgetting
something prohibited/required after suspend - or the TI1131 is broken.

- when, after yenta sockets became unusable due to pm suspend, I try to
eject/insert the cards from a slot, the box freezes. This turned out
to be a loop in yenta_interrupt being called endlessly. Apparently the
yenta_bh() -> pcmcia-handler path somehow triggers the next IRQ.
But this might be a consequence of the former issue.

According to the forecasts, next weekend will be rainy, so...

Thank you for the time!

Regards
Martin

-
--- linux-2.4.0-t12p3/drivers/pci/pci.c.origMon Dec  4 14:21:26 2000
+++ linux-2.4.0-t12p3/drivers/pci/pci.c Fri Dec  8 00:17:50 2000
@@ -1089,6 +1089,9 @@
return 0;
 }
 
+
+/* take care to suspend/resume bridges only once */
+
 static int pci_pm_suspend_bus(struct pci_bus *bus)
 {
struct list_head *list;
@@ -1100,9 +1103,6 @@
/* Walk the device children list */
list_for_each(list, >devices)
pci_pm_suspend_device(pci_dev_b(list));
-
-   /* Suspend the bus controller.. */
-   pci_pm_suspend_device(bus->self);
return 0;
 }
 
@@ -1110,8 +1110,6 @@
 {
struct list_head *list;
 
-   pci_pm_resume_device(bus->self);
-
/* Walk the device children list */
list_for_each(list, >devices)
pci_pm_resume_device(pci_dev_b(list));
@@ -1125,18 +1123,26 @@
 static int pci_pm_suspend(void)
 {
struct list_head *list;
+   struct pci_bus *bus;
 
-   list_for_each(list, _root_buses)
-   pci_pm_suspend_bus(pci_bus_b(list));
+   list_for_each(list, _root_buses) {
+   bus = pci_bus_b(list);
+   pci_pm_suspend_bus(bus);
+   pci_pm_suspend_device(bus->self);
+   }
return 0;
 }
 
 static int pci_pm_resume(void)
 {
struct list_head *list;
+   struct pci_bus *bus;
 
-   list_for_each(list, _root_buses)
-   pci_pm_resume_bus(pci_bus_b(list));
+   list_for_each(list, _root_buses) {
+   bus = pci_bus_b(list);
+   pci_pm_resume_device(bus->self);
+   pci_pm_resume_bus(bus);
+   }
return 0;
 }



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Signal 11

2000-12-07 Thread Rainer Mager

Hi all,

I've searched around for a answer to this with no real luck yet. If anyone
has some ideas I'd be very grateful.

I recently upgraded to a new machine. It is running RedHat 6.2 Linux (with
a SMP 2.4.0test[8-11] kernel) and has a Matrox G400 in it. X is 4.0.1.
Anyway, about once every 2-3 days X will spontaneously die and the only info
I get back is that it was because of signal 11.
I've heard that signal 11 can be related to bad hardware, most often
memory, but I've done a good bit of testing on this and the system seems ok.
What I did was to run the VA Linux Cerberos(sp?) test for 15 hours+ with no
errors. Actually this only worked when running from the console. When
running from X the machine locked up (although no signal 11).
The only info I've gotten back from the XFree86 mailing lists so far is
that there are known and wide spread problems with SMP and these types of
problems. Can anyone comment on this? Are there known SMP problems? What is
the current resolution plan?


Thanks,

--Rainer

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre25

2000-12-07 Thread Alan Cox

> (note: the above is outdated so it's not anymore suggested for inclusion of
> course)
> 
> I sumbitted most of the not-feature-oriented stuff at pre2 time and I plan to
> re-submit after 2.2.18 is released.

Excellent. I've been trying to avoid VM fixes for 2.2.18 to stop stuff getting
muddled together and hard to debug. Running with page aging convinces me that
2.2.19 we need to sort some of the vm issues out badly, and make it faster than
2.4test 8)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre25

2000-12-07 Thread Andrea Arcangeli

On Fri, Dec 08, 2000 at 12:27:58AM +, Alan Cox wrote:
> The problem is its hard to know which of your patches depend on what, and
> the complete set is large to say the least.

That's why I use a `proposed' directory that only contains patches that can be
applied to your tree, in this case it was:


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/proposed/v2.2/2.2.18pre2/VM-global-2.2.18pre2-6.bz2

(note: the above is outdated so it's not anymore suggested for inclusion of
course)

I sumbitted most of the not-feature-oriented stuff at pre2 time and I plan to
re-submit after 2.2.18 is released.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre25

2000-12-07 Thread Alan Cox

> Such bug can't generate crashes. Did you ever reproduced crashes on your 8Mb
> 486 with 2.2.18pre24?

Yes. Every 20 minutes or so quite reliably. With that change it has yet to 
crash (its actually running that + page aging + another minor tweak so it
doesnt return success on page aging until we have a clump of free pages.

With just the page aging patch it performed way better but still hung.

> 
>ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.18pre24aa1/00_account-failed-buffer-tries-1
>

Oh well ;) 
 
> account-failed-buffer-tries-1 is included in VM-global-7 and it was
> described in the 2.2.18pre21aa2 email to l-k (CC'ed you) in date Fri, 17 Nov
> 2000 18:54:43 +0100:

The problem is its hard to know which of your patches depend on what, and
the complete set is large to say the least.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Microsecond accuracy

2000-12-07 Thread Kotsovinos Vangelis


On Thu, 7 Dec 2000, Christopher Friesen wrote:

> Kotsovinos Vangelis wrote:
> > 
> > Is there any way to measure (with microsecond accuracy) the time of a
> > program execution (without using Machine Specific Registers) ?
> > I've already tried getrusage(), times() and clock() but they all have
> > 10 millisecond accuracy, even though they claim to have microsecond
> > acuracy.
> > The only thing that seems to work is to use one of the tools that measure
> > performanc through accessing the machine specific registers. They give you
> > the ability to measure the clock cycles used, but their accuracy is also
> > very low from what I have seen up to now.
> 
> Can you not just use something like gettimeofday()?  Do two consecutive calls to
> find the execution time of the instruction itself, and then do two calls on
> either side of the program execution.  Subtract the instruction execution time
> from the delta, and that should give a pretty good idea of execution time.

Well, it is a pretty complex program that I want to measure, with more
than one modules that run one after another... they sleep and use signals
to wake each other up, they use semaphores etc. What I want to measure is
the time the program is running (not waiting for other processes or
waiting for a signal etc). 
Also, there are other processes running on the
system (for example, my program needs about 50 seconds of real time to
execute and I estimate the time it is "running" to be about 5000-1
microseconds)...

Thanks anyway,

Vangelis

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827 in test12-pre6 and 7

2000-12-07 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Joseph Cheek  <[EMAIL PROTECTED]> wrote:
>copying files off a loopback-mounted vfat filesystem exposes this bug.
>test11 worked fine.

It's not a new bug - it's an old bug that apparently is uncovered by a
new stricter test.

Apparently loopback unlocks an already unlocked page - which has always
been a serious offense, but has never been detected before.

test12-pre6+ detects it, and thus the BUG().

Your stack trace isn't symbolic (see Documentation/oops-tracing.txt), so
it's impossible to debug, but it's already interesting information to
see that it seems to be either loopback of vfat.

Can you test some more? In particular, I'd love to hear if this happens
with vfat even without loopback, or with loopback even without vfat
(make an ext2 filesystem or similar instead). That woul dnarrow down the
bug further.

Thanks,
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: io_request_lock question (2.2)

2000-12-07 Thread Alan Cox

> I looked at the implementation of the nbd which just calls 
> 
>   spin_unlock_irq(_request_lock);
>   ... do network io ...
>   spin_lock_irq(_request_lock);
> 
> This seems to work but it looks very dangerous to me (and ugly, too). Isn't there a 
>better way to do this?

It is only dangerous if you unlock it in the wrong place, unlocking it as much
as possible is good behaviour. You need it locked until you get the actual
request off the queue, you need it locked when you complete the request. The
rest of the time you can be polite

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre25

2000-12-07 Thread Andrea Arcangeli

On Thu, Dec 07, 2000 at 08:03:00PM +, Alan Cox wrote:
> 
> Ok we believe the VM crash looping printing error messages is now fixed.

Such bug can't generate crashes. Did you ever reproduced crashes on your 8Mb
486 with 2.2.18pre24?

> Marcelo finally figured it out and my 8Mb 486 has been running 2.2.18pre
> with that fix and stably[1].

diff -urN 2.2.18pre24/mm/filemap.c 2.2.18pre25/mm/filemap.c
--- 2.2.18pre24/mm/filemap.cWed Nov 29 19:28:29 2000
+++ 2.2.18pre25/mm/filemap.cFri Dec  8 00:41:45 2000
@@ -220,8 +220,10 @@
 * throttling.
 */
 
-   if (!try_to_free_buffers(page, wait))
+   if (!try_to_free_buffers(page, wait)) { 
+   if(--count < 0) break;
goto refresh_clock;
+   }
return 1;
}
 
ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.18pre24aa1/00_account-failed-buffer-tries-1

--- 2.2.17pre19/mm/filemap.cTue Aug 22 14:54:13 2000
+++ /tmp/filemap.c  Thu Aug 24 01:05:50 2000
@@ -179,6 +179,8 @@
if ((gfp_mask & __GFP_DMA) && !PageDMA(page))
continue;
 
+   count--;
+
/*
 * Is it a page swap page? If so, we want to
 * drop it if it is no longer used, even if it
@@ -224,7 +226,7 @@
return 1;
}
 
-   } while (--count > 0);
+   } while (count > 0);
return 0;
 }
 
lftp> pwd
ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.17pre19
 ^^^
lftp> ls -l account-failed-buffer-tries-1 
-rw-r--r--   1 korg korg  407 Sep  5 22:43 account-failed-buffer-tries-1
  ^^
lftp> 

Only difference is that pre25 keeps decreasing `count' for locked, mapped and
out-of-zone pages and that means it will still fail to shrink the cache when it
looks at the unlucky part of the physical memory while the
account-failed-buffer-tries-1 intentionally doesn't decrease `count' in that
cases to avoid failing in such unlucky cases.

account-failed-buffer-tries-1 is included in VM-global-7 and it was
described in the 2.2.18pre21aa2 email to l-k (CC'ed you) in date Fri, 17 Nov
2000 18:54:43 +0100:

[..]
00_account-failed-buffer-tries-1

Account also the failed buffer tries during shrink_mmap. (me)  
(this is included in the VM-global that I maintain against vanilla
2.2.x btw)
[..]

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Microsecond accuracy

2000-12-07 Thread Kotsovinos Vangelis


Ok, I'll check it out...

Thank you very much,

--) Vangelis


On Thu, 7 Dec 2000, Karim Yaghmour wrote:

> 
> You might want to try the Linux Trace Toolkit. It'll give you microsecond
> accuracy on program execution time measurement.
> 
> Check it out:
> http://www.opersys.com/LTT
> 
> Karim
> 
> Kotsovinos Vangelis wrote:
> > 
> > Is there any way to measure (with microsecond accuracy) the time of a
> > program execution (without using Machine Specific Registers) ?
> > I've already tried getrusage(), times() and clock() but they all have
> > 10 millisecond accuracy, even though they claim to have microsecond
> > acuracy.
> > The only thing that seems to work is to use one of the tools that measure
> > performanc through accessing the machine specific registers. They give you
> > the ability to measure the clock cycles used, but their accuracy is also
> > very low from what I have seen up to now.
> > 
> > Thank you very much in advance
> > 
> > --) Vangelis
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > Please read the FAQ at http://www.tux.org/lkml/
> 
> -- 
> ===
>  Karim Yaghmour
>[EMAIL PROTECTED]
>   Operating System Consultant
>  (Linux kernel, real-time and distributed systems)
> ===
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: D-LINK DFE-530-TX

2000-12-07 Thread Peter Horton

On Wed, Dec 06, 2000 at 07:44:02PM -0500, Mike A. Harris wrote:
> Which ethernet module works with this card?  2.2.17 kernel
> 

If the PCI device ID is 3065 then it's via-rhine, but not supported by the
driver in the kernel. Get updated via-rhine from Donald Becker's site
http://www.scyld.com/network.

Even the DFE-530-TX driver for NT downloaded from D-Link's site doesn't know
about this chip yet ... though changing the device ID in the .INF file seemed
to make it work ... shrug.

HTH

P.

-- 
++
|Peter Horton|
++
|http://www.colonel-panic.com|
|   http://www.berserk.demon.co.uk   |
| Linux 2.4.0-test11 |
++
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



io_request_lock question (2.2)

2000-12-07 Thread Reto Baettig

Hi

I'm trying to write a block device driver which does some network stuff to satisfy the 
requests. The problem is, that the network stuff wants to grab the io_request_lock 
which does not work because this lock is already locked when I come into the 
request_fn of my device.

I looked at the implementation of the nbd which just calls 

spin_unlock_irq(_request_lock);
... do network io ...
spin_lock_irq(_request_lock);

This seems to work but it looks very dangerous to me (and ugly, too). Isn't there a 
better way to do this?

Thanks very much

Reto 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: cyrix III by via

2000-12-07 Thread Alan Cox

> The cyrixIII chips by via have the centaur vendor id which causes the
> identify_cpu call in arch/i386/kernel/setup.c to fail.  It is probably
> reasonable for it to have the centaur id as via owns centaur as well.  I
> just replaced the centaur_model call with the cyrix_model one, but I
> know that I am using a cyrix chip.
> 
> A test probably needs to be added in the centaur_model section to test
> for the cyrixIII in disguise.
> 
> The error is a general protection fault.
> 
> Sorry if this is old hat,

Its fairly new hat. VIA cyrix III is a next generation IDT winchip (VIA bought
both the winchip stuff and the Cyrix stuff). 2.2.18 should handle the
winchip properly

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



cyrix III by via

2000-12-07 Thread Eric Estabrooks

I am not a subscriber to this list, but I thought this was important
information (which you might already have).

The cyrixIII chips by via have the centaur vendor id which causes the
identify_cpu call in arch/i386/kernel/setup.c to fail.  It is probably
reasonable for it to have the centaur id as via owns centaur as well.  I
just replaced the centaur_model call with the cyrix_model one, but I
know that I am using a cyrix chip.

A test probably needs to be added in the centaur_model section to test
for the cyrixIII in disguise.

The error is a general protection fault.

Sorry if this is old hat,

Eric Estabrooks
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre25

2000-12-07 Thread Alan Cox

> Megaraid still needs fixing. I sent you the patch twice, so have
> other people, but it still isn't fixed. The

I asked people to explain why it was needed. I am still waiting. It is a 
patch that does nothing. I will not put random deep magic into the kernel.

I have no reason to believe the current driver in 2.2.18pre24 does not work,
have you tried that specific kernel ? 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: fs corruption with invalidate_buffers()

2000-12-07 Thread Jan Niehusmann

On Thu, Dec 07, 2000 at 05:26:46PM -0500, Alexander Viro wrote:
> That invalidate_buffers() should leave the unhashed ones alone. If it can't
> be found via getblk() - just leave it as is.
> 
> IOW, let it skip bh if (bh->b_next == NULL && !destroy_dirty_buffers).
> No warnings needed - it's a normal situation.

Yes, if(bh->b_next == NULL && !destroy_dirty_buffers) seems to work, too.
I wonder why, because, if I analysed the problem correctly, it was caused
by the page mapping. Or is it a general rule that bh->b_next==NULL if
bh->b_page->mapping != NULL, ie. a buffer is never directly hased and belongs
to a mapped page?

Is there a place one can look for documentation about these things?

Another question is what should happen with the mapped pages? Whoever calls
invalidate_buffers(), probably does it because the underlying device disappered
or changed, so the page mappings should be invalidated, too.
OTOH, pages are (normally) mapped through inodes, and if the inode stays valid
after the invalidate_buffers() (ie. if it's called by an lvm resize event),
the page mapping stays valid, too.

Jan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.2.18pre25

2000-12-07 Thread Miquel van Smoorenburg

In article <[EMAIL PROTECTED]>,
Alan Cox  <[EMAIL PROTECTED]> wrote:
>So I figure this is it for 2.2.18, subject to evidence to the contrary

Megaraid still needs fixing. I sent you the patch twice, so have
other people, but it still isn't fixed. The

megaBase &= PCI_BASE_ADDRESS_MEM_MASK;

...

megaBase &= PCI_BASE_ADDRESS_IO_MASK;

is removed by the 2.2.18 version (read the patch) and that breaks
older megaraid cards.

Existing megaraid system with 2.2.x kernels WILL break with 2.2.18

Mike.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: bug in scsi.c

2000-12-07 Thread Andreas Klein

On Thu, 7 Dec 2000, Tigran Aivazian wrote:

> On Thu, 7 Dec 2000, Andreas Klein wrote:
> 
> > hello,
> > 
> > I have found a problem in scsi.c which in present in the 2.2 and 2.4
> > series. the scsi error handler thread is created with:
> > 
> > kernel_thread((int (*)(void *)) scsi_error_handler,
> > (void *) shpnt, 0);
> > 
> > This will lead to problems, when you have to umount the filesystem on
> > which the scsi-hostapter module is located.
> > To solve to problem I would propose to change this to:
> > 
> > kernel_thread((int (*)(void *)) scsi_error_handler,
> >   (void *) shpnt, CLONE_FILES);
> 
> Hi Andreas,
> 
> Unfortunately, CLONE_FILES is not enough because the module may be loaded
> from the directory containing it, i.e. the thread's cwd may point to that
> filesystem and that would keep it busy. Or-ing CLONE_FS into flags
> wouldn't help either...

Yes, you are right with that.

> A proper way to release the references to resources is to call daemonize()
> function from within the kernel thread function, which calls
> exit_fs()/exit_files() internally.

Nearly correct, the daemonize function does NOT call exit_files. This has
to be done manually. Looking at the 2.4.0-test10 source I saw, that
someone has already fixed the problem by calling exit_files and daemonize.
In the 2.2 series someone tried cut-copy-paste programing from the
daemonize function, but exit_files was forgotten. The following patch
should fix the problem for 2.2.16, while leaving scsi.c untouched.

--- linux/drivers/scsi/scsi_error.c.origThu Dec  7 23:56:47 2000
+++ linux/drivers/scsi/scsi_error.c Fri Dec  8 00:13:20 2000
@@ -1935,6 +1935,7 @@
 * user space pages.  We don't need them, and if we didn't close 
them
 * they would be locked into memory.
 */
+   exit_files(current);
exit_mm(current);
 
current->session = 1;

Bye,

-- Andreas Klein
   [EMAIL PROTECTED]
   root / webmaster @cip.physik.uni-wuerzburg.de
   root / webmaster @www.physik.uni-wuerzburg.de
_
|   | 
|   Long live our gracious AMIGA!   |
|___|


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: oops in 2.4.0test12-pre5+reiserfs+crypto

2000-12-07 Thread Steven Cole


Florian Schmitt <[EMAIL PROTECTED]> wrote:
>I had the following oops while doing a "find -name" and playing mp3s on 

If you're running 2.4.0-test12-pre5 or later with reiserfs, you should be 
aware that this can be unsafe due to a problem with reiserfs_writepage.
Fortunately, Chris Mason just posted a patch to fix this on the reiserfs-list,
against reiserfs-3.6.22.  The recent reiserfs-list archives can be found at:

http://marc.theaimsgroup.com/?l=reiserfs=1=2=200012

This may have nothing to do with your oops, but if you're going to run
2.4.0-test12-pre5,6,7 then go get the writepage.diff patch, and apply it
after applying linux-2.4.0-test10-reiserfs-3.6.22-patch.

Of course, this is bleeding edge, so the usual caveats apply.

Good luck,
Steven
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827 in test12-pre6 and 7

2000-12-07 Thread Keith Owens

On Thu, 07 Dec 2000 14:42:38 -0800, 
Joseph Cheek <[EMAIL PROTECTED]> wrote:
>loop.o built as module.  this hard crashes the machine, every time
>[PIII-450].  i don't know how to debug this, is there a FAQ?

linux/Documentation/oops-tracing.txt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: bug in scsi.c

2000-12-07 Thread Tigran Aivazian

On Thu, 7 Dec 2000, Alan Cox wrote:
> > > > A proper way to release the references to resources is to call daemonize()
> > > > function from within the kernel thread function, which calls
> > > > exit_fs()/exit_files() internally.
> > > 
> > > Nearly correct, the daemonize function does NOT call exit_files.
> > 
> > I do not post messages to linux-kernel without checking the facts
> > first. Read the daemonize() function and see for yourself that you are
> > wrong.
> 
> Andreas is looking at a slightly older kernel, and was right for that. Every
> caller to daemonize either then did the file stuff or needed to and forgot
> so I fixed daemonize

Yes, yes, Alan, I do know that. I just took it for granted that the
correctness of the statement when applied to the latest kernel should not
upset someone who is looking at the older version so me calling someone
"wrong" is not a strong offensive term, just a mild thing saying "guess
what -- things changed". Just trying to exercise the human mind to get
used to quick changes in the Linux world -- that is all :)

Regards,
Tigran

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: bug in scsi.c

2000-12-07 Thread Tigran Aivazian

On Thu, 7 Dec 2000, Tigran Aivazian wrote:
> PS, Here it is, to save you time opening kernel/sched.c. The kernel is, of
> course, test12-pre7.
  ~~~

Before you tell me "it was not so in the earlier versions!" I am tempted
to quote a famous russian proverb 

"whosoever remembereth the past shall have his eye plucked out" 

which, paraphrased into Linux kernel development would sound like:

"whosoever keepeth Linux kernel trees under CVS, his disk shall rot"

;)

Tigran

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Disableing USB.

2000-12-07 Thread Bryan Whitehead

Is there a way I can disble a part of the kernel that is compiled into the
kernel? For example I'd like to pass this to lilo: "usb=disable" and then
the usb code is not loaded even though USB has been built into the kernel.

Is such a feature stupid? Or has this already been implemented?

It would be nice if this was generic and I could also pass things like
"procfs=disabled".

The resone I ask is a friend of mine got a new Sony Vaio Laptop that has
the ethernet card and USB device stepping on eachother. It would be nice
to pass to the Redhat/Mandrake/whatever installation boot disk usb=disable
so the ethernet card can work freely (he's doiung a ntwork install becasue
he has no CD-ROM), as he doesn't use any USB devices anyway.

-- 
---
Bryan Whitehead
Email: [EMAIL PROTECTED]
WorkE: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: bug in scsi.c

2000-12-07 Thread Alan Cox

> > > A proper way to release the references to resources is to call daemonize()
> > > function from within the kernel thread function, which calls
> > > exit_fs()/exit_files() internally.
> > 
> > Nearly correct, the daemonize function does NOT call exit_files.
> 
> I do not post messages to linux-kernel without checking the facts
> first. Read the daemonize() function and see for yourself that you are
> wrong.

Andreas is looking at a slightly older kernel, and was right for that. Every
caller to daemonize either then did the file stuff or needed to and forgot
so I fixed daemonize

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread richardj_moore



Yes, indeed this is the point - we should at least be able to report the
problem even if we can't recover - and we should do that in the standard
kernel. It doesn't seem right to convert a bad problem into an unfathomable
disaster, which is what a trap gate for double-fault does. If you're going
to do that then why bother to set up a trap gate, just leave IDT vector 8
as an invalid descriptor. As is stands, the do_double_fault routine is
otiose.


Richard Moore -  RAS Project Lead - Linux Technology Centre (PISC).

http://oss.software.ibm.com/developerworks/opensource/linux
Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183
IBM UK Ltd,  MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK


Keith Owens <[EMAIL PROTECTED]> on 07/12/2000 22:47:42

Please respond to Keith Owens <[EMAIL PROTECTED]>

To:   Richard J Moore/UK/IBM@IBMGB
cc:   Andi Kleen <[EMAIL PROTECTED]>, [EMAIL PROTECTED], "Maciej W. Rozycki"
  <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
Subject:  Re: Why is double_fault serviced by a trap gate?




On Thu, 7 Dec 2000 21:09:47 +,
[EMAIL PROTECTED] wrote:
>In summary I'd say the lack of a task gate is at the very least an
>oversight, if not a bug.
>
>If no one else wants to do it I'll see if I can code up the task gates for
>the double-fault and NMI.

If you overflow the kernel stack then you have already scribbled on the
process state at the low end of the kernel stack pages.  The process is
definitely not recoverable but you might not even be able to recover
the machine.  Corrupt p_opptr and friends, thread_group or pidhash and
other processes can be affected when they follow the chains.  However
being able to report the error is a good start, even if you cannot
recover.

If you add task gates, assign enough stack space for debuggers.  kdb
does a lot of work when NMI detects a hung cpu and needs stack space to
do that work.  A good option is to dedicate a set of process entries
for per cpu task gates, say processes 2-NR_CPUS+1 are dedicated to task
gates.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Lock ordering, inquiring minds want to know.

2000-12-07 Thread george anzinger

In looking over sched.c I find:

spin_lock_irq(_lock);
read_lock(_lock);


This seems to me to be the wrong order of things.  The read lock
unavailable (some one holds a write lock) for relatively long periods of
time, for example, wait holds it in a while loop.  On the other hand the
runqueue_lock, being a "irq" lock will always be held for short periods
of time.  It would seem better to wait for the runqueue lock while
holding the read_lock with the interrupts on than to wait for the
read_lock with interrupts off.  As near as I can tell this is the only
place in the system that both of these locks are held (of course, all
cases of two locks being held at the same time, both locker must use the
same order).  So...


What am I missing here? 

George
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: bug in scsi.c

2000-12-07 Thread Tigran Aivazian

On Thu, 7 Dec 2000, Andreas Klein wrote:
> > A proper way to release the references to resources is to call daemonize()
> > function from within the kernel thread function, which calls
> > exit_fs()/exit_files() internally.
> 
> Nearly correct, the daemonize function does NOT call exit_files.

I do not post messages to linux-kernel without checking the facts
first. Read the daemonize() function and see for yourself that you are
wrong.

Regards,
Tigran

PS, Here it is, to save you time opening kernel/sched.c. The kernel is, of
course, test12-pre7.

/*
 *  Put all the gunge required to become a kernel thread without
 *  attached user resources in one place where it belongs.
 */

void daemonize(void)
{
struct fs_struct *fs;


/*
 * If we were started as result of loading a module, close all of
the
 * user space pages.  We don't need them, and if we didn't close
them
 * they would be locked into memory.
 */
exit_mm(current);

current->session = 1;
current->pgrp = 1;

/* Become as one with the init task */

exit_fs(current);   /* current->fs->count--; */
fs = init_task.fs;
current->fs = fs;
atomic_inc(>count);
exit_files(current);
current->files = init_task.files;
atomic_inc(>files->count);
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread richardj_moore



You seem to be misunderstanding the point of the argument: R3 stack fault -
no problem - handled by trap gate for idt vector 12 - recovery is possible
if one wants to handle it. R0 stack fault - big problem, exception 12 is
converted to a double-fault, which is converted to a triple-fault because
vector 8 is a trap gate and not a task gate.


Richard Moore -  RAS Project Lead - Linux Technology Centre (PISC).

http://oss.software.ibm.com/developerworks/opensource/linux
Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183
IBM UK Ltd,  MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK


"Richard B. Johnson" <[EMAIL PROTECTED]> on 07/12/2000 21:44:23

Please respond to [EMAIL PROTECTED]

To:   Richard J Moore/UK/IBM@IBMGB
cc:   Andi Kleen <[EMAIL PROTECTED]>, "Maciej W. Rozycki" <[EMAIL PROTECTED]>,
  [EMAIL PROTECTED]
Subject:  Re: Why is double_fault serviced by a trap gate?




On Thu, 7 Dec 2000 [EMAIL PROTECTED] wrote:

>
>
> Which surely we can on today's x86 systems. Even back in the days of OS/2
> 2.0 running on a 386 with 4Mb RAM we used a taskgate for both NMI and
> Double Fault. You need only a minimal stack - 1K, sufficient to save
state
> and restore ESP to a known point before switching back to the main TSS to
> allow normal exception handling to occur.
>
> There no architectural restriction that some folks have hinted at - as
long
> as the DPL for the task gates is 3.
>
[SNIPPED...]

Please refer to page 6-16, Inter486 Microprocessor Family Programmer's
Reference Manual.

The specifc text is: "The TSS does not have a stack pointer for a
privilege level 3 stack, because the procedure cannot be called by a less
privileged procedure. The stack for privilege level 3 is preserved by the
contents of SS and EIP registers which have been saved on the stack
of the privilege level called from level 3".

What this means is that a stack-fault in level 3 will kill you no
matter how cute you try to be. And, putting a task gate as call
procedure entry from a trap or fault is just trying to be cute.
It's extra code that will result in the same processor reset.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.0 on an i686 machine (799.54 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] CONFIG_PCI cleanup in drivers/net/fc/iph5526.c (240t12p7)

2000-12-07 Thread Rasmus Andersen

Hi.

This patch is blessed by the maintainer and is based on the observation
that the Interphase 5526 card requires PCI to work. Therefore the PCI
dependency is moved into net/Config.in and removed from the .c-file.
The maintainers email address is also updated and a minor code shuffle
is done in order to eliminate an #ifdef MODULE.

Please apply (or at least comment :) )


diff -Naur linux-240-t12-pre7-clean/drivers/net/Config.in linux/drivers/net/Config.in
--- linux-240-t12-pre7-clean/drivers/net/Config.in  Wed Nov 22 22:41:40 2000
+++ linux/drivers/net/Config.in Fri Dec  8 00:50:34 2000
@@ -258,7 +258,7 @@
 
 bool 'Fibre Channel driver support' CONFIG_NET_FC
 if [ "$CONFIG_NET_FC" = "y" ]; then
-   dep_tristate '  Interphase 5526 Tachyon chipset based adapter support' 
CONFIG_IPHASE5526 $CONFIG_SCSI
+   dep_tristate '  Interphase 5526 Tachyon chipset based adapter support' 
+CONFIG_IPHASE5526 $CONFIG_SCSI $CONFIG_PCI
 fi
 
 if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
diff -Naur linux-240-t12-pre7-clean/drivers/net/fc/iph5526.c 
linux/drivers/net/fc/iph5526.c
--- linux-240-t12-pre7-clean/drivers/net/fc/iph5526.c   Wed Nov 22 22:41:40 2000
+++ linux/drivers/net/fc/iph5526.c  Fri Dec  8 00:50:34 2000
@@ -1,7 +1,7 @@
 /**
  * iph5526.c: IP/SCSI driver for the Interphase 5526 PCI Fibre Channel
  *   Card.
- * Copyright (C) 1999 Vineet M Abraham <[EMAIL PROTECTED]>
+ * Copyright (C) 1999 Vineet M Abraham <[EMAIL PROTECTED]>
  *
  * This program is free software; you can redistribute it and/or 
  * modify it under the terms of the GNU General Public License as 
@@ -33,7 +33,7 @@
 */ 
 
 static const char *version =
-"iph5526.c:v1.0 07.08.99 Vineet Abraham ([EMAIL PROTECTED])\n";
+"iph5526.c:v1.0 07.08.99 Vineet Abraham ([EMAIL PROTECTED])\n";
 
 #include 
 #include 
@@ -220,32 +220,23 @@
 
 static void iph5526_timeout(struct net_device *dev);
 
-#ifdef CONFIG_PCI
 static int iph5526_probe_pci(struct net_device *dev);
-#endif
-
 
 int __init iph5526_probe(struct net_device *dev)
 {
-#ifdef CONFIG_PCI
if (pci_present() && (iph5526_probe_pci(dev) == 0))
return 0;
-#endif
 return -ENODEV;
 }
 
-#ifdef CONFIG_PCI
 static int __init iph5526_probe_pci(struct net_device *dev)
 {
-#ifndef MODULE
-struct fc_info *fi;
-static int count = 0;
-#endif
 #ifdef MODULE
-struct fc_info *fi = (struct fc_info *)dev->priv;
-#endif
-
-#ifndef MODULE
+   struct fc_info *fi = (struct fc_info *)dev->priv;
+#else
+   struct fc_info *fi;
+   static int count = 0;
+ 
if(fc[count] != NULL) {
if (dev == NULL) {
dev = init_fcdev(NULL, 0);
@@ -277,7 +268,6 @@
display_cache(fi);
return 0;
 }
-#endif  /* CONFIG_PCI */
 
 static int __init fcdev_init(struct net_device *dev)
 {
diff -Naur linux-240-t12-pre7-clean/drivers/net/fc/tach_structs.h 
linux/drivers/net/fc/tach_structs.h
--- linux-240-t12-pre7-clean/drivers/net/fc/tach_structs.h  Mon Aug 23 19:12:38 
1999
+++ linux/drivers/net/fc/tach_structs.h Fri Dec  8 00:50:34 2000
@@ -1,7 +1,7 @@
 /**
  * iph5526.c: Structures for the Interphase 5526 PCI Fibre Channel 
  *   IP/SCSI driver.
- * Copyright (C) 1999 Vineet M Abraham <[EMAIL PROTECTED]>
+ * Copyright (C) 1999 Vineet M Abraham <[EMAIL PROTECTED]>
  **/
 
 #ifndef _TACH_STRUCT_H

-- 
Regards,
Rasmus([EMAIL PROTECTED])

Without censorship, things can get terribly confused in the
public mind. -General William Westmoreland, during the war in Viet Nam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 vs 2.4.0 proc_fs.c

2000-12-07 Thread Alan Cox

> Why is 2.2.18 proc_fs.c different than both 2.2.17 and 2.4.0? Cox, would
> you accept a patch that makes 2.2.18 define create_proc_info_entry and
> related functions the same way that 2.4.0 does?

Send me a diff and I'll be happy to
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: D-LINK DFE-530-TX

2000-12-07 Thread Dr. Kelsey Hudson

It uses the via-rhine driver on my system

On Thu, 7 Dec 2000, James Bourne wrote:

> On Wed, 6 Dec 2000, Mike A. Harris wrote:
> 
> > Which ethernet module works with this card?  2.2.17 kernel
> 
> Should be the rtl8139 driver.
> 
> Regards,
> Jim
> 
> > --
> >   Mike A. Harris  -  Linux advocate  -  Open source advocate
> >   This message is copyright 2000, all rights reserved.
> >   Views expressed are my own, not necessarily shared by my employer.
> > --
> 
> 

-- 
 Kelsey Hudson   [EMAIL PROTECTED] 
 Software Engineer
 Compendium Technologies, Inc   (619) 725-0771
--- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827! and scsi modules no load at boot w/ initrd - test12pre7

2000-12-07 Thread Mohammad A. Haque

Will do in a few hours. Working on stupid cs project right now.

Linus Torvalds wrote:
> Do you have something special that triggers this? Can you test if it
> only happens with initrd, for example?

-- 

=
Mohammad A. Haque  http://www.haque.net/ 
   [EMAIL PROTECTED]

  "Alcohol and calculus don't mix. Project Lead
   Don't drink and derive." --Unknown  http://wm.themes.org/
   [EMAIL PROTECTED]
=
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: linux-2.4.0-test11 and sysklogd-1.3-31

2000-12-07 Thread Keith Owens

On Thu, 7 Dec 2000 12:36:01 -0500 (EST), 
"Georg Nikodym" <[EMAIL PROTECTED]> wrote:
>> "KO" == Keith Owens <[EMAIL PROTECTED]> writes:
>
> KO> I would prefer to see the oops decoding completely removed from
> KO> klogd.
>
>Since nobody else has weighed in on this issue, I quickly did the
>necessary to effect Keith's suggestion.  What follows is a patch to
>sysklogd-1.3-31 (which after applying, ksym_mod.c can be removed):

You only removed the module symbol handling.  The problem is that the
entire klogd oops handling is out of date and broken.  I recommend
removing all oops processing from klogd, which means that klogd does
not need any symbols nor System.map.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



PCI bridge setup weirdness

2000-12-07 Thread Russell King

Hi,

Kernel 2.4.0-test12-pre7

There appears to be a possibility whereby the root resources (ioport_resource
and iomem_resource) can get modified by the PCI code:

Unknown bridge resource 0: assuming transparent
Unknown bridge resource 1: assuming transparent
Unknown bridge resource 2: assuming transparent
PCI: Fast back to back transfers enabled
ioport:  - 
ioport0: 1000 - 
ioport1: 1000 - 
PCI: Bus 1, bridge: Digital Equipment Corporation DECchip 21152
  IO window: -0fff
  MEM window: -000f
ioport2: 1000 - 1fff
ioport: 1000 - 1fff

So now the PCI IO resource contains 0x1000 to 0x1fff.  Unfortunately,
the PCI devices on the root bus are allocated to IO ports 0x4000 and
up.

This means that drivers are unable to request their resources:

Linux Tulip driver version 0.9.11 (November 3, 2000)
conflict: 1000-1fff: PCI IO (100)
tulip: eth0: I/O region (0x80@0x4400) unavailable, aborting
eepro100.c:v1.09j-t 9/29/99 Donald Becker 
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.35 $ 2000/11/17 Modified by Andrey V. Savochkin 
<[EMAIL PROTECTED]> and others
conflict: 1000-1fff: PCI IO (100)
eepro100: cannot reserve I/O ports

It appears to be caused by the pci_read_bridge_bases code copying the
pointer to the resources instead of making a copy of the resources
themselves.

I'm not sure what the correct fix is here (there have been some recent
changes in this area, but I'll hack around it for now).
   _
  |_| - ---+---+-
  |   | Russell King[EMAIL PROTECTED]  --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+ --- -+-
  /   |   THE developer of ARM Linux  |+| /|\
 /  | | | ---  |
+-+-+ -  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Oops while assigning PCI resources

2000-12-07 Thread Russell King

Russell King writes:
> I'm seeing an oops while assigning PCI resources on an ARM board.  This
> board as a PCI to PCI bridge on board without any devices on the second
> bus.

I've solved this one, sorry for the noise.
   _
  |_| - ---+---+-
  |   | Russell King[EMAIL PROTECTED]  --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+ --- -+-
  /   |   THE developer of ARM Linux  |+| /|\
 /  | | | ---  |
+-+-+ -  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: oops in 2.4.0test12-pre5+reiserfs+crypto

2000-12-07 Thread Keith Owens

On Thu, 7 Dec 2000 23:36:23 +0100, 
Florian Schmitt <[EMAIL PROTECTED]> wrote:
>I had the following oops while doing a "find -name" and playing mp3s on 
>my SB live:
>0010:[ne2k-pci:__insmod_ne2k-pci_O/lib/modules/2.4.0-test12/kernel/drivers+-2386971/96]
>It seems strange that the oops occured in ne2k-pci, since no network was 
>connected at that time.

No it did not.  This is the broken klogd oops converter making a mess
of the report.  Always run with "klogd -x" and run ksymoops manually.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread Keith Owens

On Thu, 7 Dec 2000 21:09:47 +, 
[EMAIL PROTECTED] wrote:
>In summary I'd say the lack of a task gate is at the very least an
>oversight, if not a bug.
>
>If no one else wants to do it I'll see if I can code up the task gates for
>the double-fault and NMI.

If you overflow the kernel stack then you have already scribbled on the
process state at the low end of the kernel stack pages.  The process is
definitely not recoverable but you might not even be able to recover
the machine.  Corrupt p_opptr and friends, thread_group or pidhash and
other processes can be affected when they follow the chains.  However
being able to report the error is a good start, even if you cannot
recover.

If you add task gates, assign enough stack space for debuggers.  kdb
does a lot of work when NMI detects a hung cpu and needs stack space to
do that work.  A good option is to dedicate a set of process entries
for per cpu task gates, say processes 2-NR_CPUS+1 are dedicated to task
gates.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



kernel BUG at buffer.c:827 in test12-pre6 and 7

2000-12-07 Thread Joseph Cheek

copying files off a loopback-mounted vfat filesystem exposes this bug.
test11 worked fine.

loop.o built as module.  this hard crashes the machine, every time
[PIII-450].  i don't know how to debug this, is there a FAQ?

[transcribed by hand]:

# mount -o loop /tmp/cdboot.288 /mnt/cd
# cd /mnt/cd
# cp menu.lst /tmp
kernel BUG at buffer.c:827!
invalid operand: 
CPU: 0
EIP: 0010:[]
EFLAGS: 00010082
eax: 001c ebx: c1d8fc60 ecx:  edx: 0001
esi: c10658e4 edi: 0002 ebp: c1d8fca8 esp: c1793dc0
ds: 0018 es: 0018 ss: 0018
Process cp (pid 762, stackpage=c1793000)
Stack: c01fe484 c01fe95a 033b c1d8fc60 c1cef420 0001 0001
c01610e1
   c1d8fc60 0001 c1cef420  c1cef420 c02c8ed8 c88df91c
c1cef420
   0001 c88e0986 0007  0001 c02c8ed8 c02c8ee8
c4f18800
Call Trace: [] [] [] []
[] [] [] []
   [] [] [] [] []
[] [] [c0128720>]
   [] [] [c010b56b>]
Code: 0f 0b 83 c4 0c 8d 5e 28 8d 46 2c 39 46 2c 74 24 b9 01 00 00

as soon as i reboot i will look what's at buffer.c:827



--
thanks!

joe

--
Joseph Cheek, Sr Linux Consultant, Linuxcare | http://www.linuxcare.com/
Linuxcare.  Support for the Revolution.  | [EMAIL PROTECTED]
CTO / Acting PM, Redmond Linux Project   | [EMAIL PROTECTED]
425 990-1072 vox [1074 fax] 206 679-6838 pcs | [EMAIL PROTECTED]



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



oops in 2.4.0test12-pre5+reiserfs+crypto

2000-12-07 Thread Florian Schmitt

I had the following oops while doing a "find -name" and playing mp3s on 
my SB live:

Dec  7 14:16:50 phoenix kernel: Unable to handle kernel paging request at 
virtual address 00010f08
Dec  7 14:16:50 phoenix kernel:  printing eip:
Dec  7 14:16:50 phoenix kernel: d084a3e5
Dec  7 14:16:50 phoenix kernel: *pde = 
Dec  7 14:16:50 phoenix kernel: Oops: 
Dec  7 14:16:50 phoenix kernel: CPU:0
Dec  7 14:16:50 phoenix kernel: EIP:
0010:[ne2k-pci:__insmod_ne2k-pci_O/lib/modules/2.4.0-test12/kernel/drivers+-2386971/96]
Dec  7 14:16:50 phoenix kernel: EFLAGS: 00010207
Dec  7 14:16:50 phoenix kernel: eax: 0004   ebx: 00010f00   ecx: 
0a45   edx: d084a3e0
Dec  7 14:16:50 phoenix kernel: esi: c1176934   edi:    ebp: 
0308   esp: c147ff8c
Dec  7 14:16:50 phoenix kernel: ds: 0018   es: 0018   ss: 0018
Dec  7 14:16:50 phoenix kernel: Process kswapd (pid: 4, 
stackpage=c147f000)
Dec  7 14:16:50 phoenix kernel: Stack: c1176918 c0129ef4 c1176918 
00010f00 0004   0001
Dec  7 14:16:50 phoenix kernel:0004  004e 
 c012a854 0004  00010f00
Dec  7 14:16:50 phoenix kernel:c01e0377 c147e239 0008e000 
c012a92d 0004  00010f00 c1449fb8
Dec  7 14:16:50 phoenix kernel: Call Trace: [rw_swap_page+148/160] 
[__get_free_pages+36/48] [stext_lock+7687/12848] [nr_free_pages+61/64] 
[Dec  7 14:16:50 phoenix kernel: Code: 8b 43 08 8b 40 10 8b 80 8c 00 00 
00 50 e8 c9 1b 01 00 53 e8 

It seems strange that the oops occured in ne2k-pci, since no network was 
connected at that time.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread Brian Gerst

"Richard B. Johnson" wrote:
> 
> On Thu, 7 Dec 2000 [EMAIL PROTECTED] wrote:
> 
> >
> >
> > Which surely we can on today's x86 systems. Even back in the days of OS/2
> > 2.0 running on a 386 with 4Mb RAM we used a taskgate for both NMI and
> > Double Fault. You need only a minimal stack - 1K, sufficient to save state
> > and restore ESP to a known point before switching back to the main TSS to
> > allow normal exception handling to occur.
> >
> > There no architectural restriction that some folks have hinted at - as long
> > as the DPL for the task gates is 3.
> >
> [SNIPPED...]
> 
> Please refer to page 6-16, Inter486 Microprocessor Family Programmer's
> Reference Manual.
> 
> The specifc text is: "The TSS does not have a stack pointer for a
> privilege level 3 stack, because the procedure cannot be called by a less
> privileged procedure. The stack for privilege level 3 is preserved by the
> contents of SS and EIP registers which have been saved on the stack
> of the privilege level called from level 3".
> 
> What this means is that a stack-fault in level 3 will kill you no
> matter how cute you try to be. And, putting a task gate as call
> procedure entry from a trap or fault is just trying to be cute.
> It's extra code that will result in the same processor reset.

No, because the CPL of the task gate would be 0, which means the stack
will be set to tss->esp0.  The DPL of 3 means that the descriptor can be
accessed from CPL3.  The text you mention generally means that the only
way to get back to CPL3 is with iret (via the saved %cs:%eip and
%ss:%esp pushed on the CPL0/1/2 stack).

--

Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: fs corruption with invalidate_buffers()

2000-12-07 Thread Alexander Viro



On Thu, 7 Dec 2000, Udo A. Steinberg wrote:

> Jan Niehusmann wrote:
> > 
> > The following patch actually prevents the corruption I described.
> > 
> > I'd like to hear from the people having problems with hdparm, if it helps
> > them, too.
> 
> Yes, it prevents the issue.
> 
> > Please note that the patch circumvents the problem more than it fixes it.
> > The true fix would invalidate the mappings, but I don't know how to do it.
> 
> I don't know either. What does Alexander Viro say to all of this?

That invalidate_buffers() should leave the unhashed ones alone. If it can't
be found via getblk() - just leave it as is.

IOW, let it skip bh if (bh->b_next == NULL && !destroy_dirty_buffers).
No warnings needed - it's a normal situation.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: kernel BUG at buffer.c:827! and scsi modules no load at boot w/ initrd - test12pre7

2000-12-07 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Mohammad A. Haque <[EMAIL PROTECTED]> wrote:
>
>I'm getting a BUG at boot in buffer.c:827. Oops/ksymoops at teh end of
>this message. I also noticed that the driver for my scsi card isn't
>loading at boot if compiled as a module using initrd. This is what I get
>during the boot process. 

This is a new BUG-check, where "UnlockPage()" actually verifies that the
page was locked before it unlocks it.

Trying to unlock a page that isn't locked is a nasty bug - if it happens
it probably also means that with some bad luck that unlock could have
unlocked the page that somebody _else_ had locked, and expected to stay
locked until it was unlocked properly.

(It may also be that the BUG() is due to exactly that - somebody else
who didn't have the lock unlocked the page from under you, and the
_proper_ unlocker will in that case be the one that oopses).

Do you have something special that triggers this? Can you test if it
only happens with initrd, for example?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [patch] Re: [patch-2.4.0-test12-pre6] truncate(2) permissions

2000-12-07 Thread Alexander Viro



On Thu, 7 Dec 2000, Andries Brouwer wrote:

> On Thu, Dec 07, 2000 at 10:24:31AM -0500, Alexander Viro wrote:
> 
> > Al, currently walking through the /usr/share/man/man2 and swearing silently...
> 
> Swearing? At the POSIX decisions or at the man page quality?

Mostly at the out-of-sync/not-all-errors-documented kind of places and amount
of fun involved in getting them in sync with the tree (and with each other,
for that matter). Oh, well...

> In the latter case, additions and corrections are very welcome.
> Make sure that you have 1.31 installed. 

Grabbed it. BTW, if you still have 1.7, 1.10, 1.13 and 1.14...  I've managed
to dig the rest out, but these seem to be gone (looks like a modified 1.10
is out there, but...)

I was thinking about putting the whole bunch under the CVS.  If you have
the missing ones somewhere and could send an URL...

BTW, could we finally lose mpx(2)? Very few programs used it under v7 and
that experiment had been abandoned early in 80s, so it's not like we needed
it for porting. phys(2) is already gone (not from unimplemented(2), though),
so we have a precedent of removing such stuff.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: fs corruption with invalidate_buffers()

2000-12-07 Thread Udo A. Steinberg

Jan Niehusmann wrote:
> 
> The following patch actually prevents the corruption I described.
> 
> I'd like to hear from the people having problems with hdparm, if it helps
> them, too.

Yes, it prevents the issue.

> Please note that the patch circumvents the problem more than it fixes it.
> The true fix would invalidate the mappings, but I don't know how to do it.

I don't know either. What does Alexander Viro say to all of this?

-Udo.



Same debug patch adapted to test12-pre7 follows:
 
--- linux/fs/buffer.c   Thu Dec  7 22:55:54 2000
+++ /usr/src/linux/fs/buffer.c  Thu Dec  7 22:49:02 2000
@@ -627,7 +627,7 @@
then an invalidate_buffers call that doesn't trash dirty buffers. */
 void __invalidate_buffers(kdev_t dev, int destroy_dirty_buffers)
 {
-   int i, nlist, slept;
+   int i, nlist, slept, db_message = 0;
struct buffer_head * bh, * bh_next;
 
  retry:
@@ -653,9 +653,13 @@
write_lock(_table_lock);
if (!atomic_read(>b_count) &&
(destroy_dirty_buffers || !buffer_dirty(bh))) {
-   remove_inode_queue(bh);
-   __remove_from_queues(bh);
-   put_last_free(bh);
+   if (bh->b_page && bh->b_page->mapping)
+   db_message = 1;
+   else {
+   remove_inode_queue(bh);
+   __remove_from_queues(bh);
+   put_last_free(bh);
+   }
}
/* else complain loudly? */
 
@@ -668,6 +672,8 @@
spin_unlock(_list_lock);
if (slept)
goto retry;
+   if (db_message)
+   printk("invalidate_buffers with mapped page!\n");
 }
 
 void set_blocksize(kdev_t dev, int size)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread Petr Vandrovec

On  7 Dec 00 at 16:44, Richard B. Johnson wrote:
> On Thu, 7 Dec 2000 [EMAIL PROTECTED] wrote:
> 
> > Which surely we can on today's x86 systems. Even back in the days of OS/2
> > 2.0 running on a 386 with 4Mb RAM we used a taskgate for both NMI and
> > Double Fault. You need only a minimal stack - 1K, sufficient to save state
> > and restore ESP to a known point before switching back to the main TSS to
> > allow normal exception handling to occur.
> > 
> > There no architectural restriction that some folks have hinted at - as long
> > as the DPL for the task gates is 3.
> > 
> [SNIPPED...]
> 
> Please refer to page 6-16, Inter486 Microprocessor Family Programmer's
> Reference Manual.
> 
> The specifc text is: "The TSS does not have a stack pointer for a
> privilege level 3 stack, because the procedure cannot be called by a less
> privileged procedure. The stack for privilege level 3 is preserved by the
> contents of SS and EIP registers which have been saved on the stack
> of the privilege level called from level 3".
> 
> What this means is that a stack-fault in level 3 will kill you no
> matter how cute you try to be. And, putting a task gate as call
> procedure entry from a trap or fault is just trying to be cute.
> It's extra code that will result in the same processor reset.

You misunderstand. There is no SS/ESP for level 3, because of you cannot
switch to CPL 3 using CALL/JMP, you can switch to it only through IRET/RETF.
And both of them fetch new SS/ESP from stack...

If stack-fault happens on CPL3, CPU switches to CPL0 (as defined by
stack fault trap gate), executes appropriate code, and then returns
back to CPL3 through IRET.

Maybe you forgot when reading this, that CPL3 is non-priviledged level,
and CPL0 has most of priviledges.

Problem with doublefault is that if you overflowed CPL0 stack, you just
cannot service this error on same stack, you must switch to another one.
And only way to switch out from CPL0 stack during fault service is
hardware switch to another TSS.

In either case, nothing is ever pushed into old stack, so doing

movl $0,%esp

does not matter. With userspace never, in kernel if you have task gate
for doublefault... In userspace it will not even crash until you send some
signal to that process, or until you'll execute some call/push/pop yourself.
Petr Vandrovec
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Ramdisk root filesystem strangeness

2000-12-07 Thread Vivek Dasgupta

Hi

I m sorry if this question doesn't belong to this list. But I couldn't
access the linux-admin list.

Is there support for using RAMDISK as the final root file system
in 2.2.x versions, or is it there in the 2.4.x versions.

I am trying to bring up linux on a diskless server which initially mounts
root FS thru NFS. Then I want to load HDD image to a RAMDISK and use it as
the final root filesystem.

I am not sure whether it is a ready supported or any kernel change will be
required for this?

Any pointers would be helpful.

thanks

vivek

-Original Message-
From: Justin Carlson [mailto:[EMAIL PROTECTED]]
Sent: Thursday, December 07, 2000 11:52 AM
To: [EMAIL PROTECTED]
Subject: Ramdisk root filesystem strangeness


Am in the midst of bringing up the kernel on a new MIPS variant, and I'm
tryingthe mount a statically linked ramdisk as the root filesystem.

Note, this is NOT using initrd support, I really want to use a ramdisk as my
final filesystem, not as an intermediate step in booting the system.

In blkdev_get(), called from mount_root(), there's some code that grabs
an empty inode, sets up i_rdev, and calls open() for the root device
with the caveat that open() must not examine anything except i_rdev.

in rd_open, though, there's this code snippet:

/*
* Immunize device against invalidate_buffers() and prune_icache().
*/
if (rd_inode[DEVICE_NR(inode->i_rdev)] == NULL) {
if (!inode->i_bdev)
return -ENXIO;

I'm hitting the -ENXIO return, which is based on an uninitialized field of
the
inode structure.

Being relatively new to the code base, I'm not sure what this code is trying
to
do, nor how to fix it.  Any suggestions?

The code involved is from the MIPS CVS repository at oss.sgi.com, which was
synced in the past couple days from 2.4.0test11

-Justin

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/


---
FREE! The World's Best Email Address @email.com
Reserve your name now at http://www.email.com


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread Richard B. Johnson

On Thu, 7 Dec 2000 [EMAIL PROTECTED] wrote:

> 
> 
> Which surely we can on today's x86 systems. Even back in the days of OS/2
> 2.0 running on a 386 with 4Mb RAM we used a taskgate for both NMI and
> Double Fault. You need only a minimal stack - 1K, sufficient to save state
> and restore ESP to a known point before switching back to the main TSS to
> allow normal exception handling to occur.
> 
> There no architectural restriction that some folks have hinted at - as long
> as the DPL for the task gates is 3.
> 
[SNIPPED...]

Please refer to page 6-16, Inter486 Microprocessor Family Programmer's
Reference Manual.

The specifc text is: "The TSS does not have a stack pointer for a
privilege level 3 stack, because the procedure cannot be called by a less
privileged procedure. The stack for privilege level 3 is preserved by the
contents of SS and EIP registers which have been saved on the stack
of the privilege level called from level 3".

What this means is that a stack-fault in level 3 will kill you no
matter how cute you try to be. And, putting a task gate as call
procedure entry from a trap or fault is just trying to be cute.
It's extra code that will result in the same processor reset.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.0 on an i686 machine (799.54 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0-test12-pre7

2000-12-07 Thread Tom Rini

On Thu, Dec 07, 2000 at 08:01:13PM +0100, Kai Germaschewski wrote:

> Maybe I'm stating something which is obvious to everybody, but note
> that pci_assign_unassigned_resources is only called from

Possibly, but I don't know either. :)

> ./arch/alpha/kernel/pci.c:  pci_assign_unassigned_resources();
> ./arch/mips/ddb5074/pci.c:  pci_assign_unassigned_resources();
> ./arch/arm/kernel/bios32.c: pci_assign_unassigned_resources();
> 
> so it looks like most archs don't use it anyway. (And that's supposedly
> why pci_set_master helped people on x86)

You're assuming all arches are up to date.  Silly you. :)  I know there's a
patch to use this for some PReP (PPC) machines.  It's quite possible other
arches might be using this but aren't synced up to Linus.

-- 
Tom Rini (TR1265)
http://gate.crashing.org/~trini/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] Re: fs corruption with invalidate_buffers()

2000-12-07 Thread Jan Niehusmann

The following patch actually prevents the corruption I described.

I'd like to hear from the people having problems with hdparm, if it helps
them, too.

Please note that the patch circumvents the problem more than it fixes it.
The true fix would invalidate the mappings, but I don't know how to do it.

Jan

--- linux-2.4.0-test11/fs/buffer.c  Mon Nov 20 08:55:05 2000
+++ test/fs/buffer.cThu Dec  7 22:28:24 2000
@@ -589,7 +589,7 @@
then an invalidate_buffers call that doesn't trash dirty buffers. */
 void __invalidate_buffers(kdev_t dev, int destroy_dirty_buffers)
 {
-   int i, nlist, slept;
+   int i, nlist, slept, db_message=0;
struct buffer_head * bh, * bh_next;
 
  retry:
@@ -615,8 +615,13 @@
write_lock(_table_lock);
if (!atomic_read(>b_count) &&
(destroy_dirty_buffers || !buffer_dirty(bh))) {
-   __remove_from_queues(bh);
-   put_last_free(bh);
+   if(bh->b_page 
+   && bh->b_page->mapping) { 
+   db_message=1;
+   } else { 
+   __remove_from_queues(bh);
+   put_last_free(bh);
+   }
}
/* else complain loudly? */
 
@@ -629,6 +634,8 @@
spin_unlock(_list_lock);
if (slept)
goto retry;
+   if(db_message)
+   printk("invalidate_buffer with mapped page\n");
 }
 
 void set_blocksize(kdev_t dev, int size)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Microsecond accuracy

2000-12-07 Thread Karim Yaghmour


You might want to try the Linux Trace Toolkit. It'll give you microsecond
accuracy on program execution time measurement.

Check it out:
http://www.opersys.com/LTT

Karim

Kotsovinos Vangelis wrote:
> 
> Is there any way to measure (with microsecond accuracy) the time of a
> program execution (without using Machine Specific Registers) ?
> I've already tried getrusage(), times() and clock() but they all have
> 10 millisecond accuracy, even though they claim to have microsecond
> acuracy.
> The only thing that seems to work is to use one of the tools that measure
> performanc through accessing the machine specific registers. They give you
> the ability to measure the clock cycles used, but their accuracy is also
> very low from what I have seen up to now.
> 
> Thank you very much in advance
> 
> --) Vangelis
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-- 
===
 Karim Yaghmour
   [EMAIL PROTECTED]
  Operating System Consultant
 (Linux kernel, real-time and distributed systems)
===
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why is double_fault serviced by a trap gate?

2000-12-07 Thread richardj_moore



Which surely we can on today's x86 systems. Even back in the days of OS/2
2.0 running on a 386 with 4Mb RAM we used a taskgate for both NMI and
Double Fault. You need only a minimal stack - 1K, sufficient to save state
and restore ESP to a known point before switching back to the main TSS to
allow normal exception handling to occur.

There no architectural restriction that some folks have hinted at - as long
as the DPL for the task gates is 3.

There's no problem under MP since the double fault exception will be only
presented on the processor that instigated the problem.

As for NMIs I didn't think they  were presented to all processors
simultaneously. If they are then the way to handle that is to map a page of
the GDT,  to a  unique physical address per-processor - i.e. processor
local storage. The virtual address will be the same on each. This is what
we did under OS/2 SMP.
We also alisaed these pages to unique virtual addresses so that they could
be seen by the kernel from any processor context.

The only time you want the NMI handler to be fast is when it's being used
for hand-shaking, which some disk devices do. And perhaps for APIC NMI
class interprocessor interrupts. But I honestly don't think that's really a
good enough reason not to have a task gate for NMI.

The unpredictablility of the abort (NMI or Double-fault) refers to fact
that in general it is indeterminate as to whether it is  a fault or trap.
And that's a matter of whether the EIP point at ot after the instruction
related to the exception. The abort nature  of theses exceptions is not
really a problem for the exception handler.

In summary I'd say the lack of a task gate is at the very least an
oversight, if not a bug.

If no one else wants to do it I'll see if I can code up the task gates for
the double-fault and NMI.

Richard


Richard Moore -  RAS Project Lead - Linux Technology Centre (PISC).

http://oss.software.ibm.com/developerworks/opensource/linux
Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183
IBM UK Ltd,  MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] setup.c notsc Re: Microsecond accuracy

2000-12-07 Thread Maciej W. Rozycki

On Thu, 7 Dec 2000, Hugh Dickins wrote:

> The present situation is inconsistent: "notsc" removes cpuinfo's
> "tsc" flag in the UP case (when cpu_data[0] is boot_cpu_data), but
> not in the SMP case.  I don't believe HPA's recent mods affected that
> behaviour, but it is made consistent (cleared in SMP case too) by the
> patch I sent him a couple of days ago, below updated for test12-pre7...

 My original code was specifically tested on a SMP system -- having no
suitable system I wrote it mainly to make sure TSC-less SMP systems (i.e. 
486 ones) run fine.  If it doesn't work as expected anymore, then an error
slipped in somehow since then. 

> I didn't test userland access to the TSC, but my reading of the code
> was that prior to this patch, it would be disallowed on the boot cpu,
> but still allowed on auxiliaries - because disable_tsc set X86_CR4_TSD
> if cpu_has_tsc, but initing boot cpu forces cpu_has_tsc to !cpu_has_tsc.

 Note that identify_cpu() rereads feature flags, so everything should be
fine.

> I have removed the "FIX-HPA" comment line: of course, that's none of my
> business, but if you approve the patch I imagine you'd want that to go too
> (I agree it's a bit ugly there, but safest to disable cpu_has_tsc soonest).

 It might probably be done in identify_cpu() but do we want to fiddle with
cr4 there?

 Well, it appears an error slipped in, indeed.  The following change is
the key one.  Everything should be fine once it's changed. 

 Peter would you accept the patch (see below)? 

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

diff -up --recursive --new-file linux-2.4.0-test11.macro/arch/i386/kernel/setup.c 
linux-2.4.0-test11/arch/i386/kernel/setup.c
--- linux-2.4.0-test11.macro/arch/i386/kernel/setup.c   Mon Nov 20 07:03:47 2000
+++ linux-2.4.0-test11/arch/i386/kernel/setup.c Thu Dec  7 20:43:24 2000
@@ -1959,7 +1959,7 @@ void __init identify_cpu(struct cpuinfo_
 */
 
/* TSC disabled? */
-#ifdef CONFIG_TSC
+#ifndef CONFIG_X86_TSC
if ( tsc_disable )
clear_bit(X86_FEATURE_TSC, >x86_capability);
 #endif

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.2.18 vs 2.4.0 proc_fs.c

2000-12-07 Thread Michael Rothwell

Why is 2.2.18 proc_fs.c different than both 2.2.17 and 2.4.0? Cox, would
you accept a patch that makes 2.2.18 define create_proc_info_entry and
related functions the same way that 2.4.0 does?

2.2.17:
does not define this

2.2.18:
#define create_proc_info_entry(n, m, b, g) \
{ \
struct proc_dir_entry *r = create_proc_entry(n, m, b); \
if (r) r->get_info = g; \
}

2.4.0:
extern inline struct proc_dir_entry *create_proc_info_entry(const char
*name,
mode_t mode, struct proc_dir_entry *base, get_info_t *get_info)
{
struct proc_dir_entry *res=create_proc_entry(name,mode,base);
if (res) res->get_info=get_info;
return res;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



kernel BUG at buffer.c:827! and scsi modules no load at boot w/ initrd - test12pre7

2000-12-07 Thread Mohammad A. Haque

I'm getting a BUG at boot in buffer.c:827. Oops/ksymoops at teh end of
this message. I also noticed that the driver for my scsi card isn't
loading at boot if compiled as a module using initrd. This is what I get
during the boot process. 

SCSI subsystem driver Revision: 1.00
request_module[scsi_hostadapter]: Root fs not mounted
request_module[scsi_hostadapter]: Root fs not mounted


Oops:

Reading Oops report from the terminal
kernel BUG at buffer.c:827!
invalid operand: 
CPU:0
EIP:0010:[]
EFLAGS: 00010286
eax: 001c   ebx: c1541144   ecx: c02919c8   edx: 0001
esi: d3d41c20   edi: 0202   ebp: d3d41c68   esp: d3d0dbec
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 9, stackpage=d3d0d000)
Stack: c023fbf2 c023feda 033b d3eace40 d3ea4000 d3c7d800 d3d41c20
c0171570 
   d3d41c20 0001 0100 0202 d3d41c20 018c c0168b9c
c0303da8 
    d3d41c20 d3d41c20 0001  d3d0dc9c c0168d21
 
Call Trace: [] [] [] []
[] [] [] 
   [] [] [] [] []
[] [] [] 
   [] [] [] [] []
[] [] [] 
   [] [] [] [] []
[] [] 
Code: 0f 0b 83 c4 0c 8d 73 28 8d 43 2c 39 43 2c 74 15 b9 01 00 00 
invalid operand: 
CPU:0
EIP:0010:[]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 001c   ebx: c1541144   ecx: c02919c8   edx: 0001
esi: d3d41c20   edi: 0202   ebp: d3d41c68   esp: d3d0dbec
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 9, stackpage=d3d0d000)
Stack: c023fbf2 c023feda 033b d3eace40 d3ea4000 d3c7d800 d3d41c20
c0171570 
   d3d41c20 0001 0100 0202 d3d41c20 018c c0168b9c
c0303da8 
    d3d41c20 d3d41c20 0001  d3d0dc9c c0168d21
 
Call Trace: [] [] [] []
[] [] [] 
   [] [] [] [] []
[] [] [] 
   [] [] [] [] []
[] [] [] 
   [] [] [] [] []
[] [] 
Code: 0f 0b 83 c4 0c 8d 73 28 8d 43 2c 39 43 2c 74 15 b9 01 00 00 

>>EIP; c0135e13<=
Trace; c023fbf2 
Trace; c023feda 
Trace; c0171570 
Trace; c0168b9c 
Trace; c0168d21 
Trace; c0137125 
Trace; c0130d63 <__alloc_pages+e3/42c>
Trace; c015a2ab 
Trace; c0158ad4 
Trace; c01291d3 
Trace; c012b1c3 
Trace; c012b42c 
Trace; c013e0d6 
Trace; c014e76d 
Trace; c014e5e0 
Trace; c0130d63 <__alloc_pages+e3/42c>
Trace; c014a805 
Trace; c0129439 
Trace; c013e1e7 
Trace; c02308ad 
Trace; c013e4ce 
Trace; c013e4e5 
Trace; c0109fbb 
Trace; c010b4d3 
Trace; c02308ba 
Trace; c0100018 
Trace; c01070c5 
Trace; c02308ba 
Trace; c0109d80 
Trace; c02308ba 
Code;  c0135e13 
 <_EIP>:
Code;  c0135e13<=
   0:   0f 0b ud2a  <=
Code;  c0135e15 
   2:   83 c4 0c  add$0xc,%esp
Code;  c0135e18 
   5:   8d 73 28  lea0x28(%ebx),%esi
Code;  c0135e1b 
   8:   8d 43 2c  lea0x2c(%ebx),%eax
Code;  c0135e1e 
   b:   39 43 2c  cmp%eax,0x2c(%ebx)
Code;  c0135e21 
   e:   74 15 je 25 <_EIP+0x25> c0135e38

Code;  c0135e23 
  10:   b9 01 00 00 00mov$0x1,%ecx

-- 

=
Mohammad A. Haque  http://www.haque.net/ 
   [EMAIL PROTECTED]

  "Alcohol and calculus don't mix. Project Lead
   Don't drink and derive." --Unknown  http://wm.themes.org/
   [EMAIL PROTECTED]
=
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



  1   2   3   4   5   >