date:20000925

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Linus Torvalds




On Mon, 25 Sep 2000, Alexander Viro wrote:
 
 
 On Sun, 24 Sep 2000, Linus Torvalds wrote:
 
  The remaining part if the directory handling. THAT is very buffer-cache
  intensive, as the directory handling hasn't been moved over to the page
  cache at all for ext2. Doing a large "find" (or even just a "ls -l") will
  basically do purely buffer cache accesses, first for the directory data
  and then for the inode data. With no page cache activity to balance things
  out at all - leading to a potentially quite unbalanced VM that never
  really had a good chance to get rid of dentries etc.
 
 You forgot inode tables themselves.

I don't. That's the "then for the inode data" part.

I'm not claiming that the buffer cache accesses would go away - I'm just
saying that the unbalanced "only buffer cache" case should go away,
because things like "find" and friends will still cause mostly page cache
activity.

(Considering the size of the inode on ext2, I don't know how true this is,
I have to admit. It might still be quite biased towards the buffer cache,
and as such the additional page cache pressure might not be enough to
really cause any major shift in balancing).

 I'll do it and post the result tomorrow. I bet that there will be issues
 I've overlooked (stuff that happens to work on UFS, but needs to be more
 general for ext2), so it's going as "very alpha", but hey, it's pretty
 straightforward, so there is a chance to debug it fast. Yes, famous last
 words and all such...

Sure.

 BTW, we _will_ need it on UFS side in 2.4 anyway. Rationale:

[ reasons removed ]

I have no problem with that. Especially as I suspect the people who use
UFS are more likely to be the technical kind of user who is more inclined
to be able to debug whatever potential problems crop up anyway. Your point
about not duplicating the fragment handling is certainly quite convincing
for the case of UFS.

   So some variant of directories in pagecache is needed for 2.4, the
 question being whether it's UFS-only or we use its port on ext2... BTW,
 minixfs/sysvfs can also use the thing, but that's another story.

Let's plan on UFS-only, for all the prudent reasons.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: NAT dropping packets

2000-09-25 Thread Rusty Russell


In message [EMAIL PROTECTED] you write:
 Hi,
 
 I've just spotted a small problem with 2.4.0-test8 running netfilter:
 
 NAT: 3 dropping untracked packet c065d3a0 1 192.168.0.1 - 192.168.0.9

Yes.  The connection tracking code doesn't try to understand broadcast
packets, so when it sees the ping reply, it doesn't recognize it.  The
NAT code then drops the (untracked) packet.

The message has been very useful in highlighing connection tracking
problems in the past 8).

If you don't mind your box `leaking', you can simply comment out this
message and make NAT return NF_ACCEPT for this.

Rusty.
--
Hacking time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Alexander Viro





On Sun, 24 Sep 2000, Linus Torvalds wrote:

 I'm not claiming that the buffer cache accesses would go away - I'm just
 saying that the unbalanced "only buffer cache" case should go away,
 because things like "find" and friends will still cause mostly page cache
 activity.
 
 (Considering the size of the inode on ext2, I don't know how true this is,
 I have to admit. It might still be quite biased towards the buffer cache,
 and as such the additional page cache pressure might not be enough to
 really cause any major shift in balancing).

Hrrrmmm... You know, since we don't have to associate struct inode with every
address space and inode table _is_ a linear array, after all... We
might put it into pagecache too. Very few places access the on-disk
inode, so it's not too horrible. All we need is readpage() and that's
very easy, considering the fact that allocation is static. prepare_write()
and commit_write() may be NULL for all I care and writepage() will
be easy too - no holes, no allocation, no nothing. Looks like we need to deal
with ext2_update_inode(), ext2_read_inode() and that's it. Even less
intrusive than directory stuff...

Comments?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [DOC] Debugging early kernel hangs

2000-09-25 Thread Russell King


James Sutherland writes:
 On Sat, 23 Sep 2000, Russell King wrote:
  And I'll try to make the point a second time that everything does not have
  a character-based screen to write to.
 
 So what? For platforms which have a nice easy way to stick ASCII on
 screen, use this. For other platforms, find some other approach - if you
 have a nice easy serial port handy, try feeding the characters there.

So what?  It shouldn't be called "VIDEO_CHAR" then - calling it that
describes ONE implementation only, not what it is actually doing.
Something more like DEBUGCH(char) or whatever is a better choice because
it describes what the intention of it is, rather than the implementation.
   _
  |_| - ---+---+-
  |   | Russell King[EMAIL PROTECTED]  --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+ --- -+-
  /   |   THE developer of ARM Linux  |+| /|\
 /  | | | ---  |
+-+-+ -  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: problem with 2.4.0-test9-pre6 seems to be SHM

2000-09-25 Thread Christoph Rohland


Hi David,

David Ford [EMAIL PROTECTED] writes:

 I think it's time to get Christoph on the line and see what he has
 to say.  The 4096 number is a limit to the system, you can have a
 max of 4096 shared memory segments systemwide.  Do you know offhand
 which programs are using(abusing) shm?

Here I am on the line again. But fortunately you found out yourself
that it's not the kernel to blame...

Greetings
Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 1023rd thread crashes 2.4.0-test8 from non-root user

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Mark Hahn wrote:

  The problem is large numbers of threads in 2.4.0-test8 can result in a
  hard crash of the entire kernel.  This can be done as a non-root user.
 
 this appears to be reproducable (128M duron, haven't tried intel UP/SMP):

i've done some experimentation, and to me it appears we overload the
queued signal limit of bash, or something like that? The Ctrl-C thing
definitely creates alot of signals. And the default limit for queued
signals [kernel/signal.c:max_queued_signals] is 1024 ...

so i think this is threading-unrelated, to me it (tentatively) looks like
to be a signal handling bug.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Linux 2.2.17

2000-09-25 Thread Andrey Savochkin


On Mon, Sep 04, 2000 at 10:58:09PM +0200, Pedro M. Rodrigues wrote:
 
The change to eepro100 done in pre16 isn´t listed as being 
 restored. Is it still in i/o mode?

The investigation hasn't succeeded yet.
It looks like a timing problem (however, I'm not so sure now).
I spent 3 full evenings last week working on this matter, no luck so far.

  2.2.17pre16
 [...]
  o   Switch eepro100 to I/O mode pending investigation
  (Andrey Savochkin)

Best regards
Andrey V.
Savochkin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 1023rd thread crashes 2.4.0-test8 from non-root user

2000-09-25 Thread Ingo Molnar



indeed, after changing max_queued_signals to 4096, i cannot crash the
kernel anymore with 2000 threads.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 1023rd thread crashes 2.4.0-test8 from non-root user

2000-09-25 Thread Ingo Molnar



btw., maybe it's init that gets those 2000 signals, not bash?

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[Demo program]: Poor elevator performance in 2.4.0-test9pre6

2000-09-25 Thread Robert Cohen


Ive written a small program to demonstrate the performance problems Ive
been seeing in recent Linux kernels.

The benchmark is a single process which writes and read  8k blocks
round-robin from a number of files.
It is written as a single process so the ordering of the operations is
known and perfectly interleaved. I include the source at the end of this
message.
The files are initially created using sequential writes so that the
files are laid out nicely on disk.

With kernel version 2.4.0-test9pre6 the results are as follows.
The test machine has 128 Megs of memory. The tests accesses 240 Megs of
files so that it can't fit in cache.

If I run it with 8 files of size 30 Megs:

[robert@test25 src]$ ./elv_test 8 30
files created, 240 megs written at 8.96 megs/sec
finished writing 240 megs written at 1.05 megs per sec  
finished reading, 240 megs read at 5.848833 megs/sec

If I do the same with a single file of size 240 Megs

[robert@test25 src]$ ./elv_test 1 240
files created, 240 megs written at 11.12 megs/sec
finished writing 240 megs written at 11.08 megs per sec
finished reading, 240 megs read at 12.580521 megs/sec

Comparing this to a similar tiotest run

[robert@test25 src]$ tiotest -f 30 -b 8192 -t 8 -r 0
Tiotest results for 8 concurrent io threads:

,--.
| Item  | Time | Rate | Usr CPU | Sys CPU  |
+---+--+--+--+-+
| Write 240 MBs |   25.5 s |   9.410 MB/s |   0.1 %  |  10.0 % |
| Read  240 MBs |   20.4 s |  11.755 MB/s |   0.0 %  |   8.8 % |
`--'


As the tests demonstrate, we get terrible write performance when a
single processes is writing round robin to a number of files.
There are two possible explanations for this, the single threaded nature
of the program is slowing things down. Or the fact that the files are
being written round robin is slowing us down. 

Since I see exactly the same kind of behaviour with the netatalk
benchmark I have been using and the netatalk benchmark isnt single
threaded, I believe that its the round robin interleaving of writes that
leading to the performance problems.

As a comparison, heres the results of the test program in kernel version
2.4.0-test1-ac22.

[robert@testmac25 src]$ ./elv_test 8 30 
files created, 240 megs written at 8.24 megs/sec
finished writing 240 megs written at 8.99 megs per sec
finished reading, 240 megs read at 5.849072 megs/sec


Here the write performance is fine. This definitely indicates that its
not the single threaded benchmark thats slowing things down.

As I understand it, the elevator should be dealing with the interleaved
nature of the writes. This seems to be working ok for reads, but it
doesnt seem to be working properly for writes.

The source can be found at http://tltsu.anu.edu.au/~robert/elv_test.c

--
Robert Cohen
TLTSU, Unix support
Australian National University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: problem with 2.4.0-test9-pre6 seems to be SHM

2000-09-25 Thread Christoph Rohland


safemode [EMAIL PROTECTED] writes:

 The sum of the Bytes used in the 4096 entries ipcs shows is WAY off from the
 bytes used in df if that's what you wanted to know.df shows 109K in
 use... and that's easily beaten by the first entry in ipcs
 
 -- Shared Memory Segments 
 key   shmid owner perms bytes nattchstatus
 0x 32769 root  600   5038082 dest
 0x0002 131074root  600   1966082
 0x0003 163843root  600   6553602
 0x 3997700   root  777   5240  1 dest
 0x 4030469   root  777   5060  1 dest
 0x 4063238   root  777   4700  1 dest
 
 
 this is the first 6 entries ...  i'm not sure what you're getting at with
 this though..

Just to give you some debugging help for the future: you can get the
attachees to a shm segment with shmfs using fuser(1):

[root /root]# ipcs -m
 
-- Shared Memory Segments 
key   shmid owner perms bytes nattchstatus
0x 32769 nobody600   46084 11dest
 
[root /root]# fuser -v /dev/shm/.IPC_8001
 
 USERPID ACCESS COMMAND
/dev/shm/.IPC_8001
 root883 m  httpd
 root886 m  httpd
 root887 m  httpd
 root888 m  httpd
 root889 m  httpd
 root890 m  httpd
 root891 m  httpd
 root892 m  httpd
 root893 m  httpd
 root894 m  httpd
 root895 m  httpd

The number in .IPC_ is the shmid in hex.

So if you are in doubt which program is to blame, you should have a
way to find it now.

Greetings
Christoph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: kernel 2.4.0-test8 lockup

2000-09-25 Thread Martin Costabel


Donn Washburn wrote:
 
 I would request a "cc" message.
 
 It seems as recent I have either a memory problem and or possible
 kernel problem with this system.  System is a ASUS P5A, AMD K6-II/350
 128Meg/IDE system.

Don't use test8! It is known for cannibalism (particularly for eating
mailboxes). 

My personal advice: Apply the test9-pre1 patch, but nothing later if you
are not into serious kernel debugging.

--
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [DOC] Debugging early kernel hangs

2000-09-25 Thread James Sutherland


On Mon, 25 Sep 2000, Russell King wrote:

 James Sutherland writes:
  On Sat, 23 Sep 2000, Russell King wrote:
   And I'll try to make the point a second time that everything does not have
   a character-based screen to write to.
  
  So what? For platforms which have a nice easy way to stick ASCII on
  screen, use this. For other platforms, find some other approach - if you
  have a nice easy serial port handy, try feeding the characters there.
 
 So what?  It shouldn't be called "VIDEO_CHAR" then - calling it that
 describes ONE implementation only, not what it is actually doing.
 Something more like DEBUGCH(char) or whatever is a better choice because
 it describes what the intention of it is, rather than the implementation.

Yes, a better name could be found; I'd go for DUMP_CHAR() myself, I think.
The basic concept is great, it just needs a new name...


James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 82559 driver bug

2000-09-25 Thread Andrey Savochkin


Greg,

On Sun, Sep 24, 2000 at 11:42:11PM -0700, Greg Zhang wrote:
 I need to update the MAC address on a Intel 82559 ethernet card.
 Tried:
 
 # ifconfig eth0 down
 # ifconfig eth0 hw ether0 xx:xx:xx:xx:xx:xx
 # ifconfig eth0 up
 
 It seems to take effect. Ping works. I have not had time to verify
 whether the MAC address is changed on the wire.
 
 When the machine was rebooted, the new MAC address was lost.
 This seems to be a bug in the 82559 driver. 82559 spec specifies
 how to manipulate its control and status register to write to the
 EEPROM that stores the MAC address. Before I write a program
 to do this, can someone confirm that this is a bug and it currently
 has no fix?

It's not a bug and shouldn't be fixed.
The address set by `ifconfig hw' is a part of run-time system configuration,
and should stay as it.
If you want to change EEPROM, do it.
Take EEPROM update utility (e.g. from http://scyld.com/) and write what you
want.

Best regards
Andrey V.
Savochkin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: (reiserfs) Re: An elevator algorithm

2000-09-25 Thread Hans Reiser


Ragnar Kjørstad wrote:
 
 On Fri, Sep 22, 2000 at 03:23:26PM -0700, Hans Reiser wrote:
  I think Xuan's algorithm is good, so I want to add to it.:-)
 
  Ragnar, I don't understand your objection to it.  It is always the
  case that if you specify real
  time constraints that are impossible then they aren't met.
 
 My objection was that in the case where it is impossible to serve
 requests within the maximum latency, it would stop ordering the
 requests.
 
 With a FIFO queue, the throuput will be lower, and that will also
 give longer latency.
 
 --
 Ragnar
Ok, reasonable objection.:)

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: problem with 2.4.0-test9-pre6 seems to be SHM

2000-09-25 Thread David Ford


Very correct except for one thing, allocation fails and ipcs -u shows
4097 when the limit shows 4096.  safemode reports that eventually the
kernel crashes.  This may be due to the test9 'features' and a side
affect, or it may be something to keep in mind once we get things nailed
down a bit.

-d

Christoph Rohland wrote:

 Here I am on the line again. But fortunately you found out yourself
 that it's not the kernel to blame...

 Greetings
 Christoph

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President




begin:vcard 
n:Ford;David
x-mozilla-html:TRUE
org:img src="http://www.kalifornia.com/images/paradise.jpg"
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
title:Blue Labs Developer
x-mozilla-cpt:;28256
fn:David Ford
end:vcard

Re: kernel compiled with frame pointer

2000-09-25 Thread Sushil



On Sun, 24 Sep 2000, Robert Redelmeier wrote:

  I am trying to get the call trace of a process by tracing the return 
  addresses on the stack. To get the correct location of the return 
  address  I need to know whether the kernel is being compiled with 
  frame pointer because this will affect the offset of return address 
  on the stack. 
 
 Of course.  But when your kernel was compiling, did you notice the
 `gcc` options as the files flew by?  `-fomit-frame-pointer` is standard
 on i386 and perhaps other arch's. 

I agree. Sitting in the front of desktop I can see if the source files are
getting compiled with or without -fomit-frame-pointer. But, while writing
a function in a kernel source file, I want to know whether the caller of
this function was compiled with or without -fomit-frame-pointer because
this will affect the location of return address to it on the stack.

So, I assume that if CONFIG_FRAME_POINTER is defined then the kernel (and
hopefully the caller function also) is being compiled without
-fomit-frame-pointer and then look for the return address appropriately.

Although this assumption is not correct (see Keith's mail in this thread)
but works in the case I am looking at (the function __dump_save_panic_regs
in the arch/i386/kernel/vmdump.c from the LKCD patch) because there the
caller and the callee are part of one code and either both or none is
compiled with frame pointer.


 But when you say "process", that sounds like userland.  Then it 
 would depend on whether you compiled with `-fomit-frame-pointer`
 or not.  

I am looking at crash dump utilities for Linux and in that context if the
kernel crashes then I am only interested in the kernel functions which the
process was executing at the time of the crash and not worried about the
user land call trace before the process entered the kernel. Therefore,
whether the user level program (which the process is executing) is
compiled with or without -fomit-frame-pointer is irrelevent in this case.


Regards,
Sushil.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Given an image, how can show its config?

2000-09-25 Thread Daniel Phillips


Keith Owens wrote:
 
 On Sat, 23 Sep 2000 14:15:44 +0100 (BST),
 James Sutherland [EMAIL PROTECTED] wrote:
 How about putting these files in the modules directory? That way, we have
 a nice consistent location for them.
 
 Why do you think modutils 2.3.14 added a prune list of files to ignore
 in /lib/modules/`uname -r`?  The current (2.3.17) list from modprobe -c is
 
 # Prune
 prune modules.dep
 prune modules.pcimap
 prune System.map
 prune .config
 prune build
 prune vmlinux
 prune vmlinuz
 prune bzImage
 prune zImage
 
 The 2.5 Makefiles rewrite will create /lib/modules/`uname -r`, even if
 you do not use modules (Hi, Rusty ;) and install the kernel specific
 output files in it.  There will also be enough information saved in
 /lib/modules to allow external compilation of third party software
 without having to guess what the kernel compile options were, this
 includes module symbol version information.  This is all covered in
 
 ftp://ftp.kernel.org/pub/linux/kernel/projects/kbuild/makefile-wishlist-2.5-4.bz2.
 
 The Makefile rewrite is definitely a 2.5 project, it is too big a
 change for 2.4.  Whether we rename /lib/modules to /lib/kernel has not
 been decided yet.  BTW, any discussion about this rewrite should be on
 the linux-kbuild list, not linux-kernel (yet).  See 2.4.0-test9-pre6
 MAINTAINERS.

I'm slowly drifting towards enlightenment on this issue.  Let me try to
state this in simple terms I can understand: the tree descending from
the revision name in the modules directory will contain everything
needed to:

  - Boot and run a given kernel + modules
  - Reconfigure the same kernel, given the original source tree
  - Support symbolic crash dumps and debugging

And to satisfy these needs the following will be saved in that tree:

  - Kernel image (one or more of vmlinux, vmlinuz, etc.)
  - Module tree
  - Kernel configuration (.config)
  - Module dependencies
  - Kernel symbols (System.map)

This makes sense to me.  This arrangement keeps track of my .config and
System.map and gives me a nice mindless 'make install' that does it
all.  Gosh, we could even put a README in it.  The next obvious thing to
do is move the whole tree to the /boot directory, leaving just a
symbolic link in /lib/modules.

I'll stop promoting the idea of tacking a portion of this tree onto the
bzImage.  This can wait, and if I want it, it would be better to tack on
the whole tree anyway, filtered for the parts I don't need.  This would
give a nice, linear file that I can just cat onto any boot device or
feed to lilo in the usual way.  It also suggests a way of loading
modules using a stub filesystem that knows only about the bzImage.  The
bottom line is I can stop panicking about this issue and panic about
something else instead :-)

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

PATCH 2.2.18.9: Backport /proc/pci from 2.4.x to 2.2.x

2000-09-25 Thread Andrzej Krzysztofowicz


 The 2.4.x kernel series obtains its /proc/pci device name data from a
 data file pci.ids.  The file makes PCI device name generic enough that
 it may be used by multiple utilities -- the kernel, Martin Mares'
 pciutils, distro installers, etc.  The attached patch, against kernel
 2.2.18-pre9, backports the 2.4.x /proc/pci facilities and device name
 database.

BTW, what do you think of idea making the pci.ids base modular ?
I mean replacing data requests from pci.ids base by their queuing requests 
(+ eventually request_module(pci_ids) to process the queue if possible )

The module while loading should process the queue.

I see two advantages of this solution:
- make if possible to use Vendor/Device info when booting from floppy
  (kernel size limitations)
- useful for hot-plugable PCI devices...

What do you think of it ?

-- 
===
  Andrzej M. Krzysztofowicz   [EMAIL PROTECTED]
  phone (48)(58) 347 14 61
Faculty of Applied Phys.  Math.,   Technical University of Gdansk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4 kernels do not boot on UX (Alpha)

2000-09-25 Thread Jeff Garzik


On Sun, 24 Sep 2000, Richard Henderson wrote:
 The PCI setup widgetry is known to be broken for pci-pci bridges.
 
 I've been intending to rewrite all this, but keep finding something
 more interesting to do -- like clean the cat box.  If it makes you
 feel any better, I have an AS4100 that can't boot 2.4 at the moment
 either for the same reason.

To give us a knowledge jump start... what is broken?  The latest test9
pre-patches include some bridge cleanup..

Jeff




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] 2.4.0-test8: Alpha RTC clean-ups

2000-09-25 Thread Maciej W. Rozycki


On Fri, 22 Sep 2000, Jan-Benedict Glaw wrote:

 Instead of having hard-coded values, we should maybe do something
 more variable like:
  if (year = (20 + YEARS_SINCE_2000)  year  (48 + YEARS_SINCE_2000)
   ...

 This looks reasonable.

 YEARS_SINCE_2000 could be define'd through (menu;x;...)config...

 I don't think it's really needed.  We have 20 years before epoch gets
misdetected.  Just keeping the macro up to date for Linux releases should
be enough (and if someone insists on running a given kernel for such a
long time, they may modify sources accordingly themselves).

 This applies to other platforms using different epoch vaules as
 well, of course...

 Alpha appears to be the only one.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 Not sure if this is the right moment for those changes though, I'm not
 worried about ext2 but about the other non-netoworked fses that nobody
 uses regularly.

it *is* the right moment to clean these issues up. These kinds of things
are what made the 2.2 VM a mess (everybody added his easy improvements,
without solving some of the conceptual problems), and frankly, instead of
yet another elevator algorithm we need a squeaky clean VM balancer above
all. Please help identifying, fixing, debugging and testing these VM
balancing issues. This is tough work and it needs to be done.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Kernel Oops with bonding

2000-09-25 Thread pb


Hi all

I'm getting kernel oops if I try networking with bonding. I working with
2.2.16-smp and the bonding.c etc. included with it. Everything starts up but
as soon as a packet is sent (ping). I'm getting the following error:

Unable to handel kernel NULL pointer derefernce at virtual address 
current-tss.cr3 = 1f949000, %cr3  = 1f949000
pde* = 
Oops: 
CPU:1
EIP:0010:[e0051343]
EFLAGS: 00010246
.
.
.
Segmentation fault


I read about others having this problem but couldn't find a patch or
something... any help appreciated.. thanx

phibo


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

VIA UDMA / Kernel 2.2.17

2000-09-25 Thread Jens Luedicke


Hi there ...

I have the 2.2.17 Kernel with the VIA
Chipset Support. My BIOS says that my HD
(Samsung) is in UDMA Mode 4. A friend of
mine told me that I can increase my
disk performance a little if I use DMA.

hdparm -d 1 /dev/hda

But I will get the following errors whenever
I run hdparm -tT /dev/hda :

Sep 25 10:06:15 cello kernel: hdc: drive_cmd: status=0x51 { DriveReady SeekComplete 
Error }
Sep 25 10:06:15 cello kernel: hdc: drive_cmd: error=0x04
Sep 25 10:06:42 cello kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete 
Error }
Sep 25 10:06:42 cello kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Sep 25 10:06:42 cello kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete 
Error }
Sep 25 10:06:42 cello kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Sep 25 10:06:52 cello kernel: hda: lost interrupt
Sep 25 10:06:52 cello kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete 
Error }
Sep 25 10:06:52 cello kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Sep 25 10:07:02 cello kernel: hda: lost interrupt
Sep 25 10:07:02 cello kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete 
Error }
Sep 25 10:07:02 cello kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Sep 25 10:07:02 cello kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete 
Error }
Sep 25 10:07:02 cello kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
Sep 25 10:07:02 cello kernel: hda: DMA disabled

Whats wrong there?



with friendly regards
jens luedicke [EMAIL PROTECTED]

Support the Theory of Evolution;
400 Billion Amphibians can't be wrong!

Q: What is the difference between Texas and yogurt?
A: Yogurt has culture.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: PATCH 2.2.18.9: Backport /proc/pci from 2.4.x to 2.2.x

2000-09-25 Thread Keith Owens


On Mon, 25 Sep 2000 11:07:58 +0200 (CEST), 
Andrzej Krzysztofowicz [EMAIL PROTECTED] wrote:
BTW, what do you think of idea making the pci.ids base modular ?
The module while loading should process the queue.

Does the modules.pcimap file creates by recent modules do what you
want?  It maps PCI vendor and device codes to the module that supports
them.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[SysRq PATCH]: no more reboots because console freeze

2000-09-25 Thread uaca



Hi all

I made a very small patch to the SysRq facility that signals
a program with SIGUSR1, the program is registered via sysctl 

The signal is launched with Alt+SysRq+X (X stands for eXecute program)

/proc/sys/kernel/sysrq_progid contains pid and start_time 
which totally identifies the program to signal

One of the uses of this is restore the console when a X-server
or a SVGAlib program crashes

How many times did you have got to reboot because of that, while your
system may be doing something important??? this hopefully solves the problem

Patch is included below, sample programs/scripts are available at

http://pusa.uv.es/~ulisses/sysrq_X.tar.gz

Comments/sugestions wanted, please e-mail me directly to [EMAIL PROTECTED]
(I'm not subscribed to the linux-kernel mailing list)

Bye!

Ulisses

-
Debian GNU/Linux: a dream come true
-

diff -u -r linux-2.4.0-test8/drivers/char/sysrq.c 
linux-2.4.0-test8-modificat2/drivers/char/sysrq.c
--- linux-2.4.0-test8/drivers/char/sysrq.c  Tue Aug  1 04:36:10 2000
+++ linux-2.4.0-test8-modificat2/drivers/char/sysrq.c   Mon Sep 25 08:58:23 2000
@@ -6,6 +6,8 @@
  *
  * (c) 1997 Martin Mares [EMAIL PROTECTED]
  * based on ideas by Pavel Machek [EMAIL PROTECTED]
+ *
+ * (c) 2000 Ulisses Alonso Camaró eXecute_program extension
  */
 
 #include linux/config.h
@@ -22,6 +24,7 @@
 #include linux/quotaops.h
 #include linux/smp_lock.h
 #include linux/module.h
+#include linux/kernel.h
 
 #include asm/ptrace.h
 
@@ -30,9 +33,13 @@
 extern int console_loglevel;
 extern struct list_head super_blocks;
 
+void signal_program(void);
+
 /* Whether we react on sysrq keys or just ignore them */
 int sysrq_enabled = 1;
 
+unsigned long sysrq_progid[2]= {0, 0}; /* pid and start_time */
+
 /* Machine specific power off function */
 void (*sysrq_power_off)(void) = NULL;
 
@@ -53,6 +60,35 @@
}
 }
 
+void signal_program(void)
+{
+   struct task_struct *p;
+   pid_t pid= (pid_t)sysrq_progid[0];
+
+   read_lock(tasklist_lock);
+   
+   if ((p= find_task_by_pid(pid)) == NULL)
+   goto error;
+
+   if (!p-mm || p-pid == 1)
+   goto error;
+   
+   if (p-start_time != sysrq_progid[1])
+   goto error;
+
+   send_sig(SIGUSR1, p, 0);
+
+   read_unlock(tasklist_lock);
+   
+   return; 
+   
+ error:
+   printk(KERN_ERR "SysRq: Could not send signal to pid %d with start_time 
+%lu\n", 
+   pid, sysrq_progid[1]);
+
+   return;
+}
+
 /*
  * This function is called by the keyboard handler when SysRq is pressed
  * and any other keycode arrives.
@@ -138,6 +174,10 @@
send_sig_all(SIGKILL, 1);
orig_log_level = 8;
break;
+   case 'x':
+   printk("eXecute program\n");
+   signal_program();   
+   break;
default:/* Unknown: help */
if (kbd)
printk("unRaw ");
@@ -148,7 +188,7 @@
printk("Boot ");
if (sysrq_power_off)
printk("Off ");
-   printk("Sync Unmount showPc showTasks showMem loglevel0-8 tErm kIll 
killalL\n");
+   printk("Sync Unmount showPc showTasks showMem loglevel0-8 tErm kIll 
+killalL eXec_program\n");
/* Don't use 'A' as it's handled specially on the Sparc */
}
 
diff -u -r linux-2.4.0-test8/include/linux/sysctl.h 
linux-2.4.0-test8-modificat2/include/linux/sysctl.h
--- linux-2.4.0-test8/include/linux/sysctl.hThu Aug 10 22:01:26 2000
+++ linux-2.4.0-test8-modificat2/include/linux/sysctl.h Sun Sep 24 12:39:30 2000
@@ -113,6 +113,7 @@
KERN_OVERFLOWGID=47,/* int: overflow GID */
KERN_SHMPATH=48,/* string: path to shm fs */
KERN_HOTPLUG=49,/* string: path to hotplug policy agent */
+   KERN_SYSRQ_PROGID=50/* string: pid and start_time of the program to signal 
+*/
 };
 
 
diff -u -r linux-2.4.0-test8/kernel/sysctl.c 
linux-2.4.0-test8-modificat2/kernel/sysctl.c
--- linux-2.4.0-test8/kernel/sysctl.c   Tue Aug  1 04:36:11 2000
+++ linux-2.4.0-test8-modificat2/kernel/sysctl.cMon Sep 25 08:59:24 2000
@@ -83,6 +83,10 @@
 extern int acct_parm[];
 #endif
 
+#ifdef CONFIG_MAGIC_SYSRQ
+extern unsigned long sysrq_progid[];
+#endif
+
 extern int pgt_cache_water[];
 
 static int parse_table(int *, int, void *, size_t *, void *, size_t,
@@ -220,6 +224,10 @@
 #ifdef CONFIG_MAGIC_SYSRQ
{KERN_SYSRQ, "sysrq", sysrq_enabled, sizeof (int),
 0644, NULL, proc_dointvec},
+   {KERN_SYSRQ_PROGID, "sysrq_progid", sysrq_progid, 2*sizeof(unsigned long), 
+   0644, NULL, proc_doulongvec_minmax, NULL, 
+   (void *)NULL, (void *)NULL},
+/*

the new VM

2000-09-25 Thread Ingo Molnar



i'd also like to share my experiences with recent kernels, compared to the
'old VM'. I frequently run high VM load multi-gigabyte systems with alot
of IRQ-side allocations as well, and it's surprising how sensitive these
systems' performance is to VM balance, despite gobs of RAM.

- The biggest difference under high allocation load is that the CPU usage
of kswapd and the synchronous VM balancing code has decreased
significantly. Under previous kernels it was not uncommon to see sudden
spikes in kswapd usage, and to see significant CPU time spent in
shrink_mmap()  friends. I suspect that this is because the new VM does
much less 'guessing' and blind list-walking.

- i'm also happy that __alloc_pages() now 'guarantees' allocation. This i
believe could simplify unrelated kernel code significantly. Eg. no need to
check for NULL pointers on most allocations, a GFP_KERNEL allocation
always succeeds, end of story. This behavior also has the 'nice'
side-effect of showing memory inbalance rather forcefully: the system
locks up ;-) A GFP_ATOMIC allocation obviously still has the potential to
fail, and must be handled properly.

all in one, the new VM balancing code looks really promising, despite all
the growing pains.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: more testing on 2.4.0-t9p[456] VM deadlocks

2000-09-25 Thread Martin Diehl




On Mon, 25 Sep 2000, Martin Diehl wrote:

 PS: vmfixes-2.4.0-test9-B2 not yet tested - will do later.

Hi - done now:

using 2.4.0-t9p6 + vmfixes-2.4.0-test9-B2 I ended up with the box
deadlocked again! Was "make bzImage" on UP booted with mem=8M.
After about 4 hours at load 2-3 and almost continously paging the box
is apparently locked up. SysRq+t still shows several processes including
kswapd being scheduled "current" (one after the other of course).

Mem-Info (retyped from SysRq+m):
Active: 847 / inactive dirty: 67 / inactive clean: 0 / free: 64
2x16 + 1x32 + 1x64 + 1x128 = 256kB
Swap cache: add 3353996, delete 3353209, find 2300336/9605753
Free swap: 496144kB
2048 pages of RAM
0 pages of HIGHMEM
490 reserved pages
74 pages shared
787 pages cached
0 pages in page table cache
Buffer memory: 236kB

No change on this at all, despite the scheduling activity still observed.

I've looked up several EIP-values (given by SysRq+p) vs. System.map to get
an idea what is still going on. The functions I've recorded (this # often):

page_launder(10)
try_to_free_buffer(5)
deactivate_page_nolock(4)
refill_inactive_scan(3)
nr_free_pages(1)
wakeup_kswapd(1)
__wake_up(1)
kmem_cache_reap(1)
sys_fstatfs(1)
sys_statfs(1)

The results of this very rudimentary "profiling the deadlock" are far from
statistical significance of course. The only ordering rule implied in this
list is the number of occurences - i.e., I don't see any pattern or call
chain there.

Finally, SysRq+e solved the problem: hanging processes term'ed, VM
deadlock released, box seems to be as useable as after it was booted.

Comments?

Regards
Martin

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] 2.4.0-test8: Alpha RTC clean-ups

2000-09-25 Thread Jan-Benedict Glaw


On Mon, Sep 25, 2000 at 11:35:35AM +0200, Maciej W. Rozycki wrote:
 On Fri, 22 Sep 2000, Jan-Benedict Glaw wrote:
 
  Instead of having hard-coded values, we should maybe do something
  more variable like:
   if (year = (20 + YEARS_SINCE_2000)  year  (48 + YEARS_SINCE_2000)
  ...
 
  This looks reasonable.
 
  This applies to other platforms using different epoch vaules as
  well, of course...
 
  Alpha appears to be the only one.

./driver/char/rtc.c:rtc_init()
#if defined(__alpha__) || defined(__mips__)
[...]

MIPS does that as well _in the wrong way_ compared to rtc.c:
./arch/mips/dev/time.c:time_init()
/*
 * The DECstation RTC is used as a TOY (Time Of Year).
 * The PROM will reset the year to either '70, '71 or '72.
 * This hack will only work until Dec 31 2001.
 */
year += 1928;

MfG, JBG



-- 
Fehler eingestehen, Größe zeigen: Nehmt die Rechtschreibreform zurück!!!
/* Jan-Benedict Glaw [EMAIL PROTECTED] -- +49-177-5601720 */
keyID=0x8399E1BB fingerprint=250D 3BCF 7127 0D8C A444 A961 1DBD 5E75 8399 E1BB
 "insmod vi.o and there we go..." (Alexander Viro on linux-kernel)

 PGP signature

Re: PATCH 2.2.18.9: Backport /proc/pci from 2.4.x to 2.2.x

2000-09-25 Thread Jeff Garzik


On Mon, 25 Sep 2000, Andrzej Krzysztofowicz wrote:
 BTW, what do you think of idea making the pci.ids base modular ?
 I mean replacing data requests from pci.ids base by their queuing requests 
 (+ eventually request_module(pci_ids) to process the queue if possible )
 
 The module while loading should process the queue.
 
 I see two advantages of this solution:
 - make if possible to use Vendor/Device info when booting from floppy
   (kernel size limitations)
 - useful for hot-plugable PCI devices...

I'm not sure I understand what you are describing here...

pci.ids is just a vendor/device id - device name map.  It shouldn't
affect functionality at all... just whether or not you know the names of
your devices.

Jeff




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: how interesting are data-bss patches?

2000-09-25 Thread Keith Owens


On Sun, 24 Sep 2000 14:28:07 -0500 (CDT), 
Peter Samuelson [EMAIL PROTECTED] wrote:
[Tigran Aivazian [EMAIL PROTECTED]]
 The question you ask can be answered trivially - yes, it is
 definitely a good idea, please make such a patch.

My expression doesn't catch *all* offenders, by any means.  For
example, things like

  char *foo[MAX_BLURFL] = { NULL, };

would be much harder to pick up.

Got bored, wrote some Perl.  On a 2.4.0-test8 system with a fairly
minimal config, most stuff in modules, it found 14,420 bytes of vmlinux
.data where the entire variable was initialized to 0.  Obviously this
depends on your config and will only catch data for options that have
been selected.  Anybody want to try this on a kernel with the lot?

Biggest offenders

0xc02ce7e0(64)  empty_iops.505
0xc02ce820(64)  empty_fops.506
0xc02c8c40(128) last_irq_sums.664
0xc02c8cc0(128) alert_counter.665
0xc02c9120(128) vm86_irqs
0xc02ca340(128) apic_timer_irqs
0xc02dc0c0(128) inet_protos
0xc02dc240(128) tcp_listening_hash
0xc02dff20(128) inet6_protos
0xc02c9920(256) command_line
0xc02ca080(256) tsc_values
0xc02cee60(512) proc_alloc_map
0xc02c9680(672) e820
0xc02d0c60(1024)floppy_blocksizes
0xc02d4f80(1024)raw_device_bindings
0xc02d5380(1024)raw_device_inuse
0xc02d5780(1024)raw_device_sector_size
0xc02d5b80(1024)raw_device_sector_bits
0xc02cd160(2048)chrdevs
0xc02cdc40(2048)blkdevs

--- cut here

#!/usr/bin/perl -w
# List symbols in .data section which are initialized to zero.
# Needs readelf command from recent binutils.

use strict;
die($0 . " takes exactly one argument, vmlinux or a module to be explored\n") 
if($#ARGV);

my ($data, $section, $start, $size, $addr, $len, $i, $total);
my @f;
my @symdata;
my @keys;
my %symbol;

$data = `readelf -S $ARGV[0] | fgrep ' .data '`;
chomp($data);
die("$0 could not find .data section in $ARGV[0]\n") if($data eq "");
print("Data section ", $data, "\n");
$data =~ s/[][]//g;
(@f) = split(' ', $data);
$section = $f[0];
$start = hex("0x" . $f[3]);
$size = hex("0x" . $f[5]);
$symbol{$start} = ".data__start";
$symbol{$start+$size} = ".data__end";

$data = "";
@symdata = `readelf -s -x $section $ARGV[0]`;
foreach (@symdata) {
chomp();
if (/^ *[0-9]+:/) {
(@f) = split();
$symbol{hex("0x" . $f[1])} = $f[7] if ($f[6] eq "$section");
}
elsif (/^ *0x/) {
s/^ +//;
$addr = hex(substr($_, 0, 10));
($_ = substr($_, 11, 35)) =~ s/ //g;
$data .= scalar reverse($_);# assumes little endian
}
}

@keys = sort(keys(%symbol));
$total = 0;
for ($i = 0; $i  $#keys-1; ++$i) {
$addr = $keys[$i];
$len = $keys[$i+1] - $addr;
if (substr($data, ($addr-$start)*2, $len*2) =~ /^0+$/) { 
printf("0x%x(%d)\t%s\n", $addr, $len, $symbol{$addr});
$total += $len;
}
}

printf("Total %d (0x%x)\n", $total, $total);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: kernel compiled with frame pointer

2000-09-25 Thread Robert Redelmeier


Sushil wrote:
 
 I agree. Sitting in the front of desktop I can see if the source files are
 getting compiled with or without -fomit-frame-pointer. But, while writing
 a function in a kernel source file, I want to know whether the caller of
 this function was compiled with or without -fomit-frame-pointer because
 this will affect the location of return address to it on the stack.
 
 So, I assume that if CONFIG_FRAME_POINTER is defined then the kernel (and
 hopefully the caller function also) is being compiled without
 -fomit-frame-pointer and then look for the return address appropriately.

Ah -- I see, you are looking at some sort of kernel debugger.  Well,
then one way  would be to look at entry and exit points.  i386 Frame
pointers are set up with  `pushl %ebp / movl %esp, %ebp / subl $local, %esp`
or sometimes [not by gcc AFAIK with `enter`].  Exit points are similarly
`movl %ebp, %esp / popl %ebp / ret`.  Some versions of gcc do generate
`leave / ret`.  

You could look for these byte signatures.  Should be quite reliable with 
a good System.map.
 
-- Robert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: kernel compiled with frame pointer

2000-09-25 Thread Keith Owens


On Mon, 25 Sep 2000 06:21:48 -0500, 
Robert Redelmeier [EMAIL PROTECTED] wrote:
Ah -- I see, you are looking at some sort of kernel debugger.  Well,
then one way  would be to look at entry and exit points.  i386 Frame
pointers are set up with  `pushl %ebp / movl %esp, %ebp / subl $local, %esp`
or sometimes [not by gcc AFAIK with `enter`].  Exit points are similarly
`movl %ebp, %esp / popl %ebp / ret`.  Some versions of gcc do generate
`leave / ret`.  

You could look for these byte signatures.  Should be quite reliable with 
a good System.map.

Until you go to gcc 2.96 when the prologue code changes dramatically.
Interleaved instructions, plus "nice" constructs like

void foo(int bar)
{
if (!bar)
return;

return;
}

Could generate the test before doing anything on stack.

foo: cmpl  4(%esp),$0
 be1f
 pushl %ebp
 movl  %esp,%ebp
 ...
 movl  %ebp,%esp
 popl  %ebp
1:   ret

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 82559 driver bug

2000-09-25 Thread Alan Cox


 When the machine was rebooted, the new MAC address was lost.
 This seems to be a bug in the 82559 driver. 82559 spec specifies

The kernel address overrides never do permanent changes


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: PATCH 2.2.18.9: Backport /proc/pci from 2.4.x to 2.2.x

2000-09-25 Thread Andrzej Krzysztofowicz


 
 On Mon, 25 Sep 2000 11:07:58 +0200 (CEST), 
 Andrzej Krzysztofowicz [EMAIL PROTECTED] wrote:
 BTW, what do you think of idea making the pci.ids base modular ?
 The module while loading should process the queue.
 
 Does the modules.pcimap file creates by recent modules do what you
 want?  It maps PCI vendor and device codes to the module that supports
 them.

I'm affraid no. Which module should be responsible for VGA/PCI-PCI bridge/
PCI-ISA bridge/IDE and all other features compiled into kernel ? 
I mean updating the data not at boot time (when it is not always necessary,
but later).

Probably it would be also an alternative solution for  modules.pcimap.

-- 
===
  Andrzej M. Krzysztofowicz   [EMAIL PROTECTED]
  phone (48)(58) 347 14 61
Faculty of Applied Phys.  Math.,   Technical University of Gdansk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: more testing on 2.4.0-t9p[456] VM deadlocks

2000-09-25 Thread Marcelo Tosatti



On Mon, 25 Sep 2000, Martin Diehl wrote:

 
 
 On Mon, 25 Sep 2000, Martin Diehl wrote:
 
  PS: vmfixes-2.4.0-test9-B2 not yet tested - will do later.
 
 Hi - done now:
 
 using 2.4.0-t9p6 + vmfixes-2.4.0-test9-B2 I ended up with the box
 deadlocked again! Was "make bzImage" on UP booted with mem=8M.

There is a known deadlock with Ingo's patch.

I'm attaching a patch which should fix it. (on top of
vmfixes-2.4.0-test9-B2) 


diff -Nur --exclude-from=exclude linux.orig/fs/dcache.c linux/fs/dcache.c
--- linux.orig/fs/dcache.c  Mon Sep 25 08:40:47 2000
+++ linux/fs/dcache.c   Mon Sep 25 08:40:53 2000
@@ -556,15 +556,11 @@
int count = 0;
if (priority)
count = dentry_stat.nr_unused / priority;
-   prune_dcache(count);
-   /* FIXME: kmem_cache_shrink here should tell us
-  the number of pages freed, and it should
-  work in a __GFP_DMA/__GFP_HIGHMEM behaviour
-  to free only the interesting pages in
-  function of the needs of the current allocation. */
-   kmem_cache_shrink(dentry_cache);
 
-   return 0;
+   if(gfp_mask  __GFP_IO)
+   prune_dcache(count);
+
+   return kmem_cache_shrink(dentry_cache);
 }
 
 #define NAME_ALLOC_LEN(len)((len+16)  ~15)
diff -Nur --exclude-from=exclude linux.orig/fs/inode.c linux/fs/inode.c
--- linux.orig/fs/inode.c   Mon Sep 25 08:40:47 2000
+++ linux/fs/inode.cMon Sep 25 08:40:53 2000
@@ -460,15 +460,11 @@

if (priority)
count = inodes_stat.nr_unused / priority;
-   prune_icache(count);
-   /* FIXME: kmem_cache_shrink here should tell us
-  the number of pages freed, and it should
-  work in a __GFP_DMA/__GFP_HIGHMEM behaviour
-  to free only the interesting pages in
-  function of the needs of the current allocation. */
-   kmem_cache_shrink(inode_cachep);
 
-   return 0;
+   if(gfp_mask  __GFP_IO) 
+   prune_icache(count);
+
+   return kmem_cache_shrink(inode_cachep);
 }
 
 /*
diff -Nur --exclude-from=exclude linux.orig/mm/slab.c linux/mm/slab.c
--- linux.orig/mm/slab.cMon Sep 25 08:40:38 2000
+++ linux/mm/slab.c Mon Sep 25 08:40:53 2000
@@ -887,7 +887,7 @@
 static int __kmem_cache_shrink(kmem_cache_t *cachep)
 {
slab_t *slabp;
-   int ret;
+   int ret, freed = 0;
 
drain_cpu_caches(cachep);
 
@@ -912,8 +912,11 @@
spin_unlock_irq(cachep-spinlock);
kmem_slab_destroy(cachep, slabp);
spin_lock_irq(cachep-spinlock);
+
+   freed++;
}
-   ret = !list_empty(cachep-slabs);
+
+   ret = ((1  cachep-gfporder) * freed);
spin_unlock_irq(cachep-spinlock);
return ret;
 }
@@ -923,7 +926,8 @@
  * @cachep: The cache to shrink.
  *
  * Releases as many slabs as possible for a cache.
- * To help debugging, a zero exit status indicates all slabs were released.
+ *
+ * Returns the amount of freed pages.
  */
 int kmem_cache_shrink(kmem_cache_t *cachep)
 {
@@ -962,7 +966,9 @@
list_del(cachep-next);
up(cache_chain_sem);
 
-   if (__kmem_cache_shrink(cachep)) {
+   __kmem_cache_shrink(cachep); 
+   
+   if (!list_empty(cachep-slabs)) {
printk(KERN_ERR "kmem_cache_destroy: Can't free all objects %p\n",
   cachep);
down(cache_chain_sem);

Re: PATCH 2.2.18.9: Backport /proc/pci from 2.4.x to 2.2.x

2000-09-25 Thread Jeff Garzik


On Mon, 25 Sep 2000, Andrzej Krzysztofowicz wrote:
 I mean moving the __init database compiled into kernel (based on pci.ids) to
 a separate module, which would be responsible for on-demand updating of text
 information (i.e. replacing VID:DID numbers with text).

In early 2.3.x, the fbdev subsystem added "modedb", a feature which
provides a standard video mode database for all framebuffer drivers.
This is also __init code, because after boot, video mode information can
be provided from userspace (via 'fbset', in fbdev's case).

I see you suggestion in the same way...  If we keep the PCI device name
data around after boot, then we have a lot of kernel memory locked up
on the off chance that a HotPlug PCI device might appear for which we
need a name.

I would much prefer a userspace solution for naming unnamed PCI devices
after boot...

Jeff




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: PATCH 2.2.18.9: Backport /proc/pci from 2.4.x to 2.2.x

2000-09-25 Thread Dan Hollis


On Mon, 25 Sep 2000, Jeff Garzik wrote:
 I see you suggestion in the same way...  If we keep the PCI device name
 data around after boot, then we have a lot of kernel memory locked up
 on the off chance that a HotPlug PCI device might appear for which we
 need a name.
 I would much prefer a userspace solution for naming unnamed PCI devices
 after boot...

How about the kernel calling lspci?

-Dan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] 2.4.0-test8: Alpha RTC clean-ups

2000-09-25 Thread Maciej W. Rozycki


On Mon, 25 Sep 2000, Jan-Benedict Glaw wrote:

 ./driver/char/rtc.c:rtc_init()
 #if defined(__alpha__) || defined(__mips__)
 [...]

 That is wrong.  I fixed this partially in the MIPS/Linux CVS tree a few
weeks ago.  The __mips__ conditional is to be completely removed. 

 MIPS does that as well _in the wrong way_ compared to rtc.c:
 ./arch/mips/dev/time.c:time_init()
 /*
  * The DECstation RTC is used as a TOY (Time Of Year).
  * The PROM will reset the year to either '70, '71 or '72.
  * This hack will only work until Dec 31 2001.
  */
 year += 1928;

 We already handle this differently for the DECstation -- no need to check
the year from the RTC apart from handling leap years.  The real year has
to be stored elsewhere.  This is platform specific and other MIPS systems
are unaffected so the only check needed is whether we are on a DECstation
or not.  As we don't have a unified MIPS kernel, this can be accomplished
at the compile time.

 These changes will get into the official 2.4 kernel once a merge is
performed.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

some sound-related oops'es

2000-09-25 Thread Pierfrancesco Caci



I get some oops whenever I try to insmod sb

here are some of them, in the hope that someone can track down the problem




Unable to handle kernel paging request at virtual address ca8fc1a0
ca88a49d
*pde = 07f8a063
Oops: 
CPU:0
EIP:0010:[usbcore:__insmod_usbcore_S.bss_L96+166557/5043409]
EFLAGS: 00010286
eax: ca8fc1a0   ebx: c1f5ec60   ecx:    edx: 
esi: c704d920   edi:    ebp:    esp: c7025f00
ds: 0018   es: 0018   ss: 0018
Process aumix (pid: 939, stackpage=c7025000)
Stack:  c704d920 c36c4540 c128c640  c704d920 c36c4540 c128c640
   c7024000 0001 c74d2ac0 c6009005 c36c4540 c012b062 c36c4540 c704d920
   c704d920 c36c4540  c012a371 c36c4540 c704d920  c6009000
Call Trace: [chrdev_open+62/76] [dentry_open+189/328] [filp_open+82/92] 
[sys_open+56/180] [system_call+51/56]
Code: 8b 10 85 d2 74 1d 52 e8 bf d1 88 f5 83 c4 04 85 c0 74 14 8b
Using defaults from ksymoops -t elf32-i386 -a i386

Code;   Before first symbol
 _EIP:
Code;   Before first symbol
   0:   8b 10 mov(%eax),%edx
Code;  0002 Before first symbol
   2:   85 d2 test   %edx,%edx
Code;  0004 Before first symbol
   4:   74 1d je 23 _EIP+0x23 0023 Before first symbol
Code;  0006 Before first symbol
   6:   52push   %edx
Code;  0007 Before first symbol
   7:   e8 bf d1 88 f5call   f588d1cb _EIP+0xf588d1cb f588d1cb 
END_OF_CODE+2afb034c/
Code;  000c Before first symbol
   c:   83 c4 04  add$0x4,%esp
Code;  000f Before first symbol
   f:   85 c0 test   %eax,%eax
Code;  0011 Before first symbol
  11:   74 14 je 27 _EIP+0x27 0027 Before first symbol
Code;  0013 Before first symbol
  13:   8b 00 mov(%eax),%eax


---


Unable to handle kernel paging request at virtual address ca8fc1a0
ca88a49d
*pde = 07f8a063
Oops: 
CPU:0
EIP:0010:[usbcore:__insmod_usbcore_S.bss_L96+166557/5043409]
EFLAGS: 00010286
eax: ca8fc1a0   ebx: c0558440   ecx: 0003   edx: 0003
esi: c2abc3e0   edi: 0003   ebp: 0003   esp: c2ff9f00
ds: 0018   es: 0018   ss: 0018
Process esd (pid: 1781, stackpage=c2ff9000)
Stack:  c2abc3e0 c3d209e0 c128c640  c2abc3e0 c3d209e0 c128c640 
   72616863 6a616d2d 312d726f 0034 0287 c012b062 c3d209e0 c2abc3e0 
   c2abc3e0 c3d209e0  c012a371 c3d209e0 c2abc3e0  c45b2000 
Call Trace: [chrdev_open+62/76] [dentry_open+189/328] [filp_open+82/92] 
[sys_open+56/180] [system_call+51/56] 
Code: 8b 10 85 d2 74 1d 52 e8 bf d1 88 f5 83 c4 04 85 c0 74 14 8b 


Code;   Before first symbol
 _EIP:
Code;   Before first symbol
   0:   8b 10 mov(%eax),%edx
Code;  0002 Before first symbol
   2:   85 d2 test   %edx,%edx
Code;  0004 Before first symbol
   4:   74 1d je 23 _EIP+0x23 0023 Before first symbol
Code;  0006 Before first symbol
   6:   52push   %edx
Code;  0007 Before first symbol
   7:   e8 bf d1 88 f5call   f588d1cb _EIP+0xf588d1cb f588d1cb 
END_OF_CODE+2afb034c/
Code;  000c Before first symbol
   c:   83 c4 04  add$0x4,%esp
Code;  000f Before first symbol
   f:   85 c0 test   %eax,%eax
Code;  0011 Before first symbol
  11:   74 14 je 27 _EIP+0x27 0027 Before first symbol
Code;  0013 Before first symbol
  13:   8b 00 mov(%eax),%eax


-


This is what appeared in the logs right before the 2nd oops

Sep 25 14:08:35 penny kernel: Soundblaster audio driver Copyright (C) by Hannu 
Savolainen 1993-1996
Sep 25 14:08:35 penny kernel: sb: No ISAPnP cards found, trying standard ones...
Sep 25 14:08:35 penny kernel: SB 4.13 detected OK (220)
Sep 25 14:08:35 penny kernel: Sound Blaster 16 (4.13) at 0x220 irq 5 dma 1,5
Sep 25 14:08:35 penny kernel: Sound Blaster 16 at 0x330 irq 5 dma 0,0
Sep 25 14:08:35 penny kernel: sb: I/O region in use.
Sep 25 14:08:35 penny kernel: Sound: Hmm, DMA1 was left allocated - fixed
Sep 25 14:08:35 penny kernel: Sound: Hmm, DMA5 was left allocated - fixed
Sep 25 14:08:35 penny insmod: /lib/modules/2.4.0-test8/kernel/drivers/sound/sb.o: 
init_module: No such device
Sep 25 14:08:35 penny insmod: Hint: insmod errors can be caused by incorrect module 
parameters, including invalid IO or IRQ parameters
Sep 25 14:08:35 penny insmod: /lib/modules/2.4.0-test8/kernel/drivers/sound/sb.o: 
insmod char-major-14 failed
Sep 25 14:08:35 penny kernel: Unable to handle kernel paging request at virtual 
address ca8fc1a0

Re: test9pre6 usb-storage

2000-09-25 Thread John Levon


On Sun, 24 Sep 2000, Matthew Dharm wrote:

 I'm the usb-storage maintainer.  Yes, I realize that there is really no
 need to reset the state to TASK_RUNNING, but I felt better having those
 there.  Considering that code is from the reset routines which almost never
 get called, I figured it was fine.
 
 Matt
 

OK,that's reasonable, but my concern is that these things tend to
propogate. If people see this code, they will assume it is *necessary* to
do such a thing. Unfortunately this is the only real way much development
can get done, by USTL ...

thanks
john 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 12:13:08PM +0200, Ingo Molnar wrote:
 
 On Mon, 25 Sep 2000, Andrea Arcangeli wrote:
 
  Not sure if this is the right moment for those changes though, I'm not
  worried about ext2 but about the other non-netoworked fses that nobody
  uses regularly.
 
 it *is* the right moment to clean these issues up. These kinds of things

I'm talking about the removal of the superblock lock from the filesystems.

Note: I don't have problems with the removal of the superblock lock even if
done at this stage, I'm not the one who can choose those things, it's Linus's
responsability to take the final decision for the official tree, but don't ask
me to test patches that removes the superblock lock _at_this_stage_ before I
can run a stable and fast 2.4.x because I won't do that. Period.

 yet another elevator algorithm we need a squeaky clean VM balancer above

FYI: My current tree (based on 2.4.0-test8-pre5) delivers 16mbyte/sec in the
tiobench write test compared to clean 2.4.0-test8-pre5 that delivers 8mbyte/sec
instead with only blkdev layer changes in between the two kernels (and no
that's not a matter of the elevator since there are no seeks in the test
and I've not changed the elevator sorting algorithm during the bench).

Also I I found the reason of your hang, it's the TASK_EXCLUSIVE in
wait_for_request. The high part of the queue is reserved for reads.
Now if a read completes and it wakeups a write you'll hang.

If you think I should delay those fixes to do something else I don't agree
sorry. 

 all. Please help identifying, fixing, debugging and testing these VM
 balancing issues. This is tough work and it needs to be done.

I had an alternative VM, that I prefer from a design standpoint, I'll improve
it and I'll maintain it.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 12:42:09PM +0200, Ingo Molnar wrote:
 believe could simplify unrelated kernel code significantly. Eg. no need to
 check for NULL pointers on most allocations, a GFP_KERNEL allocation
 always succeeds, end of story. This behavior also has the 'nice'

Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed that is a
showstopper bug. We also have another showstopper bug in getblk that will be
hard to fix because people was used to rely on it and they wrote dealdock prone
code.

You should know that people not running benchmarks and and using the machine
power for simulations runs out of memory all the time. If you put this kind of
obvious deadlock into the main kernel allocator you'll screwup the hard work to
fix all the other deadlock problems during OOM that is been done so far. Please
fix raid1 instead of making things worse.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed
 that is a showstopper bug. [...]

why?

 machine power for simulations runs out of memory all the time. If you
 put this kind of obvious deadlock into the main kernel allocator

FYI, i havent put it there.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 Please fix raid1 instead of making things worse.

huh, what do you mean?

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] 2.4.0 i386 watchpoint problems [NEW PATCH]

2000-09-25 Thread James Cownie


Here is a patch to arch/i386/traps.c and arch/i386/signal.c which does
what you are suggesting, I believe.

I have tested this and it works fine for me. (Though I do also need
the patch which stores dr6 back into current-thread.debugreg[6]. That
is not included here since I submitted it separately and assume it is
uncontentious). All user generated watchpoints are noted, as are ones
triggered from the kernel by system calls overwriting the watched
data.

A couple more points :-

1) I restore the debug control register at the point where a signal is
   about to be delivered, rather than at the top of the loop as you
   suggested. I think that's safe (and potentially less work if we go
   around the loop more than once).

2) Rather than zapping the eip in the watchpoint trap to -1 if the
   trap occurs in the kernel, I make it the user eip from the thread.
   That makes a debugger point at the right place, i.e. the system
   call inside which the watchpoint triggered.

-- Jim 

James Cownie[EMAIL PROTECTED]
Etnus, LLC. +44 117 9071438
http://www.etnus.com

jcownie@pc2: diff -u signal.c-test7 signal.c
--- signal.c-test7  Mon Sep 25 11:52:35 2000
+++ signal.cMon Sep 25 11:52:38 2000
@@ -600,6 +600,7 @@
 
for (;;) {
unsigned long signr;
+   unsigned long thread_dr7;
 
spin_lock_irq(current-sigmask_lock);
signr = dequeue_signal(current-blocked, info);
@@ -689,6 +690,16 @@
/* NOTREACHED */
}
}
+
+   /* Reenable any watchpoints before delivering the
+* signal to user space. The processor register will
+* have been cleared if the watchpoint triggered
+* inside the kernel.
+*/
+   thread_dr7 = current-thread.debugreg[7];
+   __asm__("movl %0,%%db7"
+   : /* no output */
+   : "r" (thread_dr7));
 
/* Whee!  Actually deliver the signal.  */
handle_signal(signr, ka, info, oldset, regs);
jcownie@pc2: diff -c traps.c-2.4.0-test7 traps.c
*** traps.c-2.4.0-test7 Sat Aug  5 00:15:38 2000
--- traps.c Mon Sep 25 13:42:43 2000
***
*** 491,507 
  }
  
  /*
!  * Careful - we must not do a lock-kernel until we have checked that the
!  * debug fault happened in user mode. Getting debug exceptions while
!  * in the kernel has to be handled without locking, to avoid deadlocks..
   *
   * Being careful here means that we don't have to be as careful in a
   * lot of more complicated places (task switching can be a bit lazy
   * about restoring all the debug state, and ptrace doesn't have to
   * find every occurrence of the TF bit that could be saved away even
!  * by user code - and we don't have to be careful about what values
!  * can be written to the debug registers because there are no really
!  * bad cases).
   */
  asmlinkage void do_debug(struct pt_regs * regs, long error_code)
  {
--- 491,516 
  }
  
  /*
!  * Our handling of the processor debug registers is non-trivial.
!  * We do not clear them on entry and exit from the kernel. Therefore
!  * it is possible to get a watchpoint trap here from inside the kernel.
!  * However, the code in ./ptrace.c has ensured that the user can
!  * only set watchpoints on userspace addresses. Therefore the in-kernel
!  * watchpoint trap can only occur in code which is reading/writing
!  * from user space. Such code must not hold kernel locks (since it
!  * can equally take a page fault), therefore it is safe to call
!  * force_sig_info even though that claims and releases locks.
!  * 
!  * Code in ./signal.c ensures that the debug control register
!  * is restored before we deliver any signal, and therefore that
!  * user code runs with the correct debug control register even though
!  * we clear it here.
   *
   * Being careful here means that we don't have to be as careful in a
   * lot of more complicated places (task switching can be a bit lazy
   * about restoring all the debug state, and ptrace doesn't have to
   * find every occurrence of the TF bit that could be saved away even
!  * by user code)
   */
  asmlinkage void do_debug(struct pt_regs * regs, long error_code)
  {
***
*** 535,562 
goto clear_TF;
}
  
-   /* If this is a kernel mode trap, we need to reset db7 to allow us to continue 
sanely */
-   if ((regs-xcs  3) == 0)
-   goto clear_dr7;
- 
/* Ok, finally something we can handle */
tsk-thread.trap_no = 1;
tsk-thread.error_code = error_code;
info.si_signo = SIGTRAP;
info.si_errno = 0;
info.si_code = TRAP_BRKPT;
!   info.si_addr = (void *)regs-eip;
force_sig_info(SIGTRAP, info, tsk);
-   return;
- 
- debug_vm86:
-   handle_vm86_trap((struct kernel_vm86_regs *) regs, error_code, 1);
-

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 03:02:58PM +0200, Ingo Molnar wrote:
 
 On Mon, 25 Sep 2000, Andrea Arcangeli wrote:
 
  Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed
  that is a showstopper bug. [...]
 
 why?

Because as you said the machine can lockup when you run out of memory.

 FYI, i havent put it there.

Ok.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

  yet another elevator algorithm we need a squeaky clean VM balancer above
 
 FYI: My current tree (based on 2.4.0-test8-pre5) delivers 16mbyte/sec
 in the tiobench write test compared to clean 2.4.0-test8-pre5 that
 delivers 8mbyte/sec

great! I'm happy we have a fine-tuned elevator again.

 Also I I found the reason of your hang, it's the TASK_EXCLUSIVE in
 wait_for_request. The high part of the queue is reserved for reads.
 Now if a read completes and it wakeups a write you'll hang.

yep. But i dont understand why this makes any difference - the waitqueue
wakeup is FIFO, so any other request will eventually arrive. Could you
explain this bug a bit better?

 If you think I should delay those fixes to do something else I don't
 agree sorry.

no, i never ment it. I find it very good that those half-done changes are
cleaned up and the remaining bugs / performance problems are eliminated -
the first reports about bad write performance came right after the
original elevator patches went in, about 6 months ago.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

   Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed
   that is a showstopper bug. [...]
  
  why?
 
 Because as you said the machine can lockup when you run out of memory.

well, i think all kernel-space allocations have to be limited carefully,
denying succeeding allocations is not a solution against over-allocation,
especially in a multi-user environment.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 03:04:10PM +0200, Ingo Molnar wrote:
 
 On Mon, 25 Sep 2000, Andrea Arcangeli wrote:
 
  Please fix raid1 instead of making things worse.
 
 huh, what do you mean?

I mean this:

while (!( /* FIXME: now we are rather fault tolerant than nice */
mirror_bh[i] = kmalloc (sizeof (struct buffer_head), GFP_KERNEL)
) )

I've seen in the 2.4.0-test9-pre6 raid1 code the above is gone (and this looks
very promising :)), it is at least proof that some care about the deadlock is
been taken) and you instead sleep on a waitqueue now. While it's not obvious at
all that sleeping on the waitqueue is not deadlock prone (for example getblk
sleeps on a waitqueue bit it's deadlock prone too), at least it's not an
infinite loop anymore and that's still better.

Is it safe to sleep on the waitqueue in the kmalloc fail path in raid1?

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

  huh, what do you mean?
 
 I mean this:
 
   while (!( /* FIXME: now we are rather fault tolerant than nice */

this is fixed in 2.4. The 2.2 RAID code is frozen, and has known
limitations (ie. due to the above RAID1 cannot be used as a swap-device).

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 Is it safe to sleep on the waitqueue in the kmalloc fail path in
 raid1?

yes. every RAID1-bh has a bound lifetime. (bound by worst-case IO
latencies)

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 03:12:58PM +0200, Ingo Molnar wrote:
 well, i think all kernel-space allocations have to be limited carefully,

When a machine without a gigabit ethernet runs oom it's userspace that
allocated the memory via page faults not the kernel.

And if the careful limit avoids the deadlock in the layer above alloc_pages,
then it will also avoid alloc_pages to return NULL and you won't need an
infinite loop in first place (unless the memory balancing is buggy).

GFP should return NULL only if the machine is out of memory. The kernel can be
written in a way that never deadlocks when the machine is out of memory just
checking the GFP retval. I don't think any in-kernel resource limit is
necessary to have things reliable and fast. Most dynamic big caches and kernel
data can be shrinked dynamically during memory pressure (pheraps except skbs
and I agree that for skbs on gigabit ethernet the thing is a little different).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 03:21:01PM +0200, Ingo Molnar wrote:
 yes. every RAID1-bh has a bound lifetime. (bound by worst-case IO
 latencies)

Very good! Many thanks Ingo.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Interrupt sharing

2000-09-25 Thread Mahadev K Cholachagudda


Hello to all,

I have one doubt and is as below.


Suppose say the two drivers driver1 and driver2 will install the ISR for a
particular interrupt, say UART0.
After some time the interrupt is generated. At this moment, which driver's
ISR is going to execute ?.

If driver1 ISR is get executed, will the driver2's ISR is going to execute
?. If say driver2's ISR is going to execute, Is the data that interrupt
generated is going to be emulated to the driver2's ISR.

please help,

any help is welcome,

with regards,
Mahadev




_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 And if the careful limit avoids the deadlock in the layer above
 alloc_pages, then it will also avoid alloc_pages to return NULL and
 you won't need an infinite loop in first place (unless the memory
 balancing is buggy).

yes i like this property very much because it unearths VM balancing bugs,
which plagued us for so long and are so hard to detect. But statistically
it's also possible that try_to_free_pages() frees a page and alloc_pages()
done on another CPU (or in IRQ context) 'steals' the page. This can
happen, because the VM right now guarantees no straight path from
deallocator to allocator. (and it's not necessery to guarantee it, given
the varying nature of allocation requests.)

 GFP should return NULL only if the machine is out of memory. The
 kernel can be written in a way that never deadlocks when the machine
 is out of memory just checking the GFP retval. I don't think any
 in-kernel resource limit is necessary to have things reliable and
 fast. [...]

Andrea, if you really mean this then you should not be let near the VM
balancing code :-)

 Most dynamic big caches and kernel data can be shrinked dynamically
 during memory pressure (pheraps except skbs and I agree that for skbs
 on gigabit ethernet the thing is a little different).

a big 'except'. You dont need gigabit for that, to the contrary, if the
network is slow it's easier to overallocate within the kernel. Ask Alan
about how many D.O.S. attacks there are possible without implicit or
explicit bean counting.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Interrupt sharing

2000-09-25 Thread Jeff Garzik




On Mon, 25 Sep 2000, Mahadev K Cholachagudda wrote:

 Hello to all,
 
 I have one doubt and is as below.
 
 
 Suppose say the two drivers driver1 and driver2 will install the ISR for a
 particular interrupt, say UART0.
 After some time the interrupt is generated. At this moment, which driver's
 ISR is going to execute ?.
 
 If driver1 ISR is get executed, will the driver2's ISR is going to execute
 ?. If say driver2's ISR is going to execute, Is the data that interrupt
 generated is going to be emulated to the driver2's ISR.

When an interrupt is delivered, the kernel calls ALL interrupt handlers
registered for that interrupt.  That means all drivers capable of
sharing interrupts should, ideally, have code in their interrupt handler
to exit ASAP if no work is necessary.

status = RTL_R16(IntrStatus);
/* exit ASAP if no interrupt conditions (0), or
 * if the hardware was unplugged (0x)
 */
if ((status == 0) || (status == 0x))
return;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

  yes. every RAID1-bh has a bound lifetime. (bound by worst-case IO
  latencies)
 
 Very good! Many thanks Ingo.

this was actually coded/fixed by Neil Brown - so the kudos go to him!

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 03:10:51PM +0200, Ingo Molnar wrote:
 yep. But i dont understand why this makes any difference - the waitqueue

It makes a difference because your sleeping reads won't get the wakeup
even while they could queue their reserved read request (they have
to wait the FIFO to roll or some write to complete).

 wakeup is FIFO, so any other request will eventually arrive. Could you
 explain this bug a bit better?

Well it may not explain an infinite hang because as you say the write that got
the suprious wakeup will unplug the queue and after some time the reads will be
wakenup. So maybe that wasn't the reason of your hangs because I remeber your
problem looked more like an infinite hang that was only solved by kflushd
writing some more stuff and unplugging the queue as side effect (however I'm
not sure since I never experienced those myself). 

But I hope if it wasn't that one it's the below fix that will help:

Index: mm/filemap.c
===
RCS file: /home/andrea/cvs/linux/mm/filemap.c,v
retrieving revision 1.1.1.5.2.3
retrieving revision 1.1.1.5.2.4
diff -u -r1.1.1.5.2.3 -r1.1.1.5.2.4
--- mm/filemap.c2000/09/21 03:11:53 1.1.1.5.2.3
+++ mm/filemap.c2000/09/25 03:33:31 1.1.1.5.2.4
@@ -622,8 +622,8 @@
 
add_wait_queue(page-wait, wait);
do {
-   sync_page(page);
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+   sync_page(page);
if (!PageLocked(page))
break;
schedule();
Index: fs/buffer.c
===
RCS file: /home/andrea/cvs/linux/fs/buffer.c,v
retrieving revision 1.1.1.5.2.1
retrieving revision 1.1.1.5.2.2
diff -u -r1.1.1.5.2.1 -r1.1.1.5.2.2
--- fs/buffer.c 2000/09/06 19:57:51 1.1.1.5.2.1
+++ fs/buffer.c 2000/09/25 03:33:30 1.1.1.5.2.2
@@ -147,8 +147,8 @@
atomic_inc(bh-b_count);
add_wait_queue(bh-b_wait, wait);
do {
-   run_task_queue(tq_disk);
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+   run_task_queue(tq_disk);
if (!buffer_locked(bh))
break;
schedule();


Think if the buffer returns locked between set_task_state(tsk,
TASK_UNINTERRUPTIBLE) and if (!buffer_locked(bh)). The window is very small but
it looks a genuine window for a deadlock. (and this one could sure explain
infinite hangs in read... even if it looks even less realistic than the
EXCLUSIVE task thing)

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 - sync_page(page);
   set_task_state(tsk, TASK_UNINTERRUPTIBLE);
 + sync_page(page);

 - run_task_queue(tq_disk);
   set_task_state(tsk, TASK_UNINTERRUPTIBLE);
 + run_task_queue(tq_disk);

these look like genuine fixes, but i dont think they can explain the hangs
i had yesterday - those were simple VM deadlocks. I dont see any deadlocks
today - but i'm running the unsafe B2 variant of the vmfixes patch. (and i
have no swapping enabled which simplifies my VM setup.)

but one of these two fixes could explain the slowdown i saw on and off for
quite some time, seeing very bad read performance occasionally. (do you
remember my sched.c tq_disc hack?)

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Stephen C. Tweedie


Hi,

On Mon, Sep 25, 2000 at 04:02:30AM +0200, Andrea Arcangeli wrote:
 On Sun, Sep 24, 2000 at 09:27:39PM -0400, Alexander Viro wrote:
  So help testing the patches to them. Arrgh...
 
 I think I'd better fix the bugs that I know about before testing patches that
 tries to remove the superblock_lock at this stage.

Right.  If we're introducing new deadlock possibilities, then sure we
can fix the obvious cases in ext2, but it will be next to impossible
to do a thorough audit of all of the other filesystems.  Adding in the
new shrink_icache loop into the VFS just feels too dangerous right
now.

Of course, that doesn't mean we shouldn't remove the excessive
superblock locking from ext2 --- rather, it is simply more robust to
keep the two issues separate.

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 Again: the bean counting and all the limit happens at the higher
 layer.  I shouldn't know anything about it when I play with the lower
 layer GFP memory balancing code.

exactly, and this is why if a higher level lets through a GFP_KERNEL, then
it *must* succeed. Otherwise either the higher level code is buggy, or the
VM balance is buggy, but we want to have clear signs of it.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Interrupt sharing

2000-09-25 Thread Richard B. Johnson


On Mon, 25 Sep 2000, Mahadev K Cholachagudda wrote:

 Hello to all,
 
 I have one doubt and is as below.
 
 
 Suppose say the two drivers driver1 and driver2 will install the ISR for a
 particular interrupt, say UART0.
 After some time the interrupt is generated. At this moment, which driver's
 ISR is going to execute ?.
 
 If driver1 ISR is get executed, will the driver2's ISR is going to execute
 ?. If say driver2's ISR is going to execute, Is the data that interrupt
 generated is going to be emulated to the driver2's ISR.
 
 please help,
 
 any help is welcome,
 
 with regards,
 Mahadev

Interrupt sharing works only with level interrupts. Your choice of
a UART for an example is unfortunate because the IRQs that they use
(IRQ3 and IRQ4) are not normally configured for level triggering.

That said, if you have a device that shares interrupts, the driver
does not know and does not care that it is sharing an interrupt.
It does not care if, and only if, the driver's ISR is written properly.

A properly-written ISR does not muck with the interrupt controller. It
reads the status registers of the device(s) that it is supposed to
handle, does whatever is necessary to satisfy the device, then gets
to hell out as quickly as possible. Under Linux, getting to hell out
is a simple 'return' from the void ISR procedure.

When your driver returns to the kernel code that called it, the kernel
code determines if the specific interrupt level is still pending. If
it is, it calls the next ISR that uses the same interrupt level.  This
means that every ISR that uses the same interrupt level (IRQ) may get
called when there is nothing to do.

This is why a properly written ISR will check its device status and
if there is nothing to do, it will not complain, it will just return.

As you can see shared interrupts have a little more overhead than
non-shared ones, however nothing is ever 'lost'. An interrupt that
occurs during the execution of an interrupt is 'remembered' by the
controller because the the IRQ line will be set true and remain true
until the device requesting it is finally satisfied.

Cheers,
Dick Johnson

Penguin : Linux version 2.2.15 on an i686 machine (797.90 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Jens Axboe wrote:

 The changes made were never half-done. The recent bug fixes have
 mainly been to remove cruft from the earlier elevator and fixing a bug
 where the elevator insert would screw up a bit. So I'd call that fine
 tuning or adjusting, not fixing half-done stuff.

sorry i did not mean to offend you - unadjusted and unfixed stuff hanging
around in the kernel for months is 'half done' for me.

  the first reports about bad write performance came right after the
  original elevator patches went in, about 6 months ago.
 
 And a new elevator was introduced some months ago to solve this.

and these are still not solved in the vanilla kernel, as recent complaints
on l-k prove.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 03:57:31PM +0200, Ingo Molnar wrote:
 i had yesterday - those were simple VM deadlocks. I dont see any deadlocks

Definitely. They can't explain anything about the VM deadlocks. I was
_only_ talking about the blkdev hangs that caused you to unplug the
queue at each reschedule in tux and that Eric reported me for the SG
driver (and I very much hope that with EXCLUSIVE gone away and the
wait_on_* fixed those hangs will go away because I don't see anything else
wrong at this moment).

 but one of these two fixes could explain the slowdown i saw on and off for
 quite some time, seeing very bad read performance occasionally. (do you
 remember my sched.c tq_disc hack?)

Exactly, that's the only thing I was talking about in this subthread.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [Demo program]: Poor elevator performance in 2.4.0-test9pre6

2000-09-25 Thread Jens Axboe


On Mon, Sep 25 2000, Robert Cohen wrote:
 With kernel version 2.4.0-test9pre6 the results are as follows.
 The test machine has 128 Megs of memory. The tests accesses 240 Megs of
 files so that it can't fit in cache.
 
 If I run it with 8 files of size 30 Megs:
 
 [robert@test25 src]$ ./elv_test 8 30
 files created, 240 megs written at 8.96 megs/sec
 finished writing 240 megs written at 1.05 megs per sec
 finished reading, 240 megs read at 5.848833 megs/sec
 
 If I do the same with a single file of size 240 Megs
 
 [robert@test25 src]$ ./elv_test 1 240
 files created, 240 megs written at 11.12 megs/sec
 finished writing 240 megs written at 11.08 megs per sec
 finished reading, 240 megs read at 12.580521 megs/sec

axboe@burns:/opt/software/testing  ./elv_test 8 30
files created, 240 megs written at 21.64 megs/sec
finished writing 240 megs written at 21.12 megs per sec

This is my current tree on 2.4.0-test9-pre5. Thanks for the test program,
Andrea and I are working on getting a polished patch ready for inclusion
that (apparently) also fixes this problem.

-- 
* Jens Axboe [EMAIL PROTECTED]
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 I was _only_ talking about the blkdev hangs [...]

i guess this was just miscommunication. It never 'hung', it just performed
reads with 20k/sec or so. (without any writes being done in the
background.) A 'hang' for me is a deadlock or lockup, not a slowdown.

 that caused you to unplug the queue at each reschedule in tux and that
 Eric reported me for the SG driver (and I very much hope that with
 EXCLUSIVE gone away and the wait_on_* fixed those hangs will go away
 because I don't see anything else wrong at this moment).

okay, i'll test this.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Jens Axboe


On Mon, Sep 25 2000, Ingo Molnar wrote:
  The changes made were never half-done. The recent bug fixes have
  mainly been to remove cruft from the earlier elevator and fixing a bug
  where the elevator insert would screw up a bit. So I'd call that fine
  tuning or adjusting, not fixing half-done stuff.
 
 sorry i did not mean to offend you - unadjusted and unfixed stuff hanging
 around in the kernel for months is 'half done' for me.

No offense taken, I just tried to explain my view. And in light of
the bad test2, I'd like the new changes to not have any "issues". So
this work has been going on for the last month or so, and I think we are
finally getting to agreement on what needs to be done now and how. WIP.

  And a new elevator was introduced some months ago to solve this.
 
 and these are still not solved in the vanilla kernel, as recent complaints
 on l-k prove.

Different problems, though :(. However, I believe they are solved in
Andrea and my current tree. Just needs the final cleaning, more later.

-- 
* Jens Axboe [EMAIL PROTECTED]
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: refill_inactive()

2000-09-25 Thread Rik van Riel


On Sun, 24 Sep 2000, Ingo Molnar wrote:

 i'm wondering about the following piece of code in refill_inactive():
 
 if (current-need_resched  (gfp_mask  __GFP_IO)) {
 __set_current_state(TASK_RUNNING);
 schedule();
 }
 
 shouldnt this be __GFP_WAIT? It's true that __GFP_IO implies __GFP_WAIT
 (because IO cannot be done without potentially scheduling), so the code is
 not buggy, but the above 'yielding' of the CPU should be done in the
 GFP_BUFFER case as well. (which is __GFP_WAIT but not __GFP_IO)
 
 Objections?

1) if __GFP_WAIT isn't set, we cannot run try_to_free_pages at all

2) you are right, we /can/ schedule when __GFP_IO isn't set, this is
   mistake ... now I'm getting confused about what __GFP_IO is all
   about, does anybody know the _exact_ meaning of __GFP_IO ?


regards,


Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 03:49:52PM +0200, Jens Axboe wrote:
 And a new elevator was introduced some months ago to solve this.

And now that I done some benchmark it seems the major optimization consists in
the implementation of the new _ordering_ algorithm in test2, not really from
the removal of the more finegrined latency control (said that I'm not going to
reintroduce the previous latency control, the current one doesn't provide great
latency but it's ok).

As soon I patch my tree with Peter's perfect CSCAN ordering (that only changes
the ordering algorithm), tiotest performance drops significantly in the
2-thread-reading case. elvtune settings doesn't matter, that's only a matter of
the ordering.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Bonding again..

2000-09-25 Thread pb


thanx for the thip with 2.2.17, it really solved my problem. but know i'm
getting

SIOCSIFSLAVE: invalid agrument.

error's when trying to ifenslave devices. i know that this may be the wrong
place for a discussion on bonding, but i hardly can find any help on this.
because it's quite urgent to me, any clue would help...  thanx

phibo


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Jens Axboe


On Mon, Sep 25 2000, Andrea Arcangeli wrote:
  i had yesterday - those were simple VM deadlocks. I dont see any deadlocks
 
 Definitely. They can't explain anything about the VM deadlocks. I was
 _only_ talking about the blkdev hangs that caused you to unplug the
 queue at each reschedule in tux and that Eric reported me for the SG
 driver (and I very much hope that with EXCLUSIVE gone away and the
 wait_on_* fixed those hangs will go away because I don't see anything else
 wrong at this moment).

The sg problem was different. When sg queues a request, it invokes the
request_fn to handle it. But if the queue is currently plugged, the
scsi_request_fn will not do anything.

-- 
* Jens Axboe [EMAIL PROTECTED]
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:04:14PM +0200, Ingo Molnar wrote:
 exactly, and this is why if a higher level lets through a GFP_KERNEL, then
 it *must* succeed. Otherwise either the higher level code is buggy, or the
 VM balance is buggy, but we want to have clear signs of it.

I'm not sure if we should restrict the limiting only to the cases that needs
them. For example do_anonymous_page looks a place that could rely on the
GFP retval.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Jens Axboe


On Mon, Sep 25 2000, Andrea Arcangeli wrote:
  And a new elevator was introduced some months ago to solve this.
 
 And now that I done some benchmark it seems the major optimization consists in
 the implementation of the new _ordering_ algorithm in test2, not really from
 the removal of the more finegrined latency control (said that I'm not going to
 reintroduce the previous latency control, the current one doesn't provide
 great latency but it's ok).

Yes, I found this the greatest improvement too.

 As soon I patch my tree with Peter's perfect CSCAN ordering (that only changes
 the ordering algorithm), tiotest performance drops significantly in the
 2-thread-reading case. elvtune settings doesn't matter, that's only a matter
 of the ordering.

Interesting. I haven't done any serious benching with the CSCAN introduction
in elevator_linus, I'll try that too.

-- 
* Jens Axboe [EMAIL PROTECTED]
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 I'm not sure if we should restrict the limiting only to the cases that
 needs them. For example do_anonymous_page looks a place that could
 rely on the GFP retval.

i think an application should not fail due to other applications
allocating too much RAM. OOM behavior should be a central thing and based
on allocation patterns, not pure luck or unluck. I always found it rude to
SIGBUS when some other application is abusing RAM but the oom detector has
not yet killed it off.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:08:38PM +0200, Jens Axboe wrote:
 The sg problem was different. When sg queues a request, it invokes the
 request_fn to handle it. But if the queue is currently plugged, the
 scsi_request_fn will not do anything.

That will explain it, yes. In the same way for correctness also those should
be converted from request_fn to generic_unplug_device, right? (this
will also avoid to recall spurious request_fn because the device is still in the
tq_disk queue even when the I/O generated by the below request_fn completed)

if (major = COMPAQ_SMART2_MAJOR+0  major = COMPAQ_SMART2_MAJOR+7)
(q-request_fn)(q);
if (major = DAC960_MAJOR+0  major = DAC960_MAJOR+7)
(q-request_fn)(q);

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

 driver (and I very much hope that with EXCLUSIVE gone away and the
 wait_on_* fixed those hangs will go away because I don't see anything else
 wrong at this moment).

the EXCLUSIVE thing only optimizes the wakeup, it's not semantic! How
better is it to let 100 processes race for one freed-up request slot?
There is no guarantee at all that the reader will win. If reads and writes
racing for request slots ever becomes a problem then we should introduce a
separate read and write waitqueue.

the EXCLUSIVE thing was noticed by Dimitris i think, and it makes tons of
(performance) sense.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Jens Axboe


On Mon, Sep 25 2000, Andrea Arcangeli wrote:
  The sg problem was different. When sg queues a request, it invokes the
  request_fn to handle it. But if the queue is currently plugged, the
  scsi_request_fn will not do anything.
 
 That will explain it, yes. In the same way for correctness also those should
 be converted from request_fn to generic_unplug_device, right? (this

Yes, that would be the right fix. However, then we also need some
way of inserting requests in the queue and let it plug when appropriate.
The scsi layer currently "manually" does a list_add on the queue itself,
which doesn't look too healthy.

 will also avoid to recall spurious request_fn because the device is still
 in the tq_disk queue even when the I/O generated by the below request_fn
 completed)
 
   if (major = COMPAQ_SMART2_MAJOR+0  major = COMPAQ_SMART2_MAJOR+7)
   (q-request_fn)(q);
   if (major = DAC960_MAJOR+0  major = DAC960_MAJOR+7)
   (q-request_fn)(q);

AFAIR, Eric tried to talk to the Compaq folks (and Leonard too, I dunno)
about why they want this. What came of it, I don't know.

-- 
* Jens Axboe [EMAIL PROTECTED]
* SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:11:34PM +0200, Jens Axboe wrote:
 Interesting. I haven't done any serious benching with the CSCAN introduction
 in elevator_linus, I'll try that too.

Only changing that the performance decreased reproducibly from 16 to 14
mbyte/sec in the read test with 2 threads.

So far I'm testing only IDE with LVM striping on two equal fast disks on
separate IDE channels.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Marcelo Tosatti



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

snip

 I talked with Alexey about this and it seems the best way is to have a
 per-socket reservation of clean cache in function of the receive window.  So we
 don't need an huge atomic pool but we can have a special lru with an irq
 spinlock that is able to shrink cache from irq as well.

In the current 2.4 VM code, there is a kernel thread called
"kreclaimd".

This thread keeps freeing pages from the inactive clean list when needed
(when zone-free_pages  zone-pages_low), making them available for
atomic allocations.

Do you consider pages_low pages as a "huge atomic pool" ? 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: refill_inactive()

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Rik van Riel wrote:

 2) you are right, we /can/ schedule when __GFP_IO isn't set, this is
mistake ... now I'm getting confused about what __GFP_IO is all
about, does anybody know the _exact_ meaning of __GFP_IO ?

__GFP_IO set to 1 means that the allocator can afford doing IO implicitly
by the page allocator. Most allocations dont care at all wether swap IO is
started as part of gfp() or not. But a prominent counter-example is
GFP_BUFFER, which is used by the buffer-cache/fs layer, and which cannot
do any IO implicitly. (because it *is* the IO layer already, and it is
already trying to do IO.) The other reason are legacy lowlevel-filesystem
locks like the ext2fs lock, which cannot be taken recursively.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:27:24PM +0200, Ingo Molnar wrote:
 i think an application should not fail due to other applications
 allocating too much RAM. OOM behavior should be a central thing and based

At least Linus's point is that doing perfect accounting (at least on the
userspace allocation side) may cause you to waste resources, failing even if
you could still run and I tend to agree with him. We're lazy on that
side and that's global win in most cases.

We are finegrined with page granularity, not with the mmap granularity.

The point is that not all the mmapped regions are going to be pagedin.  Think a
program that only after 1 hour did all the calculations that allocated all
the memory it requested with malloc.  Before the hour passes the unused memory
can still be used for other things and that's what the user also expects
when he runs `free`.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:29:42PM +0200, Ingo Molnar wrote:
 There is no guarantee at all that the reader will win. If reads and writes
 racing for request slots ever becomes a problem then we should introduce a
 separate read and write waitqueue.

I agree. However here I also have a in flight per-queue limit of locked stuff
(otherwise with 512k sized request on scsi I could fill in some second 128mbyte
of RAM locked and I don't want to decrease the size of the queue because it has
to be large for aggressive reordering when the request are 4k large each).
This in-flight-perqueue limit is actually a non exclusive wakeup and it
triggers more often than the request shortage (because most of the time write
are consecutive) and so having two waitqueues and the reads that reigsters
themself into both shouldn't be very significative improvement at the moment (I
should first care about a wake-one in-flight-limit-per-queue wakeup :).

 the EXCLUSIVE thing was noticed by Dimitris i think, and it makes tons of

Actually I'm the one who introduced the EXCLUSIVE thing there and I audited
_all_ the device drivers to check they do 1 wakeup for each 1 request they
release before sending it off Linus. But I never thought (until some day ago)
about the fact that if a read completes a reserved request the write won't be
able to accept it.

So long term we'll do two wake-one queues with reads registered in both.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:18:54PM +0200, Jens Axboe wrote:
 The scsi layer currently "manually" does a list_add on the queue itself,
 which doesn't look too healthy.

It's grabbing the io_request_lock so it looks healthy for now :)

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 11:26:48AM -0300, Marcelo Tosatti wrote:
 This thread keeps freeing pages from the inactive clean list when needed
 (when zone-free_pages  zone-pages_low), making them available for
 atomic allocations.

This is flawed. It's the irq that have to shrink the memory itself. It can't
certainly reschedule kreclaimd and wait it to do the work.

Increasing the free_pages_min limit is the _only_ alternative to having
irqs that are able to shrink clean cache (and hopefully that "feature"
will be resurrected soon since it's the only way to go right now). 

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Rik van Riel


On Mon, 25 Sep 2000, Andrea Arcangeli wrote:
 On Mon, Sep 25, 2000 at 03:02:58PM +0200, Ingo Molnar wrote:
  On Mon, 25 Sep 2000, Andrea Arcangeli wrote:
  
   Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed
   that is a showstopper bug. [...]
  
  why?
 
 Because as you said the machine can lockup when you run out of memory.

The fix for this is to kill a user process when you're OOM
(you need to do this anyway).

The last few allocations of the "condemned" process can come
frome the reserved pages and the process we killed will exit just
fine.

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

  the EXCLUSIVE thing was noticed by Dimitris i think, and it makes tons of
 
 Actually I'm the one who introduced the EXCLUSIVE thing there and I audited

sorry - i said it was *noticed* by Dimitris. (and sent to l-k IIRC)

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: (Fwd) CD-ROM (SCSI and IDE) not mounting disk

2000-09-25 Thread tdanis


On Sat, Sep 23, 2000 at 09:01:04PM -0500, [EMAIL PROTECTED] wrote:
 Another interesting thing that I just noticed, I can still play music CD's in either 
drive.
 

I am currently seeing the same behaviour. My machine is up for
42 days now. Kernel 2.2.16-3 (RH 6.2). I am quite sure I could
play CDROM a few weeks ago. But now, when I launch cdplay
or xplaycd, no CD is detected :

/home/danis/DISCOGRAPHIE/JethroTull/Stormwatch/mp3  cdplay
/dev/cdrom: Mauvais type de medium (wrong medium type)

/home/danis/DISCOGRAPHIE/JethroTull/Stormwatch/mp3  dmesg
...
VFS: Disk change detected on device ide1(22,0)
cdrom: pid 15218 must open device O_NONBLOCK!
cdrom: open failed.
...

A+,
-- 
Thierry Danis
Poste : 57 96   [EMAIL PROTECTED]
# rm *;o
o : commande non trouvée
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Alan Cox


  Because as you said the machine can lockup when you run out of memory.
 
 well, i think all kernel-space allocations have to be limited carefully,
 denying succeeding allocations is not a solution against over-allocation,
 especially in a multi-user environment.

GFP_KERNEL has to be able to fail for 2.4. Otherwise you can get everything
jammed in kernel space waiting on GFP_KERNEL and if the swapper cannot make
space you die.

The alternative approach where it cannot fail has to be at higher levels so
you can release other resources that might need freeing for deadlock avoidance
before you retry


Alan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:43:44PM +0200, Ingo Molnar wrote:
 i talked about GFP_KERNEL, not GFP_USER. Even in the case of GFP_USER i

My bad, you're right I was talking about GFP_USER indeed.

But even GFP_KERNEL allocations like the init of a module or any other thing
that is static sized during production just checking the retval looks be ok.

 believe the right place to oom is via a signal, not in the gfp() case.

Signal can be trapped and ignored by malicious task. We had that security
problem until 2.2.14 IIRC.

 (because oom situation in the gfp() case is a completely random and
 statistical event, which might have no connection at all to the behavior
 of that given process.)

I agree we should have more information about the behaviour of the system
and I think a per-task page fault rate should work in practice.

But my question isn't what you do when you're OOM, but is _how_ do you
notice that you're OOM?

In the GFP_USER case simply checking when GFP fails looks right to me.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Mon, Sep 25, 2000 at 04:53:05PM +0200, Ingo Molnar wrote:
 sorry - i said it was *noticed* by Dimitris. (and sent to l-k IIRC)

I didn't know.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Swap on RAID; was: Re: the new VM

2000-09-25 Thread parsley


Ingo Molnar wrote:

 this is fixed in 2.4. The 2.2 RAID code is frozen, and has known
 limitations (ie. due to the above RAID1 cannot be used as a swap-device).

Eh, just to be clear about this: does this apply to the RAID 0.90 code
as commonly patched in by RedHat?  Should I instead use a swap file for
a machine that should be fault-tolerant against a drive failure?

regards,
David
-- 
David L. Parsley
Network Administrator
Roanoke College
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Swap on RAID; was: Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000 [EMAIL PROTECTED] wrote:

  this is fixed in 2.4. The 2.2 RAID code is frozen, and has known
  limitations (ie. due to the above RAID1 cannot be used as a swap-device).

 as commonly patched in by RedHat?  Should I instead use a swap file
 for a machine that should be fault-tolerant against a drive failure?

the answer is yes. RAID5 will not deadlock due to VM problems, but RAID5
might have other problems if the device is being reconstructed *and* used
for swap.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[patch] 2.4.0-test9-pre6: Alpha cross-compilation fixes

2000-09-25 Thread Maciej W. Rozycki


Hi,

 The following patch allows an Alpha kernel to be built with a
cross-compiling toolchain as $(NM) and $(STRIP) do incorporate the
$(CROSS_COMPILE) prefix.

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+

diff -u --recursive --new-file linux-2.4.0-test9-pre6.macro/arch/alpha/Makefile 
linux-2.4.0-test9-pre6/arch/alpha/Makefile
--- linux-2.4.0-test9-pre6.macro/arch/alpha/MakefileMon Sep 25 17:01:52 2000
+++ linux-2.4.0-test9-pre6/arch/alpha/Makefile  Mon Sep 25 17:07:49 2000
@@ -8,7 +8,7 @@
 # Copyright (C) 1994 by Linus Torvalds
 #
 
-NM := nm -B
+NM := $(NM) -B
 
 LINKFLAGS = -static -T arch/alpha/vmlinux.lds -N #-relax
 CFLAGS := $(CFLAGS) -pipe -mno-fp-regs -ffixed-8
diff -u --recursive --new-file linux-2.4.0-test9-pre6.macro/arch/alpha/boot/Makefile 
linux-2.4.0-test9-pre6/arch/alpha/boot/Makefile
--- linux-2.4.0-test9-pre6.macro/arch/alpha/boot/Makefile   Wed Jul 19 05:58:27 
2000
+++ linux-2.4.0-test9-pre6/arch/alpha/boot/Makefile Mon Sep 25 17:07:14 2000
@@ -68,7 +68,7 @@
$(OBJSTRIP) -v $(VMLINUX) vmlinux.nh
 
 vmlinux: $(TOPDIR)/vmlinux
-   strip -o vmlinux $(VMLINUX)
+   $(STRIP) -o vmlinux $(VMLINUX)
 
 tools/lxboot: $(OBJSTRIP) bootloader
$(OBJSTRIP) -p bootloader tools/lxboot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VMt

2000-09-25 Thread Alan Cox


  GFP_KERNEL has to be able to fail for 2.4. Otherwise you can get
  everything jammed in kernel space waiting on GFP_KERNEL and if the
  swapper cannot make space you die.
 
 if one can get everything jammed waiting for GFP_KERNEL, and not being
 able to deallocate anything, thats a VM or resource-limit bug. This
 situation is just 1% RAM away from the 'root cannot log in', situation.

Unless Im missing something here think about this case

2 active processes, no swap

#1  #2
kmalloc 32K kmalloc 16K
OK  OK
kmalloc 16K kmalloc 32K
block   block

so GFP_KERNEL has to be able to fail - it can wait for I/O in some cases with
care, but when we have no pages left something has to give


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Andrea Arcangeli wrote:

  i think the GFP_USER case should do the oom logic within __alloc_pages(),
 
 What's the difference of implementing the logic outside alloc_pages?
 Putting the logic inside looks not clean design to me.

it gives consistency and simplicity. The allocators themselves do not have
to care about oom.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0t8: hard reboot with ipchains/ipmasq

2000-09-25 Thread Les Schaffer


sorry: it was with iptables, not ipchains

=
modprobe iptable_nat
iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE
echo 1  /proc/sys/net/ipv4/ip_forward
=


g. everything else as in previous post

les
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

2.4.0t8: hard reboot with iptables/ipmasq

2000-09-25 Thread Les Schaffer


[reposted under __corrected__ subject line]

My linux box was set up for ipmasq with:

===
modprobe iptable_nat
iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE
echo 1  /proc/sys/net/ipv4/ip_forward
===

a windows box had been browsing the net through the linux box several
hours earlier (about 4 hours), and then left alone. when i went back
to the windows box and tried to browse again from the same IExplorer
window, _SNAP_ and the linux machine just plain up and rebooted
instantly

i am __guessing__  the problem had something to do with using an old
IExplorer session so long after it had last been used??? something
about NAT timeouts or something???

but a hard reboot???

apart from this crash, ipmasq had been working fine (just never tested
with that kind of delay time).

les schaffer


other tidbits:
--

a few hours prior to crash, i got these from net browsing on a
connected windows box:

Sep 24 22:14:34 localhost kernel: NAT: 0 dropping untracked packet c1afb180 1 
207.88.240.105 - 24.191.22.34
[snip]
Sep 25 00:03:12 localhost kernel: NAT: 0 dropping untracked packet c33da540 1 
63.211.32.65 - 24.191.22.


auth.log marks the last moment of conciousnes:

Sep 25 01:37:01 localhost PAM_unix[19174]: (cron) session closed for user root

no other significants things written to log.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: the new VM

2000-09-25 Thread Ingo Molnar



On Mon, 25 Sep 2000, Alan Cox wrote:

 Unless Im missing something here think about this case
 
 2 active processes, no swap
 
 #1#2
 kmalloc 32K   kmalloc 16K
 OKOK
 kmalloc 16K   kmalloc 32K
 block block
 
 so GFP_KERNEL has to be able to fail - it can wait for I/O in some
 cases with care, but when we have no pages left something has to give

you are right, i agree that synchronous OOM for higher-order allocations
must be preserved (just like ATOMIC allocations). But the overwhelming
majority of allocations is done at page granularity.

with multi-page allocations and the need for physically contiguous
buffers, the problem cannot be solved.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [patch] vmfixes-2.4.0-test9-B2

2000-09-25 Thread Andrea Arcangeli


On Sun, Sep 24, 2000 at 11:39:13PM -0300, Marcelo Tosatti wrote:
 - Change kmem_cache_shrink to return the number of freed pages. 

I did that too extending a patch from Mark. I also removed the first_not_full
ugliness providing a LIFO behaviour to the completly freed slabs (so
kmem_cache_reap removes the oldest completly unused slabs from the queue, not
the most recently used ones with potentially live cache in the CPU). 

 There was a comment on the shrink functions about making
 kmem_cache_shrink() work on a GFP_DMA/GFP_HIGHMEM basis to free only the
 wanted pages by the current allocation. 

This is meaningless at the moment because it can't be addressed without
classzone logic in the allocator (classzone means that the allocator will pass
to the memory balancing code the information about _which_ classzone you have
to allocate memory from, so you won't waste time to synchronously balance
unrelated zones).

My patch is here (it isn't going to apply cleanly due the test9 changes
in do_try_to_free_pages but porting is trivial). It was tested and
it was working for me.


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.0-test7/slab-1

BTW, here there's a fix for a longstanding SMP race (since swap_out and msync
doesn't run with the big lock) that can corrupt memory: 


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.0-test5/msync-smp-race-1

Here the fix for another SMP race in enstablish_pte:


ftp://ftp.uskernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.0-test5/tlb-flush-smp-race-1

The fix for this last bit is ugly bit it's safe because Manfred said s390 have
a flush_tlb_page that atomically flushes and makees the pte invalid (cleaner
fix means moving part of enstablish_pte into the arch inlines).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 485 matches

Mail list logo