Linux-Development-Sys Digest #657, Volume #8 Fri, 20 Apr 01 12:13:11 EDT
Contents:
Re: Is linux kernel preemptive?? (Greg Copeland)
Re: A Linux emulator for Linux, does this exist? (Philip Armstrong)
Re: kernel thread list (Philip Armstrong)
Re: Is linux kernel preemptive?? (Greg Copeland)
Re: A Linux emulator for Linux, does this exist? (Grant Edwards)
Re: IO system throughput (Greg Copeland)
Re: A Linux emulator for Linux, does this exist? (Malcolm Beattie)
Re: Memory caching (Greg Copeland)
Re: Is linux kernel preemptive?? (Kasper Dupont)
type of hard disk ("Alpesh K")
type of hard disk ("Alpesh K")
Re: kernel thread list (Massimiliano Caovilla)
Re: Is linux kernel preemptive?? (ChromeDome)
----------------------------------------------------------------------------
Subject: Re: Is linux kernel preemptive??
From: Greg Copeland <[EMAIL PROTECTED]>
Date: 20 Apr 2001 10:08:59 -0500
Just so we are all on the same page, let me spell it out like this:
Let's say that program A makes a system call immediately followed by
program B making the same system call. The system call is I/O related.
Let's assume that B did not preempt A prior to A entering the kernel
and locking the resource. Now, for the duration that A has the I/O
resource locked, B is not able to preempt A; meaning B is simply in
a wait queue. In the old 2.2 kernel, many of us remember the MindCraft?
benchmarks. If you recall, they specifically picked a benchmark that
had multiple network cards. This is because Linux's network stack was
not re-entrant and could not be preempted. This means that on NT, both
cards could asynchronously and concurrently send data. On Linux, only
one card could be used at a time. Because of the coarse grain locks that
were in place in the network stack, not only was Linux limited to using
a single card at a time (think round-robin, but not really), but for
the duration of the I/O (stack <-> card), nothing could preempt the
system call, further causing programs to sit in a wait queue which,
because of other scheduling problems, caused the scheduler to consume
huge amounts of CPU trying to reschedule everything that was waiting,
which couldn't be rescheduled because nothing could continue until the
I/O had completed. As you can see, that's why Linux got spanked so
hard during those tests.
The other distinction is the scheduler preempting something. Remember,
it's possible for an application to be preempted without it being in
a system call. This happens all the time. Thus, Linux is, and has
been a fully preemptive-multitasking OS for some time, rather, the
kernel was not "fully" preemptive/reentrant. In other words,
applications could still be preempted, as long as it wasn't in certain
systems calls. This is where the kernel being preemptive and reentrant
comes in.
Greg
Joe Pfeiffer <[EMAIL PROTECTED]> writes:
> Greg Copeland <[EMAIL PROTECTED]> writes:
>
> > Prior to 2.4, Linux had lots of system calls that were not preemptive (coarse
> > grain locks; networking was one such beast). With 2.4, Linux got lots of fine
> > grained locks which allows the kernel to be fully preemptive. If I recall,
> > there are still a couple of exceptions to the rule, however, most people now
> > consider Linux to be a "fully" preemptive kernel.
>
> This is why I was careful to put the definition I'm familiar with in
> my answer -- you appear to be using a definition of a preemptive OS
> that corresponds to the definition I'm familiar with for a preemptible
> OS (which, thinking back to the original post, may well be what he had
> in mind).
>
> > Joe Pfeiffer <[EMAIL PROTECTED]> writes:
> > [snip]
> > > By the definition I'm familiar with -- that user programs get
> > > preempted by the kernel, and don't need to explicitly relinquish
> > > control either explicitly or by requesting services -- it is fully
> > > preemptive.
> > >
> > > Being able to do realtime stuff, with guaranteed maximum latencies and
> > > the like, requires more than just being preemptive.
>
> --
> Joseph J. Pfeiffer, Jr., Ph.D. Phone -- (505) 646-1605
> Department of Computer Science FAX -- (505) 646-1002
> New Mexico State University http://www.cs.nmsu.edu/~pfeiffer
> SWNMRSEF: http://www.nmsu.edu/~scifair
--
Greg Copeland, Principal Consultant
Copeland Computer Consulting
==================================================
PGP/GPG Key at http://www.keyserver.net
DE5E 6F1D 0B51 6758 A5D7 7DFE D785 A386 BD11 4FCD
==================================================
------------------------------
From: [EMAIL PROTECTED] (Philip Armstrong)
Subject: Re: A Linux emulator for Linux, does this exist?
Date: 20 Apr 2001 09:10:10 +0100
In article <[EMAIL PROTECTED]>,
Jonathan Buzzard <[EMAIL PROTECTED]> wrote:
>In article <9bksqf$129$[EMAIL PROTECTED]>,
> [EMAIL PROTECTED] (Philip Armstrong) writes:
>> I don't doubt the staff, but I believe the newer ibm mainframes have
>> much lower power requirements than the old ones used to. They're much
>> smaller as well...so the serious power and aircon doesn't need to be
>> quite so serious any more.
>>
>> *surf* *surf* Apparently the spiffiest Z-series draws approx 13kW. So
>> there you go.
>You can't draw this from a 13amp socket though. I would suspect you
>need a three phase supply for this beasty. At the very least you are
>going to need specialist wiring as this is 56A at 230VAC.
I have a single circuit in this house which is rated at 40A (for the
electric shower says the circuit breaker). I don't think a 56A circuit
is going to be a problem for any business, and I really doubt the need
for 3-phase. You will need better than a 13A socket though as you say.
Besides I said lower than the old ones (which certainly did need three
phase power :) ), not low in absolute terms!
cheers,
Phil
--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
------------------------------
From: [EMAIL PROTECTED] (Philip Armstrong)
Subject: Re: kernel thread list
Date: 20 Apr 2001 09:12:52 +0100
In article <[EMAIL PROTECTED]>,
Massimiliano Caovilla <[EMAIL PROTECTED]> wrote:
> Hi to all
>I'm trying to debug a module I have ported to linux from Solaris. I need
>a debugger or another way to see a detailed thread list, such as the one
>I can see on Solaris with Sun's kernel debugger adb. I need this to
>track down some lock problems I have wich are very difficult to track
>down with printk! My module makes really wide use of kernel threads. Any
>ideas?
>Is there a way to get a kernel thread list detailed enought to see wich
>thread is doing what?
There's a kernel debugging mode available via a patch originally
authored by SGI I believe. I think it might be in the Alan Cox "ac"
series of kernel patches, but Linus has made it very clear that those
patches will never go into the mainstream kernel sources.
Phil
--
http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt
------------------------------
Subject: Re: Is linux kernel preemptive??
From: Greg Copeland <[EMAIL PROTECTED]>
Date: 20 Apr 2001 10:18:45 -0500
I read the link that you sent. Remember, why a resource is spin locked,
nothing can get past it. He's trying to allow for, preemptive activity
in the areas that still have coarse grain spin locks. If you recall,
I said that there was supposed to be some exceptions. I guess that
this may actually address some of the reentrance issues that the Linux
kernel has. Don't hold me to this, but as I recall, the Linux kernel
was not very reentrant. Meaning, a system call can not interrupt it
self with the same system call, whereby, because of the find grain locks
that are now in place, one system call my be able to interrupt another
system call (preempt). I would like to remind you that the patch in
question is not SMP safe.
I will admit that I'm getting into the merky grey area of my
Linux kernel understanding, but I believe what I said to be accurate
at least to it's broad meaning, if not to the letter.
Greg
Moritz Franosch <[EMAIL PROTECTED]> writes:
> Joe Pfeiffer <[EMAIL PROTECTED]> writes:
>
> > > Well, I'm not an expert on Linux kernel code, but...
> > > the original post didn't qualify to what degree 'preemptive' meant, and I can
> > > guarantee that the kernel is to some degree preemptive. It certainly isn't a
> > > non-preemptive kernel <g>.
> >
> > By the definition I'm familiar with -- that user programs get
> > preempted by the kernel, and don't need to explicitly relinquish
> > control either explicitly or by requesting services -- it is fully
> > preemptive.
>
> What you mean is that the scheduler is preemptive. I thought the
> kernel being preemptive means that the kernel gets preemped by user
> programs (or by itself).
>
> There is a patch for making the kernel "preemtible"
> http://kt.zork.net/kernel-traffic/kt20010330_113.html#5
> so I thought it is not preemtive (preemtible?, is there a difference?)
> now.
>
> Moritz
--
Greg Copeland, Principal Consultant
Copeland Computer Consulting
==================================================
PGP/GPG Key at http://www.keyserver.net
DE5E 6F1D 0B51 6758 A5D7 7DFE D785 A386 BD11 4FCD
==================================================
------------------------------
From: [EMAIL PROTECTED] (Grant Edwards)
Subject: Re: A Linux emulator for Linux, does this exist?
Date: Fri, 20 Apr 2001 15:32:45 GMT
In article <[EMAIL PROTECTED]>, Jonathan Buzzard wrote:
>> *surf* *surf* Apparently the spiffiest Z-series draws approx 13kW. So
>> there you go.
>>
>
>You can't draw this from a 13amp socket though. I would suspect you
>need a three phase supply for this beasty. At the very least you are
>going to need specialist wiring as this is 56A at 230VAC.
You can't run a couple hundred Wintel machines from a single 13A
socket either.
--
Grant Edwards grante Yow! I want to read my new
at poem about pork brains and
visi.com outer space...
------------------------------
Subject: Re: IO system throughput
From: Greg Copeland <[EMAIL PROTECTED]>
Date: 20 Apr 2001 10:36:25 -0500
[EMAIL PROTECTED] writes:
> > Greg Copeland <[EMAIL PROTECTED]> writes:
[snip]
>
> > Thanks for the summary. You make valid points. In short, I don't
> > think I'll be trying a fiber implementation for a while. Let's face
> > it, until you start to see a small segment of Linux "desktop" users
> > with it, or even a small slice of the server pie, I don't think
> > you're going to see any real support with fiber and Linux.
>
> Desktop users are, in this context, _extremely_ irrelevant. NOBODY is
> going to be hooking an $80K SAN to their desktop. Desktop's
> irrelevant to the issue.
I think you misunderstood me. Notice I had "desktop" in quotes. Since
Linux isn't really a desktop OS (yet), I was really implying that until
you see fiber a much more common commodity with at least a notable
segment of the Linux user base, I think development here is going to
suffer (save fiber card manufacturer stepping up).
>
> > It's worth pointing out that several other articles show that NT has
> > a significant advantage when it comes to extremely heavy I/O because
> > it supports an asynchronous I/O model. Linux needs this very bad.
> > It was one thing for Linux to shrug his shoulders when async I/O
> > didn't make much sense for the low-end servers, however, Linux is
> > trying to get into the enterprise which pretty much demands
> > asynchronous I/O. Let's face it, databases and build-your-own SANs
> > and NASs pretty much require this type of heavy duty support. If
> > you read some of Oracle's papers, you'll find that they slam Linux
> > from time to time for not supporting it too.
>
> > As far as I can tell, with plenty of journaling file-systems coming
> > the way of Linux, SMP and scheduler issues mostly addressed, and >2G
> > files, the only thing that Linux needs to proper fiber support and
> > asynchronous I/O support. As much as I hate NT (I've used it tons,
> > so I'm allowed to say that), MS did right by building in
> > asynchronous I/O in almost all facets of NT's kernel and API.
>
> Could you elaborate a little on what is meant by asynchronous I/O?
>
> I can think of a number of models for it; most seem to require a
> combination of kernel and user space support, and I can see the
> "crossing of boundary" being a reason for Linus to be reluctant to
> introduce it.
You are correct. Asynchronous I/O requires both kernel and user app
support. Right now, all I/O in Linux is synchronous. Meaning, when
you issue a write, the write does not return until it has been written.
In this case, written means copied to a buffer (write back) to be
physically written latter, or physically written now (write through).
In either case, the application had to wait for the write to complete
(virtual or physical). In an asynchronous model, the write would
immediately return with a tagged id. The application doesn't need
to wait for the write to complete. Behind the scenes, the kernel
completes the write and notifies the application. The tagged id is
returned to the application via a callback and/or a signal and a
function call. This way the application can track which asynchronous
I/O's have completed and which have not.
This is great for I/O intensive applications (e.g., databases) because
they can get onto writing without care and let the asynchronous handler
take care to error processing. Furthermore, it supposedly allows for
better gather time in the kernel's I/O because it can collect and order
I/O without an application having to wait on it to do so. And or course,
on a very loaded system, the application is waiting for I/O to complete
(this is the biggest boost), rather, it just keeps on queueing data
and lets it be notified when they have completed.
Very cool stuff. NT supports asynchronous file and network I/O. Linux
does not.
>
> Is there a POSIX document on it to consult?
I believe there is a POSIX asychronous API, however, I don't have a
pointer for it.
--
Greg Copeland, Principal Consultant
Copeland Computer Consulting
==================================================
PGP/GPG Key at http://www.keyserver.net
DE5E 6F1D 0B51 6758 A5D7 7DFE D785 A386 BD11 4FCD
==================================================
------------------------------
From: [EMAIL PROTECTED] (Malcolm Beattie)
Subject: Re: A Linux emulator for Linux, does this exist?
Date: Fri, 20 Apr 2001 15:38:46 +0000 (UTC)
In article <vSkD6.164949$[EMAIL PROTECTED]>,
<[EMAIL PROTECTED]> wrote:
>[EMAIL PROTECTED] (Philip Armstrong) writes:
>> In article <[EMAIL PROTECTED]>,
>> Jonadab the Unsightly One <[EMAIL PROTECTED]> wrote:
>> >How much does a 390 cost?
>>
>> How long is a piece of string ?
>>
>> :)
>
>> I'm told IBM mainframe pricing is of the "turn them upside and shake
>> them until all the spare money falls out" variety, but having never
>> been in a position to actually want or need one of the beasts, I
>> can't speak from personal experience!
>
>The expensive part is that you likely need a staff of a goodly dozen
>people to manage the OS and hardware, a room with serious power and
>air conditioning
No, the power requirements of a S/390 or z900 are less than the
equivalent rack-mount cheapo boxes in many environments: that's one
of their good points. Modern S/390 and z900 servers are supposed to
have two three-phase power connections for reliability reasons. In
normal operations, two of the three phases of one supply are used.
If one fails (or its power supply fails), then the other phase or the
other complete three-phase supply is used. As always for a S/390:
redundancy for reliability is everywhere.
>air conditioning, and that doesn't buy you the hordes of
>administrators needed to manage DB/2, CICS, and the applications.
Now you're confusing S/390 with OS/390 (or, similarly, z900 with
z/OS). The huge licensing costs are specific to OS/390 and z/OS.
If you're using your S/390 or z900 running Linux then the licensing
costs are much, much less (the highest cost is the licensing for VM
and the VM-capable engines but both are those are being addressed
with "Linux-only VM-Lite" out shortly). If you're running Linux on
the system then it's pretty much the same as Linux (and its
applications) on any other platform. You'll need someone VM-capable
unless you're using the cut-down VIF (which is much less flexible)
but it's nothing like OS/390 software licensing costs (whether IBM or
third-party).
--Malcolm
--
Malcolm Beattie <[EMAIL PROTECTED]>
Oxford University Computing Services
"I permitted that as a demonstration of futility" --Grey Roger
------------------------------
Subject: Re: Memory caching
From: Greg Copeland <[EMAIL PROTECTED]>
Date: 20 Apr 2001 10:39:27 -0500
And of course, as the memory is needed for applications, it will be
yielded by the cache. In short, it's doing what you want.
Greg
[EMAIL PROTECTED] (Trevor Hemsley) writes:
> On Fri, 13 Apr 2001 14:50:27, "Ryan Storgaard"
> <[EMAIL PROTECTED]> wrote:
>
> > After booting my system, I periodically ran free &/or meminfo and you will
> > notice it starts off with lots free, and then declines... Is this normal?
>
> Yes.
>
> The memory is free until it is used. It's either used for loading a
> program into or for a program's use. Or it's used as a file cache.
> When you first boot you haven't read many files so not much is used as
> a file cache. Nor have you used much to load programs. The longer you
> run, the more files you read and the more they get cached in that
> memory that was marked as free. Now it's not free, it's used for file
> caching.
>
> --
> Trevor Hemsley, Brighton, UK.
> [EMAIL PROTECTED]
--
Greg Copeland, Principal Consultant
Copeland Computer Consulting
==================================================
PGP/GPG Key at http://www.keyserver.net
DE5E 6F1D 0B51 6758 A5D7 7DFE D785 A386 BD11 4FCD
==================================================
------------------------------
From: Kasper Dupont <[EMAIL PROTECTED]>
Subject: Re: Is linux kernel preemptive??
Date: Fri, 20 Apr 2001 15:41:03 +0000
Greg Copeland wrote:
>
> Just so we are all on the same page, let me spell it out like this:
> Let's say that program A makes a system call immediately followed by
> program B making the same system call. The system call is I/O related.
> Let's assume that B did not preempt A prior to A entering the kernel
> and locking the resource. Now, for the duration that A has the I/O
> resource locked, B is not able to preempt A; meaning B is simply in
> a wait queue. In the old 2.2 kernel, many of us remember the MindCraft?
> benchmarks. If you recall, they specifically picked a benchmark that
> had multiple network cards. This is because Linux's network stack was
> not re-entrant and could not be preempted. This means that on NT, both
> cards could asynchronously and concurrently send data. On Linux, only
> one card could be used at a time. Because of the coarse grain locks that
> were in place in the network stack, not only was Linux limited to using
> a single card at a time (think round-robin, but not really), but for
> the duration of the I/O (stack <-> card), nothing could preempt the
> system call, further causing programs to sit in a wait queue which,
> because of other scheduling problems, caused the scheduler to consume
> huge amounts of CPU trying to reschedule everything that was waiting,
> which couldn't be rescheduled because nothing could continue until the
> I/O had completed. As you can see, that's why Linux got spanked so
> hard during those tests.
>
> The other distinction is the scheduler preempting something. Remember,
> it's possible for an application to be preempted without it being in
> a system call. This happens all the time. Thus, Linux is, and has
> been a fully preemptive-multitasking OS for some time, rather, the
> kernel was not "fully" preemptive/reentrant. In other words,
> applications could still be preempted, as long as it wasn't in certain
> systems calls. This is where the kernel being preemptive and reentrant
> comes in.
>
> Greg
>
[...]
The problems you describe are related to locking
granularity not whether the system is preemptive.
When a process is preempted it is not preempted
by another process, but by an interrupt. Making
a system preemptive means you have to ensure two
things, first that interrupts are not disabled
for long periodes of time, and second that an
interrupt can actually result in a process
switch.
Disabling interrupts when running kernel code is
not bad practice and is seldom a problem. When a
process spends long time within a sysytem call,
it is usually because the system call explecitly
sleeps. Thus we will have interrupts reenabled
in a short time anyway.
So what we have to look for is not preemption,
because we already have that, but rather fine
grained locking. As I understand it 2.4.x is a
good step in that direction.
Also keep in mind that there are different types
of locks. there are spinlocks used to synchronize
code running on different CPUs in a multi
processor system. And there are locks related to
hardware components like netcards diskcontrollers.
The first type of locks must be finegrained to
avoid CPUs wasting lots of clockcycles spining
for a lock, the second type must be finegrained
to allow using as many different hardware
components in parallel as physically posible.
--
Kasper Dupont
------------------------------
From: [EMAIL PROTECTED] ("Alpesh K")
Subject: type of hard disk
Date: Fri, 20 Apr 2001 15:49:59 +0000 (UTC)
Hi:
Is it possible to find out what type of disk(s) (whether it is IDE/SCSI) are present
on linux system using system call?
Do I have to write a kernel module for this ? If yes
what book/document I should read first as I am newbie in kernel programming.
Thanks,
Alpesh
_____________________________________________________
Chat with your friends as soon as they come online. Get Rediff Bol at
http://bol.rediff.com
--
Posted from IDENT:qmailr@[202.54.124.148]
via Mailgate.ORG Server - http://www.Mailgate.ORG
------------------------------
From: [EMAIL PROTECTED] ("Alpesh K")
Subject: type of hard disk
Date: Fri, 20 Apr 2001 15:50:03 +0000 (UTC)
Hi:
Is it possible to find out what type of disk(s) (whether it is IDE/SCSI) are present
on linux system using system call?
Do I have to write a kernel module for this ? If yes
what book/document I should read first as I am newbie in kernel programming.
Thanks,
Alpesh
_____________________________________________________
Chat with your friends as soon as they come online. Get Rediff Bol at
http://bol.rediff.com
--
Posted from IDENT:qmailr@[202.54.124.148]
via Mailgate.ORG Server - http://www.Mailgate.ORG
------------------------------
From: Massimiliano Caovilla <[EMAIL PROTECTED]>
Subject: Re: kernel thread list
Date: Fri, 20 Apr 2001 16:06:21 GMT
> >Is there a way to get a kernel thread list detailed enought to see wich
> >thread is doing what?
>
> There's a kernel debugging mode available via a patch originally
> authored by SGI I believe. I think it might be in the Alan Cox "ac"
> series of kernel patches, but Linus has made it very clear that those
> patches will never go into the mainstream kernel sources.
>
> Phil
Ok, thank you very much! By the way, where do I find that?
------------------------------
From: ChromeDome <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: Re: Is linux kernel preemptive??
Date: Fri, 20 Apr 2001 16:08:08 GMT
Greg Copeland wrote:
>
> I will admit that I'm getting into the merky grey area of my
> Linux kernel understanding, but I believe what I said to be accurate
> at least to it's broad meaning, if not to the letter.
>
It's obvious from following this thread that "preemptive" means
different things to different people. I've been a programmer (mostly
process control) for almost 50 years, so let me throw in my 2 cents
worth.
The definition I've always heard used for a "preemptive O/S" has nothing
to do with processes. It says that kernel services (i.e. system calls)
must be preemptible, which in practice also implies reentrancy.
Everyone I worked with complained that Unix didn't have reentrant system
calls and thus was not preemptive according to their definition. I
don't know if Linux has eliminated this problem or not, but from others
postings it sounds like it has at least made a start.
Gene
--
Homo Sapiens is a goal, not a description.
------------------------------
** FOR YOUR REFERENCE **
The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:
Internet: [EMAIL PROTECTED]
You can send mail to the entire list by posting to the
comp.os.linux.development.system newsgroup.
Linux may be obtained via one of these FTP sites:
ftp.funet.fi pub/Linux
tsx-11.mit.edu pub/linux
sunsite.unc.edu pub/Linux
End of Linux-Development-System Digest
******************************