2.4.3 fails to boot with initrd - solved

2001-04-05 Thread Bill Davidsen

PROBLEM:

  kernel 2.4.3 will not boot on systems with initrd files

DESCRIPTION

  Building kernel 2.4.3 and attempting to boot it failed. The problem
turned out to be in the modutils-2.4.5 rpm for i386.

DETAIL

  After building the 2.4.3 kernel and moving the boot modules to the
initrd image, it was noted the the system stopped when trying to load
modules for the root filesystem device. First solution attempted was to
get the i386 rpm from kernel.org for the latest (2.4.5) modutils and
install, copying the insmod program to the initrd image.

  This fails, with the message "insmod: no such program" at boot.
Examination showed that the binary provided was not static linked. Got
the source from kernel.org and built. By default this still isn't static
linked! Changed the common Makfile to set LDFLAGS to "-static -s" and
built again. After install and copy to initrd image this resulted in a
bootable system.

  While it is possible to copy the libraries needed to the initrd image,
it becomes larger than the default ramdisk size (at least on my system).
And including the drivers in the kernel hurts portability and makes the
kernel too large to boot from floppy.

SYSTEMS AFFECTED

  Redhat 7.x and similar using configurations which have the root device
driver loaded from modules.

SUGGESTED FIX

  None needed, but the kernel "Changes" file should include a note that
people using initrd will need to rebuild them static along with the note
that a newer modutils is needed. Even for people who build their own
initrd files, this is NOT obvious!

-- 
bill davidsen [EMAIL PROTECTED]
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Documentation glitch in 2.4

2001-04-09 Thread Bill Davidsen

The Config help for kernel automount indicates that the pointer to user
code is in the Documentation/Changes file for autofs. As far as I can tell
that isn't the case. Since search engines seem to be better at finding the
BSD and 2.2 software, it would be nice if the information was restored
with all the other "get it here info."

-- 
bill davidsen [EMAIL PROTECTED]
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: sched_yield proposals/rationale

2007-04-17 Thread Bill Davidsen

Mark Lord wrote:

[EMAIL PROTECTED] wrote:

From: Bill Davidsen

And having gotten same, are you going to code up what appears to be a
solution, based on this feedback?


The feedback was helpful in verifying whether there are any arguments 
against my approach. The real proof is in the pudding.


I'm running a kernel with these changes, as we speak. Overall system 
throughput is about up 20%. With 'system throughput' I mean measured 
performance of a rather large (experimental) system. The patch isn't 
even 24h old... Also the application latency has improved.


Cool.  You *do know* that there is a brand new CPU scheduler
scheduled to replace the current one for the 2.6.22 Kernel, right?

Having tried both nicksched and Con's fair sched on some normal loads, 
as opposed to benchmarks, I sure hope Linus changes his mind about 
having several schedulers in the kernel. The one perfect and 
self-adjusting scheduler isn't here yet.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] Kill off legacy power management stuff.

2007-04-17 Thread Bill Davidsen

Rafael J. Wysocki wrote:

[appropriate CCs added]

On Friday, 13 April 2007 02:33, Robert P. J. Day wrote:

just something i threw together, not in final form, but it represents
tossing the legacy PM stuff.  at the moment, the menuconfig entry for
PM_LEGACY lists it as DEPRECATED, while the help screen calls it
obsolete.  that's a good sign that it's getting close to the time
for it to go, and the removal is fairly straightforward, but there's
no mention of its removal in the feature removal schedule file.


It's been like this for a long long time.  I think you're right that it can be
dropped, but I don't know the details (eg. why it hasn't been dropped yet).
 
One reason was that there are (were?) a number of machines which only 
powered down properly using apm. It was discussed as part of shutting 
down after power failure when your UPS is running out of power.


I haven't checked on that in a while, I'm just supplying one reason 
since you wondered.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] Kill off legacy power management stuff.

2007-04-18 Thread Bill Davidsen

Robert P. J. Day wrote:

On Tue, 17 Apr 2007, Bill Davidsen wrote:

  

Rafael J. Wysocki wrote:


[appropriate CCs added]

On Friday, 13 April 2007 02:33, Robert P. J. Day wrote:
  

just something i threw together, not in final form, but it represents
tossing the legacy PM stuff.  at the moment, the menuconfig entry for
PM_LEGACY lists it as DEPRECATED, while the help screen calls it
obsolete.  that's a good sign that it's getting close to the time
for it to go, and the removal is fairly straightforward, but there's
no mention of its removal in the feature removal schedule file.


It's been like this for a long long time.  I think you're right that it can
be
dropped, but I don't know the details (eg. why it hasn't been dropped yet).

  

One reason was that there are (were?) a number of machines which only powered
down properly using apm. It was discussed as part of shutting down after power
failure when your UPS is running out of power.



um ... what does APM have to do with legacy PM?  two different issues,
no?
  
Since the patches are going into apm.c and apm was used for suspend and 
poweroff before ACPI was a feature of the hardware, I assume there's a 
relationship. As of 2.6.9 ACPI still couldn't power down one of my old 
boxes, it hasn't been updated since that time, so I can't say what later 
kernels will do.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kaffeine problem with CFS

2007-04-20 Thread Bill Davidsen

S.Çag(lar Onur wrote:
18 Nis 2007 Çar tarihinde, Ingo Molnar s,unlar? yazm?s,t?: 

* S.Çag(lar Onur [EMAIL PROTECTED] wrote:

-   schedule();
+   msleep(1);

which Ingo sends me to try also has the same effect on me. I cannot
reproduce hangs anymore with that patch applied top of CFS while one
console checks out SVN repos and other one compiles a small test
software.

great! Could you please unapply the hack above and try the proper fix
below, does this one solve the hangs too?


Instead of that one, i tried CFSv3 and i cannot reproduce the hang anymore, 
Thanks!...


And that explains why CFS-v3 on 21-rc7-git3 wouldn't show me the hang. 
As a matter of fact, nothing I did showed any bad behavior! Note that I 
was doing actual badly behaved things which do sometimes glitch the 
standard scheduler, not running benchmarks.


This scheduler is boring, everything works. I am going to try some tests 
on a uniprocessor, though, I have been running everything on either SMP 
or HT CPUs. But so far it looks fine.



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-20 Thread Bill Davidsen

Mike Galbraith wrote:

On Tue, 2007-04-17 at 05:40 +0200, Nick Piggin wrote:

On Tue, Apr 17, 2007 at 04:29:01AM +0200, Mike Galbraith wrote:
 

Yup, and progress _is_ happening now, quite rapidly.

Progress as in progress on Ingo's scheduler. I still don't know how we'd
decide when to replace the mainline scheduler or with what.

I don't think we can say Ingo's is better than the alternatives, can we?


No, that would require massive performance testing of all alternatives.


If there is some kind of bakeoff, then I'd like one of Con's designs to
be involved, and mine, and Peter's...


The trouble with a bakeoff is that it's pretty darn hard to get people
to test in the first place, and then comes weighting the subjective and
hard performance numbers.  If they're close in numbers, do you go with
the one which starts the least flamewars or what?

Here we disagree... I picked a scheduler not by running benchmarks, but 
by running loads which piss me off with the mainline scheduler. And then 
I ran the other schedulers for a while to find the things, normal things 
I do, which resulted in bad behavior. And when I found one which had (so 
far) no such cases I called it my winner, but I haven't tested it under 
server load, so I can't begin to say it's the best.


What we need is for lots of people to run every scheduler in real life, 
and do worst case analysis by finding the cases which cause bad 
behavior. And if there were a way to easily choose another scheduler, 
call it plugable, modular, or Russian Roulette, people who found a worst 
case would report it (aka bitch about it) and try another. But the 
average user is better able to boot with an option like sched=cfs (or 
sc, or nick, or ...) than to patch and build a kernel. So if we don't 
get easily switched schedulers people will not test nearly as well.


The best scheduler isn't the one 2% faster than the rest, it's the one 
with the fewest jackpot cases where it sucks. And if the mainline had 
multiple schedulers this testing would get done, authors would get more 
reports and have a better chance of fixing corner cases.


Note that we really need multiple schedulers to make people happy, 
because fairness is not the most desirable behavior on all machines, and 
adding knobs probably isn't the answer. I want a server to degrade 
gently, I want my desktop to show my movie and echo my typing, and if 
that's hard on compiles or the file transfer, so be it. Con doesn't want 
to compromise his goals, I agree but want to have an option if I don't 
share them.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-20 Thread Bill Davidsen

Ingo Molnar wrote:

( Lets be cautious though: the jury is still out whether people actually 
  like this more than the current approach. While CFS feedback looks 
  promising after a whopping 3 days of it being released [ ;-) ], the 
  test coverage of all 'fairness centric' schedulers, even considering 
  years of availability is less than 1% i'm afraid, and that  1% was 
  mostly self-selecting. )


All of my testing has been on desktop machines, although in most cases 
they were really loaded desktops which had load avg 10..100 from time to 
time, and none were low memory machines. Up to CFS v3 I thought 
nicksched was my winner, now CFSv3 looks better, by not having stumbles 
under stupid loads.


I have not tested:
  1 - server loads, nntp, smtp, etc
  2 - low memory machines
  3 - uniprocessor systems

I think this should be done before drawing conclusions. Or if someone 
has tried this, perhaps they would report what they saw. People are 
talking about smoothness, but not how many pages per second come out of 
their overloaded web server.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-21 Thread Bill Davidsen

Matt Mackall wrote:

On Wed, Apr 18, 2007 at 08:37:11AM +0200, Nick Piggin wrote:



[2] It's trivial to construct two or more perfectly reasonable and
desirable definitions of fairness that are mutually incompatible.

Probably not if you use common sense, and in the context of a replacement
for the 2.6 scheduler.


Ok, trivial example. You cannot allocate equal CPU time to
processes/tasks and simultaneously allocate equal time to thread
groups. Is it common sense that a heavily-threaded app should be able
to get hugely more CPU than a well-written app? No. I don't want Joe's
stupid Java app to make my compile crawl.

On the other hand, if my heavily threaded app is, say, a voicemail
server serving 30 customers, I probably want it to get 30x the CPU of
my gzip job.

Matt, you tickled a thought... on one hand we have a single user running 
a threaded application, and it ideally should get the same total CPU as 
a user running a single thread process. On the other hand we have a 
threaded application, call it sendmail, nnrpd, httpd, bind, whatever. In 
that case each thread is really providing service for an independent 
user, and should get an appropriate share of the CPU.


Perhaps the solution is to add a means for identifying server processes, 
by capability, or by membership in a server group, or by having the 
initiating process set some flag at exec() time. That doesn't 
necessarily solve problems, but it may provide more information to allow 
them to be soluble.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-21 Thread Bill Davidsen

Linus Torvalds wrote:


On Wed, 18 Apr 2007, Matt Mackall wrote:

Why is X special? Because it does work on behalf of other processes?
Lots of things do this. Perhaps a scheduler should focus entirely on
the implicit and directed wakeup matrix and optimizing that
instead[1].


I 100% agree - the perfect scheduler would indeed take into account where 
the wakeups come from, and try to weigh processes that help other 
processes make progress more. That would naturally give server processes 
more CPU power, because they help others


I don't believe for a second that fairness means give everybody the 
same amount of CPU. That's a totally illogical measure of fairness. All 
processes are _not_ created equal.


That said, even trying to do fairness by effective user ID would 
probably already do a lot. In a desktop environment, X would get as much 
CPU time as the user processes, simply because it's in a different 
protection domain (and that's really what effective user ID means: it's 
not about users, it's really about protection domains).


And fairness by euid is probably a hell of a lot easier to do than 
trying to figure out the wakeup matrix.


You probably want to consider the controlling terminal as well...  do 
you want to have people starting 'at' jobs competing on equal footing 
with people typing at a terminal? I'm not offering an answer, just 
raising the question.


And for some database applications, everyone in a group may connect with 
the same login-id, then do sub authorization to the database 
application. euid may be an issue there as well.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-21 Thread Bill Davidsen

Ingo Molnar wrote:

* Davide Libenzi [EMAIL PROTECTED] wrote:


The same user nicing two different multi-threaded processes would 
expect a predictable CPU distribution too. [...]


i disagree that the user 'would expect' this. Some users might. Others 
would say: 'my 10-thread rendering engine is more important than a 
1-thread job because it's using 10 threads for a reason'. And the CFS 
feedback so far strengthens this point: the default behavior of treating 
the thread as a single scheduling (and CPU time accounting) unit works 
pretty well on the desktop.


If by desktop you mean one and only one interactive user, that's true. 
On a shared machine it's very hard to preserve any semblance of fairness 
when one user gets far more than another, based not on the value of what 
they're doing but the tools they use to to it.


think about it in another, 'kernel policy' way as well: we'd like to 
_encourage_ more parallel user applications. Hurting them by accounting 
all threads together sends the exact opposite message.


Why is that? There are lots of things which are intrinsically single 
threaded, how are we hurting hurting multi-threaded applications by 
refusing to give them more CPU than an application running on behalf of 
another user? By accounting all threads together we encourage writing an 
application in the most logical way. Threads are a solution, not a goal 
in themselves.


[...] Doing that efficently (the old per-cpu run-queue is pretty nice 
from many POVs) is the real challenge.


yeah.

Ingo



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] another scheduler beater

2007-04-23 Thread Bill Davidsen
The small attached script does a nice job of showing animation glitches 
in the glxgears animation. I have run one set of tests, and will have 
several more tomorrow. I'm off to a poker game, and would like to let 
people draw their own conclusions.


Based on just this script as load I would say renice on X isn't a good 
thing. Based on one small test, I would say that renice of X in 
conjunction with heavy disk i/o and a single fast scrolling xterm (think 
kernel compile) seems to slow the raid6 thread measurably. Results late 
tomorrow, it will be an early and long day :-(


glitch1.sh
Description: Bourne shell script


[REPORT] First glitch1 results, 2.6.21-rc7-git6-CFSv5

2007-04-23 Thread Bill Davidsen
I am not sure a binary attachment will go thru, I will move to the web 
ste if not.


GL2.6.21-rc7-git6-CFSv5_nice0_jump
Description: Binary data


GL2.6.21-rc7-git6-CFSv5_nice0_nojump
Description: Binary data


GL2.6.21-rc7-git6-CFSv5_nice19_nojump
Description: Binary data


GL2.6.21-rc7-git6-CFSv5_nice-19_jump
Description: Binary data


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-08 Thread Bill Davidsen

Linus Torvalds wrote:


On Mon, 5 Mar 2007, Ed Tomlinson wrote:
The patch _does_ make a difference.  For instance reading mail with freenet working 
hard  (threaded java application) and gentoo's emerge triggering compiles to update the 
box is much smoother.


Think this scheduler needs serious looking at.  


I agree, partly because it's obviously been getting rave reviews so far, 
but mainly because it looks like you can think about behaviour a lot 
better, something that was always very hard with the interactivity 
boosters with process state history.


I'm not at all opposed to this, but we do need:
 - to not do it at this stage in the stable kernel
 - to let it sit in -mm for at least a short while
 - and generally more people testing more loads.

Please, could you now rethink plugable scheduler as well? Even if one 
had to be chosen at boot time and couldn't be change thereafter, it 
would still allow a few new thoughts to be included.


I don't actually worry too much about switching out a CPU scheduler: those 
things are places where you *can* largely read the source code and get an 
idea for them (although with the kind of history state that we currently 
have, it's really really hard). But at the very least they aren't likely 
to have subtle bugs that show up elsewhere, so...


I confess that the default scheduler works for me most of the time, i/o 
tuning is more productive. I want tot test with kvm load, but 
2.6.21-rc3-git3 doesn't want to run kvm at all, I'm looking to see what 
I broke, since nbd doesn't work, either.


I'm collecting OOPS now, will forward when I have a few more.

So as long as the generic concerns above are under control, I'll happily 
try something like this if it can be merged early in a merge window..


Linus



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-08 Thread Bill Davidsen

Con Kolivas wrote:

On Wednesday 07 March 2007 04:50, Bill Davidsen wrote:



With luck I'll get to shake out that patch in combination with kvm later
today.


Great thanks!. I've appreciated all the feedback so far.

I did try, the 2.6.21-rc3-git3 doesn't want to kvm for me, and your 
patch may not be doing what it should. I'm falling back to 2.6.20 and 
will retest after I document my kvm issues.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question: schedule()

2007-03-08 Thread Bill Davidsen

albcamus wrote:

your kthread IS preemptible unless you call preempt_disable or some
locking functions explicitly .

I think he's trying to go the other way, make his thread the highest 
priority to blow anything else in the system out of the water. See his 
previous post how to make kernel thread more faster?


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-09 Thread Bill Davidsen

Linus Torvalds wrote:

On Thu, 8 Mar 2007, Bill Davidsen wrote:
  

Please, could you now rethink plugable scheduler as well? Even if one had to
be chosen at boot time and couldn't be change thereafter, it would still allow
a few new thoughts to be included.



No. Really.

I absolutely *detest* pluggable schedulers. They have a huge downside: 
they allow people to think that it's ok to make special-case schedulers. 
  
But it IS okay for people to make special-case schedulers. Because it's 
MY machine, and how it behaves under mixed load is not a technical 
issue, it's a POLICY issue, and therefore the only way you can allow the 
admin to implement that policy is to either provide several schedulers 
or to provide all sorts of tunable knobs. And by having a few schedulers 
which have been heavily tested and reviewed, you can define the policy 
the scheduler implements and document it. Instead of people writing 
their own, or hacking the code, they could have a few well-tested 
choices, with known policy goals.

And I simply very fundamentally disagree.

If you want to play with a scheduler of your own, go wild. It's easy 
(well, you'll find out that getting good results isn't, but that's a 
different thing). But actual pluggable schedulers just cause people to 
think that oh, the scheduler performs badly under circumstance X, so 
let's tell people to use special scheduler Y for that case.
  
And has that been a problem with io schedulers? I don't see any vast 
proliferation of them, I don't see contentious exchanges on LKML, or 
people asking how to get yet another into mainline. In fact, I would say 
that the io scheduler situation is as right as anything can be, choices 
for special cases, lack of requests for something else.
And CPU scheduling really isn't that complicated. It's *way* simpler than 
IO scheduling. There simply is *no*excuse* for not trying to do it well 
enough for all cases, or for having special-case stuff.
  
This supposes that the desired behavior, the policy, is the same on all 
machines or that there is currently a way to set the target. If I want 
interactive response with no consideration to batch (and can't trust 
users to use nice), I want one policy. If I want a compromise, the 
current scheduler or RSDL are candidates, but they do different things.
But even IO scheduling actually ends up being largely the same. Yes, we 
have pluggable schedulers, and we even allow switching them, but in the 
end, we don't want people to actually do it. It's much better to have a 
scheduler that is good enough than it is to have five that are perfect 
for five particular cases.
  
We not only have multiple io schedulers, we have many tunable io 
parameters, all of which allow people to make their system behave the 
way they think is best. It isn't causing complaint, confusion, or 
instability. We have many people requesting a different scheduler, so 
obviously what we have isn't good enough and I doubt any one scheduler 
can be, given that the target behavior is driven by non-technical choices.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix read past end of array in md/linear.c

2007-03-08 Thread Bill Davidsen

Andy Isaacson wrote:

When iterating through an array, one must be careful to test one's index
variable rather than another similarly-named variable.  


The loop will read off the end of conf-disks[] in the following
(pathological) case:

% dd bs=1 seek=840716287 if=/dev/zero of=d1 count=1
% for i in 2 3 4; do dd if=/dev/zero of=d$i bs=1k count=$(($i+150)); done
% ./vmlinux ubd0=root ubd1=d1 ubd2=d2 ubd3=d3 ubd4=d4
# mdadm -C /dev/md0 --level=linear --raid-devices=4 /dev/ubd[1234]

adding some printks, I saw this:
[42949374.96] hash_spacing = 821120
[42949374.96] cnt  = 4
[42949374.96] min_spacing  = 801
[42949374.96] j=0 size=820928 sz=820928
[42949374.96] i=0 sz=820928 hash_spacing=820928
[42949374.96] j=1 size=64 sz=64
[42949374.96] j=2 size=64 sz=128
[42949374.96] j=3 size=64 sz=192
[42949374.96] j=4 size=1515870810 sz=1515871002

Index: linus/drivers/md/linear.c
===
--- linus.orig/drivers/md/linear.c  2007-03-02 11:35:55.0 -0800
+++ linus/drivers/md/linear.c   2007-03-07 13:10:30.0 -0800
@@ -188,7 +188,7 @@
for (i=0; i  cnt-1 ; i++) {
sector_t sz = 0;
int j;
-   for (j=i; icnt-1  sz  min_spacing ; j++)
+   for (j=i; jcnt-1  sz  min_spacing ; j++)
sz += conf-disks[j].size;
if (sz = min_spacing  sz  conf-hash_spacing)
conf-hash_spacing = sz;


After looking at that code, I have to wonder how this ever worked, or if 
in fact anyone ever took this path. I assume that the value of sz caused 
the loop exit in all cases, since this has been in the code at least 
since 2.6.15, oldest thing I have handy.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PROBLEM] Can't start MD devices by using /dev/disk/by-id

2007-02-13 Thread Bill Davidsen

Patrick Ale wrote:

Hi chaps,

I just came home, rebooted my box to git8 and *gasp* a problem :)

I can't start my MD devices anymore by defining /dev/disk/by-id/*
devices in /etc/mdadm.conf.

When I do a: mdadm --assemble /dev/md/1 it tells me No devices found
for /dev/md/1
When I edit the file /etc/mdadm.conf and change the /dev/disk/by-id/*
to whatever the symbolic links points to in the /dev directory, it
does work.

Just out of curiosity, why did you do this in such a manual way instead 
of just using the UUID? I would think every time you replace a failed 
drive you would have to go edit the files all over again.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp and irq conflict

2007-02-14 Thread Bill Davidsen

Benny Amorsen wrote:

BD == Bill Davidsen [EMAIL PROTECTED] writes:



BD You may be able to move one board to another slot, but looking at
BD the bandwidth I suspect you may need a server motherboard with
BD multiple busses, preferably running at 66MHz 64bit. I don't think
BD this is a interrupt problem, but you can just try capture on two
BD channels which share an interrupt, like bttv0 and bttv7 to verify
BD that.

66MHz 64bit isn't much fun when the capture cards are 33MHz 32bit.

  


It doesn't help the video to bus, but multiple busses to give a bus per 
card would help, and assuming the data are being saved to disk using a 
decent disk controller which can use the additional bandwidth, at least 
some conflict is avoided or reduced.


This is really a case of using general hardware to the utmost, I suspect 
more m/b bandwidth will be needed somewhere.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AHCI - remove probing of ata2

2007-02-14 Thread Bill Davidsen

Greg Trounson wrote:

At the risk of sounding like a me too post:

I also have an Asus P5W-DH, with the following drives connected:

SATA: ST3250820AS, connected to sata1
PATA: HL-DT-ST GSA-H12N, ATAPI DVD Writer, Primary master

On bootup of 2.6.19 and 2.6.20, the kernel stalls for 1 minute when 
probing sata2, eventually giving up and continuing the boot process.  
There is no physical sata2 connector on the Motherboard, just solder 
lugs between sata1 and sata3.  From other users I understand this is 
really a Silicon Image SIL4723 SATA to 2-Port SATA splitter.  It is 
detected by the kernel as a disk, as below.


The relevant part of the boot process looks like:
...
libata version 2.00 loaded.
ahci :00:1f.2: version 2.0
ACPI: PCI Interrupt :00:1f.2[B] - GSI 23 (level, low) - IRQ 22
PCI: Setting latency timer of device :00:1f.2 to 64
ahci :00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA 
mode

ahci :00:1f.2: flags: 64bit ncq led clo pio slum part
ata1: SATA max UDMA/133 cmd 0xF882A900 ctl 0x0 bmdma 0x0 irq 219
ata2: SATA max UDMA/133 cmd 0xF882A980 ctl 0x0 bmdma 0x0 irq 219
ata3: SATA max UDMA/133 cmd 0xF882AA00 ctl 0x0 bmdma 0x0 irq 219
ata4: SATA max UDMA/133 cmd 0xF882AA80 ctl 0x0 bmdma 0x0 irq 219
scsi0 : ahci
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-7, max UDMA/133, 488397168 sectors: LBA48 NCQ (depth 31/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : ahci
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

...waits 20 seconds...

ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104)

...waits 5 seconds...

ata2: port is slow to respond, please be patient (Status 0x80)

...waits 30 seconds...

ata2: port failed to respond (30 secs, Status 0x80)
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs

...waits 5 seconds...

ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: ATA-6, max UDMA/133, 640 sectors: LBA
ata2.00: ata2: dev 0 multi count 1
ata2.00: configured for UDMA/133
scsi2 : ahci
ata3: SATA link down (SStatus 0 SControl 300)
...

A bit of poking about shows:

fdisk -l /dev/sdb
Disk /dev/sdb: 0 MB, 327680 bytes
255 heads, 63 sectors/track, 0 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdb doesn't contain a valid partition table

So it presents itself as a 320k disk, filled with zeroes as below:

dd if=/dev/sdb |hexdump
000        
*
005

640+0 records in
640+0 records out
327680 bytes (328 kB) copied, 0.0106662 seconds, 30.7 MB/s

Note that this is not a fatal error.  The machine still boots 
eventually, but the seemingly mandatory 60 second pause makes startup 
rather cumbersome for the user.


So far none of the suggested fixes have managed to stop ata2 from being 
detected. (noprobe=ata2, irqpoll, etc).  I understand this problem 
wasn't present in 2.6.16 so the problem must lie in some patch since 
then.  I see Tejun is working towards patches for this and I would be 
happy to try them here.


Is this 320k of cache memory, or in any way some actual storage on the 
system? Have you tried to write to it out of curiosity? Seems odd that 
it would be detected if there were nothing at all present, although 
obviously it could be artifact.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.20-get13] KVM-12 won't build

2007-02-16 Thread Bill Davidsen

Goes out with an error message:

   cc -I /home/davidsen/downloads/kernel.org/linux-2.6.20-git13/include -MMD 
-MF ./.kvmctl.d -g   -c -o kvmctl.o kvmctl.c
   kvmctl.c:29:2: error: #error libkvm: userspace and kernel version mismatch
   make[1]: *** [kvmctl.o] Error 1
 


I don't see a kvm-13 on the KVM website.

--
Bill Davidsen
 He was a full-time professional cat, not some moonlighting
ferret or weasel. He knew about these things.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Excessive dmesg whining in 2.6.20-git13

2007-02-16 Thread Bill Davidsen
The good news is that this kernel boots, so I can start testing. 
However, it seems to have a LOT of trouble coping with the idea that my 
only IDE device is a DVD burner. I am guessing from the hundreds of 
lines of nbd whining that nbd doesn't work, testing will continue after 
I go plow more snow.


If all this whining doesn't indicate a problem, I might suggest 
eliminating it, since it tends to hide any real problem.


compressed dmesg and config attached.

--
Bill Davidsen
 He was a full-time professional cat, not some moonlighting
ferret or weasel. He knew about these things.



config.gz
Description: GNU Zip compressed data


dmesg-2.6.20-git13.bz2
Description: BZip2 compressed data


Re: [2.6.20-get13] KVM-12 won't build

2007-02-16 Thread Bill Davidsen

Joerg Roedel wrote:

On Fri, Feb 16, 2007 at 11:32:13AM -0500, Bill Davidsen wrote:
  

Goes out with an error message:

   cc -I /home/davidsen/downloads/kernel.org/linux-2.6.20-git13/include -MMD -MF ./.kvmctl.d 
-g   -c -o kvmctl.o kvmctl.c

   kvmctl.c:29:2: error: #error libkvm: userspace and kernel version mismatch
   make[1]: *** [kvmctl.o] Error 1
 
I don't see a kvm-13 on the KVM website.



You will find the kvm-13 release in the SourceForge download area of
KVM[1]. Kvm-12 is still required for 2.6.20 kernels.

Joerg

[1] http://sourceforge.net/project/showfiles.php?group_id=180599

  

I'll look, the download off the home page didn't seem to have it.

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation

2007-02-16 Thread Bill Davidsen

Jörn Engel wrote:

On Thu, 15 February 2007 23:59:14 +0100, Juan Piernas Canovas wrote:
Actually, the version of DualFS for Linux 2.4.19 implements a cleaner. In 
our case, the cleaner is not really a problem because there is not too 
much to clean (the meta-data device only contains meta-data blocks which 
are 5-6% of the file system blocks; you do not have to move data blocks).


That sounds as if you have not hit the interesting cases yet.  Fun
starts when your device is near-full and you have a write-intensive
workload.  In your case, that would be metadata-write-intensive.  For
one, this is where write performance of log-structured filesystems
usually goes down the drain.  And worse, it is where the cleaner can
run into a deadlock.

Being good where log-structured filesystems usually are horrible is a
challenge.  And I'm sure many people are more interested in those
performance number than in the ones you shine at. :)

Actually I am interested in the common case, where the machine is not 
out of space, or memory, or CPU, but when it is appropriately sized to 
the workload. Not that I lack interest in corner cases, but the running 
flat out case doesn't reflect case where there's enough hardware, now 
the o/s needs to use it well.


The one high load benchmark I would love to see is a web server, running 
tux, with a load over a large (number of files) distributed data set. 
The much faster tar create times posted make me think that a server with 
a lot of files would benefit, when CPU and memory requirements are not a 
bottleneck.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation

2007-02-17 Thread Bill Davidsen

Jörn Engel wrote:

On Fri, 16 February 2007 18:47:48 -0500, Bill Davidsen wrote:
  
Actually I am interested in the common case, where the machine is not 
out of space, or memory, or CPU, but when it is appropriately sized to 
the workload. Not that I lack interest in corner cases, but the running 
flat out case doesn't reflect case where there's enough hardware, now 
the o/s needs to use it well.



There is one detail about this specific corner case you may be missing.
Most log-structured filesystems don't just drop in performance - they
can run into a deadlock and the only recovery from this is the lovely
backup-mkfs-restore procedure.
  

I missed that. Which corner case did you find triggers this in DualFS?

If it was just performance, I would agree with you.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-17 Thread Bill Davidsen

Cédric Augonnet wrote:

2007/2/15, Andrew Morton [EMAIL PROTECTED]:


Temporarily at

  http://userweb.kernel.org/~akpm/2.6.20-mm1/

Will appear later at


Changes since 2.6.20-rc6-mm3:

-minix-v3-support.patch


Hi Daniel,

On 2.6.20-rc6-mm3 and 2.6.20-mm1, i get an OOPS when using the minix 3
file system. I enclose the dmesg and the .config to that mail.

Here are the steps to reproduce this oops (they involve using qemu to
run Minix 3)
- First create a 2GB image using
 qemu-img create minix.img 2G
  (Please note that this seem to be producing an eroneous image)
- Then launch Minix inside qemu to make a minix partition on this
image using mkfs on the corresponding device.


That's two steps, right? First you make a partition on the disk qemu 
provides, then you put a filesystem on the partition? Or did you put a 
filesystem on the raw device?



- Mount the image on loopback using
 mount -t minix -o loop minix.img /mnt/qemu/


Does mount know to use Minux3 with this command line?


- issue a df command on /mnt/qemu

This oops occurs everytime i use df on this directory. However, this
does not occur if the image was for a 1MB partition. And it does not
occur if the partition on which we created minix.img was the same as
the partition on which qemu stands. Sounds like qemu has an issue and
creates an erroneous partition which linux does not handle correctly.

Regards, and thanks for your patch by the way !


Having been burned a few times by the fact that qemu provides disk 
images which then (normally) get partitions, I'm not sure you aren't 
having the same problem.


None of which justifies the OOPS, of course, nice kernels don't go down.

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] Time for a linux-kvm mailing list?

2007-02-17 Thread Bill Davidsen
There doesn't seem to be a great place for KVM user questions, this is 
it, and kvm-devel seems a poor place for user questions, while the chat 
room is real time and depends on the question and the answer being in 
the same place at the same time.


Just a thought on getting a dialogue going in the right place.

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Time for a linux-kvm mailing list?

2007-02-18 Thread Bill Davidsen

Avi Kivity wrote:

Bill Davidsen wrote:
There doesn't seem to be a great place for KVM user questions, this 
is it, and kvm-devel seems a poor place for user questions, while the 
chat room is real time and depends on the question and the answer 
being in the same place at the same time.




kvm-devel is perfectly suitable for user queries.



Just a thought on getting a dialogue going in the right place.



You could have started by posting your idea on kvm-devel, where kvm 
developers and users would actually see it.


Why would I post it to a list where it's off-topic by list name? And how 
would anyone know that the list name can be ignored when so many other 
lists with devel in the name tell people with user questions to go 
elsewhere? Right now only users who ignore list names would even look there.


I was suggesting to improve user participation, since you don't think 
that's needed I'll stop trying to help. I guess since kvm needs more 
hardware you have fewer users and don't need user support list like xen.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-17 Thread Bill Davidsen

Con Kolivas wrote:

On Monday 12 March 2007 22:26, Al Boldi wrote:

Con Kolivas wrote:

On Monday 12 March 2007 15:42, Al Boldi wrote:

Con Kolivas wrote:

On Monday 12 March 2007 08:52, Con Kolivas wrote:

And thank you! I think I know what's going on now. I think each
rotation is followed by another rotation before the higher priority
task is getting a look in in schedule() to even get quota and add
it to the runqueue quota. I'll try a simple change to see if that
helps. Patch coming up shortly.

Can you try the following patch and see if it helps. There's also one
minor preemption logic fix in there that I'm planning on including.
Thanks!

Applied on top of v0.28 mainline, and there is no difference.

What's it look like on your machine?

The higher priority one always get 6-7ms whereas the lower priority one
runs 6-7ms and then one larger perfectly bound expiration amount.
Basically exactly as I'd expect. The higher priority task gets precisely
RR_INTERVAL maximum latency whereas the lower priority task gets
RR_INTERVAL min and full expiration (according to the virtual deadline)
as a maximum. That's exactly how I intend it to work. Yes I realise that
the max latency ends up being longer intermittently on the niced task but
that's -in my opinion- perfectly fine as a compromise to ensure the nice
0 one always gets low latency.

I think, it should be possible to spread this max expiration latency across
the rotation, should it not?


There is a way that I toyed with of creating maps of slots to use for each 
different priority, but it broke the O(1) nature of the virtual deadline 
management. Minimising algorithmic complexity seemed more important to 
maintain than getting slightly better latency spreads for niced tasks. It 
also appeared to be less cache friendly in design. I could certainly try and 
implement it but how much importance are we to place on latency of niced 
tasks? Are you aware of any usage scenario where latency sensitive tasks are 
ever significantly niced in the real world?


It depends on how you reconcile completely fair and order of 
magnitude blips in latency. It looks (from the results, not the code) 
as if nice is implemented by round-robin scheduling followed by once in 
a while just not giving the CPU to the nice task for a while. Given the 
smooth nature of the performance otherwise, it's more obvious than if 
you weren't doing such a good job most of the time.


Ugly stands out more on something beautiful!

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: is RSDL an unfair scheduler too?

2007-03-17 Thread Bill Davidsen

Con Kolivas wrote:

On Saturday 17 March 2007 23:28, Ingo Molnar wrote:

* Con Kolivas [EMAIL PROTECTED] wrote:

We're obviously disagreeing on what heuristics are [...]

that could very well be so - it would be helpful if you could provide
your own rough definition for the term, so that we can agree on how to
call things?

[ in any case, there's no rush here, please reply at your own pace, as
  your condition allows. I wish you a speedy recovery! ]


You're simply cashing in on the deep pipes that do kernel work for
other tasks. You know very well that I dropped the TASK_NONINTERACTIVE
flag from rsdl which checks that tasks are waiting on pipes and you're
exploiting it.

Con, i am not 'cashing in' on anything and i'm not 'exploiting'
anything. The TASK_NONINTERACTIVE flag is totally irrelevant to my
argument because i was not testing the vanilla scheduler, i was testing
RSDL. I could have written this test using plain sockets, because i was
testing RSDL's claim of not having heuristics, i was not testing the
vanilla scheduler.

I have simply replied to this claim of yours:

Despite the claims to the contrary, RSDL does not have _less_
heuristics, it does not have _any_. [...]

and i showed you a workload under _RSDL_ that clearly shows that RSDL is
an unfair scheduler too.

my whole point was to counter the myth of 'RSDL has no heuristics'. Of
course it has heuristics, which results in unfairness. (If it didnt have
any heuristics that tilt the balance of scheduling towards sleep-intense
tasks then a default Linux desktop would not be usable at all.)

so the decision is _not_ a puristic do we want to have heuristics or
not, the question is a more practical which heuristics are simpler,
which heuristics are more flexible, which heuristics result in better
behavior.

Ingo


Ok but please look at how it appears from my end (illness aside).

I spend 3 years just diddling with scheduler code trying my hardest to find a 
design that fixes a whole swag of problems we still have, and a swag of 
problems we might get with other fixes.


You initially said you were pleased with this design.

..lots of code, testing, bugfixes and good feedback.

Then Mike has one testcase that most other users disagree is worthy of being 
considered a regresssion. You latched onto that and basically called it a 
showstopper in spite of who knows how many other positive things.


Then you quickly produce a counter patch designed to kill off RSDL with a 
config option for mainline.


Then you boldly announce on LKML is RSDL an unfair scheduler too? with 
some test case you whipped up to try and find fault with the design.


No damn it! He's pointing out that you do have heuristics, they are just 
built into the design. And of course he's whipping up test cases, how 
else can anyone help you find corner cases where it behaves in an 
unexpected or undesirable manner?


I think he's trying to help, please stop taking it personally.


What am I supposed to think? Considering just how many problems I have 
addressed and tried to correct with RSDL succesfully I'm surprised that 
despite your enthusiasm for it initially you have spent the rest of the time 
trying to block it.


Please, either help me (and I'm in no shape to code at the moment despite what 
I have done so far), or say you have no intention of including it. I'm 
risking paralysis just by sitting at the computer right now so I'm dropping 
the code as is at the moment and will leave it up to your better judgement as 
to what to do with it.


Actually I think Ingo has tried to help get it in, that's his patch 
offered for CONFIG_SCHED_FAIR, lets people try it and all.


Now for something constructive... by any chance is Mike running KDE 
instead of GNOME? I only had a short time to play because I had to look 
at another problem in 2.6.21-rc3 (nbd not working), so the test machine 
is in use. But it looked as if behavior was not as smooth with KDE. May 
that thought be useful.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: is RSDL an unfair scheduler too?

2007-03-17 Thread Bill Davidsen
 schedulers, because I don't believe any one can match the 
behavior goals of all users.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pluggable Schedulers (was: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler)

2007-03-17 Thread Bill Davidsen

David Lang wrote:

On Fri, 9 Mar 2007, Al Boldi wrote:




My preferred sphere of operation is the Manichean domain of faster vs.
slower, functionality vs. non-functionality, and the like. For me, such
design concerns are like the need for a kernel to format pagetables so
the x86 MMU decodes what was intended, or for a compiler to emit valid
assembly instructions, or for a programmer to write C the compiler
won't reject with parse errors.


Sure, but I think, even from a technical point of view, competition is 
a good
thing to have.  Pluggable schedulers give us this kind of competition, 
that

forces each scheduler to refine or become obsolete.  Think evolution.


The point Linus is makeing is that with pluggable schedulers there isn't 
competition between them, the various developer teams would go off in 
their own direction and any drawbacks to their scheduler could be 
answered with that's not what we are good at, use a different 
scheduler, with the very real possibility that a person could get this 
answer from ALL schedulers, leaving them with nothing good to use.


Have you noticed that currently that is exactly what happens? If the 
default scheduler doesn't handle your load well you have the option of 
rewriting it and maintaining it, or doing without, or tying to fix your 
case without breaking others, or patching in some other, non-mainline, 
scheduler.


The default scheduler has been around long enough that I don't see it 
being tuned for any A without making some B perform worse. Thus multiple 
schedulers are a possible solution.


They don't need to be available as runtime choices, boot time selection 
would still allow reasonable testing. I can see myself using a compile 
time option and building multiple kernels, but not the average user.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-03-18 Thread Bill Davidsen

Andrew Morton wrote:

On Wed, 14 Mar 2007 21:42:46 +0900 Tomoki Sekiyama [EMAIL PROTECTED] wrote:

...


-Solution:

I consider that all of the dirty pages for the disk have been written
back and that the disk is clean if a process cannot write 'write_chunk'
pages in balance_dirty_pages().

To avoid using up the free memory with dirty pages by passing blocking,
this patchset adds a new threshold named vm.dirty_limit_ratio to sysctl.

It modifies balance_dirty_pages() not to block when the amount of
Dirty+Writeback is less than vm.dirty_limit_ratio percent of the memory.
In the other cases, writers are throttled as current Linux does.


In this patchset, vm.dirty_limit_ratio, instead of vm.dirty_ratio, is
used as the clamping level of Dirty+Writeback. And, vm.dirty_ratio is
used as the level at which a writers will itself start writeback of the
dirty pages.


Might be a reasonable solution - let's see what Peter comes up with too.

Comments on the patch:

- Please don't VM_DIRTY_LIMIT_RATIO: just use CTL_UNNUMBERED and leave
  sysctl.h alone.

- The 40% default is already too high.  Let's set this new upper limit to
  40% and decrease he non-blocking ratio.

- Please update the procfs documentation in ./Docmentation/

- I wonder if dirty_limit_ratio is the best name we could choose. 
  vm_dirty_blocking_ratio, perhaps?  Dunno.


I don't like it, but I dislike it less than dirty_limit_ratio I guess. 
It would probably break things to change it now, including my 
sysctl.conf on a number of systems :-(


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: RSDL v0.31

2007-03-19 Thread Bill Davidsen

Kasper Sandberg wrote:

On Sun, 2007-03-18 at 08:38 +0100, Mike Galbraith wrote:

On Sun, 2007-03-18 at 08:22 +0100, Radoslaw Szkodzinski wrote:


I'd recon KDE regresses because of kioslaves waiting on a pipe
(communication with the app they're doing IO for) and then expiring.
That's why splitting IO from an app isn't exactly smart. It should at
least be ran in an another thread.

Hm.  Sounds rather a lot like the...
X sucks, fix X and RSDL will rock your world.  RSDL is perfect.
...that I've been getting.


not really, only X sucks. KDE works atleast as good with rsdl as
vanilla. i dont know how originally said kde works worse, wasnt it just
someone that thought?

It was probably me, and I had the opinion that KDE is not as smooth as 
GNOME with RSDL. I haven't had time to measure, but using for daily 
stuff for about an hour each way hasn't changed my opinion. Every once 
in a while KDE will KLUNK to a halt for 200-300ms doing mundane stuff 
like redrawing a page, scrolling, etc. I don't see it with GNOME.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: is RSDL an unfair scheduler too?

2007-03-19 Thread Bill Davidsen

Bill Huey (hui) wrote:

On Sun, Mar 18, 2007 at 06:24:40AM +0100, Willy Tarreau wrote:
Dunno. I guess a lot of people would like to then manage the classes, 
which would be painful as hell. 

Sure ! I wouldn't like people to point the finger on Linux saying hey
look, they can't write a good scheduler so you have to adjust the knobs
yourself!. I keep in mind that Solaris' scheduler is very good, both
fair and interactive. FreeBSD was good (I haven't tested for a long time).
We should manage to get something good for most usages, and optimize
later for specific uses.


Like I've said in a previous email, SGI schedulers have an interactive
term in addition to the normal nice values. If RSDL ends up being too
rigid for desktop use, then this might be a good idea to explore in
addition to priority manipulation.

However, it hasn't been completely proven that RSDL can't handle desktop
loads and that needs to be completely explored first. It certain seems
like, from the .jpgs that were posted earlier in the thread regarding mysql
performance, that RSDL seems to have improved performance for those set
ups so it's not universally the case that it sucks for server loads. The
cause of this performance difference has yet to be pinpointed.


I would say that RSDL is probably a bit better than default for server 
use, although if the server starves for CPU interactive processing at 
the console becomes leisurely indeed. The only thing I would like to 
address is the order of magnitude blips in latency of nice processes, 
which may be solved by playing with time slices. Con hasn't really 
commented on that (or I haven't read down to it).




Also, bandwidth scheduler like this are a new critical development for
things like the -rt patch. It would benefit greatly if the RSDL basic
mechanisms (RR and deadlines) were to somehow slip into that patch and
be used for a more strict -rt based scheduling class. It would be the basis
for first-class control over process resource usage and would be a first
in Linux or any mainstream kernel.


I don't think that RSDL and -rt should be merged, but that's for Ingo 
and Con to discuss. I would love to see RSDL in mainline as soon as it 
is practical, marked as EXPERIMENTAL.


This would be a powerful addition to Linux as a whole and RSDL should
not be dismissed without these considerations. If it can somehow be
integrated into the kernel with interactivity concerns addressed, then
it would be an all out win for the kernel in both these areas.

I don't think there are a lot of places where it underperforms the 
default scheduler, and it avoids a lot of jackpot cases where an 
overloaded system really bogs down. I would like to see more varied 
testing before any changes are made, unless a simple change would 
improve consistency of latency.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [2/6] 2.6.21-rc2: known regressions

2007-03-19 Thread Bill Davidsen

Jim Gettys wrote:

On Sun, 2007-03-18 at 17:07 +0100, Ingo Molnar wrote:

* Pavel Machek [EMAIL PROTECTED] wrote:

Some day we may have modesetting support in the kernel for some 
graphics hw, right now it's pretty damn spotty.

Yep, that's the way to go.

hey, i wildly supported this approach ever since 1996, when GGI came up
:-/



So wildly you wrote tons of code ;-).

More seriously, at the time, XFree86 would have spat in your face for
any such thing.  Thankfully, times are changing.

Also more seriously, a somewhat hybrid approach is in order for mode
setting: simple mode setting isn't much code and is required for sane
behavior on crash (it is nice to get oopses onto a screen); but the full
blown mode setting/configuration problem is so large that on some
hardware, it is likely left best left to a helper process (not the X
server).

Also key to get sane behavior out of the scheduler is to get the X
server to yield (sleep in the kernel) rather than busy waiting when the
GPU is busy; a standardized interface for this for both fbdev and dri is
in order.  Right now, X is a misbehaving compute bound process rather
than the properly interactive process it can/should/will be, releasing
the CPU whenever the hardware is busy. Needless to say, this wastes
cycles and hurts interactivity with just about any scheduler you can
devise. It isn't as if this is hard; on UNIX systems we did it in 1984
or thereabouts.


What you say sounds good, assuming that the cost of a sleep is less than 
the cost of the busy wait. But this may be hardware, the waits may be 
very small and frequent, and if it's hitting a small hardware window 
like retrace, delays in response will cause the time period to be missed 
completely. This probably less critical with very smart cards, many of 
us don't run them.


Of course, in 1996, XFree86 would have ignored any such interfaces, in
its insane quest for operating system independent user space drivers
requiring no standard kernel interfaces (it is the second part of
this where the true insanity lay).
  - Jim




--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [2/6] 2.6.21-rc2: known regressions

2007-03-20 Thread Bill Davidsen

Jim Gettys wrote:

On Mon, 2007-03-19 at 16:33 -0400, Bill Davidsen wrote:

  
What you say sounds good, assuming that the cost of a sleep is less than 
the cost of the busy wait. But this may be hardware, the waits may be 
very small and frequent, and if it's hitting a small hardware window 
like retrace, delays in response will cause the time period to be missed 
completely. This probably less critical with very smart cards, many of 
us don't run them.



Actually, various strategies involving short busy waiting, or looking at
DMA address registers before sleeping were commonplace.  But a
syscall/sleep/wakeup is/was pretty fast.  If you have an operation
blitting the screen (e.g. scrolling), it takes a bit of time for the GPU
to execute the command.  I see this right now on OLPC, where a wonderful
music application needs to scroll (most of) the screen left),
periodically, and we're losing samples sometimes at those operation.
  
None of that conflicts with what I said, but what works on an LCD may 
not be appropriate for a CRT. With even moderate [EMAIL PROTECTED] timing the 
horizontal retrace happens ~50k/sec, and that's not an appropriate 
syscall rate. I'm just pointing out that some things a video interface 
does with simple hardware involve lots of very small windows. Don't read 
that as don't do it, just be careful HOW you do it.

Remember also, that being nice to everyone else by sleeping, there are
more cycles to go around, and the scheduler can nicely boost the X
server's priority as it will for interactive processes that are being
cooperative.
I'm going to cautiously guess that the problem might be not how much 
but how soon. That is, latency might be more important than giving the 
server a lot of CPU.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-03-22 Thread Bill Davidsen

Tomoki Sekiyama wrote:

Hi,

Thanks for your comments.
I'm sorry for my late reply.

Bill Davidsen wrote:
 Andrew Morton wrote:

 - I wonder if dirty_limit_ratio is the best name we could choose.
 vm_dirty_blocking_ratio, perhaps?  Dunno.

 I don't like it, but I dislike it less than dirty_limit_ratio I 
guess.

 It would probably break things to change it now, including my
 sysctl.conf on a number of systems  :-(

I'm wondering which interface is preferred...

1) Just rename dirty_limit_ratio to dirty_blocking_ratio.
   Those who had been changing dirty_ratio should additionally modify
   dirty_blocking_ratio in order to determine the upper limit of dirty 
pages.


2) Change dirty_ratio to a vector, consists of 2 values;
   {blocking ratio, writeback starting ratio}.
   For example, to change the both values:
 # echo 40 35  /proc/sys/vm/dirty_ratio
   And to change only the first one:
 # echo 20  /proc/sys/vm/dirty_ratio
   In the latter way the writeback starting ratio is regarded as the 
same as the
   blocking ratio if the writeback starting ratio is smaller. And 
then, the kernel behaves

   similarly as the current kernel.

3) Use dirty_ratio as the blocking ratio. And add
   start_writeback_ratio, and start writeback at
   start_writeback_ratio(default:90) * dirty_ratio / 100 [%].
   In this way, specifying blocking ratio can be done in the same way as
   current kernel, but high/low watermark algorithm is enabled.
I like 3 better, it should make tuning behavior more precise. You can 
make an argument for absolute values for writeback, if my disk will only 
write 70MB/s I may only want 203 sec of pending writes, regardless of 
available memory.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-03-26 Thread Bill Davidsen

Tomoki Sekiyama wrote:

Hi,
Thanks for your reply.

  

3) Use dirty_ratio as the blocking ratio. And add
  start_writeback_ratio, and start writeback at
  start_writeback_ratio(default:90) * dirty_ratio / 100 [%].
  In this way, specifying blocking ratio can be done in the same way
  as current kernel, but high/low watermark algorithm is enabled.
  

I like 3 better, it should make tuning behavior more precise.



Then, what do you think of the following idea?

(4) add `dirty_start_writeback_ratio' as percentage of memory,
at which a generator of dirty pages itself starts writeback
(that is, non-blocking ratio).

In this way, `dirty_ratio' is used as the blocking ratio, so we don't
need to modify the sysctl.conf etc. I think it's easier to understand
for administrators of systems, because the interface is similar as
`dirty_background_ratio' and`dirty_ratio.'

If this is OK, I'll repost the patch.
  
It sounds good to me, just be sure behavior is sane for for both 
blocking less than start_writeback and vice versa.
  

You can make an argument for absolute values for writeback,
if my disk will only write 70MB/s I may only want 203 sec of
pending writes, regardless of available memory.



To realize tuning with absolute values, I consider that we need to
modify handling of `dirty_background_ratio,' `dirty_ratio' and so on as
well as `dirty_start_writeback_ratio.' I think this should be done in
another patch if this feature is required.

Regards,
--
Tomoki Sekiyama
Hitachi, Ltd., Systems Development Laboratory


  



--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question: half-duplex and full-duplex serial driver

2007-04-05 Thread Bill Davidsen

Mockern wrote:

Hi,

Could you help me please, how can my serial driver to work in  half-duplex and 
full-duplex mode?

Thank you


Since you don't seem to have gotten an answer, and while this is 
probably the wrong list for your question, I can give you a pointer 
which may help.


The communications program kermit can do this, google for the source, 
or try kermit.columbia.edu first, and read the source to see how they do 
it. I'm reasonably sure ioctl() is the answer, but that's choice three 
for your research.


--
bill davidsen [EMAIL PROTECTED]
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [OT] the shortest thread of LKML !

2007-04-05 Thread Bill Davidsen

Willy Tarreau wrote:

On Wed, Mar 28, 2007 at 01:02:10PM -0700, David Miller wrote:

Please nobody reply to his posting, I'm shit-canning this thread from
the start as it's nothing but flame fodder.


He forgot the most important thing: there are *many* benevolent dictators,
all with their own domain of excellence ;-)

Good catch, David, you're like a spider on a web waiting for the naive
intruder !


Posted several days too early for April Fool...

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: was once: Samsung DVD writer.

2007-04-06 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:

Hi,

BTW: What happened to FreeBSD User Giacomo and his Samsung DVD writer ?

Any news to report ? A happy ending perchance ?


Bill Davidsen: [about filtering dangerous SCSI commands]
  

My suggestion would be to add an ioctl, like SET_SCSI_UNFILTERED, which
can only be used as root, and which would allow SCSI commands sent a
device to be persistently set unfiltered.



I understand that would be for programs like firmware
updaters, but not for the vanilla purpose of bringing
data onto an optical disc ?

Because ... if this is intended for daily usage:

I do not think that burning a disc on a desktop should
necessarily require root privileges. Let's leave it to
the sysadmin (or distro) to decide who may burn resp.
may endanger the device by malicious use of normally
harmless SCSI commands. (Like overworking the drive tray
motor ?)

  
I don't see a problem here requiring root. Once the filter is turned off 
for the device, say in rc.local, and the ownership and permissions are 
set, there's no issue. It can be owned by root, group cdwriters, have 
permissions 0660, and cdrecord or other trusted programs can be setgid 
once the kernel stop blocking such access.

After all, what is gained if one performs daily tasks
as a privileged user ? That only pierces the protection
against absent-minded mistakes and involuntary backdoors.

  
No need to do any daily tasks in a dangerous mode, access would be 
limited to users and programs regarded as trusted.

Actually i try to stay away from any kernel peculiarities
so i do not get addicted to something that might change.
A pointer to a list of forbidden commands would be welcome
thus.

  

If this could be added it would presumably not change.

Maybe cdrskin was up to now only tested on totally insecure
systems.  After all i never got reports of the ominous
command filtering interfering with burning. If it prevents
any of libburn's SCSI commands from being executed then it
does this silently and does not prevent burning success.
  


No, as I noted, programs other than cdrecord are clever enough to avoid 
requiring disallowed commands.

I would like to know, which commands and cease sending them. :))


libburn SCSI command list (commands in brackets are defined
but not in normal use):
spc.c: 00h, 03h, 12h, 1Eh, 55h, 5Ah,
sbc.c: 1Bh,
mmc.c: 04h, 23h, 2Ah, 35h, 43h, 46h, 4Ah, 
   51h, 52h, 53h, 54h, 5Bh, 5Ch, 5Dh,
   A1h,(AAh),ACh, B6h, BBh,(BEh), 



Have a nice day :)
  
I put lkml back on the recipients, I'm suggesting a new ioctl as a way 
around the decision to no longer have setuid/setgid actually fully 
functional.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.21-rc5-git1][KVM] exception on start VM

2007-04-06 Thread Bill Davidsen

Avi Kivity wrote:

Bill Davidsen wrote:

Starting a VM for Win98SE:


[ debug snip info ]

Known issue.  It will be a while before we can support the '95 family on 
Intel as it makes heavy use of real mode.


Thanks for the quick answer, I'll investigate other virtualization 
solutions.


Please copy kvm-devel@lists.sourceforge.net on kvm issues, a per 
MAINTAINERS.



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] remove artificial software max_loop limit

2007-04-06 Thread Bill Davidsen

Jan Engelhardt wrote:

On Apr 1 2007 11:10, Ken Chen wrote:

On 4/1/07, Tomas M [EMAIL PROTECTED] wrote:

I believe that IF you _really_ need to preserve the max_loop module 
parameter, then the parameter should _not_ be ignored, rather it 
should have the same function like before - to limit the loop driver 
so if you use max_loop=10 for example, it should not allow loop.c to 
create more than 10 loops.
Blame on the dual meaning of max_loop that it uses currently: to 
initialize a set of loop devices and as a side effect, it also sets 
the upper limit.  People are complaining about the former constrain, 
isn't it?  Does anyone uses the 2nd meaning of upper limit?


Who cares if the user specifies max_loop=8 but still is able to open up 
/dev/loop8, loop9, etc.? max_loop=X basically meant (at least to me) 
have at least X loops ready.


You have just come up with a really good reason not to do unlimited 
loops. With the current limit people can count on a script mounting 
files, or similar, to neither loop for a VERY long time or to eat their 
memory. Whatever you think of programs without limit checking, this 
falls in the range of expecting an unsigned char to have a certain upper 
bound, and argues that the default limit should be the current limit and 
that setting a lower bound should work as a real and enforced limit.


If a new capability is being added, and I think it's a great one, then 
people using the capability should be the ones explicitly doing 
something different. Plauger's law of least astonishment.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH resend][CRYPTO]: RSA algorithm patch

2007-04-06 Thread Bill Davidsen

Tasos Parisinos wrote:

Andi Kleen wrote:

Tasos Parisinos [EMAIL PROTECTED] writes:

 

From: Tasos Parisinos [EMAIL PROTECTED]

This patch adds module rsa.ko in the kernel (built-in or as a kernel
module) and offers an API to do fast modular exponentiation, using the
Montgomery algorithm, thus the exponentiation is not generic but can
be used only when
the modulus is odd, such as RSA public/private key pairs. This module 
is the
computational core (using multiple precision integer arithmetics) and 
does not
provide any means to do key management, implement padding schemas 
e.t.c. so the

calling code should implement all those as needed. Signed-off-by:
Tasos Parisinos [EMAIL PROTECTED]



You forgot to answer the most important question.

What would want to use RSA in the kernel and why? 
-Andi


  


The main purpose behind the creation of this module was to create the
cryptographic infrastructure to develop an in-kernel system of signed
modules.

I don't really see why this has to be in the kernel, even after reading 
your text below. This would be code which a tiny number of users would 
find useful, someone in the future might find exploitable, to perform a 
function which can be done in user space.



The best environment to deploy such functionality is in updating by
remote, executable code (programs, libs and modules) on embedded
devices running Linux, that have some form of kernel physical
security, so one can't tamper the kernel, but can read it. In this
case only a public key would be revealed. The vendor of the devices
can sign and distribute/update executable code to the devices, and
the kernel will not load/run any of them if they don't match with
their signatures. The signature can be embedded in the elf, so this
system is portable and centralized. 



Although this functionality can be achieved using userland helper
programs this may create the need to physically secure entire
filesystems which adds to the cost of developing such devices.


So to save cost on your end you want to make this feature part of the 
mainline kernel. Am I misreading your intent here?



In such cases one needs to use asymmetric cryptography because in the
case of symmetric it would be very easy to give away the key and end
with having all your devices being attacked.

Which make a good argument for doing asymmetric anyway, it would seem. 
That way any updates can be checked off the target machine and validated 
as authentic.


There are already some systems that implement and utilize such 
functionality that use windows platforms, and other Linux distros

that use userland programs to do so, assuming physical security of
the host computer.


Exactly.


Moreover a same system that would use hashes is easier to brake and
more difficult to update each time new code must be loaded to the
host devices.

See also this thread

http://lkml.org/lkml/2007/3/19/447

Having said all this, we have a boatload of other crypto in the kernel, 
if it's just the crypto module, like aes, anubis or micheal_mic, and is 
GPL compatible, some people may agree. But if this is an embedded 
system, and you have the patch, why not just apply it to your kernel and 
forget mainline?


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Lower HD transfer rate with NCQ enabled?

2007-04-06 Thread Bill Davidsen

Paa Paa wrote:
Q: What conclusion can I make on hdparm -t results or can I make 
any conclusions? Do I really have lower performance with NCQ or not? 
If I do, is this because of my HD or because of kernel?


What IO scheduler are you using? If AS or CFQ, could you try with 
deadline?


I was using CFQ. I now tried with Deadline and that doesn't seem to 
degrade the performance at all! With Deadline I got 60MB/s both with and 
without NCQ. This was with hdparm -t.


So what does this tell us?

It suggests that it's time to test with real load and see if deadline 
works well for you in the general case.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Performance Stats: Kernel patch

2007-04-06 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:


2) It arrived here with some line-wrapping damage, most likely to the fact
that you posted it with Thunderbird.  There's a mystic Thunderbird incantation
to make it not do that, but I have no idea what it is - it's in the list
archives someplace.


I don't use TBird (seamonkey fan) but I assume the patch can just be 
attached rather than inlined. Some mailers are pretty arcane otherwise.


But I do like the idea, but the issue of things which parse 
/proc/PID/status hasn't had comments. A good parser would ignore what it 
didn't understand, or take everything, not everyone has a good parser. ;-)


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] the overdue removal of X86_SPEEDSTEP_CENTRINO_ACPI

2007-04-06 Thread Bill Davidsen

Adrian Bunk wrote:

This patch contains the overdue removal of X86_SPEEDSTEP_CENTRINO_ACPI.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

It would be really nice, when removing features used on computers which 
are only a few years old, if you noted what replaces this functionality. 
Yes, people can take 10-15 minutes to find and read previous discussion, 
but one or two sentences who generate less concern and noise on the list.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH resend][CRYPTO]: RSA algorithm patch

2007-04-06 Thread Bill Davidsen

Indan Zupancic wrote:

On Fri, April 6, 2007 23:30, Bill Davidsen wrote:
  

Tasos Parisinos wrote:


The main purpose behind the creation of this module was to create the
cryptographic infrastructure to develop an in-kernel system of signed
modules.

  


  

Although this functionality can be achieved using userland helper
programs this may create the need to physically secure entire
filesystems which adds to the cost of developing such devices.
  

So to save cost on your end you want to make this feature part of the
mainline kernel. Am I misreading your intent here?



(Tasos was talking about the cost of securing whole file systems versus only
the kernel binary.)

But if that entire filesystem is initramfs, I don't
see any problem. If it fits into the kernel, it also has enough room for an
initramfs with a user space program with the RSA signing. I said this before,
so please look up how initramfs works and tell us why that isn't sufficient
for this case.

I suspect your answer will be because it isn't the only part and a lot other
infrastructure is need in the kernel to do all the binary signing. But that
code you didn't post, only a MPI module, however nice, which is only a partial
solution to what you want to achieve. Combine that with the kernel policy to
not merge unused code, and you're in the current situation.

  

Having said all this, we have a boatload of other crypto in the kernel,
if it's just the crypto module, like aes, anubis or micheal_mic, and is
GPL compatible, some people may agree. But if this is an embedded
system, and you have the patch, why not just apply it to your kernel and
forget mainline?



Currently it's less than a cryptoapi module, as it only provide some functions
to do multi-precision integer calculations, which happen to be the tricky part
of implementing RSA.

That said, this implementation seems quite good, from a code size and complexity
point of view. So for that alone I think it wouldn't be bad to merge this or a
modified version of this, even if it's unused by the rest of the kernel, it 
might
be useful for other users. The burden to carry it along for the kernel is quite
small, while the code is worth something and might get improved by their users,
in the end having a central place to collect them. So I think from an open 
source
ecological point of view, it wouldn't be bad to merge it.

I see three possible way forwards (alternative is the status quo):

1) Move it to user space (into the initramfs embedded into the kernel).
But you'd still need to add binary (modules, libs and programs) load hooks.

2) Flesh it out into a ready to use, full blown RSA cryptoAPI module. Whatever
you said earlier, whether you want or not, it's just a block cipher, with the
modulo as block size (I suspect there's some room for code simplification when
assuming fixed block sizes too, by allocating blocksize * 2 space instead of
resizing when needed).

  
This would probably be the best solution, to provide most of the hooks 
while presenting the cryptoAPI for others to use if they wish. Good 
suggestion.

3) Go all the way, and post all the other kernel modifications too, to get the
whole binary signing you want to achieve.
Advantage will be that in the end you'll end up with something scrutinized to
death. Disadvantage is that it will be scrutinized to death, as that can take
a lot of time. Maybe you'll end up with a new LSM module, who knows?

The list is in increasing order of difficulty and quality of your end code.

It would help if you could find others who also wants something similar and
work together to get it into the kernel. But even if the last step fails,
you still have had people reviewing your code. And failing even that, you at
least shared your code with the rest of the world, which is already something
good (and required by the GPL. But doing it in the open is much more laudable
than hiding it on a website).

Greetings,

Indan


  
I think you have covered the possibilities, my read is that your item 
number two is most likely to be accepted.


--
Bill Davidsen [EMAIL PROTECTED]
 We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] remove artificial software max_loop limit

2007-04-07 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:

On Fri, 06 Apr 2007 16:33:32 EDT, Bill Davidsen said:
  

Jan Engelhardt wrote:



  
Who cares if the user specifies max_loop=8 but still is able to open up 
/dev/loop8, loop9, etc.? max_loop=X basically meant (at least to me) 
have at least X loops ready.


  
You have just come up with a really good reason not to do unlimited 
loops.



That, and I'd expect the intuitive name for have at least N ready to
be 'min_loop=N'.  'max_loop=N' means (to me, at least) If I ask for N+1,
something has obviously gone very wrong, so please shoot my process before
it gets worse.

Maybe what's needed is *both* a max_ and min_ parameter?
  
I think that max_loop is a sufficient statement of the highest number of 
devices needed, and can reasonably interpreted as both I may need this 
many and I won't legitimately want more.


As I recall memory is allocated as the device is set up, so unless you 
want to use the max memory at boot, just in case, the minimum won't be 
guaranteed anyway. Something else could eat memory.


In practice I think asking for way too many is more common than not 
being able to get to the max. It may happen but it's a corner case, and 
status is returned.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: need help

2007-04-08 Thread Bill Davidsen

vjn wrote:

in my project i want to code the kernel such that when i plugged my usb it
should ask for password and check it in the kernel space . can anyone help
me


I think the correct solution is to use an excrypted mount, and issue the 
mount command manually with the question in user space. There's no code 
to ask for input, nor anyway to positively decide which connected 
terminal is the terminal to ask.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: init's children list is long and slows reaping children.

2007-04-09 Thread Bill Davidsen

Ingo Molnar wrote:

* Linus Torvalds [EMAIL PROTECTED] wrote:


On Fri, 6 Apr 2007, Davide Libenzi wrote:

or lets just face it and name it what it is: process_struct ;-)
That'd be fine too! Wonder if Linus would swallow a rename patch like 
that...
I don't really see the point. It's not even *true*. A process 
includes more than the shared signal-handling - it would include files 
and fs etc too.


So it's actually *more* correct to call it the shared signal state 
than it would be to call it process state.


we could call it structure for everything that we know to be ugly about 
POSIX process semantics ;-) The rest, like files and fs we've 
abstracted out already.


Ingo

So are you voting for ugly_struct? ;-)

I do think this is still waiting for a more descriptive name, like 
proc_misc_struct or some such. Kernel code should be treated as 
literature, intended to be both read and readable.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: REISER4 FOR INCLUSION IN THE LINUX KERNEL.

2007-04-09 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:

I actually beleive that Hans made a reasonable case that 
Reiser4 had gone about as far as it could reasonably go with regard 
to testing, robustness, ... without the broader base of use that 
even an experimental filesystem in distribution tree would get.


Of course, this is an entirely reasonable request of Reiser's.
One meet with an array of unreasonable actions, but mainly STALLING 
which has led to REISER4 never becoming part of the main kernel.


It has also lead to many false claims about REISER4. Claims that are
never backed up with solid statistics, but used to keep REISER4 out of
the kernel and tar its reputation.


Keep that last sentence in mind for four lines...


I for one would at least play with it if it were in the distribution
tree.


I AM SURE THERE ARE A HUGE NUMBER OF PEOPLE WHO WOULD GIVE IT A TRY.
 

Claims that are never backed up with solid statistics, ...

As far as I could tell Hans pretty much everything else that 
was demanded. Hans eventually caved and provided - albeit with much 
pissing and moaning, and holy than thou rhetoric.


It was not his pissing and moaning, etc,... these were just excuses to
keep REISER4 from succeeding. The truth is, that any excuse would do.

The real reasons are financial and backed by big money (sometimes, big
egos).

Yes all of the people who make millions on the other filesystems! 
Wait... identify who makes a penny more or less with or without Reiser4.


[ snip ]

Until Namesys is stable there's no support team. It's not my impression 
that there's much support otherwise.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: init's children list is long and slows reaping children.

2007-04-10 Thread Bill Davidsen

Davide Libenzi wrote:

On Mon, 9 Apr 2007, Linus Torvalds wrote:


On Mon, 9 Apr 2007, Kyle Moffett wrote:

Maybe struct posix_process is more descriptive?  struct process_posix?
Ugly POSIX process semantics data seems simple enough to stick in a struct
name.  struct uglyposix_process?

Guys, you didn't read my message.

It's *not* about process stuff.  Anything that tries to call it a 
process is *by*definition* worse than what it is now. Processes have all 
the things that we've cleanly separated out for filesystem, VM, SysV 
semaphore state, namespaces etc.


The struct signal_struct is the random *leftovers* from all the other 
stuff. It's *not* about processes. Never has been, and never will be. 


I proposed struct task_shared_ctx but you ducked :)


Descriptive, correct, I like it!

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Performance Stats: Kernel patch

2007-04-11 Thread Bill Davidsen

Eric Dumazet wrote:

On Wed, 11 Apr 2007 15:59:16 +0400
Maxim Uvarov [EMAIL PROTECTED] wrote:

  

Patch adds Process Performance Statistics.
It make available to the user the following 
new per-process (thread) performance statistics:

   * Involuntary Context Switches
   * Voluntary Context Switches
   * Number of system calls
   
This data is useful for detecting hyperactivity 
patterns between processes.



Your description is not very clear about the semantic of your stats.

You currently returns stats only for thread(s) (not process as you claimed)
  
I'm not sure if you were confused by his use of thread in parenthesis, 
but isn't the whole point of this to see which threads are doing what? 
Or am I misreading his result as intentional?

Please check kernel/sys.c:k_getrusage() to see how getrusage() has to sum *lot* 
of individual fields to get precise process numbers (even counting stats for 
dead threads)

  

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: init's children list is long and slows reaping children.

2007-04-11 Thread Bill Davidsen

Oleg Nesterov wrote:

On 04/10, Eric W. Biederman wrote:


I'm trying to remember what the story is now.  There is a nasty
race somewhere with reparenting, a threaded parent setting SIGCHLD to
SIGIGN, and non-default signals that results in an zombie that no one
can wait for and reap.  It requires being reparented twice to trigger.


reparent_thread:

...

/* If we'd notified the old parent about this child's death,
 * also notify the new parent.
 */
if (!traced  p-exit_state == EXIT_ZOMBIE 
p-exit_signal != -1  thread_group_empty(p))
do_notify_parent(p, p-exit_signal);

We notified /sbin/init. If it ignores SIGCHLD, we should release the task.
We don't do this.

The best fix I believe is to cleanup the forget_original_parent/reparent_thread
interaction and factor out this exit_state == EXIT_ZOMBIE  exit_signal == -1
checks.

As long as the original parent is preserved for getppid(). There are 
programs out there which communicate between the parent and child with 
signals, and if the original parent dies, it undesirable to have the 
child getppid() and start sending signals to a program not expecting 
them. Invites undefined behavior.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Add a norecovery option to ext3/4?

2007-04-11 Thread Bill Davidsen

Eric Sandeen wrote:

Phillip Susi wrote:

Eric Sandeen wrote:



In that case you are mounting the same filesystem uner 2 different
operating systems simultaneously, which is, and always has been, a
recipe for disaster.  Flagging the fs as mounted already would
probably be a better solution, though it's harder than it sounds at
first glance.
No, it has not been.  Prior to poorly behaved journal playback, it was 
perfectly safe to mount a filesystem read only even if it was mounted 
read-write by another system ( possibly fsck or defrag ).  You might not 
read the correct data from it, but you would not damage the underlying 
data simply by mounting it read-only.


You might not damage the underlying filesystem, but you could sure go
off in the weeds trying to read it, if you stumbled upon some
half-updated metadata... so while it may be safe for the filesystem, I'm
not convinced that it's safe for the host reading the filesystem.

Exactly. If the data are protected you can use other software to access 
it. For ext3 an explicit ext2 mount might do it... but if you corrupt 
the underlying information, there's no going back.


In practice Linux has had lots of practice mounting garbage, and isn't 
likely to suffer terminal damage.


I wonder what happens if the device is really read-only and the o/s 
tries to replay the journal as part of a r/o mount? I suspect the system 
will refuse totally with an i/o error, not what you want.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: init's children list is long and slows reaping children.

2007-04-11 Thread Bill Davidsen

Eric W. Biederman wrote:

Bill Davidsen [EMAIL PROTECTED] writes:


As long as the original parent is preserved for getppid(). There are programs
out there which communicate between the parent and child with signals, and if
the original parent dies, it undesirable to have the child getppid() and start
sending signals to a program not expecting them. Invites undefined behavior.


Then the programs are broken.  getppid is defined to change if the process
is reparented to init.


The short answer is that kthreads don't do this so it doesn't matter.

But user programs are NOT broken, currently getppid returns either the 
original parent or init, and a program can identify init. Reparenting to 
another pid would not be easily noted, and as SUS notes, no values are 
reserved to error. So there's no way to check, and no neeed for 
kthreads, I was prematurely paranoid.


Presumably user processes will still be reparented to init so that's not 
an issue. Since there's no atomic signal_parent() the ppid could change 
between getppid() and signal(), but that's never actually been a problem 
AFAIK.


Related: Is there a benefit from having separate queues for original 
children of init and reparented (to init) tasks? Even in a server would 
there be enough to save anything?


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched_yield proposals/rationale

2007-04-12 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:

-Original Message-



Besides - but I guess you're aware of it - any randomized
algorithms tend to drive benchmarkers and performance analysts
crazy because their performance cannot be repeated. So it's usually
better to avoid them unless there is really no alternative.


That could already solve your concern from above. Statistically

speaking, it will give them (benchmarkers) the smoothest curve they've
ever seen.


Please be aware that I'm just exploring options/insight here. It is

not something I intend to push inside the mainline kernel. I just want
to find reasonable and logic criticism as you and some others have
provided already. Thanks for that!

And having gotten same, are you going to code up what appears to be a 
solution, based on this feedback?


I'm curious how well it would run poorly written programs, having 
recently worked with a company which seemed to have a whole part of 
purchasing dedicated to buying same. :-(


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DualFS: File System with Meta-data and Data Separation

2007-02-20 Thread Bill Davidsen

Juan Piernas Canovas wrote:


The point of all the above is that you must improve the common case, 
and manage the worst case correctly. 
That statement made it to my quote file. Of course correctly hopefully 
means getting to the desired behavior without a performance hit so bad 
it becomes a jackpot case and is correct in result but too slow to be 
useful.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 000 of 6] md: Assorted fixes and features for md for 2.6.21

2007-02-20 Thread Bill Davidsen

NeilBrown wrote:

Following 6 patches are against 2.6.20 and are suitable for 2.6.21.
They are not against -mm because the new plugging makes raid5 not work
and so not testable, and there are a few fairly minor intersections between
these patches and those patches.
There is also a very minor conflict with the hardware-xor patches - one line
of context is different.

Patch 1 should probably go in -stable - the bug could cause data
corruption in a fairly uncommon raid10 configuration, so that one and
this intro are Cc:ed to [EMAIL PROTECTED]

Thanks,
NeilBrown


 [PATCH 001 of 6] md: Fix raid10 recovery problem.
 [PATCH 002 of 6] md: RAID6: clean up CPUID and FPU enter/exit code
 [PATCH 003 of 6] md: Move warning about creating a raid array on partitions of 
the one device.
 [PATCH 004 of 6] md: Clean out unplug and other queue function on md shutdown
 [PATCH 005 of 6] md: Restart a (raid5) reshape that has been aborted due to a 
read/write error.
 [PATCH 006 of 6] md: Add support for reshape of a raid6


Every month or so there are a bunch of patches like this, which do 
various enhancements to the kernel. And these are usually based against 
the release kernel, and all is fine. But every once in a while there is 
a patch which is more urgent, in this case the RAID10 one, which is 
really desirable to get into every kernel running on a machine. Are 
patches marked as needed for -stable also fast tracked to -git inclusion?


If this isn't in -git14 I'm going to rebuild with it before testing 
Neil's NFS stuff. The NFS server test data is on RAID10 ;-)


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 006 of 6] md: Add support for reshape of a raid6

2007-02-23 Thread Bill Davidsen

Andrew Morton wrote:

On Tue, 20 Feb 2007 17:35:16 +1100
NeilBrown [EMAIL PROTECTED] wrote:

  

+   for (i = conf-raid_disks ; i-- ;  ) {



That statement should be dragged out, shot, stomped on then ceremonially
incinerated.

What's wrong with doing

for (i = 0; i  conf-raid_disks; i++) {

in a manner which can be understood without alcoholic fortification?


I don't find either hard to read, but you suggestion isn't equivalent, 
since it increments rather than decrements the index.

I admit I probably would write it the same way Neil did...

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc1: framebuffer/console boot failure

2007-02-27 Thread Bill Davidsen

Andrew wrote:

I have just discovered 2.6.21-rc1 boots with
pci=noacpi ...

Try setting the resolution and frame rate, video=XXX:[EMAIL PROTECTED] or 
such. Worked for me. I like pci=noacpi, though ;-)


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [QUESTION] Sata RAID

2007-02-27 Thread Bill Davidsen

Patrick Ale wrote:

On 2/24/07, Patrick Ale [EMAIL PROTECTED] wrote:

On 2/24/07, Michael-Luke Jones [EMAIL PROTECTED] wrote:


One more question regarding this, I am aware its not *really* kernel
related but answering this question now will save yourself a lot of
bogus emails from me about MD oopses later and all, and I want to
setup my disks right once and for all and never witness what I
witnessed last weeks with my ATA disks.

Would you use MD at all, taking in account the disks come from the
same batch and all? I hear these things about MD/RAID being pointless
when you use disks from the same brand/type/batch since they most
likely will break shortly after each other.


Well, for values of shortly in months in most cases. These are 
consumer goods, I would not expect units with consecutive serial numbers 
to fail separated by such a short time that you can't do a backup and/or 
replace and rebuild. If quality control were so good they are likely to 
fail at the same time it would be so good they would be obsolete before 
they failed.


That urban myth is a good reason to do backups, but a bad reason to 
avoid RAID.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SMP performance degradation with sysbench

2007-02-27 Thread Bill Davidsen

Paulo Marques wrote:

Rik van Riel wrote:

J.A. Magallón wrote:

[...]
Its the same to answer 4+4 queries than 8 at half the speed, isn't it ?


That still doesn't fix the potential Linux problem that this
benchmark identified.

To clarify: I don't care as much about MySQL performance as
I care about identifying and fixing this potential bug in
Linux.


IIRC a long time ago there was a change in the scheduler to prevent a 
low prio task running on a sibling of a hyperthreaded processor to slow 
down a higher prio task on another sibling of the same processor.


Basically the scheduler would put the low prio task to sleep during an 
adequate task slice to allow the other sibling to run at full speed for 
a while.


I don't know the scheduler code well enough, but comments like this one 
make me think that the change is still in place:



/*
 * If an SMT sibling task has been put to sleep for priority
 * reasons reschedule the idle task to see if it can now run.
 */
if (rq-nr_running) {
resched_task(rq-idle);
ret = 1;
}


If that is the case, turning off CONFIG_SCHED_SMT would solve the problem.

That may be the case, but in my opinion if this helps it doesn't solve 
the problem, because the real problem is that a process which is not on 
a HT is being treated as if it were.


Note that Intel does make multicore HT processors, and hopefully when 
this code works as intended it will result in more total throughput. My 
supposition is that it currently is NOT working as intended, since 
disabling SMT scheduling is reported to help.


A test with MC on and SMT off would be informative for where to look next.

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: latencies due to disk writes

2007-02-27 Thread Bill Davidsen

[EMAIL PROTECTED] wrote:

Hello!

I'm experiencing extreme lags during disk writes. I have read somewhere (didn't 
save the URI, sigh) that this is actually related to bad (non-existing) write 
io priorities (CFQ only manages file reads).

I could imagine two quick, easy and probably quite effective ways to prevent 
such lags:

1.) don't flush buffers to disk at once more than necessary.

Actually, in many cases this is just what you do want, to avoid filling 
memory with buffered writes and then flushing them on time or memory runout.


Investigate the /proc/sys/vm/dirty_* values.


2.) relate CPU niceness to max write buffer fill level (ie. the point where it 
gets forced to be flushed to disk -- a conservative estimate would be much 
better than nothing): (100-5*nicelevel)%, ie. writes for processes having nice 
level 19 are blocked/delayed until the write buffer is below 5%. That way, the 
accounting is done at a higher and probably easier to access level.

Maybe I'm just talking nonsense, but nonetheless, here are my 2 cents.

Best regards,
Mark

p.s. please CC me as I'm not subscribed to this list.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-rc1: CIFS cheers, NFS4 jeers

2007-02-27 Thread Bill Davidsen

Florin Iucha wrote:

Hello, it's me and my 70 GB of photos again.

I have tested both CIFS and NFSv4 clients in kernel 2.6.20-rc1 . CIFS
passed with flying colors and NFSv4 stalled after 7 GB.

Configuration:

   Server: PIII/1GHz, 512 MB RAM, Debian testing,
  distro kernel 2.6.18-3-vserver-686, Intel E1000 NIC, 
  filesystem 170 GB ext3 with default mkfs values on a SATA disk
  
   Client: AMD x2 4200+, 2 GB RAM, Debian testing/unstable

  kernel 2.6.20-rc1, Marvell SKGE onboard,
  filesystem 120 GB ext3 with default mkfs values on a SATA disk

Neil has been diddling NFS, I did some light testing with 2.6.20-git14 
with 190GB of mp3 and mpg files (library of congress folk music) without 
hangs. Just did it work tests, copy 20-30GB to server, do md5 on the 
data pulled back from the server.


Didn't hang, performance testing later.

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bug in kernel 2.6.21-rc1-git1: conventional floppy drive cannot be mounted without hanging up the whole system

2007-02-27 Thread Bill Davidsen

Linus Torvalds wrote:


On Mon, 26 Feb 2007, Rene Herman wrote:

Other than these two, ECP parallel ports are the other remaining users.
Now, even though on a machine that still has a parallel port it might usually
indeed be set to ECP in its BIOS; having anything attached to the port also
use it as such seems quite seldom.


Well, if it's some kind of cache coherency problem (the same way much more 
modern CPU's have cache coherency issues with DMA during C3 sleep), then 
it's entirely possible that the normal ECP parallel port behaviour would 
never show it, since most people tend to use it for output only (yeah, I 
realize you can use it bidirectionally, but at least on old hardware it 
tends to be talk AT printer rather than talk WITH printer.


The bidirectional use is/was PL/IP, aka laplink connections. Yes, I 
still have a machine I installed that way, and it will run 2.2.19 
forever before I try it again. ;-)


I frankly forget what hardware platforms had problems with the DMA thing, 
and what the exact behaviour even was (I think there was some possibility 
of corrupt data on the floppy). We also used to have the nohlt flag to 
turn off hlt entirely, and that was due to some other legacy issues, iirc.


I seriously doubt we will ever see anybody who has this problem ever 
again, but on the other hand, I also seriously doubt that most modern 
machines even *have* a floppy drive any more, so I'd rather not even 
change it. It's just not worth even a miniscule risk..


Thank you.


Linus



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc1: CIFS cheers, NFS4 jeers

2007-02-28 Thread Bill Davidsen

Florin Iucha wrote:

On Tue, Feb 27, 2007 at 09:36:23PM -0500, Bill Davidsen wrote:
  

Florin Iucha wrote:


Hello, it's me and my 70 GB of photos again.

I have tested both CIFS and NFSv4 clients in kernel 2.6.20-rc1 . CIFS
passed with flying colors and NFSv4 stalled after 7 GB.
  


  
Neil has been diddling NFS, I did some light testing with 2.6.20-git14 
with 190GB of mp3 and mpg files (library of congress folk music) without 
hangs. Just did it work tests, copy 20-30GB to server, do md5 on the 
data pulled back from the server.


Didn't hang, performance testing later.



2.6.20-rcX used to copy all files then hang on certain operations that
I think used the VFS.  2.6.21-rc1 stalls the NFS transfer itself after
several GB.  The data was never corrupted.

Have you tried copying _ALL_ 190 GB to the server?


No, but as noted I was doing 20-30GB, so if several is a small number 
I'm not seeing that behavior. I'm using a Gbit connection if that is not 
the same as your setup. I have additional testing queued for time 
available this week.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] affinity is not defined in non-smp kernels - i386

2007-02-28 Thread Bill Davidsen

Eric W. Biederman wrote:

Fernando Luis Vázquez Cao [EMAIL PROTECTED] writes:


Initialize affinity only when building SMP kernels.


Reasonable.  I goofed here.

However I would prefer my patch that just deletes these problem lines.
These lines don't really contribute anything and are harmless to
remove.


Where is the initialization performed, then?

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] md: Fix for raid6 reshape.

2007-03-02 Thread Bill Davidsen

Neil Brown wrote:

On Thursday March 1, [EMAIL PROTECTED] wrote:
  

On Fri, 2 Mar 2007 15:56:55 +1100 NeilBrown [EMAIL PROTECTED] wrote:



-   conf-expand_progress = (sector_nr + i)*(conf-raid_disks-1);
+   conf-expand_progress = (sector_nr + i) * new_data_disks);
  

ahem.




It wasn't like that when I tested it, honest...
But the original got caught up with some other changes which were not
really related so I removed them all and just made this change
manually and totally messed it up (again).  Sorry.

Of course it should be

  

+   conf-expand_progress = (sector_nr + i) * new_data_disks;
  

Will the (real) fix be in 2.6.21?

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question: 20 microseconds delay

2007-03-03 Thread Bill Davidsen

Mockern wrote:
The problem is that I need to use wait_event_timeout function 


AFAIK you can't do that, it's just not the right tool for the job. You 
can always use a single jiffy and be sure you will wait at least 20us, 
or use usleep.


No matter how much you need to use your snowblower, it won't mow your 
lawn.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bugs in kernel 2.6.21 (both two release candidates) and kernel 2.6.20

2007-03-03 Thread Bill Davidsen

Uwe Bugla wrote:

Hi folks,
the floppy mount error I mentioned is gone now in 2.6.21-rc2, and my kernel is 
smaller. Good decision to rip out Stephane's stuff, Linus!
As I did not get a reply from Andrew I hope that the buggy stuff residing in 
2.6.20-mm1 ( freezing my apache services
- I already mentioned the problem some days ago - mm2 I did not try yet ) will 
never be pushed into vanilla mainline.
I owe some old CDROM and CDRW devices manufactured by TEAC (bought somewhen in 
1999): CDR 540 and CDRW 54.
Those old CD devices sometimes get confused with drive seek errors and status 
errors shown in dmesg.
The newer DVD devices (LG reading device and Yamakawa burning device) do not 
show those errors at all.
As I have finished an enourmous project 6 weeks ago (transforming some 500 
Audio CDs to MP3 format
with kaudiocreator and lame 3.97 (320 kbit quality - preset insane) and then 
burning the material on DVDs)
those old devices were an incredible help in some cases where the newer DVD 
devices refused to read some audio
CDs without errors. That's why I do not want to kick them off at all. Never had 
those troubles with kernel 2.6.19 and former ones.


Other than wanting to stay current, is there a reason why you need to go 
to a newer kernel to do this process? Would it be an option just to run 
on a kernel which works for the moment?


Assuming that you don't want to use these drives unless you can't read a 
CD any other way, would it be practical to (a) move these drives to 
another machine and run and old kernel, (b) try a newer CD (not DVD) 
reader, or (c) install one or both of these antiques in a USB external 
enclosure which would allow you to reinsert the drive rather than reboot?


It may take a while for this problem to be identified, I doubt there are 
many around for developers to test. If I have one in the old junk 
closet anyone is welcome to it, but I have donated a lot of built from 
parts machines to various people and causes, so anything that old is 
unlikely to be found.



Dmesg 1 says on my AMD machine with a CDR540 as /dev/hdd during boot process:
hdd: media error (bad sector): status=0x51 { DriveReady SeekComplete Error }
hdd: media error (bad sector): error=0x34 { AbortedCommand LastFailedSense=0x03 
}
ide: failed opcode was: unknown
ATAPI device hdd:
  Error: Medium error -- (Sense key=0x03)
  (reserved error code) -- (asc=0x02, ascq=0x00)
  The failed Read 10 packet command was:
  28 00 00 00 00 10 00 00 02 00 00 00 00 00 00 00 
end_request: I/O error, dev hdd, sector 64
Buffer I/O error on device hdd, logical block 8
hdd: media error (bad sector): status=0x51 { DriveReady SeekComplete Error }
hdd: media error (bad sector): error=0x34 { AbortedCommand LastFailedSense=0x03 
}
ide: failed opcode was: unknown
ATAPI device hdd:
  Error: Medium error -- (Sense key=0x03)
  (reserved error code) -- (asc=0x02, ascq=0x00)
  The failed Read 10 packet command was:
  28 00 00 00 00 10 00 00 02 00 00 00 00 00 00 00 
end_request: I/O error, dev hdd, sector 64
Buffer I/O error on device hdd, logical block 8

But even more crucial is this one:
Dmesg 2 says on the Intel machine with a TEAC CDRW54 as /dev/hdd:
hdd: status error: status=0x7f { DriveReady DeviceFault SeekComplete 
DataRequest CorrectedError Index Error }
hdd: status error: error=0x7f { IllegalLengthIndication EndOfMedia 
AbortedCommand MediaChangeRequested LastFailedSense=0x07 }
ide: failed opcode was: unknown
For about 1 second the whole system hangs while /dev/hdd is executing some kind 
of reinitialization, just like as if you unconnect
the data and the 12 V / 6V cable and reconnect them again while the machine is 
up and running.
For a DVB-S record f. ex. the breakdown of the recording can be one consequence.
Question: Can someone reading this please confirm these errors? Please take old 
CD devices to find out, not newer ones or even DVD devices!
I am using the standard IDE driver with the following chipsets: Intel ICH4 and 
SIS 5513. And please take time, as these crucial errors do not happen
immediately, but about 4 times in about 8 - 10 hours while the machine is up 
and running.

Yours sincerely and thanks for all your efforts

Uwe
P. S.: I do not think this is a hardware error as I did not have those problems 
with kernels = 2.6.19.




--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: warn if speed limited due to 40-wire cable (v2)

2007-03-03 Thread Bill Davidsen

Alan Cox wrote:
it seems broken to manipulate xfer_mask after returning from the 
driver's -mode_filter hook.


this patch is more than just a speed-limited warning printk, afaics


I actually suggested that order because the only way the printk can be
done correctly is for it to be the very last test made. Since the mode
filter is not told what mode will be used but just subtracts modes that
are not allowed this should be safe.


Far better to have a drive which works slowly than one which works 
unreliably.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: warn if speed limited due to 40-wire cable (v2)

2007-03-04 Thread Bill Davidsen

Stephen Clark wrote:

Bill Davidsen wrote:


Alan Cox wrote:
 

it seems broken to manipulate xfer_mask after returning from the 
driver's -mode_filter hook.


this patch is more than just a speed-limited warning printk, afaics


I actually suggested that order because the only way the printk can be
done correctly is for it to be the very last test made. Since the mode
filter is not told what mode will be used but just subtracts modes that
are not allowed this should be safe.
  


Far better to have a drive which works slowly than one which works 
unreliably.


 


That would be true if the 40 wire detection was 100% accurate!
The statement is completely correct, even though the detection may not 
be. ;-)


With the current set(s) of patches to do better detection, cable 
evaluation should be better. But even if not, a slow system is more 
useful than one which doesn't work, crashes because of swap i/o errors, etc.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc2-mm1

2007-03-05 Thread Bill Davidsen

Neil Brown wrote:

On Sunday March 4, [EMAIL PROTECTED] wrote:

On Mon, 5 Mar 2007 01:11:33 +0100 J.A. Magallón [EMAIL PROTECTED] wrote:


On Fri, 2 Mar 2007 03:00:26 -0800, Andrew Morton [EMAIL PROTECTED] wrote:


Temporarily at

  http://userweb.kernel.org/~akpm/2.6.21-rc2-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc2/2.6.21-rc2-mm1/


nfs blocks shutdown and reboot.
If I try to do 'service nfs stop', the box hangs, no login, no SysRQ-T or P,
S-U-B works at least.


The bug was added by
knfsd-use-recv_msg-to-get-peer-address-for-nfsd-instead-of-code-copying.patch.


Bother...
Looks like a need a MSG_DONTWAIT in there, don't I.

I'll resend.


Crap, that's probably in 2.6.20-git14 with all the NFS stuff I thought I 
had checked before trusting. At least there is a bug found, no more 
works for me reports. I'll revert to an FC6 kernel before load gets 
high this morning.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] disable NMI watchdog by default

2007-03-05 Thread Bill Davidsen

Ingo Molnar wrote:

From: Ingo Molnar [EMAIL PROTECTED]
Subject: [patch] disable NMI watchdog by default

there's a new NMI watchdog related problem: KVM crashes on certain 
bzImages because ... we enable the NMI watchdog by default (even if the 
user does not ask for it) , and no other OS on this planet does that so 
KVM doesnt have emulation for that yet. So KVM injects a #GP, which 
crashes the Linux guest:


 general protection fault:  [#1]
 PREEMPT SMP
 Modules linked in:
 CPU:0
 EIP:0060:[c011a8ae]Not tainted VLI
 EFLAGS: 0246   (2.6.20-rc5-rt0 #3)
 EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3

and no, i did /not/ request an nmi_watchdog on the boot command line!

Solution: turn off that darn thing! It's a debug tool, not a 'make life 
harder' tool!!


with this patch the KVM guest boots up just fine.

And with this my laptop (Lenovo T60) also stops its sporadic hard 
hanging (sometimes in acpi_init(), sometimes later during bootup, 
sometimes much later during actual use) as well. It hung with both 
nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI 
injection that is causing problems, not the NMI watchdog variant, nor 
any particular bootup code.


The patch is unintrusive.


I'm missing something, what limits this to systems running under kvm?

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [5/6] 2.6.21-rc2: known regressions

2007-03-05 Thread Bill Davidsen

Ingo Molnar wrote:

* Adrian Bunk [EMAIL PROTECTED] wrote:


Subject: i386: no boot with nmi_watchdog=1  (clockevents)
References : http://lkml.org/lkml/2007/2/21/208
Submitter  : Daniel Walker [EMAIL PROTECTED]
Caused-By  : Thomas Gleixner [EMAIL PROTECTED]
 commit e9e2cdb412412326c4827fc78ba27f410d837e6e
Handled-By : Thomas Gleixner [EMAIL PROTECTED]
Status : problem is being debugged


FYI, this is not a wont boot problem, this should be a NMI watchdog 
does not work problem - which has far lower severity. Also, Thomas did 
a fix for this which is now in -mm.


If a system normally runs a watchdog, and some do, then nmi would be 
forced on by grub.comf and the system would not boot. And if the system 
was counting on nmi to look for a hanging problem, nmi does not work 
would be a real problem if the failure was silent.


Actually, a lack of nmi would be worse than not booting, it would be a 
time bomb waiting for a bad moment to hang.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-05 Thread Bill Davidsen

jos poortvliet wrote:

Op Sunday 04 March 2007, schreef Willy Tarreau:

Hi Con !

This was designed to be robust for any application since linux demands a
general purpose scheduler design, while preserving interactivity, instead
of optimising for one particular end use.

Well, I haven't tested it yet, but your design choices please me. As you
know, I've been one of those encountering big starvation problems with
the original scheduler, making 2.6 unusable for me in many situations. I
welcome your work and want to thank you for the time you spend trying to
fix it.

Keep up the good work,
Willy

PS: I've looked at your graphs, I hope you're on the way to something
really better than the 21 first 2.6 releases !
Well, imho his current staircase scheduler already does a better job compared 
to mainline, but it won't make it in (or at least, it's not likely). So we 
can hope this WILL make it into mainline, but I wouldn't count on it.


Wrong problem, what is really needed is to get CPU scheduler choice into 
mainline, just as i/o scheduler finally did. Con has noted that for some 
loads this will present suboptimal performance, as will his -ck patches, 
as will the default scheduler. Instead of trying to make ANY one size 
fit all, we should have a means to select, at runtime, between any of 
the schedulers, and preferably to define an interface by which a user 
can insert a new scheduler in the kernel (compile in, I don't mean 
plugable) with clear and well defined rules for how that can be done.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: warn if speed limited due to 40-wire cable (v2)

2007-03-05 Thread Bill Davidsen

Stephen Clark wrote:

Bill Davidsen wrote:


Stephen Clark wrote:
 


Bill Davidsen wrote:

  

Alan Cox wrote:



it seems broken to manipulate xfer_mask after returning from the 
driver's -mode_filter hook.


this patch is more than just a speed-limited warning printk, afaics
   
I actually suggested that order because the only way the printk 
can be
done correctly is for it to be the very last test made. Since the 
mode
filter is not told what mode will be used but just subtracts modes 
that

are not allowed this should be safe.
 
  
Far better to have a drive which works slowly than one which works 
unreliably.






That would be true if the 40 wire detection was 100% accurate!
  
The statement is completely correct, even though the detection may 
not be. ;-)


With the current set(s) of patches to do better detection, cable 
evaluation should be better. But even if not, a slow system is more 
useful than one which doesn't work, crashes because of swap i/o 
errors, etc.


 

I have had problems with cable detection on my previous laptop and my 
current laptop. It almost made
my systems unusable. On my current laptop I was getting a thruput of a 
little over 1 mbps instead
of the 44 mbps I get with udma set to the correct value. It took hours 
to upgrade my laptop from

fc5 to fc6 because of this mis detection.

As far as I can see, if you are getting that low a speed, you have other 
problems. I have a system with old slow drives which are really on a 40 
pin cable, and they run at UDMA(33). One of the experts in this can 
undoubtedly tell us more, but your system should run faster than that, 
mine does, and I really HAVE a 40 pin cable (and drive).


If your system drops to PIO modes, I doubt cable is the only issue, I 
think there are other issues (acpi comes to mind).


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Raid 10 Problems?

2007-03-06 Thread Bill Davidsen

Marc Perkel wrote:

--- Jan Engelhardt [EMAIL PROTECTED] wrote:


On Mar 4 2007 19:17, Marc Perkel wrote:

Thanks - because of your suggestion I had found the
instructions. But you have some interesting options
set. 


-N nicearray -b internal -e 1.0

Are these important?

  -N? What's in a name? I suppose, it's not so
important.
  (Arrays are identified by their UUID anyway. But
maybe
  udev can do something with the name someday as it
does
  today with /dev/disk/*.)


Not worth starting over for.


  -b internal -- seems like a good idea to speed up
  resynchronization.


I'm trying to figure out what the default is. 


  -e 1.0 -- I wonder why the new superblock format
is
  not default in mdadm (0.90 is still used).



Looks interesting for big arrays but not sure it's
worth starting over for. Trying to get through a 2
hour sync using 4 500gb sata 2 drives.


That's exactly why you want the bitmap. Fortunately you can add it after 
the array is created. Now the bad news, you should read and understand 
the meaning of the far layout. Part of the information is in the mdadm 
man page under -p, some in the md man page. Use of far layout will 
effect the performance of the array, the balance of read vs. write 
performance, and (maybe) the reliability.


Two hours is a pretty short time to invest if you find that you have 
your layout wrong and would be better off for the life of the array with 
some other data layout. And the time to do the reading is worth if if 
you wind up convinced that the default settings are fine for you.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] BadRAM still not ready for inclusion ? (was: Re: Free Linux Driver Development!)

2007-03-06 Thread Bill Davidsen
 imagine that 
such \
feature would be possible ?

Where is those efforts for fixing/integrating fantastic cowloop?
What about badram/badmem patch ?
Compressed Ccaching ?
Somebody helping with development of dm-loop or extend loop.c to support more 
than \
256 devices ? Replacement of proprietary, unstable and unelegant vmware-lopp 
for \
being able to mount vmware .vmdk files ? Internal Spec for this is open, 
dm-userspace \
could be some infrastructure for this, but the author seems to have other \
priorities dm-cow, zfs-fuse - anybody ?
Kernel based target for AoE (Ata over Ethernet) ?  (there are two independent \
implementations, but both got stuck at some early experimental stage) 

Just my 2 cents. 


Roland K.
Sysadmin

ps:
This isn`t meant to criticise any of you kernel developers since you`re doing \
fantastic work!
___
Viren-Scan für Ihren PC! Jetzt für jeden. Sofort, online und kostenlos.
Gleich testen! http://www.pc-sicherheit.web.de/freescan/?mc=02




--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


CONFIG_SYSFS_DEPRECATED and similar issues

2007-03-06 Thread Bill Davidsen
Just a few comments on changes like this in general. Prompted by, but 
not otherwise related to the subject.


When any new feature in the kernel requires significant changes in 
userspace, attention should be given to documenting the user items which 
must change. Now attention doesn't mean handwaving b.s. like user 
programs which depend on sysfs will need to be modified. It means a 
list of which common user features need to be updated and where to find 
new ones. That's what the whole Documentation directory is for, but it's 
rarely used for such a purpose.


In the most recent case, a user tool is needed which doesn't even exist 
as a release.


The other issue is to avoid trap door changes, which occur when a 
kernel change requires new user tools, and the user tools will not run 
with older kernels. That includes missing major features like having a 
display and being able to mount filesystems, even if the kernel is 
technically running. When there were stable and development kernel 
series that was not so much of an issue, now that every kernel is an 
adventure it would be nice to ensure that testing a new kernel is not an 
irrevocable step.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22 patch] the scheduled removal of OBSOLETE_OSS options

2007-03-06 Thread Bill Davidsen

Adrian Bunk wrote:
This patch contains the scheduled removal of the OBSOLETE_OSS options 
for 2.6.22.


If these are drivers for which there are thought to be useful ALSA 
drivers, would it be reasonable to leave a stub for a help file naming 
the driver which claims to support the hardware?


I'm not objection to the removal of the drivers, just noting that 
identifying the new drivers can be made easier.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-06 Thread Bill Davidsen

Gene Heskett wrote:

On Monday 05 March 2007, Nicolas Mailhot wrote:

This looks like -mm stuff if you want it in 2.6.22


This needs to get to 2.6.21, it really is that big an improvement.

As Con pointed out, for some workloads and desired behavour this is not 
as good as the existing scheduler. Therefore it should go in -mm and 
hopefully give the user an option to select which is appropriate.


With luck I'll get to shake out that patch in combination with kvm later 
today.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu scheduler

2007-03-06 Thread Bill Davidsen

Willy Tarreau wrote:

On Tue, Mar 06, 2007 at 11:18:44AM +1100, Con Kolivas wrote:
  

On Tuesday 06 March 2007 10:05, Bill Davidsen wrote:


jos poortvliet wrote:
  

Well, imho his current staircase scheduler already does a better job
compared to mainline, but it won't make it in (or at least, it's not
likely). So we can hope this WILL make it into mainline, but I wouldn't
count on it.


Wrong problem, what is really needed is to get CPU scheduler choice into
mainline, just as i/o scheduler finally did. Con has noted that for some
loads this will present suboptimal performance, as will his -ck patches,
as will the default scheduler. Instead of trying to make ANY one size
fit all, we should have a means to select, at runtime, between any of
the schedulers, and preferably to define an interface by which a user
can insert a new scheduler in the kernel (compile in, I don't mean
plugable) with clear and well defined rules for how that can be done.
  
Been there, done that. Wli wrote the infrastructure for plugsched; I took his 
code and got it booting and ported 3 or so different scheduler designs. It 
allowed you to build as few or as many different schedulers into the kernel 
and either boot the only one you built into your kernel, or choose a 
scheduler at boot time. That code got permavetoed by both Ingo and Linus. 
After that I gave up on that code and handed it over to Peter Williams who 
still maintains it. So please note that I pushed the plugsched barrow 
previously and still don't think it's a bad idea, but the maintainers think 
it's the wrong approach.



In a way, I think they are right. Let me explain. Pluggable schedulers are
useful when you want to switch away from the default one. This is very useful
during development of a new scheduler, as well as when you're not satisfied
with the default scheduler. Having this feature will incitate many people to
develop their own scheduler for their very specific workload, and nothing
generic. It's a bit what happened after all : you, Peter, Nick, and Mike
have worked a lot trying to provide alternative solutions.

But when you think about it, there are other OSes which have only one scheduler
and which behave very well with tens of thousands of tasks and scale very well
with lots of CPUs (eg: solaris). So there is a real challenge here to try to
provide something at least as good and universal because we know that it can
exist. And this is what you finally did : work on a scheduler which ought to be
good with any workload.

  
The problem is not with any workload, because that's not the issue, 
the issue is the definition of good matching the administrator's 
policy. And that's where the problem comes in. We have the default 
scheduler, which favors interactive jobs. We have Con's staircase 
scheduler which is part of an interactivity package. We have the 
absolutely fair scheduler which is, well... fair, and keeps things 
smooth and under reasonable load crisp.


There are other schedulers in the pluggable package, I did a doorknob 
scheduler for 2.2 (everybody gets a turn, special case of round-robin). 
I'm sure people have quietly hacked many more, which have never been 
presented to public view.


The point is that no one CPU scheduler will satisfy the policy needs of 
all users, any more than one i/o scheduler does so. We have realtime 
scheduling, preempt both voluntary and involuntary, why should we not 
have multiple CPU schedulers. If Linus has an objection to plugable 
schedulers, then let's identify what the problem is and address it. If 
that means one scheduler or the other must be compiled in, or all 
compiled in and selected, so be it.



Then, when we have a generic, good enough scheduler for most situations, I
think that it could be good to get the plugsched for very specific usages.
People working in HPC may prefer to allocate ressource differently for
instance. There may also be people refusing to mix tasks from different users
on two different siblings of one CPU for security reasons, etc... All those
would justify a plugable scheduler. But it should not be an excuse to provide
a set of bad schedulers and no good one.

  
Unless you force the the definition of good to what the default 
scheduler does, there can be no one good one. Choice is good, no one 
is calling for bizarre niche implementations, but we have at minimum 
three CPU schedulers which as best for a large number of users. 
(current default, and Con's fair and interactive flavors, before you ask).

The CPU scheduler is often compared to the I/O schedulers while in fact this
is a completely different story. The I/O schedulers are needed because the
hardware and filesystems may lead to very different behaviours, and the
workload may vary a lot (eg: news server, ftp server, cache, desktop, real
time streaming, ...). But at least, the default I/O scheduler was good enough
for most usages, and alternative ones are here to provide optimal solutions

Re: [2.6.22 patch] the scheduled removal of OBSOLETE_OSS options

2007-03-06 Thread Bill Davidsen

Adrian Bunk wrote:

On Tue, Mar 06, 2007 at 12:46:22PM -0500, Bill Davidsen wrote:
  

Adrian Bunk wrote:

This patch contains the scheduled removal of the OBSOLETE_OSS options 
for 2.6.22.


  
If these are drivers for which there are thought to be useful ALSA 
drivers, would it be reasonable to leave a stub for a help file naming 
the driver which claims to support the hardware?


I'm not objection to the removal of the drivers, just noting that 
identifying the new drivers can be made easier.



People compiling their own kernels aren't completely dumb - if you know 
about people having problems finding the right ALSA driver for their 
hardware, please name the concrete problems so that we can improve the 
description and/or help text of these ALSA options.
  
I'm not sure how my original note might have been clearer, but let me 
try again.


You are about to delete a number of OSS drivers because there are ALSA 
drivers for the hardware. I am assuming that for each of there drivers 
you have some ALSA driver in mind, rather than just just general 
handwaving. I therefore suggest that it would be good if one person, 
that would be you, could do a little Kconfig magic so that when 'make 
oldconfig' on new kernel source fails to support sound, there might be a 
message in the output with a hint, like 'OSS driver  has been 
deleted and ALSA driver  should support this hardware.' So one 
person who I bet knows which replacement drivers are most likely could 
save some effort for many people who otherwise may have to read help on 
a number of drivers (naming is not always obvious), or grep through the 
driver source for board or chipset names giving a clue.


If Kconfig can't do this, fine, I haven't studied it in years, nor ever 
been an expert. If you have no idea what drivers replace the ones you 
are deleting and are only following orders, fine too (but I doubt that). 
But no improvement to ALSA help text would save as many people as much 
time as a one line message telling them the most likely driver to 
support similar hardware and avoiding the need to look at that text, or 
at least let the cautious look as the most likely text first.


Since you are the agent of change in breaking many existing configs I 
thought you might be inclined to at least give a clue if it were small 
effort on your part.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL v0.31

2007-03-28 Thread Bill Davidsen

David Schwartz wrote:

there were multiple attempts with renicing X under the vanilla
scheduler, and they were utter failures most of the time. _More_ people
complained about interactivity issues _after_ X has been reniced to -5
(or -10) than people complained about nice 0 interactivity issues to
begin with.


Unfortunately, nicing X is not going to work. It causes X to pre-empt any
local process that tries to batch requests to it, defeating the batching.
What you really want is X to get scheduled after the client pauses in
sending data to it or has sent more than a certain amount. It seems kind of
crazy to put such login in a scheduler.

Perhaps when one process unblocks another, you put that other process at the
head of the run queue but don't pre-empt the currently running process. That
way, the process can continue to batch requests, but X's maximum latency
delay will be the quantum of the client program.


In general I think that's the right idea. See below for more...



The vanilla scheduler's auto-nice feature rewards _behavior_, so it gets
X right most of the time. The fundamental issue is that sometimes X is
very interactive - we boost it then, there's lots of scheduling but nice
low latencies. Sometimes it's a hog - we penalize it then and things
start to batch up more and we get out of the overload situation faster.
That's the case even if all you care about is desktop performance.

no doubt it's hard to get the auto-nice thing right, but one thing is
clear: currently RSDL causes problems in areas that worked well in the
vanilla scheduler for a long time, so RSDL needs to improve. RSDL should
not lure itself into the false promise of 'just renice X statically'. It
wont work. (You might want to rewrite X's request scheduling - but if so
then i'd like to see that being done _first_, because i just dont trust
such 10-mile-distance problem analysis.)


I am hopeful that there exists a heuristic that both improves this problem
and is also inherently fair. If that's true, then such a heuristic can be
added to RSDL without damaging its properties and without requiring any
special settings. Perhaps longer-term latency benefits to processes that
have yielded in the past?

I think there are certain circumstances, however, where it is inherently
reasonable to insist that 'nice' be used. If you want a CPU-starved task to
get more than 1/X of the CPU, where X is the number of CPU-starved tasks,
you should have to ask for that. If you want one CPU-starved task to get
better latency than other CPU-starved tasks, you should have to ask for
that.


I agree for giving a process more than a fair share, but I don't think 
latency is the best term for what you describe later. If you think of 
latency as the time between a process unblocking and the time when it 
gets CPU, that is a more traditional interpretation. I'm not really sure 
latency and CPU-starved are compatible.


I would like to see processes at the head of the queue (for latency) 
which were blocked for long term events, keyboard input, network input, 
mouse input, etc. Then processes blocked for short term events like 
disk, then processes which exhausted their time slice. This helps 
latency and responsiveness, while keeping all processes running.


A variation is to give those processes at the head of the queue short


Fundamentally, the scheduler cannot do it by itself. You can create cases
where the load is precisely identical and one person wants X and another
person wants Y. The scheduler cannot know what's important to you.

DS





--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL v0.31

2007-03-28 Thread Bill Davidsen

Linus Torvalds wrote:


On Tue, 20 Mar 2007, Willy Tarreau wrote:

Linus, you're unfair with Con. He initially was on this position, and lately
worked with Mike by proposing changes to try to improve his X responsiveness.


I was not actually so much speaking about Con, as about a lot of the 
tone in general here. And yes, it's not been entirely black and white. I 
was very happy to see the try this patch email from Al Boldi - not 
because I think that patch per se was necessarily the right fix (I have no 
idea), but simply because I think that's the kind of mindset we need to 
have.


Not a lot of people really *like* the old scheduler, but it's been tweaked 
over the years to try to avoid some nasty behaviour. I'm really hoping 
that RSDL would be a lot better (and by all accounts it has the potential 
for that), but I think it's totally naïve to expect that it won't need 
some tweaking too.


So I'll happily still merge RSDL right after 2.6.21 (and it won't even be 
a config option - if we want to make it good, we need to make sure 
*everybody* tests it), but what I want to see is that can do spirit wrt 
tweaking for issues that come up.


May I suggest that if you want proper testing that it not only should be 
a config option but a boot time option as well? Otherwise people will be 
comparing an old scheduler with an RSDL kernel, and they will diverge as 
time goes on.


More people would be willing to reboot and test on a similar load than 
will keep two versions of the kernel around. And if you get people 
testing RSDL against a vendor kernel which might be hacked, it will be 
even less meaningful.


Please consider the benefits of making RSDL the default scheduler, and 
leaving people with the old scheduler with an otherwise identical kernel 
as a fair and meaningful comparison.


There, that's a technical argument ;-)

Because let's face it - nothing is ever perfect. Even a really nice 
conceptual idea always ends up hitting the but in real life, things are 
ugly and complex, and we've depended on behaviour X in the past and can't 
change it, so we need some tweaking for problem Y.


And everything is totally fixable - at least as long as people are willing 
to!


Linus



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] the scheduled eepro100 removal

2007-03-28 Thread Bill Davidsen

Adrian Bunk wrote:

This patch contains the scheduled removal of the eepro100 driver.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]


This keeps coming around, but I haven't seen an answer to the questions 
raised by Eric Piel or Kiszka. I do know that e100 didn't work on some 
IBM rackmount servers and eepro100 did, but since I'm no longer 
responsible for those machines I can't retest. Perhaps someone will be 
able to provide data points.


IBM current offerings as of about three years ago, I had a few dozen of 
them at one time.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: max_loop limit

2007-03-29 Thread Bill Davidsen

Tomas M wrote:

255 loop devices are insufficient? What kind of scenario do you have
in mind?




Thank you very much for replying.

In 1981, Bill Gates said that 64KB of memory is enough for everybody.
And you know how much RAM do you have right now. :)

Actually, I believe the number was 640k, the quote included the phrase 
should be, the available memory on the IBM PC. And this was after IBM 
decided to put the video adapter in memory at 640k, Intel decided to 
provide only 1MB of address space on the 8086, and was in the context of 
mainframes of the day, some of which could only address 1MB.


And having run clients with three users on an XT with just that 640kB 
and UNIX, I don't think he was wrong about the memory for that time, 
just the O/S.


BTW: anyone got a copy of PC/IX (SysIII for XT) around? I'd love to run 
that in a VM just for the comparison.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: max_loop limit

2007-03-29 Thread Bill Davidsen

roland wrote:

partitions on loop or device-mapper devices ?

you can use kpartx tool for this.

bryn m. reeves told me, that it's probably poissible to create udev 
rules that will automatically create partition maps when a new loop 
device is setup, which is better than adding partitioning logig into 
dm-loop for example.


It is certainly possible to create a partitionable RAID device from a 
loop device. Should be possible to use nbd as well, but I can't seem to 
get nbd to work on 2.6.21-rc (my working system runs 2.6.17).


example:

kpartx -a /dev/mapper/loop0

# ls /dev/mapper/loop0*
/dev/mapper/loop0  /dev/mapper/loop0p1  /dev/mapper/loop0p2
/dev/mapper/loop0p3


i have seen a patch for loop.c doing this, though. search the archives 
for this


regards
roland





On Thu, Mar 22, 2007 at 02:33:14PM +, Al Viro wrote:

Correction: current ABI is crap.  To set the thing up you need to open
it and issue an ioctl.  Which is a bloody bad idea, for obvious 
reasons...


Agreed.  What would be a right way?  Global device ala ptmx/tun/tap?
New syscall?  Something else?

 OG.
-
]



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] possible USB regression with 2.6.21-rc4: iPod doesn't work

2007-03-29 Thread Bill Davidsen

Tino Keitel wrote:

On Mon, Mar 26, 2007 at 17:15:53 -0400, Alan Stern wrote:

[...]


The lack of messages from the iPod seems to indicate that the hub isn't
working right.  You could try plugging the iPod into a different hub or
directly into the computer.  Or maybe into a different port of that hub.


Uh, I think I found the reason for the strange behaviour at
shutdown/suspend. When I unload the usblp module, then the iPod is
recognized.


And that's not the case with 2.6.20?

--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kmalloc() with size zero

2007-03-29 Thread Bill Davidsen

Stephane Eranian wrote:

Hi,

On Sun, Mar 25, 2007 at 06:30:34PM +0200, Folkert van Heusden wrote:

I'd say feature, glibc's malloc also returns an address on
malloc(0).


This is implementation defined-the standard allows for return of either
null or an address.

Entirely for entertainment: AIX (5.3) returns NULL, IRIX returns a valid
address.


That's interesting, so many different behaviors! Personally, I still prefer
when malloc(0) returns zero because it makes it easier to catch errors.

Exactly, the address returned is not really useful, the improved error 
checking is useful.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.21-rc5-git1][KVM] exception on start VM

2007-03-29 Thread Bill Davidsen
Starting a VM for Win98SE:

posidon:root /usr/local/kvm-15/bin/qemu -m 128 -hda Win98SE-2.kvm 
exception 13 (0)
rax f000ff53 rbx  rcx 005a rdx 
000e
rsi 001100c4 rdi 0002a002 rsp 00086650 rbp 
667a
r8   r9   r10  r11 

r12  r13  r14  r15 

rip d350 rflags 00237206
cs fff8 (000fff80/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ds  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ss 103f (000103f0/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
fs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
gs  (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
tr  (0885/2088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt  (/ p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
gdt 87244/2f
idt 0/3ff
cr0 6010 cr2 0 cr3 0 cr4 0 cr8 0 efer 0
Aborted
posidon:root


Hope that's useful, I was looking at nbd issues and just tried for 
curiosity.

-- 
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] the scheduled eepro100 removal

2007-03-30 Thread Bill Davidsen

Brandeburg, Jesse wrote:

Roberto Nibali wrote:
  

Sounds sane to me.  My overall opinion on eepro100 removal is that
we're not there yet.  Rare problem cases remain where e100 fails
but eepro100 works, and it's older drivers so its low priority for
everybody. 


Needs to happen, though...


It seems that several Tyan Opteron base system that were using IPMI
add on card.  the IPMI card share intel 100Mhz nic onboard. you need
to use eepro100 instead of e100 otherwise the e100 will shutdown OOB
(out of Band) connection for IPMI when shut down the OS.
  

I find it hard to believe that something as common as IPMI in
conjunction with the IPMI technology wasn't tested in Intel's lab.
From my experience with Intel Server boards, onboard IPMI (all offered
versions) and e100/e1000 NICs, I've never ever experienced any
problems operating the BMC over the NIC. Also, I don't quite
understand you point about the IPMI card sharing the 100Mbit/s NIC
onboard? What exactly is shared?



It's a legit problem, but only with this *one* system.

  
Of course the eepro100 driver is not taking a lot of maintenance either, 
removing it is not critical as long as there is a legitimate need to 
support old hardware.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-04 Thread Bill Davidsen

Tejun Heo wrote:

[resending.  my mail service was down for more than a week and this
message didn't get delivered.]

[EMAIL PROTECTED] wrote:
  

Anyway, what's annoying is that I can't figure out how to bring the
drive back on line without resetting the box.  It's in a hot-swap
  

enclosure,
  

but power cycling the drive doesn't seem to help.  I thought libata
  

hotplug
  

was working?  (SiI3132 card, using the sil24 driver.)
  


Yeah, it's working but failing resets are considered highly dangerous
(in that the controller status is unknown and may cause something
dangerous like screaming interrupts) and port is muted after that.  The
plan is to handle this with polling hotplug such that libata tries to
revive the port if PHY status change is detected by polling.  Patches
are available but they need other things to resolved to get integrated.
 I think it'll happen before the summer.

Anyways, you can tell libata to retry the port by manually telling it to
rescan the port (echo - - -  /sys/class/scsi_host/hostX/scan).
  
I won't say that's voodoo, but if I ever did it I'd wipe down my 
keyboard with holy water afterward. ;-)


Well, I did save the message in my tricks file, but it sounds like a 
last ditch effort after something get very wrong.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] fast file mapping for loop

2008-01-10 Thread Bill Davidsen

Jens Axboe wrote:

Hi,

loop.c currently uses the page cache interface to do IO to file backed
devices. This works reasonably well for simple things, like mapping an
iso9660 file for direct mount and other read-only workloads. Writing is
somewhat problematic, as anyone who has really used this feature can
attest to - it tends to confuse the vm (hello kswapd) since it break
dirty accounting and behaves very erratically on writeout. Did I mention
that it's pretty slow as well, for both reads and writes?

Since you are looking for comments, I'll mention a loop-related behavior 
I've been seeing and see if it gets comments or is useful, since it can 
be used to tickle bad behavior on demand.


I have an 6GB sparse file, which I mount with cryptoloop and populate as 
an ext3 filesystem (more later on why). I then copy ~5.8GB of data to 
the filesystem, which is unmounted to be burnt to a DVD. Before it's 
burned the dvdisaster application is used to add some ECC information 
to the end, and make an image which fits on a DVD-DL. Media will be 
burned and distributed to multiple locations.


The problem:

When copying with rsync, the copy runs at ~25MB/s for a while, then 
falls into a pattern of bursts of 25MB/s followed by 10-15 sec of iowait 
with no disk activity. So I tried doing the copy by cpio

  find . -depth | cpio -pdm /mnt/loop
which shows exactly the same behavior. Then, for no good reason I tried
  find . -depth | cpio -pBdm /mnt/loop
and the copy ran at 25MB/s for the whole data set.

I was able to see similar results with a pure loop mount, I only mention 
the crypto for accuracy. Because many of these have been shipped over 
the last two years and new loop code would only be useful in this case 
if it were compatible so old data sets could be read.



It also behaves differently than a real drive. For writes, completions
are done once they hit page cache. Since loop queues bio's async and
hands them off to a thread, you can have a huge backlog of stuff to do.
It's hard to attempt to guarentee data safety for file systems on top of
loop without making it even slower than it currently is.

Back when loop was only used for iso9660 mounting and other simple
things, this didn't matter. Now it's often used in xen (and others)
setups where we do care about performance AND writing. So the below is a
attempt at speeding up loop and making it behave like a real device.
It's a somewhat quick hack and is still missing one piece to be
complete, but I'll throw it out there for people to play with and
comment on.

So how does it work? Instead of punting IO to a thread and passing it
through the page cache, we instead attempt to send the IO directly to the
filesystem block that it maps to. loop maintains a prio tree of known
extents in the file (populated lazily on demand, as needed). Advantages
of this approach:

- It's fast, loop will basically work at device speed.
- It's fast, loop it doesn't put a huge amount of system load on the
  system when busy. When I did comparison tests on my notebook with an
  external drive, running a simple tiobench on the current in-kernel
  loop with a sparse file backing rendered the notebook basically
  unusable while the test was ongoing. The remapper version had no more
  impact than it did when used directly on the external drive.
- It behaves like a real block device.
- It's easy to support IO barriers, which is needed to ensure safety
  especially in virtualized setups.

Disadvantages:

- The file block mappings must not change while loop is using the file.
  This means that we have to ensure exclusive access to the file and
  this is the bit that is currently missing in the implementation. It
  would be nice if we could just do this via open(), ideas welcome...
- It'll tie down a bit of memory for the prio tree. This is GREATLY
  offset by the reduced page cache foot print though.
- It cannot be used with the loop encryption stuff. dm-crypt should be
  used instead, on top of loop (which, I think, is even the recommended
  way to do this today, so not a big deal).



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] per-task I/O throttling

2008-01-10 Thread Bill Davidsen

Andrea Righi wrote:

Allow to limit the bandwidth of I/O-intensive processes, like backup
tools running in background, large files copy, checksums on huge files,
etc.

This kind of processes can noticeably impact the system responsiveness
for some time and playing with tasks' priority is not always an
acceptable solution.

This patch allows to specify a maximum I/O rate in sectors per second
for each single process via /proc/PID/io_throttle (default is zero,
that specify no limit).

It would seem to me that this would be vastly more useful in the real 
world if there were a settable default, so that administrators could 
avoid having to find and tune individual user processes. And it would 
seem far less common that the admin would want to set the limit *up* for 
a given process, and it's likely to be one known to the admin, at least 
by name.


Of course if you want to do the effort to make it fully tunable, it 
could have a default by UID or GID. Usful on machines shared by students 
or managers.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >