Re: [PATCH] sched: staircase deadline misc fixes

2007-03-28 Thread Mike Galbraith
Oh my, I'm on a roll here... somebody stop me ;-)

Some emphasis:

On Thu, 2007-03-29 at 08:29 +0200, Mike Galbraith wrote:
> On Thu, 2007-03-29 at 07:50 +0200, Mike Galbraith wrote:
> 
> > Opinion polls are nice, but I'm more interested in gathering numbers
> > which either validate or invalidate the claims of the design documents.
> 
> Suggestion: try the testcase that Satoru Takeuch posted.  The numbers I
> got with latest SD were no better than the numbers I got with the patch
> I posted to try to solve it.  Seems to me the numbers with SD should
> have been much better, but they in fact were not.
> 
> Running that thing, mainline's GUI was not usable, even with my patch,
> but neither was it usable with SD.  What's the difference between
> horrible with mainline and merely terrible with SD?  In both, the GUI
> ends up doing round-robin with a slew of hogs.  In mainline, this
> happens because the history logic can and does get it wrong sometimes,
> which this exploit deliberately triggers.  With SD, it's by design.

The much maligned history mechanism in mainline didn't start it's life
as an interactivity estimator, that's a name it acquired later.  What it
was first put there for was to ensure fairness for sleeping tasks.

I found it most ironic that the numbers I posted showed that mechanism
working perfectly, with an exploit that was designed specifically to
expose it's weakness, despite the deliberate tweaks that have gone in
tweaking it very heavily in the unfair direction, and this went
uncommented.  If I had run more of them, it would have shown that
weakness very well.  We all know that weakness exists.

What the numbers clearly showed was that sleeping tasks did not get the
fairness RSDL advertised with the particular test I ran, yet it went
uncommented/uncontested.  Anyone could have tested with the trivial
proggy of their choice... but nobody did.

The history mechanism is not only about interactivity, and never was. 

-Mike

I'm gonna go piddle around with code now, much more fun than yacking :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: staircase deadline misc fixes

2007-03-28 Thread Con Kolivas
On Thursday 29 March 2007 02:37, Con Kolivas wrote:
> I'm cautiously optimistic that we're at the thin edge of the bugfix wedge
> now.

My neck condition got a lot worse today. I'm forced offline for a week and 
will be uncontactable.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: staircase deadline misc fixes

2007-03-28 Thread Mike Galbraith
On Thu, 2007-03-29 at 07:50 +0200, Mike Galbraith wrote:

> Opinion polls are nice, but I'm more interested in gathering numbers
> which either validate or invalidate the claims of the design documents.

Suggestion: try the testcase that Satoru Takeuch posted.  The numbers I
got with latest SD were no better than the numbers I got with the patch
I posted to try to solve it.  Seems to me the numbers with SD should
have been much better, but they in fact were not.

Running that thing, mainline's GUI was not usable, even with my patch,
but neither was it usable with SD.  What's the difference between
horrible with mainline and merely terrible with SD?  In both, the GUI
ends up doing round-robin with a slew of hogs.  In mainline, this
happens because the history logic can and does get it wrong sometimes,
which this exploit deliberately triggers.  With SD, it's by design.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Software RAID (non-preempt) server blocking question. (2.6.20.4)

2007-03-28 Thread Neil Brown
On Tuesday March 27, [EMAIL PROTECTED] wrote:
> I ran a check on my SW RAID devices this morning.  However, when I did so, 
> I had a few lftp sessions open pulling files.  After I executed the check, 
> the lftp processes entered 'D' state and I could do 'nothing' in the 
> process until the check finished.  Is this normal?  Should a check block 
> all I/O to the device and put the processes writing to a particular device 
> in 'D' state until it is finished?

No, that shouldn't happen.  The 'check' should notice any other disk
activity and slow down if anything else is happening on the device.

Did the check run to completion?  And if so, did the 'lftp' start
working normally again?

Did you look at "cat /proc/mdstat" ?? What sort of speed was the check
running at?

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Andrew Wbeelsoi says: I think I have a vagina!

2007-03-28 Thread andrew . wbeelsoi

Oh shit!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Andrew Wbeelsoi says: Fuck you!!

2007-03-28 Thread andrew . wbeelsoi

Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!
Fuck you!
You\'re dead to me!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: staircase deadline misc fixes

2007-03-28 Thread Mike Galbraith
On Thu, 2007-03-29 at 09:44 +1000, Con Kolivas wrote:
> On Thursday 29 March 2007 04:48, Ingo Molnar wrote:
> > hm, how about the questions Mike raised (there were a couple of cases of
> > friction between 'the design as documented and announced' and 'the code
> > as implemented')? As far as i saw they were still largely unanswered -
> > but let me know if they are all answered and addressed:
> 
> I spent less time emailing and more time coding. I have been working on 
> addressing whatever people brought up.
> 
> >  http://marc.info/?l=linux-kernel=117465220309006=2
> 
> Attended to.
> 
> >  http://marc.info/?l=linux-kernel=117489673929124=2
> 
> Attended to.
> 
> >  http://marc.info/?l=linux-kernel=117489831930240=2
> 
> Checked fine.

That one's not fine.

+static void recalc_task_prio(struct task_struct *p, struct rq *rq)
+{
+   struct prio_array *array = rq->active;
+   int queue_prio;
+
+   update_if_moved(p, rq);
+   if (p->rotation == rq->prio_rotation) {
+   if (p->array == array) {
+   if (p->time_slice > 0)
+   return;
+   p->time_slice = p->quota;
+   } else if (p->array == rq->expired) {

You implemented nanosecond accounting, but here you give a task which
has either missed the tick ofter enough, or accumulated enough cross cpu
clock drift to have an I.O.U. in it's wallet a shiny new $8 bill.

WRT  clock drift/timewarps, your latest code cedes that these do occur,
but where these timewarps can be anywhere between minuscule with Intel
same package processors, up to a tick elsewhere, charges a tick. 
 
-   /* cpu scheduler quota accounting is performed here */
+   if (tick) {
+   /*
+* Called from scheduler_tick() there should be less
than two
+* jiffies worth, and not negative/overflow.
+*/
+   if (time_diff > JIFFIES_TO_NS(2) || time_diff <
min_diff)
+   time_diff = JIFFIES_TO_NS(1); 

> > and the numbers he posted:
> >
> >  http://marc.info/?l=linux-kernel=117448900626028=2
> 
> Attended to.

Hm.  How, where?

I'm getting inconsistent results with current, but sleeping tasks still
don't _appear_ to be able to compete with hogs on an equal footing, and
I don't see how they really can.

What happens if a sleeper sleeps after using say half of it's slice, and
the hog it's sharing the CPU with then sleeps briefly after using most
of it's slice.  That's the end of the rotation.  They are put back on an
equal footing, but what just happened to the differential in cpu usage?

> > his test conclusion was that under CPU load, RSDL (SD) generally does
> > not hold up to mainline's interactivity.
> 
> There have been improvements since the earlier iterations but it's still a 
> fairness based design. Mike's "sticking point" test case should be improved 
> as well.

The behavior is different, and is less ragged, but I wouldn't say it's
really been improved.  The below was added as a workaround.

+ * This contains a bitmap for each dynamic priority level with empty slots
+ * for the valid priorities each different nice level can have. It allows
+ * us to stagger the slots where differing priorities run in a way that
+ * keeps latency differences between different nice levels at a minimum.
+ * ie, where 0 means a slot for that priority, priority running from left to
+ * right:
+ * nice -20 
+ * nice -10 1001000100100010001001000100010010001000
+ * nice   0 0101010101010101010101010101010101010101
+ * nice   5 1101011010110101101011010110101101011011
+ * nice  10 0110111011011101110110111011101101110111
+ * nice  15 0101101101011011
+ * nice  19 1110

I don't really know what to say about this.  I think it explains reduced
context switching, but I don't see how this could be a good thing.
Consider a nice -20 fast/light task trying to get CPU with nice 0 tasks
being constantly spawned.  How can this latency bound fast mover perform
if it can't preempt?  What am I missing?

> My call based on my own testing and feedback from users is: 
> 
> Under niced loads it is 99% in favour of SD.
> 
> Under light loads it is 95% in favour of SD.
> 
> Under Heavy loads it becomes proportionately in favour of mainline. The 
> crossover is somewhere around a load of 4.

Opinion polls are nice, but I'm more interested in gathering numbers
which either validate or invalidate the claims of the design documents.
 
WRT this subjective opinion thing, I see regressions with all loads, and
I don't see what a < 95% load really means.  If CPU isn't contended,
dishing it out is dirt simple.  Just give everybody frequent, and fairly
short chunks, and everybody is fairly happy.  The only time scheduling
becomes interesting is when there IS contention, and mainline seems to
do much better at this, with the caveat that the history 

Re: [ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions

2007-03-28 Thread Maxim
On Thursday 29 March 2007 07:08:58 Linus Torvalds wrote:
> 
> On Thu, 29 Mar 2007, Maxim wrote:
> >
> > I am sending here a patch that as was discussed here adds hpet to list 
> > of system devices
> > and adds suspend/resume hooks this way.
> > I tested it and it works fine.
> 
> Ok, it certainly looks better, but it *also* looks like it just assumes 
> the HPET is there. Which would work in testing _with_ a HPET, but would 
> likely break on hardware without one, no?
> 
> Shouldn't there be at least something like a
> 
>   if (!is_hpet_capable())
>   return 0;
> 
> at the top of that init routine? I'd also expect that you'd need to check 
> that "hpet_virt_address" is valid or something?
> 
> (Or, better yet, shouldn't we set "boot_hpet_disable" when we decide not 
> to use the HPET, and set hpet_virt_address to NULL?)

This is done here

out_nohpet:
iounmap(hpet_virt_address);
hpet_virt_address = NULL;
> 
>   Linus
> 

Hi, 
Of course, I forgot.

I was planning to put sysdev code in hpet_enable()
but it is not possible because this function is called too early.

Thus I put sysdev initialization  in separate function but forgot to 
test for HPET

Thanks a lot.

Best regards
Maxim Levitsky

---
This adds support of suspend/resume on i386 for HPET
Signed-off-by: Maxim Levitsky <[EMAIL PROTECTED]>

---
 arch/i386/kernel/hpet.c |   68 +++
 1 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..7c67780 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -310,6 +312,7 @@ int __init hpet_enable(void)
 out_nohpet:
iounmap(hpet_virt_address);
hpet_virt_address = NULL;
+   boot_hpet_disable = 1;
return 0;
 }
 
@@ -524,3 +527,68 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 #endif
+
+
+/*
+ * Suspend/resume part
+ */
+
+#ifdef CONFIG_PM
+
+static int hpet_suspend(struct sys_device *sys_device, pm_message_t state)
+{
+   unsigned long cfg = hpet_readl(HPET_CFG);
+
+   cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY);
+   hpet_writel(cfg, HPET_CFG);
+
+   return 0;
+}
+
+static int hpet_resume(struct sys_device *sys_device)
+{
+   unsigned int id;
+
+   hpet_start_counter();
+
+   id = hpet_readl(HPET_ID);
+
+   if (id & HPET_ID_LEGSUP)
+   hpet_enable_int();
+
+   return 0;
+}
+
+static struct sysdev_class hpet_class = {
+   set_kset_name("hpet"),
+   .suspend= hpet_suspend,
+   .resume = hpet_resume,
+};
+
+static struct sys_device hpet_device = {
+   .id = 0,
+   .cls= _class,
+};
+
+
+static __init int hpet_register_sysfs(void)
+{
+   int err;
+
+   if (!is_hpet_capable())
+   return 0;
+
+   err = sysdev_class_register(_class);
+
+   if (!err) {
+   sysdev_register(_device);
+   if (err)
+   sysdev_class_unregister(_class);
+   }
+
+   return err;
+}
+
+device_initcall(hpet_register_sysfs);
+
+#endif
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] pid: Properly detect orphaned process groups in exit_notify

2007-03-28 Thread Eric W. Biederman

In commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1 when converting
the converting the orphaned process group handling to use struct pid
I made a small mistake.  I accidentally replaced an == with a !=.

Besides just being a dumb thing to do apparently this has a bad side
effect.  The improper orphaned process group detection causes kwin to
die after a suspend/resume cycle.

I'm amazed this patch has been around as long as it has without anyone
else noticing something funny going on.

And the following people deserve credit for spotting and helping
to reproduce this.

Thanks to: Sid Boyce <[EMAIL PROTECTED]>
Thanks to: "Michael Wu"

Signed-off-by: "Eric W. Biederman" <[EMAIL PROTECTED]>
---

diff --git a/kernel/exit.c b/kernel/exit.c
index f132349..b55ed4c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -790,7 +790,7 @@ static void exit_notify(struct task_struct *tsk)

pgrp = task_pgrp(tsk);
if ((task_pgrp(t) != pgrp) &&
-   (task_session(t) != task_session(tsk)) &&
+   (task_session(t) == task_session(tsk)) &&
will_become_orphaned_pgrp(pgrp, tsk) &&
has_stopped_jobs(pgrp)) {
__kill_pgrp_info(SIGHUP, SEND_SIG_PRIV, pgrp);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] [RFC] HID bus design overview.

2007-03-28 Thread Li Yu
Jiri Kosina wrote:
> JFYI the preliminary version of the hidraw interface is now in the 
> hid/usbhid git tree, and has also been in a few recent -mm kernels 
> already.
>
>   
The shadow driver support works now.

The most largest problem is HID/Bluetooth can not work now. And, I have
no any bluetooth input device to test, So ...

I think I should port current implementation to 2.6.21-rc5-mm2, and
support hiddev, then release it.

The last word is a question, what's the future of hiddev? It will merge
into hidraw later?  I think so, but can't sure.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/21] MSI: Use a list instead of the custom link structure

2007-03-28 Thread Eric W. Biederman
Michael Ellerman <[EMAIL PROTECTED]> writes:

>
> I thought about doing it in the MSI enable methods, but I think it
> really belongs in the (nonexistant) routine that allocs and sets up a
> pci_dev.

I agree that would be a good place for it as well.

> I think it's pretty dicy to be passing around a pci_dev with an
> uninitialised msi_list. Even if currently no code outside the MSI enable
> methods looks at it, I think we're asking for bugs in the future.

Reasonable.

> So I'll do a patch which adds alloc_pci_dev(), update the callers, and
> then put the msi_list initialisation in there.

Sounds good.  That will allow us to initialize all of the fields in struct
pci_dev to a default value in one place.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFT] e100 driver on ARM

2007-03-28 Thread David Acker

Kok, Auke wrote:

Lennert Buytenhek wrote:

On Mon, Sep 04, 2006 at 06:39:29AM -0400, Jeff Garzik wrote:


1) Does e100 driver work on ARM?


FWIW, e100 seems to work okay for me on an intel ixp2400 (xscale based)
board, an ixp2850 (xscale based) board and an ixp2350 (xscale3 based)
board.  ixp2350 works both with hardware coherency turned on (cpu
snoops bus) and turned off (manual dma cache clean/invalidate as usual.)

As for the other ARM platforms that I'm interested in / have hardware
for / maintain, the at91/ep93xx/pxa270 don't have PCI, and the other
two (iop32x/iop33x) I can't test because I don't have such systems with
e100 NICs, but I expect those would work, since they're both xscale
based like the ixp2400, and the ixp2400 works.


I just got an iop342 board dropped on my lap. Once it's running, I'll 
make sure to make this the first thing to test.




I have a pxa255 based system with PCI added to it.  The e100 would have 
memory corruption in its receive buffers detected by slab debugging 
unless I put in the patch to use the S-bit.


Here is a link to the patch posting:
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20-rc3/2.6.20-rc3-mm1/broken-out/git-netdev-all.patch
Search for e100.c.

http://www-gatago.com/linux/kernel/15457063.html - This discussion seems 
to hit the issue.


There appears to be a race on the cache line where the EL bit and the 
next packet info live. In my case the hardware appeared to write to a 
free packet.  The S-bit seems to make the hardware stop and spin on the 
bit, while the EL bit seems to let the hardware try to use that packet.


This race would occur less often when the receive buffer chain is always 
refilled before the hardware can use them up.  On our 400 Mhz Xscale, we 
can use up all 256 buffers if the PCI bus has another busy device on it. 
 In our case it is an 802.11g miniPCI card and our software was routing 
all ethernet packets to the wireless interface and vice versa while TCP 
streams were running accross these connections.

-Ack
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC:PATCH]regster memory init functions into white list of section mismatch.

2007-03-28 Thread Yasunori Goto

> > > > WARNING: mm/built-in.o - Section mismatch: reference to
> > > > .init.text:__alloc_bootmem_node from .text between 'sparse_init' (at
> > > > offset 0x15c8f) and '__section_nr'
> > > I took a look at this one.
> > > You have SPARSEMEM enabled in your config.
> > > And then I see that in sparse.c we call alloc_bootmem_node()
> > > from a function I thought should be marked __devinit (it
> > > is used by memory_hotplug.c).
> > > But I am not familiar enough to judge if __alloc_bootmen_node
> > > are marked correct with __init or __devinit (to say
> > > this is used in the HOTPLUG case) is more correct.
> > > Anyone?
> > > 
> > > > WARNING: mm/built-in.o - Section mismatch: reference to
> > > > .init.text:__alloc_bootmem_node from .text between 'sparse_init' (at
> > > > offset 0x15d02) and '__section_nr'
> > > Same as above
> > 
> > Memory hotplug code has __meminit for its purpose.
> > But, I suspect that many other places of memory hotplug code may have
> > same issue. I will chase them.


Hello.

  I chased section mismatch codes on memory hotplug code. Many of
  them should be defined as __meminit. (This check was great helpful
  for checking it. Thanks!)
  But, I would like to add a new pattern in white list for some of
  them. (I'll post another patch for others.)
  
  sparse.c (sparse_index_alloc()) calles alloc_bootmem_node() as you mentioned.
  And, zone_wait_table_init() calles it too.
  These functions call it on only boot time, and call
  vmalloc()/kmalloc() on hotplug time. It is distinguished by
  system_state value or slab_is_available(). Just refrerences remain
  at them after boot.
  
  Bootmem allocation functions are called by many functions and it must be
  used only at boot time. I think __init of them should keep for
  section mismatch check. So, I would like to register sparse_index_alloc()
  and zone_wait_table_init() into white list.

  Please comment. If there is a more good way, please let me know...

Thanks.

P.S.
  Pattarn 10 is for ia64 (not for memory hotplug). 
  ia64's .machvec section is mixture table of .init functions and normal text.
  It is defined for platform dependent functions. This is also cause of 
  warnings. I think this should be registered too. 


Signed-off-by: Yasunori Goto <[EMAIL PROTECTED]>

---
 mm/page_alloc.c   |2 +-
 mm/sparse.c   |2 +-
 scripts/mod/modpost.c |   29 +
 3 files changed, 31 insertions(+), 2 deletions(-)

Index: current_test/scripts/mod/modpost.c
===
--- current_test.orig/scripts/mod/modpost.c 2007-03-27 20:21:20.0 
+0900
+++ current_test/scripts/mod/modpost.c  2007-03-29 14:16:05.0 +0900
@@ -643,6 +643,17 @@ static int strrcmp(const char *s, const 
  *  The pattern is:
  *  tosec= .init.text
  *  fromsec  = __ksymtab*
+ *
+ * Pattern 9:
+ *  Some of functions are common code between boot time and hotplug
+ *  time. The bootmem allocater is called only boot time in its
+ *  functions. So it's ok to reference.
+ *  tosec= .init.text
+ *
+ * Pattern 10:
+ *  ia64 has machvec table for each platform. It is mixture of function
+ *  pointer of .init.text and .text.
+ *  fromsec  = .machvec
  **/
 static int secref_whitelist(const char *modname, const char *tosec,
const char *fromsec, const char *atsym,
@@ -669,6 +680,12 @@ static int secref_whitelist(const char *
NULL
};
 
+   const char *pat4sym[] = {
+   "sparse_index_alloc",
+   "zone_wait_table_init",
+   NULL
+   };
+
/* Check for pattern 1 */
if (strcmp(tosec, ".init.data") != 0)
f1 = 0;
@@ -725,6 +742,18 @@ static int secref_whitelist(const char *
if ((strcmp(tosec, ".init.text") == 0) &&
(strncmp(fromsec, "__ksymtab", strlen("__ksymtab")) == 0))
return 1;
+
+   /* Check for pattern 9 */
+   if ((strcmp(tosec, ".init.text") == 0) &&
+   (strcmp(fromsec, ".text") == 0))
+   for (s = pat4sym; *s; s++)
+   if (strcmp(atsym, *s) == 0)
+   return 1;
+
+   /* Check for pattern 10 */
+   if (strcmp(fromsec, ".machvec") == 0)
+   return 1;
+
return 0;
 }
 
Index: current_test/mm/page_alloc.c
===
--- current_test.orig/mm/page_alloc.c   2007-03-27 16:04:41.0 +0900
+++ current_test/mm/page_alloc.c2007-03-29 14:14:42.0 +0900
@@ -2673,7 +2673,7 @@ void __init setup_per_cpu_pageset(void)
 
 #endif
 
-static __meminit
+static __meminit noinline
 int zone_wait_table_init(struct zone *zone, unsigned long zone_size_pages)
 {
int i;
Index: current_test/mm/sparse.c
===
--- current_test.orig/mm/sparse.c   2007-03-27 16:04:41.0 

Re: [ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions

2007-03-28 Thread Linus Torvalds


On Thu, 29 Mar 2007, Maxim wrote:
>
>   I am sending here a patch that as was discussed here adds hpet to list 
> of system devices
>   and adds suspend/resume hooks this way.
>   I tested it and it works fine.

Ok, it certainly looks better, but it *also* looks like it just assumes 
the HPET is there. Which would work in testing _with_ a HPET, but would 
likely break on hardware without one, no?

Shouldn't there be at least something like a

if (!is_hpet_capable())
return 0;

at the top of that init routine? I'd also expect that you'd need to check 
that "hpet_virt_address" is valid or something?

(Or, better yet, shouldn't we set "boot_hpet_disable" when we decide not 
to use the HPET, and set hpet_virt_address to NULL?)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86_64 irq: keep consistent for changing IRQ0_VECTOR from 0x20 to 0x30

2007-03-28 Thread Eric W. Biederman
"Yinghai Lu" <[EMAIL PROTECTED]> writes:

> On 3/7/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>> The comment fixes or some variation on them are needed.
>
> Please check the patch about comment.
>
> YH
>

Looks good to me.  I've cleaned up the description and placed the patch inline
for easier consumption.  Everything this patch touches is a comment.  So
it is as safe as they come.  And the patch appears to apply to the Linus's
latest tree.

---
From: Yinghai Lu <[EMAIL PROTECTED]>
Subject: x86_64 irq: Fix comments after changing IRQ0_VECTOR from 0x20 to 0x30

Signed-off-by: "Eric W. Biederman" <[EMAIL PROTECTED]>
Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/arch/x86_64/kernel/i8259.c b/arch/x86_64/kernel/i8259.c
index 21d95b7..4894266 100644
--- a/arch/x86_64/kernel/i8259.c
+++ b/arch/x86_64/kernel/i8259.c
@@ -45,7 +45,7 @@
 
 /*
  * ISA PIC or low IO-APIC triggered (INTA-cycle or APIC) interrupts:
- * (these are usually mapped to vectors 0x20-0x2f)
+ * (these are usually mapped to vectors 0x30-0x3f)
  */
 
 /*
@@ -299,7 +299,7 @@ void init_8259A(int auto_eoi)
 * outb_p - this has to work on a wide range of PC hardware.
 */
outb_p(0x11, 0x20); /* ICW1: select 8259A-1 init */
-   outb_p(IRQ0_VECTOR, 0x21);  /* ICW2: 8259A-1 IR0-7 mapped to 
0x20-0x27 */
+   outb_p(IRQ0_VECTOR, 0x21);  /* ICW2: 8259A-1 IR0-7 mapped to 
0x30-0x37 */
outb_p(0x04, 0x21); /* 8259A-1 (the master) has a slave on IR2 */
if (auto_eoi)
outb_p(0x03, 0x21); /* master does Auto EOI */
@@ -307,7 +307,7 @@ void init_8259A(int auto_eoi)
outb_p(0x01, 0x21); /* master expects normal EOI */
 
outb_p(0x11, 0xA0); /* ICW1: select 8259A-2 init */
-   outb_p(IRQ8_VECTOR, 0xA1);  /* ICW2: 8259A-2 IR0-7 mapped to 
0x28-0x2f */
+   outb_p(IRQ8_VECTOR, 0xA1);  /* ICW2: 8259A-2 IR0-7 mapped to 
0x38-0x3f */
outb_p(0x02, 0xA1); /* 8259A-2 is a slave on master's IR2 */
outb_p(0x01, 0xA1); /* (slave's support for AEOI in flat mode
is to be investigated) */
diff --git a/include/asm-x86_64/hw_irq.h b/include/asm-x86_64/hw_irq.h
index 2e4b7a5..6153ae5 100644
--- a/include/asm-x86_64/hw_irq.h
+++ b/include/asm-x86_64/hw_irq.h
@@ -38,7 +38,7 @@
 #define IRQ_MOVE_CLEANUP_VECTORFIRST_EXTERNAL_VECTOR
  
 /*
- * Vectors 0x20-0x2f are used for ISA interrupts.
+ * Vectors 0x30-0x3f are used for ISA interrupts.
  */
 #define IRQ0_VECTORFIRST_EXTERNAL_VECTOR + 0x10
 #define IRQ1_VECTORIRQ0_VECTOR + 1
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] the scheduled eepro100 removal

2007-03-28 Thread Yinghai Lu

On 3/28/07, Jeff Garzik <[EMAIL PROTECTED]> wrote:

Kok, Auke wrote:
Sounds sane to me.  My overall opinion on eepro100 removal is that we're
not there yet.  Rare problem cases remain where e100 fails but eepro100
works, and it's older drivers so its low priority for everybody.

Needs to happen, though...



It seems that several Tyan Opteron base system that were using IPMI
add on card.  the IPMI card share intel 100Mhz nic onboard. you need
to use eepro100 instead of e100 otherwise the e100 will shutdown OOB
(out of Band) connection for IPMI when shut down the OS.

YH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Eric W. Biederman
Len Brown <[EMAIL PROTECTED]> writes:

>> Tony, Len the way pci_disable_device is being used in a suspend/resume 
>> path by a few drivers is completely incompatible with the way irqs are 
>> allocated on ia64.  In particular people the following sequence occurs 
>> in several drivers.
>> 
>> probe:
>>   pci_enable_device(pdev);
>>   request_irq(pdev->irq);
>> suspend:
>>   pci_disable_device(pdev);
>> resume:
>>   pci_enable_device(pdev);
>> remove:
>>   free_irq(pdev->irq);
>>   pci_disable_device(pdev);
>
> There are no IA64 machines that support system suspend/resume today --
> so you have 0 chance of breaking the IA64 suspend/resume installed base.

Ok.  So that is why the inconsistency persists...

> My understanding is that Luming Yu has cobbled IA64 S4 support
> together for a future release though.
>
>> What I'm proposing we do is move the irq allocation code out of 
>> pci_enable_device and the irq freeing code out of pci_disable_device in 
>> the future.  If we move ia64 to a model where the irq number equal the 
>> gsi like we have for x86_64 and are in the middle of for i386 that 
>> should be pretty straight forward. It would even be relatively simple  
>> to delay vector allocation in that context until request_irq, if we 
>> needed the delayed allocation benefit.  Do you two have any problems 
>> with moving in that direction?
>
> I think consistency here would be _wonderful_.
> Of course the beauty of having identity GSI=IRQ and a /proc/interrupts
> that tells you what IOAPIC pin you are using become moot with MSI --
> but hey, showing the IRQ number rather than the vector number
> is consistent and makes sense.

Yes.  It also allows for bigger machines.  And I can get a consistent
number out of MSI if we allocate irq numbers in a sufficiently non-sparse
way.  Something like bus|device|func|irq which is 8+5+3+12 or 28 bits...
I'll never get there though if i keep unearthing this long standing bugs.

>> If fixing the arch code is unacceptable for some reason I'm not aware of 
>> we need to audit the 10-20 drivers that call pci_disable_device in their 
>> suspend/resume processing and ensure that they have freed all of the 
>> irqs before that point.  Given that I have bug reports on the msi path I 
>> know that isn't true.
>
> I think the suspend/resume interrupt logic needs some serious attention.
> We've had several schemes for suspend/resume of interrupts, several
> changes in strategy, and right now I think we are inconsistent,
> and frankly, I'm amazed it works at all.

What I have been doing lately is to aim at consistency in how a function
is called (and thus how it is expected to be used) and how it is actually
implemented.  When I have a choice I try to pick a forgiving implementation
so that driver writers don't have to follow a magic correct path for
things to work correctly.  

Removing the irq assignment from pci_enable_device is something that
matches implementation with use.

As for the rest it seems reasonable to me to allow an irq to be held
requested over suspend/resume and to save and restore apic and msi
capability state.  Especially since irq numbers are a kernel
abstraction we should be able to do with them what we need to.

Honestly the whole suspend/resume thing is beyond me at this point I'm
laptop free...  But I do know how to make code consistent with itself.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/21] MSI: Use a list instead of the custom link structure

2007-03-28 Thread Michael Ellerman
On Wed, 2007-03-28 at 00:29 -0600, Eric W. Biederman wrote:
> Michael Ellerman <[EMAIL PROTECTED]> writes:
> 
> > The msi descriptors are linked together with what looks a lot like
> > a linked list, but isn't a struct list_head list. Make it one.
> >
> > The only complication is that previously we walked a list of irqs, and
> > got the descriptor for each with get_irq_msi(). Now we have a list of
> > descriptors and need to get the irq out of it, so it needs to be in the
> > actual struct msi_desc. We use 0 to indicate no irq is setup.
> >
> > At some point after a pci_dev is created we need to initialise its
> > msi_list. pci_device_add() looks like the right place to do that, although
> > I'm not convinced it's 100% safe. In drivers/char/agp/alpha-agp.c we create
> > a pci_dev and I don't see that it ever gets passed to pci_device_add(), but
> > we probably don't care.
> 
> Well that one appears to be a dummy place holder and probably should at
> least have a kzalloc to initialize all of the fields to know values.
> 
> Regardless the normal pci device allocation does use kzalloc so we will
> have well defined if not beautiful behavior if we try and use it.
> 
> Until we have a case where we need to use the msi_list outside of 
> where we enable and disable msi we should be perfectly fine initializing
> the list somewhere inside of pci_enable_msi, and pci_enable_msix.
> With dev->msi_enabled and dev->msix_enabled serving as flags to the
> rest of the world that it is safe to look at the list.
> 
> It certainly sounds safer to me then becoming to closely coupled with
> code that doesn't really care about how msi works.  Heck even though
> we repeat the call twice I bet it will even be less code.

I thought about doing it in the MSI enable methods, but I think it
really belongs in the (nonexistant) routine that allocs and sets up a
pci_dev.

I think it's pretty dicy to be passing around a pci_dev with an
uninitialised msi_list. Even if currently no code outside the MSI enable
methods looks at it, I think we're asking for bugs in the future.

So I'll do a patch which adds alloc_pci_dev(), update the callers, and
then put the msi_list initialisation in there.

> > --- msi-new.orig/include/linux/msi.h
> > +++ msi-new/include/linux/msi.h
> > @@ -1,6 +1,8 @@
> >  #ifndef LINUX_MSI_H
> >  #define LINUX_MSI_H
> >  
> > +#include 
> > +
> >  struct msi_msg {
> > u32 address_lo; /* low 32 bits of msi message address */
> > u32 address_hi; /* high 32 bits of msi message address */
> > @@ -24,10 +26,8 @@ struct msi_desc {
> > unsigned default_irq;   /* default pre-assigned irq   */
> > }msi_attrib;
> >  
> > -   struct {
> > -   __u16   head;
> > -   __u16   tail;
> > -   }link;
> > +   int irq;
> This should be "unsigned int irq"

Oops, I'll fix that.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part


Re: [patch resend v4] update ctime and mtime for mmaped write

2007-03-28 Thread Nick Piggin

[EMAIL PROTECTED] wrote:

But if you didn't notice until now, then the current implementation
must be pretty reasonable for you use as well.



Oh, I definitely noticed.  As soon as I tried to port my application
to 2.6, it broke - as evidenced by my complaints last year.  The
current solution is simple - since it's running on dedicated boxes,
leave them on 2.4.


Well I didn't know that was a change in behaviour vs 2.4 (or maybe I
did and forgot). That was probably a bit silly, unless there was a
good reason for it.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86_64 irq: keep consistent for changing IRQ0_VECTOR from 0x20 to 0x30

2007-03-28 Thread Yinghai Lu

On 3/7/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

The comment fixes or some variation on them are needed.


Please check the patch about comment.

YH
[PATCH] x86_64 irq: keep consistent for changing IRQ0_VECTOR from 0x20 to 0x30

FIRST_EXTERNAL_VECTOR is used for IRQ_MOVE_CLEANUP_VECTOR, and IRQ0 starting from FIRST_EXTERNAL_VECTOR + 0x10.

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

diff --git a/arch/x86_64/kernel/i8259.c b/arch/x86_64/kernel/i8259.c
index 21d95b7..4894266 100644
--- a/arch/x86_64/kernel/i8259.c
+++ b/arch/x86_64/kernel/i8259.c
@@ -45,7 +45,7 @@
 
 /*
  * ISA PIC or low IO-APIC triggered (INTA-cycle or APIC) interrupts:
- * (these are usually mapped to vectors 0x20-0x2f)
+ * (these are usually mapped to vectors 0x30-0x3f)
  */
 
 /*
@@ -299,7 +299,7 @@ void init_8259A(int auto_eoi)
 	 * outb_p - this has to work on a wide range of PC hardware.
 	 */
 	outb_p(0x11, 0x20);	/* ICW1: select 8259A-1 init */
-	outb_p(IRQ0_VECTOR, 0x21);	/* ICW2: 8259A-1 IR0-7 mapped to 0x20-0x27 */
+	outb_p(IRQ0_VECTOR, 0x21);	/* ICW2: 8259A-1 IR0-7 mapped to 0x30-0x37 */
 	outb_p(0x04, 0x21);	/* 8259A-1 (the master) has a slave on IR2 */
 	if (auto_eoi)
 		outb_p(0x03, 0x21);	/* master does Auto EOI */
@@ -307,7 +307,7 @@ void init_8259A(int auto_eoi)
 		outb_p(0x01, 0x21);	/* master expects normal EOI */
 
 	outb_p(0x11, 0xA0);	/* ICW1: select 8259A-2 init */
-	outb_p(IRQ8_VECTOR, 0xA1);	/* ICW2: 8259A-2 IR0-7 mapped to 0x28-0x2f */
+	outb_p(IRQ8_VECTOR, 0xA1);	/* ICW2: 8259A-2 IR0-7 mapped to 0x38-0x3f */
 	outb_p(0x02, 0xA1);	/* 8259A-2 is a slave on master's IR2 */
 	outb_p(0x01, 0xA1);	/* (slave's support for AEOI in flat mode
 is to be investigated) */
diff --git a/include/asm-x86_64/hw_irq.h b/include/asm-x86_64/hw_irq.h
index 2e4b7a5..6153ae5 100644
--- a/include/asm-x86_64/hw_irq.h
+++ b/include/asm-x86_64/hw_irq.h
@@ -38,7 +38,7 @@
 #define IRQ_MOVE_CLEANUP_VECTOR	FIRST_EXTERNAL_VECTOR
  
 /*
- * Vectors 0x20-0x2f are used for ISA interrupts.
+ * Vectors 0x30-0x3f are used for ISA interrupts.
  */
 #define IRQ0_VECTOR		FIRST_EXTERNAL_VECTOR + 0x10
 #define IRQ1_VECTOR		IRQ0_VECTOR + 1


Re: [PATCH 10/21] MSI: Add an arch_msi_supported()

2007-03-28 Thread Eric W. Biederman
Michael Ellerman <[EMAIL PROTECTED]> writes:

> I agree with most of that. I thought of doing that change, but didn't
> want to have the powerpc code stuck behind a huge pile of driver
> changes.
>
> My only other worry is that at some point we'll get a driver that does
> want to choose the entries it's allocated, and at that point we'll have
> to put back the msix_entry code (or something similar). I don't have any
> idea of when/if that sort of hardware/driver requirement is likely to
> surface though, if it's "not for a while" it might be worth ripping out
> the complexity until we really need it.

Yes.

Allocating everything and just requesting the irqs you really want is
works as well.  So drivers like that would need to be common and the
savings significant before it would really be worthwhile to change
the API back the way it is now.

>> I was tempted to drop nvec as well since our irq numbers are virtual,
>> we could always delay the failure into request_irq.  But there are
>> a few embedded architectures like the arm where the number irqs
>> numbers may stay limited for a long time and if the driver will never
>> use all of the irqs we get to save some resources and some work.  So
>> that makes sense.
>
> I think nvec should stay.

Agreed.

>> So can we please at least move this patch down to the end with the
>> rest of the RTAS arch support?
>> 
>> Moving it towards the end will allow it to be reviewed in the context
>> where it will be used and it will give us a chance to simplify
>> pci_enable_msix before we get there.
>
> I'm happy to move it to the end of the series. I'm also happy to stop
> passing the msix_entry into the arch.
>
> But I don't want to predicate the merge of our powerpc stuff on the
> removal of msix_entry entirely, there's too much risk that we'll slip to
> v23.

Sure.   But if we can kill msix_entry in the same time frame it would
be a good thing.

Eric


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions

2007-03-28 Thread Maxim
On Wednesday 28 March 2007 18:38:48 Linus Torvalds wrote:
> 
> On Wed, 28 Mar 2007, Maxim wrote:
> > 
> > Now I don't have a clue how to set those bits if only HPET is used as 
> > clock source because now clocksources
> > don't have _any_ resume hook.
> 
> One thing that drives me wild about that "clocksource resume" thing is 
> that it seems to think that clocksources are somehow different from any 
> other system devices..
> 
> Why isn't the HPET considered a "device", and has it's own *device* 
> "suspend" and "resume"? Why do we seem to think that only "set_mode()" 
> etc should wake up clock sources?
> 
> It's a *device*, dammit. It should save and resume like one (probably as a 
> system device). The "set_mode()" etc stuff is at a completely different 
> (higher) conceptual level.
> 
> Thomas? It does seem like Maxim has hit the nail on the head (at least 
> partly) on the HPET timer resume problems..
> 
>   Linus
> 


Hi,
I am sending here a patch that as was discussed here adds hpet to list 
of system devices
and adds suspend/resume hooks this way.
I tested it and it works fine.

---
Add suspend/resume support for HPET
Signed-off-by: Maxim Levitsky <[EMAIL PROTECTED]>

---
 arch/i386/kernel/hpet.c |   64 +++
 1 files changed, 64 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..ac41476 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -524,3 +526,65 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 #endif
+
+
+/*
+ * Suspend/resume part
+ */
+
+#ifdef CONFIG_PM
+
+static int hpet_suspend(struct sys_device *sys_device, pm_message_t state)
+{
+   unsigned long cfg = hpet_readl(HPET_CFG);
+
+   cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY);
+   hpet_writel(cfg, HPET_CFG);
+
+   return 0;
+}
+
+static int hpet_resume(struct sys_device *sys_device)
+{
+   unsigned int id;
+
+   hpet_start_counter();
+
+   id = hpet_readl(HPET_ID);
+
+   if (id & HPET_ID_LEGSUP)
+   hpet_enable_int();
+
+   return 0;
+}
+
+static struct sysdev_class hpet_class = {
+   set_kset_name("hpet"),
+   .suspend= hpet_suspend,
+   .resume = hpet_resume,
+};
+
+static struct sys_device hpet_device = {
+   .id = 0,
+   .cls= _class,
+};
+
+
+static __init int hpet_register_sysfs(void)
+{
+   int err;
+
+   err = sysdev_class_register(_class);
+
+   if (!err) {
+   sysdev_register(_device);
+   if (err)
+   sysdev_class_unregister(_class);
+   }
+
+   return err;
+}
+
+device_initcall(hpet_register_sysfs);
+
+#endif
-- 
1.4.4.2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] MSI-X: fix resume crash

2007-03-28 Thread Len Brown
> Tony, Len the way pci_disable_device is being used in a suspend/resume 
> path by a few drivers is completely incompatible with the way irqs are 
> allocated on ia64.  In particular people the following sequence occurs 
> in several drivers.
> 
> probe:
>   pci_enable_device(pdev);
>   request_irq(pdev->irq);
> suspend:
>   pci_disable_device(pdev);
> resume:
>   pci_enable_device(pdev);
> remove:
>   free_irq(pdev->irq);
>   pci_disable_device(pdev);

There are no IA64 machines that support system suspend/resume today --
so you have 0 chance of breaking the IA64 suspend/resume installed base.

My understanding is that Luming Yu has cobbled IA64 S4 support
together for a future release though.

> What I'm proposing we do is move the irq allocation code out of 
> pci_enable_device and the irq freeing code out of pci_disable_device in 
> the future.  If we move ia64 to a model where the irq number equal the 
> gsi like we have for x86_64 and are in the middle of for i386 that 
> should be pretty straight forward. It would even be relatively simple  
> to delay vector allocation in that context until request_irq, if we 
> needed the delayed allocation benefit.  Do you two have any problems 
> with moving in that direction?

I think consistency here would be _wonderful_.
Of course the beauty of having identity GSI=IRQ and a /proc/interrupts
that tells you what IOAPIC pin you are using become moot with MSI --
but hey, showing the IRQ number rather than the vector number
is consistent and makes sense.

> If fixing the arch code is unacceptable for some reason I'm not aware of 
> we need to audit the 10-20 drivers that call pci_disable_device in their 
> suspend/resume processing and ensure that they have freed all of the 
> irqs before that point.  Given that I have bug reports on the msi path I 
> know that isn't true.

I think the suspend/resume interrupt logic needs some serious attention.
We've had several schemes for suspend/resume of interrupts, several
changes in strategy, and right now I think we are inconsistent,
and frankly, I'm amazed it works at all.

-Len


> From: Eric W. Biederman <[EMAIL PROTECTED]>
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> ---
>  arch/cris/arch-v32/drivers/pci/bios.c |4 +++-
>  arch/frv/mb93090-mb00/pci-vdk.c   |3 ++-
>  arch/i386/pci/common.c|6 --
>  arch/ia64/pci/pci.c   |8 ++--
>  4 files changed, 15 insertions(+), 6 deletions(-)
> 
> Index: linux/arch/cris/arch-v32/drivers/pci/bios.c
> ===
> --- linux.orig/arch/cris/arch-v32/drivers/pci/bios.c
> +++ linux/arch/cris/arch-v32/drivers/pci/bios.c
> @@ -100,7 +100,9 @@ int pcibios_enable_device(struct pci_dev
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
>  
> - return pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + pcibios_enable_irq(dev);
> + return 0;
>  }
>  
>  int pcibios_assign_resources(void)
> Index: linux/arch/frv/mb93090-mb00/pci-vdk.c
> ===
> --- linux.orig/arch/frv/mb93090-mb00/pci-vdk.c
> +++ linux/arch/frv/mb93090-mb00/pci-vdk.c
> @@ -466,6 +466,7 @@ int pcibios_enable_device(struct pci_dev
>  
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
> - pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + pcibios_enable_irq(dev);
>   return 0;
>  }
> Index: linux/arch/i386/pci/common.c
> ===
> --- linux.orig/arch/i386/pci/common.c
> +++ linux/arch/i386/pci/common.c
> @@ -434,11 +434,13 @@ int pcibios_enable_device(struct pci_dev
>   if ((err = pcibios_enable_resources(dev, mask)) < 0)
>   return err;
>  
> - return pcibios_enable_irq(dev);
> + if (!dev->msi_enabled)
> + return pcibios_enable_irq(dev);
> + return 0;
>  }
>  
>  void pcibios_disable_device (struct pci_dev *dev)
>  {
> - if (pcibios_disable_irq)
> + if (!dev->msi_enabled && pcibios_disable_irq)
>   pcibios_disable_irq(dev);
>  }
> Index: linux/arch/ia64/pci/pci.c
> ===
> --- linux.orig/arch/ia64/pci/pci.c
> +++ linux/arch/ia64/pci/pci.c
> @@ -557,14 +557,18 @@ pcibios_enable_device (struct pci_dev *d
>   if (ret < 0)
>   return ret;
>  
> - return acpi_pci_irq_enable(dev);
> + if (!dev->msi_enabled)
> + return acpi_pci_irq_enable(dev);
> + return 0;
>  }
>  
>  void
>  pcibios_disable_device (struct pci_dev *dev)
>  {
>   BUG_ON(atomic_read(>enable_cnt));
> - acpi_pci_irq_disable(dev);
> + if (!dev->msi_enabled)
> + acpi_pci_irq_disable(dev);
> + return 0;
>  }
>  
>  void
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a 

Re: [PATCH] max_loop limit, t2

2007-03-28 Thread Jan Engelhardt


On Mar 25 2007 10:40, Tomas M wrote:
>On ??, Jan Engelhardt wrote:
>
>> here's one. Allocates all the fluff dynamically. It does not
>> create any dev nodes by itself, so you need to do it (à la mdadm)
>
> I'm afraid that this would break a lot of things, for example mount
> -o loop will not work anymore unless you create /dev/loop* manually
> first, am I correct? In this case, this is unusable for many as it
> is not backward compatible with old loop.c, am I correct?

So here's another try. Use the max_auto_loop= module parameter to
define how many device nodes should be created (defaults to 8, like
original loop.c) in advance. (More specifically, how many disks you
want uevents have generated.) This is because creating all 1048576
possible loop disks in /dev (tmpfs!!) would be really overkill and
seldom good for memory usage.


On Mar 28 2007 23:54, Kyle Moffett wrote:

> Maybe an rbtree would work better here?  Maximum number of nodes
> traversed to get to the bottom of the tree given 2^(20) loop
> devices is 19 as opposed to the 2^(20) for a linked list.  Also, to
> preserve compatibility with existing userspace loop tools you
> should probably always allocate one extra loop device. Keep a
> "highest used loopdev" number and create the one after that so that
> udev will autocreate a dev node for it.

Yeah I already have a ... hack that creates /dev/loop[0-7] but it
segfaults ^_^ Perhaps someone knows why. The oops trace I get has
kobject_uevent() in it, but I don't think I missed something in the
_init function wrt. uevent generation, did I?

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>

Name: dynamic-loop-jengelh2.diff

Index: linux-2.6.21-rc5/drivers/block/Makefile
===
--- linux-2.6.21-rc5.orig/drivers/block/Makefile
+++ linux-2.6.21-rc5/drivers/block/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_VIODASD) += viodasd.o
 obj-$(CONFIG_BLK_DEV_SX8)  += sx8.o
 obj-$(CONFIG_BLK_DEV_UB)   += ub.o
 
+CFLAGS_loop.o += -O0
Index: linux-2.6.21-rc5/drivers/block/loop.c
===
--- linux-2.6.21-rc5.orig/drivers/block/loop.c
+++ linux-2.6.21-rc5/drivers/block/loop.c
@@ -77,9 +77,9 @@
 
 #include 
 
-static int max_loop = 8;
-static struct loop_device *loop_dev;
-static struct gendisk **disks;
+static unsigned int max_auto_loop = 8;
+static LIST_HEAD(loop_devices);
+static DEFINE_SPINLOCK(loop_devices_lock);
 
 /*
  * Transfer functions
@@ -183,7 +183,7 @@ figure_loop_size(struct loop_device *lo)
if (unlikely((loff_t)x != size))
return -EFBIG;
 
-   set_capacity(disks[lo->lo_number], x);
+   set_capacity(lo->lo_disk, x);
return 0;   
 }
 
@@ -812,7 +812,7 @@ static int loop_set_fd(struct loop_devic
lo->lo_queue->queuedata = lo;
lo->lo_queue->unplug_fn = loop_unplug;
 
-   set_capacity(disks[lo->lo_number], size);
+   set_capacity(lo->lo_disk, size);
bd_set_size(bdev, size << 9);
 
set_blocksize(bdev, lo_blocksize);
@@ -832,7 +832,7 @@ out_clr:
lo->lo_device = NULL;
lo->lo_backing_file = NULL;
lo->lo_flags = 0;
-   set_capacity(disks[lo->lo_number], 0);
+   set_capacity(lo->lo_disk, 0);
invalidate_bdev(bdev, 0);
bd_set_size(bdev, 0);
mapping_set_gfp_mask(mapping, lo->old_gfp_mask);
@@ -918,7 +918,7 @@ static int loop_clr_fd(struct loop_devic
memset(lo->lo_crypt_name, 0, LO_NAME_SIZE);
memset(lo->lo_file_name, 0, LO_NAME_SIZE);
invalidate_bdev(bdev, 0);
-   set_capacity(disks[lo->lo_number], 0);
+   set_capacity(lo->lo_disk, 0);
bd_set_size(bdev, 0);
mapping_set_gfp_mask(filp->f_mapping, gfp);
lo->lo_state = Lo_unbound;
@@ -1357,8 +1357,9 @@ static struct block_device_operations lo
 /*
  * And now the modules code and kernel interface.
  */
-module_param(max_loop, int, 0);
-MODULE_PARM_DESC(max_loop, "Maximum number of loop devices (1-256)");
+module_param(max_auto_loop, uint, S_IRUGO);
+MODULE_PARM_DESC(max_auto_loop, "Maximum number of auto-generated loop device "
+   "nodes (0-1048576)");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS_BLOCKDEV_MAJOR(LOOP_MAJOR);
 
@@ -1383,7 +1384,7 @@ int loop_unregister_transfer(int number)
 
xfer_funcs[n] = NULL;
 
-   for (lo = _dev[0]; lo < _dev[max_loop]; lo++) {
+   list_for_each_entry(lo, _devices, lo_list) {
mutex_lock(>lo_ctl_mutex);
 
if (lo->lo_encryption == xfer)
@@ -1398,102 +1399,120 @@ int loop_unregister_transfer(int number)
 EXPORT_SYMBOL(loop_register_transfer);
 EXPORT_SYMBOL(loop_unregister_transfer);
 
-static int __init loop_init(void)
+static struct loop_device *loop_find_dev(unsigned int number)
+{
+   struct loop_device *lo;
+   list_for_each_entry(lo, _devices, lo_list)
+   if (lo->lo_number == number)
+   

Re: [PATCH 10/21] MSI: Add an arch_msi_supported()

2007-03-28 Thread Michael Ellerman
On Tue, 2007-03-27 at 23:54 -0600, Eric W. Biederman wrote:
> Michael Ellerman <[EMAIL PROTECTED]> writes:
> 
> > Add an arch_msi_supported(), which gives archs a chance to check the input
> > to pci_enable_msi/x. For MSI-X this routine might need the entry array, so
> > pass it in. For plain MSI, NULL is passed, the arch routine needs to cope
> > with that. Propagate the error value returned from the arch routine out to
> > the caller.
> 
> Ugh.  I'm not very comfortable with passing struct msix_entry into
> the architectures right now.
> 
> There are a couple of reasons.
> - It's irq field is to small (so we need to change it at some point)
> - No a single driver that calls pci_enable_msix uses the scatter gather
>   feature (so the entry member is redundant).
> 
> So this struct msix_entry needs to change and we need to change the drivers
> along with it.  Having to change a couple of architectures as well sounds
> painful.  So we might as well fix that at the same time as we are
> adding the RTAS support so architectures don't have to deal with this
> nasty unused concept.
> 
> I'm thinking the same thing to do is to completely remove struct msix_entry
> and just let drivers walk the linked list you introduce a few patches
> later down.  All they need is to get their irq numbers anyway.

I agree with most of that. I thought of doing that change, but didn't
want to have the powerpc code stuck behind a huge pile of driver
changes.

My only other worry is that at some point we'll get a driver that does
want to choose the entries it's allocated, and at that point we'll have
to put back the msix_entry code (or something similar). I don't have any
idea of when/if that sort of hardware/driver requirement is likely to
surface though, if it's "not for a while" it might be worth ripping out
the complexity until we really need it.

> I was tempted to drop nvec as well since our irq numbers are virtual,
> we could always delay the failure into request_irq.  But there are
> a few embedded architectures like the arm where the number irqs
> numbers may stay limited for a long time and if the driver will never
> use all of the irqs we get to save some resources and some work.  So
> that makes sense.

I think nvec should stay.

> So can we please at least move this patch down to the end with the
> rest of the RTAS arch support?
> 
> Moving it towards the end will allow it to be reviewed in the context
> where it will be used and it will give us a chance to simplify
> pci_enable_msix before we get there.

I'm happy to move it to the end of the series. I'm also happy to stop
passing the msix_entry into the arch.

But I don't want to predicate the merge of our powerpc stuff on the
removal of msix_entry entirely, there's too much risk that we'll slip to
v23.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part


Re: 2.6.21-rc1 and 2.6.21-rc2 kwin dies silently

2007-03-28 Thread Eric W. Biederman

Sid I think I have found the problem. Could you try the following patch. 
I believe I accidentally switched the sense of a test

diff --git a/kernel/exit.c b/kernel/exit.c
index f132349..b55ed4c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -790,7 +790,7 @@ static void exit_notify(struct task_struct *tsk)

pgrp = task_pgrp(tsk);
if ((task_pgrp(t) != pgrp) &&
-   (task_session(t) != task_session(tsk)) &&
+   (task_session(t) == task_session(tsk)) &&
will_become_orphaned_pgrp(pgrp, tsk) &&
has_stopped_jobs(pgrp)) {
__kill_pgrp_info(SIGHUP, SEND_SIG_PRIV, pgrp);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] max_loop limit

2007-03-28 Thread Kyle Moffett

On Mar 23, 2007, at 19:26:34, Jan Engelhardt wrote:
here's one. Allocates all the fluff dynamically. It does not create  
any dev nodes by itself, so you need to do it (à la mdadm), but  
you'll get all 1048576 available minors.


+static LIST_HEAD(loop_devices);


Maybe an rbtree would work better here?  Maximum number of nodes  
traversed to get to the bottom of the tree given 2^(20) loop devices  
is 19 as opposed to the 2^(20) for a linked list.  Also, to preserve  
compatibility with existing userspace loop tools you should probably  
always allocate one extra loop device.  Keep a "highest used loopdev"  
number and create the one after that so that udev will autocreate a  
dev node for it.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [KJ][PATCH] BIT macro cleanup

2007-03-28 Thread Richard Knutsson

Milind Arun Choudhary wrote:

BIT macro cleanup,now in bitops.h

Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]>

---
  



diff --git a/drivers/net/s2io.h b/drivers/net/s2io.h
index 0de0c65..5aa3be5 100644
--- a/drivers/net/s2io.h
+++ b/drivers/net/s2io.h
@@ -14,6 +14,7 @@
 #define _S2IO_H
 
 #define TBD 0

+#undef BIT
 #define BIT(loc)   (0x8000ULL >> (loc))
 #define vBIT(val, loc, sz) (((u64)val) << (64-loc-sz))
 #define INV(d)  ((d&0xff)<<24) | (((d>>8)&0xff)<<16) | (((d>>16)&0xff)<<8)| 
((d>>24)&0xff)
  

Why not use "LLBIT(63 - loc)" instead?

Richard Knutsson

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kdump/kexec: calculate note size at compile time

2007-03-28 Thread Simon Horman
On Thu, Mar 29, 2007 at 09:14:21AM +0530, Vivek Goyal wrote:
> On Thu, Mar 29, 2007 at 12:30:59PM +0900, Simon Horman wrote:
> > Hi,
> > 
> > this is a(nother) minor update to this patch. 
> > Explanation below.
> > 
> > -- 
> > Horms
> >   H: http://www.vergenet.net/~horms/
> >   W: http://www.valinux.co.jp/en/
> > 
> > [PATCH] kdump/kexec: calculate note size at compile time
> > 
> > Currently the size of the per-cpu region reserved to save crash
> > notes is set by the per-architecture value MAX_NOTE_BYTES. Which
> > in turn is currently set to 1024 on all supported architectures.
> > 
> > While testing ia64 I recently discovered that this value is
> > in fact too small. The particular setup I was using actually
> > needs 1172 bytes. This lead to very tedious failure mode
> > where the tail of one elf note would overwrite the head of
> > another if they ended up being alocated sequentially by kmalloc,
> > which was often the case.
> > 
> > It seems to me that a far better approach is to caclculate the size
> > that the area needs to be. This patch does just that.
> > 
> > If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X)
> > is needed then this should be as easy as making MAX_NOTE_BYTES
> > larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice.
> > However, I think that the approach in this patch is a much more robust
> > idea.
> > 
> > Update I:
> > 
> >   Changed KEXEC_NOTE_HEAD_BYTES to KEXEC_NOTE_DESC_BYTES in line
> >   with the name of the relevant field in struct elf_note
> > 
> > Update II:
> > 
> >   * Use KEXEC_NOTE_NAME instead of "CORE" in kernel/kexec.c and
> > arch/ia64/kernel/crash.c just to be extra sure that the data
> > used to calculate the size, and the data stuffed into the reserved
> > area is the same.
> > 
> > Incidently, the ia64 code really ought to use the generic code.
> > I am working on a patch for this. But it is not urgent.
> > 
> 
> Looks good. Another patch to make ia64 also use generic kexec code
> for note generation would be nice.

Thanks, I will make it so :-)

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kdump/kexec: calculate note size at compile time

2007-03-28 Thread Vivek Goyal
On Thu, Mar 29, 2007 at 12:30:59PM +0900, Simon Horman wrote:
> Hi,
> 
> this is a(nother) minor update to this patch. 
> Explanation below.
> 
> -- 
> Horms
>   H: http://www.vergenet.net/~horms/
>   W: http://www.valinux.co.jp/en/
> 
> [PATCH] kdump/kexec: calculate note size at compile time
> 
> Currently the size of the per-cpu region reserved to save crash
> notes is set by the per-architecture value MAX_NOTE_BYTES. Which
> in turn is currently set to 1024 on all supported architectures.
> 
> While testing ia64 I recently discovered that this value is
> in fact too small. The particular setup I was using actually
> needs 1172 bytes. This lead to very tedious failure mode
> where the tail of one elf note would overwrite the head of
> another if they ended up being alocated sequentially by kmalloc,
> which was often the case.
> 
> It seems to me that a far better approach is to caclculate the size
> that the area needs to be. This patch does just that.
> 
> If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X)
> is needed then this should be as easy as making MAX_NOTE_BYTES
> larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice.
> However, I think that the approach in this patch is a much more robust
> idea.
> 
> Update I:
> 
>   Changed KEXEC_NOTE_HEAD_BYTES to KEXEC_NOTE_DESC_BYTES in line
>   with the name of the relevant field in struct elf_note
> 
> Update II:
> 
>   * Use KEXEC_NOTE_NAME instead of "CORE" in kernel/kexec.c and
> arch/ia64/kernel/crash.c just to be extra sure that the data
> used to calculate the size, and the data stuffed into the reserved
> area is the same.
> 
> Incidently, the ia64 code really ought to use the generic code.
> I am working on a patch for this. But it is not urgent.
> 

Looks good. Another patch to make ia64 also use generic kexec code
for note generation would be nice.

Thanks
Vivek
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kdump/kexec: calculate note size at compile time

2007-03-28 Thread Simon Horman
Hi,

this is a(nother) minor update to this patch. 
Explanation below.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

[PATCH] kdump/kexec: calculate note size at compile time

Currently the size of the per-cpu region reserved to save crash
notes is set by the per-architecture value MAX_NOTE_BYTES. Which
in turn is currently set to 1024 on all supported architectures.

While testing ia64 I recently discovered that this value is
in fact too small. The particular setup I was using actually
needs 1172 bytes. This lead to very tedious failure mode
where the tail of one elf note would overwrite the head of
another if they ended up being alocated sequentially by kmalloc,
which was often the case.

It seems to me that a far better approach is to caclculate the size
that the area needs to be. This patch does just that.

If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X)
is needed then this should be as easy as making MAX_NOTE_BYTES
larger in arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice.
However, I think that the approach in this patch is a much more robust
idea.

Update I:

  Changed KEXEC_NOTE_HEAD_BYTES to KEXEC_NOTE_DESC_BYTES in line
  with the name of the relevant field in struct elf_note

Update II:

  * Use KEXEC_NOTE_NAME instead of "CORE" in kernel/kexec.c and
arch/ia64/kernel/crash.c just to be extra sure that the data
used to calculate the size, and the data stuffed into the reserved
area is the same.

Incidently, the ia64 code really ought to use the generic code.
I am working on a patch for this. But it is not urgent.

  * Added Ack from Vivek, which was actually for the update I version
of the patch. If this is wrong, please tell me.

Acked-by:  Vivek Goyal <[EMAIL PROTECTED]>
Signed-off-by: Simon Horman <[EMAIL PROTECTED]>

 arch/ia64/kernel/crash.c|2 +-
 include/asm-arm/kexec.h |2 --
 include/asm-i386/kexec.h|2 --
 include/asm-ia64/kexec.h|2 --
 include/asm-mips/kexec.h|2 --
 include/asm-powerpc/kexec.h |2 --
 include/asm-s390/kexec.h|2 --
 include/asm-sh/kexec.h  |2 --
 include/asm-x86_64/kexec.h  |2 --
 include/linux/kexec.h   |   11 ++-
 kernel/kexec.c  |2 +-
 11 files changed, 12 insertions(+), 19 deletions(-)

Index: linux-2.6/include/asm-ia64/kexec.h
===
--- linux-2.6.orig/include/asm-ia64/kexec.h 2007-03-28 18:50:25.0 
+0900
+++ linux-2.6/include/asm-ia64/kexec.h  2007-03-29 12:19:10.0 +0900
@@ -14,8 +14,6 @@
 /* The native architecture */
 #define KEXEC_ARCH KEXEC_ARCH_IA_64
 
-#define MAX_NOTE_BYTES 1024
-
 #define kexec_flush_icache_page(page) do { \
 unsigned long page_addr = (unsigned long)page_address(page); \
 flush_icache_range(page_addr, page_addr + PAGE_SIZE); \
Index: linux-2.6/include/linux/kexec.h
===
--- linux-2.6.orig/include/linux/kexec.h2007-03-28 18:50:25.0 
+0900
+++ linux-2.6/include/linux/kexec.h 2007-03-29 12:19:10.0 +0900
@@ -7,6 +7,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 /* Verify architecture specific macros are defined */
@@ -31,6 +33,13 @@
 #error KEXEC_ARCH not defined
 #endif
 
+#define KEXEC_NOTE_NAME "CORE"
+#define KEXEC_NOTE_HEAD_BYTES ALIGN(sizeof(struct elf_note), 4)
+#define KEXEC_NOTE_NAME_BYTES ALIGN(strlen(KEXEC_NOTE_NAME) + 1, 4)
+#define KEXEC_NOTE_DESC_BYTES ALIGN(sizeof(struct elf_prstatus), 4)
+#define KEXEC_NOTE_BYTES ( (KEXEC_NOTE_HEAD_BYTES * 2) + \
+  KEXEC_NOTE_NAME_BYTES + KEXEC_NOTE_DESC_BYTES )
+
 /*
  * This structure is used to hold the arguments that are used when loading
  * kernel binaries.
@@ -136,7 +145,7 @@
 /* Location of a reserved region to hold the crash kernel.
  */
 extern struct resource crashk_res;
-typedef u32 note_buf_t[MAX_NOTE_BYTES/4];
+typedef u32 note_buf_t[KEXEC_NOTE_BYTES/4];
 extern note_buf_t *crash_notes;
 
 
Index: linux-2.6/include/asm-arm/kexec.h
===
--- linux-2.6.orig/include/asm-arm/kexec.h  2007-03-28 18:50:25.0 
+0900
+++ linux-2.6/include/asm-arm/kexec.h   2007-03-29 12:19:10.0 +0900
@@ -16,8 +16,6 @@
 
 #ifndef __ASSEMBLY__
 
-#define MAX_NOTE_BYTES 1024
-
 struct kimage;
 /* Provide a dummy definition to avoid build failures. */
 static inline void crash_setup_regs(struct pt_regs *newregs,
Index: linux-2.6/include/asm-i386/kexec.h
===
--- linux-2.6.orig/include/asm-i386/kexec.h 2007-03-28 18:50:25.0 
+0900
+++ linux-2.6/include/asm-i386/kexec.h  2007-03-29 12:19:10.0 +0900
@@ -47,8 +47,6 @@
 /* The native architecture */
 #define KEXEC_ARCH KEXEC_ARCH_386
 
-#define MAX_NOTE_BYTES 1024
-
 /* CPU 

Re: [KJ][PATCH] BIT macro cleanup

2007-03-28 Thread Richard Knutsson

Alexey Dobriyan wrote:

On Wed, Mar 28, 2007 at 09:03:09AM +0530, Milind Arun Choudhary wrote:
  

--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -8,6 +8,9 @@
  */
 #include 

+#define BIT(nr)(1UL << ((nr) % BITS_PER_LONG))



I think this would be a disaster because something like

BIT(123)

would not even generate a warning.
  
There were a discussion on this, at KJ, when BIT was first used with a 
modular operation. I said the same thing as you do now, but a big user 
of BIT is the input-subsystem who defined their BIT as above. Also it 
was mentioned that the compiler can only find the statical errors, a 
variable input can break it in runtime.
+ if we _really_ want to check the tree for such warnings, it is easy to 
remove the modular operation temporarily (and keep away of input/)


I don't say I like this, just that it is a choose between possible errors.
Richard Knutsson

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Tejun Heo

Jeff Garzik wrote:

AN is a generic concept that I feel will propagate elsewhere.


I think SCSI already has it or am I imagining things again?  :-)

Though perhaps it should be in a 'capability_flags' file rather than a 
'media_change_event' file.


IMHO, if it's genhd.capability_flags then the flag should be 
MEDIA_CHANGE_NOTIFY not ASYNC_NOTIFICATION because AN itself doesn't 
imply any specific event.  It's just a notification mechanism, for ATAPI 
devices, it means media change, for PMP it has a different meaning, so I 
think we need to export the processed meaning not the specific mechanism 
to userland.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Corrupt XFS -Filesystems on new Hardware and Kernel

2007-03-28 Thread Linda Walsh


Oliver Joa wrote:
eason or another, xfs has detected a corrupted on-disk inode format 
which it cannot recognize, and shuts down.


Oh, one other thing that may not apply in your case, but may.
Does your SATA disk support write caching?  Does it support
something called a barrier function?  (not real clear on all
the ways this can go wrong, but I believe barriers are supposed
to guarantee previous data has been fixed on disk (not in write
cache).  If the SATA controller issues a reset, it may very well
purge the write cache.  Theoretically, I can think of a _possibility_,
that the reset disk would purge the write cache and the barrier
indicator would tell xfs to resume writing.  From a recent thread
on the xfs list, it would appear this could be a "bad" thing (like
crossing the streams ala "ghostbusters", but in a data-integrity
context).

Just a "shot in the dark" -- absent knowing anything specific
about your hardware or situation...

If that's the case, you might want to turn off write
caching, since when xfs thinks "barriers" work, it turns
off some "protection", that can enable some significant
speedup in some situations. As an aside, some disks, I gather,
may "claim" to support barriers, but really don't.  Xfs tries
to verify the barrier claim, but I don't know that a reset
issued to the disk will have deterministic behavior across
all manufacturer's disks.  A bunch of "coulds" and "maybe's",
but just thinking off top of head...

Linda


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sky2 PHY setup

2007-03-28 Thread Rob Sims
On Fri, Mar 16, 2007 at 02:16:48PM -0700, Stephen Hemminger wrote:
> On Fri, 16 Mar 2007 14:36:45 -0600
> Rob Sims <[EMAIL PROTECTED]> wrote:

> > Are there some debug hooks that can be activated?  My sky2 stops
> > responding (very light load) about twice a day.  The netdev watchdog
> > notices after a while and is able to reactivate the interface:

> > Mar 15 13:28:12 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out
> > Mar 15 13:28:12 btd kernel: sky2 eth0: tx timeout
> > Mar 15 13:28:12 btd kernel: sky2 eth0: transmit ring 458 .. 435 report=458 
> > done=458
> > Mar 15 13:28:12 btd kernel: sky2 eth0: disabling interface
> > Mar 15 13:28:12 btd kernel: sky2 eth0: enabling interface
> > Mar 15 13:28:12 btd kernel: sky2 eth0: ram buffer 48K
> > Mar 15 13:28:15 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full 
> > duplex, flow control both
> 
> Use ethtool -S to if there are any pause frames, etc. See if frames are
> still making it into PHY statistics but not being received.
> 
> Use ethtool -d to dump registers. Need current version of ethtool with decode 
> logic.
> 
> Then look for things like is Ram buffer read/write pointer changing?
> 
> Is GMAC stuck in pause:
> 
> Normal is:
>   GMAC 1
>   Status   0x5010  (see GM_GPSR_XXX in sky2.h)
>   Control  0x1800
> 
> Stuck is
>   GMAC 1
>   Status   0x5810 (or 0x5A10)

First, here's the described hang in action, on the Core2 Duo on a 1Gb
hub:
GMAC 1 Status/Control remains at 0x5010/0x1800 until module is removed.
Read/write buffer pointers are changing.  Full ethtool output in
http://www.robsims.com/sky2.netmon.log.gz

This machine was also having major throughput problems - 17 kB/s.
Rebooting brought it to ~ 20 MB/s.  Booting into a kernel with the
proprietary sk98lin kernel module showed ~ 80MB/s.  Finally, returning
to sky2 gave 117 MB/s.  Tests run using netcat, dd, /dev/zero, and
/dev/null, transmitting from the problem box to an e1000 via a Netgear
GS108.  No hangs were observed during the "load test."

I also had a hang on a Pentium 4 w/sky2, 100Mb/s hub.  I neglected to
try removing and re-inserting the module before rebooting.
GMAC 1
Status   0xF004
Control  0x1800

RAMbuffer pointers not moving, Read buffer Read pointer != Write pointer.
http://www.robsims.com/sky2.ethtooldumps.tgz

Thanks for looking at this.
-- 
Rob


signature.asc
Description: Digital signature


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Jeff Garzik

Tejun Heo wrote:

Jeff Garzik wrote:

Kristen Carlson Accardi wrote:

Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file "async_notification" to genhd.
If the file reads 1, then the device supports async notification.  If 
the file reads 0, it does not. A flag is set in the generic disk to 
indicate whether

or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure.
Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>



3) I would make the contents of 'media_change_events' be a list of 
flags, rather than a boolean.  Thus, when AN is present, 
media_change_events would return "AN\n".  It would return "\n" (no 
flags) when AN is absent.  This permits future expansion of this 
capabilities reporting variable.


I'm not sure about this.  AN is kind of specific term for ATA while 
media change event is generic.  So, I think the original approach is 
okay.  No matter how the actual thing is implemented, it's the same 
media change event and as long as event delivery interface is the same, 
upper layer shouldn't care about how it's done.


AN is a generic concept that I feel will propagate elsewhere.

Though perhaps it should be in a 'capability_flags' file rather than a 
'media_change_event' file.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] libata: handle AN interrupt

2007-03-28 Thread Tejun Heo

Kristen Carlson Accardi wrote:

When we get an SDB FIS with the 'N' bit set, we should send
an event to user space to indicate that there has been a
media change.  The ahci host controller will send the
event via KOBJ_CHANGE uevent.

Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>
 
+static void async_notify_thread(struct work_struct *work)

+{
+   struct ata_device *atadev =
+   container_of(work, struct ata_device, async_notify);
+
+   /*
+* TBD - who should send this event?  I couldn't find an
+* easy way to map an ata_device to a genhd device, so
+* decided maybe the ata host should send the event and
+* allow user space to figure out what happened?
+*/
+   kobject_uevent(>ap->host->dev->kobj, KOBJ_CHANGE);
+}


I don't think this is right.  If you're gonna make media_change_event 
capability generic, you gotta make event delivery generic too.  You can 
make it a genhd event and make genhd supply the interface function, say, 
genhd_notify_media_change() which is then forwarded by SCSI layer.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Tejun Heo

Jeff Garzik wrote:

Kristen Carlson Accardi wrote:

Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file "async_notification" to genhd.
If the file reads 1, then the device supports async notification.  If 
the file reads 0, it does not. 
A flag is set in the generic disk to indicate whether

or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure.
Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>



3) I would make the contents of 'media_change_events' be a list of 
flags, rather than a boolean.  Thus, when AN is present, 
media_change_events would return "AN\n".  It would return "\n" (no 
flags) when AN is absent.  This permits future expansion of this 
capabilities reporting variable.


I'm not sure about this.  AN is kind of specific term for ATA while 
media change event is generic.  So, I think the original approach is 
okay.  No matter how the actual thing is implemented, it's the same 
media change event and as long as event delivery interface is the same, 
upper layer shouldn't care about how it's done.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Tejun Heo

Kristen Carlson Accardi wrote:

Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file "async_notification" to genhd.
If the file reads 1, then the device supports async 
notification.  If the file reads 0, it does not.  


A flag is set in the generic disk to indicate whether
or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure. 


I'm not sure whether this should be in generic block layer or in libata 
proper.  libata sysfs hierarchy isn't there yet but is scheduled to be 
added soon.  Async notification of media change is generic event for any 
block device with removable media, so I guess it can belong to generic 
layer.  BTW, I think you also need to forward the flag in sd - disk 
device can be removable too.  And please cc linux-scsi@vger.kernel.org 
to get SCSI part reviewed.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/3] libata: check for AN support

2007-03-28 Thread Jeff Garzik

Tejun Heo wrote:

Kristen Carlson Accardi wrote:

Check to see if an ATAPI device supports Asynchronous Notification.
If so, enable it.


As supporting AN needs host interrupt handler change.  I think we need 
host-supports-AN flag; otherwise, we might end up with screaming 
interrupts in the worst case.


Quite so.  Lacking a host flag, we need to know how each and every 
controller behaves when AN is activated (and supported by the device). 
I'm willing to bet some of the first-gen SATA controllers' ASIC state 
machines croak when AN is activated.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/3] libata: check for AN support

2007-03-28 Thread Tejun Heo

Kristen Carlson Accardi wrote:

Check to see if an ATAPI device supports Asynchronous Notification.
If so, enable it.


As supporting AN needs host interrupt handler change.  I think we need 
host-supports-AN flag; otherwise, we might end up with screaming 
interrupts in the worst case.


--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc][patch] queued spinlocks (i386)

2007-03-28 Thread Nick Piggin
On Wed, Mar 28, 2007 at 03:00:21PM -0700, Davide Libenzi wrote:
> On Wed, 28 Mar 2007, Davide Libenzi wrote:
> 
> > The method you propose is otherwise called "Ticket Lock":
> > 
> > http://en.wikipedia.org/wiki/Ticket_lock
> > http://www.cs.rochester.edu/research/synchronization/pseudocode/ss.html#ticket
> 
> That this work prio-art dates to 1991:
> 
> http://www.cs.rochester.edu/u/scott/papers/1991_TOCS_synch.pdf
> 
> So I would not worry to much about patents here. At least W2K MS ones ;)
> What I would worry though, is to add another class of locks. There's no 

No, as you see from my patch I just change the spinlock implementation
to a queued one. I agree it doesn't make sense to add a new type of lock.


> reason why Ticket Locks would perform worse than our spinlock, in both 
> contended and not-contended case, AFAICS. And they have a nice FIFO 
> behaviour.

In most cases, no. For the uncontended case they should be about the
same. They have the same spinning behaviour. However there is a little
window where they might be a bit slower I think... actually perhaps I'm
wrong!

Currently if you have 4 CPUs spinning and the lock is released, all 4
CPU cachelines will be invalidated, then they will be loaded again, and
found to be 0, so they all try to atomic_dec_return the counter, each
one invalidating others' cachelines. 1 gets through.

With my queued locks, all 4 cachelines are invalidated and loaded, but
only one will be allowed to proceed, and there are 0 atomic operations
or stores of any kind.

So I take that back: our current spinlocks have a worse thundering herd
behaviour under contention than my queued ones. So I'll definitely
push the patch through.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Inlining can be _very_bad...

2007-03-28 Thread Benjamin LaHaise
On Thu, Mar 29, 2007 at 01:18:38AM +0200, J.A. Magallón wrote:
> It looks like is updating the stack on each iteration...This is -march=opteron
> code, the -march=pentium4 is similar. Same behaviour with gcc3 and gcc4.
> 
> tst.c and Makefile attached.
> 
> Nice, isn't it ? Please, probe where is my fault...

Yes, gcc sucks in its handling of large return values, news at 11.  I have 
several outstanding bugs on cases where gcc could keep things in registers 
but doesn't.

That said, it tends to do much better on plain integer code, as that is 
what it gets tuned for.  Do NOT propagate the blanket myth that inlining is 
a bad thing.  It is very useful for small functions where the overhead 
associated with call/ret sequences and register clobbers overshadows the 
work being done.  The call/ret updates alone can make a big difference when 
there are lots of other (more useful) memory transactions to complete.  Take 
a look at things like the notifier hooks for an example of something that 
does far too little work per function call and should really be inlined.

-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[EMAIL PROTECTED]>.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rfc][patch] queued spinlocks (i386)

2007-03-28 Thread Nick Piggin
On Wed, Mar 28, 2007 at 12:26:57PM -0700, Davide Libenzi wrote:
> On Wed, 28 Mar 2007, Nick Piggin wrote:
> 
> > On Sat, Mar 24, 2007 at 06:29:59PM +0100, Ingo Molnar wrote:
> > > 
> > > * Nikita Danilov <[EMAIL PROTECTED]> wrote:
> > > 
> > > > Indeed, this technique is very well known. E.g., 
> > > > http://citeseer.ist.psu.edu/anderson01sharedmemory.html has a whole 
> > > > section (3. Local-spin Algorithms) on them, citing papers from the 
> > > > 1990 onward.
> > > 
> > > that is a cool reference! So i'd suggest to do (redo?) the patch based 
> > > on those concepts and that terminology and not use 'queued spinlocks' 
> > > that are commonly associated with MS's stuff. And as a result the 
> > > contended case would be optimized some more via local-spin algorithms. 
> > > (which is not a key thing for us, but which would be nice to have 
> > > nevertheless)
> > 
> > Firstly, the terminology in that paper _is_ "queue lock", which isn't
> > really surprising. I don't really know or care about what MS calls their
> > locks, but I'd suggest that their queued spinlock is probably named in
> > reference to its queueing property rather than its local spin property.
> 
> The method you propose is otherwise called "Ticket Lock":
> 
> http://en.wikipedia.org/wiki/Ticket_lock
> http://www.cs.rochester.edu/research/synchronization/pseudocode/ss.html#ticket

Yes, a ticket based FIFO queue isn't new... I think we have a lot of
xamples already in the kernel. Using them to implement queue locks
obviously isn't new either.

I don't think we'd have to worry about patents.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Benjamin LaHaise
On Wed, Mar 28, 2007 at 02:05:54PM -0700, Christoph Lameter wrote:
> Tried this also on x86_64 with an enhanced quicklist patch that also deals 
> with ptes (at the price of not guaranteeing the free after the tlb flush):
...
> Seems that there is a slight benefit but its also barely above noise 
> level.

You're not running a test that will show any benefit in this area.  Run 
some heavy shell scripts or lmbench's fork() and exec() latency tests 
to get real numbers.

-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[EMAIL PROTECTED]>.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] GIT 1.5.0.6

2007-03-28 Thread Junio C Hamano
The latest maintenance release GIT 1.5.0.6 is available at the
usual places:

  http://www.kernel.org/pub/software/scm/git/

  git-1.5.0.6.tar.{gz,bz2}  (tarball)
  git-htmldocs-1.5.0.6.tar.{gz,bz2} (preformatted docs)
  git-manpages-1.5.0.6.tar.{gz,bz2} (preformatted docs)
  RPMS/$arch/git-*-1.5.0.6-1.$arch.rpm  (RPM)

GIT v1.5.0.6 Release Notes
==

Fixes since v1.5.0.5


* Bugfixes

  - a handful small fixes to gitweb.

  - build procedure for user-manual is fixed not to require locally
installed stylesheets.

  - "git commit $paths" on paths whose earlier contents were
already updated in the index were failing out.

* Documentation

  - user-manual has better cross references.

  - gitweb installation/deployment procedure is now documented.



Changes since v1.5.0.5 are as follows:

J. Bruce Fields (5):
  user-manual: run xsltproc without --nonet option
  user-manual: Use def_ instead of ref_ for glossary references.
  glossary: stop generating automatically
  glossary: clean up cross-references
  user-manual: introduce "branch" and "branch head" differently

Jakub Narebski (4):
  gitweb: Fix "next" link in commit view
  gitweb: Don't escape attributes in CGI.pm HTML methods
  gitweb: Fix not marking signoff lines in "log" view
  gitweb: Add some installation notes in gitweb/INSTALL

Jeff King (1):
  commit: fix pretty-printing of messages with "\nencoding "

Jim Meyering (1):
  user-manual.txt: fix a tiny typo.

Johannes Schindelin (1):
  t4118: be nice to non-GNU sed

Junio C Hamano (2):
  git-commit: "read-tree -m HEAD" is not the right way to read-tree quickly
  GIT 1.5.0.6

Li Yang (1):
  gitweb: Change to use explicitly function call cgi->escapHTML()

Michael S. Tsirkin (1):
  fix typo in git-am manpage

Peter Eriksen (1):
  Documentation/pack-format.txt: Clear up description of types.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Christoph Lameter
On Wed, 28 Mar 2007, William Lee Irwin III wrote:

> NIH == "Not Invented Here." Basically a sort of idea theft, often used
> to grab credit for patches. You're not the one involved there. That was
> a digression. One could say, though, that a solution to the slab issues
> is to NIH slab allocators e.g. via quicklist.h/quicklist.c without the
> negative connotation.

Oh. The quicklist were actually taken from existing IA64 code. Not my idea 
either. I am not wedded to any solution and I was certainly not intending
to abscond with your idea. Would have been difficult given that there was 
a signoff line with your name on it.

> On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote:
> > We certainly see even from the rudimentary tests that I have done
> > that the limited pgd, pmd caching has some effect. Could we please
> > see your local patches? And I guess that you must have some sort of
> > benchmark that you run to test these?
> 
> Short answer: No.
> 
> Long answer: Most of the local patches are not likely to be of interest
> to the world at large. The ones I probably don't mind mentioning so
> much are things like ports of ipt_TARPIT.c to -CURRENT, support for
> mmap() of /proc/profile, things to make the notsc boot parameter
> actually do what you'd expect it to do instead of the kernel ignoring
> the option when you actually need it and mucking with the TSC behind
> your back, and so on. There are also things I'd rather keep under wraps
> so they don't mysteriously appear on lkml a few years later posted by
> someone else without any attribution to me (i.e. the NIH's that bother
> me). I've not got any of them ported to current mainline anyway, and
> some data loss from fried disks seems to have eaten most/all of the
> post-2.6.0 revisions of these patches anyway, though I've got compiled
> kernels with them on various kernel versions between then and 2.6.10
> (not to say that's any impediment to my hammering out fresh ports).

Ummm. So nothing concrete on the performance issues that we 
are considering here? We are talking about something that was lost
a couple of years ago? There are certainly other people who will have the 
same ideas given enough time. Software and patches age like groceries. 
Hiding them will just make them wither away.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-28 Thread Blaisorblade
On giovedì 29 marzo 2007, Blaisorblade wrote:
> On mercoledì 28 marzo 2007, Jeff Dike wrote:
> > [ This patch needs to get into 2.6.21, as it fixes a serious bug
> > introduced soon after 2.6.20 ]
> >
> > Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices
> > queues and locks, which was fine as far as it went, but left in place
> > a global which controlled access to submitting requests to the host.
> > This should have been made per-device as well, since it causes I/O
> > hangs when multiple block devices are in use.
> >
> > This patch fixes that by replacing the global with an activity flag in
> > the device structure in order to tell whether the queue is currently
> > being run.
>
> Finally that variable has a understandable name. However in a mail from
> Jens Axboe, titled:
> "Re: [uml-devel] [PATCH 06/11] uml ubd driver: ubd_io_lock usage fixup" ,
> with Date: Mon, 30 Oct 2006 09:26:48 +0100, he suggested removing this flag
>
> altogether, so we may explore this for the future:
> > > Add some comments about requirements for ubd_io_lock and expand its
> > > use.
> > >
> > > When an irq signals that the "controller" (i.e. another thread on the
> > > host, which does the actual requests and is the only one blocked on I/O
> > > on the host) has done some work, we call again the request function
> > > ourselves (do_ubd_request).

> > > We now do that with ubd_io_lock held - that's useful to protect against
> > > concurrent calls to elv_next_request and so on.

> > Not only useful, required, as I think I complained about a year or more
> > ago :-)

> > > XXX: Maybe we shouldn't call at all the request function. Input needed
> > > on this. Are we supposed to plug and unplug the queue? That code
> > > "indirectly" does that by setting a flag, called do_ubd, which makes
> > > the request function return (it's a residual of 2.4 block layer
> > > interface).

> > Sometimes you need to. I'd probably just remove the do_ubd check and
> > always recall the request function when handling completions, it's
> > easier and safe.

> Anyway, the main speedups to do on the UBD driver are:
> * implement write barriers (so much less fsync) - this is performance
> killer n.1

> * possibly to use the new 2.6 request layout with scatter/gather I/O, and
> vectorized I/O on the host
> * while at vectorizing I/O using async I/O

> * to avoid passing requests on pipes (n.2) - on fast disk I/O becomes
> cpu-bound.
> To make a different but related example, with a SpeedScale laptop, it's
> interesting to double CPU frequency and observe tuntap speed double too.
> (with 1GHz I get on TCP numbers like 150 Mbit/s - 100 Mbit/s, depending
> whether UML trasmits or receives data; with 2GHz double rates).
> Update: I now get 150Mbit / 200Mbit (Uml receives/Uml sends) at 1GHz, and
> still the double at 2Ghz.
> This is a different UML though.

> * using futexes instead of pipes for synchronization (required for previous
> one).

I forgot one thing: remember ubd=mmap? Something like that could have been 
done using MAP_PRIVATE, so that write had still to be called explicitly but 
unchanged data was shared with the host.

Once a page gets dirty but is then cleaned, sharing it back is difficult - but 
even without that good savings could be achievable. That's to explore for the 
very future though.
-- 
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add support for deferrable timers (respun-Mar28)

2007-03-28 Thread Venki Pallipadi
On Wed, Mar 28, 2007 at 05:01:59PM -0700, Andrew Morton wrote:
> On Wed, 28 Mar 2007 16:00:21 -0700
> Venki Pallipadi <[EMAIL PROTECTED]> wrote:
> 
> > Please drop the patch you included yesterday and two incremental patches and
> > use the patch below.
> 
> As you saw, I went and turned it into an incremental patch again.  It makes
> it easier to see what changed, but harder to see the whole thing.
> 
> > Introduce a new flag for timers - deferrable:
> 
> OK, but there's nothing in-kernel whcih actually uses this.
> 
> It would be good to identify some timer users which can be switched over (as
> many as possible, really) so this thing actually gets some runtime testing.

ondemand is the biggest offender and the patch below reduces the number of
interrupts by 50% or more (depending on HZ) on different test systems here.

Yes. There are quite a few other timers inside kernel that can be
migrated. I will use timer_stats and track others and send in the patches
soon.

Thanks,
Venki


--

Add a new deferrable delayed work init. This can be used to schedule work
that are 'unimportant' when CPU is idle and can be called later, when CPU
eventually comes out of idle.

Use this init in cpufreq ondemand governor.
 
Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>

Index: new/drivers/cpufreq/cpufreq_ondemand.c
===
--- new.orig/drivers/cpufreq/cpufreq_ondemand.c 2007-03-28 10:03:21.0 
-0800
+++ new/drivers/cpufreq/cpufreq_ondemand.c  2007-03-28 10:05:44.0 
-0800
@@ -470,7 +470,7 @@
dbs_info->enable = 1;
ondemand_powersave_bias_init();
dbs_info->sample_type = DBS_NORMAL_SAMPLE;
-   INIT_DELAYED_WORK(_info->work, do_dbs_timer);
+   INIT_DELAYED_WORK_DEFERRABLE(_info->work, do_dbs_timer);
queue_delayed_work_on(dbs_info->cpu, kondemand_wq, _info->work,
  delay);
 }
Index: new/include/linux/workqueue.h
===
--- new.orig/include/linux/workqueue.h  2007-03-28 10:03:21.0 -0800
+++ new/include/linux/workqueue.h   2007-03-28 10:05:44.0 -0800
@@ -89,6 +89,12 @@
init_timer(&(_work)->timer);\
} while (0)
 
+#define INIT_DELAYED_WORK_DEFERRABLE(_work, _func) \
+   do {\
+   INIT_WORK(&(_work)->work, (_func)); \
+   init_timer_deferrable(&(_work)->timer); \
+   } while (0)
+
 /**
  * work_pending - Find out whether a work item is currently pending
  * @work: The work item in question
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread William Lee Irwin III
On Wed, Mar 28, 2007 at 02:38:55PM -0700, Christoph Lameter wrote:
>>> No that was described in the patch. Quote:
>>> "i386 only provides support for caching constructed pgd and pmds. These 
>>> are comparatively rare to ptes so it is no surprise that the current 
>>> approach has only minimal effect. "

On Thu, Mar 29, 2007 at 01:28:59AM +0100, Alan Cox wrote:
> Whatever it was originally for and public or not, the above isn't true
> for some non Intel products...

Sorry if the descriptions here are misleading. This is basically an
attempt to have the kernel keep preconstructed pagetables around so
that the bitblitting hits need not be repetitively taken during fork()
and faults, where counterarguments revolve around whether this is
actually a hit at all and whether it's significant. It's not related to
the transparent quasi-ASID/ASN affairs AMD has based on %cr3 contents.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread William Lee Irwin III
On Wed, 28 Mar 2007, William Lee Irwin III wrote:
>> As far as kernel compiles being relevant to anything besides
>> potentially optimizing a particular major benchmark using gcc as one
>> of its components... yeah, right. It's too macro to be a microbenchmark
>> of anything and too micro to be pertinent to any meaningful
>> macrobenchmark such as those from major benchmark publishers (who can't
>> be named for trademark/etc. reasons). Hasn't it been at least 5 years
>> since people figured out kernel compiles were complete bulls**t as
>> benchmarks along with dbench for other reasons and several others? If
>> not, I don't know why I bother with this kernel at all.

On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote:
> All benchmarks have their specific drawbacks. I personally like to do
> code review and see what cachelines are touched but that is basically
> imagining what a cpu does. Ones thinking may be led astray.

Well, the kernel compiles are just terrible at everything they could
plausibly be used to measure. I could, in principle, develop a benchmark
that simulates a forking server that does many things similar to what a
kernel compile is meant to measure without a number of its stupidities,
but I've got enough to do already.


On Wed, 28 Mar 2007, William Lee Irwin III wrote:
>> Even so, I already did this and am done with it. It's not like I'm
>> not carrying around numerous patches I know will never be merged all
>> the time anyway. If you want to back it all out so badly, just do it
>> and stop bothering me about it, and I'll merely continue maintaining my
>> local patches without ever posting them as I have been for years. I'm
>> not at all happy with the NIH situation, either, not that I'm at such a
>> loss for ideas to need to contest every petty NIH that flies past.

On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote:
> What is NIH? My main concern is to get the use of page struct fields of 
> the slab removed. We have to do special things to page sized allocs 
> because of these page struct uses. F.e. the private field is used by 
> compound pages and if any of the slabs allocate a higher order page that 
> field will be in use.

NIH == "Not Invented Here." Basically a sort of idea theft, often used
to grab credit for patches. You're not the one involved there. That was
a digression. One could say, though, that a solution to the slab issues
is to NIH slab allocators e.g. via quicklist.h/quicklist.c without the
negative connotation.


On Wed, Mar 28, 2007 at 04:44:01PM -0700, Christoph Lameter wrote:
> We certainly see even from the rudimentary tests that I have done
> that the limited pgd, pmd caching has some effect. Could we please
> see your local patches? And I guess that you must have some sort of
> benchmark that you run to test these?

Short answer: No.

Long answer: Most of the local patches are not likely to be of interest
to the world at large. The ones I probably don't mind mentioning so
much are things like ports of ipt_TARPIT.c to -CURRENT, support for
mmap() of /proc/profile, things to make the notsc boot parameter
actually do what you'd expect it to do instead of the kernel ignoring
the option when you actually need it and mucking with the TSC behind
your back, and so on. There are also things I'd rather keep under wraps
so they don't mysteriously appear on lkml a few years later posted by
someone else without any attribution to me (i.e. the NIH's that bother
me). I've not got any of them ported to current mainline anyway, and
some data loss from fried disks seems to have eaten most/all of the
post-2.6.0 revisions of these patches anyway, though I've got compiled
kernels with them on various kernel versions between then and 2.6.10
(not to say that's any impediment to my hammering out fresh ports).


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: new sysfs layout and ethernet device names

2007-03-28 Thread Bill Nottingham
Greg KH ([EMAIL PROTECTED]) said: 
> If you follow the rules in Documentation/ABI/testing/sysfs-class your
> program will not have any problems.

Oh, of *course*. We add interfaces and then claim years later,
after code has been written, "Oh, you shouldn't be using that!" in
documentation. Meanwhile, such code using the old interface will still
a) continue to compile b) continue to run without any sort of warnings.

If interfaces have to change, so be it. But changing the rules for
using them years after it's implemented and then claiming "you didn't
read the instructions" is pretty lame.

Bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-28 Thread Blaisorblade
On mercoledì 28 marzo 2007, Jeff Dike wrote:
> [ This patch needs to get into 2.6.21, as it fixes a serious bug
> introduced soon after 2.6.20 ]
>
> Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices
> queues and locks, which was fine as far as it went, but left in place
> a global which controlled access to submitting requests to the host.
> This should have been made per-device as well, since it causes I/O
> hangs when multiple block devices are in use.
>
> This patch fixes that by replacing the global with an activity flag in
> the device structure in order to tell whether the queue is currently
> being run.
Finally that variable has a understandable name. However in a mail from Jens 
Axboe, titled:
"Re: [uml-devel] [PATCH 06/11] uml ubd driver: ubd_io_lock usage fixup" , with 
Date: Mon, 30 Oct 2006 09:26:48 +0100, he suggested removing this flag 
altogether, so we may explore this for the future:

> > Add some comments about requirements for ubd_io_lock and expand its use.
> >
> > When an irq signals that the "controller" (i.e. another thread on the
> > host, which does the actual requests and is the only one blocked on I/O
> > on the host) has done some work, we call again the request function
> > ourselves (do_ubd_request).
> >
> > We now do that with ubd_io_lock held - that's useful to protect against
> > concurrent calls to elv_next_request and so on.
>
> Not only useful, required, as I think I complained about a year or more
> ago :-)
>
> > XXX: Maybe we shouldn't call at all the request function. Input needed on
> >  this. Are we supposed to plug and unplug the queue? That code
> > "indirectly" does that by setting a flag, called do_ubd, which makes the
> > request function return (it's a residual of 2.4 block layer interface).
>
> Sometimes you need to. I'd probably just remove the do_ubd check and
> always recall the request function when handling completions, it's
> easier and safe.

Anyway, the main speedups to do on the UBD driver are:
* implement write barriers (so much less fsync) - this is performance killer 
n.1

* possibly to use the new 2.6 request layout with scatter/gather I/O, and 
vectorized I/O on the host
* while at vectorizing I/O using async I/O

* to avoid passing requests on pipes (n.2) - on fast disk I/O becomes 
cpu-bound.
To make a different but related example, with a SpeedScale laptop, it's 
interesting to double CPU frequency and observe tuntap speed double too. 
(with 1GHz I get on TCP numbers like 150 Mbit/s - 100 Mbit/s, depending 
whether UML trasmits or receives data; with 2GHz double rates).
Update: I now get 150Mbit / 200Mbit (Uml receives/Uml sends) at 1GHz, and 
still the double at 2Ghz.
This is a different UML though.

* using futexes instead of pipes for synchronization (required for previous 
one).

-- 
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Corrupt XFS -Filesystems on new Hardware and Kernel

2007-03-28 Thread Linda Walsh

Oliver Joa wrote:
eason or another, xfs has detected a corrupted on-disk inode format 
which it cannot recognize, and shuts down.  It is likely the result 
of something which has gone wrong previously.  xfs_repair should fix 
it.  Are there other non-xfs messages in your logs indicating other 
problems prior to this?

i sent already the dmesg output to the list. there is nothing else.
I made a xfs_repair. Now I have some Files in lost+found.
So I tried it again with a new cable:

---
   I doubt it has changed significantly, but xfs was designed for
stable hardware.  That doesn't mean you can't pull the plug, but if
you are getting SATA resets, you may be getting some writes aborted,
with subsequent writes going through (speculation).  I know when
I had a flakey SCSI disk problem (was cable or connector in my
case), I'd get a rare XFS corruption (out of ~10 years of XFS use,
maybe 2-3 corruptions, all caused by loose connections, cables, etc).

   I'd strongly suggest you get to the bottom of the SATA reset
problem.  After that is fixed, then try to clean up your XFS disks (or
restore from backups).  Sometimes, after some intermittent hardware
problems, my xfs file system was too corrupt for me to repair (at
least with default xfs_repair options).  Doesn't mean it was irreparable,
just, I didn't know how to proceed and it was easier to restore from
a daily backup than attempt to manually repair the damage.

   The above is based solely on my own experience.  I use xfs
with max(8?) logbuffs, and noatime/nodiratime, and find it to have among
the best performance characteristics of any file system (overall;
lowest performance aspect was file delete).
   XFS has a low fragmentation rate, due to how it allocates
space and can delay writes.  Even so, it is also one of the few
file systems (only?) that comes with a "defragmenter"
(xfs_fsr (file system reorganizer)).

Sgi used to ship systems with xfs_fsr configured to run
weekly to "watch out for" rare, degenerate cases (important for some
real-time video apps).  My cron runs it nightly,  but often it
will pass through all file systems making no changes.

Fix the flakey hw -- then see if your xfs probs don't "magically"
go away...however, YMMV...

Linda


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Jeff Garzik

Kristen Carlson Accardi wrote:

Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file "async_notification" to genhd.
If the file reads 1, then the device supports async 
notification.  If the file reads 0, it does not.  


A flag is set in the generic disk to indicate whether
or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure. 


Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>

Index: 2.6-mm/block/genhd.c
===
--- 2.6-mm.orig/block/genhd.c
+++ 2.6-mm/block/genhd.c
@@ -372,6 +372,11 @@ static ssize_t disk_size_read(struct gen
 {
return sprintf(page, "%llu\n", (unsigned long long)get_capacity(disk));
 }
+static ssize_t disk_AN_read(struct gendisk * disk, char *page)
+{
+   return sprintf(page, "%d\n",
+   (disk->flags & GENHD_FL_ASYNC_NOTIFICATION ? 1 : 0));
+}
 
 static ssize_t disk_stats_read(struct gendisk * disk, char *page)

 {
@@ -419,6 +424,10 @@ static struct disk_attribute disk_attr_s
.attr = {.name = "stat", .mode = S_IRUGO },
.show   = disk_stats_read
 };
+static struct disk_attribute disk_attr_AN = {
+   .attr = {.name = "media_change_events", .mode = S_IRUGO },
+   .show   = disk_AN_read
+};
 
 #ifdef CONFIG_FAIL_MAKE_REQUEST
 
@@ -455,6 +464,7 @@ static struct attribute * default_attrs[

_attr_removable.attr,
_attr_size.attr,
_attr_stat.attr,
+   _attr_AN.attr,
 #ifdef CONFIG_FAIL_MAKE_REQUEST
_attr_fail.attr,
 #endif
Index: 2.6-mm/include/linux/genhd.h
===
--- 2.6-mm.orig/include/linux/genhd.h
+++ 2.6-mm/include/linux/genhd.h
@@ -94,6 +94,7 @@ struct hd_struct {
 
 #define GENHD_FL_REMOVABLE			1

 #define GENHD_FL_DRIVERFS  2
+#define GENHD_FL_ASYNC_NOTIFICATION4
 #define GENHD_FL_CD8
 #define GENHD_FL_UP16
 #define GENHD_FL_SUPPRESS_PARTITION_INFO   32
Index: 2.6-mm/include/scsi/scsi_device.h
===
--- 2.6-mm.orig/include/scsi/scsi_device.h
+++ 2.6-mm/include/scsi/scsi_device.h
@@ -126,7 +126,7 @@ struct scsi_device {
unsigned fix_capacity:1;/* READ_CAPACITY is too high by 1 */
unsigned guess_capacity:1;  /* READ_CAPACITY might be too high by 1 
*/
unsigned retry_hwerror:1;   /* Retry HARDWARE_ERROR */
-
+   unsigned async_notification:1;  /* device supports async notification */
unsigned int device_blocked;/* Device returned QUEUE_FULL. */
 
 	unsigned int max_device_blocked; /* what device_blocked counts down from  */

Index: 2.6-mm/drivers/ata/libata-scsi.c
===
--- 2.6-mm.orig/drivers/ata/libata-scsi.c
+++ 2.6-mm/drivers/ata/libata-scsi.c
@@ -899,6 +899,9 @@ static void ata_scsi_dev_config(struct s
blk_queue_max_hw_segments(q, q->max_hw_segments - 1);
}
 
+	if (dev->flags & ATA_DFLAG_AN)

+   sdev->async_notification = 1;
+
if (dev->flags & ATA_DFLAG_NCQ) {
int depth;
 
Index: 2.6-mm/drivers/scsi/sr.c

===
--- 2.6-mm.orig/drivers/scsi/sr.c
+++ 2.6-mm/drivers/scsi/sr.c
@@ -603,6 +603,8 @@ static int sr_probe(struct device *dev)
 
 	dev_set_drvdata(dev, cd);

disk->flags |= GENHD_FL_REMOVABLE;
+   if (sdev->async_notification)
+   disk->flags |= GENHD_FL_ASYNC_NOTIFICATION;
add_disk(disk);


(added linux-scsi to CC)

Comments:

1) From a procedural standpoint, you'll want to separate this patch into 
three patches:  generic block layer stuff, SCSI stuff, and libata stuff.


2) I don't claim to be a sysfs expert, but this seems like a reasonable 
approach for reporting async-notification capabilities


3) I would make the contents of 'media_change_events' be a list of 
flags, rather than a boolean.  Thus, when AN is present, 
media_change_events would return "AN\n".  It would return "\n" (no 
flags) when AN is absent.  This permits future expansion of this 
capabilities reporting variable.


4) Figure out some place to document 'media_change_events', in 
Documentation/*


5) I think the method of delivery probably needs discussing, and some 
work.  Presumably the normal hotplug paths should be traversed for this 
sort of thing.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add support for deferrable timers (respun-Mar28)

2007-03-28 Thread Andrew Morton
On Wed, 28 Mar 2007 16:00:21 -0700
Venki Pallipadi <[EMAIL PROTECTED]> wrote:

> Please drop the patch you included yesterday and two incremental patches and
> use the patch below.

As you saw, I went and turned it into an incremental patch again.  It makes
it easier to see what changed, but harder to see the whole thing.

> Introduce a new flag for timers - deferrable:

OK, but there's nothing in-kernel whcih actually uses this.

It would be good to identify some timer users which can be switched over (as
many as possible, really) so this thing actually gets some runtime testing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] UML - Code cleanups for 2.6.21

2007-03-28 Thread Andrew Morton
On Wed, 28 Mar 2007 11:28:45 -0400
Jeff Dike <[EMAIL PROTECTED]> wrote:

> These are tidying patches from Blaisorblade - 2.6.21 material.

The three net_kern.c patches invoked a reject storm against mainline,
presumably because of uml-network-interface-hotplug-error-handling.patch.

So I bumped those three patches into 2.6.22.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Virtual methods for devices and generalized GPIO support using it

2007-03-28 Thread H. Peter Anvin

Paul Sokolovsky wrote:


 By this criteria I happened to choose macros syntax. But it's still
merely a syntax, and I don't pledge for it. If there's more movement
towards using explicit low-level forms like 1) or 2) instead of
introducing new syntactic pattern, then macro syntax can be considered
to have fulfilled its introductory role and can be dropped.



"Movement towards?!"  That's been a fundamental part of Linux design 
since the very beginning.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Virtual methods for devices and generalized GPIO support using it

2007-03-28 Thread Paul Sokolovsky
Hello H.,

Wednesday, March 28, 2007, 7:32:57 PM, you wrote:

> Paul Sokolovsky wrote:
>> 
>> In this respect, VTABLE(), METHOD() macros serve the same purpose as 
>> container_of() and list_for_each() - they are besides offering (more) 
>> convenient syntax, also carry important annotattion and educational
>> messages, like "it's ok, and encouraged to embed one structure into 
>> another - use it!" or "list manipulation is a trivial operation for kernel,
>> and we want you to treat it as such and use in standard, easily 
>> distinguishable way".
>> 

> You realize, right, that the Linux kernel already have a much cleaner 
> way to do vtables in the kernel, without this kind of macro crappage? 
> It's called an _ops table, and is used in a patternized way:

foo->x_ops->func(foo, ...);

> ... all over the kernel.  We like it that way.

  Sure! I wrote it's nothing really new. And I hope it's clear why
those macros appeared in the first place: with the type of structures
the device virtual methods are intended to be used, there're always
pretty comprehensive member selection and typecasting is required. In
this regard, there were 3 choices:

1. Use long but explicit expressions, like

((struct dev_pdata*)pdev.dev->platform_device)->x_ops->func(dev)

2. Use temporary variables:

struct dev_pdata *tmp = (struct dev_pdata*)pdev.dev->platform_device;
tmp->x_ops->func(dev);

3. Introduce macros which would hide guts and would provide syntax
more resembling usual function call (especially for folks who remember
that preprocessor is unalienable part of C ;-) ).


 As I also noted in the original mail, macros are also nice device
for in-place annotation - to emphasize the fact that this is not just
a mundane case of pointer manipulation, but paradigmatic thing.


 By this criteria I happened to choose macros syntax. But it's still
merely a syntax, and I don't pledge for it. If there's more movement
towards using explicit low-level forms like 1) or 2) instead of
introducing new syntactic pattern, then macro syntax can be considered
to have fulfilled its introductory role and can be dropped.




> -hpa



-- 
Best regards,
 Paulmailto:[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] cache pipe buf page address for non-highmem arch

2007-03-28 Thread Jeremy Fitzhardinge
Andrew Morton wrote:
> On Wed, 28 Mar 2007 16:21:04 -0700
> Zach Brown <[EMAIL PROTECTED]> wrote:
>
>   
>>> Does this look OK?
>>>   
>> Almost...
>>
>> 
>>> #ifdef CONFIG_HIGHMEM
>>> static inline void pipe_kunmap_atomic(void *addr, enum km_type type)
>>> #else   /* CONFIG_HIGHMEM */
>>> static inline void pipe_kunmap_atomic(struct page *page, enum  
>>> km_type type)
>>>   
>
> OK, I give up.  What are you telling me here?
>   

Also void *addr vs struct page *page.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/3] libata: check for AN support

2007-03-28 Thread Kristen Carlson Accardi
Check to see if an ATAPI device supports Asynchronous Notification.
If so, enable it.

Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>

Index: 2.6-mm/drivers/ata/libata-core.c
===
--- 2.6-mm.orig/drivers/ata/libata-core.c
+++ 2.6-mm/drivers/ata/libata-core.c
@@ -71,6 +71,7 @@ const unsigned long sata_deb_timing_long
 static unsigned int ata_dev_init_params(struct ata_device *dev,
u16 heads, u16 sectors);
 static unsigned int ata_dev_set_xfermode(struct ata_device *dev);
+static unsigned int ata_dev_set_AN(struct ata_device *dev);
 static void ata_dev_xfermask(struct ata_device *dev);
 
 static unsigned int ata_print_id = 1;
@@ -1745,6 +1746,23 @@ int ata_dev_configure(struct ata_device 
}
dev->cdb_len = (unsigned int) rc;
 
+   /*
+* check to see if this ATAPI device supports
+* Asynchronous Notification
+*/
+   if (ata_id_has_AN(id))
+   {
+   /* issue SET feature command to turn this on */
+   rc = ata_dev_set_AN(dev);
+   if (rc) {
+   ata_dev_printk(dev, KERN_ERR,
+   "unable to set AN\n");
+   rc = -EINVAL;
+   goto err_out_nosup;
+   }
+   dev->flags |= ATA_DFLAG_AN;
+   }
+
if (ata_id_cdb_intr(dev->id)) {
dev->flags |= ATA_DFLAG_CDB_INTR;
cdb_intr_string = ", CDB intr";
@@ -3642,6 +3660,42 @@ static unsigned int ata_dev_set_xfermode
 }
 
 /**
+ * ata_dev_set_AN - Issue SET FEATURES - SATA FEATURES
+ *   with sector count set to indicate
+ *   Asynchronous Notification feature
+ * @dev: Device to which command will be sent
+ *
+ * Issue SET FEATURES - SATA FEATURES command to device @dev
+ * on port @ap.
+ *
+ * LOCKING:
+ * PCI/etc. bus probe sem.
+ *
+ * RETURNS:
+ * 0 on success, AC_ERR_* mask otherwise.
+ */
+static unsigned int ata_dev_set_AN(struct ata_device *dev)
+{
+   struct ata_taskfile tf;
+   unsigned int err_mask;
+
+   /* set up set-features taskfile */
+   DPRINTK("set features - SATA features\n");
+
+   ata_tf_init(dev, );
+   tf.command = ATA_CMD_SET_FEATURES;
+   tf.feature = SETFEATURES_SATA_ENABLE;
+   tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
+   tf.protocol = ATA_PROT_NODATA;
+   tf.nsect = SATA_AN;
+
+   err_mask = ata_exec_internal(dev, , NULL, DMA_NONE, NULL, 0);
+
+   DPRINTK("EXIT, err_mask=%x\n", err_mask);
+   return err_mask;
+}
+
+/**
  * ata_dev_init_params - Issue INIT DEV PARAMS command
  * @dev: Device to which command will be sent
  * @heads: Number of heads (taskfile parameter)
Index: 2.6-mm/include/linux/ata.h
===
--- 2.6-mm.orig/include/linux/ata.h
+++ 2.6-mm/include/linux/ata.h
@@ -193,6 +193,12 @@ enum {
SETFEATURES_WC_ON   = 0x02, /* Enable write cache */
SETFEATURES_WC_OFF  = 0x82, /* Disable write cache */
 
+   SETFEATURES_SATA_ENABLE = 0x10, /* Enable use of SATA feature */
+   SETFEATURES_SATA_DISABLE = 0x90, /* Disable use of SATA feature */
+
+   /* SETFEATURE Sector counts for SATA features */
+   SATA_AN = 0x05,  /* Asynchronous Notification */
+
/* ATAPI stuff */
ATAPI_PKT_DMA   = (1 << 0),
ATAPI_DMADIR= (1 << 2), /* ATAPI data dir:
@@ -298,6 +304,8 @@ struct ata_taskfile {
 #define ata_id_queue_depth(id) (((id)[75] & 0x1f) + 1)
 #define ata_id_removeable(id)  ((id)[0] & (1 << 7))
 #define ata_id_has_dword_io(id)((id)[50] & (1 << 0))
+#define ata_id_has_AN(id)  \
+   ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))
 #define ata_id_iordy_disable(id) ((id)[49] & (1 << 10))
 #define ata_id_has_iordy(id) ((id)[49] & (1 << 9))
 #define ata_id_u32(id,n)   \
Index: 2.6-mm/include/linux/libata.h
===
--- 2.6-mm.orig/include/linux/libata.h
+++ 2.6-mm/include/linux/libata.h
@@ -136,6 +136,7 @@ enum {
ATA_DFLAG_CDB_INTR  = (1 << 2), /* device asserts INTRQ when ready 
for CDB */
ATA_DFLAG_NCQ   = (1 << 3), /* device supports NCQ */
ATA_DFLAG_FLUSH_EXT = (1 << 4), /* do FLUSH_EXT instead of FLUSH */
+   ATA_DFLAG_AN= (1 << 5), /* device supports Async 
notification */
ATA_DFLAG_CFG_MASK  = (1 << 8) - 1,
 
ATA_DFLAG_PIO   = (1 << 8), /* device limited to PIO mode */

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message 

[patch 3/3] libata: handle AN interrupt

2007-03-28 Thread Kristen Carlson Accardi
When we get an SDB FIS with the 'N' bit set, we should send
an event to user space to indicate that there has been a
media change.  The ahci host controller will send the
event via KOBJ_CHANGE uevent.

Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>
Index: 2.6-mm/drivers/ata/ahci.c
===
--- 2.6-mm.orig/drivers/ata/ahci.c
+++ 2.6-mm/drivers/ata/ahci.c
@@ -1164,6 +1164,26 @@ static void ahci_host_intr(struct ata_po
return;
}
 
+   if (status & PORT_IRQ_SDB_FIS) {
+   /*
+* if this is an ATAPI device with AN turned on,
+* then we should interrogate the device to
+* determine the cause of the interrupt
+*
+* for AN - this we should check the SDB FIS
+* and find the I and N bits set
+*/
+   const u32 *f = pp->rx_fis + RX_FIS_SDB;
+
+   /* check the 'N' bit in word 0 of the FIS */
+   if (f[0] & (1 << 15)) {
+   int port_addr =  ((f[0] & 0x0f00) >> 8);
+   struct ata_device *adev = >device[port_addr];
+   ata_port_printk(ap, KERN_INFO, "N bit set on SDB 
FIS!\n");
+   if (adev->flags & ATA_DFLAG_AN)
+   ata_async_notify(adev);
+   }
+   }
if (ap->sactive)
qc_active = readl(port_mmio + PORT_SCR_ACT);
else
Index: 2.6-mm/include/linux/libata.h
===
--- 2.6-mm.orig/include/linux/libata.h
+++ 2.6-mm/include/linux/libata.h
@@ -492,6 +492,7 @@ struct ata_device {
/* ACPI objects info */
acpi_handle obj_handle;
 #endif
+   struct work_struct  async_notify;
 };
 
 /* Offset into struct ata_device.  Fields above it are maintained
@@ -826,6 +827,7 @@ extern void ata_scsi_slave_destroy(struc
 extern int ata_scsi_change_queue_depth(struct scsi_device *sdev,
   int queue_depth);
 extern struct ata_device *ata_dev_pair(struct ata_device *adev);
+extern void ata_async_notify(struct ata_device *atadev);
 extern int ata_do_set_mode(struct ata_port *ap, struct ata_device 
**r_failed_dev);
 extern u8 ata_irq_on(struct ata_port *ap);
 extern u8 ata_dummy_irq_on(struct ata_port *ap);
Index: 2.6-mm/drivers/ata/libata-core.c
===
--- 2.6-mm.orig/drivers/ata/libata-core.c
+++ 2.6-mm/drivers/ata/libata-core.c
@@ -1576,6 +1576,26 @@ static void ata_dev_config_ncq(struct at
snprintf(desc, desc_sz, "NCQ (depth %d/%d)", hdepth, ddepth);
 }
 
+static void async_notify_thread(struct work_struct *work)
+{
+   struct ata_device *atadev =
+   container_of(work, struct ata_device, async_notify);
+
+   /*
+* TBD - who should send this event?  I couldn't find an
+* easy way to map an ata_device to a genhd device, so
+* decided maybe the ata host should send the event and
+* allow user space to figure out what happened?
+*/
+   kobject_uevent(>ap->host->dev->kobj, KOBJ_CHANGE);
+}
+
+void ata_async_notify(struct ata_device *atadev)
+{
+   schedule_work(>async_notify);
+}
+
+
 /**
  * ata_dev_configure - Configure the specified ATA/ATAPI device
  * @dev: Target device to configure
@@ -1761,6 +1781,7 @@ int ata_dev_configure(struct ata_device 
goto err_out_nosup;
}
dev->flags |= ATA_DFLAG_AN;
+   INIT_WORK(>async_notify, async_notify_thread);
}
 
if (ata_id_cdb_intr(dev->id)) {
@@ -6650,6 +6671,7 @@ EXPORT_SYMBOL_GPL(ata_dummy_irq_on);
 EXPORT_SYMBOL_GPL(ata_irq_ack);
 EXPORT_SYMBOL_GPL(ata_dummy_irq_ack);
 EXPORT_SYMBOL_GPL(ata_dev_try_classify);
+EXPORT_SYMBOL_GPL(ata_async_notify);
 
 EXPORT_SYMBOL_GPL(ata_cable_40wire);
 EXPORT_SYMBOL_GPL(ata_cable_80wire);

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Kristen Carlson Accardi
Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file "async_notification" to genhd.
If the file reads 1, then the device supports async 
notification.  If the file reads 0, it does not.  

A flag is set in the generic disk to indicate whether
or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure. 

Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>

Index: 2.6-mm/block/genhd.c
===
--- 2.6-mm.orig/block/genhd.c
+++ 2.6-mm/block/genhd.c
@@ -372,6 +372,11 @@ static ssize_t disk_size_read(struct gen
 {
return sprintf(page, "%llu\n", (unsigned long long)get_capacity(disk));
 }
+static ssize_t disk_AN_read(struct gendisk * disk, char *page)
+{
+   return sprintf(page, "%d\n",
+   (disk->flags & GENHD_FL_ASYNC_NOTIFICATION ? 1 : 0));
+}
 
 static ssize_t disk_stats_read(struct gendisk * disk, char *page)
 {
@@ -419,6 +424,10 @@ static struct disk_attribute disk_attr_s
.attr = {.name = "stat", .mode = S_IRUGO },
.show   = disk_stats_read
 };
+static struct disk_attribute disk_attr_AN = {
+   .attr = {.name = "media_change_events", .mode = S_IRUGO },
+   .show   = disk_AN_read
+};
 
 #ifdef CONFIG_FAIL_MAKE_REQUEST
 
@@ -455,6 +464,7 @@ static struct attribute * default_attrs[
_attr_removable.attr,
_attr_size.attr,
_attr_stat.attr,
+   _attr_AN.attr,
 #ifdef CONFIG_FAIL_MAKE_REQUEST
_attr_fail.attr,
 #endif
Index: 2.6-mm/include/linux/genhd.h
===
--- 2.6-mm.orig/include/linux/genhd.h
+++ 2.6-mm/include/linux/genhd.h
@@ -94,6 +94,7 @@ struct hd_struct {
 
 #define GENHD_FL_REMOVABLE 1
 #define GENHD_FL_DRIVERFS  2
+#define GENHD_FL_ASYNC_NOTIFICATION4
 #define GENHD_FL_CD8
 #define GENHD_FL_UP16
 #define GENHD_FL_SUPPRESS_PARTITION_INFO   32
Index: 2.6-mm/include/scsi/scsi_device.h
===
--- 2.6-mm.orig/include/scsi/scsi_device.h
+++ 2.6-mm/include/scsi/scsi_device.h
@@ -126,7 +126,7 @@ struct scsi_device {
unsigned fix_capacity:1;/* READ_CAPACITY is too high by 1 */
unsigned guess_capacity:1;  /* READ_CAPACITY might be too high by 1 
*/
unsigned retry_hwerror:1;   /* Retry HARDWARE_ERROR */
-
+   unsigned async_notification:1;  /* device supports async notification */
unsigned int device_blocked;/* Device returned QUEUE_FULL. */
 
unsigned int max_device_blocked; /* what device_blocked counts down 
from  */
Index: 2.6-mm/drivers/ata/libata-scsi.c
===
--- 2.6-mm.orig/drivers/ata/libata-scsi.c
+++ 2.6-mm/drivers/ata/libata-scsi.c
@@ -899,6 +899,9 @@ static void ata_scsi_dev_config(struct s
blk_queue_max_hw_segments(q, q->max_hw_segments - 1);
}
 
+   if (dev->flags & ATA_DFLAG_AN)
+   sdev->async_notification = 1;
+
if (dev->flags & ATA_DFLAG_NCQ) {
int depth;
 
Index: 2.6-mm/drivers/scsi/sr.c
===
--- 2.6-mm.orig/drivers/scsi/sr.c
+++ 2.6-mm/drivers/scsi/sr.c
@@ -603,6 +603,8 @@ static int sr_probe(struct device *dev)
 
dev_set_drvdata(dev, cd);
disk->flags |= GENHD_FL_REMOVABLE;
+   if (sdev->async_notification)
+   disk->flags |= GENHD_FL_ASYNC_NOTIFICATION;
add_disk(disk);
 
sdev_printk(KERN_DEBUG, sdev,

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Corrupt XFS -Filesystems on new Hardware and Kernel

2007-03-28 Thread David Chinner
On Wed, Mar 28, 2007 at 02:42:00PM +0200, Oliver Joa wrote:
> Hi,
> 
> David Chinner wrote:
> 
> [...]
> 
> >What is the corruption message in the log from XFS?
> >Can you please post that? Without it we really can't help you.
> >
> >Also, please check to see if there are any I/O errors
> >in the log around the time the corruption message appears.
> 
> Ok, here is a test:
> 
> test:/# find / -xdev | cpio -padm /test/
> cpio: /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt: 
> Structure needs cleaning
> 3648371 blocks
> test:/#
> 
> test:/home/olli# uname -a
> Linux test 2.6.20.4-majestix-1 #1 SMP PREEMPT Tue Mar 27 12:15:41 CEST 
> 2007 i686 GNU/Linux
> 
> dmesg gives the following:
> [15442.935941] Filesystem "sda3": XFS internal error xfs_iformat(6) at 
> line 492 of file fs/xfs/xfs_inode.c.  Caller 0xc0211f94
> [15442.936003]  [] xfs_iread+0x4ee/0x6e8
> [15442.936039]  [] xfs_iget+0x2e4/0x714
> [15442.936071]  [] xfs_iget+0x2e4/0x714
> [15442.936101]  [] xfs_dir_lookup_int+0x7d/0xd4

So we have a corrupt inode. The error tells me that the
corrupted inode is either a regular file, directory or link.
Unfortunately it doesn't tell us the inode number that is
corrupted.

> test:/# rm /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt
> rm: cannot remove 
> `/usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt': 
> Structure needs cleaning
> test:/#

Once the filesystem shuts down this will happen to every operation.

Next time you get a shutdown, can you unmount the filesystems and
run xfs_check and then "xfs_repair -n" on the filesystem. These will
tell you the inode numbers that are bad. Can you post the errors
reported by these tools?

Once you have the bad inode numbers, can you run the following
on the bad inodes:

# xfs_db -r -c "inode " -c "p" 

E.g.:

# xfs_db -r -c "inode 128" -c p /dev/sdb8
core.magic = 0x494e
core.mode = 040755
core.version = 2
core.format = 2 (extents)
..

and post the output for us? That will enable us to see exactly what
the corruption is on the inode.

Cheers,

Dave.


> 
> I got:
> 
> [18359.750604] Filesystem "sda3": XFS internal error xfs_iformat(6) at 
> line 492 of file fs/xfs/xfs_inode.c.  Caller 0xc0211f94
> [18359.750701]  [] xfs_iread+0x4ee/0x6e8
> [18359.750755]  [] xfs_iget+0x2e4/0x714
> [18359.750802]  [] xfs_iget+0x2e4/0x714
> [18359.750849]  [] xfs_dir_lookup_int+0x7d/0xd4
> [18359.750897]  [] xfs_lookup+0x52/0x78
> [18359.750943]  [] xfs_vn_lookup+0x3b/0x70
> [18359.750990]  [] do_lookup+0xa3/0x140
> [18359.751036]  [] __link_path_walk+0x73d/0xb5e
> [18359.751086]  [] link_path_walk+0x44/0xb3
> [18359.751133]  [] rb_insert_color+0x4c/0xad
> [18359.751180]  [] vma_link+0x54/0xcd
> [18359.751226]  [] do_path_lookup+0x176/0x191
> [18359.751273]  [] getname+0x59/0x8f
> [18359.751318]  [] __user_walk_fd+0x2f/0x45
> [18359.751364]  [] vfs_lstat_fd+0x16/0x3d
> [18359.751410]  [] rb_insert_color+0x4c/0xad
> [18359.751457]  [] vma_link+0x54/0xcd
> [18359.751501]  [] sys_lstat64+0xf/0x23
> [18359.751546]  [] do_page_fault+0x277/0x526
> [18359.751595]  [] do_page_fault+0x0/0x526
> [18359.751640]  [] syscall_call+0x7/0xb
> [18359.751686]  [] rsc_parse+0x6f/0x37f
> [18359.751732]  ===
> [18359.751784] Filesystem "sda3": XFS internal error xfs_iformat(6) at 
> line 492 of file fs/xfs/xfs_inode.c.  Caller 0xc0211f94
> [18359.751859]  [] xfs_iread+0x4ee/0x6e8
> [18359.751906]  [] xfs_iget+0x2e4/0x714
> [18359.751952]  [] xfs_iget+0x2e4/0x714
> [18359.751998]  [] xfs_dir_lookup_int+0x7d/0xd4
> [18359.752047]  [] xfs_lookup+0x52/0x78
> [18359.752094]  [] xfs_vn_lookup+0x3b/0x70
> [18359.752140]  [] __lookup_hash+0xb1/0xe1
> [18359.752191]  [] do_unlinkat+0x5f/0x126
> [18359.752237]  [] do_page_fault+0x277/0x526
> [18359.752285]  [] syscall_call+0x7/0xb
> [18359.752331]  [] rsc_parse+0x6f/0x37f
> [18359.752376]  ===
> 
> 
> 
> Thanks a Lot
> 
> Oliver

-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/3] Asynchronous Notification for SATA ATAPI devices

2007-03-28 Thread Kristen Carlson Accardi
This patch series implements Asynchronous Notification (AN) for SATA ATAPI
devices as defined in SATA 2.5 and AHCI 1.1.  Drives which support this
feature will send a notification when new media is inserted into the
drive, preventing the need for user space to poll for new media.  This
support is exposed to user space via a file in sysfs (/sys/block/sr*)
called "media_change_events".  If the drive supports AN, this file will
read 1, otherwise 0.  User space can disable polling for new media if this
file reads 1.  When new media is inserted into the ATAPI drive, the ahci
driver will send a KOBJ_CHANGE event.

I would really like feedback on the user interface - both the location
of the sysfs file which indicates AN support, as well as the type of
uevent etc.  I have not yet tested AN on eject (I assume it doesn't require
anything special) as my test drive which supports AN is a bit "quirky" in 
this respect.  Please take a look and let me know what you think.

Thanks,
Kristen

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Christoph Lameter
On Wed, 28 Mar 2007, William Lee Irwin III wrote:

> As far as kernel compiles being relevant to anything besides
> potentially optimizing a particular major benchmark using gcc as one
> of its components... yeah, right. It's too macro to be a microbenchmark
> of anything and too micro to be pertinent to any meaningful
> macrobenchmark such as those from major benchmark publishers (who can't
> be named for trademark/etc. reasons). Hasn't it been at least 5 years
> since people figured out kernel compiles were complete bulls**t as
> benchmarks along with dbench for other reasons and several others? If
> not, I don't know why I bother with this kernel at all.

All benchmarks have their specific drawbacks. I personally like to do code 
review and see what cachelines are touched but that is basically imagining 
what a cpu does. Ones thinking may be led astray.

> Even so, I already did this and am done with it. It's not like I'm
> not carrying around numerous patches I know will never be merged all
> the time anyway. If you want to back it all out so badly, just do it
> and stop bothering me about it, and I'll merely continue maintaining my
> local patches without ever posting them as I have been for years. I'm
> not at all happy with the NIH situation, either, not that I'm at such a
> loss for ideas to need to contest every petty NIH that flies past.

What is NIH? My main concern is to get the use of page struct fields of 
the slab removed. We have to do special things to page sized allocs 
because of these page struct uses. F.e. the private field is used by 
compound pages and if any of the slabs allocate a higher order page that 
field will be in use.

We certainly see even from the rudimentary tests that I have done that 
the limited pgd, pmd caching has some effect. Could we please see your local 
patches? And I guess that you must have some sort of benchmark that you 
run to test these?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL v0.31

2007-03-28 Thread Bill Davidsen

Linus Torvalds wrote:


On Tue, 20 Mar 2007, Willy Tarreau wrote:

Linus, you're unfair with Con. He initially was on this position, and lately
worked with Mike by proposing changes to try to improve his X responsiveness.


I was not actually so much speaking about Con, as about a lot of the 
tone in general here. And yes, it's not been entirely black and white. I 
was very happy to see the "try this patch" email from Al Boldi - not 
because I think that patch per se was necessarily the right fix (I have no 
idea), but simply because I think that's the kind of mindset we need to 
have.


Not a lot of people really *like* the old scheduler, but it's been tweaked 
over the years to try to avoid some nasty behaviour. I'm really hoping 
that RSDL would be a lot better (and by all accounts it has the potential 
for that), but I think it's totally naïve to expect that it won't need 
some tweaking too.


So I'll happily still merge RSDL right after 2.6.21 (and it won't even be 
a config option - if we want to make it good, we need to make sure 
*everybody* tests it), but what I want to see is that "can do" spirit wrt 
tweaking for issues that come up.


May I suggest that if you want proper testing that it not only should be 
a config option but a boot time option as well? Otherwise people will be 
comparing an old scheduler with an RSDL kernel, and they will diverge as 
time goes on.


More people would be willing to reboot and test on a similar load than 
will keep two versions of the kernel around. And if you get people 
testing RSDL against a vendor kernel which might be hacked, it will be 
even less meaningful.


Please consider the benefits of making RSDL the default scheduler, and 
leaving people with the old scheduler with an otherwise identical kernel 
as a fair and meaningful comparison.


There, that's a technical argument ;-)

Because let's face it - nothing is ever perfect. Even a really nice 
conceptual idea always ends up hitting the "but in real life, things are 
ugly and complex, and we've depended on behaviour X in the past and can't 
change it, so we need some tweaking for problem Y".


And everything is totally fixable - at least as long as people are willing 
to!


Linus



--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] max_loop limit

2007-03-28 Thread Karel Zak
On Sun, Mar 25, 2007 at 10:40:10AM +0200, Tomas M wrote:
> >here's one. Allocates all the fluff dynamically. It does not create any
> >dev nodes by itself, so you need to do it (à la mdadm)
> 
> I'm afraid that this would break a lot of things, for example mount -o 
> loop will not work anymore unless you create /dev/loop* manually first, 

 Yes, "losetup" and "mount -o loop" call stat( /dev/loopN ) when look
 for an (un)used loop device.

> am I correct? In this case, this is unusable for many as it is not 
> backward compatible with old loop.c, am I correct?

 udev ?

Karel

-- 
 Karel Zak  <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL v0.31

2007-03-28 Thread Bill Davidsen

David Schwartz wrote:

there were multiple attempts with renicing X under the vanilla
scheduler, and they were utter failures most of the time. _More_ people
complained about interactivity issues _after_ X has been reniced to -5
(or -10) than people complained about "nice 0" interactivity issues to
begin with.


Unfortunately, nicing X is not going to work. It causes X to pre-empt any
local process that tries to batch requests to it, defeating the batching.
What you really want is X to get scheduled after the client pauses in
sending data to it or has sent more than a certain amount. It seems kind of
crazy to put such login in a scheduler.

Perhaps when one process unblocks another, you put that other process at the
head of the run queue but don't pre-empt the currently running process. That
way, the process can continue to batch requests, but X's maximum latency
delay will be the quantum of the client program.


In general I think that's the right idea. See below for more...



The vanilla scheduler's auto-nice feature rewards _behavior_, so it gets
X right most of the time. The fundamental issue is that sometimes X is
very interactive - we boost it then, there's lots of scheduling but nice
low latencies. Sometimes it's a hog - we penalize it then and things
start to batch up more and we get out of the overload situation faster.
That's the case even if all you care about is desktop performance.

no doubt it's hard to get the auto-nice thing right, but one thing is
clear: currently RSDL causes problems in areas that worked well in the
vanilla scheduler for a long time, so RSDL needs to improve. RSDL should
not lure itself into the false promise of 'just renice X statically'. It
wont work. (You might want to rewrite X's request scheduling - but if so
then i'd like to see that being done _first_, because i just dont trust
such 10-mile-distance problem analysis.)


I am hopeful that there exists a heuristic that both improves this problem
and is also inherently fair. If that's true, then such a heuristic can be
added to RSDL without damaging its properties and without requiring any
special settings. Perhaps longer-term latency benefits to processes that
have yielded in the past?

I think there are certain circumstances, however, where it is inherently
reasonable to insist that 'nice' be used. If you want a CPU-starved task to
get more than 1/X of the CPU, where X is the number of CPU-starved tasks,
you should have to ask for that. If you want one CPU-starved task to get
better latency than other CPU-starved tasks, you should have to ask for
that.


I agree for giving a process more than a fair share, but I don't think 
"latency" is the best term for what you describe later. If you think of 
latency as the time between a process unblocking and the time when it 
gets CPU, that is a more traditional interpretation. I'm not really sure 
latency and CPU-starved are compatible.


I would like to see processes at the head of the queue (for latency) 
which were blocked for long term events, keyboard input, network input, 
mouse input, etc. Then processes blocked for short term events like 
disk, then processes which exhausted their time slice. This helps 
latency and responsiveness, while keeping all processes running.


A variation is to give those processes at the head of the queue short


Fundamentally, the scheduler cannot do it by itself. You can create cases
where the load is precisely identical and one person wants X and another
person wants Y. The scheduler cannot know what's important to you.

DS





--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: max_loop limit

2007-03-28 Thread Karel Zak
On Thu, Mar 22, 2007 at 04:09:13PM +, Pádraig Brady wrote:
> William Lee Irwin III wrote:
> > Any chance we can get some kind of devices set up for partitions of
> > loop devices if we're going to redo loopdev setup? That's been a thorn
> > in my side for some time.
> 
> This script might be of use:
> http://www.pixelbeat.org/scripts/lomount.sh

 Ah, lomount... very popular name ;-) Xen guys have lomount too.
 
 Unfortunately, these solution are useless with LVM volumes. The
 kpartx is more usable:

 
http://fedoraproject.org/wiki/FedoraXenQuickstartFC6?highlight=%28Xen%29#head-9c5408e750e8184aece3efe822be0ef6dd1871cd

Karel

-- 
 Karel Zak  <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] cache pipe buf page address for non-highmem arch

2007-03-28 Thread Andrew Morton
On Wed, 28 Mar 2007 16:21:04 -0700
Zach Brown <[EMAIL PROTECTED]> wrote:

> > Does this look OK?
> 
> Almost...
> 
> > #ifdef CONFIG_HIGHMEM
> > static inline void pipe_kunmap_atomic(void *addr, enum km_type type)
> > #else   /* CONFIG_HIGHMEM */
> > static inline void pipe_kunmap_atomic(struct page *page, enum  
> > km_type type)
> 

OK, I give up.  What are you telling me here?



argh, enum km_type isn't defined if !CONFIG_HIGHMEM, which is extravagantly
dumb.


From: Andrew Morton <[EMAIL PROTECTED]>

Cc: "Ken Chen" <[EMAIL PROTECTED]>
Cc: Zach Brown <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 fs/pipe.c |   31 +--
 1 files changed, 25 insertions(+), 6 deletions(-)

diff -puN fs/pipe.c~cache-pipe-buf-page-address-for-non-highmem-arch-fix-tidy 
fs/pipe.c
--- a/fs/pipe.c~cache-pipe-buf-page-address-for-non-highmem-arch-fix-tidy
+++ a/fs/pipe.c
@@ -22,17 +22,36 @@
 #include 
 
 #ifdef CONFIG_HIGHMEM
-#define pipe_kmap  kmap
-#define pipe_kmap_atomic   kmap_atomic
-#define pipe_kunmapkunmap
-#define pipe_kunmap_atomic kunmap_atomic
+static inline void *pipe_kmap(struct page *page)
+{
+   return kmap(page);
+}
+
+static inline void pipe_kunmap(struct page *page)
+{
+   kunmap(page);
+}
+
+static inline void *pipe_kmap_atomic(struct page *page, enum km_type type)
+{
+   return kmap_atomic(page, type);
+}
+
+static inline void pipe_kunmap_atomic(void *addr, enum km_type type)
+{
+   kunmap_atomic(addr, type);
+}
 #else  /* CONFIG_HIGHMEM */
 static inline void *pipe_kmap(struct page *page)
 {
-   return (void *) page->private;
+   return (void *)page->private;
 }
+
+static inline void pipe_kunmap(struct page *page)
+{
+}
+
 #define pipe_kmap_atomic(page, type)   pipe_kmap(page)
-#define pipe_kunmap(page)  do { } while (0)
 #define pipe_kunmap_atomic(page, type) do { } while (0)
 #endif
 
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][2.6.21] uml: fix unreasonably long udelay

2007-03-28 Thread Paolo 'Blaisorblade' Giarrusso
Currently we have a confused udelay implementation.

* __const_udelay does not accept usecs but xloops in i386 and x86_64
* our implementation requires usecs as arg
* it gets a xloops count when called by asm/arch/delay.h

Bugs related to this (extremely long shutdown times) where reported by some
x86_64 users, especially using Device Mapper.

To hit this bug, a compile-time constant time parameter must be passed - that's
why UML seems to work most times.
Fix this with a simple udelay implementation.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]>
---

 arch/um/sys-i386/delay.c   |   11 ---
 arch/um/sys-x86_64/delay.c |   11 ---
 include/asm-um/delay.h |   17 ++---
 3 files changed, 14 insertions(+), 25 deletions(-)

diff --git a/arch/um/sys-i386/delay.c b/arch/um/sys-i386/delay.c
index 2c11b97..d623e07 100644
--- a/arch/um/sys-i386/delay.c
+++ b/arch/um/sys-i386/delay.c
@@ -27,14 +27,3 @@ void __udelay(unsigned long usecs)
 }
 
 EXPORT_SYMBOL(__udelay);
-
-void __const_udelay(unsigned long usecs)
-{
-   int i, n;
-
-   n = (loops_per_jiffy * HZ * usecs) / MILLION;
-for(i=0;i 2) ? \
+   __bad_udelay() : __udelay(n))
+
+/* It appears that ndelay is not used at all for UML, and has never been
+ * implemented. */
+extern void __unimplemented_ndelay(void);
+#define ndelay(n) __unimplemented_ndelay()
+
 #endif



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Alan Cox
On Wed, 28 Mar 2007 15:01:31 -0700
William Lee Irwin III <[EMAIL PROTECTED]> wrote:

> On Wed, Mar 28, 2007 at 02:38:55PM -0700, Christoph Lameter wrote:
> > No that was described in the patch. Quote:
> > "i386 only provides support for caching constructed pgd and pmds. These 
> > are comparatively rare to ptes so it is no surprise that the current 
> > approach has only minimal effect. "

Whatever it was originally for and public or not, the above isn't true
for some non Intel products...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Odd log message associated with NFS

2007-03-28 Thread J. Bruce Fields
On Wed, Mar 28, 2007 at 07:05:36PM +, Thorsten Kranzkowski wrote:
> I'll let a tcpdump run this evening and see if I can correlate the message
> with anything. 
> 
> If you have a printk or other patch for me to try, just let me know.

Well, just for fun, you could try something like this--should dump some
data the first time it hits the "bad direction" error.

--b.

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index 9e340fa..c2349d2 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -35,6 +35,45 @@ struct xdr_netobj {
  */
 typedef int(*kxdrproc_t)(void *rqstp, __be32 *data, void *obj);
 
+/* dump the buffer in `emacs-hexl' style */
+#define isprintable(c)  ((c > 0x1f) && (c < 0x7f))
+
+static inline void dump_hex(void *p, u_int length)
+{
+   u_int i, j, jm;
+   u8 c, *cp;
+
+   printk("RPC: print_hexl: length %d\n",length);
+   cp = p;
+
+   for (i = 0; i < length; i += 0x10) {
+   printk("  %04x: ", (u_int)i);
+   jm = length - i;
+   jm = jm > 16 ? 16 : jm;
+
+   for (j = 0; j < jm; j++) {
+   if ((j % 2) == 1)
+   printk("%02x ", (u_int)cp[i+j]);
+   else
+   printk("%02x", (u_int)cp[i+j]);
+   }
+   for (; j < 16; j++) {
+   if ((j % 2) == 1)
+   printk("   ");
+   else
+   printk("  ");
+   }
+   printk(" ");
+
+   for (j = 0; j < jm; j++) {
+   c = cp[i+j];
+   c = isprintable(c) ? c : '.';
+   printk("%c", c);
+   }
+   printk("\n");
+   }
+}
+
 /*
  * Basic structure for transmission/reception of a client XDR message.
  * Features a header (for a linear buffer containing RPC headers
@@ -61,6 +100,18 @@ struct xdr_buf {
 
 };
 
+static inline void dump_xdr_buf(struct xdr_buf *buf)
+{
+   printk("buf->head[0].iov_base = %p, buf->head[0].iov_len = %d\n",
+   buf->head[0].iov_base, buf->head[0].iov_len);
+   printk("buf->tail[0].iov_base = %p, buf->tail[0].iov_len = %d\n",
+   buf->tail[0].iov_base, buf->tail[0].iov_len);
+   printk("pages = %p, page_base = %d, page_len = %d\n",
+   buf->pages, buf->page_base, buf->page_len);
+   printk("buflen = %d, len = %d\n", buf->buflen, buf->len);
+   return;
+}
+
 /*
  * pre-xdr'ed macros.
  */
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index b4db53f..977056e 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -776,6 +776,26 @@ svc_register(struct svc_serv *serv, int proto, unsigned 
short port)
return error;
 }
 
+static void
+dump_once(struct svc_rqst *rqstp, __be32 *orig_start)
+{
+   static int done = 0;
+   struct kvec *argv = >rq_arg.head[0];
+   char buf[RPC_MAX_ADDRBUFLEN];
+
+   if (done)
+   return;
+   done++;
+
+   printk("dumping request; rq_addr = %s, rq_deferred = %p, rq_arg:\n",
+   svc_print_addr(rqstp, buf, sizeof(buf)), rqstp->rq_deferred);
+   dump_xdr_buf(>rq_arg);
+
+   printk("head data (from %p):\n", orig_start);
+   dump_hex(orig_start, (argv->iov_base + argv->iov_len)
+   - (void *)orig_start);
+}
+
 /*
  * Process the RPC request.
  */
@@ -794,6 +814,7 @@ svc_process(struct svc_rqst *rqstp)
__be32  auth_stat, rpc_stat;
int auth_res;
__be32  *reply_statp;
+   __be32  *start;
 
rpc_stat = rpc_success;
 
@@ -819,6 +840,7 @@ svc_process(struct svc_rqst *rqstp)
if (rqstp->rq_prot == IPPROTO_TCP)
svc_putnl(resv, 0);
 
+   start = argv->iov_base;
rqstp->rq_xid = svc_getu32(argv);
svc_putu32(resv, rqstp->rq_xid);
 
@@ -971,6 +993,7 @@ err_short_len:
 err_bad_dir:
if (net_ratelimit())
printk("svc: bad direction %d, dropping request\n", dir);
+   dump_once(rqstp, start);
 
serv->sv_stats->rpcbadfmt++;
goto dropit;/* drop request */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] cache pipe buf page address for non-highmem arch

2007-03-28 Thread Zach Brown

Does this look OK?


Almost...


#ifdef CONFIG_HIGHMEM
static inline void pipe_kunmap_atomic(void *addr, enum km_type type)
#else   /* CONFIG_HIGHMEM */
static inline void pipe_kunmap_atomic(struct page *page, enum  
km_type type)


- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Inlining can be _very_bad...

2007-03-28 Thread J.A. Magallón
Hi all...

I post this here as it can be of direct interest for kernel development
(as I recall many discussions about inlining yes or no...).

Testing other problems, I finally got this this issue: the same short
and stupid loop lasted from 3 to 5 times more if it was in main() than
if it was in an out-of-line function. The same (bad thing) happens if
the function is inlined.

The basic code is like this:

float   data[];

[inline] double one()
{
double sum;
sum = 0;
for (i=0; i tst
T0: 1145.12 ms
S0: 268435456.00
T1: 457.19 ms
S1: 268435456.00

With one() inlined:

apolo:~/e4> tst
T0: 1200.52 ms
S0: 268435456.00
T1: 1200.14 ms
S1: 268435456.00

Looking at the assembler, the non-inlined version does:

.L2:
cvtss2sd(%rdx,%rax,4), %xmm0
incq%rax
cmpq$268435456, %rax
addsd   %xmm0, %xmm1
jne .L2

and the inlined

.L13:
cvtss2sd(%rdx,%rax,4), %xmm0
incq%rax
cmpq$268435456, %rax
addsd   8(%rsp), %xmm0
movsd   %xmm0, 8(%rsp)
jne .L13

It looks like is updating the stack on each iteration...This is -march=opteron
code, the -march=pentium4 is similar. Same behaviour with gcc3 and gcc4.

tst.c and Makefile attached.

Nice, isn't it ? Please, probe where is my fault...

--
J.A. Magallon  \   Software is like sex:
 \ It's better when it's free
Mandriva Linux release 2007.1 (Cooker) for i586
Linux 2.6.20-jam06 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #1 SMP 
PREEMPT


Makefile
Description: Binary data
#include 
#include 
#include 

#define SIZE 256*1024*1024

#define elap(t0,t1) \
	((1000*t1.tv_sec+0.001*t1.tv_usec) - (1000*t0.tv_sec+0.001*t0.tv_usec))

double  one();

float	*data;

#ifdef INLINE
inline
#endif
double one()
{
	int i;
	double sum;

	sum = 0;
	asm("#FBGN");
	for (i=0; i

Re: [patch] cache pipe buf page address for non-highmem arch

2007-03-28 Thread Andrew Morton
On Tue, 27 Mar 2007 15:57:53 -0700
Zach Brown <[EMAIL PROTECTED]> wrote:

> > +#define pipe_kmap_atomic(page, type)   pipe_kmap(page)
> > +#define pipe_kunmap(page)  do { } while (0)
> > +#define pipe_kunmap_atomic(page, type) do { } while (0)
> 
> Please don't drop arguments in stubs.  It can let completely broken  
> code compile, like:
> 
>   pipe_kunmap(SOME_COMPLETE_NONSENSE);
> 
> Static inlines with empty bodies are the gold standard.
> 

yup.



Does this look OK?

#ifdef CONFIG_HIGHMEM
static inline void *pipe_kmap(struct page *page)
{
return kmap(page);
}

static inline void pipe_kunmap(struct page *page)
{
kunmap(page);
}

static inline void *pipe_kmap_atomic(struct page *page, enum km_type type)
{
return kmap_atomic(page, type);
}

static inline void pipe_kunmap_atomic(void *addr, enum km_type type)
{
kunmap_atomic(addr, type);
}
#else   /* CONFIG_HIGHMEM */
static inline void *pipe_kmap(struct page *page)
{
return (void *)page->private;
}

static inline void pipe_kunmap(struct page *page)
{
}

static inline void *pipe_kmap_atomic(struct page *page, enum km_type type)
{
return (void *)page->private;
}

static inline void pipe_kunmap_atomic(struct page *page, enum km_type type)
{
}
#endif

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: possible mistake in linux kernel header file -- kernel: 2.6.16.29 file: mod_devicetable.h

2007-03-28 Thread Robert Hancock

smitchel wrote:
I am not sure where to post this, maybe you can direct me what to do, if 
anything.


We have two computers running slackware for amd64 version 11.0.
Tonight we compiled mplayer on each of the systems.

On the first, everything compiled fine--it has a core 2 duo cpu and is 
running a stock kernel off the install DVD for slackware-amd64.

it is kernel 2.6.16.29.

On the second it would not compile, and it has dual opteron 250 cpus and 
is running a kernel that we compiled to add some things to
for sound, etc.   This was from a kernel source that we downloaded a few 
days ago.

it is kernel 2.6.16.29--same as first machine.

The error is stopping in the file /usr/include/linux/mod_devicetable.h.

It appears that there are 4 extra lines that have been added to the  
mod_devicetable.h that was part of the kernel source that we downloaded.

They are in the first screenful of the file:

#ifdef __KERNEL__
#include 
typedef unsigned long kernel_ulong_t;
#endif

They are not in the same file in the kernel source from the slackware 
amd-64 install DVD. ( included somewhere else?)


Googling we found:
__KERNEL__ is defined for programs that run in kernel mode instead of 
user programs (whatever that means).


A few lines later in mod_devicetable.h it uses the type kernel_ulong_t 
(in the same file--what if the ifdef path is not taken?)


These compile errors are from compiling mplayer? Something is not right 
here, it shouldn't be including that header file at all - and I'm not 
sure how anything in /usr/include could be ending up trying to do so.


__KERNEL__ is only supposed to be defined when building the kernel 
itself. Current kernels (not sure if 2.6.16 had this though) have a 
process which generates header files suitable for userspace from the 
kernel's header files and strips out everything inside #ifdef __KERNEL__.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: staircase deadline misc fixes

2007-03-28 Thread Con Kolivas
On Thursday 29 March 2007 04:48, Ingo Molnar wrote:
> hm, how about the questions Mike raised (there were a couple of cases of
> friction between 'the design as documented and announced' and 'the code
> as implemented')? As far as i saw they were still largely unanswered -
> but let me know if they are all answered and addressed:

I spent less time emailing and more time coding. I have been working on 
addressing whatever people brought up.

>  http://marc.info/?l=linux-kernel=117465220309006=2

Attended to.

>  http://marc.info/?l=linux-kernel=117489673929124=2

Attended to.

>  http://marc.info/?l=linux-kernel=117489831930240=2

Checked fine.

> and the numbers he posted:
>
>  http://marc.info/?l=linux-kernel=117448900626028=2

Attended to.

> his test conclusion was that under CPU load, RSDL (SD) generally does
> not hold up to mainline's interactivity.

There have been improvements since the earlier iterations but it's still a 
fairness based design. Mike's "sticking point" test case should be improved 
as well.

My call based on my own testing and feedback from users is: 

Under niced loads it is 99% in favour of SD.

Under light loads it is 95% in favour of SD.

Under Heavy loads it becomes proportionately in favour of mainline. The 
crossover is somewhere around a load of 4.

If the reluctance to renice X goes away I'd say it was 99% across the board 
and to much higher loads.

>   Ingo

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFT] e100 driver on ARM

2007-03-28 Thread Kok, Auke

Lennert Buytenhek wrote:

On Mon, Sep 04, 2006 at 06:39:29AM -0400, Jeff Garzik wrote:


1) Does e100 driver work on ARM?


FWIW, e100 seems to work okay for me on an intel ixp2400 (xscale based)
board, an ixp2850 (xscale based) board and an ixp2350 (xscale3 based)
board.  ixp2350 works both with hardware coherency turned on (cpu
snoops bus) and turned off (manual dma cache clean/invalidate as usual.)

As for the other ARM platforms that I'm interested in / have hardware
for / maintain, the at91/ep93xx/pxa270 don't have PCI, and the other
two (iop32x/iop33x) I can't test because I don't have such systems with
e100 NICs, but I expect those would work, since they're both xscale
based like the ixp2400, and the ixp2400 works.


I just got an iop342 board dropped on my lap. Once it's running, I'll make sure 
to make this the first thing to test.


Cheers,

Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add support for deferrable timers (respun-Mar28)

2007-03-28 Thread Venki Pallipadi

Andrew,

Please drop the patch you included yesterday and two incremental patches and
use the patch below.

This patch is - yesterday's patch + Your tidy cleanup +
minor changes based on comments from Oleg and Andi. This is a lot
cleaner (and smaller) than earlier patches.

Thanks,
Venki


Introduce a new flag for timers - deferrable:
Timers that work normally when system is busy. But, will not cause CPU to
come out of idle (just to service this timer), when CPU is idle. Instead,
this timer will be serviced when CPU eventually wakes up with a subsequent
non-deferrable timer.

The main advantage of this is to avoid unnecessary timer interrupts when
CPU is idle. If the routine currently called by a timer can wait until next
event without any issues, this new timer can be used to setup timer event
for that routine. This, with dynticks, allows CPUs to be lazy, allowing them
to stay in idle for extended period of time by reducing unnecesary wakeup and
thereby reducing the power consumption.

This patch:
Builds this new timer on top of existing timer infrastructure. It uses
last bit in 'base' pointer of timer_list structure to store this
deferrable timer flag. __next_timer_interrupt() function
skips over these deferrable timers when CPU looks for
next timer event for which it has to wake up.

This is exported by a new interface init_timer_deferrable() that can
be called in place of regular init_timer().

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>

Index: new/kernel/timer.c
===
--- new.orig/kernel/timer.c 2007-03-22 16:27:44.0 -0800
+++ new/kernel/timer.c  2007-03-28 10:05:38.0 -0800
@@ -74,7 +74,7 @@
tvec_t tv3;
tvec_t tv4;
tvec_t tv5;
-} cacheline_aligned_in_smp;
+} cacheline_aligned;
 
 typedef struct tvec_t_base_s tvec_base_t;
 
@@ -82,6 +82,37 @@
 EXPORT_SYMBOL(boot_tvec_bases);
 static DEFINE_PER_CPU(tvec_base_t *, tvec_bases) = _tvec_bases;
 
+/*
+ * Note that all tvec_bases is 2 byte aligned and lower bit of
+ * base in timer_list is guaranteed to be zero. Use the LSB for
+ * the new flag to indicate whether the timer is deferrable
+ */
+#define TBASE_DEFERRABLE_FLAG  (0x1)
+
+/* Functions below help us manage 'deferrable' flag */
+static inline unsigned int tbase_get_deferrable(tvec_base_t *base)
+{
+   return ((unsigned int)(unsigned long)base & TBASE_DEFERRABLE_FLAG);
+}
+
+static inline tvec_base_t *tbase_get_base(tvec_base_t *base)
+{
+   return ((tvec_base_t *)((unsigned long)base & ~TBASE_DEFERRABLE_FLAG));
+}
+
+static inline void timer_set_deferrable(struct timer_list *timer)
+{
+   timer->base = ((tvec_base_t *)((unsigned long)(timer->base) |
+  TBASE_DEFERRABLE_FLAG));
+}
+
+static inline void
+timer_set_base(struct timer_list *timer, tvec_base_t *new_base)
+{
+   timer->base = (tvec_base_t *)((unsigned long)(new_base) |
+ tbase_get_deferrable(timer->base));
+}
+
 /**
  * __round_jiffies - function to round jiffies to a full second
  * @j: the time in (absolute) jiffies that should be rounded
@@ -295,6 +326,13 @@
 }
 EXPORT_SYMBOL(init_timer);
 
+void fastcall init_timer_deferrable(struct timer_list *timer)
+{
+   init_timer(timer);
+   timer_set_deferrable(timer);
+}
+EXPORT_SYMBOL(init_timer_deferrable);
+
 static inline void detach_timer(struct timer_list *timer,
int clear_pending)
 {
@@ -325,10 +363,11 @@
tvec_base_t *base;
 
for (;;) {
-   base = timer->base;
+   tvec_base_t *prelock_base = timer->base;
+   base = tbase_get_base(prelock_base);
if (likely(base != NULL)) {
spin_lock_irqsave(>lock, *flags);
-   if (likely(base == timer->base))
+   if (likely(prelock_base == timer->base))
return base;
/* The timer has migrated to another CPU */
spin_unlock_irqrestore(>lock, *flags);
@@ -365,11 +404,11 @@
 */
if (likely(base->running_timer != timer)) {
/* See the comment in lock_timer_base() */
-   timer->base = NULL;
+   timer_set_base(timer, NULL);
spin_unlock(>lock);
base = new_base;
spin_lock(>lock);
-   timer->base = base;
+   timer_set_base(timer, base);
}
}
 
@@ -397,7 +436,7 @@
timer_stats_timer_set_start_info(timer);
BUG_ON(timer_pending(timer) || !timer->function);
spin_lock_irqsave(>lock, flags);
-   timer->base = base;
+   timer_set_base(timer, base);
internal_add_timer(base, timer);
spin_unlock_irqrestore(>lock, flags);
 }
@@ -548,7 

[Repost][PATCH] Remove "obsolete" label from ISDN4Linux

2007-03-28 Thread Tilman Schmidt
From: Tilman Schmidt <[EMAIL PROTECTED]>

Remove incorrect "obsolete" label from ISDN4Linux.

Signed-off-by: Tilman Schmidt <[EMAIL PROTECTED]>

---

--- a/drivers/isdn/Kconfig  2006-11-29 22:57:37.0 +0100
+++ b/drivers/isdn/Kconfig  2007-02-21 01:19:19.0 +0100
@@ -25,7 +25,7 @@ menu "Old ISDN4Linux"
depends on NET && ISDN

 config ISDN_I4L
-   tristate "Old ISDN4Linux (obsolete)"
+   tristate "Old ISDN4Linux subsystem"
---help---
  This driver allows you to use an ISDN-card for networking
  connections and as dialin/out device.  The isdn-tty's have a built
@@ -38,8 +38,8 @@ config ISDN_I4L

  ISDN support in the linux kernel is moving towards a new API,
  called CAPI (Common ISDN Application Programming Interface).
- Therefore the old ISDN4Linux layer is becoming obsolete. It is
- still usable, though, if you select this option.
+ The old ISDN4Linux layer is still available for use with cards
+ that are not supported by the new CAPI subsystem yet.

 if ISDN_I4L
 source "drivers/isdn/i4l/Kconfig"

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
In theory, there is no difference between theory and practice.
In practice, there is.



signature.asc
Description: OpenPGP digital signature


[PATCH -rt] Fix build on MIPS

2007-03-28 Thread Deepak Saxena

Extra #endif got into atomic.h

Signed-off-by: Deepak Saxena <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5/include/asm-mips/atomic.h
===
--- linux-2.6.21-rc5.orig/include/asm-mips/atomic.h
+++ linux-2.6.21-rc5/include/asm-mips/atomic.h
@@ -566,7 +566,6 @@ static __inline__ long atomic64_add_retu
raw_local_irq_restore(flags);
}
 #endif
-#endif
 
smp_mb();
 

-- 
Deepak Saxena - [EMAIL PROTECTED] - http://www.plexity.net

In the end, they will not say, "those were dark times,"  they will ask
"why were their poets silent?" - Bertolt Brecht
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux 2.6.16.46-rc1

2007-03-28 Thread Adrian Bunk
Location:
ftp://ftp.kernel.org/pub/linux/kernel/people/bunk/linux-2.6.16.y/testing/

git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git

RSS feed of the git tree:
http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.16.y.git;a=rss


Changes since 2.6.16.45:

Adrian Bunk (1):
  Linux 2.6.16.46-rc1

Akinbou Mita (1):
  md: fix /proc/mdstat refcounting

Alan Stern (10):
  usb-storage: unusual_devs entry for Nikon DSC D70s
  USB: unusual_devs entry for Nokia N80
  USB: unusual_devs entry for Nokia N91
  USB: unusual_devs entry for Nokia E61
  USB: unusual_devs entry for Lacie DVD+-RW
  USB: unusual-devs entry for Nokia E60
  USB: unusual_devs entry for Nokia 6131
  USB: unusual_devs entry for Nokia 6234
  unusual_devs update for UCR-61S2B
  USB: unusual_devs update for Sony P990i phone

Amol Lad (1):
  sound/pci/au88x0/au88x0.c: ioremap balanced with iounmap

Andrew Nayenko (1):
  USB storage: Nokia 6288 unusual_devs entry

Andy Isaacson (1):
  fix read past end of array in md/linear.c

Clemens Ladisch (1):
  usb-audio: work around wrong frequency in CM6501 descriptors

David Kuehling (1):
  USB: unusual_devs entry for A-VOX WSX-300ER MP3 player

Davide Perini (1):
  usb-storage: unusual_devs entry for Motorola RAZR V3x

Dylan Taft (1):
  USB Storage: US_FL_IGNORE_RESIDUE needed for Aiptek MP3 Player

Eric Sesterhenn (1):
  [ALSA] fix NULL pointer dereference in sound/synth/emux/soundfont.c

Ernis (1):
  USB: unusual_devs entry for Samsung MP3 player

Florin Malita (1):
  [ALSA] Dereference after free in snd_hwdep_release()

Guennadi Liakhovetski (1):
  [PPP]: Don't leak an sk_buff on interface destruction.

Jaco Kroon (1):
  USB: add Digitech USB-Storage to unusual_devs.h

Jürgen Mell (1):
  USB floppy drive SAMSUNG SFD-321U/EP was detected 8 times

Lars Ellenberg (1):
  md: pass down BIO_RW_SYNC in raid{1,10}

Lars Jacob (1):
  USB: unusual_devs entry for Sony DSC-H5

Luiz Fernando N. Capitulino (1):
  USB: unusual_devs.h for Sony floppy

Manuel Osdoba (1):
  USB: unusual_devs.h entry for nokia 6233

Mario Rettig (1):
  USB: unusual_devs entry for Nokia 3250

Mikko Honkala (1):
  USB: Nokia E70 is an unusual device

Neil Brown (2):
  MD: Fix problem where hot-added drives are not resynced.
  md: Fix bug where spares don't always get rebuilt properly when they 
become live

Nick Piggin (1):
  mm: fix madvise infinine loop

Olivier Blondeau (1):
  USB: storage: atmel unusual dev update

Patrick McHardy (3):
  [NET_SCHED]: Fix endless loops caused by inaccurate qlen counters
  [NET_SCHED]: cls_basic: fix NULL pointer dereference
  [NET_SCHED]: Fix ingress locking

Pete Zaitcev (3):
  USB storage: fix ipod ejecting issue
  USB: unusual_devs.h for 0x046b:ff40
  USB: RAZR v3i unusual_devs

Phil Dibowitz (11):
  USB: storage: sandisk unusual_devices entry
  USB: storage: another unusual_devs.h entry
  USB: storage: unusual_devs.h entry 0420:0001
  USB: Storage: unusual devs update
  USB Storage: US_FL_MAX_SECTORS_64 flag
  USB: another unusual device
  USB Storage: unusual_devs.h for Sony Ericsson M600i
  USB: unusual_dev entry for Sony P990i
  USB: usb-storage: Unusual_dev update
  USB Storage: unusual_devs: add supertop drives
  USB: Fix UCR-61S2B unusual_dev entry

Rodolfo Quesada (1):
  USB: storage: new unusual_devs.h entry: Mitsumi 7in1 Card Reader

Russell King (1):
  [SERIAL] Fix oops when removing suspended serial port

Stefan Richter (1):
  ieee1394: dv1394: fix CardBus card ejection

Takashi Iwai (6):
  [ALSA] hda-codec - Don't return error at initialization of modem codec
  [ALSA] hda-intel - Don't try to probe invalid codecs
  [ALSA] Fix invalid assignment of PCI revision
  [ALSA] cmipci - Fix a typo in 'PC Speaker Playback Switch' control
  [ALSA] cs4281 - Fix the check of right channel
  [ALSA] ca0106 - Add missing sysfs device assignment

Tobias Lorenz (1):
  USB: Mitsumi USB FDD 061M: UNUSUAL_DEV multilun fix

YOSHIFUJI Hideaki (1):
  [IPV6] HASHTABLES: Use appropriate seed for caluculating ehash index.


 Makefile   |2 
 drivers/ieee1394/dv1394.c  |   17 -
 drivers/md/linear.c|2 
 drivers/md/md.c|3 
 drivers/md/raid1.c |   13 -
 drivers/md/raid10.c|   11 -
 drivers/net/ppp_generic.c  |3 
 drivers/serial/serial_core.c   |9 
 drivers/usb/storage/scsiglue.c |   12 -
 drivers/usb/storage/unusual_devs.h |  285 +++--
 drivers/usb/storage/usb.h  |4 
 include/linux/serial_core.h|1 
 include/linux/usb_usual.h  |2 
 include/net/sch_generic.h  |4 
 include/sound/ymfpci.h |2 
 mm/madvise.c 

[PATCH] scsi: megaraid_sas - intercepts cmd timeout and throttle io

2007-03-28 Thread Sumant Patro

eh_timed_out call back (megasas_reset_timer) is used to throttle io to the 
adapter 
when it is called the first time for a scmd.
The MEGASAS_FW_BUSY flag is set and can_queue reduced to 16. The can_queue is 
restored 
from completion routine in following two conditions : 5 seconds has elapsed and 
the # of
outstanding cmds in FW is < 17.

Signed-off-by: Sumant Patro <[EMAIL PROTECTED]>
---
 drivers/scsi/megaraid/megaraid_sas.c |   65 +++--
 drivers/scsi/megaraid/megaraid_sas.h |   13 +++--
 2 files changed, 70 insertions(+), 8 deletions(-)

This patch requires the patch submitted by James with subject line : 

[PATCH] expose eh_timed_out to the host template

diff -uprN linux-2.6.orig/drivers/scsi/megaraid/megaraid_sas.c 
linux-2.6.new/drivers/scsi/megaraid/megaraid_sas.c
--- linux-2.6.orig/drivers/scsi/megaraid/megaraid_sas.c 2007-03-28 
08:41:49.0 -0700
+++ linux-2.6.new/drivers/scsi/megaraid/megaraid_sas.c  2007-03-28 
08:36:38.0 -0700
@@ -10,7 +10,7 @@
  *2 of the License, or (at your option) any later version.
  *
  * FILE: megaraid_sas.c
- * Version : v00.00.03.10-rc1
+ * Version : v00.00.03.10-rc3
  *
  * Authors:
  * (email-id : [EMAIL PROTECTED])
@@ -886,6 +886,7 @@ megasas_queue_command(struct scsi_cmnd *
goto out_return_cmd;
 
cmd->scmd = scmd;
+   scmd->SCp.ptr = (char *)cmd;
 
/*
 * Issue the command to the FW
@@ -981,8 +982,8 @@ static int megasas_generic_reset(struct 
 
instance = (struct megasas_instance *)scmd->device->host->hostdata;
 
-   scmd_printk(KERN_NOTICE, scmd, "megasas: RESET -%ld cmd=%x\n",
-  scmd->serial_number, scmd->cmnd[0]);
+   scmd_printk(KERN_NOTICE, scmd, "megasas: RESET -%ld cmd=%x 
retries=%x\n",
+scmd->serial_number, scmd->cmnd[0], scmd->retries);
 
if (instance->hw_crit_error) {
printk(KERN_ERR "megasas: cannot recover from previous reset "
@@ -1000,6 +1001,40 @@ static int megasas_generic_reset(struct 
 }
 
 /**
+ * megasas_reset_timer - quiesce the adapter if required
+ * @scmd:  scsi cmnd
+ *
+ * Sets the FW busy flag and reduces the host->can_queue if the
+ * cmd has not been completed within the timeout period.
+ */
+static enum
+scsi_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd)
+{
+   struct megasas_cmd *cmd = (struct megasas_cmd *)scmd->SCp.ptr;
+   struct megasas_instance *instance;
+   unsigned long flags;
+
+   if (cmd) {
+   if (time_after(jiffies, scmd->jiffies_at_alloc + 170 * HZ))
+   return EH_NOT_HANDLED;
+
+   instance = cmd->instance;
+   if (!(instance->flag & MEGASAS_FW_BUSY)) {
+   /* FW is busy, throttle IO */
+   spin_lock_irqsave(>throttle_io_lock, flags);
+
+   instance->host->can_queue = 16;
+   instance->last_time = jiffies;
+   instance->flag |= MEGASAS_FW_BUSY;
+
+   spin_unlock_irqrestore(>throttle_io_lock, 
flags);
+   }
+   return EH_RESET_TIMER;
+   }
+   return EH_HANDLED;
+}
+
+/**
  * megasas_reset_device -  Device reset handler entry point
  */
 static int megasas_reset_device(struct scsi_cmnd *scmd)
@@ -1112,6 +1147,7 @@ static struct scsi_host_template megasas
.eh_device_reset_handler = megasas_reset_device,
.eh_bus_reset_handler = megasas_reset_bus_host,
.eh_host_reset_handler = megasas_reset_bus_host,
+   .eh_timed_out = megasas_reset_timer,
.bios_param = megasas_bios_param,
.use_clustering = ENABLE_CLUSTERING,
 };
@@ -1215,9 +1251,8 @@ megasas_complete_cmd(struct megasas_inst
int exception = 0;
struct megasas_header *hdr = >frame->hdr;
 
-   if (cmd->scmd) {
+   if (cmd->scmd)
cmd->scmd->SCp.ptr = (char *)0;
-   }
 
switch (hdr->cmd) {
 
@@ -1806,6 +1841,7 @@ static void megasas_complete_cmd_dpc(uns
u32 context;
struct megasas_cmd *cmd;
struct megasas_instance *instance = (struct megasas_instance 
*)instance_addr;
+   unsigned long flags;
 
/* If we have already declared adapter dead, donot complete cmds */
if (instance->hw_crit_error)
@@ -1828,6 +1864,22 @@ static void megasas_complete_cmd_dpc(uns
}
 
*instance->consumer = producer;
+
+   /*
+* Check if we can restore can_queue
+*/
+   if (instance->flag & MEGASAS_FW_BUSY
+   && time_after(jiffies, instance->last_time + 5 * HZ)
+   && atomic_read(>fw_outstanding) < 17) {
+
+   spin_lock_irqsave(>throttle_io_lock, flags);
+
+   instance->flag &= ~MEGASAS_FW_BUSY;
+   instance->host->can_queue =
+   instance->max_fw_cmds - MEGASAS_INT_CMDS;
+
+   

Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Chris Wright
* Zachary Amsden ([EMAIL PROTECTED]) wrote:
> William Lee Irwin III wrote:
> >>clone_pgd_range() for consistency?  and it seems we lost a
> >>paravirt_alloc_pd_clone() in there somewhere.
> >>
> >
> >Yes, another reason why it shouldn't have been posted as-is. It was not
> >intended to for anything more than comparative benchmarking on systems
> >without graphics running on the bare metal as opposed to Xen/etc. guests.
> >  
> 
> So clone_pgd_range is mostly useless now.  Originally, I intended it to 
> take the part of paravirt_alloc_pd_clone.  We should probably merge the 
> two into just one function, unless someone thinks clone_pgd_range is 
> actually useful for something.

No, I was going to suggest just that.  It was orginially introduced as
the place holder for that IIRC.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Chris Wright
* William Lee Irwin III ([EMAIL PROTECTED]) wrote:
> * Christoph Lameter ([EMAIL PROTECTED]) wrote:
> >> +#ifdef CONFIG_HIGHMEM64G
> >> +#define __pgd_alloc() kmem_cache_alloc(pgd_cache, 
> >> GFP_KERNEL|__GFP_REPEAT)
> >> +#define __pgd_free(pgd)   kmem_cache_free(pgd_cache, pgd)
> 
> On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote:
> > I must've glazed over something, I thought this was removal of slabs?
> 
> The pgd slab is not fully removable in the PAE case because a dedicated
> slab is the only way to enforce alignment for allocations as small as
> PAE PGD's.

Heh, yeah "page sized" is the part i glazed over, my fault.

> On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote:
> > BTW, this will interact shared_kernel_pmd patch that Jeremy's posted a
> > few times (I know at least wli has looked over that one).  We need to
> > make sure that PAE under at least Xen hypervisor has a page-sized pgd,
> > although the mmlist chaining looks nice to me.
> 
> That, not to mention the total lack of verification of the pageattr.c
> code, are among the reasons I didn't want it posted.
> 
> 
> * Christoph Lameter ([EMAIL PROTECTED]) wrote:
> >> +  memcpy([USER_PTRS_PER_PGD], _pg_dir[USER_PTRS_PER_PGD],
> >> +  KERNEL_PGD_PTRS*sizeof(pgd_t));
> 
> On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote:
> > clone_pgd_range() for consistency?  and it seems we lost a
> > paravirt_alloc_pd_clone() in there somewhere.
> 
> Yes, another reason why it shouldn't have been posted as-is. It was not
> intended to for anything more than comparative benchmarking on systems
> without graphics running on the bare metal as opposed to Xen/etc. guests.

OK, all good here.  Just wanted to make sure things didn't collide
too badly.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Zachary Amsden

William Lee Irwin III wrote:

clone_pgd_range() for consistency?  and it seems we lost a
paravirt_alloc_pd_clone() in there somewhere.



Yes, another reason why it shouldn't have been posted as-is. It was not
intended to for anything more than comparative benchmarking on systems
without graphics running on the bare metal as opposed to Xen/etc. guests.
  


So clone_pgd_range is mostly useless now.  Originally, I intended it to 
take the part of paravirt_alloc_pd_clone.  We should probably merge the 
two into just one function, unless someone thinks clone_pgd_range is 
actually useful for something.


Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] the scheduled eepro100 removal

2007-03-28 Thread Jeff Garzik

Kok, Auke wrote:

Bill Davidsen wrote:

Adrian Bunk wrote:

This patch contains the scheduled removal of the eepro100 driver.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>


This keeps coming around, but I haven't seen an answer to the 
questions raised by Eric Piel or Kiszka. I do know that e100 didn't 
work on some IBM rackmount servers and eepro100 did, but since I'm no 
longer responsible for those machines I can't retest. Perhaps someone 
will be able to provide data points.


IBM current offerings as of about three years ago, I had a few dozen 
of them at one time.


We have provided a (test) driver which allows e100 to use IO to 
communicate with the device, which seems to have helped for one person. 
I think we need to work with those changes and see if it helps the other 
people resolve their e100 issues. Unfortunately it keeps slipping off to 
the low priority list for us.


I suggest that we should push this code into -mm for people to test or 
something. It's fairly low risk as by default the patch won't enable IO 
and thus use the old method of writing to the adapter.


Sounds sane to me.  My overall opinion on eepro100 removal is that we're 
not there yet.  Rare problem cases remain where e100 fails but eepro100 
works, and it's older drivers so its low priority for everybody.


Needs to happen, though...

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread William Lee Irwin III
* Christoph Lameter ([EMAIL PROTECTED]) wrote:
>> +#ifdef CONFIG_HIGHMEM64G
>> +#define __pgd_alloc()   kmem_cache_alloc(pgd_cache, 
>> GFP_KERNEL|__GFP_REPEAT)
>> +#define __pgd_free(pgd) kmem_cache_free(pgd_cache, pgd)

On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote:
> I must've glazed over something, I thought this was removal of slabs?

The pgd slab is not fully removable in the PAE case because a dedicated
slab is the only way to enforce alignment for allocations as small as
PAE PGD's.


On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote:
> BTW, this will interact shared_kernel_pmd patch that Jeremy's posted a
> few times (I know at least wli has looked over that one).  We need to
> make sure that PAE under at least Xen hypervisor has a page-sized pgd,
> although the mmlist chaining looks nice to me.

That, not to mention the total lack of verification of the pageattr.c
code, are among the reasons I didn't want it posted.


* Christoph Lameter ([EMAIL PROTECTED]) wrote:
>> +memcpy([USER_PTRS_PER_PGD], _pg_dir[USER_PTRS_PER_PGD],
>> +KERNEL_PGD_PTRS*sizeof(pgd_t));

On Wed, Mar 28, 2007 at 03:26:56PM -0700, Chris Wright wrote:
> clone_pgd_range() for consistency?  and it seems we lost a
> paravirt_alloc_pd_clone() in there somewhere.

Yes, another reason why it shouldn't have been posted as-is. It was not
intended to for anything more than comparative benchmarking on systems
without graphics running on the bare metal as opposed to Xen/etc. guests.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread William Lee Irwin III
On Wed, 28 Mar 2007, William Lee Irwin III wrote:
>> I already went over the methodological issues with kernel compiles.
>> You may be able to prove this, but not this way.

On Wed, Mar 28, 2007 at 02:20:20PM -0700, Christoph Lameter wrote:
> But this way is an established kernel way of doing things. Seems that my 
> AIM9 stuff was not convincing and I am not sure what other tests would be 
> acceptable. Could you post some of data regarding the improvements 
> possible through your patches?

What I did, I did a number of years ago. Even if I could find the
results (and I don't even recall order-of-magnitude estimates) they
would be effectively irrelevant to modern kernels. The disaster in
all this was that the PTE caching never got merged. It's not much of
an observation to note that the primarily bottleneck is still there
when the patch to resolve it never got merged.

As far as kernel compiles being relevant to anything besides
potentially optimizing a particular major benchmark using gcc as one
of its components... yeah, right. It's too macro to be a microbenchmark
of anything and too micro to be pertinent to any meaningful
macrobenchmark such as those from major benchmark publishers (who can't
be named for trademark/etc. reasons). Hasn't it been at least 5 years
since people figured out kernel compiles were complete bulls**t as
benchmarks along with dbench for other reasons and several others? If
not, I don't know why I bother with this kernel at all.

Even so, I already did this and am done with it. It's not like I'm
not carrying around numerous patches I know will never be merged all
the time anyway. If you want to back it all out so badly, just do it
and stop bothering me about it, and I'll merely continue maintaining my
local patches without ever posting them as I have been for years. I'm
not at all happy with the NIH situation, either, not that I'm at such a
loss for ideas to need to contest every petty NIH that flies past.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-28 Thread Tilman Schmidt
Am 27.03.2007 08:17 schrieb Andrew Morton:
> I have a few fixes here which belong to subsystem trees, which were missed
> by the maintainers and which we probably want to get into 2.6.21.
[...]
> Maintainers are cc'ed.  Please promptly ack, nack or otherwise quack, else
> I'll be making my own decisions ;)

[CC list trimmed]

It's not on that list, but would you mind slipping
drivers-isdn-gigaset-mark-some-static-data-as-const-v2.patch
into 2.6.21 too? It's largely trivial but I'd like to get it
out of the door.

Thanks,
Tilman

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Chris Wright
* Christoph Lameter ([EMAIL PROTECTED]) wrote:
> +#ifdef CONFIG_HIGHMEM64G
> +#define __pgd_alloc()kmem_cache_alloc(pgd_cache, 
> GFP_KERNEL|__GFP_REPEAT)
> +#define __pgd_free(pgd)  kmem_cache_free(pgd_cache, pgd)

I must've glazed over something, I thought this was removal of slabs?

BTW, this will interact shared_kernel_pmd patch that Jeremy's posted a
few times (I know at least wli has looked over that one).  We need to
make sure that PAE under at least Xen hypervisor has a page-sized pgd,
although the mmlist chaining looks nice to me.

> +static struct kmem_cache *pgd_cache;
> +
> +void __init pgtable_cache_init(void)
> +{
> + pgd_cache = kmem_cache_create("pgd",
> + PTRS_PER_PGD*sizeof(pgd_t),
> + PTRS_PER_PGD*sizeof(pgd_t),
> + SLAB_PANIC,
> + NULL,
> + NULL);
> +}
> +#else /* !CONFIG_HIGHMEM64G */
> +#define __pgd_alloc()((pgd_t 
> *)get_zeroed_page(GFP_KERNEL|__GFP_REPEAT))
> +#define __pgd_free(pgd)  free_page((unsigned long)(pgd))
> +#endif /* !CONFIG_HIGHMEM64G */
>  
>  pgd_t *pgd_alloc(struct mm_struct *mm)
>  {
>   int i;
> - pgd_t *pgd = kmem_cache_alloc(pgd_cache, GFP_KERNEL);
> + pgd_t *pgd = __pgd_alloc();
>  
> - if (PTRS_PER_PMD == 1 || !pgd)
> + if (!pgd)
> + return NULL;
> + memcpy([USER_PTRS_PER_PGD], _pg_dir[USER_PTRS_PER_PGD],
> + KERNEL_PGD_PTRS*sizeof(pgd_t));

clone_pgd_range() for consistency?  and it seems we lost a 
paravirt_alloc_pd_clone()
in there somewhere.

> + if (PTRS_PER_PMD == 1)
>   return pgd;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [3/6] 2.6.21-rc4: known regressions

2007-03-28 Thread Maxim
On Wednesday 28 March 2007 22:42:00 Linus Torvalds wrote:
> 
> On Wed, 28 Mar 2007, David Brownell wrote:
> >
> > On Wednesday 28 March 2007 9:38 am, Linus Torvalds wrote:
> > 
> > > It's a *device*, dammit. It should save and resume like one (probably as 
> > > a 
> > > system device). The "set_mode()" etc stuff is at a completely different 
> > > (higher) conceptual level.
> > 
> > Agreed, except about "probably as a system device".
> > 
> > Last I checked, there was no good reason to use sysdev suspend()/resume()
> > rather than platform_device suspend_late()/early_resume().  Which more
> > or less means no good reason to use sysdev in new code...
> 
> I won't disagree - it might well be much nicer to just show it in the 
> "real" device tree. I'm not 100% sure where in the tree it would go, 
> though. It should probably be "inside" the root entry, before any of the 
> PCI buses. It's generally what we've used those "system device" things 
> for, but I agree that it would be better to just make system devices show 
> up early on the regular device list than it is to have them be special 
> cases.
> 
> Bit I think that's a separate (and fairly small) issue compared to the 
> "don't use the clocksource infrastructure as a make-believe suspend/resume 
> mechanism" problem that Maxim's patch had.
> 
> (Maxim, don't take that the wrong way - I think your analysis and patch 
> were great, I just think another organization would be better)

Exactly, I agree completely
I said that my patch was a  temporary fix, and I agree that the best way is to 
create a new system device
and use its suspend/resume hooks to bring HPET back to life on resume.

> 
> > Also, making HPET use the legacy mode seems like a step backwards.
> 
> I don't think that's actually "legacy" in any sense but the interrupt 
> delivery, where the "legacy mode" bit is not so much that the HPET itself 
> is "legacy" but that it *replaces* legacy devices.
> 
> But I may have misunderstood the thing. I'm an old fart, so I know the old 
> timers much better than I know the new ones ;). Somebody feel free to hit 
> me with the clue-2x4.
> 
>   Linus
> 

Best regards,
Maxim Levitsky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] the scheduled eepro100 removal

2007-03-28 Thread Kok, Auke

Bill Davidsen wrote:

Adrian Bunk wrote:

This patch contains the scheduled removal of the eepro100 driver.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>


This keeps coming around, but I haven't seen an answer to the questions 
raised by Eric Piel or Kiszka. I do know that e100 didn't work on some 
IBM rackmount servers and eepro100 did, but since I'm no longer 
responsible for those machines I can't retest. Perhaps someone will be 
able to provide data points.


IBM current offerings as of about three years ago, I had a few dozen of 
them at one time.


We have provided a (test) driver which allows e100 to use IO to communicate with 
the device, which seems to have helped for one person. I think we need to work 
with those changes and see if it helps the other people resolve their e100 
issues. Unfortunately it keeps slipping off to the low priority list for us.


I suggest that we should push this code into -mm for people to test or 
something. It's fairly low risk as by default the patch won't enable IO and thus 
use the old method of writing to the adapter.


Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] i386: Remove page sized slabs for pgds and pmds

2007-03-28 Thread Christoph Lameter
On Wed, 28 Mar 2007, William Lee Irwin III wrote:

> On Wed, Mar 28, 2007 at 02:38:55PM -0700, Christoph Lameter wrote:
> > No that was described in the patch. Quote:
> > "i386 only provides support for caching constructed pgd and pmds. These 
> > are comparatively rare to ptes so it is no surprise that the current 
> > approach has only minimal effect. "
> 
> And where was the mention of this being a patch I sent you in a private
> reply verbatim, and furthermore asked you not to post?

Yes it was private and you told me to be careful about "waving this patch 
around". No mention of not posting it.

And I repeat that I am sorry to have removed the paragraph that mentioned 
you being the author during rewrites of the text. Signoff line is there. 
This is an RFC so we can still add this if we want to apply it at all.

I think we need to discuss this openly. It seems that I am getting into 
unknown minefields of an ancient discussion between you and Andrew.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FF layer restrictions [Was: [PATCH 1/1] Input: add sensable phantom driver]

2007-03-28 Thread Jiri Slaby
Jiri Slaby napsal(a):
> Dmitry Torokhov napsal(a):
>> On Tuesday 27 March 2007 17:34, johann deneux wrote:
>>> What about adding a member to ff_effect which would be the number of the 
>>> motor?
>>> We can't change the layout of ff_effect too much though, so we have to
>>> find unused bits and put them to work.
>>>
>>> For instance, we could replace
>>>
>>> __u16 type;
>>>
>>> by
>>>
>>> __u8 motor;
>>> __u8 type;
>>>
>> Splitting type field seems to be a good idea.
> 
> Maybe stupid question, but what about endianness + backward compatibility?
> If we split it into motor,type sequence, it would break LE (untouched BE),
> if we do type,motor, it is OK for LE (broken BE).

Aha, and the question is: do

#ifdef __BIG_ENDIAN
#else
#endif

?

regards,
-- 
http://www.fi.muni.cz/~xslaby/Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
 B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E

Hnus <[EMAIL PROTECTED]> is an alias for /dev/null
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix "Section mismatch" compile warning

2007-03-28 Thread Andrew Morton
On Mon, 26 Mar 2007 17:19:33 +0200
Bernhard Walle <[EMAIL PROTECTED]> wrote:

> Fix "Section mismatch" warnings in arch/x86_64/kernel/time.c
> 

Please always quote the warnings in the changelog when fixing them, thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] the scheduled eepro100 removal

2007-03-28 Thread Bill Davidsen

Adrian Bunk wrote:

This patch contains the scheduled removal of the eepro100 driver.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>


This keeps coming around, but I haven't seen an answer to the questions 
raised by Eric Piel or Kiszka. I do know that e100 didn't work on some 
IBM rackmount servers and eepro100 did, but since I'm no longer 
responsible for those machines I can't retest. Perhaps someone will be 
able to provide data points.


IBM current offerings as of about three years ago, I had a few dozen of 
them at one time.


--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FF layer restrictions [Was: [PATCH 1/1] Input: add sensable phantom driver]

2007-03-28 Thread Jiri Slaby
Dmitry Torokhov napsal(a):
> On Tuesday 27 March 2007 17:34, johann deneux wrote:
>> What about adding a member to ff_effect which would be the number of the 
>> motor?
>> We can't change the layout of ff_effect too much though, so we have to
>> find unused bits and put them to work.
>>
>> For instance, we could replace
>>
>> __u16 type;
>>
>> by
>>
>> __u8 motor;
>> __u8 type;
>>
> 
> Splitting type field seems to be a good idea.

Maybe stupid question, but what about endianness + backward compatibility?
If we split it into motor,type sequence, it would break LE (untouched BE),
if we do type,motor, it is OK for LE (broken BE).

regards,
-- 
http://www.fi.muni.cz/~xslaby/Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
 B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E

Hnus <[EMAIL PROTECTED]> is an alias for /dev/null
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   >