Re: irq0 stops working

2007-10-08 Thread Jan Engelhardt

On Oct 9 2007 09:26, Vasily Averin wrote:
>
>On one of our servers timer interrupts (i.e irq0) are stops working. As result
>any kernel timers do not triggers and tasks waiting some signals from timers
>hangs forever.

What kernel.. and tried CONFIG_NO_HZ=n?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT] kbuild: save ARCH & CROSS_COMPILE

2007-10-08 Thread Randy Dunlap
On Mon, 8 Oct 2007 21:53:16 -0700 Randy Dunlap wrote:

> On Tue, 9 Oct 2007 06:17:43 +0200 Sam Ravnborg wrote:
> 
> > > 
> > > What about, that this is the first ever prompt, that must be shown and
> > > written to the .config?
> > Two issues to fix before we can do this:
> > 1) chocie values cannot have more than one prompt
> > 2) We need to share much more Kconfig* between the individual architectures
> >First step is to let all arch's use drivers/Kconfig
> 
> 2) isn't terribly difficult, just takes some time and willingness
> of $arch maintainers to some changes, but please explain a bit more
> why it is needed...?

Maybe I didn't read carefully:  "to add arch selection to kconfig"..

arch/cris using drivers/Kconfig: patch is below (maintainer is
cc-ed)

> > Let's get the two items above solved then we can revisit adding arch 
> > selection
> > to kconfig (where it belongs in the end).
> > And neither require a rewrite of kconfig...
> > 
> > > Also, i'd like to propose sequencing of config-enable-build-this-unit
> > > in config file(s), thus Makefile(s) (sometimes very small and stupid)
> > > will be not necessary. Additional link ordering can be supplied as
> > > meta-config information there. Shell scripting, very ugly in the view
> > > of make syntax, will be natural in config files. Extending build
> > > process to get hidden dependencies or right linking/other magic is
> > > part of particular configuration. Hm?
> > Discussed before but so far no patches has shown up.

---

From: Randy Dunlap <[EMAIL PROTECTED]>

Move arch/cris to using drivers/Kconfig for its drivers config list.
When all arches do this, Sam can make more interesting improvements
to .config files.

Using drivers/Kconfig adds these kconfig files to cris:
connector, misc, ata, message/fusion (not avail.), macintosh (not avail.),
i2c, spi, w1, power, hwmon, mfd, video, hid, mmc, leds,
infiniband (not avail.), edac (not avail.), rtc, dma, auxdisplay,
kvm (not avail.), uio, and lguest (not avail.).

Many of these are already enabled/disabled per arch., so adding that
for cris can be done as required.

"not avail." means that this menu is not valid for this arch.
and won't be presented to users when running 'make *config'.

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 arch/cris/Kconfig |   40 +---
 1 file changed, 1 insertion(+), 39 deletions(-)

--- linux-2.6.23-rc9-git6.orig/arch/cris/Kconfig
+++ linux-2.6.23-rc9-git6/arch/cris/Kconfig
@@ -153,49 +153,11 @@ source arch/cris/arch-v10/drivers/Kconfi
 
 endmenu
 
-source "drivers/base/Kconfig"
-
 # standard linux drivers
-source "drivers/mtd/Kconfig"
-
-source "drivers/parport/Kconfig"
-
-source "drivers/pnp/Kconfig"
-
-source "drivers/block/Kconfig"
-
-source "drivers/md/Kconfig"
-
-source "drivers/ide/Kconfig"
-
-source "drivers/scsi/Kconfig"
-
-source "drivers/ieee1394/Kconfig"
-
-source "drivers/message/i2o/Kconfig"
-
-source "drivers/net/Kconfig"
-
-source "drivers/isdn/Kconfig"
-
-source "drivers/telephony/Kconfig"
-
-#
-# input before char - char/joystick depends on it. As does USB.
-#
-source "drivers/input/Kconfig"
-
-source "drivers/char/Kconfig"
-
-#source drivers/misc/Config.in
-source "drivers/media/Kconfig"
+source "drivers/Kconfig"
 
 source "fs/Kconfig"
 
-source "sound/Kconfig"
-
-source "drivers/usb/Kconfig"
-
 source "arch/cris/Kconfig.debug"
 
 source "security/Kconfig"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


irq0 stops working

2007-10-08 Thread Vasily Averin
On one of our servers timer interrupts (i.e irq0) are stops working. As result
any kernel timers do not triggers and tasks waiting some signals from timers
hangs forever.

Most noticeable effect of this situation is that any write operations to disk
are stalled, and nobody can log in on the node.

At the same time node all existing shells works away. I'm able to read
interrupts statistic from /proc/interrupts file and it shows that all other
interrupts are changed when these devices are accessed: disk on sata controller,
network, cdrom on ide controller, keyboard, serial console, LOC interrupts.

Also I've found that disable of irqbalance service on the node helps to
workaround this issue, however of course it fixes nothing.

All details about hardware/logs could be found in
http://bugzilla.kernel.org/show_bug.cgi?id=8650

I'm able to reproduce this situation, however now I have no ideas how to
continue the investigation of this problem.

Could please anybody advise me any new ways for investigation of this issue?

Thank you,
Vasily Averin

OpenVZ Linux Kernel Team
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Miller
From: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Date: Tue, 09 Oct 2007 15:13:36 +1000

> I'm not even sure module load order is 100% fault proof here since
> khubd spawns as a thread...

I'm concerned about that as well, thanks for bringing it up.

My understanding, however, is that the critical thing is that the EHCI
device reset being done by the EHCI driver probe occurs and completes
first.  If that is true, then just making sure EHCI loads initially is
a sufficient constraint to fix this problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [13/18] x86_64: Allow fallback for the stack

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:36, Christoph Lameter wrote:
> On Sun, 7 Oct 2007, Nick Piggin wrote:
> > > The problem can become non-rare on special low memory machines doing
> > > wild swapping things though.
> >
> > But only your huge systems will be using huge stacks?
>
> I have no idea who else would be using such a feature. Relaxing the tight
> memory restrictions on stack use may allow placing larger structures on
> the stack in general.

The tight memory restrictions on stack usage do not come about because
of the difficulty in increasing the stack size :) It is because we want to
keep stack sizes small!

Increasing the stack size 4K uses another 4MB of memory for every 1000
threads you have, right?

It would take a lot of good reason to move away from the general direction
we've been taking over the past years that 4/8K stacks are a good idea for
regular 32 and 64 bit builds in general.


> I have some concerns about the medium NUMA systems (a few dozen of nodes)
> also running out of stack since more data is placed on the stack through
> the policy layer and since we may end up with a couple of stacked
> filesystems. Most of the current NUMA systems on x86_64 are basically
> two nodes on one motherboard. The use of NUMA controls is likely
> limited there and the complexity of the filesystems is also not high.

The solution has until now always been to fix the problems so they don't
use so much stack. Maybe a bigger stack is OK for you for 1024+ CPU
systems, but I don't think you'd be able to make that assumption for most
normal systems.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Miller
From: David Brownell <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 22:00:19 -0700

> Assuming PCI is present, /sys/bus/pci/devices/*/class can tell
> if EHCI is present (0x0c0320) ... if so, load that driver.
> Then repeat for OHCI (0x0c0310) and UHCI (0x0c0300).

These are facts all of us know very well, but implementing this in
userspace in a failsafe manner isn't practical.  That's what we're
discussing.

There are things that autoload USB drivers way before udev or similar
even get started.

For example, the first thing some distributions do is try to load the
correct keyboard maps.  Guess what that can do?  It triggers a load of
all of the modular USB host controller drivers in case we have a USB
keyboard.

The only real solution is in the kernel, because it is the only
clean place to trap all of the potential module load events.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: gigabit ethernet power consumption

2007-10-08 Thread Willy Tarreau
Hi Auke,

On Mon, Oct 08, 2007 at 03:31:51PM -0700, Kok, Auke wrote:
> Pavel Machek wrote:
> > Hi!
> > 
> > I've found that gbit vs. 100mbit power consumption difference is about
> > 1W -- pretty significant. (Maybe powertop should include it in the
> > tips section? :).
> > 
> > Energy Star people insist that machines should switch down to 100mbit
> > when network is idle, and I guess that makes a lot of sense -- you
> > save 1W locally and 1W on the router.
> > 
> > Question is, how to implement it correctly? Daemon that would watch
> > data rates and switch speeds using mii-tool would be simple, but is
> > that enough?
> 
> you most certainly want to do this in userspace I think.
> 
> One of the biggest problems is that link negotiation can take a significant 
> amount
> of time, well over several seconds (1 to 3 seconds typical) with gigabit, and
> having your ethernet connection go offline for 3 seconds may not be the 
> desired
> effect for when you want to get more bandwidth in the first place.
> 
> However, when a laptop is in battery mode, switching down from gigabit to 
> 100mbit
> makes a lot more sense, so this is something I would recommend. This can be as
> easy as changing the advertisement mask of the interface and renegotiating the
> link. Userspace could handle that very easily.

If something does that, it must *only* be in userspace so that we can
easily disable it. It's amazing how many laptops consider that you
don't want any performance when you run off batteries. I've seen a
2GHz laptop falling back to 600 MHz when running on batteries, which
was very inconvenient when the laptop in question was used to go
sniffing gigabit traffic in datacenters... I would even go as far
as to say that my notebook runs lowpower only when it's plugged into
the wall because it's when I'm typing or doing low activity things.

In my opinion, battery != low power, battery == mobility. It's user's
choice which must imply low power, so that must be done with a dedicated
daemon.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread Benjamin Herrenschmidt
> Yes, that's why I asked about EHCI.  My speculation would be that
> OHCI starts the reset, and EHCI claims the port before it completes;
> or contrariwise OHCI starts the reset right after EHCI claims it.
> 
> And there's some point in that process where a hardware race makes
> the trouble you've observed.  I believe there are plenty of other
> places where it's perfectly fine if EHCI grabs the port, or this
> little race would have shown up many times before.

Since we can't know which O/UHCI is paired with which EHCI, we can't
really have generic code to deal with that race, but maybe we can be
smart and basically mutex khubd activity such as port reset vs.
registration of any new HCD ?

I'm not even sure module load order is 100% fault proof here since khubd
spawns as a thread...
 
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread Benjamin Herrenschmidt

On Mon, 2007-10-08 at 21:47 -0700, David Miller wrote:
> From: Greg KH <[EMAIL PROTECTED]>
> Date: Mon, 8 Oct 2007 21:39:09 -0700
> 
> > No, nothing cute in udev itself, but it seems that all distros that I
> > know of have a "load these modules now" type setting in their init
> > scripts that can be used here.
> > 
> > I can't think of a way to enforce this load order on the modules
> > themselves due to the fact that OHCI might not even be needed for EHCI
> > devices on UHCI (Intel) based chipsets :(
> > 
> > Can anyone else?
> 
> The three modules perhaps should be a bundle of whatever ones you have
> enabled, and internally we can dispatch the initialization to occur in
> the correct order from a top-level module_init().
> 
> If the devices need to be initialized in a certain order in a
> situation like this, it really seems like it is the kernel's job to
> enforce it.

Is the problem strictly an ordering problem or just a race ? In the
later case, maybe some better arbitration by the USB core to make
sure things are quiescent or in a known state when letting a new HCD
register might help ?

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: gigabit ethernet power consumption

2007-10-08 Thread Chris Snook

Pavel Machek wrote:

Hi!

I've found that gbit vs. 100mbit power consumption difference is about
1W -- pretty significant. (Maybe powertop should include it in the
tips section? :).

Energy Star people insist that machines should switch down to 100mbit
when network is idle, and I guess that makes a lot of sense -- you
save 1W locally and 1W on the router.

Question is, how to implement it correctly? Daemon that would watch
data rates and switch speeds using mii-tool would be simple, but is
that enough?


I believe you misspelled "ethtool".

While you're at it, why stop at 100Mb?  I believe you save even more power at 
10Mb, which is why WOL puts the card in 10Mb mode.  In my experience, you 
generally want either the maximum setting or the minimum setting when going for 
power savings, because of the race-to-idle effect.  Workloads that have a 
sustained fractional utilization are rare.  Right now I'm at home, hooked up to 
a cable modem, so anything over 4Mb is wasted, unless I'm talking to the box 
across the room, which is rare.


Talk to the NetworkManager folks.  This is right up their alley.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-fbdev-devel] [PATCH 0/6] Patch series to add of_platform binding to xilinxfb

2007-10-08 Thread Antonino A. Daplas
On Mon, 2007-10-08 at 22:43 -0600, Grant Likely wrote:
> On 10/2/07, Antonino A. Daplas <[EMAIL PROTECTED]> wrote:
> > On Mon, 2007-10-01 at 09:57 -0600, Grant Likely wrote:
> > > Assuming there are no major issues, I'd like to get this patch series
> > > queued up for inclusion in 2.6.24.
> >
> > Okay.
> >
> > Tony
> 
> BTW, what path do framebuffer patches take to get into Linus' tree?
> Does he pull your tree directly, or do they go through someone else's
> tree?

They all go to -mm tree, unless it's a needed fix, then to Linus's.

Tony


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Brownell
> > > The old /etc/hotplug/usb.rc script made sure to load those modules
> > > in the correct order:  EHCI first.
> > 
> > I expected to find something cute attempting to handle this under
> > /etc/udev, I have failed so far :-)
>
> No, nothing cute in udev itself, but it seems that all distros that I
> know of have a "load these modules now" type setting in their init
> scripts that can be used here.
>
> I can't think of a way to enforce this load order on the modules
> themselves due to the fact that OHCI might not even be needed for EHCI
> devices on UHCI (Intel) based chipsets :(

Assuming PCI is present, /sys/bus/pci/devices/*/class can tell
if EHCI is present (0x0c0320) ... if so, load that driver.
Then repeat for OHCI (0x0c0310) and UHCI (0x0c0300).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT] kbuild: save ARCH & CROSS_COMPILE

2007-10-08 Thread Randy Dunlap
On Tue, 9 Oct 2007 06:17:43 +0200 Sam Ravnborg wrote:

> > 
> > What about, that this is the first ever prompt, that must be shown and
> > written to the .config?
> Two issues to fix before we can do this:
> 1) chocie values cannot have more than one prompt
> 2) We need to share much more Kconfig* between the individual architectures
>First step is to let all arch's use drivers/Kconfig

2) isn't terribly difficult, just takes some time and willingness
of $arch maintainers to some changes, but please explain a bit more
why it is needed...?


> Let's get the two items above solved then we can revisit adding arch selection
> to kconfig (where it belongs in the end).
> And neither require a rewrite of kconfig...
> 
> > Also, i'd like to propose sequencing of config-enable-build-this-unit
> > in config file(s), thus Makefile(s) (sometimes very small and stupid)
> > will be not necessary. Additional link ordering can be supplied as
> > meta-config information there. Shell scripting, very ugly in the view
> > of make syntax, will be natural in config files. Extending build
> > process to get hidden dependencies or right linking/other magic is
> > part of particular configuration. Hm?
> Discussed before but so far no patches has shown up.


---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Miller
From: Greg KH <[EMAIL PROTECTED]>
Date: Mon, 8 Oct 2007 21:39:09 -0700

> No, nothing cute in udev itself, but it seems that all distros that I
> know of have a "load these modules now" type setting in their init
> scripts that can be used here.
> 
> I can't think of a way to enforce this load order on the modules
> themselves due to the fact that OHCI might not even be needed for EHCI
> devices on UHCI (Intel) based chipsets :(
> 
> Can anyone else?

The three modules perhaps should be a bundle of whatever ones you have
enabled, and internally we can dispatch the initialization to occur in
the correct order from a top-level module_init().

If the devices need to be initialized in a certain order in a
situation like this, it really seems like it is the kernel's job to
enforce it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Miller
From: David Brownell <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 21:36:43 -0700

> Don't need this "limit_1" timeout; "reset_done" handles all
> the timeout needed there.  The regs->fmnumber is essentially
> a millisecond counter.

If the hardware hangs and the register stops incrementing,
the entire kernel will hang.  That is unacceptable.

We do need it.

> 
> > +   int limit_2;
> > +
> > /* spin until any current reset finishes */
> > -   for (;;) {
> > +   limit_2 = PORT_RESET_MSEC * 2;
> 
> This is the loop that didn't terminate for you, right?
> PORT_RESET_HW_MSEC is the ceiling you should use here,
> not PORT_RESET_MSEC.

Ok, fixed.

> What values do you see for "portstat"?

0x111

> I suspect there will be some flag set which would allow a more
> immediate exit from that loop.  RH_PS_CCS might clear, for example.

Absolutely nothing clears in the register from it's initial value.

Here is the patch with the limit_2 initial value fixed.

I kept loop_1 in there, it is necessary.  No kernel code should
hang in an endless loop because of malfunctioning hardware.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/drivers/usb/host/ohci-hub.c b/drivers/usb/host/ohci-hub.c
index bb9cc59..9149593 100644
--- a/drivers/usb/host/ohci-hub.c
+++ b/drivers/usb/host/ohci-hub.c
@@ -563,14 +563,19 @@ static inline int root_port_reset (struct ohci_hcd *ohci, 
unsigned port)
u32 temp;
u16 now = ohci_readl(ohci, >regs->fmnumber);
u16 reset_done = now + PORT_RESET_MSEC;
+   int limit_1;
 
/* build a "continuous enough" reset signal, with up to
 * 3msec gap between pulses.  scheduler HZ==100 must work;
 * this might need to be deadline-scheduled.
 */
-   do {
+   limit_1 = 100;
+   while (--limit_1 >= 0) {
+   int limit_2;
+
/* spin until any current reset finishes */
-   for (;;) {
+   limit_2 = PORT_RESET_HW_MSEC * 2;
+   while (--limit_2 >= 0) {
temp = ohci_readl (ohci, portstat);
/* handle e.g. CardBus eject */
if (temp == ~(u32)0)
@@ -579,6 +584,10 @@ static inline int root_port_reset (struct ohci_hcd *ohci, 
unsigned port)
break;
udelay (500);
}
+   if (limit_2 < 0) {
+   ohci_warn(ohci, "Root port inner-loop reset timeout, "
+ "portstat[%08x]\n", temp);
+   }
 
if (!(temp & RH_PS_CCS))
break;
@@ -589,7 +598,14 @@ static inline int root_port_reset (struct ohci_hcd *ohci, 
unsigned port)
ohci_writel (ohci, RH_PS_PRS, portstat);
msleep(PORT_RESET_HW_MSEC);
now = ohci_readl(ohci, >regs->fmnumber);
-   } while (tick_before(now, reset_done));
+   if (!tick_before(now, reset_done))
+   break;
+   }
+   if (limit_1 < 0) {
+   ohci_warn(ohci, "Root port outer-loop reset timeout, "
+ "now[%04x] reset_done[%04x]\n",
+ now, reset_done);
+   }
/* caller synchronizes using PRSC */
 
return 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-fbdev-devel] [PATCH 0/6] Patch series to add of_platform binding to xilinxfb

2007-10-08 Thread Grant Likely
On 10/2/07, Antonino A. Daplas <[EMAIL PROTECTED]> wrote:
> On Mon, 2007-10-01 at 09:57 -0600, Grant Likely wrote:
> > Assuming there are no major issues, I'd like to get this patch series
> > queued up for inclusion in 2.6.24.
>
> Okay.
>
> Tony

BTW, what path do framebuffer patches take to get into Linus' tree?
Does he pull your tree directly, or do they go through someone else's
tree?

Thanks,
g.


>
>
>


-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
[EMAIL PROTECTED]
(403) 399-0195
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Brownell
> Regardless, here is a patch that hardens the OHCI reset handling
> loops so that they break out instead of hanging the entire system
> should this condition occur.  It's at least better than what the
> code does to a user right now which is hang the box completely:
>
> [USB] ohci: Do not hang the system if port reset does not complete.
>
> Signed-off-by: David S. Miller <[EMAIL PROTECTED]>
>
> diff --git a/drivers/usb/host/ohci-hub.c b/drivers/usb/host/ohci-hub.c
> index bb9cc59..77ae5b4 100644
> --- a/drivers/usb/host/ohci-hub.c
> +++ b/drivers/usb/host/ohci-hub.c
> @@ -563,14 +563,19 @@ static inline int root_port_reset (struct ohci_hcd 
> *ohci, unsigned port)
>   u32 temp;
>   u16 now = ohci_readl(ohci, >regs->fmnumber);
>   u16 reset_done = now + PORT_RESET_MSEC;
> + int limit_1;
>  
>   /* build a "continuous enough" reset signal, with up to
>* 3msec gap between pulses.  scheduler HZ==100 must work;
>* this might need to be deadline-scheduled.
>*/
> - do {
> + limit_1 = 100;
> + while (--limit_1 >= 0) {

Don't need this "limit_1" timeout; "reset_done" handles all
the timeout needed there.  The regs->fmnumber is essentially
a millisecond counter.


> + int limit_2;
> +
>   /* spin until any current reset finishes */
> - for (;;) {
> + limit_2 = PORT_RESET_MSEC * 2;

This is the loop that didn't terminate for you, right?
PORT_RESET_HW_MSEC is the ceiling you should use here,
not PORT_RESET_MSEC.


> + while (--limit_2 >= 0) {
>   temp = ohci_readl (ohci, portstat);
>   /* handle e.g. CardBus eject */
>   if (temp == ~(u32)0)
> @@ -579,6 +584,10 @@ static inline int root_port_reset (struct ohci_hcd 
> *ohci, unsigned port)
>   break;
>   udelay (500);
>   }
> + if (limit_2 < 0) {
> + ohci_warn(ohci, "Root port inner-loop reset timeout, "
> +   "portstat[%08x]\n", temp);
> + }

What values do you see for "portstat"?

I suspect there will be some flag set which would allow a more
immediate exit from that loop.  RH_PS_CCS might clear, for example.

And in any case, if that fails I don't see any reason not to just
break, and return immediately.

>  
>   if (!(temp & RH_PS_CCS))
>   break;
> @@ -589,7 +598,14 @@ static inline int root_port_reset (struct ohci_hcd 
> *ohci, unsigned port)
>   ohci_writel (ohci, RH_PS_PRS, portstat);
>   msleep(PORT_RESET_HW_MSEC);
>   now = ohci_readl(ohci, >regs->fmnumber);
> - } while (tick_before(now, reset_done));
> + if (!tick_before(now, reset_done))
> + break;
> + }
> + if (limit_1 < 0) {
> + ohci_warn(ohci, "Root port outer-loop reset timeout, "
> +   "now[%04x] reset_done[%04x]\n",
> +   now, reset_done);
> + }
>   /* caller synchronizes using PRSC */
>  
>   return 0;
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread Greg KH
On Mon, Oct 08, 2007 at 08:42:36PM -0700, David Miller wrote:
> From: David Brownell <[EMAIL PROTECTED]>
> Date: Mon, 08 Oct 2007 20:34:12 -0700
> 
> > > However, when both OHCI and EHCI are built as modules (or, similarly
> > > I guess, OHCI is built-in and EHCI is modular) there appears to be
> > > nothing in userspace which makes sure EHCI gets loaded first.
> > 
> > The old /etc/hotplug/usb.rc script made sure to load those modules
> > in the correct order:  EHCI first.
> 
> I expected to find something cute attempting to handle this under
> /etc/udev, I have failed so far :-)

No, nothing cute in udev itself, but it seems that all distros that I
know of have a "load these modules now" type setting in their init
scripts that can be used here.

I can't think of a way to enforce this load order on the modules
themselves due to the fact that OHCI might not even be needed for EHCI
devices on UHCI (Intel) based chipsets :(

Can anyone else?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleepy linux 2.6.23-rc9

2007-10-08 Thread Antonino A. Daplas
On Tue, 2007-10-09 at 00:05 +0200, Pavel Machek wrote:
> Hi!
> 
> I played with powertop a bit, and found a fairly interesting failure
> mode. If I boot init=/bin/bash vga=1, I get ~2 wakeups a second, nice.
> 
> When I boot init=/bin/bash vga=791 (vesa framebuffer), most wakeups
> are caused by cursor painting (I should fix that some day, I
> guess). But... the cursor blinking does not even work properly!
> 
> It blinks at normal speed, then (randomly) it blinks slowly, then gets
> back to normal speed, then inserts longer delay.
> 
> The effect is so nice that I thought about youtube ;-). Thinkpad
> x60.. question is, how to debug it? 

The cursor blinking is done by software via a timer. It's in
drivers/video/console/fbcon.c.

With the latest -rc kernel you can turn off the blinking with

echo 0 > /sys/class/graphics/fbcon/cursor_blink

Tony


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RT] fix rt-task scheduling issue

2007-10-08 Thread Gregory Haskins
Hi Guys,
  Nice find!  Comment inline..

(adding linux-rt-users)

 for reference to

 http://lkml.org/lkml/2007/10/8/252

On Mon, 2007-10-08 at 22:46 -0400, Steven Rostedt wrote:
> Index: linux-2.6.23-rc9-rt2/kernel/sched.c
> ===
> --- linux-2.6.23-rc9-rt2.orig/kernel/sched.c
> +++ linux-2.6.23-rc9-rt2/kernel/sched.c
> @@ -2207,7 +2207,7 @@ static inline void finish_task_switch(st
>* If we pushed an RT task off the runqueue,
>* then kick other CPUs, they might run it:
>*/
> - if (unlikely(rt_task(current) && prev->se.on_rq && rt_task(prev))) {
> + if (unlikely(rt_task(current) && rq->rt_nr_running > 1)) {
>   schedstat_inc(rq, rto_schedule);
>   smp_send_reschedule_allbutself_cpumask(current->cpus_allowed);

the current->cpus_allowed I think probably should have been
"prev->cpus_allowed" in the original code?  However, in light of the new
findings with this bug Mike found, this should probably be sent to
allbutself() without the mask since you don't know what could have been
queued behind you.

Unless I am missing something?

Regards,
-Greg


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT] kbuild: save ARCH & CROSS_COMPILE

2007-10-08 Thread Sam Ravnborg
> 
> What about, that this is the first ever prompt, that must be shown and
> written to the .config?
Two issues to fix before we can do this:
1) chocie values cannot have more than one prompt
2) We need to share much more Kconfig* between the individual architectures
   First step is to let all arch's use drivers/Kconfig

Let's get the two items above solved then we can revisit adding arch selection
to kconfig (where it belongs in the end).
And neither require a rewrite of kconfig...

> Also, i'd like to propose sequencing of config-enable-build-this-unit
> in config file(s), thus Makefile(s) (sometimes very small and stupid)
> will be not necessary. Additional link ordering can be supplied as
> meta-config information there. Shell scripting, very ugly in the view
> of make syntax, will be natural in config files. Extending build
> process to get hidden dependencies or right linking/other magic is
> part of particular configuration. Hm?
Discussed before but so far no patches has shown up.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Brownell
> To add some more information here, I think the EHCI idea might
> hold some water.
>
> What I have here are two NEC OHCI USB interfaces and one NEC EHCI
> USB interface on PCI.  Aparently they all go through a shared
> USB hub, mapped like this:
>
> HUB Port 1: OHCI #1, EHCI
> HUB Port 2: OHCI #2, EHCI
> HUB Port 3: OHCI #1, EHCI
> HUB Port 4: OHCI #2, EHCI
> HUB Port 5: OHCI #1, EHCI
>
> The OHCI ports go out to external USB connectors on the back panel of
> the machine, whereas the EHCI is connected up to an internal USB
> storage CDROM device and what appears to be another USB hub.

There's actually no such thing as an "EHCI port" or an "OHCI port".
Instead, there's a set of ports, each of which can be switched so
the USB differential data signals go up to either controller.

When EHCI starts, that switch points to EHCI so that devices can try
enumerating with high speed signaling.  When a device doesn't respond
to that "chirp", the EHCI root hub driver switches the port to the
companion controller.  (Which is OHCI here, UHCI on some PCs, etc.)


> The problem seems to be very strongly tied to timing.  For example
> simply adding "ignore_loglevel" to the kernel boot command line can
> make the problem go away.
>
> This got me thinking about your EHCI comment.
>
> If these controllers are going through the same HUB, things might go
> south if OHCI initialized first, then khubd et al. are asynchronously
> accessing the segments behind OHCI at the same time that the EHCI
> driver is initializing.  Perhaps, this is the kind of sequence of
> events which makes one of the root ports reset in such a way that the
> the reset bit never clears.
>
> Given that this machine has 64 cpus, the likelyhood for such parallel
> accesses is very likely :-)
>
> Does this make any sense?

Yes, that's why I asked about EHCI.  My speculation would be that
OHCI starts the reset, and EHCI claims the port before it completes;
or contrariwise OHCI starts the reset right after EHCI claims it.

And there's some point in that process where a hardware race makes
the trouble you've observed.  I believe there are plenty of other
places where it's perfectly fine if EHCI grabs the port, or this
little race would have shown up many times before.

- Dave


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] lockdep: Avoid /proc/lockdep & lock_stat infinite output

2007-10-08 Thread Tim Pepper
On Tue 09 Oct at 02:30:11 +0100 [EMAIL PROTECTED] said:
> On Mon, Oct 08, 2007 at 06:15:51PM -0700, Tim Pepper wrote:
> > 
> > -   if (>lock_entry == all_lock_classes.next)
> > +   if (*pos == 0)
> > seq_printf(m, "all lock classes:\n");
> 
> Do not generate output outside of ->show() and you won't have these
> problems.  That's where your infinite output crap comes from.
> 
> IOW, NAK - fix the underlying problem.

Aaah...OK.  Can we add something like the following then:




Document that output must only come from _show() and SEQ_START_TOKEN is how
a _start() indicates a header is to be printed.

Signed-off-by: Tim Pepper <[EMAIL PROTECTED]>
Cc: Al Viro <[EMAIL PROTECTED]>

---

--- linux-2.6.orig/include/linux/seq_file.h
+++ linux-2.6.23-rc9/include/linux/seq_file.h
@@ -36,9 +36,10 @@ ssize_t seq_read(struct file *, char __u
 loff_t seq_lseek(struct file *, loff_t, int);
 int seq_release(struct inode *, struct file *);
 int seq_escape(struct seq_file *, const char *, const char *);
+
+/* these may only be called from a (*show) function */
 int seq_putc(struct seq_file *m, char c);
 int seq_puts(struct seq_file *m, const char *s);
-
 int seq_printf(struct seq_file *, const char *, ...)
__attribute__ ((format (printf,2,3)));
 
@@ -48,6 +49,11 @@ int single_open(struct file *, int (*)(s
 int single_release(struct inode *, struct file *);
 int seq_release_private(struct inode *, struct file *);
 
+/*
+ * return SEQ_START_TOKEN in your (*start) function and test for
+ * (v == SEQ_START_TOKEN) in * your (*show) funtion in order to
+ * print a header before your seq data
+ */
 #define SEQ_START_TOKEN ((void *)1)
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Miller
From: David Brownell <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 20:34:12 -0700

> > However, when both OHCI and EHCI are built as modules (or, similarly
> > I guess, OHCI is built-in and EHCI is modular) there appears to be
> > nothing in userspace which makes sure EHCI gets loaded first.
> 
> The old /etc/hotplug/usb.rc script made sure to load those modules
> in the correct order:  EHCI first.

I expected to find something cute attempting to handle this under
/etc/udev, I have failed so far :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Stephen Hemminger
On Mon, 8 Oct 2007 16:06:03 -0700
Randy Dunlap <[EMAIL PROTECTED]> wrote:

> On Mon, 08 Oct 2007 16:43:10 -0600 Jonathan Corbet wrote:
> 
> > Sam Ravnborg <[EMAIL PROTECTED]> wrote:
> > 
> > > Or maybe we need something much less formal that explain the purpose of 
> > > the
> > > four tags we use:
> > 
> > ...or maybe a combination?  How does the following patch look as a way
> > to describe how the tags are used and what Reviewed-by, in particular,
> > means?
> > 
> > Perhaps the DCO should move to this file as well?
> > 
> > jon
> 
> Just typos noted below...
> 
> > ---
> > 
> > Add a document on patch tags.
> > 
> > Signed-off-by: Jonathan Corbet <[EMAIL PROTECTED]>
> > 
> > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > index 43e89b1..fa1518b 100644
> > --- a/Documentation/00-INDEX
> > +++ b/Documentation/00-INDEX
> > @@ -284,6 +284,8 @@ parport.txt
> > - how to use the parallel-port driver.
> >  parport-lowlevel.txt
> > - description and usage of the low level parallel port functions.
> > +patch-tags
> > +   - description of the tags which can be added to patches
> >  pci-error-recovery.txt
> > - info on PCI error recovery.
> >  pci.txt
> > diff --git a/Documentation/patch-tags b/Documentation/patch-tags
> > new file mode 100644
> > index 000..fb5f8e1
> > --- /dev/null
> > +++ b/Documentation/patch-tags
> > @@ -0,0 +1,66 @@
> > +Patches headed for the mainline may contain a variety of tags documenting
> > +who played a hand in (or was at least aware of) its progress.  All of these
> > +tags have the form:
> > +
> > +   Something-done-by: Full name <[EMAIL PROTECTED]>
> > +
> > +These tags are:
> > +
> > +Signed-off-by:  A person adding a Signed-off-by tag is attesting that the
> > +   patch is, to the best of his or her knowledge, legally able
> > +   to be merged into the mainline and distributed under the
> > +   terms of the GNU General Public License, version 2.
All changes are licensed under the terms of the file modified. 

(Some people seem not to understand that
if the file is dual licensed, then the changes are dual licensed. 
If file is GPL v2 only, then the changes are GPL v2 only, ...)

> >  See
> > +   the Developer's Certificate of Origin, found in
> > +   Documentation/SubmittingPatches, for the precise meaning of
> > +   Signed-off-by.


> > +Acked-by:  The person named (who should be an active developer in the
> > +   area addressed by the patch) is aware of the patch and has
> > +   no objection to its inclusion.  An Acked-by tag does not
> > +   imply any involvement in the development of the patch or
> > +   that a detailed review was done.
> > +
> > +Reviewed-by:   The patch has been reviewed and found acceptible 
> > according
> 
>   acceptable
> 
> > +   to the Reviewer's Statement as found at the bottom of this
> > +   file.  A Reviewed-by tag is a statement of opinion that the
> > +   patch is an appropriate modification of the kernel without
> > +   any remaining serious technical issues.  Any interested
> > +   reviewer (who has done the work) can offer a Reviewed-by
> > +   tag for a patch.
> > +
> > +Cc:The person named was given the opportunity to comment on
> > +   the patch.  This is the only tag which might be added
> > +   without an explicit action by the person it names.
> > +
> > +Tested-by: The patch has been successfully tested (in some
> > +   environment) by the person named.
> > +
>

IMHO the other tags actually are a poor substitute for providing a
more complete description of the reviewer's involvement. It would be better
to have more complete responses like "the patch should be merged as is for
2.6.X but the following should be fixed, ..." etc. The certificate of origin
has meaning for legal things that have a more concrete definition, but the
existing process is about people making good (or bad) decisions based on
feedback and other data. Trying to reduce the feedback down to 3 Acks, and 1 
Review
seems like noise. The problem is getting good reviews of new code in
a timely manner, not the descriptions of the result.


-- 
Stephen Hemminger <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Brownell
> However, when both OHCI and EHCI are built as modules (or, similarly
> I guess, OHCI is built-in and EHCI is modular) there appears to be
> nothing in userspace which makes sure EHCI gets loaded first.

The old /etc/hotplug/usb.rc script made sure to load those modules
in the correct order:  EHCI first.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Miller
From: Greg KH <[EMAIL PROTECTED]>
Date: Mon, 8 Oct 2007 20:10:49 -0700

> Yes it does, I'm seeing reports from some hardware companies of the very
> same thing.  If you serialize and load the ehci driver first, and then
> the ohci driver, that should fix the problem.
> 
> Does that also work for you?  Or are these drivers built into the
> kernel?

As coicidence would have it I finally found a recipe for triggering
the issue, and it ties into what you're talking about here.

It happens only if I make sure OHCI gets loaded first and then EHCI
right afterwards.

It seems that indeed it is important for EHCI to get loaded first,
and in-kernel this is ensured by the link ordering.

However, when both OHCI and EHCI are built as modules (or, similarly
I guess, OHCI is built-in and EHCI is modular) there appears to be
nothing in userspace which makes sure EHCI gets loaded first.

When this triggers, in OHCI's root_port_reset(), the port status
register reads 0x111 in that inner-loop and the value never changes.
It stays like this forever.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] Use one zonelist that is filtered by nodemask

2007-10-08 Thread Nishanth Aravamudan
On 08.10.2007 [18:56:05 -0700], Christoph Lameter wrote:
> On Mon, 8 Oct 2007, Nishanth Aravamudan wrote:
> 
> > >  struct page * fastcall
> > >  __alloc_pages(gfp_t gfp_mask, unsigned int order,
> > >   struct zonelist *zonelist)
> > >  {
> > > + /*
> > > +  * Use a temporary nodemask for __GFP_THISNODE allocations. If the
> > > +  * cost of allocating on the stack or the stack usage becomes
> > > +  * noticable, allocate the nodemasks per node at boot or compile time
> > > +  */
> > > + if (unlikely(gfp_mask & __GFP_THISNODE)) {
> > > + nodemask_t nodemask;
> > > +
> > > + return __alloc_pages_internal(gfp_mask, order,
> > > + zonelist, nodemask_thisnode());
> > > + }
> > > +
> > >   return __alloc_pages_internal(gfp_mask, order, zonelist, NULL);
> > >  }
> > 
> > 
> > 
> > So alloc_pages_node() calls here and for THISNODE allocations, we go ask
> > nodemask_thisnode() for a nodemask...
> 
> H... nodemask_thisnode needs to be passed the zonelist.
> 
> > And nodemask_thisnode() always gives us a nodemask with only the node
> > the current process is running on set, I think?
> 
> Right.
> 
> 
> > That seems really wrong -- and would explain what Lee was seeing while
> > using my patches for the hugetlb pool allocator to use THISNODE
> > allocations. All the allocations would end up coming from whatever node
> > the process happened to be running on. This obviously messes up hugetlb
> > accounting, as I rely on THISNODE requests returning NULL if they go
> > off-node.
> > 
> > I'm not sure how this would be fixed, as __alloc_pages() no longer has
> > the nid to set in the mask.
> > 
> > Am I wrong in my analysis?
> 
> No you are right on target. The thisnode function must determine the
> node from the first zone of the zonelist.

It seems like I would zonelist_node_idx() for this, along the lines of:

static nodemask_t *nodemask_thisnode(nodemask_t *nodemask,
struct zonelist *zonelist)
{
int nid = zonelist_node_idx(zonelist);
/* Build a nodemask for just this node */
nodes_clear(*nodemask);
node_set(nid, *nodemask);

return nodemask;
}

But I think I need to check that zonelist->_zonerefs->zone is !NULL, given this
definition of zonelist_node_idx()

static inline int zonelist_node_idx(struct zoneref *zoneref)
{
#ifdef CONFIG_NUMA
/* zone_to_nid not available in this context */
return zoneref->zone->node;
#else
return 0;
#endif /* CONFIG_NUMA */
}

and this comment in __alloc_pages_internal():


z = zonelist->_zonerefs;  /* the list of zones suitable for gfp_mask */

if (unlikely(!z->zone)) {
/*
 * Happens if we have an empty zonelist as a result of
 * GFP_THISNODE being used on a memoryless node
 */
return NULL;
}
...

It seems like zoneref->zone may be NULL in zonelist_node_idx()? Maybe
someone else should look into resolving this :)

Thanks,
Nish

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread Greg KH
On Mon, Oct 08, 2007 at 04:54:20PM -0700, David Miller wrote:
> From: David Miller <[EMAIL PROTECTED]>
> Date: Sun, 07 Oct 2007 00:51:56 -0700 (PDT)
> 
> > From: David Brownell <[EMAIL PROTECTED]>
> > Date: Sun, 07 Oct 2007 00:31:41 -0700
> > 
> > > Are the other ports still behaving?  Is EHCI maybe trying to switch
> > > ownership of that port?  Is maybe the (newish) autosuspend stuff
> > > kicking in?
> > 
> > I wouldn't know, the machine hangs and doesn't get any further.
> 
> To add some more information here, I think the EHCI idea might
> hold some water.
> 
> What I have here are two NEC OHCI USB interfaces and one NEC EHCI
> USB interface on PCI.  Aparently they all go through a shared
> USB hub, mapped like this:
> 
> HUB Port 1: OHCI #1, EHCI
> HUB Port 2: OHCI #2, EHCI
> HUB Port 3: OHCI #1, EHCI
> HUB Port 4: OHCI #2, EHCI
> HUB Port 5: OHCI #1, EHCI
> 
> The OHCI ports go out to external USB connectors on the back panel of
> the machine, whereas the EHCI is connected up to an internal USB
> storage CDROM device and what appears to be another USB hub.
> 
> The problem seems to be very strongly tied to timing.  For example
> simply adding "ignore_loglevel" to the kernel boot command line can
> make the problem go away.
> 
> This got me thinking about your EHCI comment.
> 
> If these controllers are going through the same HUB, things might go
> south if OHCI initialized first, then khubd et al. are asynchronously
> accessing the segments behind OHCI at the same time that the EHCI
> driver is initializing.  Perhaps, this is the kind of sequence of
> events which makes one of the root ports reset in such a way that the
> the reset bit never clears.
> 
> Given that this machine has 64 cpus, the likelyhood for such parallel
> accesses is very likely :-)
> 
> Does this make any sense?

Yes it does, I'm seeing reports from some hardware companies of the very
same thing.  If you serialize and load the ehci driver first, and then
the ohci driver, that should fix the problem.

Does that also work for you?  Or are these drivers built into the
kernel?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: -rt more realtime scheduling issues

2007-10-08 Thread Steven Rostedt
On Mon, Oct 08, 2007 at 11:45:23AM -0700, Mike Kravetz wrote:
> On Fri, Oct 05, 2007 at 07:15:48PM -0700, Mike Kravetz wrote:
> > After applying the fix to try_to_wake_up() I was still seeing some large
> > latencies for realtime tasks.
> 
> I've been looking for places in the code where reschedule IPIs should
> be sent in the case of 'overload' to redistribute RealTime tasks based
> on priority.  However, an even more basic question to ask might be:  Are
> the use of reschedule IPIs reliable enough for this purpose.  In the
> code, there is the following comment:
> 
> /*
>  * this function sends a 'reschedule' IPI to another CPU.
>  * it goes straight through and wastes no time serializing
>  * anything. Worst case is that we lose a reschedule ...
>  */
> 
> After a quick read of the code, it does appear that reschedule's can
> be lost if the the IPI is sent at just the right time in schedule
> processing.  Can someone confirm this is actually the case?
> 
> The issue I see is that the 'rt_overload' mechanism depends on reschedule
> IPIs for RealTime scheduling semantics.  If this is not a reliable
> mechanism then this can lead to breakdowns in RealTime scheduling semantics.
> 
> Are these accurate statements?  I'll start working on a reliable delivery
> mechanism for RealTime scheduling.  But, I just want to make sure that
> is really necessary.

For i386 I don't think so. Seems that the interrupt handler will set the
current task to "need_resched" and on exit of the interrupt handler, the
schedule should take place. I don't see the race (that doesn't mean
there is one).

For x86_64 though, I don't think that we schedule. All the reschedule
vector does is return with a comment:

/*
 * Reschedule call back. Nothing to do,
 * all the work is done automatically when
 * we return from the interrupt.
 */
asmlinkage void smp_reschedule_interrupt(void)
{
ack_APIC_irq();
}

I'm thinking that this was the case for i386 a while back, and we fixed
it for RT.

/me does a quick search...

http://lkml.org/lkml/2005/5/13/174

Yep!  This is a bug in x86_64. I'll fix this up tomorrow and send out a
patch.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RT] fix rt-task scheduling issue

2007-10-08 Thread Steven Rostedt
Mike,

Can you attach your Signed-off-by to this patch, please.


On Fri, Oct 05, 2007 at 07:15:48PM -0700, Mike Kravetz wrote:
> Hi Ingo,
> 
> After applying the fix to try_to_wake_up() I was still seeing some large
> latencies for realtime tasks.  Some debug code pointed out two additional
> causes of these latencies.  I have put fixes into my 'old' kernel and the
> scheduler related latencies have gone away.  I'm pretty confident that
> one of these bugs still exist in the latest RT patch set.  Not so sure
> about the other.  But, I wanted to describe in detail so that you could
> address in the latest version of the code if applicable.
> 
> finish_task_switch() contains the following code:
> 
> #if defined(CONFIG_PREEMPT_RT) && defined(CONFIG_SMP)
>   /*
>* If we pushed an RT task off the runqueue,
>* then kick other CPUs, they might run it:
>*/
>   if (unlikely(rt_task(current) && prev->se.on_rq && rt_task(prev))) {
>   schedstat_inc(rq, rto_schedule);
>   smp_send_reschedule_allbutself_cpumask(current->cpus_allowed);
>   }
> #endif
> 
> My debug code found instances where more than one realtime task got
> put on the runqueue before the __schedule() was invoked.  So, current
> would be a realtime task, but prev was not realtime.  And, there was
> another (lesser priority, or last in) realtime task on the queue.  I
> believe that in this case we would still want to send the IPIs.  In my
> kernel I changed the test to be:
> 
>   if (unlikely(rt_task(current) && rq->rt_nr_running > 1)) {
> 
> After this change, I definitely saw some long latencies go away.

I definitely agree with your analysis.

> 
> The other place of concern is in the routine pull_task().  I was a
> little surprised to see realtime tasks moved around via normal load
> balancing.  But, my debug code did point this out.  In the code for
> my old kernel, the routines end with:
> 
> /*
>  * Note that idle threads have a prio of MAX_PRIO, for this test
>  * to be always true for them.
>  */
> if (TASK_PREEMPTS_CURR(p, this_rq))
> resched_task(this_rq->curr);
> 
> This reminded me very much of the situation/code in try_to_wake_up().
> If pull_tasks() pulled in a realtime task, then I think it should also
> deal with the case where (TASK_PREEMPTS_CURR(p, this_rq) is false.  So
> I changed the code in my kernel to be:
> 
>   /*
>* Note that idle threads have a prio of MAX_PRIO, for this test
>* to be always true for them.
>*/
>   if (TASK_PREEMPTS_CURR(p, this_rq)) {
>   resched_task(this_rq->curr);
> 
>   } else if (unlikely(rt_task(p))) {
>   /* no appropriate rt_overload counter goes here */
>   smp_send_reschedule_allbutself();
>   }

I'm thinking that the first change would actually make this one
obsolete. The checking at the time of scheduling should cover most cases
where multiple rt tasks are being queued on the same CPU.  When we see
that the rt tasks are bunching up on a queue we should handle it then.
Which I would think is at the time of schedule, and the time a task is
queued (try_to_wake_up). Hopefully this is enough.

> 
> To be perfectly honest, I don't know if this change helped eliminate
> any of the large latencies I was seeing.  I made this changes first,
> and was still seeing some large latencies.  I then made the modification
> to finish_task_switch() and all my scheduler related latencies went
> away.  Entirely possible this change had no impact.  Also, the above

I'm thinking it may have had little to no effect. The first change seems
to be the culprit.

> code is replaced in the latest kernels with:
> 
>   check_preempt_curr(this_rq, p);
> 
> What check_preempt_curr() does is not immediately obvious to me. So,
> this may not apply at all.  Just something to think about.

I also don't want to put too many IPI reschedules when we see that we
have more than one rt task on queue. I can imaging an IPI scheduling
storm if we have one more rt tasks than CPUs. So sending the IPI when a
task switch actually occurs seems approriate.

-- Steve

Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>


Index: linux-2.6.23-rc9-rt2/kernel/sched.c
===
--- linux-2.6.23-rc9-rt2.orig/kernel/sched.c
+++ linux-2.6.23-rc9-rt2/kernel/sched.c
@@ -2207,7 +2207,7 @@ static inline void finish_task_switch(st
 * If we pushed an RT task off the runqueue,
 * then kick other CPUs, they might run it:
 */
-   if (unlikely(rt_task(current) && prev->se.on_rq && rt_task(prev))) {
+   if (unlikely(rt_task(current) && rq->rt_nr_running > 1)) {
schedstat_inc(rq, rto_schedule);
smp_send_reschedule_allbutself_cpumask(current->cpus_allowed);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a 

Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-08 Thread Mark Fasheh
On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote:
> > block_page_mkwrite() is just using generic interfaces to do this,
> > same as pretty much any write() system call. The idea was to make it
> > as similar to the write() call path as possible...
> >
> > However, unlike generic_file_buffered_write(), we are not calling
> > balance_dirty_pages_ratelimited(mapping) between
> > ->prepare/commit_write call pairs.  Perhaps this should be added to
> > block_page_mkwrite() after the page is unlocked
> 
> That sounds pretty sane, in terms of matching with
> generic_file_buffered_write.

I agree. We could also insert a call to balance_dirty_pages_ratelimited() in
__ocfs2_page_mkwrite.
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Steven Rostedt
On Mon, Oct 08, 2007 at 10:16:26PM +0200, Rafael J. Wysocki wrote:
> 
> Tested-by: is sort of trivial for a fix patch, for example, if a bug reporter
> confirms that the proposed patch actually fixes the issue.  IMHO it wouldn't
> be practical to complicate that.
>

I see two types of Tested-by.

1) As you stated, a fixed to a problem that the reporter has seen. So
that someone could state a "fixes issue" in the change log and that
would simple mean that the tester has seen a problem, and the attached
patch fixes it.

2) Someone has a testsuite to the area that the change affects. So if
someone has developed a networking test suite and a patch changes some
networking logic, the Tested-by could be that the tester actually ran
specific tests.  This should require a more detail explaination of what
was done. Or the very least, a pointer to a web page of the tests that
were run.

So for the user that sees an issue, then gets a patch, perhaps all they
need to do is add a "fixed problem" or "works now" in the change log to
denote that the patch has actually (or seems to) fix the problem that
they previously seen. This shouldn't be too hard.

But for those that run test suites, they should be smart enough to put
in more documentation into the change log to state how it was tested.

Perhaps we need to add yet another signed off.

"Verified-by", which could be for the user that saw an issue and the
patch now fixes it. That user could just add the "Verified-by" to the
patch to acknowledge (and record) that the patch did fix the issue.

The "Tested-by" can be used for patches that are run through a test
suite.

Just a thought.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] Use one zonelist that is filtered by nodemask

2007-10-08 Thread Christoph Lameter
On Mon, 8 Oct 2007, Nishanth Aravamudan wrote:

> >  struct page * fastcall
> >  __alloc_pages(gfp_t gfp_mask, unsigned int order,
> > struct zonelist *zonelist)
> >  {
> > +   /*
> > +* Use a temporary nodemask for __GFP_THISNODE allocations. If the
> > +* cost of allocating on the stack or the stack usage becomes
> > +* noticable, allocate the nodemasks per node at boot or compile time
> > +*/
> > +   if (unlikely(gfp_mask & __GFP_THISNODE)) {
> > +   nodemask_t nodemask;
> > +
> > +   return __alloc_pages_internal(gfp_mask, order,
> > +   zonelist, nodemask_thisnode());
> > +   }
> > +
> > return __alloc_pages_internal(gfp_mask, order, zonelist, NULL);
> >  }
> 
> 
> 
> So alloc_pages_node() calls here and for THISNODE allocations, we go ask
> nodemask_thisnode() for a nodemask...

H... nodemask_thisnode needs to be passed the zonelist.

> And nodemask_thisnode() always gives us a nodemask with only the node
> the current process is running on set, I think?

Right.

 
> That seems really wrong -- and would explain what Lee was seeing while
> using my patches for the hugetlb pool allocator to use THISNODE
> allocations. All the allocations would end up coming from whatever node
> the process happened to be running on. This obviously messes up hugetlb
> accounting, as I rely on THISNODE requests returning NULL if they go
> off-node.
> 
> I'm not sure how this would be fixed, as __alloc_pages() no longer has
> the nid to set in the mask.
> 
> Am I wrong in my analysis?

No you are right on target. The thisnode function must determine the node 
from the first zone of the zonelist.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: parallel networking

2007-10-08 Thread Jeff Garzik

David Miller wrote:

From: Jeff Garzik <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 10:22:28 -0400

In terms of overall parallelization, both for TX as well as RX, my gut 
feeling is that we want to move towards an MSI-X, multi-core friendly 
model where packets are LIKELY to be sent and received by the same set 
of [cpus | cores | packages | nodes] that the [userland] processes 
dealing with the data.


The problem is that the packet schedulers want global guarantees
on packet ordering, not flow centric ones.

That is the issue Jamal is concerned about.


Oh, absolutely.

I think, fundamentally, any amount of cross-flow resource management 
done in software is an obstacle to concurrency.


That's not a value judgement, just a statement of fact.

"traffic cops" are intentional bottlenecks we add to the process, to 
enable features like priority flows, filtering, or even simple socket 
fairness guarantees.  Each of those bottlenecks serves a valid purpose, 
but at the end of the day, it's still a bottleneck.


So, improving concurrency may require turning off useful features that 
nonetheless hurt concurrency.




The more I think about it, the more inevitable it seems that we really
might need multiple qdiscs, one for each TX queue, to pull this full
parallelization off.

But the semantics of that don't smell so nice either.  If the user
attaches a new qdisc to "ethN", does it go to all the TX queues, or
what?

All of the traffic shaping technology deals with the device as a unary
object.  It doesn't fit to multi-queue at all.


Well the easy solutions to networking concurrency are

* use virtualization to carve up the machine into chunks

* use multiple net devices

Since new NIC hardware is actively trying to be friendly to 
multi-channel/virt scenarios, either of these is reasonably 
straightforward given the current state of the Linux net stack.  Using 
multiple net devices is especially attractive because it works very well 
with the existing packet scheduling.


Both unfortunately impose a burden on the developer and admin, to force 
their apps to distribute flows across multiple [VMs | net devs].



The third alternative is to use a single net device, with SMP-friendly 
packet scheduling.  Here you run into the problems you described "device 
as a unary object" etc. with the current infrastructure.


With multiple TX rings, consider that we are pushing the packet 
scheduling from software to hardware...  which implies

* hardware-specific packet scheduling
* some TC/shaping features not available, because hardware doesn't 
support it


Jeff




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Yan Zheng
2007/10/9, Andrew Morton <[EMAIL PROTECTED]>:
> Perhaps Yan Zheng can tell us what test was used to demonstrate this?

I found it by review, only do test to check remap_file_pages works
when VM_CAN_NONLINEAR flags is set.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] lockdep: Avoid /proc/lockdep & lock_stat infinite output

2007-10-08 Thread Al Viro
On Mon, Oct 08, 2007 at 06:15:51PM -0700, Tim Pepper wrote:
> 
> When a read() requests an amount of data smaller than the amount of data
> that the seq_file's foo_show() outputs, the output starts looping and
> outputs the "stuck" element's data infinitely.  There may be multiple
> sequential calls to foo_start(), foo_next()/foo_show(), and foo_stop()
> for a single open with sequential read of the file.  The _start() does not
> have to start with the 0th element and _show() might be called multiple
> times in a row for the same element for a given open/read of the seq_file.
>  
>  static void *l_start(struct seq_file *m, loff_t *pos)
>  {
> - struct lock_class *class = m->private;
> + struct lock_class *class;
> + loff_t i = 0;
>  
> - if (>lock_entry == all_lock_classes.next)
> + if (*pos == 0)
>   seq_printf(m, "all lock classes:\n");

Do not generate output outside of ->show() and you won't have these
problems.  That's where your infinite output crap comes from.

IOW, NAK - fix the underlying problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] param_sysfs_builtin memchr argument fix

2007-10-08 Thread Dave Young
> > If memchr argument is longer than strlen(kp->name), there will be some
> > weird result.
> 
> Just to clarify:  this was causing duplicate filenames in sysfs ?
Yes, it will casuse duplicate filenames in sysfs. For me, the "nousb"
will cause the "usbcore" created twice. 
> 
> 
> > Signed-off-by: Dave Young <[EMAIL PROTECTED]>
> >
> > ---
> > kernel/params.c |8 +++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff -upr linux/kernel/params.c linux.new/kernel/params.c
> > --- linux/kernel/params.c 2007-10-08 14:30:06.0 +0800
> > +++ linux.new/kernel/params.c 2007-10-08 15:13:04.0 +0800
> > @@ -592,11 +592,17 @@ static void __init param_sysfs_builtin(v
> >
> >   for (i=0; i < __stop___param - __start___param; i++) {
> >   char *dot;
> > + size_t kplen;
> >
> >   kp = &__start___param[i];
> > + kplen = strlen(kp->name);
> >
> >   /* We do not handle args without periods. */
> > - dot = memchr(kp->name, '.', MAX_KBUILD_MODNAME);
> > + if (kplen > MAX_KBUILD_MODNAME) {
> > + DEBUGP("kernel parameter %s is too long\n", kp->name);
> 
> how about
> kernel parameter name %s is too long
> or
> kernel parameter name is too long: %s
> 
> (primary is addition of "name")
Yes, "name" should be added, thanks.
> 
> > + continue;
> > + }
> > + dot = memchr(kp->name, '.', kplen);
> >   if (!dot) {
> >   DEBUGP("couldn't find period in %s\n", kp->name);
> >   continue;
> > -
> 

Regards
dave



Signed-off-by: Dave Young <[EMAIL PROTECTED]> 

---
kernel/params.c |8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff -upr linux/kernel/params.c linux.new/kernel/params.c
--- linux/kernel/params.c   2007-10-08 14:30:06.0 +0800
+++ linux.new/kernel/params.c   2007-10-09 09:16:55.0 +0800
@@ -592,11 +592,17 @@ static void __init param_sysfs_builtin(v
 
for (i=0; i < __stop___param - __start___param; i++) {
char *dot;
+   size_t kplen;
 
kp = &__start___param[i];
+   kplen = strlen(kp->name);
 
/* We do not handle args without periods. */
-   dot = memchr(kp->name, '.', MAX_KBUILD_MODNAME);
+   if (kplen > MAX_KBUILD_MODNAME) {
+   DEBUGP("kernel parameter name is too long: %s\n", 
kp->name);
+   continue;
+   }
+   dot = memchr(kp->name, '.', kplen);
if (!dot) {
DEBUGP("couldn't find period in %s\n", kp->name);
continue;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] lockdep: Avoid /proc/lockdep & lock_stat infinite output

2007-10-08 Thread Tim Pepper

When a read() requests an amount of data smaller than the amount of data
that the seq_file's foo_show() outputs, the output starts looping and
outputs the "stuck" element's data infinitely.  There may be multiple
sequential calls to foo_start(), foo_next()/foo_show(), and foo_stop()
for a single open with sequential read of the file.  The _start() does not
have to start with the 0th element and _show() might be called multiple
times in a row for the same element for a given open/read of the seq_file.

Signed-off-by: Tim Pepper <[EMAIL PROTECTED]>
Cc: Peter Zijlstra <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>

---

Assuming people are fine with this, it should probably find its way
to stable.

If you haven't seen the infinite output: it's easy to trigger with a
simple 'cat /proc/lockdep' generally for me, a cat /proc/lock_stat piped
to a file or for either of them a dd with the default bs=512 (or smaller)
should do the job also.

With this change to the lock_stat handler the data->iter member no longer
attempts to hold state across calls, so it could be taken out of the
lock_stat_seq struct and replace by a local variable in each function
but that isn't a clear win to me so I just left it.

--- linux-2.6.23-rc9.orig/kernel/lockdep_proc.c
+++ linux-2.6.23-rc9/kernel/lockdep_proc.c
@@ -34,19 +34,23 @@ static void *l_next(struct seq_file *m, 
  lock_entry);
else
class = NULL;
-   m->private = class;
 
return class;
 }
 
 static void *l_start(struct seq_file *m, loff_t *pos)
 {
-   struct lock_class *class = m->private;
+   struct lock_class *class;
+   loff_t i = 0;
 
-   if (>lock_entry == all_lock_classes.next)
+   if (*pos == 0)
seq_printf(m, "all lock classes:\n");
 
+   list_for_each_entry(class, _lock_classes, lock_entry) {
+   if (i++ == *pos)
+   return class;
+   }
+   return NULL;
-   return class;
 }
 
 static void l_stop(struct seq_file *m, void *v)
@@ -101,7 +105,7 @@ static void print_name(struct seq_file *
 static int l_show(struct seq_file *m, void *v)
 {
unsigned long nr_forward_deps, nr_backward_deps;
-   struct lock_class *class = m->private;
+   struct lock_class *class = v;
struct lock_list *entry;
char c1, c2, c3, c4;
 
@@ -523,12 +527,15 @@ static void *ls_start(struct seq_file *m
 {
struct lock_stat_seq *data = m->private;
 
-   if (data->iter == data->stats)
-   seq_header(m);
+   data->iter = data->stats;
+   data->iter += *pos;
 
-   if (data->iter == data->iter_end)
+   if (data->iter >= data->iter_end)
data->iter = NULL;
 
+   if (data->iter == data->stats)
+   seq_header(m);
+
return data->iter;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] Use one zonelist that is filtered by nodemask

2007-10-08 Thread Nishanth Aravamudan
On 28.09.2007 [15:25:27 +0100], Mel Gorman wrote:
> 
> Two zonelists exist so that GFP_THISNODE allocations will be guaranteed
> to use memory only from a node local to the CPU. As we can now filter the
> zonelist based on a nodemask, we filter the standard node zonelist for zones
> on the local node when GFP_THISNODE is specified.
> 
> When GFP_THISNODE is used, a temporary nodemask is created with only the
> node local to the CPU set. This allows us to eliminate the second zonelist.
> 
> Signed-off-by: Mel Gorman <[EMAIL PROTECTED]>
> Acked-by: Christoph Lameter <[EMAIL PROTECTED]>



> diff -rup -X /usr/src/patchset-0.6/bin//dontdiff 
> linux-2.6.23-rc8-mm2-030_filter_nodemask/include/linux/gfp.h 
> linux-2.6.23-rc8-mm2-040_use_one_zonelist/include/linux/gfp.h
> --- linux-2.6.23-rc8-mm2-030_filter_nodemask/include/linux/gfp.h  
> 2007-09-28 15:49:57.0 +0100
> +++ linux-2.6.23-rc8-mm2-040_use_one_zonelist/include/linux/gfp.h 
> 2007-09-28 15:55:03.0 +0100

[Reordering the chunks to make my comments a little more logical]



> -static inline struct zonelist *node_zonelist(int nid, gfp_t flags)
> +static inline struct zonelist *node_zonelist(int nid)
>  {
> - return NODE_DATA(nid)->node_zonelists + gfp_zonelist(flags);
> + return _DATA(nid)->node_zonelist;
>  }
> 
>  #ifndef HAVE_ARCH_FREE_PAGE
> @@ -198,7 +186,7 @@ static inline struct page *alloc_pages_n
>   if (nid < 0)
>   nid = numa_node_id();
> 
> - return __alloc_pages(gfp_mask, order, node_zonelist(nid, gfp_mask));
> + return __alloc_pages(gfp_mask, order, node_zonelist(nid));
>  }

This is alloc_pages_node(), and converting the nid to a zonelist means
that lower levels (specifically __alloc_pages() here) are not aware of
nids, as far as I can tell. This isn't a change, I just want to make
sure I understand...



>  struct page * fastcall
>  __alloc_pages(gfp_t gfp_mask, unsigned int order,
>   struct zonelist *zonelist)
>  {
> + /*
> +  * Use a temporary nodemask for __GFP_THISNODE allocations. If the
> +  * cost of allocating on the stack or the stack usage becomes
> +  * noticable, allocate the nodemasks per node at boot or compile time
> +  */
> + if (unlikely(gfp_mask & __GFP_THISNODE)) {
> + nodemask_t nodemask;
> +
> + return __alloc_pages_internal(gfp_mask, order,
> + zonelist, nodemask_thisnode());
> + }
> +
>   return __alloc_pages_internal(gfp_mask, order, zonelist, NULL);
>  }



So alloc_pages_node() calls here and for THISNODE allocations, we go ask
nodemask_thisnode() for a nodemask...

> +static nodemask_t *nodemask_thisnode(nodemask_t *nodemask)
> +{
> + /* Build a nodemask for just this node */
> + int nid = numa_node_id();
> +
> + nodes_clear(*nodemask);
> + node_set(nid, *nodemask);
> +
> + return nodemask;
> +}



And nodemask_thisnode() always gives us a nodemask with only the node
the current process is running on set, I think?

That seems really wrong -- and would explain what Lee was seeing while
using my patches for the hugetlb pool allocator to use THISNODE
allocations. All the allocations would end up coming from whatever node
the process happened to be running on. This obviously messes up hugetlb
accounting, as I rely on THISNODE requests returning NULL if they go
off-node.

I'm not sure how this would be fixed, as __alloc_pages() no longer has
the nid to set in the mask.

Am I wrong in my analysis?

Thanks,
Nish

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleepy linux 2.6.23-rc9

2007-10-08 Thread H. Peter Anvin

Clemens Koller wrote:


When I boot init=/bin/bash vga=791 (vesa framebuffer), most wakeups
are caused by cursor painting (I should fix that some day, I
guess). But... the cursor blinking does not even work properly!

It blinks at normal speed, then (randomly) it blinks slowly, then gets
back to normal speed, then inserts longer delay.


Is the effect a beat that it has roughly the frequency of your Notebooks
screen refresh rate (60Hz)? (in german: Schwebung)


The effect is so nice that I thought about youtube ;-). Thinkpad
x60.. question is, how to debug it? 


No idea... check where the register of the HW cursor blink rate
gets written? But as it seems to be so nice, please submit a patch
which enables this for all platforms. ;-)



For the VESA framebuffer I would assume the cursor blinking is done in 
software (if done at all.)


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] aic94xx: Use sas_request_addr() to provide SAS addr if the adapter lacks one

2007-10-08 Thread Darrick J. Wong
If the aic94xx chip doesn't have a SAS address in the chip's flash memory,
make libsas get one for us.  Also clean out some old code that had been
used to do this in the past.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/aic94xx/aic94xx.h  |   16 
 drivers/scsi/aic94xx/aic94xx_hwi.c  |   21 ++---
 drivers/scsi/aic94xx/aic94xx_init.c |2 --
 3 files changed, 10 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/aic94xx/aic94xx.h b/drivers/scsi/aic94xx/aic94xx.h
index 32f513b..aee235f 100644
--- a/drivers/scsi/aic94xx/aic94xx.h
+++ b/drivers/scsi/aic94xx/aic94xx.h
@@ -58,7 +58,6 @@
 
 extern struct kmem_cache *asd_dma_token_cache;
 extern struct kmem_cache *asd_ascb_cache;
-extern char sas_addr_str[2*SAS_ADDR_SIZE + 1];
 
 static inline void asd_stringify_sas_addr(char *p, const u8 *sas_addr)
 {
@@ -68,21 +67,6 @@ static inline void asd_stringify_sas_addr(char *p, const u8 
*sas_addr)
*p = '\0';
 }
 
-static inline void asd_destringify_sas_addr(u8 *sas_addr, const char *p)
-{
-   int i;
-   for (i = 0; i < SAS_ADDR_SIZE; i++) {
-   u8 h, l;
-   if (!*p)
-   break;
-   h = isdigit(*p) ? *p-'0' : *p-'A'+10;
-   p++;
-   l = isdigit(*p) ? *p-'0' : *p-'A'+10;
-   p++;
-   sas_addr[i] = (h<<4) | l;
-   }
-}
-
 struct asd_ha_struct;
 struct asd_ascb;
 
diff --git a/drivers/scsi/aic94xx/aic94xx_hwi.c 
b/drivers/scsi/aic94xx/aic94xx_hwi.c
index 0cd7eed..1dc5400 100644
--- a/drivers/scsi/aic94xx/aic94xx_hwi.c
+++ b/drivers/scsi/aic94xx/aic94xx_hwi.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "aic94xx.h"
 #include "aic94xx_reg.h"
@@ -38,16 +39,14 @@ u32 MBAR0_SWB_SIZE;
 
 /* -- Initialization -- */
 
-static void asd_get_user_sas_addr(struct asd_ha_struct *asd_ha)
+static int asd_get_user_sas_addr(struct asd_ha_struct *asd_ha)
 {
-   extern char sas_addr_str[];
-   /* If the user has specified a WWN it overrides other settings
-*/
-   if (sas_addr_str[0] != '\0')
-   asd_destringify_sas_addr(asd_ha->hw_prof.sas_addr,
-sas_addr_str);
-   else if (asd_ha->hw_prof.sas_addr[0] != 0)
-   asd_stringify_sas_addr(sas_addr_str, asd_ha->hw_prof.sas_addr);
+   /* adapter came with a sas address */
+   if (asd_ha->hw_prof.sas_addr[0])
+   return 0;
+
+   return sas_request_addr(asd_ha->sas_ha.core.shost,
+   asd_ha->hw_prof.sas_addr);
 }
 
 static void asd_propagate_sas_addr(struct asd_ha_struct *asd_ha)
@@ -657,8 +657,7 @@ int asd_init_hw(struct asd_ha_struct *asd_ha)
 
asd_init_ctxmem(asd_ha);
 
-   asd_get_user_sas_addr(asd_ha);
-   if (!asd_ha->hw_prof.sas_addr[0]) {
+   if (asd_get_user_sas_addr(asd_ha)) {
asd_printk("No SAS Address provided for %s\n",
   pci_name(asd_ha->pcidev));
err = -ENODEV;
diff --git a/drivers/scsi/aic94xx/aic94xx_init.c 
b/drivers/scsi/aic94xx/aic94xx_init.c
index b70d6e7..5c99f27 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -54,8 +54,6 @@ MODULE_PARM_DESC(collector, "\n"
"\tThe aic94xx SAS LLDD supports both modes.\n"
"\tDefault: 0 (Direct Mode).\n");
 
-char sas_addr_str[2*SAS_ADDR_SIZE + 1] = "";
-
 static struct scsi_transport_template *aic94xx_transport_template;
 static int asd_scan_finished(struct Scsi_Host *, unsigned long);
 static void asd_scan_start(struct Scsi_Host *);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] libsas: Provide a transport-level facility to request SAS addrs

2007-10-08 Thread Darrick J. Wong
Use the request_firmware() interface to get a SAS address from userspace.
This way, there's no debate as to who or how an address gets generated;
it's up to the administrator to provide one if the driver can't find one
on its own.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/libsas/sas_scsi_host.c |   41 +++
 include/scsi/libsas.h   |3 +++
 2 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c 
b/drivers/scsi/libsas/sas_scsi_host.c
index 7663841..0fa0296 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -24,6 +24,8 @@
  */
 
 #include 
+#include 
+#include 
 
 #include "sas_internal.h"
 
@@ -1047,6 +1049,45 @@ void sas_target_destroy(struct scsi_target *starget)
return;
 }
 
+static void sas_parse_addr(u8 *sas_addr, const char *p)
+{
+   int i;
+   for (i = 0; i < SAS_ADDR_SIZE; i++) {
+   u8 h, l;
+   if (!*p)
+   break;
+   h = isdigit(*p) ? *p-'0' : toupper(*p)-'A'+10;
+   p++;
+   l = isdigit(*p) ? *p-'0' : toupper(*p)-'A'+10;
+   p++;
+   sas_addr[i] = (h<<4) | l;
+   }
+}
+
+#define SAS_STRING_ADDR_SIZE   16
+
+int sas_request_addr(struct Scsi_Host *shost, u8 *addr)
+{
+   int res;
+   const struct firmware *fw;
+
+   res = request_firmware(, "sas_addr", >shost_gendev);
+   if (res)
+   return res;
+
+   if (fw->size < SAS_STRING_ADDR_SIZE) {
+   res = -ENODEV;
+   goto out;
+   }
+
+   sas_parse_addr(addr, fw->data);
+
+out:
+   release_firmware(fw);
+   return res;
+}
+EXPORT_SYMBOL_GPL(sas_request_addr);
+
 EXPORT_SYMBOL_GPL(sas_queuecommand);
 EXPORT_SYMBOL_GPL(sas_target_alloc);
 EXPORT_SYMBOL_GPL(sas_slave_configure);
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 8dda2d6..58aa2aa 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -676,4 +676,7 @@ extern int sas_ioctl(struct scsi_device *sdev, int cmd, 
void __user *arg);
 
 extern int sas_smp_handler(struct Scsi_Host *shost, struct sas_rphy *rphy,
   struct request *req);
+
+int sas_request_addr(struct Scsi_Host *shost, u8 *addr);
+
 #endif /* _SASLIB_H_ */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lockdep: how to tell it multiple pte locks is OK?

2007-10-08 Thread Jeremy Fitzhardinge
Arjan van de Ven wrote:
> s/implemented/merged/ :)
>
> IN fact shared pagetables are already there for hugepages.
> For small pages it's a patch at this point.
>   

Is it kept up to date?  Where does it live?

> no I'm not saying that. I'm just saying that I'm worried about the
> locking robustness of your trick in general.
>   

Hm, well I won't need to re-pin shared ptes anyway, so I think it's moot.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 09:36, David Chinner wrote:
> On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
> > On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:

> > > Force a balance call if ->page_mkwrite() was successful.
> >
> > Would it be better to just have the callers set_page_dirty_balance()?
>
> block_page_mkwrite() is just using generic interfaces to do this,
> same as pretty much any write() system call. The idea was to make it
> as similar to the write() call path as possible...
>
> However, unlike generic_file_buffered_write(), we are not calling
> balance_dirty_pages_ratelimited(mapping) between
> ->prepare/commit_write call pairs.  Perhaps this should be added to
> block_page_mkwrite() after the page is unlocked

That sounds pretty sane, in terms of matching with
generic_file_buffered_write.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:51, Andrew Morton wrote:
> On Mon, 8 Oct 2007 10:28:43 -0700

> > I'll now add remap_file_pages soon.
> > Maybe those other 2 tests aren't strong enough (?).
> > Or maybe they don't return a non-0 exit status even when they fail...
> > (I'll check.)
>
> Perhaps Yan Zheng can tell us what test was used to demonstrate this?

Was probably found by review. Otherwise, you could probably reproduce
it by mmaping, say, drm device node, running remap_file_pages() on it
to create a nonlinear mapping, and then finding that you get the wrong
data.

> > > I'm surprise that LTP doesn't have any remap_file_pages() tests.
> >
> > quick grep didn't find any for me.
>
> Me either.  There are a few lying around the place which could be
> integrated.
>
> It would be good if LTP were to have some remap_file_pages() tests
> (please).  As we see here, it is something which we can easily break, and
> leave broken for some time.

Here is Ingo's old test, since cleaned up and fixed a bit by me
I'm sure he would distribute it GPL, but I've cc'ed him because I didn't
find an explicit statement about that.

/*
 * Copyright (C) Ingo Molnar, 2002
 */
#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define PAGE_SIZE 4096
#define PAGE_WORDS (PAGE_SIZE/sizeof(int))

#define CACHE_PAGES 1024
#define CACHE_SIZE (CACHE_PAGES*PAGE_SIZE)

#define WINDOW_PAGES 16
#define WINDOW_SIZE (WINDOW_PAGES*PAGE_SIZE)

#define WINDOW_START 0x4800

static char cache_contents [CACHE_SIZE];

static void test_nonlinear(int fd)
{
	char *data = NULL;
	int i, j, repeat = 2;

	for (i = 0; i < CACHE_PAGES; i++) {
		int *page = (int *) (cache_contents + i*PAGE_SIZE);

		for (j = 0; j < PAGE_WORDS; j++)
			page[j] = i;
	}

	if (write(fd, cache_contents, CACHE_SIZE) != CACHE_SIZE)
		perror("write"), exit(1);

	data = mmap((void *)WINDOW_START,
			WINDOW_SIZE,
			PROT_READ|PROT_WRITE, 
			MAP_FIXED | MAP_SHARED 
			, fd, 0);

	if (data == MAP_FAILED)
		perror("mmap"), exit(1);

again:
	for (i = 0; i < WINDOW_PAGES; i += 2) {
		char *page = data + i*PAGE_SIZE;

		if (remap_file_pages(page, PAGE_SIZE * 2, 0,
(WINDOW_PAGES-i-2), 0) == -1)
			perror("remap_file_pages"), exit(1);
	}

	for (i = 0; i < WINDOW_PAGES; i++) {
		/*
		 * Double-check the correctness of the mapping:
		 */
		if (i & 1) {
			if (data[i*PAGE_SIZE] != WINDOW_PAGES-i) {
printf("hm, mapped incorrect data!\n");
exit(1);
			}
		} else {
			if (data[i*PAGE_SIZE] != WINDOW_PAGES-i-2) {
printf("hm, mapped incorrect data!\n");
exit(1);
			}
		}
	}

	if (--repeat)
		goto again;
}

int main(int argc, char **argv)
{
	int fd;

	fd = open("/dev/shm/cache", O_RDWR|O_CREAT|O_TRUNC,S_IRWXU);
	if (fd < 0)
		perror("open"), exit(1);
	test_nonlinear(fd);
	if (close(fd) == -1)
		perror("close"), exit(1);
	printf("nonlinear shm file OK\n");

	fd = open("/tmp/cache", O_RDWR|O_CREAT|O_TRUNC,S_IRWXU);
	if (fd < 0)
		perror("open"), exit(1);
	test_nonlinear(fd);
	if (close(fd) == -1)
		perror("close"), exit(1);
	printf("nonlinear /tmp/ file OK\n");

	exit(0);
}



Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Andrew Vasquez
On Mon, 08 Oct 2007, Darrick J. Wong wrote:

> On Mon, Oct 08, 2007 at 03:48:32PM -0700, Andrew Vasquez wrote:
> 
> > So how about factoring that out to a transport-level interface.  How
> > about something along the lines of the following patch, whereby the
> > software driver upon detecting no valid WWPN, makes an upcall to each
> > interface's 'request_wwn()'.  The data passed in from shost_gendev
> > should be enough for some helper script to cull relevent device bits
> > and perhaps offer some level of persistence...  Off base?
> 
> Hrm... jejb made a remark that it might be better to pass the
> scsi_host's device into request_firmware() as your example does, so I'll
> pitch in a patch to do likewise with libsas--the scsi_host knows the
> actual device it's coming from, and userland can sort that all out later
> anyway via DEVPATH.
> 
> I suppose one could also have multiple scsi_hosts per PCI device, which
> means that my first patch would stumble horribly in more than a few
> cases.

This is done already in the FC case -- NPIV.  Though with that
interface, the administrator is already responsible for assigning
proper WWNN/WWPN during creation.

> > Darrick, forgive the FC example, I don't do SAS...
> 
> That's ok, I don't do FC. :)  Looks mostly good to me...

--
av
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Neil Brown
On Monday October 8, [EMAIL PROTECTED] wrote:

I find it is always good to know *why* we have the tags.  That
information is a useful complement to what they mean, and can guide
people in adding them.

So below I present some "Purposes", YetAnotherTag, and a comment on
the RSO.

(And I'd like to add a vote for "Blame-Shared-By:" rather than
"Reviewed-by:", however I don't I'll get much support...)

> diff --git a/Documentation/patch-tags b/Documentation/patch-tags
> new file mode 100644
> index 000..fb5f8e1
> --- /dev/null
> +++ b/Documentation/patch-tags
> @@ -0,0 +1,66 @@
> +Patches headed for the mainline may contain a variety of tags documenting
> +who played a hand in (or was at least aware of) its progress.  All of these
> +tags have the form:
> +
> + Something-done-by: Full name <[EMAIL PROTECTED]>
> +
> +These tags are:

   From:The Author, Primary Author, or Authors of the patch.
Authors should also provide a Signed-off-by: tag.

Purpose: to give credit to authors
> +
> +Signed-off-by:  A person adding a Signed-off-by tag is attesting that the
> + patch is, to the best of his or her knowledge, legally able
> + to be merged into the mainline and distributed under the
> + terms of the GNU General Public License, version 2.  See
> + the Developer's Certificate of Origin, found in
> + Documentation/SubmittingPatches, for the precise meaning of
> + Signed-off-by.

Purpose: to allow subsequent review of the originality of 
the contribution should copyright questions arise.
> +
> +Acked-by:The person named (who should be an active developer in the
> + area addressed by the patch) is aware of the patch and has
> + no objection to its inclusion.  An Acked-by tag does not
> + imply any involvement in the development of the patch or
> + that a detailed review was done.

Purpose:  to inform upstream aggregators that
consensus was achieved for the change.  This is
particularly relevant for changes that affect multiple
Maintenance Domains.

> +
> +Reviewed-by: The patch has been reviewed and found acceptible according
> + to the Reviewer's Statement as found at the bottom of this
> + file.  A Reviewed-by tag is a statement of opinion that the
> + patch is an appropriate modification of the kernel without
> + any remaining serious technical issues.  Any interested
> + reviewer (who has done the work) can offer a Reviewed-by
> + tag for a patch.

Purpose: to inform upstream aggregators that due
diligence has been performed to ensure correctness of
the change.  Also to give credit to reviewers.

> +
> +Cc:  The person named was given the opportunity to comment on
> + the patch.  This is the only tag which might be added
> + without an explicit action by the person it names.

Purpose: to ensure that interested parties are
included in subsequent discussions of the change.

> +
> +Tested-by:   The patch has been successfully tested (in some
> + environment) by the person named.

Purpose: to give credit to testers.

> +
> +
> +
> +
> +Reviewer's statement of oversight, v0.02
> +
> +By offering my Reviewed-by: tag, I state that:
> +
> + (a) I have carried out a technical review of this patch to evaluate its
> + appropriateness and readiness for inclusion into the mainline kernel. 
> +
> + (b) Any problems, concerns, or questions relating to the patch have been
> + communicated back to the submitter.  I am satisfied with how the
> + submitter has responded to my comments.

This seems more detailed that necessary.  The process (communicated
back / responded) is not really relevant.  I would go for something
like:

(b) I have no outstanding problems, concerns, or questions about
this patch (except as noted in the above comments).

and in fact, given (c2), (b) might not be needed at all.

NeilBrown


> +
> + (c) While there may (or may not) be things which could be improved with
> + this submission, I believe that it is, at this time, (1) a worthwhile
> + modification to the kernel, and (2) free of known issues which would
> + argue against its inclusion.
> +
> + (d) While I have reviewed the patch and believe it to be sound, I can not
> + (unless explicitly stated elsewhere) make any warranties or guarantees
> + that it will achieve its stated purpose or function properly in any
> + given situation.
> +
> + (e) I understand and agree that this project and the contribution are
> + public and that a record of the contribution (including my Reviewed-by
> + tag and any associated public 

Re: [ofa-general] Updated InfiniBand/RDMA merge plans for 2.6.24

2007-10-08 Thread Roland Dreier
 > No mention about the iwarp port space issue?

I don't think we're at a stage where I'm prepared to merge something--
we all agree the latest patch has serious drawbacks, and it commits us
to a suboptimal interface that is userspace-visible.

 > I'm at a loss as to how to proceed.

Could we try to do some cleanups to the net core to make the alias
stuff less painful?  eg is there any sane way to make it possible for
a device that creates 'eth0' to also create an 'iw0' alias without an
assigning an address?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OHCI root_port_reset() deadly loop...

2007-10-08 Thread David Miller
From: David Miller <[EMAIL PROTECTED]>
Date: Sun, 07 Oct 2007 00:51:56 -0700 (PDT)

> From: David Brownell <[EMAIL PROTECTED]>
> Date: Sun, 07 Oct 2007 00:31:41 -0700
> 
> > Are the other ports still behaving?  Is EHCI maybe trying to switch
> > ownership of that port?  Is maybe the (newish) autosuspend stuff
> > kicking in?
> 
> I wouldn't know, the machine hangs and doesn't get any further.

To add some more information here, I think the EHCI idea might
hold some water.

What I have here are two NEC OHCI USB interfaces and one NEC EHCI
USB interface on PCI.  Aparently they all go through a shared
USB hub, mapped like this:

HUB Port 1: OHCI #1, EHCI
HUB Port 2: OHCI #2, EHCI
HUB Port 3: OHCI #1, EHCI
HUB Port 4: OHCI #2, EHCI
HUB Port 5: OHCI #1, EHCI

The OHCI ports go out to external USB connectors on the back panel of
the machine, whereas the EHCI is connected up to an internal USB
storage CDROM device and what appears to be another USB hub.

The problem seems to be very strongly tied to timing.  For example
simply adding "ignore_loglevel" to the kernel boot command line can
make the problem go away.

This got me thinking about your EHCI comment.

If these controllers are going through the same HUB, things might go
south if OHCI initialized first, then khubd et al. are asynchronously
accessing the segments behind OHCI at the same time that the EHCI
driver is initializing.  Perhaps, this is the kind of sequence of
events which makes one of the root ports reset in such a way that the
the reset bit never clears.

Given that this machine has 64 cpus, the likelyhood for such parallel
accesses is very likely :-)

Does this make any sense?

Regardless, here is a patch that hardens the OHCI reset handling
loops so that they break out instead of hanging the entire system
should this condition occur.  It's at least better than what the
code does to a user right now which is hang the box completely:

[USB] ohci: Do not hang the system if port reset does not complete.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/drivers/usb/host/ohci-hub.c b/drivers/usb/host/ohci-hub.c
index bb9cc59..77ae5b4 100644
--- a/drivers/usb/host/ohci-hub.c
+++ b/drivers/usb/host/ohci-hub.c
@@ -563,14 +563,19 @@ static inline int root_port_reset (struct ohci_hcd *ohci, 
unsigned port)
u32 temp;
u16 now = ohci_readl(ohci, >regs->fmnumber);
u16 reset_done = now + PORT_RESET_MSEC;
+   int limit_1;
 
/* build a "continuous enough" reset signal, with up to
 * 3msec gap between pulses.  scheduler HZ==100 must work;
 * this might need to be deadline-scheduled.
 */
-   do {
+   limit_1 = 100;
+   while (--limit_1 >= 0) {
+   int limit_2;
+
/* spin until any current reset finishes */
-   for (;;) {
+   limit_2 = PORT_RESET_MSEC * 2;
+   while (--limit_2 >= 0) {
temp = ohci_readl (ohci, portstat);
/* handle e.g. CardBus eject */
if (temp == ~(u32)0)
@@ -579,6 +584,10 @@ static inline int root_port_reset (struct ohci_hcd *ohci, 
unsigned port)
break;
udelay (500);
}
+   if (limit_2 < 0) {
+   ohci_warn(ohci, "Root port inner-loop reset timeout, "
+ "portstat[%08x]\n", temp);
+   }
 
if (!(temp & RH_PS_CCS))
break;
@@ -589,7 +598,14 @@ static inline int root_port_reset (struct ohci_hcd *ohci, 
unsigned port)
ohci_writel (ohci, RH_PS_PRS, portstat);
msleep(PORT_RESET_HW_MSEC);
now = ohci_readl(ohci, >regs->fmnumber);
-   } while (tick_before(now, reset_done));
+   if (!tick_before(now, reset_done))
+   break;
+   }
+   if (limit_1 < 0) {
+   ohci_warn(ohci, "Root port outer-loop reset timeout, "
+ "now[%04x] reset_done[%04x]\n",
+ now, reset_done);
+   }
/* caller synchronizes using PRSC */
 
return 0;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] Colored kernel output (run3)

2007-10-08 Thread Antonino A. Daplas
On Tue, 2007-10-09 at 01:31 +0200, Jan Engelhardt wrote:
> On Oct 9 2007 07:12, Antonino A. Daplas wrote:
> >> 
> >> References: http://lkml.org/lkml/2007/4/1/162
> >>http://lkml.org/lkml/2007/10/5/199
> >
> >This is quite a long thread :-)
> 
> It was a patch series after all. But as Greg puts it, be persistent.
> 
> >> +config VT_PRINTK_COLOR
> >> +  hex "Colored kernel message output"
> >> +  range 0x00 0xFF
> >> +  depends on VT_CKO
> >> +  default 0x07
> >> +  ---help---
> >> +  This option defines with which color kernel messages will be
> >> +  printed to the console.
> >> +
> >> +  The value you need to enter here is the value is composed
> >
> >The more correct term for "The value" is probably "The attribute".
> 
> "The value for this kconfig entry" it should read in the minds.
> 
> >> +  (Foreground colors 0x08 to 0x0F do not work when a VGA
> >> +  console font with 512 glyphs is used.)
> >
> >You might have to include a warning that those values or attributes are 
> >interpreted differently depending on the driver used, and the above is
> >mostly true for 16-color console drivers only.
> 
> Are there any other drivers besides vgacon and fbcon that use vt.c?

All drivers under drivers/video/console. That would be:

vgacon
dummycon
fbcon
newport_con
sticon
promcon
mdacon

There are perhaps a few more drivers outside this directory, such as
sisusbcon or something.



> >You may want to leave out the blink attribute (0x80) from this part.
> >Otherwise setterm -blink on|off will produce the opposite effect. 
> 
> But 0x80 might be interpreted in a different fashion for some othercon, 
> yielding for example superbold rather than blinking.

That's right. But setting the blink attribute is done with an XOR (^).
So 'setterm -blink' on will unset the blink attribute (0x80 ^ 0x80).

> I'll have to try this, because usually, setterm operates on TTYs
> rather than VCs.

Yes, but if the tty driver type is a virtual console, then vt.c is still
affected. 

Well the blink attribute is ignored by most drivers, if I'm not
mistaken. So you generally won't see the effect :-). But with fbcon, the
blink attribute is interpreted as "change background color from black to
light gray".

Tony

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Darrick J. Wong
On Mon, Oct 08, 2007 at 03:48:32PM -0700, Andrew Vasquez wrote:

> So how about factoring that out to a transport-level interface.  How
> about something along the lines of the following patch, whereby the
> software driver upon detecting no valid WWPN, makes an upcall to each
> interface's 'request_wwn()'.  The data passed in from shost_gendev
> should be enough for some helper script to cull relevent device bits
> and perhaps offer some level of persistence...  Off base?

Hrm... jejb made a remark that it might be better to pass the
scsi_host's device into request_firmware() as your example does, so I'll
pitch in a patch to do likewise with libsas--the scsi_host knows the
actual device it's coming from, and userland can sort that all out later
anyway via DEVPATH.

I suppose one could also have multiple scsi_hosts per PCI device, which
means that my first patch would stumble horribly in more than a few
cases.

> Darrick, forgive the FC example, I don't do SAS...

That's ok, I don't do FC. :)  Looks mostly good to me...

--D
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sleepy linux 2.6.23-rc9

2007-10-08 Thread Clemens Koller

Pavel Machek schrieb:

I played with powertop a bit, and found a fairly interesting failure
mode. If I boot init=/bin/bash vga=1, I get ~2 wakeups a second, nice.

When I boot init=/bin/bash vga=791 (vesa framebuffer), most wakeups
are caused by cursor painting (I should fix that some day, I
guess). But... the cursor blinking does not even work properly!

It blinks at normal speed, then (randomly) it blinks slowly, then gets
back to normal speed, then inserts longer delay.


Is the effect a beat that it has roughly the frequency of your Notebooks
screen refresh rate (60Hz)? (in german: Schwebung)


The effect is so nice that I thought about youtube ;-). Thinkpad
x60.. question is, how to debug it? 


No idea... check where the register of the HW cursor blink rate
gets written? But as it seems to be so nice, please submit a patch
which enables this for all platforms. ;-)

Regards,
--
Clemens Koller
___
R Imaging Devices
Anagramm GmbH
Rupert-Mayer-Str. 45/1
81379 Muenchen
Germany

http://www.anagramm-technology.com
Phone: +49-89-741518-50
Fax: +49-89-741518-19
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: parallel networking

2007-10-08 Thread jamal
On Mon, 2007-08-10 at 15:33 -0700, David Miller wrote:

> Multiply whatever effect you think you might be able to measure due to
> that on your 2 or 4 way system, and multiple it up to 64 cpus or so
> for machines I am using.  This is where machines are going, and is
> going to become the norm.

Yes, i keep forgetting that ;-> I need to train my brain to remember
that.

cheers,
jamal



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Stefan Richter
Jonathan Corbet wrote:
> All of these
> +tags have the form:
> +
> + Something-done-by: Full name <[EMAIL PROTECTED]>

To be precise:
Something-done-by: Full name <[EMAIL PROTECTED]> [optional random stuff]

"Some people also put extra tags at the end.  They'll just be ignored
for now, but you can do this to mark internal company procedures or just
point out some special detail about the sign-off.", says
SubmittingPatches.  I actually do so on occasions.
-- 
Stefan Richter
-=-=-=== =-=- -=--=
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-08 Thread David Chinner
On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
> On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
> > It seems that with the recent usage of ->page_mkwrite() a little detail
> > was overlooked.
> >
> > .22-rc1 merged OCFS2 usage of this hook
> > .23-rc1 merged XFS usage
> > .24-rc1 will most likely merge NFS usage
> >
> > Please consider this for .23 final and maybe even .22.x
> >
> > ---
> > Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
> >
> > All the current page_mkwrite() implementations also set the page dirty.
> > Which results in the set_page_dirty_balance() call to _not_ call balance,
> > because the page is already found dirty.
> >
> > This allows us to dirty a _lot_ of pages without ever hitting
> > balance_dirty_pages().  Not good (tm).
> >
> > Force a balance call if ->page_mkwrite() was successful.
> 
> Would it be better to just have the callers set_page_dirty_balance()?

block_page_mkwrite() is just using generic interfaces to do this,
same as pretty much any write() system call. The idea was to make it
as similar to the write() call path as possible...

However, unlike generic_file_buffered_write(), we are not calling
balance_dirty_pages_ratelimited(mapping) between
->prepare/commit_write call pairs.  Perhaps this should be added to
block_page_mkwrite() after the page is unlocked

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:04, Andrew Morton wrote:
> On Mon, 8 Oct 2007 19:45:08 +0800 "Yan Zheng" <[EMAIL PROTECTED]> wrote:
> > Hi all
> >
> > The test for VM_CAN_NONLINEAR always fails
> >
> > Signed-off-by: Yan Zheng<[EMAIL PROTECTED]>
> > 
> > diff -ur linux-2.6.23-rc9/mm/fremap.c linux/mm/fremap.c
> > --- linux-2.6.23-rc9/mm/fremap.c2007-10-07 15:03:33.0 +0800
> > +++ linux/mm/fremap.c   2007-10-08 19:33:44.0 +0800
> > @@ -160,7 +160,7 @@
> > if (vma->vm_private_data && !(vma->vm_flags & VM_NONLINEAR))
> > goto out;
> >
> > -   if (!vma->vm_flags & VM_CAN_NONLINEAR)
> > +   if (!(vma->vm_flags & VM_CAN_NONLINEAR))
> > goto out;
> >
> > if (end <= start || start < vma->vm_start || end > vma->vm_end)
>
> Lovely.  From this we can deduce that nobody has run remap_file_pages()
> since 2.6.23-rc1 and that nobody (including the developer who made that
> change) ran it while that change was in -mm.

But you'd be wrong. remap_file_pages was tested both with my own tester
and Ingo's test program.

vm_flags != 0, !vm_flags = 0, 0 & x = 0, so the test always falls
through. Of course, what I _should_ have done is also test a driver which
does not have VM_CAN_NONLINEAR... but even I wouldn't rewrite half
the nonlinear mapping code without once testing it ;)

FWIW, Oracle (maybe the sole real user of this) has been testing it, which
I'm very happy about (rather than testing after 2.6.23 is released).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] device-mapper: fix bd_mount_sem corruption

2007-10-08 Thread Jun'ichi Nomura
Hi,

This patch fixes a bd_mount_sem counter corruption bug in device-mapper.

thaw_bdev() should be called only when freeze_bdev() was called for the
device.
Otherwise, thaw_bdev() will up bd_mount_sem and corrupt the semaphore counter.
struct block_device with the corrupted semaphore may remain in slab cache
and be reused later.

Attached patch will fix it by calling unlock_fs() instead.
unlock_fs() will determine whether it should call thaw_bdev()
by checking the device is frozen or not.

Easy reproducer is:
  #!/bin/sh
  while [ 1 ]; do
 dmsetup --notable create a
 dmsetup --nolockfs suspend a
 dmsetup remove a
  done

It's not easy to see the effect of corrupted semaphore.
So I have tested with putting printk below in bdev_alloc_inode():
if (atomic_read(>bdev.bd_mount_sem.count) != 1)
printk(KERN_DEBUG "Incorrect semaphore count = %d (%p)\n",
atomic_read(>bdev.bd_mount_sem.count),
>bdev);

Without the patch, I saw something like:
 Incorrect semaphore count = 17 (f2ab91c0)

With the patch, the message didn't appear.


Signed-off-by: Jun'ichi Nomura <[EMAIL PROTECTED]>

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 2120155..998d450 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1064,12 +1064,14 @@ static struct mapped_device *alloc_dev(int minor)
return NULL;
 }
 
+static void unlock_fs(struct mapped_device *md);
+
 static void free_dev(struct mapped_device *md)
 {
int minor = md->disk->first_minor;
 
if (md->suspended_bdev) {
-   thaw_bdev(md->suspended_bdev, NULL);
+   unlock_fs(md);
bdput(md->suspended_bdev);
}
mempool_destroy(md->tio_pool);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] Colored kernel output (run3)

2007-10-08 Thread Jan Engelhardt

On Oct 9 2007 07:12, Antonino A. Daplas wrote:
>> 
>> References: http://lkml.org/lkml/2007/4/1/162
>>  http://lkml.org/lkml/2007/10/5/199
>
>This is quite a long thread :-)

It was a patch series after all. But as Greg puts it, be persistent.

>> +config VT_PRINTK_COLOR
>> +hex "Colored kernel message output"
>> +range 0x00 0xFF
>> +depends on VT_CKO
>> +default 0x07
>> +---help---
>> +This option defines with which color kernel messages will be
>> +printed to the console.
>> +
>> +The value you need to enter here is the value is composed
>
>The more correct term for "The value" is probably "The attribute".

"The value for this kconfig entry" it should read in the minds.

>> +(Foreground colors 0x08 to 0x0F do not work when a VGA
>> +console font with 512 glyphs is used.)
>
>You might have to include a warning that those values or attributes are 
>interpreted differently depending on the driver used, and the above is
>mostly true for 16-color console drivers only.

Are there any other drivers besides vgacon and fbcon that use vt.c?

>For 2-colors [...] With a 4-color fb console (4-level grayscale) [...]
>With an 8-color console, only the first 8 values are considered.
>With a 16-color console, that is also not consistent:[...]

I see. That probably means the explanation of values moves from Kconfig 
to Documentation/. Somehow I think we could do without doc and let 
interested starts find out for themselves and learn a little about 
vgacon/fbcon. ;)

>With vgacon, it supports 16-color foreground (fg), 8-color
>background (bg) at 256 chars. Becomes 8 fg and 8 bg with 512 chars.
>
>With fbcon, it supports 16 fg and 16 bg at 256, 16 fg and 8 bg at
>512 chars.

And then there is fbiterm, which supports at least 16 fg/16 bg with ... 
the whole Unicode set of chars. :)

>And for drivers that have their own con_build_attr() hook, they will be
>interpreted differently again.

>> +Background:
>> +0x00 = black,   0x40 = blue,
>> +0x10 = red, 0x50 = magenta,
>> +0x20 = green,   0x60 = cyan,
>> +0x30 = brown,   0x70 = gray,
>> +
>> +For example, 0x1F would yield white on red.
>
>You may need to specify that the values here are the console default,
>ie, the default_blue|grn|red boot options are not filled up.

>> +static inline void vc_set_color(struct vc_data *vc, unsigned char color)
>> +{
>> +vc->vc_color = color_table[color & 0xF] |
>> +   (color_table[(color >> 4) & 0x7] << 4) |
>> +   (color & 0x80);
>
>You may want to leave out the blink attribute (0x80) from this part.
>Otherwise setterm -blink on|off will produce the opposite effect. 

But 0x80 might be interpreted in a different fashion for some othercon, 
yielding for example superbold rather than blinking.
I'll have to try this, because usually, setterm operates on TTYs
rather than VCs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread J. Bruce Fields
On Mon, Oct 08, 2007 at 04:43:10PM -0600, Jonathan Corbet wrote:
> + (e) I understand and agree that this project and the contribution are
> + public and that a record of the contribution (including my Reviewed-by
> + tag and any associated public communications) is maintained
> + indefinitely and may be redistributed consistent with this project or
> + the open source license(s) involved.

Is this paragraph really necessary?  (For example, is there some history
of problems that this is addressing?)

--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Version 3 (2.6.23-rc8) Smack: Simplified Mandatory Access Control Kernel

2007-10-08 Thread Bill Davidsen

Serge E. Hallyn wrote:

(tongue-in-cheek)

No no, everyone knows you don't build simpler things on top of more
complicated ones, you go the other way around.  So what he was
suggesting was that selinux be re-written on top of smack.
  


Having gone from proposing a simpler and easier to use security system 
as an alternative to SELinux, you now propose to change the one working 
security system we have. And yes, it's hard to use, but it works. Let's 
keep this a patch, people who want adventure can have one, and people 
who have gotten Linux accepted "if SELinux is enabled" will avoid one.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Monday 08 October 2007 23:37, Hugh Dickins wrote:
> On Mon, 8 Oct 2007, Yan Zheng wrote:
> > The test for VM_CAN_NONLINEAR always fails
>
> Good catch indeed.  Though I was puzzled how we do nonlinear at all,
> until I realized it's "The test for not VM_CAN_NONLINEAR always fails".
>
> It's not as serious as it appears, since code further down has been
> added more recently to simulate nonlinear on non-RAM-backed filesystems,
> instead of going the real nonlinear way; so most filesystems are now not
> required to do what VM_CAN_NONLINEAR was put in to ensure they could do.

Well, I think all filesystems can do VM_CAN_NONLINEAR anyway. Device
drivers and "weird" things tend to have trouble...


> I'm confused as to where that leaves us: is this actually a fix that
> needs to go into 2.6.23?  or will it suddenly disable a system call
> which has been silently working fine on various filesystems which did
> not add VM_CAN_NONLINEAR?  could we just rip out VM_CAN_NONLINEAR?

We probably should keep VM_CAN_NONLINEAR for the moment, I think.
But now that we have the fallback path, we _could_ use that instead of
failing. I doubt anybody will be using nonlinear mappings on anything but
regular files for the time being, but as a trivial fix, I think this probably
should go into 2.6.23.

Thanks for spotting this problem
Acked-by: Nick Piggin <[EMAIL PROTECTED]>

> I hope Nick or Miklos is clearer on what the risks are.
>
> (Apologies for all the "not"s and "non"s here, I'm embarrassed
> after just criticizing Ingo's SCHED_NO_NO_OMIT_FRAME_POINTER!)
>
> Hugh
>
> > Signed-off-by: Yan Zheng<[EMAIL PROTECTED]>
> > 
> > diff -ur linux-2.6.23-rc9/mm/fremap.c linux/mm/fremap.c
> > --- linux-2.6.23-rc9/mm/fremap.c2007-10-07 15:03:33.0 +0800
> > +++ linux/mm/fremap.c   2007-10-08 19:33:44.0 +0800
> > @@ -160,7 +160,7 @@
> > if (vma->vm_private_data && !(vma->vm_flags & VM_NONLINEAR))
> > goto out;
> >
> > -   if (!vma->vm_flags & VM_CAN_NONLINEAR)
> > +   if (!(vma->vm_flags & VM_CAN_NONLINEAR))
> > goto out;
> >
> > if (end <= start || start < vma->vm_start || end > vma->vm_end)
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
> It seems that with the recent usage of ->page_mkwrite() a little detail
> was overlooked.
>
> .22-rc1 merged OCFS2 usage of this hook
> .23-rc1 merged XFS usage
> .24-rc1 will most likely merge NFS usage
>
> Please consider this for .23 final and maybe even .22.x
>
> ---
> Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
>
> All the current page_mkwrite() implementations also set the page dirty.
> Which results in the set_page_dirty_balance() call to _not_ call balance,
> because the page is already found dirty.
>
> This allows us to dirty a _lot_ of pages without ever hitting
> balance_dirty_pages().  Not good (tm).
>
> Force a balance call if ->page_mkwrite() was successful.

Would it be better to just have the callers set_page_dirty_balance()?


> Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
> ---
>  include/linux/writeback.h |2 +-
>  mm/memory.c   |9 +++--
>  mm/page-writeback.c   |4 ++--
>  3 files changed, 10 insertions(+), 5 deletions(-)
>
> Index: linux-2.6/include/linux/writeback.h
> ===
> --- linux-2.6.orig/include/linux/writeback.h
> +++ linux-2.6/include/linux/writeback.h
> @@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
>   loff_t pos, loff_t count);
>  int sync_page_range_nolock(struct inode *inode, struct address_space
> *mapping, loff_t pos, loff_t count);
> -void set_page_dirty_balance(struct page *page);
> +void set_page_dirty_balance(struct page *page, int page_mkwrite);
>  void writeback_set_ratelimit(void);
>
>  /* pdflush.c */
> Index: linux-2.6/mm/memory.c
> ===
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
>   struct page *old_page, *new_page;
>   pte_t entry;
>   int reuse = 0, ret = 0;
> + int page_mkwrite = 0;
>   struct page *dirty_page = NULL;
>
>   old_page = vm_normal_page(vma, address, orig_pte);
> @@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
>   page_cache_release(old_page);
>   if (!pte_same(*page_table, orig_pte))
>   goto unlock;
> +
> + page_mkwrite = 1;
>   }
>   dirty_page = old_page;
>   get_page(dirty_page);
> @@ -1691,7 +1694,7 @@ unlock:
>* do_no_page is protected similarly.
>*/
>   wait_on_page_locked(dirty_page);
> - set_page_dirty_balance(dirty_page);
> + set_page_dirty_balance(dirty_page, page_mkwrite);
>   put_page(dirty_page);
>   }
>   return ret;
> @@ -2238,6 +2241,7 @@ static int __do_fault(struct mm_struct *
>   struct page *dirty_page = NULL;
>   struct vm_fault vmf;
>   int ret;
> + int page_mkwrite = 0;
>
>   vmf.virtual_address = (void __user *)(address & PAGE_MASK);
>   vmf.pgoff = pgoff;
> @@ -2315,6 +2319,7 @@ static int __do_fault(struct mm_struct *
>   anon = 1; /* no anon but release 
> vmf.page */
>   goto out;
>   }
> + page_mkwrite = 1;
>   }
>   }
>
> @@ -2375,7 +2380,7 @@ out_unlocked:
>   if (anon)
>   page_cache_release(vmf.page);
>   else if (dirty_page) {
> - set_page_dirty_balance(dirty_page);
> + set_page_dirty_balance(dirty_page, page_mkwrite);
>   put_page(dirty_page);
>   }
>
> Index: linux-2.6/mm/page-writeback.c
> ===
> --- linux-2.6.orig/mm/page-writeback.c
> +++ linux-2.6/mm/page-writeback.c
> @@ -460,9 +460,9 @@ static void balance_dirty_pages(struct a
>   pdflush_operation(background_writeout, 0);
>  }
>
> -void set_page_dirty_balance(struct page *page)
> +void set_page_dirty_balance(struct page *page, int page_mkwrite)
>  {
> - if (set_page_dirty(page)) {
> + if (set_page_dirty(page) || page_mkwrite) {
>   struct address_space *mapping = page_mapping(page);
>
>   if (mapping)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] Colored kernel output (run3)

2007-10-08 Thread Antonino A. Daplas
On Sat, 2007-10-06 at 22:09 +0200, Jan Engelhardt wrote: 
> Colored kernel message output (1/2)
> 
> This patch makes it possible to give kernel messages a selectable
> color. It can be chosen at compile time, overridden at boot time,
> and changed at run time.
> 
> References: http://lkml.org/lkml/2007/4/1/162
>   http://lkml.org/lkml/2007/10/5/199

This is quite a long thread :-)

> 
> Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>
> 
> ---
>  drivers/char/Kconfig |   42 ++
>  drivers/char/vt.c|   23 +++
>  2 files changed, 65 insertions(+)
> 
> Index: linux-2.6.23/drivers/char/Kconfig
> ===
> --- linux-2.6.23.orig/drivers/char/Kconfig
> +++ linux-2.6.23/drivers/char/Kconfig
> @@ -58,6 +58,48 @@ config VT_CONSOLE
>  
> If unsure, say Y.
>  
> +config VT_CKO
> + bool "Colored kernel message output"
> + depends on VT_CONSOLE
> + ---help---
> + This option enables kernel messages to be emitted in
> + colors other than the default.
> +
> + If unsure, say N.
> +
> +config VT_PRINTK_COLOR
> + hex "Colored kernel message output"
> + range 0x00 0xFF
> + depends on VT_CKO
> + default 0x07
> + ---help---
> + This option defines with which color kernel messages will be
> + printed to the console.
> +
> + The value you need to enter here is the value is composed

The more correct term for "The value" is probably "The attribute".

> + (OR-ed) of a foreground and a background color.
> +
> + Foreground:
> + 0x00 = black,   0x08 = dark gray,
> + 0x01 = red, 0x09 = light red,
> + 0x02 = green,   0x0A = light green,
> + 0x03 = brown,   0x0B = yellow,
> + 0x04 = blue,0x0C = light blue,
> + 0x05 = magenta, 0x0D = light magenta,
> + 0x06 = cyan,0x0E = light cyan,
> + 0x07 = gray,0x0F = white,
> +
> + (Foreground colors 0x08 to 0x0F do not work when a VGA
> + console font with 512 glyphs is used.)

You might have to include a warning that those values or attributes are 
interpreted differently depending on the driver used, and the above is
mostly true for 16-color console drivers only.

For 2-colors (we still have quite a few of them) only bit 0 is true for
color (0x00 and 0x01). The rest of the bits are interpreted as
attributes:

0x02 - italic
0x04 - underline
0x08 - bold
0x80 - blink

The italic, underline and bold attributes will show up in a 2-color
framebuffer console. The blink attribute is ignored.

With a 4-color fb console (4-level grayscale), those values are again
interpreted differently.

0x00 - 0x00 : black
0x01 - 0x06 : white
0x07 - 0x08 : gray  
the rest: intense white

(If by mistake 0x0106 is used, it will produce a white on white display)

With an 8-color console, only the first 8 values are considered.

With a 16-color console, that is also not consistent:

With vgacon, it supports 16-color foreground (fg), 8-color
background (bg) at 256 chars. Becomes 8 fg and 8 bg with 512 chars.

With fbcon, it supports 16 fg and 16 bg at 256, 16 fg and 8 bg at
512 chars.

And for drivers that have their own con_build_attr() hook, they will be
interpreted differently again.

> +
> + Background:
> + 0x00 = black,   0x40 = blue,
> + 0x10 = red, 0x50 = magenta,
> + 0x20 = green,   0x60 = cyan,
> + 0x30 = brown,   0x70 = gray,
> +
> + For example, 0x1F would yield white on red.
> +

You may need to specify that the values here are the console default,
ie, the default_blue|grn|red boot options are not filled up.

>  config HW_CONSOLE
>   bool
>   depends on VT && !S390 && !UML
> Index: linux-2.6.23/drivers/char/vt.c
> ===
> --- linux-2.6.23.orig/drivers/char/vt.c
> +++ linux-2.6.23/drivers/char/vt.c
> @@ -73,6 +73,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -2344,6 +2345,23 @@ struct tty_driver *console_driver;
>  
>  #ifdef CONFIG_VT_CONSOLE
>  
> +static unsigned int printk_color __read_mostly = CONFIG_VT_PRINTK_COLOR;
> +#ifdef CONFIG_VT_CKO
> +module_param(printk_color, uint, S_IRUGO | S_IWUSR);
> +
> +static inline void vc_set_color(struct vc_data *vc, unsigned char color)
> +{
> + vc->vc_color = color_table[color & 0xF] |
> +(color_table[(color >> 4) & 0x7] << 4) |
> +(color & 0x80);

You may want to leave out the blink attribute (0x80) from this part.
Otherwise setterm -blink on|off will produce the opposite effect. 

Tony



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + fix-vm_can_nonlinear-check-in-sys_remap_file_pages.patch added to -mm tree

2007-10-08 Thread Ray Lee
On 10/8/07, Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> On Mon, Oct 08, 2007 at 10:05:40AM -0700, [EMAIL PROTECTED] wrote:
> > --- a/mm/fremap.c~fix-vm_can_nonlinear-check-in-sys_remap_file_pages
> > +++ a/mm/fremap.c
> > @@ -160,7 +160,7 @@ asmlinkage long sys_remap_file_pages(uns
> >   if (vma->vm_private_data && !(vma->vm_flags & VM_NONLINEAR))
> >   goto out;
> >
> > - if (!vma->vm_flags & VM_CAN_NONLINEAR)
> > + if (!(vma->vm_flags & VM_CAN_NONLINEAR))
>
> Ick.

Perhaps a good candidate for checkpatch.pl? (Andy cc:d.)

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Randy Dunlap
On Mon, 08 Oct 2007 16:43:10 -0600 Jonathan Corbet wrote:

> Sam Ravnborg <[EMAIL PROTECTED]> wrote:
> 
> > Or maybe we need something much less formal that explain the purpose of the
> > four tags we use:
> 
> ...or maybe a combination?  How does the following patch look as a way
> to describe how the tags are used and what Reviewed-by, in particular,
> means?
> 
> Perhaps the DCO should move to this file as well?
> 
> jon

Just typos noted below...

> ---
> 
> Add a document on patch tags.
> 
> Signed-off-by: Jonathan Corbet <[EMAIL PROTECTED]>
> 
> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> index 43e89b1..fa1518b 100644
> --- a/Documentation/00-INDEX
> +++ b/Documentation/00-INDEX
> @@ -284,6 +284,8 @@ parport.txt
>   - how to use the parallel-port driver.
>  parport-lowlevel.txt
>   - description and usage of the low level parallel port functions.
> +patch-tags
> + - description of the tags which can be added to patches
>  pci-error-recovery.txt
>   - info on PCI error recovery.
>  pci.txt
> diff --git a/Documentation/patch-tags b/Documentation/patch-tags
> new file mode 100644
> index 000..fb5f8e1
> --- /dev/null
> +++ b/Documentation/patch-tags
> @@ -0,0 +1,66 @@
> +Patches headed for the mainline may contain a variety of tags documenting
> +who played a hand in (or was at least aware of) its progress.  All of these
> +tags have the form:
> +
> + Something-done-by: Full name <[EMAIL PROTECTED]>
> +
> +These tags are:
> +
> +Signed-off-by:  A person adding a Signed-off-by tag is attesting that the
> + patch is, to the best of his or her knowledge, legally able
> + to be merged into the mainline and distributed under the
> + terms of the GNU General Public License, version 2.  See
> + the Developer's Certificate of Origin, found in
> + Documentation/SubmittingPatches, for the precise meaning of
> + Signed-off-by.
> +
> +Acked-by:The person named (who should be an active developer in the
> + area addressed by the patch) is aware of the patch and has
> + no objection to its inclusion.  An Acked-by tag does not
> + imply any involvement in the development of the patch or
> + that a detailed review was done.
> +
> +Reviewed-by: The patch has been reviewed and found acceptible according

  acceptable

> + to the Reviewer's Statement as found at the bottom of this
> + file.  A Reviewed-by tag is a statement of opinion that the
> + patch is an appropriate modification of the kernel without
> + any remaining serious technical issues.  Any interested
> + reviewer (who has done the work) can offer a Reviewed-by
> + tag for a patch.
> +
> +Cc:  The person named was given the opportunity to comment on
> + the patch.  This is the only tag which might be added
> + without an explicit action by the person it names.
> +
> +Tested-by:   The patch has been successfully tested (in some
> + environment) by the person named.
> +
> +
> +
> +
> +Reviewer's statement of oversight, v0.02
> +
> +By offering my Reviewed-by: tag, I state that:
> +
> + (a) I have carried out a technical review of this patch to evaluate its
> + appropriateness and readiness for inclusion into the mainline kernel. 
> +
> + (b) Any problems, concerns, or questions relating to the patch have been
> + communicated back to the submitter.  I am satisfied with how the
> + submitter has responded to my comments.
> +
> + (c) While there may (or may not) be things which could be improved with
> + this submission, I believe that it is, at this time, (1) a worthwhile
> + modification to the kernel, and (2) free of known issues which would
> + argue against its inclusion.
> +
> + (d) While I have reviewed the patch and believe it to be sound, I can not

 cannot

> + (unless explicitly stated elsewhere) make any warranties or guarantees
> + that it will achieve its stated purpose or function properly in any
> + given situation.
> +
> + (e) I understand and agree that this project and the contribution are
> + public and that a record of the contribution (including my Reviewed-by
> + tag and any associated public communications) is maintained
> + indefinitely and may be redistributed consistent with this project or
> + the open source license(s) involved.
> -


---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Oleg Verych
* Mon, 8 Oct 2007 17:38:52 -0400
>
> On Mon, Oct 08, 2007 at 01:33:38PM -0700, H. Peter Anvin wrote:
>> Uhm, no.  There is no reason an "unimportant" person couldn't review a 
>> patch, and therefore perform a potentially highly valuable service to 
>> the maintainer.
>> 
>> None of these are indicative of the authority of the person acking, 
>> reviewing, testing, or nacking.  That's only as good as the trust in the 
>> person signing.
>
> I would tend to agree.  Right now I think the problem is that we are
> getting too little reviews, not enough.  And someone who reviews
> patches, even if unknown, could be building up expertise that
> eventually would make them a valued developer, even while they are
> doing us a service.   

Experience of convincing experienced patch author, that some things in
the patch are wrong :)

[]
> We could ask reviewers to include a URL to an LKML archive of their
> review, to make it easier to find a review of a patch so later on
> people can judge how effective they their review was.

I vote for more little summaries in the `Subject'(again). Long, boring
threads with whole threading part of screen being empty due to same
subjects isn't fun, when some of thousands of messages can have
interesting stuff inside.

And it's easy not only for mailing list readers now, and for archive
readers also; readers of the www search results (who ever that may be):

google.com/search?q=reviewed+crashkernel

First hit on the review of the patch, i happened to make. And i just
thought "hell, just string parsing, what can be more simply?", yet there
was productive discussion and bug fixing. After i saw convincing
statements about testing, i've placed review mark. Though i'm really
"unimportant" random hacker.
--
-o--=O`C
 #oo'L O
<___=E M
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] [NetLabel] Introduce a new kernel configuration API for NetLabel - for Smack Version 5

2007-10-08 Thread Casey Schaufler
From: Paul Moore <[EMAIL PROTECTED]>

Add a new set of configuration functions to the NetLabel/LSM API so that
LSMs can perform their own configuration of the NetLabel subsystem without
relying on assistance from userspace.

Signed-off-by: Paul Moore <[EMAIL PROTECTED]>
---

This update fixes a memory leak on error conditions.

 include/net/netlabel.h |   47 --
 net/ipv4/cipso_ipv4.c  |4 -
 net/netlabel/netlabel_cipso_v4.c   |2 
 net/netlabel/netlabel_cipso_v4.h   |3 +
 net/netlabel/netlabel_domainhash.h |1 
 net/netlabel/netlabel_kapi.c   |  177 
 6 files changed, 225 insertions(+), 9 deletions(-)

diff --git a/include/net/netlabel.h b/include/net/netlabel.h
index 2e5b2f6..facaf68 100644
--- a/include/net/netlabel.h
+++ b/include/net/netlabel.h
@@ -36,6 +36,8 @@
 #include 
 #include 
 
+struct cipso_v4_doi;
+
 /*
  * NetLabel - A management interface for maintaining network packet label
  *mapping tables for explicit packet labling protocols.
@@ -99,12 +101,6 @@ struct netlbl_audit {
uid_t loginuid;
 };
 
-/* Domain mapping definition struct */
-struct netlbl_dom_map;
-
-/* Domain mapping operations */
-int netlbl_domhsh_remove(const char *domain, struct netlbl_audit *audit_info);
-
 /* LSM security attributes */
 struct netlbl_lsm_cache {
atomic_t refcount;
@@ -285,6 +281,19 @@ static inline void netlbl_secattr_free(struct 
netlbl_lsm_secattr *secattr)
 
 #ifdef CONFIG_NETLABEL
 /*
+ * LSM configuration operations
+ */
+int netlbl_cfg_map_del(const char *domain, struct netlbl_audit *audit_info);
+int netlbl_cfg_unlbl_add_map(const char *domain,
+struct netlbl_audit *audit_info);
+int netlbl_cfg_cipsov4_add(struct cipso_v4_doi *doi_def,
+  struct netlbl_audit *audit_info);
+int netlbl_cfg_cipsov4_add_map(struct cipso_v4_doi *doi_def,
+  const char *domain,
+  struct netlbl_audit *audit_info);
+int netlbl_cfg_cipsov4_del(u32 doi, struct netlbl_audit *audit_info);
+
+/*
  * LSM security attribute operations
  */
 int netlbl_secattr_catmap_walk(struct netlbl_lsm_secattr_catmap *catmap,
@@ -318,6 +327,32 @@ void netlbl_cache_invalidate(void);
 int netlbl_cache_add(const struct sk_buff *skb,
 const struct netlbl_lsm_secattr *secattr);
 #else
+static inline int netlbl_cfg_map_del(const char *domain,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_unlbl_add_map(const char *domain,
+  struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_cipsov4_add(struct cipso_v4_doi *doi_def,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_cipsov4_add_map(struct cipso_v4_doi *doi_def,
+const char *domain,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_cipsov4_del(u32 doi,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
 static inline int netlbl_secattr_catmap_walk(
  struct netlbl_lsm_secattr_catmap *catmap,
  u32 offset)
diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c
index ab56a05..714461c 100644
--- a/net/ipv4/cipso_ipv4.c
+++ b/net/ipv4/cipso_ipv4.c
@@ -557,8 +557,8 @@ int cipso_v4_doi_remove(u32 doi,
spin_unlock(_v4_doi_list_lock);
list_for_each_entry_rcu(dom_iter, _def->dom_list, list)
if (dom_iter->valid)
-   netlbl_domhsh_remove(dom_iter->domain,
-audit_info);
+   netlbl_cfg_map_del(dom_iter->domain,
+  audit_info);
cipso_v4_cache_invalidate();
rcu_read_unlock();
 
diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c
index c060e3f..07f7fd4 100644
--- a/net/netlabel/netlabel_cipso_v4.c
+++ b/net/netlabel/netlabel_cipso_v4.c
@@ -89,7 +89,7 @@ static const struct nla_policy 
netlbl_cipsov4_genl_policy[NLBL_CIPSOV4_A_MAX + 1
  * safely.
  *
  */
-static void netlbl_cipsov4_doi_free(struct rcu_head *entry)
+void netlbl_cipsov4_doi_free(struct rcu_head *entry)
 {
struct cipso_v4_doi *ptr;
 
diff --git a/net/netlabel/netlabel_cipso_v4.h b/net/netlabel/netlabel_cipso_v4.h
index f03cf9b..220cb9d 100644
--- a/net/netlabel/netlabel_cipso_v4.h
+++ b/net/netlabel/netlabel_cipso_v4.h
@@ -163,4 +163,7 @@ enum {
 /* NetLabel protocol functions */
 int netlbl_cipsov4_genl_init(void);
 
+/* Free the memory 

possible recursive locking detected... in __wake_up

2007-10-08 Thread Stefan Richter
Hi list,

how could this ever happen?

>>   =
>>   [ INFO: possible recursive locking detected ]
>>   2.6.23-0.222.rc9.git4.fc8 #1
>>   -
>>   X/2522 is trying to acquire lock:
>>(>lock){++..}, at: [] __wake_up+0x15/0x42
>>
>>   but task is already holding lock:
>>(>lock){++..}, at: [] __wake_up+0x15/0x42
>>
>>   other info that might help us debug this:
>>   2 locks held by X/2522:
>>#0:  (>lock){.+..}, at: [] queue_event+0x2b/0x68 
>> [firewire_core]
>>#1:  (>lock){++..}, at: [] __wake_up+0x15/0x42
>>
>>   stack backtrace:
>>[] show_trace_log_lvl+0x1a/0x2f
>>[] show_trace+0x12/0x14
>>[] dump_stack+0x16/0x18
>>[] __lock_acquire+0x189/0xc67
>>[] lock_acquire+0x7b/0x9e
>>[] _spin_lock_irqsave+0x4a/0x77
>>[] __wake_up+0x15/0x42
>>[] ep_poll_safewake+0x86/0xa8
>>[] ep_poll_callback+0x9f/0xaa
>>[] __wake_up_common+0x32/0x55
>>[] __wake_up+0x31/0x42
>>[] queue_event+0x57/0x68 [firewire_core]
>>[] handle_request+0xd8/0xe0 [firewire_core]
>>[] fw_core_handle_request+0x215/0x23c [firewire_core]
>>[] handle_ar_packet+0xd7/0xeb [firewire_ohci]
>>[] ar_context_tasklet+0xb6/0xc4 [firewire_ohci]
>>[] tasklet_action+0x68/0xd3
>>[] __do_softirq+0x78/0xff
>>[] do_softirq+0x74/0xf7
>>===
(from https://bugzilla.redhat.com/show_bug.cgi?id=323411)

We wake up the queue from a workqueue context (rarely) and from tasklet
context (frequently).  However, since __wake_up disables local IRQs, it
should be entirely impossible for __wake_up to take q->lock twice before
releasing it.  What's the deal?
-- 
Stefan Richter
-=-=-=== =-=- -=--=
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Yan Zheng
2007/10/8, Hugh Dickins <[EMAIL PROTECTED]>:
> On Mon, 8 Oct 2007, Yan Zheng wrote:
> >
> > The test for VM_CAN_NONLINEAR always fails
> Good catch indeed.  Though I was puzzled how we do nonlinear at all,
> until I realized it's "The test for not VM_CAN_NONLINEAR always fails".
> It's not as serious as it appears, since code further down has been
> added more recently to simulate nonlinear on non-RAM-backed filesystems,
> instead of going the real nonlinear way; so most filesystems are now not
> required to do what VM_CAN_NONLINEAR was put in to ensure they could do.
> I'm confused as to where that leaves us: is this actually a fix that
> needs to go into 2.6.23?  or will it suddenly disable a system call
> which has been silently working fine on various filesystems which did
> not add VM_CAN_NONLINEAR?  could we just rip out VM_CAN_NONLINEAR?
> I hope Nick or Miklos is clearer on what the risks are.
> (Apologies for all the "not"s and "non"s here, I'm embarrassed
> after just criticizing Ingo's SCHED_NO_NO_OMIT_FRAME_POINTER!)
> Hugh

Yes, I mean "The test for not VM_CAN_NONLINEAR always fails".  please
forgive my poor English.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Andrew Vasquez
On Mon, 08 Oct 2007, Darrick J. Wong wrote:

> If the aic94xx chip doesn't have a SAS address in the chip's flash memory,
> use the request_firmware() interface to get one from userspace.  This
> way, there's no debate as to who or how an address gets generated--it's
> totally up to the administrator to provide it if the card doesn't have one.

So how about factoring that out to a transport-level interface.  How
about something along the lines of the following patch, whereby the
software driver upon detecting no valid WWPN, makes an upcall to each
interface's 'request_wwn()'.  The data passed in from shost_gendev
should be enough for some helper script to cull relevent device bits
and perhaps offer some level of persistence...  Off base?

Darrick, forgive the FC example, I don't do SAS...

--
av

--

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 7a7cfe5..5e0d953 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "scsi_priv.h"
 #include "scsi_transport_fc_internal.h"
 
@@ -3251,6 +3252,30 @@ fc_vport_sched_delete(struct work_struct *work)
vport->channel, stat);
 }
 
+int
+fc_request_wwn(struct Scsi_Host *shost, u64 *wwn)
+{
+   const struct firmware *fw;
+   int stat;
+
+   stat = request_firmware(, "fc_addr", >shost_gendev);
+   if (stat)
+   return stat;
+
+   if (fw->size < 16) {
+   stat = -EINVAL;
+   goto out;
+   }
+
+   stat = fc_parse_wwn(fw->data, wwn);
+   if (stat)
+   return stat;
+
+out:
+   release_firmware(fw);
+   return stat;
+}
+EXPORT_SYMBOL(fc_request_wwn);
 
 /* Original Author:  Martin Hicks */
 MODULE_AUTHOR("James Smart");
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index e466d88..e80c36c 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -734,4 +734,6 @@ void fc_host_post_vendor_event(struct Scsi_Host *shost, u32 
event_number,
 */
 int fc_vport_terminate(struct fc_vport *vport);
 
+int fc_request_wwn(struct Scsi_Host *, u64 *);
+
 #endif /* SCSI_TRANSPORT_FC_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Jonathan Corbet
Sam Ravnborg <[EMAIL PROTECTED]> wrote:

> Or maybe we need something much less formal that explain the purpose of the
> four tags we use:

...or maybe a combination?  How does the following patch look as a way
to describe how the tags are used and what Reviewed-by, in particular,
means?

Perhaps the DCO should move to this file as well?

jon

---

Add a document on patch tags.

Signed-off-by: Jonathan Corbet <[EMAIL PROTECTED]>

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 43e89b1..fa1518b 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -284,6 +284,8 @@ parport.txt
- how to use the parallel-port driver.
 parport-lowlevel.txt
- description and usage of the low level parallel port functions.
+patch-tags
+   - description of the tags which can be added to patches
 pci-error-recovery.txt
- info on PCI error recovery.
 pci.txt
diff --git a/Documentation/patch-tags b/Documentation/patch-tags
new file mode 100644
index 000..fb5f8e1
--- /dev/null
+++ b/Documentation/patch-tags
@@ -0,0 +1,66 @@
+Patches headed for the mainline may contain a variety of tags documenting
+who played a hand in (or was at least aware of) its progress.  All of these
+tags have the form:
+
+   Something-done-by: Full name <[EMAIL PROTECTED]>
+
+These tags are:
+
+Signed-off-by:  A person adding a Signed-off-by tag is attesting that the
+   patch is, to the best of his or her knowledge, legally able
+   to be merged into the mainline and distributed under the
+   terms of the GNU General Public License, version 2.  See
+   the Developer's Certificate of Origin, found in
+   Documentation/SubmittingPatches, for the precise meaning of
+   Signed-off-by.
+
+Acked-by:  The person named (who should be an active developer in the
+   area addressed by the patch) is aware of the patch and has
+   no objection to its inclusion.  An Acked-by tag does not
+   imply any involvement in the development of the patch or
+   that a detailed review was done.
+
+Reviewed-by:   The patch has been reviewed and found acceptible according
+   to the Reviewer's Statement as found at the bottom of this
+   file.  A Reviewed-by tag is a statement of opinion that the
+   patch is an appropriate modification of the kernel without
+   any remaining serious technical issues.  Any interested
+   reviewer (who has done the work) can offer a Reviewed-by
+   tag for a patch.
+
+Cc:The person named was given the opportunity to comment on
+   the patch.  This is the only tag which might be added
+   without an explicit action by the person it names.
+
+Tested-by: The patch has been successfully tested (in some
+   environment) by the person named.
+
+
+
+
+Reviewer's statement of oversight, v0.02
+
+By offering my Reviewed-by: tag, I state that:
+
+ (a) I have carried out a technical review of this patch to evaluate its
+ appropriateness and readiness for inclusion into the mainline kernel. 
+
+ (b) Any problems, concerns, or questions relating to the patch have been
+ communicated back to the submitter.  I am satisfied with how the
+ submitter has responded to my comments.
+
+ (c) While there may (or may not) be things which could be improved with
+ this submission, I believe that it is, at this time, (1) a worthwhile
+ modification to the kernel, and (2) free of known issues which would
+ argue against its inclusion.
+
+ (d) While I have reviewed the patch and believe it to be sound, I can not
+ (unless explicitly stated elsewhere) make any warranties or guarantees
+ that it will achieve its stated purpose or function properly in any
+ given situation.
+
+ (e) I understand and agree that this project and the contribution are
+ public and that a record of the contribution (including my Reviewed-by
+ tag and any associated public communications) is maintained
+ indefinitely and may be redistributed consistent with this project or
+ the open source license(s) involved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Correct Makefile rule for generating custom keymap.

2007-10-08 Thread Maarten Bressers
When building a custom keymap, after setting GENERATE_KEYMAP := 1 in
drivers/char/Makefile, the kernel build fails like this:

  CC  drivers/char/vt.o
make[2]: *** No rule to make target `drivers/char/%.map', needed by
`drivers/char/defkeymap.c'.  Stop.
make[1]: *** [drivers/char] Error 2
make: *** [drivers] Error 2

This was caused by commit af8b128719f5248e542036ea994610a29d0642a6,
which deleted a necessary colon from the Makefile rule that generates
the keymap, since that rule contains both a target and a target-pattern.
The following patch puts the colon back:

Signed-off by: Maarten Bressers <[EMAIL PROTECTED]>


--- a/drivers/char/Makefile 2007-10-08 23:46:47.0 +0200
+++ b/drivers/char/Makefile 2007-10-08 23:46:57.0 +0200
@@ -129,7 +129,7 @@ $(obj)/defkeymap.o:  $(obj)/defkeymap.c
 
 ifdef GENERATE_KEYMAP
 
-$(obj)/defkeymap.c $(obj)/%.c: $(src)/%.map
+$(obj)/defkeymap.c: $(obj)/%.c: $(src)/%.map
loadkeys --mktable $< > [EMAIL PROTECTED]
sed -e 's/^static *//' [EMAIL PROTECTED] > $@
rm [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: x86-64 sporadic hang in 2.6.23rc7 and 2.6.22

2007-10-08 Thread Helge Hafting

Thomas Gleixner wrote:

On Sat, 29 Sep 2007, Helge Hafting wrote:
  

Thomas Gleixner wrote:


I have gone back to 2.6.22rc4, which seems to work.

This is a single opteron, although on a dual-slot board.



Can you switch to serial console, so we can get some information out of
that box? Sysrq-B is working, so we can get info from other sysrq
functions as well.
  
  

I didn't need the serial - it crashes during console work too.
I think a "make clean" was in progress at the time. There must be work going
on in order to crash.

This time 2.6.22rc4 died on me with a general protection fault

I got two reports, the first one scrolled partially off screen but
the whole trace was there:



That's why I asked for a serial console. That way we can get all the
information from the reports including the register dumps 
  

I got another crash - with a full dump.  I have also discovered
files with lots of single-bit errors, so this is probably just some kind
of hw problem. :-(

Replace mermory or the motherboard with everything on it . . . :-(

Helge Hafting

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: parallel networking

2007-10-08 Thread Waskiewicz Jr, Peter P
> Multiply whatever effect you think you might be able to 
> measure due to that on your 2 or 4 way system, and multiple 
> it up to 64 cpus or so for machines I am using.  This is 
> where machines are going, and is going to become the norm.

That along with speeds going to 10 GbE with multiple Tx/Rx queues (with
40 and 100 GbE under discussion now), where multiple CPU's hitting the
driver are needed to push line rate without cratering the entire
machine.

-PJ Waskiewicz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: parallel networking

2007-10-08 Thread David Miller
From: jamal <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 18:30:18 -0400

> Very quickly there are no more packets for it to dequeue from the
> qdisc or the driver is stoped and it has to get out of there. If you
> dont have any interupt tied to a specific cpu then you can have many
> cpus enter and leave that region all the time.

With the lock shuttling back and forth between those cpus, which is
what we're trying to avoid.

Multiply whatever effect you think you might be able to measure due to
that on your 2 or 4 way system, and multiple it up to 64 cpus or so
for machines I am using.  This is where machines are going, and is
going to become the norm.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: gigabit ethernet power consumption

2007-10-08 Thread Kok, Auke
Pavel Machek wrote:
> Hi!
> 
> I've found that gbit vs. 100mbit power consumption difference is about
> 1W -- pretty significant. (Maybe powertop should include it in the
> tips section? :).
> 
> Energy Star people insist that machines should switch down to 100mbit
> when network is idle, and I guess that makes a lot of sense -- you
> save 1W locally and 1W on the router.
> 
> Question is, how to implement it correctly? Daemon that would watch
> data rates and switch speeds using mii-tool would be simple, but is
> that enough?

you most certainly want to do this in userspace I think.

One of the biggest problems is that link negotiation can take a significant 
amount
of time, well over several seconds (1 to 3 seconds typical) with gigabit, and
having your ethernet connection go offline for 3 seconds may not be the desired
effect for when you want to get more bandwidth in the first place.

However, when a laptop is in battery mode, switching down from gigabit to 
100mbit
makes a lot more sense, so this is something I would recommend. This can be as
easy as changing the advertisement mask of the interface and renegotiating the
link. Userspace could handle that very easily.

Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: parallel networking

2007-10-08 Thread jamal
On Mon, 2007-08-10 at 14:11 -0700, David Miller wrote:

> The problem is that the packet schedulers want global guarantees
> on packet ordering, not flow centric ones.
> 
> That is the issue Jamal is concerned about.

indeed, thank you for giving it better wording. 

> The more I think about it, the more inevitable it seems that we really
> might need multiple qdiscs, one for each TX queue, to pull this full
> parallelization off.
> 
> But the semantics of that don't smell so nice either.  If the user
> attaches a new qdisc to "ethN", does it go to all the TX queues, or
> what?
> 
> All of the traffic shaping technology deals with the device as a unary
> object.  It doesn't fit to multi-queue at all.

If you let only one CPU at a time access the "xmit path" you solve all
the reordering. If you want to be more fine grained you make the
serialization point as low as possible in the stack - perhaps in the
driver.
But I think even what we have today with only one cpu entering the
dequeue/scheduler region, _for starters_, is not bad actually ;->  What
i am finding (and i can tell you i have been trying hard;->) is that a
sufficiently fast cpu doesnt sit in the dequeue area for "too long" (and
batching reduces the time spent further). Very quickly there are no more
packets for it to dequeue from the qdisc or the driver is stoped and it
has to get out of there. If you dont have any interupt tied to a
specific cpu then you can have many cpus enter and leave that region all
the time. 

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc9 compile error drivers/video/fbmon.c

2007-10-08 Thread Helge Hafting

Adrian Bunk wrote:

On Tue, Oct 09, 2007 at 12:00:38AM +0200, Helge Hafting wrote:
  

 CC  drivers/video/fbmon.o
drivers/video/fbmon.c: In function ‘fb_parse_edid’:
drivers/video/fbmon.c:867: error: expected ‘=’, ‘,’, ‘;’, 
‘asm’ or ‘__attrib

_’ before ‘*’ token
drivers/video/fbmon.c:867: error: ‘block’ undeclared (first use in this 
func

)

This line reads:
   unsigned char$*block;

Source error, or is my tree simply corrupt?



The $ is a space character in my copy of the tree, so it seems to be a 
corrupted tree with a bit error on your side ($ and the space character 
differ by only one bit).
  

I downloaded a new tree.
That file has quite a few single-bit errors in my old tree.
Mostly of the +16 variety.
Seems I have to look for bad memory :-( :-( :-(

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc9 compile error drivers/video/fbmon.c

2007-10-08 Thread Adrian Bunk
On Tue, Oct 09, 2007 at 12:00:38AM +0200, Helge Hafting wrote:
>  CC  drivers/video/fbmon.o
> drivers/video/fbmon.c: In function ‘fb_parse_edid’:
> drivers/video/fbmon.c:867: error: expected ‘=’, ‘,’, ‘;’, 
> ‘asm’ or ‘__attrib
> _’ before ‘*’ token
> drivers/video/fbmon.c:867: error: ‘block’ undeclared (first use in this 
> func
> )
>
> This line reads:
>unsigned char$*block;
>
> Source error, or is my tree simply corrupt?

The $ is a space character in my copy of the tree, so it seems to be a 
corrupted tree with a bit error on your side ($ and the space character 
differ by only one bit).

> Helge Hafting

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc9 compile error drivers/video/fbmon.c

2007-10-08 Thread Randy Dunlap
On Tue, 09 Oct 2007 00:00:38 +0200 Helge Hafting wrote:

>   CC  drivers/video/fbmon.o
> drivers/video/fbmon.c: In function ‘fb_parse_edid’:
> drivers/video/fbmon.c:867: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attrib
> _’ before ‘*’ token
> drivers/video/fbmon.c:867: error: ‘block’ undeclared (first use in this func
> )
> 
> This line reads:
> unsigned char$*block;
> 
> Source error, or is my tree simply corrupt?

Hi,
It's a space in my kernel source file.

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


gigabit ethernet power consumption

2007-10-08 Thread Pavel Machek
Hi!

I've found that gbit vs. 100mbit power consumption difference is about
1W -- pretty significant. (Maybe powertop should include it in the
tips section? :).

Energy Star people insist that machines should switch down to 100mbit
when network is idle, and I guess that makes a lot of sense -- you
save 1W locally and 1W on the router.

Question is, how to implement it correctly? Daemon that would watch
data rates and switch speeds using mii-tool would be simple, but is
that enough?
Pavel 
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


sleepy linux 2.6.23-rc9

2007-10-08 Thread Pavel Machek
Hi!

I played with powertop a bit, and found a fairly interesting failure
mode. If I boot init=/bin/bash vga=1, I get ~2 wakeups a second, nice.

When I boot init=/bin/bash vga=791 (vesa framebuffer), most wakeups
are caused by cursor painting (I should fix that some day, I
guess). But... the cursor blinking does not even work properly!

It blinks at normal speed, then (randomly) it blinks slowly, then gets
back to normal speed, then inserts longer delay.

The effect is so nice that I thought about youtube ;-). Thinkpad
x60.. question is, how to debug it? 

(config attached, I did all the stuff powertop told me to, and then some.)
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


config.gz
Description: Binary data


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Rafael J. Wysocki
On Monday, 8 October 2007 23:38, Theodore Tso wrote:
> On Mon, Oct 08, 2007 at 01:33:38PM -0700, H. Peter Anvin wrote:
> > Uhm, no.  There is no reason an "unimportant" person couldn't review a 
> > patch, and therefore perform a potentially highly valuable service to 
> > the maintainer.
> > 
> > None of these are indicative of the authority of the person acking, 
> > reviewing, testing, or nacking.  That's only as good as the trust in the 
> > person signing.
> 
> I would tend to agree.  Right now I think the problem is that we are
> getting too little reviews, not enough.  And someone who reviews
> patches, even if unknown, could be building up expertise that
> eventually would make them a valued developer, even while they are
> doing us a service.   
> 
> The concern that I suspect some people have is what if this gets
> abused by people who don't really bother to do a full review of a
> patch before they ack it.  We could ask reviewers to include a URL to
> an LKML archive of their review, to make it easier to find a review of
> a patch so later on people can judge how effective they their review
> was.  Unfortunately, this would be an added burden for the regular
> reviewers, so I doubt this would be well accepted as a practice.  My
> suggestion is to not worry about this for now, and see how well it
> works out in practice.  If we start getting half a dozen or more
> Reviewed-by: where the patch is pretty clearly not getting adequately
> reviewed, or where someone is obviously abusing the system, and social
> pressures aren't working, we can try to figure out then how we want to
> address that problem then.  Let's not make the process too complicated
> unless we know it's necessary.  Premature complexity is almost as bad
> as premature optimization

I agree.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc9 compile error drivers/video/fbmon.c

2007-10-08 Thread Helge Hafting

 CC  drivers/video/fbmon.o
drivers/video/fbmon.c: In function ‘fb_parse_edid’:
drivers/video/fbmon.c:867: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attrib
_’ before ‘*’ token
drivers/video/fbmon.c:867: error: ‘block’ undeclared (first use in this func
)

This line reads:
   unsigned char$*block;

Source error, or is my tree simply corrupt?

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] NTFS error messages: replace static char pointers by static char arrays

2007-10-08 Thread Dmitri Vorobiev

Hi,

The patch below contains a small code clean-up for the NTFS driver: all
static char pointers to error message strings have been replaced by 
static char arrays.


Please apply if you like it.

Signed-off-by: Dmitri Vorobiev <[EMAIL PROTECTED]>
---
diff --git a/fs/ntfs/attrib.c b/fs/ntfs/attrib.c
index 1c08fef..b883537 100644
--- a/fs/ntfs/attrib.c
+++ b/fs/ntfs/attrib.c
@@ -869,7 +869,7 @@ static int ntfs_external_attr_find(const ATTR_TYPE type,
ntfschar *al_name;
u32 al_name_len;
int err = 0;
-   static const char *es = " Unmount and run chkdsk.";
+   static const char es[] = " Unmount and run chkdsk.";

ni = ctx->ntfs_ino;
base_ni = ctx->base_ntfs_ino;
diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c
index b532a73..82ac179 100644
--- a/fs/ntfs/inode.c
+++ b/fs/ntfs/inode.c
@@ -1870,7 +1870,7 @@ int ntfs_read_inode_mount(struct inode *vi)
} else /* if (!err) */ {
ATTR_LIST_ENTRY *al_entry, *next_al_entry;
u8 *al_end;
-   static const char *es = "  Not allowed.  $MFT is corrupt.  "
+   static const char es[] = "  Not allowed.  $MFT is corrupt.  "
"You should run chkdsk.";

ntfs_debug("Attribute list attribute found in $MFT.");
@@ -2332,7 +2332,7 @@ int ntfs_show_options(struct seq_file *sf, struct 
vfsmount *mnt)


 #ifdef NTFS_RW

-static const char *es = "  Leaving inconsistent metadata.  Unmount and 
run "
+static const char es[] = "  Leaving inconsistent metadata.  Unmount and 
run "

"chkdsk.";

 /**
@@ -2368,7 +2368,7 @@ int ntfs_truncate(struct inode *vi)
ntfs_attr_search_ctx *ctx;
MFT_RECORD *m;
ATTR_RECORD *a;
-   const char *te = "  Leaving file length out of sync with i_size.";
+   const char te[] = "  Leaving file length out of sync with i_size.";
int err, mp_size, size_change, alloc_change;
u32 attr_len;

diff --git a/fs/ntfs/mft.c b/fs/ntfs/mft.c
index 2ad5c8b..a560bb0 100644
--- a/fs/ntfs/mft.c
+++ b/fs/ntfs/mft.c
@@ -409,7 +409,7 @@ void __mark_mft_record_dirty(ntfs_inode *ni)
__mark_inode_dirty(VFS_I(base_ni), I_DIRTY_SYNC | I_DIRTY_DATASYNC);
 }

-static const char *ntfs_please_email = "Please email "
+static const char ntfs_please_email[] = "Please email "
"[EMAIL PROTECTED] and say that you saw "
"this message.  Thank you.";

@@ -1106,7 +1106,7 @@ bool ntfs_may_write_mft_record(ntfs_volume *vol, 
const unsigned long mft_no,

return true;
 }

-static const char *es = "  Leaving inconsistent metadata.  Unmount and 
run "
+static const char es[] = "  Leaving inconsistent metadata.  Unmount and 
run "

"chkdsk.";

 /**
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index 90c4e3a..a03ca16 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -460,7 +460,7 @@ static int ntfs_remount(struct super_block *sb, int 
*flags, char *opt)

 * have occured.
 */
if ((sb->s_flags & MS_RDONLY) && !(*flags & MS_RDONLY)) {
-   static const char *es = ".  Cannot remount read-write.";
+   static const char es[] = ".  Cannot remount read-write.";

/* Remounting read-write. */
if (NVolErrors(vol)) {
@@ -647,7 +647,7 @@ not_ntfs:
 static struct buffer_head *read_ntfs_boot_sector(struct super_block *sb,
const int silent)
 {
-   const char *read_err_str = "Unable to read %s boot sector.";
+   const char read_err_str[] = "Unable to read %s boot sector.";
struct buffer_head *bh_primary, *bh_backup;
sector_t nr_blocks = NTFS_SB(sb)->nr_blocks;

@@ -1756,9 +1756,9 @@ static bool load_system_files(ntfs_volume *vol)
 #ifdef NTFS_RW
/* Get mft mirror inode compare the contents of $MFT and $MFTMirr. */
if (!load_and_init_mft_mirror(vol) || !check_mft_mirror(vol)) {
-   static const char *es1 = "Failed to load $MFTMirr";
-   static const char *es2 = "$MFTMirr does not match $MFT";
-   static const char *es3 = ".  Run ntfsfix and/or chkdsk.";
+   static const char es1[] = "Failed to load $MFTMirr";
+   static const char es2[] = "$MFTMirr does not match $MFT";
+   static const char es3[] = ".  Run ntfsfix and/or chkdsk.";

/* If a read-write mount, convert it to a read-only mount. */
if (!(sb->s_flags & MS_RDONLY)) {
@@ -1880,11 +1880,12 @@ get_ctx_vol_failed:
 #ifdef NTFS_RW
/* Make sure that no unsupported volume flags are set. */
if (vol->vol_flags & VOLUME_MUST_MOUNT_RO_MASK) {
-   static const char *es1a = "Volume is dirty";
-   static const char *es1b = "Volume has been modified by chkdsk";
-   static const char *es1c = "Volume has unsupported flags set";
-   static const char *es2a = ".  Run chkdsk and mount in Windows.";
-  

Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.

2007-10-08 Thread Steve Wise



David Miller wrote:

From: Sean Hefty <[EMAIL PROTECTED]>
Date: Thu, 09 Aug 2007 14:40:16 -0700


Steve Wise wrote:

Any more comments?
Does anyone have ideas on how to reserve the port space without using a 
struct socket?


How about we just remove the RDMA stack altogether?  I am not at all
kidding.  If you guys can't stay in your sand box and need to cause
problems for the normal network stack, it's unacceptable.  We were
told all along the if RDMA went into the tree none of this kind of
stuff would be an issue.

These are exactly the kinds of problems for which people like myself
were dreading.  These subsystems have no buisness using the TCP port
space of the Linux software stack, absolutely none.

After TCP port reservation, what's next?  It seems an at least
bi-monthly event that the RDMA folks need to put their fingers
into something else in the normal networking stack.  No more.

I will NACK any patch that opens up sockets to eat up ports or
anything stupid like that.


Hey Dave,

The hack to use a socket and bind it to claim the port was just for 
demostrating the idea.  The correct solution, IMO, is to enhance the 
core low level 4-tuple allocation services to be more generic (eg: not 
be tied to a struct sock).  Then the host tcp stack and the host rdma 
stack can allocate TCP/iWARP ports/4tuples from this common exported 
service and share the port space.  This allocation service could also be 
used by other deep adapters like iscsi adapters if needed.


Will you NAK such a solution if I go implement it and submit for review? 
 The dual ip subnet solution really sux, and I'm trying one more time 
to see if you will entertain the common port space solution, if done 
correctly.


Thanks,

Steve.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Version 3 (2.6.23-rc8) Smack: Simplified Mandatory Access Control Kernel

2007-10-08 Thread Crispin Cowan
Eric W. Biederman wrote:
> My very practical question:  How do I run selinux in one container,
> and SMACK in another?
>   
In AppArmor, we plan to 'containerize' (not sure what to call it) policy
so that you can have an AppArmor policy per container. This is not
currently the case, it is just the direction we want to go. We think it
would be very useful for virtual hosts to be able to have their own
AppArmor policy, independent of what other hosts are doing.

The major step towards this goal so far is that AppArmor rules are now
canonicalized to the name space.

However, I have never considered the idea of separate LSM modules per
container. The idea doesn't really make sense to me. It is kind of like
asking for private device drivers, or even a private kernel, per name
space. If that's what you want, use virtualization like KVM, Xen, or VMware.

Crispin

-- 
Crispin Cowan, Ph.D.   http://crispincowan.com/~crispin/
   Itanium. Vista. GPLv3. Complexity at work

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23 regression: do_nanosleep will not return

2007-10-08 Thread Thomas Gleixner

On Mon, 8 Oct 2007, Bernd Schubert wrote:

> Hi,
> 
> we have a system here were e.g. "sleep 1" will never finish. This is an 
> issue of 2.6.23, on all older kernel versions it did work fine.
> 
> Seems to hang in do_nanosleep() 
> 
> [  153.775792] sleep S  0  5372   5341
>   
> [  153.782385]  81007f0a9ea8 0082  
> 8efc  
>
> [  153.790635]  81007f0a9e48 802447b4 81007f0c3080 
> 0003  
>
> [  153.798938]  81007f0c39c8 81007f0c37c0 4001d908 
>   
>
> [  153.806991] Call Trace:
>   
> [  153.809937]  [] do_nanosleep+0x42/0x75   
>   
> [  153.815727]  [<0001>]  
>   
> [  153.819383]
>   
> [  153.775792] sleep S  0  5372   5341
>   
> 
> 
> [  330.669444] SysRq : Show Pending Timers
>   
> [  330.673552] Timer List Version: v0.3   
>   
> [  330.677326] HRTIMER_MAX_CLOCK_BASES: 2 
>   
> [  330.681282] now at 255011372633 nsecs  
>   
> [  330.829981] active timers: 
>   
> [  330.832859]  #0: , hrtimer_wakeup, S:01  
>   
> [  330.838805]  # expires at 260156346358 nsecs [in 5144973725 nsecs] 
>   
> 
> [  337.046189] now at 261387685432 nsecs  
>   
> [  337.194966] active timers: 
>   
> [  337.197834]  #0: , hrtimer_wakeup, S:01  
>   
> [  337.203793]  # expires at 260156346358 nsecs [in 18446744072478212542 
> nsecs] 

timer already expired -
(~1.2 seconds)

hmm, signedness problem (only in the display) -^^^

Can you please put a complete system description, your .config and a boot 
log into bugzilla ?

Thanks,

tglx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Bug #4580: hda: lost interrupt when resuming from S3 - Sony VGN-T1XP

2007-10-08 Thread Bartlomiej Zolnierkiewicz

Hi,

On Sunday 07 October 2007, Rafael J. Wysocki wrote:
> Hi,
> 
> IDE/ATA wizzards are kindly requested to have a look at:
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=4580#c21
> 
> as I have no idea of what we can do about the "hard drive password vs suspend"
> problem, if anything.

CONFIG_BLK_DEV_IDEACPI=y is not enough since IDE ACPI _GTF support is disabled
by default.  To enable it "ide=acpigtf" kernel parameter should be used.

Sometimes it may be also worth to try enabling usage of IDE ACPI methods on
boot ("ide=acpionboot" kernel parameter) which is also disabled by default.

We probably should switch both options to be on by default if recent libata
transition to ACPI on by default turns out to be successful.

PS I've updated bug #4580 accordingly.

Thanks,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Theodore Tso
On Mon, Oct 08, 2007 at 01:33:38PM -0700, H. Peter Anvin wrote:
> Uhm, no.  There is no reason an "unimportant" person couldn't review a 
> patch, and therefore perform a potentially highly valuable service to 
> the maintainer.
> 
> None of these are indicative of the authority of the person acking, 
> reviewing, testing, or nacking.  That's only as good as the trust in the 
> person signing.

I would tend to agree.  Right now I think the problem is that we are
getting too little reviews, not enough.  And someone who reviews
patches, even if unknown, could be building up expertise that
eventually would make them a valued developer, even while they are
doing us a service.   

The concern that I suspect some people have is what if this gets
abused by people who don't really bother to do a full review of a
patch before they ack it.  We could ask reviewers to include a URL to
an LKML archive of their review, to make it easier to find a review of
a patch so later on people can judge how effective they their review
was.  Unfortunately, this would be an added burden for the regular
reviewers, so I doubt this would be well accepted as a practice.  My
suggestion is to not worry about this for now, and see how well it
works out in practice.  If we start getting half a dozen or more
Reviewed-by: where the patch is pretty clearly not getting adequately
reviewed, or where someone is obviously abusing the system, and social
pressures aren't working, we can try to figure out then how we want to
address that problem then.  Let's not make the process too complicated
unless we know it's necessary.  Premature complexity is almost as bad
as premature optimization

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] Audit: break up execve argument lists into multiple records

2007-10-08 Thread Eric Paris
Break the auditing of a list of execve arguments into smaller records if 
there
are a too many.  The limit is currently around 7.5k of arguments as 
userspace
has an 8k buffer limit and will drop messages which are longer.

Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
---
Basically the same patch as last time, used a #define, cleaned up a
memory leak on a malloc failure code path.  Other than that it's the
same.

7500 also is a good size because it means we never need more than a 2 page 
allocation.
I'd say this is a good thing as even if we used a full netlink message of 32k 
its just
making it harder on the kernel to have the memory it needs.  I think userspace 
will want
to get fixed eventually to handle a full 32k just in case, but keeping the 
kernel under
8k when we know we can seems like a good idea.

 kernel/auditsc.c |   39 +--
 1 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 04f3ffb..4176db6 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -78,6 +78,9 @@ extern struct list_head audit_filter_list[];
 /* Indicates that audit should log the full pathname. */
 #define AUDIT_NAME_FULL -1
 
+/* no execve audit message should be longer than this (userspace limits) */
+#define MAX_EXECVE_AUDIT_LEN 7500
+
 /* number of audit rules */
 int audit_n_rules;
 
@@ -819,11 +822,12 @@ static int audit_log_pid_context(struct audit_context 
*context, pid_t pid,
return rc;
 }
 
-static void audit_log_execve_info(struct audit_buffer *ab,
+static void audit_log_execve_info(struct audit_context *context,
+   struct audit_buffer **ab,
struct audit_aux_data_execve *axi)
 {
int i;
-   long len, ret;
+   long len, ret, len_sent = 0;
const char __user *p;
char *buf;
 
@@ -833,7 +837,11 @@ static void audit_log_execve_info(struct audit_buffer *ab,
p = (const char __user *)axi->mm->arg_start;
 
for (i = 0; i < axi->argc; i++, p += len) {
+   char tmp_buf[12];
+   /* how many digits are in i? */
+   int i_len = snprintf(tmp_buf, 12, "%d", i);
len = strnlen_user(p, MAX_ARG_STRLEN);
+
/*
 * We just created this mm, if we can't find the strings
 * we just copied into it something is _very_ wrong. Similar
@@ -862,9 +870,28 @@ static void audit_log_execve_info(struct audit_buffer *ab,
send_sig(SIGKILL, current, 0);
}
 
-   audit_log_format(ab, "a%d=", i);
-   audit_log_untrustedstring(ab, buf);
-   audit_log_format(ab, "\n");
+   /*
+* If there are a lot of args just break them into multiple
+* messages.  the last ab started will get closed by the
+* caller.
+*
+* + 3 + i_len because we know at least a = and \n will be sent
+* as well as the number of digits in i (i_len).
+*/
+   len_sent += (len + 3 + i_len);
+   if (len_sent > MAX_EXECVE_AUDIT_LEN) {
+   len_sent = len + 3 + i_len;
+   audit_log_end(*ab);
+   *ab = audit_log_start(context, GFP_KERNEL, 
AUDIT_EXECVE);
+   if (!*ab) {
+   kfree(buf);
+   return;
+   }
+   }
+
+   audit_log_format(*ab, "a%d=", i);
+   audit_log_untrustedstring(*ab, buf);
+   audit_log_format(*ab, "\n");
 
kfree(buf);
}
@@ -1010,7 +1037,7 @@ static void audit_log_exit(struct audit_context *context, 
struct task_struct *ts
 
case AUDIT_EXECVE: {
struct audit_aux_data_execve *axi = (void *)aux;
-   audit_log_execve_info(ab, axi);
+   audit_log_execve_info(context, , axi);
break; }
 
case AUDIT_SOCKETCALL: {


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] Audit: remove the limit on execve arguments when audit is running

2007-10-08 Thread Eric Paris
Remove the limitation on argv size.  The audit system now logs arguments in
smaller chunks (currently about 8k due to userspace audit system buffer 
sizes)
so this is no longer a requirement.

Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
Acked-by: Peter Zijlstra <[EMAIL PROTECTED]>

---

This patch hasn't changed since the last series, just reposted as 3/3 and 
rediffed.

 kernel/auditsc.c |   10 --
 kernel/sysctl.c  |   11 ---
 2 files changed, 0 insertions(+), 21 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index ffc8d4b..5d39727 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1917,8 +1917,6 @@ int __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, 
gid_t gid, mode_t mode
return 0;
 }
 
-int audit_argv_kb = 32;
-
 int audit_bprm(struct linux_binprm *bprm)
 {
struct audit_aux_data_execve *ax;
@@ -1927,14 +1925,6 @@ int audit_bprm(struct linux_binprm *bprm)
if (likely(!audit_enabled || !context || context->dummy))
return 0;
 
-   /*
-* Even though the stack code doesn't limit the arg+env size any more,
-* the audit code requires that _all_ arguments be logged in a single
-* netlink skb. Hence cap it :-(
-*/
-   if (bprm->argv_len > (audit_argv_kb << 10))
-   return -E2BIG;
-
ax = kmalloc(sizeof(*ax), GFP_KERNEL);
if (!ax)
return -ENOMEM;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 53a456e..88e5d06 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -77,7 +77,6 @@ extern int percpu_pagelist_fraction;
 extern int compat_log;
 extern int maps_protect;
 extern int sysctl_stat_interval;
-extern int audit_argv_kb;
 
 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
 static int maxolduid = 65535;
@@ -347,16 +346,6 @@ static ctl_table kern_table[] = {
.mode   = 0644,
.proc_handler   = _dointvec,
},
-#ifdef CONFIG_AUDITSYSCALL
-   {
-   .ctl_name   = CTL_UNNUMBERED,
-   .procname   = "audit_argv_kb",
-   .data   = _argv_kb,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = _dointvec,
-   },
-#endif
{
.ctl_name   = KERN_CORE_PATTERN,
.procname   = "core_pattern",


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] Audit: break a large single execve argument into smaller records

2007-10-08 Thread Eric Paris
support single arguments that are large, not just large lists of execve 
args.
This also means we never have to get a kernel buffer larger than
MAX_EXECVE_AUDIT_LEN no matter how large the argument is.  Before this patch
we could need to allocate 32 consecutive pages to hold one argument which 
could
pretty easily oom.

a single argument larger than MAX_EXECVE_AUDIT_LEN is broken into multiple
records and have a format like   a10[0] a10[1] a10[2] etc.

Signed-off-by: Eric Paris <[EMAIL PROTECTED]>
---

example audit log (about 50k long) for the whole patch series can be
found at http://people.redhat.com/~eparis/audit/audit.log the execve in
question was something like:

program_name [about 50 arguments] [one argument which is about 17k long] [about 
1000 arguments]

 kernel/auditsc.c |   42 +
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 4176db6..ffc8d4b 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -853,6 +853,48 @@ static void audit_log_execve_info(struct audit_context 
*context,
send_sig(SIGKILL, current, 0);
}
 
+   if (unlikely(len > MAX_EXECVE_AUDIT_LEN)) {
+   /* deal with single arugments > MAX_EXECVE_AUDIT_LEN */
+   int j;
+   const long tmplen = sizeof(char) * MAX_EXECVE_AUDIT_LEN;
+
+   buf = kmalloc(tmplen + 1, GFP_KERNEL);
+   if (!buf) {
+   audit_panic("out of memory for argv string\n");
+   return;
+   }
+   buf[tmplen] = '\0';
+   for (j = 0; len > 0; j++) {
+   if (len > tmplen) {
+   ret = copy_from_user(buf, p, tmplen);
+   p += tmplen;
+   len -= tmplen;
+   } else {
+   ret = copy_from_user(buf, p, len);
+   /* p is at the next arg */
+   p += len;
+   /* 27 is the max length of a%d[%d] */
+   len_sent = len + 27;
+   len  = 0;
+   }
+   if (ret) {
+   WARN_ON(1);
+   send_sig(SIGKILL, current, 0);
+   }
+   audit_log_end(*ab);
+   *ab = audit_log_start(context, GFP_KERNEL,
+ AUDIT_EXECVE);
+   if (!*ab) {
+   kfree(buf);
+   return;
+   }
+   audit_log_format(*ab, "a%d[%d]=", i, j);
+   audit_log_untrustedstring(*ab, buf);
+   audit_log_format(*ab, "\n");
+   }
+   continue;
+   }
+
buf = kmalloc(len, GFP_KERNEL);
if (!buf) {
audit_panic("out of memory for argv string\n");


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT] kbuild: save ARCH & CROSS_COMPILE

2007-10-08 Thread Giacomo Catenazzi
Adrian Bunk wrote:
> 
> BTW: I'm currently trying without success to understand why the
>  drivers/infiniband/{hw/amso1100,ulp/srp}/Kbuild files are not
>  named "Makefile".

or the inverse ;-)  Why all Makefiles (but the top level one ) are not
named Kbuild, considering that they are not valid (standalone) Makefile.

ciao
cate
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aic94xx: Use request_firmware() to provide SAS address if the adapter lacks one

2007-10-08 Thread Darrick J. Wong
If the aic94xx chip doesn't have a SAS address in the chip's flash memory,
use the request_firmware() interface to get one from userspace.  This
way, there's no debate as to who or how an address gets generated--it's
totally up to the administrator to provide it if the card doesn't have one.

Signed-off-by: Darrick J. Wong <[EMAIL PROTECTED]>
---

 drivers/scsi/aic94xx/aic94xx.h  |1 -
 drivers/scsi/aic94xx/aic94xx_hwi.c  |   40 +--
 drivers/scsi/aic94xx/aic94xx_init.c |2 --
 3 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/aic94xx/aic94xx.h b/drivers/scsi/aic94xx/aic94xx.h
index 32f513b..935d558 100644
--- a/drivers/scsi/aic94xx/aic94xx.h
+++ b/drivers/scsi/aic94xx/aic94xx.h
@@ -58,7 +58,6 @@
 
 extern struct kmem_cache *asd_dma_token_cache;
 extern struct kmem_cache *asd_ascb_cache;
-extern char sas_addr_str[2*SAS_ADDR_SIZE + 1];
 
 static inline void asd_stringify_sas_addr(char *p, const u8 *sas_addr)
 {
diff --git a/drivers/scsi/aic94xx/aic94xx_hwi.c 
b/drivers/scsi/aic94xx/aic94xx_hwi.c
index 0cd7eed..82a12cc 100644
--- a/drivers/scsi/aic94xx/aic94xx_hwi.c
+++ b/drivers/scsi/aic94xx/aic94xx_hwi.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "aic94xx.h"
 #include "aic94xx_reg.h"
@@ -38,16 +39,34 @@ u32 MBAR0_SWB_SIZE;
 
 /* -- Initialization -- */
 
-static void asd_get_user_sas_addr(struct asd_ha_struct *asd_ha)
+#define SAS_STRING_ADDR_SIZE   16
+static int asd_get_user_sas_addr(struct asd_ha_struct *asd_ha)
 {
-   extern char sas_addr_str[];
-   /* If the user has specified a WWN it overrides other settings
-*/
-   if (sas_addr_str[0] != '\0')
-   asd_destringify_sas_addr(asd_ha->hw_prof.sas_addr,
-sas_addr_str);
-   else if (asd_ha->hw_prof.sas_addr[0] != 0)
-   asd_stringify_sas_addr(sas_addr_str, asd_ha->hw_prof.sas_addr);
+   const struct firmware *fw;
+   int res;
+
+   /* adapter came with a sas address */
+   if (asd_ha->hw_prof.sas_addr[0])
+   return 0;
+
+   ASD_DPRINTK("No address found for %s; asking for one...\n",
+   pci_name(asd_ha->pcidev));
+
+   /* else go ask userspace */
+   res = request_firmware(, "sas_addr", _ha->pcidev->dev);
+   if (res)
+   return res;
+
+   if (fw->size < SAS_STRING_ADDR_SIZE) {
+   res = -ENODEV;
+   goto out;
+   }
+
+   asd_destringify_sas_addr(asd_ha->hw_prof.sas_addr, fw->data);
+
+out:
+   release_firmware(fw);
+   return res;
 }
 
 static void asd_propagate_sas_addr(struct asd_ha_struct *asd_ha)
@@ -657,8 +676,7 @@ int asd_init_hw(struct asd_ha_struct *asd_ha)
 
asd_init_ctxmem(asd_ha);
 
-   asd_get_user_sas_addr(asd_ha);
-   if (!asd_ha->hw_prof.sas_addr[0]) {
+   if (asd_get_user_sas_addr(asd_ha)) {
asd_printk("No SAS Address provided for %s\n",
   pci_name(asd_ha->pcidev));
err = -ENODEV;
diff --git a/drivers/scsi/aic94xx/aic94xx_init.c 
b/drivers/scsi/aic94xx/aic94xx_init.c
index b70d6e7..5c99f27 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -54,8 +54,6 @@ MODULE_PARM_DESC(collector, "\n"
"\tThe aic94xx SAS LLDD supports both modes.\n"
"\tDefault: 0 (Direct Mode).\n");
 
-char sas_addr_str[2*SAS_ADDR_SIZE + 1] = "";
-
 static struct scsi_transport_template *aic94xx_transport_template;
 static int asd_scan_finished(struct Scsi_Host *, unsigned long);
 static void asd_scan_start(struct Scsi_Host *);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Version 3 (2.6.23-rc8) Smack: Simplified Mandatory Access Control Kernel

2007-10-08 Thread Alan Cox
> My very practical question:  How do I run selinux in one container,
> and SMACK in another?

In the LSM model you don't because you could have the same container
objects visible in different contains at the same time and subject to
different LSMs. What does it mean to pass an SELinux protected object
over an AppArmour protected unix domain socket into a SMACK protected
container ?

If you want consistency then you probably need to put the container id
into the LSM calls and provide the ability in one system to do container
specific checks. Right now I suspect the way to do it is to complete the
work to convert SMACK rulesets into SELinux rulesets with tools.

Really its the same problem as "I'd like to use different file permission
systems on different process identifiers" and it would be very hard to
get right simply because objects can pass between two different security
models.

Pyramid tried to do the "simple" case of BSD and System 5 on the same box
and got caught out even with that because of the different rules on stuff
like chgrp..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT] kbuild: save ARCH & CROSS_COMPILE

2007-10-08 Thread Adrian Bunk
On Mon, Oct 08, 2007 at 10:02:55PM +0200, Sam Ravnborg wrote:
>...
> The settings are stored in the build directory in a file
> named "Kbuild.config" (should it be a .dot file?).
>...

A .dot file sounds better.

And even if not, generated files should IMHO not share the Kbuild* 
namespace with non-generated files.

Apart from this I like the patch.

>   Sam
>...

cu
Adrian

BTW: I'm currently trying without success to understand why the
 drivers/infiniband/{hw/amso1100,ulp/srp}/Kbuild files are not
 named "Makefile".

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: parallel networking

2007-10-08 Thread David Miller
From: Jeff Garzik <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 10:22:28 -0400

> In terms of overall parallelization, both for TX as well as RX, my gut 
> feeling is that we want to move towards an MSI-X, multi-core friendly 
> model where packets are LIKELY to be sent and received by the same set 
> of [cpus | cores | packages | nodes] that the [userland] processes 
> dealing with the data.

The problem is that the packet schedulers want global guarantees
on packet ordering, not flow centric ones.

That is the issue Jamal is concerned about.

The more I think about it, the more inevitable it seems that we really
might need multiple qdiscs, one for each TX queue, to pull this full
parallelization off.

But the semantics of that don't smell so nice either.  If the user
attaches a new qdisc to "ethN", does it go to all the TX queues, or
what?

All of the traffic shaping technology deals with the device as a unary
object.  It doesn't fit to multi-queue at all.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: A bit of kconfig rewrite (Re: [PATCH] 9p: fix compile error if !CONFIG_SYSCTL)

2007-10-08 Thread Oleg Verych
On Mon, Oct 08, 2007 at 10:22:01PM +0200, Sam Ravnborg wrote:
> > 
> > And there's no working alternative to build/config
> > system. Thus, let me have my try OK? Thanks!
> 
> I would prefer if you used your time to do small incrmental improvements
> to what we have today rather then rewriting from scratch.
> 
> But it's your decision and not mine.

In case anybody is interested:

Newsgroups: gmane.linux.kbuild.devel,gmane.linux.kernel
Subject: Re: [RFC/RFT] kbuild: save ARCH & CROSS_COMPILE
Date: Mon, 8 Oct 2007 22:50:48 +0200
Message-ID: <[EMAIL PROTECTED]>
Archived-At: 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   >