[PATCH] lockdep: debug_show_all_locks & debug_show_held_locks vs. debug_locks

2007-03-21 Thread Jarek Poplawski

And here is some addition.

[PATCH] lockdep: debug_show_all_locks &  debug_show_held_locks vs. debug_locks
 
lockdep's data shouldn't be used when debug_locks == 0
because it's not updated after this, so it's more misleading
than helpful.

PS: probably lockdep's current-> fields should be reset after
it turns debug_locks off: so, after printing a bug report, but
before return from exported functions, but there are really
a lot of these possibilities (e.g. after DEBUG_LOCKS_WARN_ON),
so, something could be missed. (Of course direct use of this
fields isn't recommended either.)

Reported-by: Folkert van Heusden <[EMAIL PROTECTED]>
Inspired-by: Oleg Nesterov <[EMAIL PROTECTED]>
Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>

---

diff -Nurp 2.6.21-rc4-git4-/kernel/lockdep.c 2.6.21-rc4-git4/kernel/lockdep.c
--- 2.6.21-rc4-git4-/kernel/lockdep.c   2007-03-21 22:46:26.0 +0100
+++ 2.6.21-rc4-git4/kernel/lockdep.c2007-03-21 23:05:17.0 +0100
@@ -2742,6 +2742,10 @@ void debug_show_all_locks(void)
int count = 10;
int unlock = 1;
 
+   if (unlikely(!debug_locks)) {
+   printk("INFO: lockdep is turned off.\n");
+   return;
+   }
printk("\nShowing all locks held in the system:\n");
 
/*
@@ -2785,6 +2789,10 @@ EXPORT_SYMBOL_GPL(debug_show_all_locks);
 
 void debug_show_held_locks(struct task_struct *task)
 {
+   if (unlikely(!debug_locks)) {
+   printk("INFO: lockdep is turned off.\n");
+   return;
+   }
lockdep_print_held_locks(task);
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AIO, FIO and Threads ...

2007-03-21 Thread Jens Axboe
On Wed, Mar 21 2007, Davide Libenzi wrote:
> On Wed, 21 Mar 2007, Jens Axboe wrote:
> 
> > On Tue, Mar 20 2007, Davide Libenzi wrote:
> > > 
> > > I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
> > > FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):
> > > 
> > > http://www.xmailserver.org/guasi-lib.html
> > > 
> > > I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
> > > and 8GB of RAM.
> > > Mind that I'm not FIO expert, like at all, but I got some interesting 
> > > results when comparing GUASI with libaio at 8/1000/1 depths.
> > > If I read those result correctly (Jens may help), GUASI output is more 
> > > then double the libaio one.
> > > Lots of context switches, yes. But the throughput looks like 2+ times.
> > > Can someone try to repeat the measures and/or spot the error?
> > > Or tell me which other tests to run?
> > > This is kinda a suprise for me ...
> > 
> > I don't know guasi at all, but libaio requires O_DIRECT to be async. I'm
> > sure you know this, but you may not know that fio default to buffered IO
> > so you have to tell it to use O_DIRECT :-)
> > 
> > So try adding a --direct=1 (or --buffered=0, same thing) as an extra
> > option when comparing depths > 1.
> 
> I knew about AIO and O_DIRECT, but I thought FIO was using it by default :)

It actually used to, but I changed the default a few months ago as I
think that is more appropriate.

> I used it for the first time yesterday night, and there are a pretty wide 
> set of options. Will re-run today with --direct.

Yep, I try to add good explanations to all of them though, also
available through --cmdhelp or --cmdhelp=option so you don't have to
lookup the documentation all the time.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] Replace pid_t in autofs with struct pid reference

2007-03-21 Thread Eric W. Biederman
"Serge E. Hallyn" <[EMAIL PROTECTED]> writes:
>
> So is the pid used for anything other than debugging?
>
> In any case, here is a replacement patch which sends the pid number
> in the pid_namespace of the process which did the autofs4 mount.
>
> Still not sure whether that is actually what makes sense...
>
> From: "Serge E. Hallyn" <[EMAIL PROTECTED]>
> Subject: [PATCH] autofs: prevent pid wraparound in waitqs
>
> Instead of storing pid numbers for waitqs, store references
> to struct pids.  Also store a reference to the mounter's pid
> namespace in the autofs4 sb info so that pid numbers for
> mount miss and expiry msgs can send the pid# in the mounter's
> pidns.

Hmm.  Not quite what I would have expected but given that
we are sending data over a pipe that sounds reasonable.

If it wasn't a pipe we would really want to do this in
the context of the process receiving the message, but since
a pipe can receive a message, and then be passed to another
process we clearly can't know the pid namespace of the
process receiving the message.

Therefore just caching the pid namespace either on pipe
open or on mount makes sense.  pipe open might be better.

Serge we really need to introduce __pid_nr in a separate
patch.   And we really seem to be confusing Ian.

Plus we have some pid namespace ref counting issues we need
to handle carefully.

Let's stop working on autofs4 for a bit, fix the pid namespace
infrastructure so there is enough of it to handle autofs4 and
then come back.

Either that or take autofs4 in two passes.  Pass one we do what
we can with the current infrastructure.  Pass two after we fix up
the infrastructure including introducing __pid_nr we come back
and update autofs4 to handle multiple pid namespaces properly.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: libata - 2.6.21-rc4-git5, ata channel still badly configured

2007-03-21 Thread Tejun Heo
Lukas Hejtmanek wrote:
>> Subject: ata_piix: PATA UDMA/100 configured as UDMA/33
>> References : http://lkml.org/lkml/2007/2/20/294
>> Submitter  : Fabio Comolli <[EMAIL PROTECTED]>
>> Status : unknown
> 
> ata_piix :00:1f.1: version 2.10ac1
> ACPI: PCI Interrupt :00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
> PCI: Setting latency timer of device :00:1f.1 to 64
> ata1: PATA max UDMA/100 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x0001ffa0 irq 14
> ata2: PATA max UDMA/100 cmd 0x00010170 ctl 0x00010376 bmdma 0x0001ffa8 irq 15
> scsi0 : ata_piix
> Time: acpi_pm clocksource has been installed.
> Switched to high resolution mode on CPU 0
> ata1.00: ATA-6: ST9100824A, 3.04, max UDMA/100
> ata1.00: 195371568 sectors, multi 16: LBA48
> ata1.01: ATAPI, max UDMA/33
> ata1.00: configured for UDMA/33
> ata1.01: configured for UDMA/33
> scsi1 : ata_piix
> ATA: abnormal status 0x7F on port 0x00010177
> scsi 0:0:0:0: Direct-Access ATA  ST9100824A   3.04 PQ: 0 ANSI: 5
> SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)
> sda: Write Protect is off
> sda: Mode Sense: 00 3a 00 00
> SCSI device sda: write cache: enabled, read cache: enabled, doesn't support 
> DPO or FUA
> SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)

Does this fix your problem?

-- 
tejun
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 361953a..c89664a 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1743,12 +1743,17 @@ static int ata_eh_revalidate_and_attach(struct ata_port 
*ap,
 {
struct ata_eh_context *ehc = >eh_context;
struct ata_device *dev;
+   unsigned int new_mask = 0;
unsigned long flags;
int i, rc = 0;
 
DPRINTK("ENTER\n");
 
-   for (i = 0; i < ATA_MAX_DEVICES; i++) {
+   /* For PATA drive side cable detection to work, IDENTIFY must
+* be done backwards such that PDIAG- is released by the slave
+* device before the master device is identified.
+*/
+   for (i = ATA_MAX_DEVICES - 1; i >= 0; i--) {
unsigned int action, readid_flags = 0;
 
dev = >device[i];
@@ -1760,13 +1765,13 @@ static int ata_eh_revalidate_and_attach(struct ata_port 
*ap,
if (action & ATA_EH_REVALIDATE && ata_dev_ready(dev)) {
if (ata_port_offline(ap)) {
rc = -EIO;
-   break;
+   goto err;
}
 
ata_eh_about_to_do(ap, dev, ATA_EH_REVALIDATE);
rc = ata_dev_revalidate(dev, readid_flags);
if (rc)
-   break;
+   goto err;
 
ata_eh_done(ap, dev, ATA_EH_REVALIDATE);
 
@@ -1784,40 +1789,53 @@ static int ata_eh_revalidate_and_attach(struct ata_port 
*ap,
 
rc = ata_dev_read_id(dev, >class, readid_flags,
 dev->id);
-   if (rc == 0) {
-   ehc->i.flags |= ATA_EHI_PRINTINFO;
-   rc = ata_dev_configure(dev);
-   ehc->i.flags &= ~ATA_EHI_PRINTINFO;
-   } else if (rc == -ENOENT) {
+   switch (rc) {
+   case 0:
+   new_mask |= 1 << i;
+   break;
+   case -ENOENT:
/* IDENTIFY was issued to non-existent
 * device.  No need to reset.  Just
 * thaw and kill the device.
 */
ata_eh_thaw_port(ap);
dev->class = ATA_DEV_UNKNOWN;
-   rc = 0;
-   }
-
-   if (rc) {
-   dev->class = ATA_DEV_UNKNOWN;
break;
+   default:
+   dev->class = ATA_DEV_UNKNOWN;
+   goto err;
}
+   }
+   }
 
-   if (ata_dev_enabled(dev)) {
-   spin_lock_irqsave(ap->lock, flags);
-   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
-   spin_unlock_irqrestore(ap->lock, flags);
+   /* Configure new devices forward such that user doesn't see
+* device detection messages backwards.
+*/
+   for (i = 0; i < ATA_MAX_DEVICES; i++) {
+   dev = >device[i];
 
-   /* new device discovered, configure xfermode */
-   ehc->i.flags |= ATA_EHI_SETMODE;
-   }
-   }
+   if (!(new_mask & (1 << i)))
+  

Re: [RFC] : Is /proc/kcore still usefull and/or maintained ?

2007-03-21 Thread Eric Dumazet
On Thu, 22 Mar 2007 02:04:50 +0200
Maxim <[EMAIL PROTECTED]> wrote:


> Hi,
>   Yes, you are right, you have different problem that I had
> 
>   But why do you need llseek ?

I dont personnaly, but tools do need llseek.

> 
>   Why not to mmap it ?
>   It is natural thing to do with files that represent memory.

Because I dont want to rewrite gdb, file, and other various tools ?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock

2007-03-21 Thread Andrew Morton
On Thu, 22 Mar 2007 07:11:19 +0100 Jarek Poplawski <[EMAIL PROTECTED]> wrote:

> 
> Here is some joke:
> 
> [PATCH] lockdep: lockdep_depth vs. debug_locks
> 
> lockdep really shouldn't be used when debug_locks == 0!
> 

This isn't a very good changelog.

> 
> Reported-by: Folkert van Heusden <[EMAIL PROTECTED]>
> Inspired-by: Oleg Nesterov <[EMAIL PROTECTED]>
> Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>
> 
> ---
> 
> diff -Nurp 2.6.21-rc4-git4-/include/linux/lockdep.h 
> 2.6.21-rc4-git4/include/linux/lockdep.h
> --- 2.6.21-rc4-git4-/include/linux/lockdep.h  2007-03-20 20:24:17.0 
> +0100
> +++ 2.6.21-rc4-git4/include/linux/lockdep.h   2007-03-21 22:32:41.0 
> +0100
> @@ -245,7 +245,7 @@ extern void lock_release(struct lockdep_
>  
>  # define INIT_LOCKDEP.lockdep_recursion = 0,
>  
> -#define lockdep_depth(tsk)   ((tsk)->lockdep_depth)
> +#define lockdep_depth(tsk)   (debug_locks ? (tsk)->lockdep_depth : 0)
>  
>  #else /* !LOCKDEP */
>  

What problem does this solve, and how does it solve it?

I assume that some codepath is incrementing ->lockdep_depth even when
debug_locks==0.  Isn't that wrong of it?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PATCH] ACPI patches for 2.6.21-rc4

2007-03-21 Thread Len Brown
Hi Linus,

please pull from: 

git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git release

This batch includes fixes to two visible 2.6.21 regressions -- the immediate 
suspend wakeup,
and the the acpi_serialize deadlock.  The later is a revert that touches a lot 
of code,
but should be okay even in -rc4 as it returns the code to how it was in 2.6.20.

This will update the files shown below.

thanks!

-Len

ps. individual patches are available on linux-acpi@vger.kernel.org
and a consolidated plain patch is available here:
ftp://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.21/acpi-release-20070126-2.6.21-rc4.diff.gz

 Documentation/crypto/api-intro.txt  |2 
 Documentation/kernel-parameters.txt |6 +
 MAINTAINERS |5 -
 arch/ia64/Kconfig   |1 
 arch/ia64/kernel/crash.c|2 
 arch/ia64/kernel/setup.c|   22 +
 arch/ia64/mm/contig.c   |   30 +-
 arch/ia64/mm/discontig.c|4 
 arch/ia64/mm/init.c |   16 ---
 arch/ia64/sn/kernel/setup.c |6 -
 crypto/scatterwalk.c|4 
 crypto/tcrypt.c |2 
 drivers/acpi/events/evmisc.c|8 +
 drivers/acpi/events/evregion.c  |   15 ++-
 drivers/acpi/events/evxface.c   |6 -
 drivers/acpi/executer/excreate.c|5 -
 drivers/acpi/executer/exsystem.c|   30 +-
 drivers/acpi/executer/exutils.c |  104 
 drivers/acpi/hardware/hwsleep.c |5 +
 drivers/acpi/ibm_acpi.c |   19 +++-
 drivers/acpi/namespace/nseval.c |   11 ++
 drivers/acpi/namespace/nsinit.c |7 +
 drivers/acpi/namespace/nsxfeval.c   |   11 +-
 drivers/acpi/processor_idle.c   |   38 ++--
 drivers/acpi/tables.c   |   57 -
 include/acpi/acinterp.h |6 -
 include/acpi/actypes.h  |2 
 include/asm-ia64/meminit.h  |1 
 28 files changed, 244 insertions(+), 181 deletions(-)

through these commits:

Alexey Starikovskiy (1):
  ACPI: resolve HP nx6125 S3 immediate wakeup regression

Henrique de Moraes Holschuh (1):
  ACPI: ibm-acpi: allow module to load when acpi notifiers can't be set (v2)

Len Brown (5):
  ACPI: Add support to parse 2nd MADT
  ACPICA: revert "acpi_serialize" changes
  ACPI: parse 2nd MADT by default
  ACPI: IA64: fix allnoconfig build
  ACPI: IA64: fix %ll build warnings

Mattia Dongili (1):
  sony-laptop: MAINTAINERS fix entry, add L: and W:

Thomas Renninger (1):
  ACPI: Only use IPI on known broken machines (AMD, Dothan/BaniasPentium M)

with this log:

commit cddece4beccaa72dcb57d64a7f1e496b2e61a16b
Merge: b25e844... 25496ca...
Author: Len Brown <[EMAIL PROTECTED]>
Date:   Tue Mar 20 11:06:37 2007 -0400

Pull c2 into release branch

commit b25e84425ee21c5560fcaec15afcf58fe4a0a414
Merge: f5ea908... 09fe583...
Author: Len Brown <[EMAIL PROTECTED]>
Date:   Tue Mar 20 11:06:18 2007 -0400

Pull bugzilla-7465 into release branch

commit f5ea908c8fca3921c1545e6ac52edbbb353640f5
Merge: 54b8c39... a8f4af6...
Author: Len Brown <[EMAIL PROTECTED]>
Date:   Tue Mar 20 11:06:00 2007 -0400

Pull bugzilla-8171 into release branch

commit 54b8c39fbd76a7341b66e49de677ea366737fce7
Merge: 0a14fe6... 0cd4554...
Author: Len Brown <[EMAIL PROTECTED]>
Date:   Tue Mar 20 11:05:41 2007 -0400

Pull misc-for-upstream into release branch

commit 0cd4554df0c261f7ba74786e471ccaa0e3725fb9
Author: Len Brown <[EMAIL PROTECTED]>
Date:   Mon Mar 19 23:51:36 2007 -0400

ACPI: IA64: fix %ll build warnings

acpi_integer is 64-bits on all platforms, and so was defined as a u64.

i386 and x86_64 define u64 as unsigned long long.
ia64 defines u64 as long.

While these are all 64-bits, the kernel build warns about formating
a "long" with %ll:

drivers/ata/libata-acpi.c:176: warning: long long unsigned int format, 
acpi_integer arg (arg 5)

So skip using "u64" and define acpi_integer as "unsigned long long"
to make gcc happy with %ll.

Signed-off-by: Len Brown <[EMAIL PROTECTED]>

commit 8140a90ec180192b202af086e7a582e5937c5580
Author: Len Brown <[EMAIL PROTECTED]>
Date:   Fri Mar 16 22:00:43 2007 -0400

ACPI: IA64: fix allnoconfig build

The evils of Kconfig's select bite us once again...
ia64/Kconfig selects ACPI, which depends on PM.
But select ignores dependencies, allnoconfig
chooses CONFIG_PM=n, and thus the menu of sub-options
under ACPI vanish, which breaks the build.

Manually select PM along with ACPI for now.
Some day, we should delete them both, or fix select.

Cc: Tony Luck <[EMAIL PROTECTED]>
Signed-off-by: Len Brown <[EMAIL PROTECTED]>

commit 25496caec111481161e7f06bbfa12a533c43cc6f
Author: Thomas Renninger <[EMAIL PROTECTED]>
Date:   Tue Feb 27 12:13:00 2007 -0500

ACPI: Only use IPI on 

Re: [PATCH] Delete obsolete RAW driver feature.

2007-03-21 Thread Dave Jones
On Thu, Mar 22, 2007 at 06:53:06AM +0100, Willy Tarreau wrote:

 > At most, they will ask their distro vendor for continued support of the
 > feature (which there will be in the same minor release), and if vendors'
 > feedback show there is enough demand, then we will just have to delay the
 > removal.

Err, this is where we are _today_

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] lockdep: lockdep_depth vs. debug_locks Re: [2.6.20] BUG: workqueue leaked lock

2007-03-21 Thread Jarek Poplawski

Here is some joke:

[PATCH] lockdep: lockdep_depth vs. debug_locks

lockdep really shouldn't be used when debug_locks == 0!


Reported-by: Folkert van Heusden <[EMAIL PROTECTED]>
Inspired-by: Oleg Nesterov <[EMAIL PROTECTED]>
Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>

---

diff -Nurp 2.6.21-rc4-git4-/include/linux/lockdep.h 
2.6.21-rc4-git4/include/linux/lockdep.h
--- 2.6.21-rc4-git4-/include/linux/lockdep.h2007-03-20 20:24:17.0 
+0100
+++ 2.6.21-rc4-git4/include/linux/lockdep.h 2007-03-21 22:32:41.0 
+0100
@@ -245,7 +245,7 @@ extern void lock_release(struct lockdep_
 
 # define INIT_LOCKDEP  .lockdep_recursion = 0,
 
-#define lockdep_depth(tsk) ((tsk)->lockdep_depth)
+#define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0)
 
 #else /* !LOCKDEP */
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 13/15] get_unmapped_area handles MAP_FIXED in /dev/mem (nommu)

2007-03-21 Thread Benjamin Herrenschmidt
This also fixes a bug, I think, it used to return a pgoff (pfn)
instead of an address. (To split ?)

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 drivers/char/mem.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

Index: linux-cell/drivers/char/mem.c
===
--- linux-cell.orig/drivers/char/mem.c  2007-03-22 16:24:04.0 +1100
+++ linux-cell/drivers/char/mem.c   2007-03-22 16:26:30.0 +1100
@@ -246,9 +246,12 @@ static unsigned long get_unmapped_area_m
   unsigned long pgoff,
   unsigned long flags)
 {
+   if (flags & MAP_FIXED)
+   if ((addr >> PAGE_SHIFT) != pgoff)
+   return (unsigned long) -EINVAL;
if (!valid_mmap_phys_addr_range(pgoff, len))
return (unsigned long) -EINVAL;
-   return pgoff;
+   return pgoff << PAGE_SHIFT;
 }
 
 /* can't do an in-place private mapping if there's no MMU */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 12/15] get_unmapped_area handles MAP_FIXED in ffb DRM

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 drivers/char/drm/ffb_drv.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-cell/drivers/char/drm/ffb_drv.c
===
--- linux-cell.orig/drivers/char/drm/ffb_drv.c  2007-03-22 16:21:22.0 
+1100
+++ linux-cell/drivers/char/drm/ffb_drv.c   2007-03-22 16:23:13.0 
+1100
@@ -191,6 +191,12 @@ unsigned long ffb_get_unmapped_area(stru
if ((kvirt & (SHMLBA - 1)) != (addr & (SHMLBA - 1))) {
unsigned long koff, aoff;
 
+   /* Address needs adjusting which can't be done
+* for MAP_FIXED
+*/
+   if (flags & MAP_FIXED)
+   return -EINVAL;
+
koff = kvirt & (SHMLBA - 1);
aoff = addr & (SHMLBA - 1);
if (koff < aoff)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 10/15] get_unmapped_area handles MAP_FIXED in hugetlbfs

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 fs/hugetlbfs/inode.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-cell/fs/hugetlbfs/inode.c
===
--- linux-cell.orig/fs/hugetlbfs/inode.c2007-03-22 16:12:56.0 
+1100
+++ linux-cell/fs/hugetlbfs/inode.c 2007-03-22 16:16:02.0 +1100
@@ -115,6 +115,12 @@ hugetlb_get_unmapped_area(struct file *f
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = ALIGN(addr, HPAGE_SIZE);
vma = find_vma(mm, addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 15/15] get_unmapped_area doesn't need hugetlbfs hacks anymore

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 mm/mmap.c |   16 
 1 file changed, 16 deletions(-)

Index: linux-cell/mm/mmap.c
===
--- linux-cell.orig/mm/mmap.c   2007-03-22 16:30:24.0 +1100
+++ linux-cell/mm/mmap.c2007-03-22 16:30:48.0 +1100
@@ -1381,22 +1381,6 @@ get_unmapped_area(struct file *file, uns
if (addr & ~PAGE_MASK)
return -EINVAL;
 
-   if (file && is_file_hugepages(file))  {
-   /*
-* Check if the given range is hugepage aligned, and
-* can be made suitable for hugepages.
-*/
-   ret = prepare_hugepage_range(addr, len, pgoff);
-   } else {
-   /*
-* Ensure that a normal request is not falling in a
-* reserved hugepage range.  For some archs like IA-64,
-* there is a separate region for hugepages.
-*/
-   ret = is_hugepage_only_range(current->mm, addr, len);
-   }
-   if (ret)
-   return -EINVAL;
return addr;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 14/15] get_unmapped_area handles MAP_FIXED in generic code

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 mm/mmap.c |   25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

Index: linux-cell/mm/mmap.c
===
--- linux-cell.orig/mm/mmap.c   2007-03-22 16:29:22.0 +1100
+++ linux-cell/mm/mmap.c2007-03-22 16:30:06.0 +1100
@@ -1199,6 +1199,9 @@ arch_get_unmapped_area(struct file *filp
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED)
+   return addr;
+
if (addr) {
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
@@ -1272,6 +1275,9 @@ arch_get_unmapped_area_topdown(struct fi
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED)
+   return addr;
+
/* requesting a specific address */
if (addr) {
addr = PAGE_ALIGN(addr);
@@ -1360,22 +1366,21 @@ get_unmapped_area(struct file *file, uns
unsigned long pgoff, unsigned long flags)
 {
unsigned long ret;
+   unsigned long (*get_area)(struct file *, unsigned long,
+ unsigned long, unsigned long, unsigned long);
 
-   if (!(flags & MAP_FIXED)) {
-   unsigned long (*get_area)(struct file *, unsigned long, 
unsigned long, unsigned long, unsigned long);
-
-   get_area = current->mm->get_unmapped_area;
-   if (file && file->f_op && file->f_op->get_unmapped_area)
-   get_area = file->f_op->get_unmapped_area;
-   addr = get_area(file, addr, len, pgoff, flags);
-   if (IS_ERR_VALUE(addr))
-   return addr;
-   }
+   get_area = current->mm->get_unmapped_area;
+   if (file && file->f_op && file->f_op->get_unmapped_area)
+   get_area = file->f_op->get_unmapped_area;
+   addr = get_area(file, addr, len, pgoff, flags);
+   if (IS_ERR_VALUE(addr))
+   return addr;
 
if (addr > TASK_SIZE - len)
return -ENOMEM;
if (addr & ~PAGE_MASK)
return -EINVAL;
+
if (file && is_file_hugepages(file))  {
/*
 * Check if the given range is hugepage aligned, and
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 9/15] get_unmapped_area handles MAP_FIXED on x86_64

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/x86_64/kernel/sys_x86_64.c |3 +++
 1 file changed, 3 insertions(+)

Index: linux-cell/arch/x86_64/kernel/sys_x86_64.c
===
--- linux-cell.orig/arch/x86_64/kernel/sys_x86_64.c 2007-03-22 
16:10:10.0 +1100
+++ linux-cell/arch/x86_64/kernel/sys_x86_64.c  2007-03-22 16:11:06.0 
+1100
@@ -93,6 +93,9 @@ arch_get_unmapped_area(struct file *filp
unsigned long start_addr;
unsigned long begin, end;

+   if (flags & MAP_FIXED)
+   return addr;
+
find_start_end(flags, , ); 
 
if (len > end)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 11/15] get_unmapped_area handles MAP_FIXED on ramfs (nommu)

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 fs/ramfs/file-nommu.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

Index: linux-cell/fs/ramfs/file-nommu.c
===
--- linux-cell.orig/fs/ramfs/file-nommu.c   2007-03-22 16:18:27.0 
+1100
+++ linux-cell/fs/ramfs/file-nommu.c2007-03-22 16:20:14.0 +1100
@@ -238,7 +238,10 @@ unsigned long ramfs_nommu_get_unmapped_a
struct page **pages = NULL, **ptr, *page;
loff_t isize;
 
-   if (!(flags & MAP_SHARED))
+   /* Deal with MAP_FIXED differently ? Forbid it ? Need help from some 
nommu
+* folks there... --BenH.
+*/
+   if ((flags & MAP_FIXED) || !(flags & MAP_SHARED))
return addr;
 
/* the mapping mustn't extend beyond the EOF */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 8/15] get_unmapped_area handles MAP_FIXED on sparc64

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/sparc64/mm/hugetlbpage.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-cell/arch/sparc64/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/sparc64/mm/hugetlbpage.c   2007-03-22 
16:12:57.0 +1100
+++ linux-cell/arch/sparc64/mm/hugetlbpage.c2007-03-22 16:15:33.0 
+1100
@@ -175,6 +175,12 @@ hugetlb_get_unmapped_area(struct file *f
if (len > task_size)
return -ENOMEM;
 
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = ALIGN(addr, HPAGE_SIZE);
vma = find_vma(mm, addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 6/15] get_unmapped_area handles MAP_FIXED on ia64

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/ia64/kernel/sys_ia64.c |7 +++
 arch/ia64/mm/hugetlbpage.c  |8 
 2 files changed, 15 insertions(+)

Index: linux-cell/arch/ia64/kernel/sys_ia64.c
===
--- linux-cell.orig/arch/ia64/kernel/sys_ia64.c 2007-03-22 15:10:45.0 
+1100
+++ linux-cell/arch/ia64/kernel/sys_ia64.c  2007-03-22 15:10:47.0 
+1100
@@ -33,6 +33,13 @@ arch_get_unmapped_area (struct file *fil
if (len > RGN_MAP_LIMIT)
return -ENOMEM;
 
+   /* handle fixed mapping: prevent overlap with huge pages */
+   if (flags & MAP_FIXED) {
+   if (is_hugepage_only_range(mm, addr, len))
+   return -EINVAL;
+   return addr;
+   }
+
 #ifdef CONFIG_HUGETLB_PAGE
if (REGION_NUMBER(addr) == RGN_HPAGE)
addr = 0;
Index: linux-cell/arch/ia64/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/ia64/mm/hugetlbpage.c  2007-03-22 15:12:32.0 
+1100
+++ linux-cell/arch/ia64/mm/hugetlbpage.c   2007-03-22 15:12:39.0 
+1100
@@ -148,6 +148,14 @@ unsigned long hugetlb_get_unmapped_area(
return -ENOMEM;
if (len & ~HPAGE_MASK)
return -EINVAL;
+
+   /* Handle MAP_FIXED */
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
/* This code assumes that RGN_HPAGE != 0. */
if ((REGION_NUMBER(addr) != RGN_HPAGE) || (addr & (HPAGE_SIZE - 1)))
addr = HPAGE_REGION_BASE;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 7/15] get_unmapped_area handles MAP_FIXED on parisc

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/parisc/kernel/sys_parisc.c |5 +
 1 file changed, 5 insertions(+)

Index: linux-cell/arch/parisc/kernel/sys_parisc.c
===
--- linux-cell.orig/arch/parisc/kernel/sys_parisc.c 2007-03-22 
15:28:05.0 +1100
+++ linux-cell/arch/parisc/kernel/sys_parisc.c  2007-03-22 15:29:08.0 
+1100
@@ -106,6 +106,11 @@ unsigned long arch_get_unmapped_area(str
 {
if (len > TASK_SIZE)
return -ENOMEM;
+   /* Might want to check for cache aliasing issues for MAP_FIXED case
+* like ARM or MIPS ??? --BenH.
+*/
+   if (flags & MAP_FIXED)
+   return addr;
if (!addr)
addr = TASK_UNMAPPED_BASE;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 2/15] get_unmapped_area handles MAP_FIXED on alpha

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/alpha/kernel/osf_sys.c |3 +++
 1 file changed, 3 insertions(+)

Index: linux-cell/arch/alpha/kernel/osf_sys.c
===
--- linux-cell.orig/arch/alpha/kernel/osf_sys.c 2007-03-22 14:58:33.0 
+1100
+++ linux-cell/arch/alpha/kernel/osf_sys.c  2007-03-22 14:58:44.0 
+1100
@@ -1267,6 +1267,9 @@ arch_get_unmapped_area(struct file *filp
if (len > limit)
return -ENOMEM;
 
+   if (flags & MAP_FIXED)
+   return addr;
+
/* First, see if the given suggestion fits.
 
   The OSF/1 loader (/sbin/loader) relies on us returning an
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 3/15] get_unmapped_area handles MAP_FIXED on arm

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/arm/mm/mmap.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-cell/arch/arm/mm/mmap.c
===
--- linux-cell.orig/arch/arm/mm/mmap.c  2007-03-22 14:59:51.0 +1100
+++ linux-cell/arch/arm/mm/mmap.c   2007-03-22 15:00:01.0 +1100
@@ -49,8 +49,7 @@ arch_get_unmapped_area(struct file *filp
 #endif
 
/*
-* We should enforce the MAP_FIXED case.  However, currently
-* the generic kernel code doesn't allow us to handle this.
+* We enforce the MAP_FIXED case.
 */
if (flags & MAP_FIXED) {
if (aliasing && flags & MAP_SHARED && addr & (SHMLBA - 1))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 4/15] get_unmapped_area handles MAP_FIXED on frv

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/frv/mm/elf-fdpic.c |4 
 1 file changed, 4 insertions(+)

Index: linux-cell/arch/frv/mm/elf-fdpic.c
===
--- linux-cell.orig/arch/frv/mm/elf-fdpic.c 2007-03-22 15:00:50.0 
+1100
+++ linux-cell/arch/frv/mm/elf-fdpic.c  2007-03-22 15:01:06.0 +1100
@@ -64,6 +64,10 @@ unsigned long arch_get_unmapped_area(str
if (len > TASK_SIZE)
return -ENOMEM;
 
+   /* handle MAP_FIXED */
+   if (flags & MAP_FIXED)
+   return addr;
+
/* only honour a hint if we're not going to clobber something doing so 
*/
if (addr) {
addr = PAGE_ALIGN(addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 5/15] get_unmapped_area handles MAP_FIXED on i386

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/i386/mm/hugetlbpage.c |6 ++
 1 file changed, 6 insertions(+)

Index: linux-cell/arch/i386/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/i386/mm/hugetlbpage.c  2007-03-22 16:08:12.0 
+1100
+++ linux-cell/arch/i386/mm/hugetlbpage.c   2007-03-22 16:14:19.0 
+1100
@@ -367,6 +367,12 @@ hugetlb_get_unmapped_area(struct file *f
if (len > TASK_SIZE)
return -ENOMEM;
 
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = ALIGN(addr, HPAGE_SIZE);
vma = find_vma(mm, addr);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 0/15] Pass MAP_FIXED down to get_unmapped_area

2007-03-21 Thread Benjamin Herrenschmidt
!!! This is a first cut, and there are still cleanups to be done in various
areas touched by that code. I also haven't done descriptions yet for the
individual patches.

The current get_unmapped_area code calls the f_ops->get_unmapped_area or
the arch one (via the mm) only when MAP_FIXED is not passed. That makes
it impossible for archs to impose proper constraints on regions of the
virtual address space. To work around that, get_unmapped_area() then
calls some hugetlbfs specific hacks.

This cause several problems, among others:

 - It makes it impossible for a driver or filesystem to do the same thing
that hugetlbfs does (for example, to allow a driver to use larger page
sizes to map external hardware) if that requires applying a constraint
on the addresses (constraining that mapping in certain regions and other
mappings out of those regions).

 - Some archs like arm, mips, sparc, sparc64, sh and sh64 already want
MAP_FIXED to be passed down in order to deal with aliasing issues.
The code is there to handle it... but is never called.

This serie of patches moves the logic to handle MAP_FIXED down to the
various arch/driver get_unmapped_area() implementations, and then changes
the generic code to always call them. The hugetlbfs hacks then disappear
from the generic code.

Since I need to do some special 64K pages mappings for SPEs on cell, I need
to work around the first problem at least. I have further patches thus
implementing a "slices" layer that handles multiple page sizes through
slices of the address space for use by hugetlbfs, the SPE code, and possibly
others, but it requires that serie of patches first/

There is still a potential (but not practical) issue due to the fact that
filesystems/drivers implemeting g_u_a will effectively bypass all arch
checks. This is not an issue in practice as the only users of those are
actually doing so are doing it using arch hooks in the first place.

There is also a problem with mremap that will completely bypass all arch
checks. I'll try to address that separately mostly by making it not work
when the vma has a file whose f_ops has a get_unmapped_area callback,
and by making it use is_hugepage_only_range() before expanding into a
new area.

Also, I want to turn is_hugepage_only_range() into a more generic
is_normal_page_range() as that's really what it will end up meaning
when used in stack grow, brk grow and mremap.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/PATCH 1/15] get_unmapped_area handles MAP_FIXED on powerpc

2007-03-21 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 arch/powerpc/mm/hugetlbpage.c |   21 +
 1 file changed, 21 insertions(+)

Index: linux-cell/arch/powerpc/mm/hugetlbpage.c
===
--- linux-cell.orig/arch/powerpc/mm/hugetlbpage.c   2007-03-22 
14:52:07.0 +1100
+++ linux-cell/arch/powerpc/mm/hugetlbpage.c2007-03-22 14:57:40.0 
+1100
@@ -572,6 +572,13 @@ unsigned long arch_get_unmapped_area(str
if (len > TASK_SIZE)
return -ENOMEM;
 
+   /* handle fixed mapping: prevent overlap with huge pages */
+   if (flags & MAP_FIXED) {
+   if (is_hugepage_only_range(mm, addr, len))
+   return -EINVAL;
+   return addr;
+   }
+
if (addr) {
addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
@@ -647,6 +654,13 @@ arch_get_unmapped_area_topdown(struct fi
if (len > TASK_SIZE)
return -ENOMEM;
 
+   /* handle fixed mapping: prevent overlap with huge pages */
+   if (flags & MAP_FIXED) {
+   if (is_hugepage_only_range(mm, addr, len))
+   return -EINVAL;
+   return addr;
+   }
+
/* dont allow allocations above current base */
if (mm->free_area_cache > base)
mm->free_area_cache = base;
@@ -829,6 +843,13 @@ unsigned long hugetlb_get_unmapped_area(
/* Paranoia, caller should have dealt with this */
BUG_ON((addr + len)  < addr);
 
+   /* Handle MAP_FIXED */
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(addr, len, pgoff))
+   return -EINVAL;
+   return addr;
+   }
+
if (test_thread_flag(TIF_32BIT)) {
curareas = current->mm->context.low_htlb_areas;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Delete obsolete RAW driver feature.

2007-03-21 Thread Willy Tarreau
On Thu, Mar 22, 2007 at 01:02:50AM -0400, Dave Jones wrote:
> On Thu, Mar 22, 2007 at 05:45:40AM +0100, Willy Tarreau wrote:
>  > On Thu, Mar 22, 2007 at 12:17:51AM -0400, Dave Jones wrote:
>  > > On Thu, Mar 22, 2007 at 05:12:42AM +0100, Willy Tarreau wrote:
>  > >  > On Wed, Mar 21, 2007 at 07:43:18PM -0400, Dave Jones wrote:
>  > >  > > On Thu, Mar 22, 2007 at 12:24:33AM +0100, Willy Tarreau wrote:
>  > >  > >  
>  > >  > >  > Then a printk() on every open() should be enough. We've all been 
> seeing
>  > >  > >  > "Warning: tcpdump uses obsolete AF_PACKET"... and it finally 
> disappeared.
>  > >  > > 
>  > >  > > There's a difference.  We have the source for tcpdump.
>  > >  > 
>  > >  > But what's the problem with "warning: process XXX uses obsolete raw 
> driver
>  > >  > and may not work anymore after 2007/XX/XX if not fixed" ?
>  > > 
>  > > The target audience isn't going to read it.
>  > 
>  > Yes they will if you write it with KERN_CRIT.
> 
> *no*.
> 
> Users will see it. The developers of the software those users are running 
> won't.
> We're talking about apps here that we don't have the source to, and vendors
> want extortionate amount of money to change.

Dave, I think you don't get it. People are paying to run those apps. When
you pay 10s or 100s of K$ a year for support and you see such messages
appear in your logs, you ask the app vendor what will happen at the given
date. Those people are already afraid by end of support on products without
any warning message. I sincerely believe that they will insist even more
when the message tells them about the end of compatibility date.

Of course, the vendor will say "we've been informed that this is caused by
a change in distro XXX. Distro YYY does not have it, you may want to migrate".
But those responses rarely satisfy customers who have to revalidate everything
to change a distro.

At most, they will ask their distro vendor for continued support of the
feature (which there will be in the same minor release), and if vendors'
feedback show there is enough demand, then we will just have to delay the
removal.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-03-21 Thread Tomoki Sekiyama

Hi,

Thanks for your comments.
I'm sorry for my late reply.

Bill Davidsen wrote:
> Andrew Morton wrote:
>>> On Wed, 14 Mar 2007 21:42:46 +0900 Tomoki Sekiyama
>>> <[EMAIL PROTECTED]> wrote:
>>>
>>> ...
>>>
>>>
>>> -Solution:
>>>
>>> I consider that all of the dirty pages for the disk have been written
>>> back and that the disk is clean if a process cannot write 'write_chunk'
>>> pages in balance_dirty_pages().
>>>
>>> To avoid using up the free memory with dirty pages by passing blocking,
>>> this patchset adds a new threshold named vm.dirty_limit_ratio to sysctl.
>>>
>>> It modifies balance_dirty_pages() not to block when the amount of
>>> Dirty+Writeback is less than vm.dirty_limit_ratio percent of the memory.
>>> In the other cases, writers are throttled as current Linux does.
>>>
>>>
>>> In this patchset, vm.dirty_limit_ratio, instead of vm.dirty_ratio, is
>>> used as the clamping level of Dirty+Writeback. And, vm.dirty_ratio is
>>> used as the level at which a writers will itself start writeback of the
>>> dirty pages.
>>
>> Might be a reasonable solution - let's see what Peter comes up with too.
>>
>> Comments on the patch:
>>
>> - Please don't VM_DIRTY_LIMIT_RATIO: just use CTL_UNNUMBERED and leave
>>   sysctl.h alone.
>>
>> - The 40% default is already too high.  Let's set this new upper limit to
>>   40% and decrease he non-blocking ratio.
>>
>> - Please update the procfs documentation in ./Docmentation/

OK, I'm going to fix them and repost the patchset.


>> - I wonder if dirty_limit_ratio is the best name we could choose.
>> vm_dirty_blocking_ratio, perhaps?  Dunno.
>>
> I don't like it, but I dislike it less than "dirty_limit_ratio" I guess.
> It would probably break things to change it now, including my
> sysctl.conf on a number of systems  :-(

I'm wondering which interface is preferred...

1) Just rename "dirty_limit_ratio" to "dirty_blocking_ratio."
   Those who had been changing dirty_ratio should additionally modify
   dirty_blocking_ratio in order to determine the upper limit of dirty pages.

2) Change "dirty_ratio" to a vector, consists of 2 values;
   {blocking ratio, writeback starting ratio}.
   For example, to change the both values:
 # echo 40 35 > /proc/sys/vm/dirty_ratio
   And to change only the first one:
 # echo 20 > /proc/sys/vm/dirty_ratio
   In the latter way the writeback starting ratio is regarded as the same as the
   blocking ratio if the writeback starting ratio is smaller. And then, the 
kernel behaves
   similarly as the current kernel.

3) Use "dirty_ratio" as the blocking ratio. And add
   "start_writeback_ratio", and start writeback at
   start_writeback_ratio(default:90) * dirty_ratio / 100 [%].
   In this way, specifying blocking ratio can be done in the same way as
   current kernel, but high/low watermark algorithm is enabled.


Regards,
--
Tomoki Sekiyama
Hitachi, Ltd., Systems Development Laboratory

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 1/4] Blackfin: architecture update patch

2007-03-21 Thread Wu, Bryan
On Wed, 2007-03-21 at 20:08 -0800, Andrew Morton wrote:
> On Thu, 22 Mar 2007 10:24:51 +0800 "Wu, Bryan" <[EMAIL PROTECTED]> wrote:
> 
> > > 1 out of 1 hunk FAILED -- saving rejects to file 
> > > include/asm-blackfin/cplbinit.h.rej
> > > 1 out of 1 hunk FAILED -- saving rejects to file 
> > > include/asm-blackfin/mach-bf535/bf535.h.rej
> > > 1 out of 1 hunk FAILED -- saving rejects to file 
> > > include/asm-blackfin/scatterlist.h.rej
> > > 
> > > This seems to be against a kernel which did not include
> > > blackfin-arch-balance-parenthesis-in-macros.patch.  But 2.6.21-rc4-mm1 did
> > > include blackfin-arch-balance-parenthesis-in-macros.patch, so I'm not sure
> > > what is going on here.
> > > 
> > 
> > Sorry for this mess up. you know, as
> > blackfin-arch-balance-parenthesis-in-macros.patch was posted, it was
> > applied to our SVN tree. And I just merged these weeks SVN change to
> > this update patch. So this patch includes
> > blackfin-arch-balance-parenthesis-in-macros.patch. 
> 
> Well, you'd better get used to it.  As I told you a few weeks ago, once this
> code is merged into mainline you no longer own the master copy of the blackfin
> tree - Linus does.
> 

We agree with this and please give us sometime to try to push our
patches to the mainline tree. We still have tons of code should be
submitted to mainline, and before the submission, the code should be
reviewed and fixed. This is my main task and I think this internal
review can reduce you and other kernel maintainer's effort.

But our team still have release pressure from customers. The whole
distribution including bootloader/toolchain/kernel/uClinux-dist should
be delivered to customers on time. If our kernel blackfin-arch are
merged eventually, we definitely will move to linus tree and ask our
customer to use linus mainline tree.

As you know, our latest U-Boot blackfin code is accepted by U-Boot
upstream mainline recently. Then the internal U-Boot SVN tree will be
replaced by upstream git-blackfin tree in U-Boot, our development will
move to that mainline tree.

At this moment, I will try my best to speed up the blackfin-arch and
drivers merge progress. 

> If you're not careful, you'll end up submitting patches which revert other
> people's fixes.   And there's an excellent chance that you will submit patches
> which have dependencies on other stuff in your private tree and which are
> insufficiently-tested-against, and possibly incorrect-against Linus's tree.

Apologise. I will resend the blackfin update patch based on other
patches.

Thanks Andrew
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: $CHECK can't be overridden

2007-03-21 Thread Keith Owens
Dave Jones (on Thu, 22 Mar 2007 01:37:14 -0400) wrote:
>On Thu, Mar 22, 2007 at 04:26:39PM +1100, Keith Owens wrote:
> > Dave Jones (on Thu, 22 Mar 2007 01:15:25 -0400) wrote:
> > >make help implies that supplying $CHECK on the command line
> > >should override sparse as the checker used when building with C=1
> > >Yet, this doesn't seem to be the case.
> > >
> > >This would be useful for cases where for eg, sparse isn't in
> > >the $PATH, allowing an explicit path to the executable to be
> > >passed in automated build environments.
> > 
> > Works for me.
> > 
> > # make C=1 CHECK=foo
>
>Ah, my bad. I was thinking it was an environment var rather
>than a makefile var.  I was using 'CHECK=foo make bzImage C=1'

The default for 'make' is that environment variables do _not_ override
variables defined in the Makefiles.  You can change that behaviour with
the -e flag, 'CHECK=foo make -e bzImage C=1' should work.  'info make'
recommends against using -e, changing environments can lead to
unexpected side effects.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 4/4] Blackfin: on-chip Two Wire Interface I2C driver

2007-03-21 Thread Mike Frysinger

On 3/21/07, Jean Delvare <[EMAIL PROTECTED]> wrote:

> + p_adap->class = I2C_CLASS_ALL;

This pretty much voids the point of these probing classes. You should
only select the classes matching devices which may actually be probed
for on this bus. If different boards have different needs, get the
right classes from the platform data.


i asked about the class issue previously specifically for this bus
driver and was told that they werent really fully defined ... the
on-chip I2C interface on the Blackfin chip can handle any I2C device
which is why i added this line

any examples of how to go about doing it via boards ?  i dont see any
other I2C bus driver doing it that way ...
-mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: $CHECK can't be overridden

2007-03-21 Thread Dave Jones
On Thu, Mar 22, 2007 at 04:26:39PM +1100, Keith Owens wrote:
 > Dave Jones (on Thu, 22 Mar 2007 01:15:25 -0400) wrote:
 > >make help implies that supplying $CHECK on the command line
 > >should override sparse as the checker used when building with C=1
 > >Yet, this doesn't seem to be the case.
 > >
 > >This would be useful for cases where for eg, sparse isn't in
 > >the $PATH, allowing an explicit path to the executable to be
 > >passed in automated build environments.
 > 
 > Works for me.
 > 
 > # make C=1 CHECK=foo

Ah, my bad. I was thinking it was an environment var rather
than a makefile var.  I was using 'CHECK=foo make bzImage C=1'

Thanks,

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: $CHECK can't be overridden

2007-03-21 Thread Keith Owens
Dave Jones (on Thu, 22 Mar 2007 01:15:25 -0400) wrote:
>make help implies that supplying $CHECK on the command line
>should override sparse as the checker used when building with C=1
>Yet, this doesn't seem to be the case.
>
>This would be useful for cases where for eg, sparse isn't in
>the $PATH, allowing an explicit path to the executable to be
>passed in automated build environments.

Works for me.

# make C=1 CHECK=foo
Using somedir/linux as source for kernel
GEN someobj/Makefile
CHK include/linux/version.h
CHK include/linux/utsrelease.h
CHK include/linux/compile.h
CHECK   somedir/linux/kdb/modules/kdb_bt2.c
/bin/sh: foo: command not found


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Unset msi and msix flags on pci_device_disable

2007-03-21 Thread Eric W. Biederman
Thomas Meyer <[EMAIL PROTECTED]> writes:

> The commit f5f2b13129a6541debf8851bae843cbbf48298b7 broke suspend/resume
> to disk two or more  times in a row. This patches fixes the problem:

Please clue me in, on what the problem is, I see why my patch would
have changed things I don't see how it is breaking things.  Clearing
the device bits has no effect on the hardware and it leaks our
msi handling data structures.

What is the user doing that this code broke?

I will be happy to work through a proper fix but this isn't it.

If this is actually helps something I am tempted to make the
added lines say WARN_ON.

But I would really like to understand the nature of the problem
so that we can do something to fix it properly.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


$CHECK can't be overridden

2007-03-21 Thread Dave Jones
make help implies that supplying $CHECK on the command line
should override sparse as the checker used when building with C=1
Yet, this doesn't seem to be the case.

This would be useful for cases where for eg, sparse isn't in
the $PATH, allowing an explicit path to the executable to be
passed in automated build environments.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PNPACPI probes serial twice, messes up serial console

2007-03-21 Thread Bjorn Helgaas
On Wednesday 21 March 2007 22:23, Keith Owens wrote:
> The aim of the patch looks sensible, but it will not compile for
> 2.6.21-rc4.  8250_x86.c tests pnp_platform_devices, which does not
> exist.  Also the combination of CONFIG_SERIAL_8250_X86=y and
> CONFIG_SERIAL_8250_PNP=m would result in 8250_x86.o being built into
> vmlinux but referring to serial8250_nopnp in module 8250_pnp.o, kernel
> to module references are tricky.

OK, I'll rework it and update it.  It will probably take me
a couple weeks though.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Delete obsolete RAW driver feature.

2007-03-21 Thread Dave Jones
On Thu, Mar 22, 2007 at 05:45:40AM +0100, Willy Tarreau wrote:
 > On Thu, Mar 22, 2007 at 12:17:51AM -0400, Dave Jones wrote:
 > > On Thu, Mar 22, 2007 at 05:12:42AM +0100, Willy Tarreau wrote:
 > >  > On Wed, Mar 21, 2007 at 07:43:18PM -0400, Dave Jones wrote:
 > >  > > On Thu, Mar 22, 2007 at 12:24:33AM +0100, Willy Tarreau wrote:
 > >  > >  
 > >  > >  > Then a printk() on every open() should be enough. We've all been 
 > > seeing
 > >  > >  > "Warning: tcpdump uses obsolete AF_PACKET"... and it finally 
 > > disappeared.
 > >  > > 
 > >  > > There's a difference.  We have the source for tcpdump.
 > >  > 
 > >  > But what's the problem with "warning: process XXX uses obsolete raw 
 > > driver
 > >  > and may not work anymore after 2007/XX/XX if not fixed" ?
 > > 
 > > The target audience isn't going to read it.
 > 
 > Yes they will if you write it with KERN_CRIT.

*no*.

Users will see it. The developers of the software those users are running won't.
We're talking about apps here that we don't have the source to, and vendors
want extortionate amount of money to change.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: rsdl improvements

2007-03-21 Thread Con Kolivas
On Thursday 22 March 2007 10:36, Andrew Morton wrote:
> On Thu, 22 Mar 2007 04:29:44 +1100
>
> Con Kolivas <[EMAIL PROTECTED]> wrote:
> > Further improve the deterministic nature of the RSDL cpu scheduler and
> > make the rr_interval tunable.
>
> I might actually need to drop RSDL from next -mm, see if those sched oopses
> whcih several people have reported go away.

I did mention them in the changelog further down. While it may not be 
immediately apparent from the minimal emails I'm sending, I am trying hard to 
address every known regression in the time alloted. Without access to the 
hardware though I'm reliant on others testing it so I can't know for certain 
if I've fixed them.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/7] Initial implementation of the trec driver and include files

2007-03-21 Thread Wink Saville

On 3/21/07, Johannes Weiner <[EMAIL PROTECTED]> wrote:

Hi,

On Wed, Mar 21, 2007 at 09:49:12AM -0700, Wink Saville wrote:
> >>Please don't use camel-case - in general.
> >>
> Would p_next, p_cur and p_end be OK?

I think it's generally disliked. Quoting Documentation/CodingStyle:

``Encoding the type of a function into the name (so-called Hungarian
notation) is brain damaged - the compiler knows the types anyway and can
check those, and it only confuses the programmer.  No wonder MicroSoft
makes buggy programs.''



I'll change it to; next, cur, end (when in rome do as the romans)



You can implement a sysctl-setting to enable/disable the whole system
(which then would be runtime changeable).


Generally the way I've used this in the kernel is sprinkle them liberally
where I'm trying to track down a bug or understand how some code works
and then disable some of them using ZREC's but then reenable as
necessary. Finally, I remove them all usually by reverting to the original
code.

In userland I have a more sophisticated version which combines TREC's
and printf's in one macro with a "debug variable" defined as a
set of bits controlling the enable/disable sets of these. For example:

  DPRP1(1 << 2, "This is a DPR with parameter=%d", xyz);

#define DPRP1(__bits, __format, __param, ...) \
do { \
  if (((__bits) << 16) & dbg_variable) \
printf((__format), __param...); \
  if ((__bits) & dbg_variable) \
TREC1((__param)); \
} while (0)

The above divides "dbg_variable" into 2 16 bit fields, the upper 16
bits control if the printf is enabled and the lower 16 bits controls if
the TREC1 is enabled. What I envision happening would possibly
to expose the "dbg_variable" via procfs where it could "easily" be
modified from userland.

Anyway, for my immediate needs I just needed the TREC's so I could
understand the inner workings of the kernel better and assist in debugging.
I've submitted it as a patch incase anyone might be interested.

Actually, are you interested in using them?

In anycase thanks for the feed back.

Wink
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems

2007-03-21 Thread David Chinner
On Wed, Mar 21, 2007 at 10:38:33PM +0100, Rafael J. Wysocki wrote:
> 
> I think this is the XFS problem with freezable workqueues.
> 
> Maxim, please try to apply the appended patch and see if it helps.
> 
> Greetings,
> Rafael
> 
> 
> ---
> Since freezable workqueues are broken in 2.6.21-rc
> (cf. http://marc.theaimsgroup.com/?l=linux-kernel=116855740612755,
> http://marc.theaimsgroup.com/?l=linux-kernel=117261312523921=2)
> it's better to remove them altogether for 2.6.21 and change the only user of
> them (XFS) accordingly.
> 
> ---
>  fs/xfs/linux-2.6/xfs_buf.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c
> ===
> --- linux-2.6.21-rc4.orig/fs/xfs/linux-2.6/xfs_buf.c
> +++ linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c
> @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
>   if (!xfs_buf_zone)
>   goto out_free_trace_buf;
>  
> - xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> + xfslogd_workqueue = create_workqueue("xfslogd");
>   if (!xfslogd_workqueue)
>   goto out_free_buf_zone;
>  
> - xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> + xfsdatad_workqueue = create_workqueue("xfsdatad");
>   if (!xfsdatad_workqueue)
>   goto out_destroy_xfslogd_workqueue;
>  

Acked-by: Dave Chinner <[EMAIL PROTECTED]>

Rafael, it sounds like this really needs to go into the next -rc
kernel. Can you push it to Linus?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RSDL v0.31

2007-03-21 Thread Willy Tarreau
On Wed, Mar 21, 2007 at 06:07:33PM +0100, Mike Galbraith wrote:
> On Wed, 2007-03-21 at 16:11 +0100, Paolo Ornati wrote:
> > On Wed, 21 Mar 2007 15:57:44 +0100
> > Mike Galbraith <[EMAIL PROTECTED]> wrote:
> > 
> > > I was more than a bit surprised that mainline did this well, considering
> > > that the proggy was one someone posted long time ago to demonstrate
> > > starvation issues with the interactivity estimator.  (source not
> > > available unfortunately, was apparently still on my old PIII box along
> > > with the one Willy posted when I installed opensuse 10.2 on it.  damn.
> > > trivial thing though)
> > 
> > This one?  :)
> 
> No, but that one went to bit heaven too ;-)

Mike, if you need my old scheddos, I can resend it to you as well as to
any people working on the scheduler and asking for it. Although trivial,
I'm a bit reluctant to publish it to the whole world because I suspect
that distros based on older kernels are still vulnerable and the fixes
may not be easy. Anyway, it has absolutely no effect on non-interactive
schedulers.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Delete obsolete RAW driver feature.

2007-03-21 Thread Willy Tarreau
On Thu, Mar 22, 2007 at 12:17:51AM -0400, Dave Jones wrote:
> On Thu, Mar 22, 2007 at 05:12:42AM +0100, Willy Tarreau wrote:
>  > On Wed, Mar 21, 2007 at 07:43:18PM -0400, Dave Jones wrote:
>  > > On Thu, Mar 22, 2007 at 12:24:33AM +0100, Willy Tarreau wrote:
>  > >  
>  > >  > Then a printk() on every open() should be enough. We've all been 
> seeing
>  > >  > "Warning: tcpdump uses obsolete AF_PACKET"... and it finally 
> disappeared.
>  > > 
>  > > There's a difference.  We have the source for tcpdump.
>  > 
>  > But what's the problem with "warning: process XXX uses obsolete raw driver
>  > and may not work anymore after 2007/XX/XX if not fixed" ?
> 
> The target audience isn't going to read it.

Yes they will if you write it with KERN_CRIT. It will show up on all their
consoles every time their process starts up. What is right is that those
people preferably use vendors' distro and generally do not compile their
kernels themselves, unless there are internal skills and good motivation
to do so.

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PNPACPI probes serial twice, messes up serial console

2007-03-21 Thread Keith Owens
Bjorn Helgaas (on Wed, 21 Mar 2007 10:35:38 -0600) wrote:
>On Tuesday 20 March 2007 08:32, Bjorn Helgaas wrote:
>> On Tuesday 20 March 2007 00:46, Keith Owens wrote:
>> > Booting with 'console=tty console=ttyS0,9600'.  The serial console on
>> > ttyS0 (0x3f8, irq 4) is probed twice, once from serial8250_init() and
>> > again from serial_pnp_probe().
>> 
>> I played with this last summer, but was too timid to finish it
>> and post it.  My plan was to remove the legacy SERIAL_PORT_DFNS,
>> make platform devices for them, and only register the platform
>> devices in the absence of PNP.
>> 
>> My motivation at the time was to prevent 8250 from claiming IRDA
>> devices that happened to live at legacy UART addresses.  I also
>> wanted to make IRDA (smsc-ircc2 in my case) smart enough to use
>> PNP to locate its devices, since 8250 would no longer claim them.
>> 
>> Here's the dusty patch (against 2.6.18-rc1-mm2).  If it seems
>> like a reasonable thing to do, I can update it, polish it up,
>> add a changelog, and post it.
>
>Keith, does this patch help?  Russell didn't complain about it, so
>if it fixes your problem, maybe we could put it in -mm and see if
>it breaks anything else.

The aim of the patch looks sensible, but it will not compile for
2.6.21-rc4.  8250_x86.c tests pnp_platform_devices, which does not
exist.  Also the combination of CONFIG_SERIAL_8250_X86=y and
CONFIG_SERIAL_8250_PNP=m would result in 8250_x86.o being built into
vmlinux but referring to serial8250_nopnp in module 8250_pnp.o, kernel
to module references are tricky.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Delete obsolete RAW driver feature.

2007-03-21 Thread Dave Jones
On Thu, Mar 22, 2007 at 05:12:42AM +0100, Willy Tarreau wrote:
 > On Wed, Mar 21, 2007 at 07:43:18PM -0400, Dave Jones wrote:
 > > On Thu, Mar 22, 2007 at 12:24:33AM +0100, Willy Tarreau wrote:
 > >  
 > >  > Then a printk() on every open() should be enough. We've all been seeing
 > >  > "Warning: tcpdump uses obsolete AF_PACKET"... and it finally 
 > > disappeared.
 > > 
 > > There's a difference.  We have the source for tcpdump.
 > 
 > But what's the problem with "warning: process XXX uses obsolete raw driver
 > and may not work anymore after 2007/XX/XX if not fixed" ?

The target audience isn't going to read it.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [1/6] 2.6.21-rc4: known regressions

2007-03-21 Thread Nick Piggin

Linus Torvalds wrote:

In contrast, the hang reported by Mariusz Kozlowski has a slightly 
different feel to it, but there's a tantalizing pattern in there too:


  http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/1243.html

Call Trace:
[] io_schedule+0x42/0x59
[] sleep_on_buffer+0x8/0xc
[] __wait_on_bit+0x47/0x6c
[] out_of_line_wait_on_bit+0x5b/0x64
[] __wait_on_buffer+0x27/0x2d
[] journal_commit_transaction+0x707/0x127f
[] kjournald+0xac/0x1ed
[] kthread+0xa2/0xc9
[] kernel_thread_helper+0x7/0x1c

which certainly also looks like an IO never completed (or completed but 
never woke anything up).


It also seems to be related to *buffers*. Maybe the whole bh layer thing 
is a fluke, but it's not waiting for normal data, it's very much waiting 
for those journal things that all use buffer heads.Which just makes me 
worry about those patches by Nick (which did come in through Andrew). I 
don't think it's the memorder one (it looks safe and shouldn't matter on 
x86 anyway!), but what about the


fs: fix __block_write_full_page error case buffer submission

locking change for example? Or that "fs: fix nobh data leak" thing with 
its fix? It uses "SetPageUptodate(page);" without waking up anybody who 
might wait for it (but the waiters here seem to wait on buffers, so that's 
probably not it)..


Nothing sleeps on PageUptodate, so I don't think that could explain it.

The fs: fix __block_write_full_page error case buffer submission patch
does change the locking, but I'd be really suprised if that was the
problem, because it changes locking to match the regular non-error path
submission.

It could be possible that ext3 is doing something weird and expecting
the old behaviour if it failed get_block, but that seems pretty weird
to do, and would need fixing.

fs: nobh data leak... again hard to see how it could cause an unlock/wakeup
to get lost. Is Mariusz using the nobh mount option?

It wouldn't hurt to test with these patches backed out...

Alternatively, maybe it really is an _io_ problem (and the buffer-head 
thing is just a red herring, and it could happen to other IO, it's just 
that metadata IO uses buffer heads), and it's the scheduler changes since 
2.6.20..


I see what you mean. Could it be an ext3 or jbd change I wonder?

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Delete obsolete RAW driver feature.

2007-03-21 Thread Willy Tarreau
On Wed, Mar 21, 2007 at 07:43:18PM -0400, Dave Jones wrote:
> On Thu, Mar 22, 2007 at 12:24:33AM +0100, Willy Tarreau wrote:
>  
>  > Then a printk() on every open() should be enough. We've all been seeing
>  > "Warning: tcpdump uses obsolete AF_PACKET"... and it finally disappeared.
> 
> There's a difference.  We have the source for tcpdump.

But what's the problem with "warning: process XXX uses obsolete raw driver
and may not work anymore after 2007/XX/XX if not fixed" ?

Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops after cd /sys/.../cpufreq/; rmmod; cat stats/time_in_state

2007-03-21 Thread Andrew Morton
On Wed, 21 Mar 2007 21:00:06 -0700 Greg KH <[EMAIL PROTECTED]> wrote:

> On Wed, Mar 21, 2007 at 11:51:04PM -0400, Dave Jones wrote:
> > On Wed, Mar 21, 2007 at 08:07:53PM -0700, Greg KH wrote:
> >  
> >  > > After modprobe/rmmod cpufreq/stats directory appears but doesn't get
> >  > > removed. Should it?
> >  > Well, one can argue that those stats should never be in sysfs at all
> >  > anyway, I mean come on, a histogram in sysfs?  That's, not ok.
> > 
> > Meh, it's only a cheesy debug thing, so it's not really that big a deal imo.
> > it could probably move to debugfs (we didn't have that when it was merged 
> > iirc)
> > I doubt anyone really cares enough to bother though I wouldn't
> > be averse to a patch.
> 
> Yeah, I realize this, I'm not trying to find fault, sorry if it came
> across that way.  And yes, it should move to debugfs some day, but if it
> does, I'll loose my "what not to put in sysfs" example I use in
> presentations :)
> 

I ain't picky, but as a short-term thing it'd be kinda nice if it didn't
oops the kernel.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 1/4] Blackfin: architecture update patch

2007-03-21 Thread Andrew Morton
On Thu, 22 Mar 2007 10:24:51 +0800 "Wu, Bryan" <[EMAIL PROTECTED]> wrote:

> > 1 out of 1 hunk FAILED -- saving rejects to file 
> > include/asm-blackfin/cplbinit.h.rej
> > 1 out of 1 hunk FAILED -- saving rejects to file 
> > include/asm-blackfin/mach-bf535/bf535.h.rej
> > 1 out of 1 hunk FAILED -- saving rejects to file 
> > include/asm-blackfin/scatterlist.h.rej
> > 
> > This seems to be against a kernel which did not include
> > blackfin-arch-balance-parenthesis-in-macros.patch.  But 2.6.21-rc4-mm1 did
> > include blackfin-arch-balance-parenthesis-in-macros.patch, so I'm not sure
> > what is going on here.
> > 
> 
> Sorry for this mess up. you know, as
> blackfin-arch-balance-parenthesis-in-macros.patch was posted, it was
> applied to our SVN tree. And I just merged these weeks SVN change to
> this update patch. So this patch includes
> blackfin-arch-balance-parenthesis-in-macros.patch. 

Well, you'd better get used to it.  As I told you a few weeks ago, once this
code is merged into mainline you no longer own the master copy of the blackfin
tree - Linus does.

If you're not careful, you'll end up submitting patches which revert other
people's fixes.   And there's an excellent chance that you will submit patches
which have dependencies on other stuff in your private tree and which are
insufficiently-tested-against, and possibly incorrect-against Linus's tree.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Use X86_EFLAGS_IF in irqflags.h, lguest.

2007-03-21 Thread Keith Owens
Rusty Russell (on Thu, 22 Mar 2007 14:52:29 +1100) wrote:
>On Thu, 2007-03-22 at 14:24 +1100, Rusty Russell wrote:
>> Belay this: there's a X86_EFLAGS_IF in asm/processor.h which we should
>> use.  Will send patch.
>
>How's this.  There may be other users, but they're not easy to grep for.

One less set of definitions for KDB to create - Thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops after cd /sys/.../cpufreq/; rmmod; cat stats/time_in_state

2007-03-21 Thread Greg KH
On Wed, Mar 21, 2007 at 11:51:04PM -0400, Dave Jones wrote:
> On Wed, Mar 21, 2007 at 08:07:53PM -0700, Greg KH wrote:
>  
>  > > After modprobe/rmmod cpufreq/stats directory appears but doesn't get
>  > > removed. Should it?
>  > Well, one can argue that those stats should never be in sysfs at all
>  > anyway, I mean come on, a histogram in sysfs?  That's, not ok.
> 
> Meh, it's only a cheesy debug thing, so it's not really that big a deal imo.
> it could probably move to debugfs (we didn't have that when it was merged 
> iirc)
> I doubt anyone really cares enough to bother though I wouldn't
> be averse to a patch.

Yeah, I realize this, I'm not trying to find fault, sorry if it came
across that way.  And yes, it should move to debugfs some day, but if it
does, I'll loose my "what not to put in sysfs" example I use in
presentations :)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Use X86_EFLAGS_IF in irqflags.h, lguest.

2007-03-21 Thread Rusty Russell
On Thu, 2007-03-22 at 14:24 +1100, Rusty Russell wrote:
> Belay this: there's a X86_EFLAGS_IF in asm/processor.h which we should
> use.  Will send patch.

How's this.  There may be other users, but they're not easy to grep for.

==
Move X86_EFLAGS_IF et al out to a new header: processor-flags.h, so we
can include it from irqflags.h and use it in raw_irqs_disabled_flags().

As a side-effect, we could now use these flags in .S files.

Lguest also modified to use the flags.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r bd0a803d9948 include/asm-i386/processor-flags.h
--- /dev/null   Thu Jan 01 00:00:00 1970 +
+++ b/include/asm-i386/processor-flags.hThu Mar 22 14:31:41 2007 +1100
@@ -0,0 +1,26 @@
+#ifndef __ASM_I386_PROCESSOR_FLAGS_H
+#define __ASM_I386_PROCESSOR_FLAGS_H
+/* Various flags defined: can be included from assembler. */
+
+/*
+ * EFLAGS bits
+ */
+#define X86_EFLAGS_CF  0x0001 /* Carry Flag */
+#define X86_EFLAGS_PF  0x0004 /* Parity Flag */
+#define X86_EFLAGS_AF  0x0010 /* Auxillary carry Flag */
+#define X86_EFLAGS_ZF  0x0040 /* Zero Flag */
+#define X86_EFLAGS_SF  0x0080 /* Sign Flag */
+#define X86_EFLAGS_TF  0x0100 /* Trap Flag */
+#define X86_EFLAGS_IF  0x0200 /* Interrupt Flag */
+#define X86_EFLAGS_DF  0x0400 /* Direction Flag */
+#define X86_EFLAGS_OF  0x0800 /* Overflow Flag */
+#define X86_EFLAGS_IOPL0x3000 /* IOPL mask */
+#define X86_EFLAGS_NT  0x4000 /* Nested Task */
+#define X86_EFLAGS_RF  0x0001 /* Resume Flag */
+#define X86_EFLAGS_VM  0x0002 /* Virtual Mode */
+#define X86_EFLAGS_AC  0x0004 /* Alignment Check */
+#define X86_EFLAGS_VIF 0x0008 /* Virtual Interrupt Flag */
+#define X86_EFLAGS_VIP 0x0010 /* Virtual Interrupt Pending */
+#define X86_EFLAGS_ID  0x0020 /* CPUID detection flag */
+
+#endif /* __ASM_I386_PROCESSOR_FLAGS_H */
diff -r bd0a803d9948 include/asm-i386/irqflags.h
--- a/include/asm-i386/irqflags.h   Thu Mar 22 14:13:31 2007 +1100
+++ b/include/asm-i386/irqflags.h   Thu Mar 22 14:47:21 2007 +1100
@@ -9,6 +9,7 @@
  */
 #ifndef _ASM_IRQFLAGS_H
 #define _ASM_IRQFLAGS_H
+#include 
 
 #ifndef __ASSEMBLY__
 static inline unsigned long native_save_fl(void)
@@ -119,7 +120,7 @@ static inline unsigned long __raw_local_
 
 static inline int raw_irqs_disabled_flags(unsigned long flags)
 {
-   return !(flags & (1 << 9));
+   return !(flags & X86_EFLAGS_IF);
 }
 
 static inline int raw_irqs_disabled(void)
diff -r bd0a803d9948 include/asm-i386/processor.h
--- a/include/asm-i386/processor.h  Thu Mar 22 14:13:31 2007 +1100
+++ b/include/asm-i386/processor.h  Thu Mar 22 14:31:58 2007 +1100
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* flag for disabling the tsc */
 extern int tsc_disable;
@@ -125,27 +126,6 @@ extern void detect_ht(struct cpuinfo_x86
 #else
 static inline void detect_ht(struct cpuinfo_x86 *c) {}
 #endif
-
-/*
- * EFLAGS bits
- */
-#define X86_EFLAGS_CF  0x0001 /* Carry Flag */
-#define X86_EFLAGS_PF  0x0004 /* Parity Flag */
-#define X86_EFLAGS_AF  0x0010 /* Auxillary carry Flag */
-#define X86_EFLAGS_ZF  0x0040 /* Zero Flag */
-#define X86_EFLAGS_SF  0x0080 /* Sign Flag */
-#define X86_EFLAGS_TF  0x0100 /* Trap Flag */
-#define X86_EFLAGS_IF  0x0200 /* Interrupt Flag */
-#define X86_EFLAGS_DF  0x0400 /* Direction Flag */
-#define X86_EFLAGS_OF  0x0800 /* Overflow Flag */
-#define X86_EFLAGS_IOPL0x3000 /* IOPL mask */
-#define X86_EFLAGS_NT  0x4000 /* Nested Task */
-#define X86_EFLAGS_RF  0x0001 /* Resume Flag */
-#define X86_EFLAGS_VM  0x0002 /* Virtual Mode */
-#define X86_EFLAGS_AC  0x0004 /* Alignment Check */
-#define X86_EFLAGS_VIF 0x0008 /* Virtual Interrupt Flag */
-#define X86_EFLAGS_VIP 0x0010 /* Virtual Interrupt Pending */
-#define X86_EFLAGS_ID  0x0020 /* CPUID detection flag */
 
 static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
 unsigned int *ecx, unsigned int *edx)
diff -r bd0a803d9948 arch/i386/lguest/interrupts_and_traps.c
--- a/arch/i386/lguest/interrupts_and_traps.c   Thu Mar 22 14:13:31 2007 +1100
+++ b/arch/i386/lguest/interrupts_and_traps.c   Thu Mar 22 14:46:15 2007 +1100
@@ -42,7 +42,7 @@ static void reflect_trap(struct lguest *
   (it's always 0, since irqs are enabled when guest is running). */
eflags = regs->eflags;
get_user(irq_enable, >lguest_data->irq_enabled);
-   eflags |= (irq_enable & 512);
+   eflags |= (irq_enable & X86_EFLAGS_IF);
 
push_guest_stack(lg, , eflags);
push_guest_stack(lg, , regs->cs);
@@ -86,7 +86,7 @@ void maybe_do_interrupt(struct lguest *l
/* If they're halted, we re-enable interrupts. */
if (lg->halted) {
/* Re-enable interrupts. */
-   put_user(512, >lguest_data->irq_enabled);
+   put_user(X86_EFLAGS_IF, 

Re: Oops after cd /sys/.../cpufreq/; rmmod; cat stats/time_in_state

2007-03-21 Thread Dave Jones
On Wed, Mar 21, 2007 at 08:07:53PM -0700, Greg KH wrote:
 
 > > After modprobe/rmmod cpufreq/stats directory appears but doesn't get
 > > removed. Should it?
 > Well, one can argue that those stats should never be in sysfs at all
 > anyway, I mean come on, a histogram in sysfs?  That's, not ok.

Meh, it's only a cheesy debug thing, so it's not really that big a deal imo.
it could probably move to debugfs (we didn't have that when it was merged iirc)
I doubt anyone really cares enough to bother though I wouldn't
be averse to a patch.
To the best of my knowledge, nothing in userspace is relying on that stuff
being present (it'd have to cope with it not being there anyway given its
optional).

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [1/6] 2.6.21-rc4: known regressions

2007-03-21 Thread Linus Torvalds


On Sun, 18 Mar 2007, Adrian Bunk wrote:
> 
> Subject: weird system hangs
> References : http://lkml.org/lkml/2007/3/16/288
> Submitter  : Michal Piotrowski <[EMAIL PROTECTED]>
>  Mariusz Kozlowski <[EMAIL PROTECTED]>
> Status : unknown

According to the console log, it seems to be hung because a lot of 
processes are stuck in D state in various variations of this:

Call Trace:
 [] start_this_handle+0x2d7/0x355
 [] journal_start+0xb3/0xe1
 [] ext3_journal_start_sb+0x48/0x4a
 [] ext3_create+0x47/0xe2
 [] vfs_create+0xcd/0x13e
 [] open_namei+0x176/0x5b5
 [] do_filp_open+0x26/0x3b
 [] do_sys_open+0x43/0xc2
 [] sys_open+0x1c/0x1e
 [] syscall_call+0x7/0xb

and then you have "kget" (whatever that is) which is doing

Call Trace:
 [] schedule_timeout+0x70/0x8e
 [] schedule_timeout_uninterruptible+0x15/0x17
 [] journal_stop+0xe2/0x1e6
 [] journal_force_commit+0x1d/0x1f
 [] ext3_force_commit+0x22/0x24
 [] ext3_write_inode+0x34/0x3a
 [] __writeback_single_inode+0x1c5/0x2cb
 [] sync_inode+0x1c/0x2e
 [] ext3_sync_file+0xab/0xc0
 [] do_fsync+0x4b/0x98
 [] __do_fsync+0x20/0x2f
 [] sys_fdatasync+0x10/0x12
 [] syscall_call+0x7/0xb

with kjournald in D sleep at

 [] journal_commit_transaction+0x15d/0x11d3
 [] kjournald+0xab/0x1e8
 [] kthread+0xb5/0xe0
 [] kernel_thread_helper+0x7/0x10

which certainly looks like something is waiting for an IO to finish.

In contrast, the hang reported by Mariusz Kozlowski has a slightly 
different feel to it, but there's a tantalizing pattern in there too:

  http://www.ussg.iu.edu/hypermail/linux/kernel/0703.0/1243.html

Call Trace:
[] io_schedule+0x42/0x59
[] sleep_on_buffer+0x8/0xc
[] __wait_on_bit+0x47/0x6c
[] out_of_line_wait_on_bit+0x5b/0x64
[] __wait_on_buffer+0x27/0x2d
[] journal_commit_transaction+0x707/0x127f
[] kjournald+0xac/0x1ed
[] kthread+0xa2/0xc9
[] kernel_thread_helper+0x7/0x1c

which certainly also looks like an IO never completed (or completed but 
never woke anything up).

It also seems to be related to *buffers*. Maybe the whole bh layer thing 
is a fluke, but it's not waiting for normal data, it's very much waiting 
for those journal things that all use buffer heads.Which just makes me 
worry about those patches by Nick (which did come in through Andrew). I 
don't think it's the memorder one (it looks safe and shouldn't matter on 
x86 anyway!), but what about the

fs: fix __block_write_full_page error case buffer submission

locking change for example? Or that "fs: fix nobh data leak" thing with 
its fix? It uses "SetPageUptodate(page);" without waking up anybody who 
might wait for it (but the waiters here seem to wait on buffers, so that's 
probably not it)..

Alternatively, maybe it really is an _io_ problem (and the buffer-head 
thing is just a red herring, and it could happen to other IO, it's just 
that metadata IO uses buffer heads), and it's the scheduler changes since 
2.6.20..

Jens, Nick.. Could you take a look?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] no boot with 2.6.21-rc3 and later

2007-03-21 Thread Bob Tracy
john stultz wrote:
> > > Also, does booting w/ "clocksource=jiffies" change the behavior?

Works fine with 2.6.21-rc4.  I'm running on that kernel as I type this.

> Also trying booting w/ "notsc" would be a useful data point.

Boot hangs at the point indicated in my original message.  I *did*
notice the blurb in the console messages about the pit clocksource
being selected/used.  There was also a complaint about it being
unstable, with a negative delta.  I think these messages are consistent
with the pre-bad-commit case other than where they appear in the boot
messages.

> (...) a pre-bad-commit dmesg would help.

Sent under separate cover to John.

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Define EFLAGS_IF

2007-03-21 Thread Rusty Russell
On Thu, 2007-03-22 at 14:16 +1100, Rusty Russell wrote:
> There is now more than one place where we use the fact that bit 9 of
> eflags is the interrupt-enabled flag, so define EFLAGS_IF.  We make it
> 512 so it can be used in asm, too.

Belay this: there's a X86_EFLAGS_IF in asm/processor.h which we should
use.  Will send patch.

Thanks to Jeremy Fitzhardinge for the clue.

Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops after cd /sys/.../cpufreq/; rmmod; cat stats/time_in_state

2007-03-21 Thread Greg KH
On Tue, Mar 20, 2007 at 01:06:34PM +0300, Alexey Dobriyan wrote:
> On Mon, Mar 19, 2007 at 01:41:25PM -0700, Greg KH wrote:
> > On Mon, Mar 19, 2007 at 06:30:13PM +0300, Alexey Dobriyan wrote:
> > > Steps to reproduce:
> > >
> > >   # modprobe p4-clockmod
> > >   $ cd /sys/devices/system/cpu/cpu0/cpufreq/
> > >   # rmmod p4-clockmod
> > >   $ cat stats/time_in_state
> > >   Segmentation fault
> >
> > Has this always happened?  Or is it new?
> 
> I've checked 2.6.17 and up and it happens too.
> 
> Some .config peculiarities:
> 
>   CONFIG_CPU_FREQ=y
>   CONFIG_CPU_FREQ_STAT=y
>   CONFIG_X86_P4_CLOCKMOD=m
> 
> After modprobe/rmmod cpufreq/stats directory appears but doesn't get
> removed. Should it?

Yes.

Well, one can argue that those stats should never be in sysfs at all
anyway, I mean come on, a histogram in sysfs?  That's, not ok.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] Replace pid_t in autofs with struct pid reference

2007-03-21 Thread Ian Kent
On Wed, 2007-03-21 at 21:19 -0500, Serge E. Hallyn wrote:
> Quoting Ian Kent ([EMAIL PROTECTED]):
> > On Tue, 2007-03-20 at 16:01 -0600, Eric W. Biederman wrote:
> > > "Serge E. Hallyn" <[EMAIL PROTECTED]> writes:
> > > 
> > > >> >  void autofs4_dentry_release(struct dentry *);
> > > >> >  extern void autofs4_kill_sb(struct super_block *);
> > > >> > diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> > > >> > index 9857543..4a9ad9b 100644
> > > >> > --- a/fs/autofs4/waitq.c
> > > >> > +++ b/fs/autofs4/waitq.c
> > > >> > @@ -141,8 +141,8 @@ static void autofs4_notify_daemon(struct
> > > >> >  packet->ino = wq->ino;
> > > >> >  packet->uid = wq->uid;
> > > >> >  packet->gid = wq->gid;
> > > >> > -packet->pid = wq->pid;
> > > >> > -packet->tgid = wq->tgid;
> > > >> > +packet->pid = pid_nr(wq->pid);
> > > >> > +packet->tgid = pid_nr(wq->tgid);
> > > >> >  break;
> > > >> 
> > > >> I'm assuming we build the packet in the process context of the
> > > >> daemon we are sending it to.  If not we have a problem here.
> > > >
> > > > Yes this is data being sent to a userspace daemon (Ian pls correct me if
> > > > I'm wrong) so the pid_nr is the only thing we can send.
> > > 
> > > Agreed.  The question is are we in the user space daemon's process when
> > > we generate the pid_nr.  Or do we stuff this in some kind of socket,
> > > and the socket switch locations of the packet.
> > 
> > The context here is the automount daemon only for expire runs.
> > 
> > Mount request packets are triggered by user processes walking over an
> > autofs mount point directory. So "current" in this case isn't the autofs
> > daemon.
> > 
> > Requests are sent via a pipe to the daemon.
> 
> So is the pid used for anything other than debugging?
> 
> In any case, here is a replacement patch which sends the pid number
> in the pid_namespace of the process which did the autofs4 mount.
> 
> Still not sure whether that is actually what makes sense...
> 
> From: "Serge E. Hallyn" <[EMAIL PROTECTED]>
> Subject: [PATCH] autofs: prevent pid wraparound in waitqs
> 
> Instead of storing pid numbers for waitqs, store references
> to struct pids.  Also store a reference to the mounter's pid
> namespace in the autofs4 sb info so that pid numbers for
> mount miss and expiry msgs can send the pid# in the mounter's
> pidns.

I think this amounts to what I suggested in my previous replies.
Hopefully my comments are enough to clear up any questions on
correctness of this approach.

Sorry to be a pain but I'm having a little trouble reviewing the patch
because I'm not clear on where the code to handle the automount process
group (so called oz_pgrp), from the first patch, fits in with this.

Is this patch in addition to the original?
If so are the references to pid_nr still OK?

Ian

> 
> Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]>
> 
> ---
> 
>  fs/autofs4/autofs_i.h |   14 --
>  fs/autofs4/inode.c|5 +
>  fs/autofs4/waitq.c|   12 ++--
>  include/linux/pid.h   |1 +
>  kernel/pid.c  |   13 ++---
>  5 files changed, 34 insertions(+), 11 deletions(-)
> 
> e4184e6923f811f8a025b831ea33541fa820fd62
> diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
> index 3ccec0a..55026dd 100644
> --- a/fs/autofs4/autofs_i.h
> +++ b/fs/autofs4/autofs_i.h
> @@ -79,8 +79,8 @@ struct autofs_wait_queue {
>   u64 ino;
>   uid_t uid;
>   gid_t gid;
> - pid_t pid;
> - pid_t tgid;
> + struct pid *pid;
> + struct pid *tgid;
>   /* This is for status reporting upon return */
>   int status;
>   atomic_t wait_ctr;
> @@ -97,6 +97,7 @@ struct autofs_sb_info {
>   int pipefd;
>   struct file *pipe;
>   struct pid *oz_pgrp;
> + struct pid_namespace *pidns;
>   int catatonic;
>   int version;
>   int sub_version;
> @@ -228,5 +229,14 @@ out:
>   return ret;
>  }
>  
> +static inline void autofs_free_wait_queue(struct autofs_wait_queue *wq)
> +{
> + if (wq->pid)
> + put_pid(wq->pid);
> + if (wq->tgid)
> + put_pid(wq->tgid);
> + kfree(wq);
> +}
> +
>  void autofs4_dentry_release(struct dentry *);
>  extern void autofs4_kill_sb(struct super_block *);
> diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
> index c34131a..294efd8 100644
> --- a/fs/autofs4/inode.c
> +++ b/fs/autofs4/inode.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "autofs_i.h"
>  #include 
>  
> @@ -164,6 +165,7 @@ void autofs4_kill_sb(struct super_block 
>   autofs4_catatonic_mode(sbi); /* Free wait queues, close pipe */
>  
>   put_pid(sbi->oz_pgrp);
> + put_pid_ns(sbi->pidns);
>  
>   /* Clean up and release dangling references */
>   autofs4_force_release(sbi);
> @@ -334,6 +336,8 @@ int autofs4_fill_super(struct super_bloc
>   sbi->type = 0;
>   sbi->min_proto = 0;
>   

[PATCH] Define EFLAGS_IF

2007-03-21 Thread Rusty Russell
There is now more than one place where we use the fact that bit 9 of
eflags is the interrupt-enabled flag, so define EFLAGS_IF.  We make it
512 so it can be used in asm, too.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

--- a/arch/i386/lguest/lguest.c
+++ b/arch/i386/lguest/lguest.c
@@ -107,9 +107,8 @@ static void fastcall irq_disable(void)
 
 static void fastcall irq_enable(void)
 {
-   /* Linux i386 code expects bit 9 set. */
/* FIXME: Check if interrupt pending... */
-   lguest_data.irq_enabled = 512;
+   lguest_data.irq_enabled = EFLAGS_IF;
 }
 
 static void fastcall lguest_load_gdt(const struct Xgt_desc_struct *desc)
@@ -394,7 +393,7 @@ static fastcall void lguest_write_idt_en
extern const char start_##name[], end_##name[]; \
asm("start_" #name ": " code "; end_" #name ":")
 DEF_LGUEST(cli, "movl $0," LGUEST_IRQ);
-DEF_LGUEST(sti, "movl $512," LGUEST_IRQ);
+DEF_LGUEST(sti, "movl $"__stringify(EFLAGS_IF)"," LGUEST_IRQ);
 DEF_LGUEST(popf, "movl %eax," LGUEST_IRQ);
 DEF_LGUEST(pushf, "movl " LGUEST_IRQ ",%eax");
 DEF_LGUEST(pushf_cli, "movl " LGUEST_IRQ ",%eax; movl $0," LGUEST_IRQ);
===
--- a/include/asm-i386/irqflags.h
+++ b/include/asm-i386/irqflags.h
@@ -87,6 +87,9 @@ static inline unsigned long __raw_local_
 #endif /* __ASSEMBLY__ */
 #endif /* CONFIG_PARAVIRT */
 
+/* Bit 9 of eflags means interrupts are enabled: a raw int for asm. */
+#define EFLAGS_IF 512
+
 #ifndef __ASSEMBLY__
 #define raw_local_save_flags(flags) \
do { (flags) = __raw_local_save_flags(); } while (0)
@@ -96,7 +99,7 @@ static inline unsigned long __raw_local_
 
 static inline int raw_irqs_disabled_flags(unsigned long flags)
 {
-   return !(flags & (1 << 9));
+   return !(flags & EFLAGS_IF);
 }
 
 static inline int raw_irqs_disabled(void)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] lguest: Compile hypervisor.S into the lg module directly

2007-03-21 Thread Rusty Russell
Because of legacy-induced blindness, I insisted on separately building
the hypervisor.S switcher code (which is mapped at 0xFFC in host
and guest).  However, the lguest64 patches showed the error of my
ways: it has no relocations, so it can be linked into the module like
normal then remapped.

The only downside is that we can no longer use sizeof(hypervisor_blob),
so we need to allocate our page-array dynamically.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r 9d462a93e1fa arch/i386/lguest/Makefile
--- a/arch/i386/lguest/Makefile Wed Mar 21 08:56:52 2007 +1100
+++ b/arch/i386/lguest/Makefile Thu Mar 22 11:43:20 2007 +1100
@@ -4,19 +4,4 @@ obj-$(CONFIG_LGUEST_GUEST) += lguest.o l
 # Host requires the other files, which can be a module.
 obj-$(CONFIG_LGUEST)   += lg.o
 lg-objs := core.o hypercalls.o page_tables.o interrupts_and_traps.o \
-   segments.o io.o lguest_user.o
-
-# We use top 4MB for hypervisor. */
-HYPE_ADDR := 0xFFC0
-# The data is only 1k (256 interrupt handler pointers)
-HYPE_DATA_SIZE := 1024
-CFLAGS += -DHYPE_ADDR="$(HYPE_ADDR)" -DHYPE_DATA_SIZE="$(HYPE_DATA_SIZE)"
-
-$(obj)/core.o: $(obj)/hypervisor-blob.c
-# This links the hypervisor in the right place and turns it into a C array.
-$(obj)/hypervisor-raw: $(obj)/hypervisor.o
-   @$(LD) -static -Tdata=`printf %#x $$(($(HYPE_ADDR)))` -Ttext=`printf 
%#x $$(($(HYPE_ADDR)+$(HYPE_DATA_SIZE)))` -o $@ $< && $(OBJCOPY) -O binary $@
-$(obj)/hypervisor-blob.c: $(obj)/hypervisor-raw
-   @od -tx1 -An -v $< | sed -e 's/^ /0x/' -e 's/$$/,/' -e 's/ /,0x/g' > $@
-
-clean-files := hypervisor-blob.c hypervisor-raw
+   segments.o io.o lguest_user.o hypervisor.o
diff -r 9d462a93e1fa arch/i386/lguest/core.c
--- a/arch/i386/lguest/core.c   Wed Mar 21 08:56:52 2007 +1100
+++ b/arch/i386/lguest/core.c   Thu Mar 22 11:44:17 2007 +1100
@@ -19,17 +19,21 @@
 #include 
 #include "lg.h"
 
-/* This is our hypervisor, compiled from hypervisor.S. */
-static char __initdata hypervisor_blob[] = {
-#include "hypervisor-blob.c"
-};
+/* Found in hypervisor.S */
+extern char start_hyper_text[], end_hyper_text[], switch_to_guest[];
+extern unsigned long default_idt_entries[];
 
 /* Every guest maps the core hypervisor blob. */
-#define SHARED_HYPERVISOR_PAGES DIV_ROUND_UP(sizeof(hypervisor_blob),PAGE_SIZE)
+#define SHARED_HYPERVISOR_PAGES \
+   DIV_ROUND_UP(end_hyper_text - start_hyper_text, PAGE_SIZE)
+/* Pages for hypervisor itself, then two pages per cpu */
+#define TOTAL_HYPE_PAGES (SHARED_HYPERVISOR_PAGES + 2 * NR_CPUS)
+
+/* We map at -4M for ease of mapping into the guest (one PTE page). */
+#define HYPE_ADDR 0xFFC0
 
 static struct vm_struct *hypervisor_vma;
-/* Pages for hypervisor itself, then two pages per cpu */
-static struct page *hype_page[SHARED_HYPERVISOR_PAGES+2*NR_CPUS];
+static struct page **hype_page;
 
 static int cpu_had_pge;
 static struct {
@@ -43,16 +47,10 @@ static DEFINE_PER_CPU(struct lguest *, l
 #define MAX_LGUEST_GUESTS 16
 struct lguest lguests[MAX_LGUEST_GUESTS];
 
-/* IDT entries are at start of hypervisor. */
-static const unsigned long *lguest_default_idt_entries(void)
-{
-   return (void *)HYPE_ADDR;
-}
-
-/* Next is switch_to_guest */
-static void *__lguest_switch_to_guest(void)
-{
-   return (void *)HYPE_ADDR + HYPE_DATA_SIZE;
+/* Offset from where hypervisor.S was compiled to where we've copied it */
+static unsigned long hype_offset(void)
+{
+   return HYPE_ADDR - (unsigned long)start_hyper_text;
 }
 
 /* This cpu's struct lguest_pages. */
@@ -65,9 +63,15 @@ static __init int map_hypervisor(void)
 static __init int map_hypervisor(void)
 {
int i, err;
-   struct page **pagep = hype_page;
-
-   for (i = 0; i < ARRAY_SIZE(hype_page); i++) {
+   struct page **pagep;
+
+   hype_page = kmalloc(sizeof(hype_page[0])*TOTAL_HYPE_PAGES, GFP_KERNEL);
+   if (!hype_page) {
+   err = -ENOMEM;
+   goto out;
+   }
+
+   for (i = 0; i < TOTAL_HYPE_PAGES; i++) {
unsigned long addr = get_zeroed_page(GFP_KERNEL);
if (!addr) {
err = -ENOMEM;
@@ -76,7 +80,7 @@ static __init int map_hypervisor(void)
hype_page[i] = virt_to_page(addr);
}
 
-   hypervisor_vma = __get_vm_area(ARRAY_SIZE(hype_page) * PAGE_SIZE,
+   hypervisor_vma = __get_vm_area(TOTAL_HYPE_PAGES * PAGE_SIZE,
   VM_ALLOC, HYPE_ADDR, VMALLOC_END);
if (!hypervisor_vma) {
err = -ENOMEM;
@@ -84,12 +88,18 @@ static __init int map_hypervisor(void)
goto free_pages;
}
 
+   pagep = hype_page;
err = map_vm_area(hypervisor_vma, PAGE_KERNEL, );
if (err) {
printk("lguest: map_vm_area failed: %i\n", err);
goto free_vma;
}
-   memcpy(hypervisor_vma->addr, hypervisor_blob, sizeof(hypervisor_blob));
+   memcpy(hypervisor_vma->addr, start_hyper_text,
+  

[PATCH] lguest: clean up some l"references .init.text" warnings

2007-03-21 Thread Rusty Russell
Thanks to Andrew for pointing these out.

This patch moves the parvirtprobe section into .init.data: it's only
used in very very early boot, and for similar reasons, puts
lguest_maybe_init and lguest_memory_setup in init.text.

As well as fixing some warnings, this frees up a tiny bit more memory.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r ecec388180b2 arch/i386/kernel/vmlinux.lds.S
--- a/arch/i386/kernel/vmlinux.lds.SMon Mar 19 14:58:08 2007 +1100
+++ b/arch/i386/kernel/vmlinux.lds.STue Mar 20 12:10:39 2007 +1100
@@ -81,12 +81,6 @@ SECTIONS
CONSTRUCTORS
} :data
 
-  .paravirtprobe : AT(ADDR(.paravirtprobe) - LOAD_OFFSET) {
-   __start_paravirtprobe = .;
-   *(.paravirtprobe)
-   __stop_paravirtprobe = .;
-  }
-
   . = ALIGN(4096);
   .data_nosave : AT(ADDR(.data_nosave) - LOAD_OFFSET) {
__nosave_begin = .;
@@ -151,7 +145,12 @@ SECTIONS
*(.init.text)
_einittext = .;
   }
-  .init.data : AT(ADDR(.init.data) - LOAD_OFFSET) { *(.init.data) }
+  .init.data : AT(ADDR(.init.data) - LOAD_OFFSET) {
+   *(.init.data)
+   __start_paravirtprobe = .;
+   *(.paravirtprobe)
+   __stop_paravirtprobe = .;
+  }
   . = ALIGN(16);
   .init.setup : AT(ADDR(.init.setup) - LOAD_OFFSET) {
__setup_start = .;
diff -r ecec388180b2 arch/i386/lguest/lguest.c
--- a/arch/i386/lguest/lguest.c Mon Mar 19 14:58:08 2007 +1100
+++ b/arch/i386/lguest/lguest.c Tue Mar 20 11:34:07 2007 +1100
@@ -136,7 +136,7 @@ static struct notifier_block paniced = {
.notifier_call = lguest_panic
 };
 
-static char *lguest_memory_setup(void)
+static __init char *lguest_memory_setup(void)
 {
/* We do this here because lockcheck barfs if before start_kernel */
atomic_notifier_chain_register(_notifier_list, );
@@ -549,7 +549,8 @@ static __attribute_used__ __init void lg
start_kernel();
 }
 
-asm("lguest_maybe_init:\n"
+asm(".section .init.text\n"
+"lguest_maybe_init:\n"
 "  cmpl $"__stringify(LGUEST_MAGIC_EBP)", %ebp\n"
 "  jne 1f\n"
 "  cmpl $"__stringify(LGUEST_MAGIC_EDI)", %edi\n"


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] make KVM conform to sucky rdmsr interface

2007-03-21 Thread Rusty Russell
Grrr Andi refused to take my "rdmsr64" patch which moved to a
function-like interface for MSRs, dismissing it as pointless churn.

paravirt_ops cleanups changed a macro to an inline and spotted this
kvm bug.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

diff -r 47c6ee74a5c5 drivers/kvm/vmx.c
--- a/drivers/kvm/vmx.c Thu Mar 22 12:57:44 2007 +1100
+++ b/drivers/kvm/vmx.c Thu Mar 22 13:38:24 2007 +1100
@@ -1127,7 +1127,7 @@ static int vmx_vcpu_setup(struct kvm_vcp
u64 data;
int j = vcpu->nmsrs;
 
-   if (rdmsr_safe(index, _low, _high) < 0)
+   if (rdmsr_safe(index, data_low, data_high) < 0)
continue;
if (wrmsr_safe(index, data_low, data_high) < 0)
continue;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Cleanup: rationalize paravirt wrappers

2007-03-21 Thread Rusty Russell
paravirt.c used to implement native versions of all low-level
functions.  Far cleaner is to have the native versions exposed in the
headers and as inline native_XXX, and if !CONFIG_PARAVIRT, then simply
#define XXX native_XXX.

There are four nice side effects:

1) write_dt_entry() now takes the correct "struct Xgt_desc_struct *"
   not "void *".

2) load_TLS is reintroduced to the for loop, not manually unrolled
   with a #error in case the bounds ever change.

3) Macros become inlines, with type checking.

4) Access to the native versions is trivial for KVM, lguest, Xen and
   others who might want it.

Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>

---
 arch/i386/kernel/paravirt.c  |  289 --
 include/asm-i386/desc.h  |   90 -
 include/asm-i386/io.h|   17 +-
 include/asm-i386/irqflags.h  |   61 ++--
 include/asm-i386/msr.h   |  156 ++
 include/asm-i386/paravirt.h  |   16 +-
 include/asm-i386/processor.h |   96 ++---
 include/asm-i386/system.h|  139 
 8 files changed, 383 insertions(+), 481 deletions(-)

===
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -93,292 +93,9 @@ static unsigned native_patch(u8 type, u1
return insn_len;
 }
 
-static unsigned long native_get_debugreg(int regno)
-{
-   unsigned long val = 0;  /* Damn you, gcc! */
-
-   switch (regno) {
-   case 0:
-   asm("movl %%db0, %0" :"=r" (val)); break;
-   case 1:
-   asm("movl %%db1, %0" :"=r" (val)); break;
-   case 2:
-   asm("movl %%db2, %0" :"=r" (val)); break;
-   case 3:
-   asm("movl %%db3, %0" :"=r" (val)); break;
-   case 6:
-   asm("movl %%db6, %0" :"=r" (val)); break;
-   case 7:
-   asm("movl %%db7, %0" :"=r" (val)); break;
-   default:
-   BUG();
-   }
-   return val;
-}
-
-static void native_set_debugreg(int regno, unsigned long value)
-{
-   switch (regno) {
-   case 0:
-   asm("movl %0,%%db0" : /* no output */ :"r" (value));
-   break;
-   case 1:
-   asm("movl %0,%%db1" : /* no output */ :"r" (value));
-   break;
-   case 2:
-   asm("movl %0,%%db2" : /* no output */ :"r" (value));
-   break;
-   case 3:
-   asm("movl %0,%%db3" : /* no output */ :"r" (value));
-   break;
-   case 6:
-   asm("movl %0,%%db6" : /* no output */ :"r" (value));
-   break;
-   case 7:
-   asm("movl %0,%%db7" : /* no output */ :"r" (value));
-   break;
-   default:
-   BUG();
-   }
-}
-
 void init_IRQ(void)
 {
paravirt_ops.init_IRQ();
-}
-
-static void native_clts(void)
-{
-   asm volatile ("clts");
-}
-
-static unsigned long native_read_cr0(void)
-{
-   unsigned long val;
-   asm volatile("movl %%cr0,%0\n\t" :"=r" (val));
-   return val;
-}
-
-static void native_write_cr0(unsigned long val)
-{
-   asm volatile("movl %0,%%cr0": :"r" (val));
-}
-
-static unsigned long native_read_cr2(void)
-{
-   unsigned long val;
-   asm volatile("movl %%cr2,%0\n\t" :"=r" (val));
-   return val;
-}
-
-static void native_write_cr2(unsigned long val)
-{
-   asm volatile("movl %0,%%cr2": :"r" (val));
-}
-
-static unsigned long native_read_cr3(void)
-{
-   unsigned long val;
-   asm volatile("movl %%cr3,%0\n\t" :"=r" (val));
-   return val;
-}
-
-static void native_write_cr3(unsigned long val)
-{
-   asm volatile("movl %0,%%cr3": :"r" (val));
-}
-
-static unsigned long native_read_cr4(void)
-{
-   unsigned long val;
-   asm volatile("movl %%cr4,%0\n\t" :"=r" (val));
-   return val;
-}
-
-static unsigned long native_read_cr4_safe(void)
-{
-   unsigned long val;
-   /* This could fault if %cr4 does not exist */
-   asm("1: movl %%cr4, %0  \n"
-   "2: \n"
-   ".section __ex_table,\"a\"  \n"
-   ".long 1b,2b\n"
-   ".previous  \n"
-   : "=r" (val): "0" (0));
-   return val;
-}
-
-static void native_write_cr4(unsigned long val)
-{
-   asm volatile("movl %0,%%cr4": :"r" (val));
-}
-
-static unsigned long native_save_fl(void)
-{
-   unsigned long f;
-   asm volatile("pushfl ; popl %0":"=g" (f): /* no input */);
-   return f;
-}
-
-static void native_restore_fl(unsigned long f)
-{
-   asm volatile("pushl %0 ; popfl": /* no output */
-:"g" (f)
-:"memory", "cc");
-}
-
-static void native_irq_disable(void)
-{
-   asm volatile("cli": : :"memory");
-}
-
-static void 

Re: [PATCH 2/2] Replace pid_t in autofs with struct pid reference

2007-03-21 Thread Ian Kent
On Wed, 2007-03-21 at 15:58 -0500, Serge E. Hallyn wrote:
> Quoting Eric W. Biederman ([EMAIL PROTECTED]):
> > "Serge E. Hallyn" <[EMAIL PROTECTED]> writes:
> > 
> > >> >  void autofs4_dentry_release(struct dentry *);
> > >> >  extern void autofs4_kill_sb(struct super_block *);
> > >> > diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> > >> > index 9857543..4a9ad9b 100644
> > >> > --- a/fs/autofs4/waitq.c
> > >> > +++ b/fs/autofs4/waitq.c
> > >> > @@ -141,8 +141,8 @@ static void autofs4_notify_daemon(struct
> > >> >packet->ino = wq->ino;
> > >> >packet->uid = wq->uid;
> > >> >packet->gid = wq->gid;
> > >> > -  packet->pid = wq->pid;
> > >> > -  packet->tgid = wq->tgid;
> > >> > +  packet->pid = pid_nr(wq->pid);
> > >> > +  packet->tgid = pid_nr(wq->tgid);
> > >> >break;
> > >> 
> > >> I'm assuming we build the packet in the process context of the
> > >> daemon we are sending it to.  If not we have a problem here.
> > >
> > > Yes this is data being sent to a userspace daemon (Ian pls correct me if
> > > I'm wrong) so the pid_nr is the only thing we can send.
> > 
> > Agreed.  The question is are we in the user space daemon's process when
> > we generate the pid_nr.  Or do we stuff this in some kind of socket,
> > and the socket switch locations of the packet.
> > 
> > Basically I'm just trying to be certain we are calling pid_nr in the
> > proper context.  Otherwise we could get the wrong pid when we have
> > multiple pid namespaces in play.
> 
> We need to know what the userspace daemon being written to is doing
> with autofs_ptype_{missing,expire}_{in,}direct() messages.

At the moment autofs only uses the packet->pid for logging purposes.
This solves an age old problem of not knowing who is causing mount
requests.

I'm not aware of any other applications that use version 5 yet but that
of course could change. So we can't really know what will be done with
these ids at some point in the future.

> 
> If I understand correctly, the pid being sent is of a process which
> tried to automount some directory.  The message is being sent to the
> autofs daemon, which should be running in the root pid namespace.

Yes, but it could be the autofs daemon itself in the expire case.

Usually it doesn't make sense to run an automounting application as
other than "root" but I'm not familiar with other possible userspace
applications. Perhaps User Mode Linux could be an issue?

> 
> So as it is, the pid_nr(wq->pid) should be done under the init
> pid_namespace, since it's a kthread.  So as long as the userspace
> automount daemon is started in the root pid namespace, the pid it gets
> will be the right one.
> 
> Ian, does what I'm saying make sense, or am I wrong about how things
> work for autofs?

Yep. That's the way it is.

> 
> thanks,
> -serge
> 
> PS
> Note that if I'm right, but some machine starts autofs in a child
> pid_namespace, the pid_nr() the way I have it is wrong.  I'm not sure in
> that case how we go about fixing that.  Somehow we need to store the
> autofs userspace daemon's pid namespace pointer to help us find the
> proper pid_nr.

In order for any user space application to use the module it must mount
the autofs file system, passing a file handle for the pipe to use for
communication. This must always be done. Can we grab the process pid
namespace at that time and store it in the superblock info structure?

How does this affect getting ids for waitq request packets of other user
space processes triggering mounts? I'm guessing that they would need to
belong to the appropriate namespace for this mechanism to funtion
correctly.

Ian


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] cpuidle: unsigned bitfield

2007-03-21 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

A 1-bit bitfield has no room for a sign bit.
drivers/cpuidle/governors/ladder.c:54:16: error: dubious bitfield without 
explicit `signed' or `unsigned'

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/cpuidle/governors/ladder.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.21-rc4-mm1.orig/drivers/cpuidle/governors/ladder.c
+++ linux-2.6.21-rc4-mm1/drivers/cpuidle/governors/ladder.c
@@ -51,7 +51,7 @@ struct ladder_device_state {
 
 struct ladder_device {
struct ladder_device_state states[CPUIDLE_STATE_MAX];
-   int bm_check:1;
+   unsigned int bm_check:1;
unsigned long bm_check_timestamp;
unsigned long bm_activity; /* FIXME: bm activity should be global */
int last_state_idx;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] ipwireless: use ANSI func. decl.

2007-03-21 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Fix function declaration:
drivers/char/pcmcia/ipwireless_cs_tty.c:730:29: warning: non-ANSI function 
declaration of function 'ipwireless_tty_release'

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/char/pcmcia/ipwireless_cs_tty.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.21-rc4-mm1.orig/drivers/char/pcmcia/ipwireless_cs_tty.c
+++ linux-2.6.21-rc4-mm1/drivers/char/pcmcia/ipwireless_cs_tty.c
@@ -727,7 +727,7 @@ int ipwireless_tty_init(int major)
return 0;
 }
 
-void ipwireless_tty_release()
+void ipwireless_tty_release(void)
 {
int ret, loops = 0;
remove_proc_entry(IPWIRELESS_PCCARD_NAME, _root);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] nozomi: use NULL for pointers

2007-03-21 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Use NULL instead of 0 for pointers:
drivers/char/nozomi.c:1028:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1029:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1031:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1032:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1034:70: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1035:70: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1037:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1039:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1040:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1042:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1043:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1045:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1046:68: warning: Using plain integer as NULL pointer
drivers/char/nozomi.c:1048:64: warning: Using plain integer as NULL pointer

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/char/nozomi.c |   28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

--- linux-2.6.21-rc4-mm1.orig/drivers/char/nozomi.c
+++ linux-2.6.21-rc4-mm1/drivers/char/nozomi.c
@@ -1025,27 +1025,27 @@ static char *interrupt2str(u16 interrupt
static char buf[TMP_BUF_MAX];
char *p = buf;
 
-   interrupt & MDM_DL1 ? p += snprintf(p, TMP_BUF_MAX, "MDM_DL1 ") : 0;
-   interrupt & MDM_DL2 ? p += snprintf(p, TMP_BUF_MAX, "MDM_DL2 ") : 0;
+   interrupt & MDM_DL1 ? p += snprintf(p, TMP_BUF_MAX, "MDM_DL1 ") : NULL;
+   interrupt & MDM_DL2 ? p += snprintf(p, TMP_BUF_MAX, "MDM_DL2 ") : NULL;
 
-   interrupt & MDM_UL1 ? p += snprintf(p, TMP_BUF_MAX, "MDM_UL1 ") : 0;
-   interrupt & MDM_UL2 ? p += snprintf(p, TMP_BUF_MAX, "MDM_UL2 ") : 0;
+   interrupt & MDM_UL1 ? p += snprintf(p, TMP_BUF_MAX, "MDM_UL1 ") : NULL;
+   interrupt & MDM_UL2 ? p += snprintf(p, TMP_BUF_MAX, "MDM_UL2 ") : NULL;
 
-   interrupt & DIAG_DL1 ? p += snprintf(p, TMP_BUF_MAX, "DIAG_DL1 ") : 0;
-   interrupt & DIAG_DL2 ? p += snprintf(p, TMP_BUF_MAX, "DIAG_DL2 ") : 0;
+   interrupt & DIAG_DL1 ? p += snprintf(p, TMP_BUF_MAX, "DIAG_DL1 ") : 
NULL;
+   interrupt & DIAG_DL2 ? p += snprintf(p, TMP_BUF_MAX, "DIAG_DL2 ") : 
NULL;
 
-   interrupt & DIAG_UL ? p += snprintf(p, TMP_BUF_MAX, "DIAG_UL ") : 0;
+   interrupt & DIAG_UL ? p += snprintf(p, TMP_BUF_MAX, "DIAG_UL ") : NULL;
 
-   interrupt & APP1_DL ? p += snprintf(p, TMP_BUF_MAX, "APP1_DL ") : 0;
-   interrupt & APP2_DL ? p += snprintf(p, TMP_BUF_MAX, "APP2_DL ") : 0;
+   interrupt & APP1_DL ? p += snprintf(p, TMP_BUF_MAX, "APP1_DL ") : NULL;
+   interrupt & APP2_DL ? p += snprintf(p, TMP_BUF_MAX, "APP2_DL ") : NULL;
 
-   interrupt & APP1_UL ? p += snprintf(p, TMP_BUF_MAX, "APP1_UL ") : 0;
-   interrupt & APP2_UL ? p += snprintf(p, TMP_BUF_MAX, "APP2_UL ") : 0;
+   interrupt & APP1_UL ? p += snprintf(p, TMP_BUF_MAX, "APP1_UL ") : NULL;
+   interrupt & APP2_UL ? p += snprintf(p, TMP_BUF_MAX, "APP2_UL ") : NULL;
 
-   interrupt & CTRL_DL ? p += snprintf(p, TMP_BUF_MAX, "CTRL_DL ") : 0;
-   interrupt & CTRL_UL ? p += snprintf(p, TMP_BUF_MAX, "CTRL_UL ") : 0;
+   interrupt & CTRL_DL ? p += snprintf(p, TMP_BUF_MAX, "CTRL_DL ") : NULL;
+   interrupt & CTRL_UL ? p += snprintf(p, TMP_BUF_MAX, "CTRL_UL ") : NULL;
 
-   interrupt & RESET ? p += snprintf(p, TMP_BUF_MAX, "RESET ") : 0;
+   interrupt & RESET ? p += snprintf(p, TMP_BUF_MAX, "RESET ") : NULL;
 
return buf;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm 1/4] Blackfin: architecture update patch

2007-03-21 Thread Wu, Bryan
On Wed, 2007-03-21 at 15:53 -0700, Andrew Morton wrote:
> On Wed, 21 Mar 2007 18:19:23 +0800
> "Wu, Bryan" <[EMAIL PROTECTED]> wrote:
> 
> > 1) Some issues are fixed according to LKML patch review.
> > 2) Remove not supported BF535 code
> > 3) Fixed some bugs from blackfin.uclinux.org SVN update
> > Here is the updated patch for 2.6.21-rc4-mm1
> 
> 1 out of 1 hunk FAILED -- saving rejects to file 
> include/asm-blackfin/cplbinit.h.rej
> 1 out of 1 hunk FAILED -- saving rejects to file 
> include/asm-blackfin/mach-bf535/bf535.h.rej
> 1 out of 1 hunk FAILED -- saving rejects to file 
> include/asm-blackfin/scatterlist.h.rej
> 
> This seems to be against a kernel which did not include
> blackfin-arch-balance-parenthesis-in-macros.patch.  But 2.6.21-rc4-mm1 did
> include blackfin-arch-balance-parenthesis-in-macros.patch, so I'm not sure
> what is going on here.
> 

Sorry for this mess up. you know, as
blackfin-arch-balance-parenthesis-in-macros.patch was posted, it was
applied to our SVN tree. And I just merged these weeks SVN change to
this update patch. So this patch includes
blackfin-arch-balance-parenthesis-in-macros.patch. 

Please hold on, after I fixed some issues according to Arnd and Paul's
latest review, a new update patch will be sent.

Sorry for send a all-in-one update patch. Arnd and Paul gave lots' of
good ideas about our last blackfin-arch.patch last time, and most of
their request are applied in this days. So if I split the all-in-one
update patch to small fix, there will be maybe 20 or 30 patches.

After this blackfin-arch-update.patch was merged to -mm, I will post
small fixed patches as Arnd's advise.

Thanks
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.21 5/4] cxgb3 - fix white spaces in drivers/net/Kconfig

2007-03-21 Thread divy
From: Divy Le Ray <[EMAIL PROTECTED]>

Use tabs instead of white spaces for CHELSIO_T3 entry.

Signed-off-by: Divy Le Ray <[EMAIL PROTECTED]>
---

 drivers/net/Kconfig |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 1b6459b..c3f9f59 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2372,23 +2372,23 @@ config CHELSIO_T1_NAPI
  when the driver is receiving lots of packets from the card.
 
 config CHELSIO_T3
-tristate "Chelsio Communications T3 10Gb Ethernet support"
-depends on PCI
+   tristate "Chelsio Communications T3 10Gb Ethernet support"
+   depends on PCI
select FW_LOADER
-help
-  This driver supports Chelsio T3-based gigabit and 10Gb Ethernet
-  adapters.
+   help
+ This driver supports Chelsio T3-based gigabit and 10Gb Ethernet
+ adapters.
 
-  For general information about Chelsio and our products, visit
-  our website at .
+ For general information about Chelsio and our products, visit
+ our website at .
 
-  For customer support, please visit our customer support page at
-  .
+ For customer support, please visit our customer support page at
+ .
 
-  Please send feedback to <[EMAIL PROTECTED]>.
+ Please send feedback to <[EMAIL PROTECTED]>.
 
-  To compile this driver as a module, choose M here: the module
-  will be called cxgb3.
+ To compile this driver as a module, choose M here: the module
+ will be called cxgb3.
 
 config EHEA
tristate "eHEA Ethernet support"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FRV: Fix unannotated variable declarations

2007-03-21 Thread Yasunori Goto
> From: David Howells <[EMAIL PROTECTED]>
> 
> Fix unannotated variable declarations.  Variables that have allocation section
> annotations (such as __meminitdata) on their definitions must also have them 
> on
> their declarations as not doing so may affect the addressing mode used by the
> compiler and may result in a linker error.

Right. Thanks.

Acked-by: Yasunori Goto <[EMAIL PROTECTED]>


-- 
Yasunori Goto 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] Replace pid_t in autofs with struct pid reference

2007-03-21 Thread Serge E. Hallyn
Quoting Ian Kent ([EMAIL PROTECTED]):
> On Tue, 2007-03-20 at 16:01 -0600, Eric W. Biederman wrote:
> > "Serge E. Hallyn" <[EMAIL PROTECTED]> writes:
> > 
> > >> >  void autofs4_dentry_release(struct dentry *);
> > >> >  extern void autofs4_kill_sb(struct super_block *);
> > >> > diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> > >> > index 9857543..4a9ad9b 100644
> > >> > --- a/fs/autofs4/waitq.c
> > >> > +++ b/fs/autofs4/waitq.c
> > >> > @@ -141,8 +141,8 @@ static void autofs4_notify_daemon(struct
> > >> >packet->ino = wq->ino;
> > >> >packet->uid = wq->uid;
> > >> >packet->gid = wq->gid;
> > >> > -  packet->pid = wq->pid;
> > >> > -  packet->tgid = wq->tgid;
> > >> > +  packet->pid = pid_nr(wq->pid);
> > >> > +  packet->tgid = pid_nr(wq->tgid);
> > >> >break;
> > >> 
> > >> I'm assuming we build the packet in the process context of the
> > >> daemon we are sending it to.  If not we have a problem here.
> > >
> > > Yes this is data being sent to a userspace daemon (Ian pls correct me if
> > > I'm wrong) so the pid_nr is the only thing we can send.
> > 
> > Agreed.  The question is are we in the user space daemon's process when
> > we generate the pid_nr.  Or do we stuff this in some kind of socket,
> > and the socket switch locations of the packet.
> 
> The context here is the automount daemon only for expire runs.
> 
> Mount request packets are triggered by user processes walking over an
> autofs mount point directory. So "current" in this case isn't the autofs
> daemon.
> 
> Requests are sent via a pipe to the daemon.

So is the pid used for anything other than debugging?

In any case, here is a replacement patch which sends the pid number
in the pid_namespace of the process which did the autofs4 mount.

Still not sure whether that is actually what makes sense...

From: "Serge E. Hallyn" <[EMAIL PROTECTED]>
Subject: [PATCH] autofs: prevent pid wraparound in waitqs

Instead of storing pid numbers for waitqs, store references
to struct pids.  Also store a reference to the mounter's pid
namespace in the autofs4 sb info so that pid numbers for
mount miss and expiry msgs can send the pid# in the mounter's
pidns.

Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]>

---

 fs/autofs4/autofs_i.h |   14 --
 fs/autofs4/inode.c|5 +
 fs/autofs4/waitq.c|   12 ++--
 include/linux/pid.h   |1 +
 kernel/pid.c  |   13 ++---
 5 files changed, 34 insertions(+), 11 deletions(-)

e4184e6923f811f8a025b831ea33541fa820fd62
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index 3ccec0a..55026dd 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -79,8 +79,8 @@ struct autofs_wait_queue {
u64 ino;
uid_t uid;
gid_t gid;
-   pid_t pid;
-   pid_t tgid;
+   struct pid *pid;
+   struct pid *tgid;
/* This is for status reporting upon return */
int status;
atomic_t wait_ctr;
@@ -97,6 +97,7 @@ struct autofs_sb_info {
int pipefd;
struct file *pipe;
struct pid *oz_pgrp;
+   struct pid_namespace *pidns;
int catatonic;
int version;
int sub_version;
@@ -228,5 +229,14 @@ out:
return ret;
 }
 
+static inline void autofs_free_wait_queue(struct autofs_wait_queue *wq)
+{
+   if (wq->pid)
+   put_pid(wq->pid);
+   if (wq->tgid)
+   put_pid(wq->tgid);
+   kfree(wq);
+}
+
 void autofs4_dentry_release(struct dentry *);
 extern void autofs4_kill_sb(struct super_block *);
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index c34131a..294efd8 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "autofs_i.h"
 #include 
 
@@ -164,6 +165,7 @@ void autofs4_kill_sb(struct super_block 
autofs4_catatonic_mode(sbi); /* Free wait queues, close pipe */
 
put_pid(sbi->oz_pgrp);
+   put_pid_ns(sbi->pidns);
 
/* Clean up and release dangling references */
autofs4_force_release(sbi);
@@ -334,6 +336,8 @@ int autofs4_fill_super(struct super_bloc
sbi->type = 0;
sbi->min_proto = 0;
sbi->max_proto = 0;
+   sbi->pidns = task_pid_ns(current);
+   get_pid_ns(sbi->pidns);
mutex_init(>wq_mutex);
spin_lock_init(>fs_lock);
sbi->queues = NULL;
@@ -435,6 +439,7 @@ fail_iput:
 fail_ino:
kfree(ino);
 fail_free:
+   put_pid_ns(sbi->pidns);
kfree(sbi);
s->s_fs_info = NULL;
 fail_unlock:
diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
index 9857543..30e2a90 100644
--- a/fs/autofs4/waitq.c
+++ b/fs/autofs4/waitq.c
@@ -141,8 +141,8 @@ static void autofs4_notify_daemon(struct
packet->ino = wq->ino;
packet->uid = wq->uid;
packet->gid = wq->gid;
-   packet->pid = wq->pid;

Re: [PATCH 2.6.21 3/4] cxgb3 - Fix potential MAC hang

2007-03-21 Thread Divy Le Ray

Andrew Morton wrote:

On Sun, 18 Mar 2007 13:10:12 -0700
[EMAIL PROTECTED] wrote:

  

From: Divy Le Ray <[EMAIL PROTECTED]>

Under rare conditions, the MAC might hang while generating a pause frame.
This patch fine tunes the MAC settings to avoid the issue, allows for 
periodic MAC state check, and triggers a recovery if hung. 


Also fix one MAC statistics counter for the rev board T3B2.



This conflicts with your previously-submitted, not-yet-merged-by-Jeff
cxgb3-add-sw-lro-support.patch.

What should we do about this?
  


I can send you a patch against the -mm tree, if it is acceptable.

Cheers,
Divy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: rsdl check for niced tasks lowering prio level

2007-03-21 Thread Con Kolivas
Here is the best fix for the bug pointed out. Thanks.

I'll try and find pc time to wrap these two patches together and make a v0.32
available.

---
Ensure niced tasks are not inappropriately limiting sleeping unniced tasks
by explicitly checking what the best static priority that has run this
major rotation was.

Reimplement SCHED_BATCH using this check.

Signed-off-by: Con Kolivas <[EMAIL PROTECTED]>

---
 kernel/sched.c |   33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

Index: linux-2.6.21-rc4-mm1/kernel/sched.c
===
--- linux-2.6.21-rc4-mm1.orig/kernel/sched.c2007-03-22 12:44:05.0 
+1100
+++ linux-2.6.21-rc4-mm1/kernel/sched.c 2007-03-22 12:58:26.0 +1100
@@ -201,8 +201,11 @@ struct rq {
struct prio_array *active, *expired, arrays[2];
unsigned long *dyn_bitmap, *exp_bitmap;
 
-   int prio_level;
-   /* The current dynamic priority level this runqueue is at */
+   int prio_level, best_static_prio;
+   /*
+* The current dynamic priority level this runqueue is at, and the
+* best static priority queued this major rotation.
+*/
 
unsigned long prio_rotation;
/* How many times we have rotated the priority queue */
@@ -704,16 +707,24 @@ static inline int entitled_slot(int stat
 
 /*
  * Find the first unused slot by this task that is also in its prio_matrix
- * level.
+ * level. Ensure that the prio_level is not unnecessarily low by checking
+ * that best_static_prio this major rotation was not a niced task.
+ * SCHED_BATCH tasks do not perform this check so they do not induce
+ * latencies in tasks of any nice level.
  */
 static inline int next_entitled_slot(struct task_struct *p, struct rq *rq)
 {
-   DECLARE_BITMAP(tmp, PRIO_RANGE);
+   if (p->static_prio < rq->best_static_prio && p->policy != SCHED_BATCH)
+   return SCHED_PRIO(find_first_zero_bit(p->bitmap, PRIO_RANGE));
+   else {
+   DECLARE_BITMAP(tmp, PRIO_RANGE);
 
-   bitmap_or(tmp, p->bitmap, prio_matrix[USER_PRIO(p->static_prio)],
- PRIO_RANGE);
-   return SCHED_PRIO(find_next_zero_bit(tmp, PRIO_RANGE,
-   USER_PRIO(rq->prio_level)));
+   bitmap_or(tmp, p->bitmap,
+ prio_matrix[USER_PRIO(p->static_prio)],
+ PRIO_RANGE);
+   return SCHED_PRIO(find_next_zero_bit(tmp, PRIO_RANGE,
+   USER_PRIO(rq->prio_level)));
+   }
 }
 
 static void queue_expired(struct task_struct *p, struct rq *rq)
@@ -3315,6 +3326,7 @@ static inline void major_prio_rotation(s
rq->active = new_array;
rq->exp_bitmap = rq->expired->prio_bitmap;
rq->dyn_bitmap = rq->active->prio_bitmap;
+   rq->best_static_prio = MAX_PRIO;
rq->prio_rotation++;
 }
 
@@ -3640,10 +3652,12 @@ need_resched_nonpreemptible:
}
 switch_tasks:
if (next == rq->idle) {
+   rq->best_static_prio = MAX_PRIO;
rq->prio_level = MAX_RT_PRIO;
rq->prio_rotation++;
schedstat_inc(rq, sched_goidle);
-   }
+   } else if (next->static_prio < rq->best_static_prio)
+   rq->best_static_prio = next->static_prio;
prefetch(next);
prefetch_stack(next);
clear_tsk_need_resched(prev);
@@ -7093,6 +7107,7 @@ void __init sched_init(void)
lockdep_set_class(>lock, >rq_lock_key);
rq->nr_running = 0;
rq->prio_rotation = 0;
+   rq->best_static_prio = MAX_PRIO;
rq->prio_level = MAX_RT_PRIO;
rq->active = rq->arrays;
rq->expired = rq->arrays + 1;

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AIO, FIO and Threads ...

2007-03-21 Thread Davide Libenzi
On Wed, 21 Mar 2007, Jens Axboe wrote:

> On Tue, Mar 20 2007, Davide Libenzi wrote:
> > 
> > I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
> > FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):
> > 
> > http://www.xmailserver.org/guasi-lib.html
> > 
> > I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
> > and 8GB of RAM.
> > Mind that I'm not FIO expert, like at all, but I got some interesting 
> > results when comparing GUASI with libaio at 8/1000/1 depths.
> > If I read those result correctly (Jens may help), GUASI output is more 
> > then double the libaio one.
> > Lots of context switches, yes. But the throughput looks like 2+ times.
> > Can someone try to repeat the measures and/or spot the error?
> > Or tell me which other tests to run?
> > This is kinda a suprise for me ...
> 
> I don't know guasi at all, but libaio requires O_DIRECT to be async. I'm
> sure you know this, but you may not know that fio default to buffered IO
> so you have to tell it to use O_DIRECT :-)
> 
> So try adding a --direct=1 (or --buffered=0, same thing) as an extra
> option when comparing depths > 1.

This is a much smaller box. P4 with 1GB of RAM, and on a 1GB RW test. 
Now libaio performs well, even though GUASI can keep the pace. Quite a 
surprise, nonetheless ...




- Davide



*** fio --name=global --rw=randrw --size=1g --bs=4k --direct=1 
--ioengine=guasi --name=job1 --iodepth=100 --thread --runtime=20
job1: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=guasi, iodepth=100
Starting 1 thread
Jobs: 1: [m] [100.0% done] [   609/ 0 kb/s] [eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=21862
  read : io=8,300KiB, bw=412KiB/s, iops=100, runt= 20599msec
slat (msec): min=0, max=2, avg= 0.04, stdev= 0.29
clat (msec): min=   11, max= 1790, avg=564.76, stdev=350.58
bw (KiB/s) : min=   47, max=  692, per=121.80%, avg=501.83, stdev=196.86
  write: io=11,344KiB, bw=563KiB/s, iops=137, runt= 20599msec
slat (msec): min=0, max=2, avg= 0.04, stdev= 0.28
clat (msec): min=2, max=  643, avg=311.86, stdev=108.85
bw (KiB/s) : min=0, max= 1695, per=143.52%, avg=808.00, stdev=632.11
  cpu  : usr=0.19%, sys=1.94%, ctx=28036
  IO depths: 1=0.0%, 2=0.0%, 4=0.1%, 8=0.2%, 16=0.3%, 32=0.7%, >=64=98.7%
 lat (msec): 2=0.0%, 4=0.0%, 10=0.0%, 20=0.3%, 50=1.2%, 100=2.2%
 lat (msec): 250=16.4%, 500=51.4%, 750=16.8%, 1000=6.5%, >=2000=5.1%

Run status group 0 (all jobs):
   READ: io=8,300KiB, aggrb=412KiB/s, minb=412KiB/s, maxb=412KiB/s, 
mint=20599msec, maxt=20599msec
  WRITE: io=11,344KiB, aggrb=563KiB/s, minb=563KiB/s, maxb=563KiB/s, 
mint=20599msec, maxt=20599msec

Disk stats (read/write):
  sda: ios=2074/2846, merge=1/26, ticks=945564/14060, in_queue=959624, 
util=97.65%



*** fio --name=global --rw=randrw --size=1g --bs=4k --direct=1 
--ioengine=libaio --name=job1 --iodepth=100 --thread --runtime=20
job1: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=100
Starting 1 thread
Jobs: 1: [m] [100.0% done] [   406/   438 kb/s] [eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=21860
  read : io=8,076KiB, bw=403KiB/s, iops=98, runt= 20495msec
slat (msec): min=0, max=  494, avg= 0.55, stdev=14.75
clat (msec): min=0, max= 1788, avg=509.38, stdev=391.43
bw (KiB/s) : min=   20, max=  682, per=104.55%, avg=421.32, stdev=153.91
  write: io=11,024KiB, bw=550KiB/s, iops=134, runt= 20495msec
slat (msec): min=0, max=  441, avg= 0.23, stdev= 8.40
clat (msec): min=0, max= 1695, avg=368.51, stdev=308.11
bw (KiB/s) : min=0, max= 1787, per=105.78%, avg=581.78, stdev=438.43
  cpu  : usr=0.06%, sys=0.76%, ctx=6185
  IO depths: 1=0.1%, 2=0.2%, 4=0.3%, 8=0.7%, 16=1.3%, 32=2.7%, >=64=94.7%
 lat (msec): 2=0.4%, 4=0.1%, 10=0.7%, 20=2.5%, 50=7.0%, 100=9.4%
 lat (msec): 250=20.2%, 500=23.2%, 750=17.8%, 1000=10.9%, >=2000=7.9%

Run status group 0 (all jobs):
   READ: io=8,076KiB, aggrb=403KiB/s, minb=403KiB/s, maxb=403KiB/s, 
mint=20495msec, maxt=20495msec
  WRITE: io=11,024KiB, aggrb=550KiB/s, minb=550KiB/s, maxb=550KiB/s, 
mint=20495msec, maxt=20495msec

Disk stats (read/write):
  sda: ios=2019/2788, merge=0/38, ticks=988100/1048096, in_queue=2036196, 
util=99.38%



*** fio --name=global --rw=randrw --size=1g --bs=4k --direct=1 
--ioengine=guasi --name=job1 --iodepth=1000 --thread --runtime=20
job1: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=guasi, iodepth=1000
Starting 1 thread
Jobs: 1: [m] [1.9% done] [  2348/  1710 kb/s] [eta 21m:00s]s]
job1: (groupid=0, jobs=1): err= 0: pid=19471
  read : io=10,640KiB, bw=436KiB/s, iops=106, runt= 24972msec
slat (msec): min=0, max=   26, avg= 3.46, stdev= 7.68
clat (msec): min=   27, max= 9800, avg=5048.47, stdev=1728.19
bw (KiB/s) : min=   44, max=  689, per=98.52%, avg=429.54, stdev=190.74
  write: io=9,748KiB, bw=399KiB/s, iops=97, runt= 24972msec
slat (msec): min=   

Re: [PATCH 2/2] Replace pid_t in autofs with struct pid reference

2007-03-21 Thread Ian Kent
On Tue, 2007-03-20 at 16:01 -0600, Eric W. Biederman wrote:
> "Serge E. Hallyn" <[EMAIL PROTECTED]> writes:
> 
> >> >  void autofs4_dentry_release(struct dentry *);
> >> >  extern void autofs4_kill_sb(struct super_block *);
> >> > diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> >> > index 9857543..4a9ad9b 100644
> >> > --- a/fs/autofs4/waitq.c
> >> > +++ b/fs/autofs4/waitq.c
> >> > @@ -141,8 +141,8 @@ static void autofs4_notify_daemon(struct
> >> >  packet->ino = wq->ino;
> >> >  packet->uid = wq->uid;
> >> >  packet->gid = wq->gid;
> >> > -packet->pid = wq->pid;
> >> > -packet->tgid = wq->tgid;
> >> > +packet->pid = pid_nr(wq->pid);
> >> > +packet->tgid = pid_nr(wq->tgid);
> >> >  break;
> >> 
> >> I'm assuming we build the packet in the process context of the
> >> daemon we are sending it to.  If not we have a problem here.
> >
> > Yes this is data being sent to a userspace daemon (Ian pls correct me if
> > I'm wrong) so the pid_nr is the only thing we can send.
> 
> Agreed.  The question is are we in the user space daemon's process when
> we generate the pid_nr.  Or do we stuff this in some kind of socket,
> and the socket switch locations of the packet.

The context here is the automount daemon only for expire runs.

Mount request packets are triggered by user processes walking over an
autofs mount point directory. So "current" in this case isn't the autofs
daemon.

Requests are sent via a pipe to the daemon.

Ian


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21 2/4] cxgb3 - Auto-load FW if mismatch detected

2007-03-21 Thread Divy Le Ray

Andrew Morton wrote:

On Sun, 18 Mar 2007 13:10:06 -0700
[EMAIL PROTECTED] wrote:

  

 config CHELSIO_T3
 tristate "Chelsio Communications T3 10Gb Ethernet support"
 depends on PCI
+   select FW_LOADER



Something has gone wrong with the indenting there.
  


The added line is fine. The surrounding lines are not. They use white 
spaces.

I'll send a patch over the last series to use tabs in drivers/net/Kconfig.

Cheers,
Divy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc1 and 2.6.21-rc2 kwin dies silently

2007-03-21 Thread Randy Dunlap
On Thu, 22 Mar 2007 01:32:36 + Sid Boyce wrote:

> Eric W. Biederman wrote:
> > Adrian Bunk <[EMAIL PROTECTED]> writes:
> >
> >   
> >> On Wed, Mar 21, 2007 at 05:43:11PM +, Sid Boyce wrote:
> >> 
> >>> Sid Boyce wrote:
> >>>   
>  Andrew Morton wrote:
>  
> > (cc restored.  Please always do reply-to-all)
> >
> >
> >   
> >> On Wed, 28 Feb 2007 18:05:13 +0200 [EMAIL PROTECTED] wrote:
> >> On Wednesday 28 February 2007 17:19, Sid Boyce wrote:
> >>   
> >> 
> >>> openSUSE 10.3 Alpha and KDE-3.5.6, xorg-x11-7.2. KDE is setup not to
> >>> require a password to unlock, but it asks for password. When the 
> >>> screen
> >>> unlocks, kwin is gone with no errors logged in /var/log/kdm or
> >>> /var/log/messages. No problems with 2.6.20.
> >>>
> >>> Same problem on openSUSE 10.2 x86_64, KDE-3.5.5 and 2.6.21-rc2.
> >>> Regards
> >>> Sid.
> >>>  
> >>>   
> >> This is the linux kernel mailing list. Perhaps you should post your 
> >> problem to the opensuse mailing list.
> >>
> >> 
> > 2.6.20 worked.
> >
> > 2.6.20-rc2 did not.
> >
> > Working theory: the kernel broke.
> >
> > Sid, the chances that anyone can work out what caused this are pretty 
> > low. It would be great if you could perform a git bisection search 
> > sometime in
> > the next few weeks, work out which commit caused this.
> >
> > Thanks.
> >
> >
> >
> >  
> >   
>  I shall go back to 2.6.20-git3 and work forward. Up to 2.6.20-git2 was 
>  OK.
>  Regards
>  Sid.
> 
>  
> >>> I tracked the problem down to 2.6.20-git11. Up to 2.6.20-git10 is OK, 
> >>> but from 2.6.20-git11 up to current 2.6.21-rc4-git2 all exhibit the 
> >>> problem.
> >>>   
> >> Thanks for this search.
> >>
> >> Looking at the changes between 2.6.20-git10 and 2.6.20-git11, the only 
> >> suspicious changes are the 60 sysctl patches by Eric.
> >>
> >> Eric, can you look at this issue?
> >> 
> >
> > git bisect between git10 (ac98695d6c1508b724f246f38ce57fb4e3cec356)
> > and git11 (86a71dbd3e81e8870d0f0e56b87875f57e58222b) is likely the most
> > productive thing that can be done right now.  
> >
> > I can't think of anything in my sysctl patches that would kill an
> > application.  My sysctl work is right on the border with user space
> > so it is a good candidate but at the same time there should have
> > been no user visible changes.  There were a few places where
> > I removed sys_sysctl support (but not /proc/sys support) but I don't
> > think any of those were on x86, and they were is such a messed up
> > state I don't think anyone could have reasonably used them anyway.
> >
> > So I think either we poke blindly making random guess by hand or
> > we let git-bisect do it.
> >
> > Sid do you think you can figure out git-bisect?
> > git-bisect start
> > git-bisect bad 86a71dbd3e81e8870d0f0e56b87875f57e58222b
> > git-bisect good ac98695d6c1508b724f246f38ce57fb4e3cec356
> >
> > It should narrow the problem down to a single commit in 6-8 tries
> > after which point we should have enough information to start
> > making intelligent guesses. 
> >
> > Eric
> >
> >
> >   
> Reading the manpage doesn't help, so I shall have to delve  into the 
> docs or  futher help is needed.

There's not a lot of docs out there.

The man-page:  http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html

Linus's email doc:
http://www.kernel.org/pub/software/scm/git/docs/howto/isolate-bugs-with-bisect.txt

I worked on something over last weekend, but it doesn't really add
much to the references above.

> :/usr/src/linux-2.6.20-git11 # git-bisect good 
> ac98695d6c1508b724f246f38ce57fb4e3cec356
> No revs to be shown.

Did you tell git where to begin with good and bad?  I.e., you have
to tell it the bracketing of where to do the binary search.

> :/usr/src/linux-2.6.20-git11 # ls .git/refs/bisect/
> bad
> good-ac98695d6c1508b724f246f38ce57fb4e3cec356
> 
> :/usr/src/linux-2.6.20-git11 # less 
> .git/refs/bisect/good-ac98695d6c1508b724f246f38ce57fb4e3cec356
> ac98695d6c1508b724f246f38ce57fb4e3cec356
> 
> :/usr/src/linux-2.6.20-git11 # less .git/refs/bisect/bad
> 86a71dbd3e81e8870d0f0e56b87875f57e58222b


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


mmap and swap

2007-03-21 Thread Phy Prabab

Hello,

I have a mysterious issue with swapping.  I have a 32b machine running
2.4.21.x (RHEE30) w/4G RAM and 8G swap.  If I run one application and
pause it after having allocated 2.5G and then run another application
(or just another instance of the same app) and try to allocate another
2.5G I would have imagined that the first would be swapped out.  But
this is not the case, rather, only part of the first application is
swapped before allocation for the second application fails trying to
allocate more space.  What is even more interesting is that the first
application seems to only swap out memory allocated via sbrk.   So
within this application I allocate memory via sbrk until failure
(~890M) and then switch to mmap.  The glibc version is 2.3.2 and I am
mmaping memory via private|anonymous, so am I doing something
incorrect or is that mmap can not be swapped out?

TIA,
Phy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc1 and 2.6.21-rc2 kwin dies silently

2007-03-21 Thread Sid Boyce

Eric W. Biederman wrote:

Adrian Bunk <[EMAIL PROTECTED]> writes:

  

On Wed, Mar 21, 2007 at 05:43:11PM +, Sid Boyce wrote:


Sid Boyce wrote:
  

Andrew Morton wrote:


(cc restored.  Please always do reply-to-all)


  

On Wed, 28 Feb 2007 18:05:13 +0200 [EMAIL PROTECTED] wrote:
On Wednesday 28 February 2007 17:19, Sid Boyce wrote:
  


openSUSE 10.3 Alpha and KDE-3.5.6, xorg-x11-7.2. KDE is setup not to
require a password to unlock, but it asks for password. When the screen
unlocks, kwin is gone with no errors logged in /var/log/kdm or
/var/log/messages. No problems with 2.6.20.

Same problem on openSUSE 10.2 x86_64, KDE-3.5.5 and 2.6.21-rc2.
Regards
Sid.
 
  
This is the linux kernel mailing list. Perhaps you should post your 
problem to the opensuse mailing list.
   


2.6.20 worked.

2.6.20-rc2 did not.

Working theory: the kernel broke.

Sid, the chances that anyone can work out what caused this are pretty 
low. It would be great if you could perform a git bisection search 
sometime in

the next few weeks, work out which commit caused this.

Thanks.



 
  

I shall go back to 2.6.20-git3 and work forward. Up to 2.6.20-git2 was OK.
Regards
Sid.


I tracked the problem down to 2.6.20-git11. Up to 2.6.20-git10 is OK, 
but from 2.6.20-git11 up to current 2.6.21-rc4-git2 all exhibit the problem.
  

Thanks for this search.

Looking at the changes between 2.6.20-git10 and 2.6.20-git11, the only 
suspicious changes are the 60 sysctl patches by Eric.


Eric, can you look at this issue?



git bisect between git10 (ac98695d6c1508b724f246f38ce57fb4e3cec356)
and git11 (86a71dbd3e81e8870d0f0e56b87875f57e58222b) is likely the most
productive thing that can be done right now.  


I can't think of anything in my sysctl patches that would kill an
application.  My sysctl work is right on the border with user space
so it is a good candidate but at the same time there should have
been no user visible changes.  There were a few places where
I removed sys_sysctl support (but not /proc/sys support) but I don't
think any of those were on x86, and they were is such a messed up
state I don't think anyone could have reasonably used them anyway.

So I think either we poke blindly making random guess by hand or
we let git-bisect do it.

Sid do you think you can figure out git-bisect?
git-bisect start
git-bisect bad 86a71dbd3e81e8870d0f0e56b87875f57e58222b
git-bisect good ac98695d6c1508b724f246f38ce57fb4e3cec356

It should narrow the problem down to a single commit in 6-8 tries
after which point we should have enough information to start
making intelligent guesses. 


Eric


  
Reading the manpage doesn't help, so I shall have to delve  into the 
docs or  futher help is needed.
:/usr/src/linux-2.6.20-git11 # git-bisect good 
ac98695d6c1508b724f246f38ce57fb4e3cec356

No revs to be shown.

:/usr/src/linux-2.6.20-git11 # ls .git/refs/bisect/
bad
good-ac98695d6c1508b724f246f38ce57fb4e3cec356


:/usr/src/linux-2.6.20-git11 # less 
.git/refs/bisect/good-ac98695d6c1508b724f246f38ce57fb4e3cec356

ac98695d6c1508b724f246f38ce57fb4e3cec356

:/usr/src/linux-2.6.20-git11 # less .git/refs/bisect/bad
86a71dbd3e81e8870d0f0e56b87875f57e58222b
Regards
Sid.

--
Sid Boyce ... Hamradio License G3VBV, Licensed Private Pilot
Emeritus IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support Specialist, 
Cricket Coach
Microsoft Windows Free Zone - Linux used for all Computing Tasks


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


the lighter side of janitorial work

2007-03-21 Thread Robert P. J. Day

include/asm-v850/io.h:

...
#if 0
/* This is really stupid; don't define it.  */
#define page_to_bus(page)   page_to_phys (page)
#endif
...


rday

-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://fsdev.net/wiki/index.php?title=Main_Page

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] [RFC] AF_RXRPC socket family implementation [try #3]

2007-03-21 Thread David Howells
David Howells <[EMAIL PROTECTED]> wrote:

> > - recvmsg not supporting MSG_TRUNC is rather weird and really ought to be
> > fixed one day as its useful to find out the sizeof message pending when
> > combined with MSG_PEEK
> 
> Hmmm...  I hadn't considered that.  I assumed MSG_TRUNC not to be useful as
> arbitrarily chopping bits out of the request or reply would seem to be
> pointless.

But why do I need to support MSG_TRUNC?  I currently have things arranged so
that if you do a recvmsg() that doesn't pull everything out of a packet then
the next time you do a recvmsg() you'll get the next part of the data in that
packet.  MSG_EOR is flagged when recvmsg copies across the last byte of data
of a particular phase.

I might at some point in the future enable recvmsg() to keep pulling packets
off the Rx queue and copying them into userspace until the userspace buffer is
full or we find that the next packet is not the logical next in sequence.

Hmmm...  I'm actually overloading MSG_EOR.  MSG_EOR is flagged on the last
data read, and is also flagged for terminal messages (end or reply data,
abort, net error, final ACK, etc).  I wonder if I should use MSG_MORE (or its
lack) instead to indicate the end of data, and only set MSG_EOR on the
terminal message.

MSG_MORE is set by the app to flag to sendmsg() that there's more data to
come, so it would be consistent to use it for recvmsg() too.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: rsdl improvements

2007-03-21 Thread Con Kolivas
On Thursday 22 March 2007 11:24, Con Kolivas wrote:
> On Thursday 22 March 2007 10:48, Jeffrey Hundstad wrote:
> > Artur Skawina wrote:
> > > Con Kolivas wrote:
> > >> Note no interactive boost idea here.
> > >>
> > >> Patch is for 2.6.21-rc4-mm1. I have not spent the time trying to bring
> > >> other bases in sync.
> > >
> > > I've tried RSDLv.31+this on 2.6.20.3 as i'm not tracking -mm.
> > >
> > >> Further improve the deterministic nature of the RSDL cpu scheduler and
> > >> make the rr_interval tunable.
> > >>
> > >> By only giving out priority slots to tasks at the current runqueue's
> > >> prio_level or below we can make the cpu allocation not altered by
> > >> accounting issues across major_rotation periods. This makes the cpu
> > >> allocation and latencies more deterministic, and decreases maximum
> > >> latencies substantially. This change removes the possibility that
> > >> tasks can get bursts of cpu activity which can favour towards
> > >> interactive tasks but also favour towards cpu bound tasks which happen
> > >> to wait on other activity (such as I/O) and is a net gain.
> > >
> > > I'm not sure this is going in the right direction... I'm writing
> > > this while compiling a kernel w/ "nice -20 make -j2" and X is almost
> >
> > Did you mean "nice -20"?  If so, that should have slowed X quite a bit.
> > Try "nice 19" instead.
> >
> > nice(1):
> >Run  COMMAND  with an adjusted niceness, which affects process
> > scheduling.  With no COMMAND, print the current  niceness.   Nicenesses
> > range from -20 (most favorable scheduling) to 19 (least favorable).
>
> No he's right. Something scrambled my brain and I've completely left out
> the part where I offer the old bursts as a tunable option as well, which
> unintentionally killed off SCHED_BATCH as an entity. I'll have to put that
> as an additional patch sorry as this by itself is not always a win. Hang in
> there.

Actually, reworking the priority matrix to always have a slot at position 1 
should fix this without needing a tunable. That is a better approach so I'll 
do that.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: pagetable_ops: Hugetlb character device example

2007-03-21 Thread Matt Mackall
On Wed, Mar 21, 2007 at 04:35:28PM -0700, William Lee Irwin III wrote:
> On Wed, Mar 21, 2007 at 03:26:59PM -0700, William Lee Irwin III wrote:
> >> My exit strategy was to make hugetlbfs an alias for ramfs when ramfs
> >> acquired the necessary functionality until expand-on-mmap() was merged.
> >> That would've allowed rm -rf fs/hugetlbfs/ outright. A compatibility
> >> wrapper for expand-on-mmap() around ramfs once ramfs acquires the
> >> necessary functionality is now the exit strategy.
> 
> On Wed, Mar 21, 2007 at 05:53:48PM -0500, Matt Mackall wrote:
> > Can you describe what ramfs needs here in a bit more detail?
> > If it's non-trivial, I'd rather see any new functionality go into
> > shmfs/tmpfs, as ramfs has done a good job at staying a minimal fs thus
> > far.
> 
> I was referring to fully-general multiple pagesize support. ramfs
> would inherit the functionality by virtue of generic pagecache and TLB
> handling in such an arrangement. It doesn't make sense to modify ramfs
> as a special case; hugetlb is as it stands a ramfs special-cased for
> such purposes.

Ahh, I see.

Good luck!

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


warning: #warning You have not edited mcdx.h

2007-03-21 Thread roland
while looking at some more warnings i got with allyesconfig i came across 
this really weird one:


In file included from drivers/cdrom/mcdx.c:78:
drivers/cdrom/mcdx.h:180:2: warning: #warning You have not edited mcdx.h
drivers/cdrom/mcdx.h:181:2: warning: #warning Perhaps irq and i/o settings 
are wrong.


looking into the code :

#ifndef I_WAS_HERE
#ifndef MODULE
#warning You have not edited mcdx.h
#warning Perhaps irq and i/o settings are wrong.
#endif
#endif

and before:
/* *** make the following line uncommented, if you're sure,
* *** all configuration is done */
/* #define I_WAS_HERE */

huh?

is this file meant to be edited before compile ?


searched the list and came across some patch from adrian bunk:

[Patch] (new version) configure mcdx without editing mcdx.h

at http://marc.info/?l=linux-kernel=94772887906874=2
and http://marc.info/?l=linux-kernel=94724271712637=2
and http://marc.info/?l=linux-kernel=94699770112439=2

has this never been merged ?

regards
roland



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc4-mm1 - problem with cpuidle routine

2007-03-21 Thread Venkatesh Pallipadi
On Wed, Mar 21, 2007 at 01:38:15PM -0700, Andrew Morton wrote:
> On Wed, 21 Mar 2007 13:49:58 -0500
> Larry Finger <[EMAIL PROTECTED]> wrote:
> 
> > When I configure 'CPU Idle PM Support' on my HP dv2125nr notebook with a 
> > Turion X64 X2 processor and
> > X86_64 architecture selected, the computer freezes on bootup. I have 
> > included a portion the
> > configuration file and part of /var/log/boot.msg for my working system. 
> > Please let me know if
> > further info from my system is required. I would be happy to test any 
> > patches, etc.
> > 
> > Larry
> > 
> > ==
> > 
> > The console log ends with the following entries:
> > 
> > ACPI: Processor [CPU0] (supports 8 throttling states)
> > ACPI: Processor [CPU1] (supports 8 throttling states)
> > cpuidle: driver acpi_idle failed to attach to cpu 0
> > cpuidle: driver acpi_idle failed to attach to cpu 0
> > cpuidle: using driver acpi_idle
> > Loading thermal
> > 
> > At this point, the system hangs.
> > 
> > =
> > 
> > The beginning section of my .config is as follows:
> > 
> 
> Thanks.   Cc's added..

Patch below resolves this issue.


Patch for cpuidle boot hang reported by Larry Finger here.
http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/2025.html

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>

Index: new/drivers/cpuidle/cpuidle.c
===
--- new.orig/drivers/cpuidle/cpuidle.c  2007-03-21 14:25:11.0 -0800
+++ new/drivers/cpuidle/cpuidle.c   2007-03-21 14:25:33.0 -0800
@@ -119,6 +119,7 @@
 
dev = _cpu(cpuidle_devices, cpu);
 
+   dev->cpu = cpu;
mutex_lock(_lock);
if (cpu_is_offline(cpu)) {
mutex_unlock(_lock);
@@ -129,15 +130,26 @@
mutex_unlock(_lock);
return 0;
}
-   dev->status |= CPUIDLE_STATUS_DETECTED;
-   list_add(>device_list, _detected_devices);
-   cpuidle_add_sysfs(sys_dev);
-   if (cpuidle_curr_driver)
-   cpuidle_attach_driver(dev);
-   if (cpuidle_curr_governor)
-   cpuidle_attach_governor(dev);
+   if (cpuidle_curr_driver) {
+   if (cpuidle_attach_driver(dev))
+   goto err_ret;
+   }
+   
+   if (cpuidle_curr_governor) {
+   if (cpuidle_attach_governor(dev)) {
+   cpuidle_detach_driver(dev);
+   goto err_ret;
+   }
+   }
+
if (cpuidle_device_can_idle(dev))
cpuidle_install_idle_handler();
+
+   list_add(>device_list, _detected_devices);
+   cpuidle_add_sysfs(sys_dev);
+   dev->status |= CPUIDLE_STATUS_DETECTED;
+
+err_ret:
mutex_unlock(_lock);
 
return 0;
Index: new/drivers/cpuidle/driver.c
===
--- new.orig/drivers/cpuidle/driver.c   2007-03-21 14:25:15.0 -0800
+++ new/drivers/cpuidle/driver.c2007-03-21 14:25:53.0 -0800
@@ -37,7 +37,7 @@
ret = cpuidle_curr_driver->init(dev);
if (ret) {
module_put(cpuidle_curr_driver->owner);
-   printk(KERN_ERROR "cpuidle: driver %s failed to attach to cpu 
%d\n",
+   printk(KERN_INFO "cpuidle: driver %s failed to attach to cpu 
%d\n",
cpuidle_curr_driver->name, dev->cpu);
} else {
if (dev->status & CPUIDLE_STATUS_GOVERNOR_ATTACHED)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems

2007-03-21 Thread Maxim
On Thursday 22 March 2007 01:14:05 Maxim wrote:
> On Wednesday 21 March 2007 23:22:40 Nigel Cunningham wrote:
> > Hi.
> > 
> > On Wed, 2007-03-21 at 18:40 +0200, Maxim Levitsky wrote:
> > > Hi,
> > > 
> > > Starting with 2.6.21-rc1 suspend to ram and disk doesn't work anymore on 
> > > my system.
> > > 
> > > I did a git-bisect and found that those commits break it:
> > > 
> > > e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change 
> > > code ordering in main.c
> > > ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change 
> > > code ordering in disk.c
> > > 259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change 
> > > code ordering in user.c
> > > 
> > > I already reported about it, but now i know the reason why suspend breaks.
> > > 
> > > The problem is that both cpu_up/cpu_down were allowed to sleep until now, 
> > > and it did work because those functions could be called only in process 
> > > context
> > > (the one that writes to /sys/devices/system/cpu/cpu*/online) or  idle 
> > > thread  that does smp_init()).
> > > 
> > > But now they are called _after_ all tasks were suspended, so if cpu_down 
> > > tries for example to take a lock
> > > that is taken by different process, it can't since the different proccess 
> > > is frozen and can't release the lock.
> > > 
> > > I tested this and all results are positive:
> > > 
> > > I disabled 2nd cpu by hand, and then suspend to ram was successfull.
> > > Suspend to disk went correctly, but it hang on resume, and I know why.
> > > It hang in old kernel trying to disable 2nd cpu that was enabled by it.
> > > 
> > > I was able using kdb to confirm that this is true because it was still 
> > > possible to enter kdb, and see that
> > > idle thread (swapper) was active, and uswsusp was waiting on mutex inside 
> > > workqueue_cpu_callback.
> > > 
> > > The solution for this problem seems to be ether complete audit of code 
> > > that uses register_cpu_notifier,
> > > to ensure that it doesn't sleep. 
> > > Also documentation should be changed to note about it.
> > > 
> > > Or, it is also possible to revert this change.
> > 
> > Do you know exactly which mutex was being waited on and where it was
> > taken? If you can say that, it would be much more helpful.
> > 
> > Regards,
> > 
> > Nigel
> > 
> > 
> 
> Hello,
> 
>   It is workqueue_mutex
>   and it is taken in kernel/workqueue.c:797
> 
>   this is guilt of freezable work queues , and XFS uses it (and I use XFS)
> 
>   Thanks to Rafael J. Wysocki for pointing it out to me.
> 
>   Regards,
>   Maxim Levitsky
> 


I think that I made a mistake,

I now reverted the patch that fixes the above error, and I wrote down 
back-trace from this hang,
and it appears that system hangs in kthread_stop:

This I have written in paper, using kdb:

workqueue_cpu_callback ->
cleanup_workqueue_thread ->
kthread_stop ->

and then wake_up_process (I think , I didn't wrote this on paper, I will check 
again)


Regards,
Maxim Levitsky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] More "deprecated" spellos

2007-03-21 Thread Randy Dunlap
On Thu, 22 Feb 2007 04:39:40 -0500 (EST) Robert P. J. Day wrote:

>   Fix remaining misspellings of "depreciated" to "deprecated."

More of these.
---

From: Randy Dunlap <[EMAIL PROTECTED]>

Fix more "deprecated" spellos.

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 Documentation/fb/aty128fb.txt  |4 ++--
 Documentation/filesystems/proc.txt |2 +-
 drivers/pci/pci-driver.c   |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

--- linux-2.6.21-rc4-git5.orig/drivers/pci/pci-driver.c
+++ linux-2.6.21-rc4-git5/drivers/pci/pci-driver.c
@@ -133,7 +133,7 @@ static inline int pci_create_newid_file(
  * system is in its list of supported devices.  Returns the matching
  * pci_device_id structure or %NULL if there is no match.
  *
- * Depreciated, don't use this as it will not catch any dynamic ids
+ * Deprecated, don't use this as it will not catch any dynamic ids
  * that a driver might want to check for.
  */
 const struct pci_device_id *pci_match_id(const struct pci_device_id *ids,
--- linux-2.6.21-rc4-git5.orig/Documentation/fb/aty128fb.txt
+++ linux-2.6.21-rc4-git5/Documentation/fb/aty128fb.txt
@@ -54,8 +54,8 @@ Accepted options:
 
 noaccel  - do not use acceleration engine. It is default.
 accel- use acceleration engine. Not finished.
-vmode:x  - chooses PowerMacintosh video mode . Depreciated.
-cmode:x  - chooses PowerMacintosh colour mode . Depreciated.
+vmode:x  - chooses PowerMacintosh video mode . Deprecated.
+cmode:x  - chooses PowerMacintosh colour mode . Deprecated.
 <[EMAIL PROTECTED]>  - selects startup videomode. See modedb.txt for detailed
   explanation. Default is 640x480x8bpp.
 
--- linux-2.6.21-rc4-git5.orig/Documentation/filesystems/proc.txt
+++ linux-2.6.21-rc4-git5/Documentation/filesystems/proc.txt
@@ -228,7 +228,7 @@ Table 1-3: Kernel info in /proc 
  mounts  Mounted filesystems   
  net Networking info (see text)
  partitions  Table of partitions known to the system   
- pciDepreciated info of PCI bus (new way -> /proc/bus/pci/, 
+ pciDeprecated info of PCI bus (new way -> /proc/bus/pci/,
  decoupled by lspci(2.4)
  rtc Real time clock   
  scsiSCSI info (see text)  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] s390 kprobes: Align probe address

2007-03-21 Thread David Wilder
[This patch applies to both linux and mm trees.  Please send comments 
off list, thanks]
Running a probe on s390 with a probe address that is not 4 byte aligned
results in a Kernel BUG.  The problem is that the stura instruction used
by swap_instruction requires the destination address to be 4 byte aligned.
As stura only writes 4 bytes, aligning to the next 4 byte aligned address
results in the breakpoint instruction being stored past the probe address.
The fix is to align the address backward (to the previous 4 byte aligned
address) and writing the two byte breakpoint instruction in the appropriate
bytes.

Signed-off-by: David Wilder <[EMAIL PROTECTED]>

diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
index 8af549e..993f353 100644
--- a/arch/s390/kernel/kprobes.c
+++ b/arch/s390/kernel/kprobes.c
@@ -167,7 +167,7 @@ static int __kprobes swap_instruction(vo
 	 * shall not cross any page boundaries (vmalloc area!) when writing
 	 * the new instruction.
 	 */
-	addr = (u32 *)ALIGN((unsigned long)args->ptr, 4);
+	addr = (u32 *)((unsigned long)args->ptr & -4UL);
 	if ((unsigned long)args->ptr & 2)
 		instr = ((*addr) & 0x) | args->new;
 	else


Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems

2007-03-21 Thread Maxim
On Thursday 22 March 2007 01:47:05 Nigel Cunningham wrote:
> Hi.
> 
> On Wed, 2007-03-21 at 22:38 +0100, Rafael J. Wysocki wrote:
> > > Do you know exactly which mutex was being waited on and where it was
> > > taken? If you can say that, it would be much more helpful.
> 
> Yeah, me too, but assuming too much sometimes bites me :)
> 
> > I think this is the XFS problem with freezable workqueues.
> > 
> > Maxim, please try to apply the appended patch and see if it helps.
> 
> Thanks for your subsequent messages, Maxim. Could you confirm for us
> that the patch Rafael attached fixes it?
> 
> Regards,
> 
> Nigel
> 
> > ---
> > Since freezable workqueues are broken in 2.6.21-rc
> > (cf. http://marc.theaimsgroup.com/?l=linux-kernel=116855740612755,
> > http://marc.theaimsgroup.com/?l=linux-kernel=117261312523921=2)
> > it's better to remove them altogether for 2.6.21 and change the only user of
> > them (XFS) accordingly.
> > 
> > ---
> >  fs/xfs/linux-2.6/xfs_buf.c |4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > Index: linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c
> > ===
> > --- linux-2.6.21-rc4.orig/fs/xfs/linux-2.6/xfs_buf.c
> > +++ linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c
> > @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
> > if (!xfs_buf_zone)
> > goto out_free_trace_buf;
> >  
> > -   xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> > +   xfslogd_workqueue = create_workqueue("xfslogd");
> > if (!xfslogd_workqueue)
> > goto out_free_buf_zone;
> >  
> > -   xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> > +   xfsdatad_workqueue = create_workqueue("xfsdatad");
> > if (!xfsdatad_workqueue)
> > goto out_destroy_xfslogd_workqueue;
> >  
> 
> 

Hello,

I can confirm now that the above patch work,

First as I said I did try to suspend with this patch and without XFS, and it 
did work,
Now I reverted it and now system still suspends correctly without xfs module 
loaded ( I didn't tell you that i use now ext3,
and that I generally compile everything in kernel, so i put XFS too, because I 
used it once, and I still have a XFS disk image)

But system hangs with XFS loaded, so this patch works.

Regards,
Maxim Levitsky


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: rsdl improvements

2007-03-21 Thread Con Kolivas
On Thursday 22 March 2007 10:48, Jeffrey Hundstad wrote:
> Artur Skawina wrote:
> > Con Kolivas wrote:
> >> Note no interactive boost idea here.
> >>
> >> Patch is for 2.6.21-rc4-mm1. I have not spent the time trying to bring
> >> other bases in sync.
> >
> > I've tried RSDLv.31+this on 2.6.20.3 as i'm not tracking -mm.
> >
> >> Further improve the deterministic nature of the RSDL cpu scheduler and
> >> make the rr_interval tunable.
> >>
> >> By only giving out priority slots to tasks at the current runqueue's
> >> prio_level or below we can make the cpu allocation not altered by
> >> accounting issues across major_rotation periods. This makes the cpu
> >> allocation and latencies more deterministic, and decreases maximum
> >> latencies substantially. This change removes the possibility that tasks
> >> can get bursts of cpu activity which can favour towards interactive
> >> tasks but also favour towards cpu bound tasks which happen to wait on
> >> other activity (such as I/O) and is a net gain.
> >
> > I'm not sure this is going in the right direction... I'm writing
> > this while compiling a kernel w/ "nice -20 make -j2" and X is almost
>
> Did you mean "nice -20"?  If so, that should have slowed X quite a bit.
> Try "nice 19" instead.
>
> nice(1):
>Run  COMMAND  with an adjusted niceness, which affects process
> scheduling.  With no COMMAND, print the current  niceness.   Nicenesses
> range from -20 (most favorable scheduling) to 19 (least favorable).

No he's right. Something scrambled my brain and I've completely left out the 
part where I offer the old bursts as a tunable option as well, which 
unintentionally killed off SCHED_BATCH as an entity. I'll have to put that as 
an additional patch sorry as this by itself is not always a win. Hang in 
there.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: rsdl improvements

2007-03-21 Thread Artur Skawina
Jeffrey Hundstad wrote:
>> I'm not sure this is going in the right direction... I'm writing
>> this while compiling a kernel w/ "nice -20 make -j2" and X is almost
>>   
> Did you mean "nice -20"?  If so, that should have slowed X quite a bit. 
> Try "nice 19" instead.

i did try "nice --20" too :) Resulted in long X stalls, but i don't
think that's a reasonable load so I did not mention it.
"nice -20 cmd" runs cmd at nice==19.

Usage: nice [OPTION] [COMMAND [ARG]...]
Run COMMAND with an adjusted niceness, which affects process scheduling.
With no COMMAND, print the current niceness.  Nicenesses range from
-20 (most favorable scheduling) to 19 (least favorable).

  -n, --adjustment=N   add integer N to the niceness (default 10)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Allow per-cpu variables to be page-aligned

2007-03-21 Thread Rusty Russell
On Wed, 2007-03-21 at 10:49 -0600, Eric W. Biederman wrote:
> Rusty Russell <[EMAIL PROTECTED]> writes:
> 
> > On Wed, 2007-03-21 at 03:21 -0600, Eric W. Biederman wrote:
> >> Do we really want to allow modules to be able to allocate page sized
> >> per cpu memory.
> >
> > Hi Eric!
> >
> > They always could, of course, they just wouldn't get correct alignment.
> > I think the principle of least surprise says that if we support this, it
> > will also work in modules...
> 
> The module load would fail.

Hi again Eric,

Unfortunately not.  It probably should, though: people ignore printks.
I was probably thinking that large alignment constraints were only for
performance when I wrote this code, but a page-aligned requirement for
hypervisors changes that.

> > Looking at the module per-cpu code again, the rounding up of the memory
> > used by the kernel seems unnecessary though.  I'll try ripping that
> > out...
>
> I want to say that when dealing with cpu stuff aligning to a cache
> line makes sense as it prevents multiple variables from sharing
> the same cache line.  However we rarely access per cpu variables from
> other cpus (the point) so the extra alignment doesn't seem to have
> a justification in this context.

Um, yes, always good to remember.  I wrote the per-cpu infrastructure,
and I haven't forgotten 8)

> Although I'm not quite certain what this will do to the per cpu
> memory allocator...

It should Just Work.  My only hesitation is that I obviously thought
different when I wrote this code, so am I smarter now, or then?

> After increasing NR_IRQS on x86_64 to (NR_CPUS*32) the per cpu irq
> stats got much bigger especially as NR_CPUS went up.  The only
> reasonable way I could see to fix this at the time was to just make
> PER_CPU_ENOUGH_ROOM do the right thing and change size dynamically
> with the size of the per cpu section.  I added PERCPU_MODULE_RESERVE
> to allocate the amount that we did not have compile information on.
> 8K was roughly what we had left over for modules before I made the
> change so I just preserved that.

This makes a lot of sense.  A fixed constant seemed sensible at the
time, but now we know that the majority of per-cpu vars are in code
which can never be a module.  Reasons are obvious, and seem unlikely to
change.

> > It means the x86 cpu_pda initialization would have to be done in
> > smp_prepare_boot_cpu tho...
> 
> Well that is earlier than trap_init so it shouldn't be a problem...

But it doesn't get called on UP.  Don't know if that matters, but it
wasn't immediately obvious.

Thanks,
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] : Is /proc/kcore still usefull and/or maintained ?

2007-03-21 Thread Maxim
On Thursday 22 March 2007 01:53:10 Eric Dumazet wrote:
> I stand corrected : This is a new bug
> 
> The /proc/kcore problem appears with linux-2.6.21-rc4-mm1
> 
> fd = open("/proc/kcore", 0);
> llseek(fd, ...) returns an -EINVAL error
> 
> 
> Quick code inspection (before going to sleep...) shows that
> 
> proc_reg_llseek() (file fs/proc/inode.c)
> 
> is doing something like :
> 
> rv = -EINVAL;
> llseek = pde->proc_fops->llseek;
> spin_unlock(>pde_unload_lock);
> if (llseek)
>   rv = llseek(file, offset, whence);
> 
> As kcore dont have a .llseek handler, proc_reg_llseek() returns -EINVAL;
> 
> Previous kernel was probably calling a default llseek() handler.
> 
> if (!llseek)
>   llseek = default_llseek;
> 
> Hum ???
> 

Hi,
Yes, you are right, you have different problem that I had

But why do you need llseek ?

Why not to mmap it ?
It is natural thing to do with files that represent memory.

Regards,
Maxim Levitsky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems

2007-03-21 Thread Maxim
On Thursday 22 March 2007 01:53:54 Rafael J. Wysocki wrote:
> On Thursday, 22 March 2007 00:39, Maxim wrote:
> > On Thursday 22 March 2007 01:24:25 Rafael J. Wysocki wrote:
> > > On Thursday, 22 March 2007 00:09, Maxim wrote:
> > > > On Thursday 22 March 2007 00:39:02 you wrote:
> > > > > On Wednesday, 21 March 2007 23:21, Pavel Machek wrote:
> > > > > > Hi!
> > > > > > 
> > > > > > > Starting with 2.6.21-rc1 suspend to ram and disk doesn't work 
> > > > > > > anymore on my system.
> > > > > > > 
> > > > > > > I did a git-bisect and found that those commits break it:
> > > > > > > 
> > > > > > > e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: 
> > > > > > > Change code ordering in main.c
> > > > > > > ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] 
> > > > > > > swsusp: Change code ordering in disk.c
> > > > > > > 259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] 
> > > > > > > swsusp: Change code ordering in user.c
> > > > > > > 
> > > > > > 
> > > > > > (Yep, it was in my "to analyze" queue).
> > > > > > 
> > > > > > > I already reported about it, but now i know the reason why 
> > > > > > > suspend breaks.
> > > > > > > 
> > > > > > > The problem is that both cpu_up/cpu_down were allowed to sleep 
> > > > > > > until now, 
> > > > > > > and it did work because those functions could be called only in 
> > > > > > > process context
> > > > > > > (the one that writes to /sys/devices/system/cpu/cpu*/online) or  
> > > > > > > idle thread  that does smp_init()).
> > > > > > > 
> > > > > > > But now they are called _after_ all tasks were suspended, so if 
> > > > > > > cpu_down tries for example to take a lock
> > > > > > > that is taken by different process, it can't since the different 
> > > > > > > proccess is frozen and can't release the lock.
> > > > > > > 
> > > > > > 
> > > > > > Thanks for detailed explanation.
> > > > > > 
> > > > > > ...but, on my machine suspend works ok in -rc4. I'm not seeing this.
> > > > > > 
> > > > > > ...by design, "frozen" tasks must not hold any locks. If frozen task
> > > > > > holds a lock, that's a bug.
> > > > > > 
> > > > > > > Or, it is also possible to revert this change.
> > > > > > 
> > > > > > Are you using xfs?
> > > > > 
> > > > > Well, this is the only case that can trigger it.  There are no other 
> > > > > freezable
> > > > > workqueues.
> > > > > 
> > > > > Greetings,
> > > > > Rafael
> > > > > 
> > > > 
> > > > Hello,
> > > > 
> > > > Yes, you are right and it is XFS
> > > > 
> > > > System suspends and resumes with xfs and your patch correctly,
> > > 
> > > Could you please sent this information to the list?  I'd like it to reach 
> > > all
> > > of the CCed parites. ;-)
> > 
> > I did now ( sorry I just keep using this Answer command, instead of Answer 
> > to everybody)
> > I didn't intend to send private email.
> > > 
> > > > Of course I need to mention that I had to unload microcode 
> > > > update driver because it prevented resume,
> > > > because it calls firmware loader helper, and again sleeps on 
> > > > lock
> > > 
> > > This is interesting.  Did it happen before or is it a regression?
> > 
> > It is from the same group of bugs , I mean hang because cpu_up/down is 
> > called with frozen tasks
> > Of course it didn't happen before those reordering commits were introduced
> 
> Well, we want cpu_up/down to be called after processes have been frozen, for
> various reasons (one of them being that applications shouldn't see us playing
> with the CPUs).
> 
> Thanks for reporting this, I'll have a look at the microcode update driver.
> 
> > > > And also I noticed now that system oopses on second attempt to 
> > > > suspend ether to ram or disk
> > > > in pci_restore_msi_state which is called indirectly by 
> > > > ahci_pci_device_resume, I will investigate this soon.
> > > 
> > > Thanks.  We've had such reports earlier, but I think the problem is still 
> > > unresolved.  Any
> > > additional information will be valuable.
> > 
> > I will do my best,
> > Also I want to note that the above problem is 100% repeatable, and happens 
> > independently whenever suspend to disk
> > or suspend to ram was used in first successful try ( or at least, I got 
> > back-trace using kdb, after suspend to disk, after
> > suspend to ram system hang, so I assume, that this it is same problem , 
> > because it didn't hang of first try)
> 
> Thanks for the information.
> 
> BTW, what's the last kernel you have tested?
> 
> Rafael
> 

Hello, 
Thanks for quick response, 
I will continue to test my system

I use literally latest Linus's git tree.

The kernel that works is 2.6.20 , and except very weird hang that happens 
sometimes (1 in 5~6 times) on resume from ram, everything works
I described it in http://lkml.org/lkml/2007/3/16/126

Regards,
Maxim Levitsky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a 

fix extra BIOS invocation during resume

2007-03-21 Thread Pavel Machek
It causes extra moon icons blinking on x60, and breaks at least two
other systems.

During resume, we do not know that "reboot"/"shutdown" method was
used, so we assume "plaform" and call BIOS, anyway...

This is 2.6.21 material, and should fix 2 or 3 regressions from 2.6.20.

Signed-off-by: Pavel Machek <[EMAIL PROTECTED]>
Acked-by:  "Rafael J. Wysocki" <[EMAIL PROTECTED]>


 kernel/power/disk.c |8 
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/kernel/power/disk.c b/kernel/power/disk.c
index 873cdf8..dee0ff4 100644
--- a/kernel/power/disk.c
+++ b/kernel/power/disk.c
@@ -241,18 +241,11 @@ static int software_resume(void)
goto Done;
}
 
-   error = platform_prepare();
-   if (error) {
-   swsusp_free();
-   goto Thaw;
-   }
-
pr_debug("PM: Reading swsusp image.\n");
 
error = swsusp_read();
if (error) {
swsusp_free();
-   platform_finish();
goto Thaw;
}
 
@@ -270,7 +263,6 @@ static int software_resume(void)
enable_nonboot_cpus();
  Free:
swsusp_free();
-   platform_finish();
device_resume();
resume_console();
  Thaw:


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] i386 GDT cleanups: Use per-cpu GDT immediately upon boot

2007-03-21 Thread Rusty Russell
On Wed, 2007-03-21 at 10:51 -0600, Eric W. Biederman wrote:
> Rusty Russell <[EMAIL PROTECTED]> writes:
> 
> > On Wed, 2007-03-21 at 03:31 -0600, Eric W. Biederman wrote:
> >> Rusty Russell <[EMAIL PROTECTED]> writes:
> >> > -/*
> >> > - * The boot_gdt_table must mirror the equivalent in setup.S and is
> >> > - * used only for booting.
> >> > - */
> >> 
> >> It looks like you are killing a useful comment here for no good reason.
> >
> > Hi Eric,
> >
> > I think one has to look harder, then.  There is no "equivalent in
> > setup.S": there is no setup.S, and it's certainly not clear what GDT
> > this "must mirror": it doesn't mirror any GDT at the moment.
> 
> see the gdt in:
> arch/i386/boot/setup.S

Erk, what a dumb mistake.  Apologies for my snarky comment above 8(

> If anything the comment should read these values are fixed by the boot
> protocol and we can't change them.

Since lguest doesn't use setup.S, it's outside my experience.  I'll just
leave the comment, and try to pretend this never happened 8)

Thanks muchly,
Rusty.
==
Now we are no longer dynamically allocating the GDT, we don't need the
"cpu_gdt_table" at all: we can switch straight from "boot_gdt_table"
to the per-cpu GDT.  This means initializing the cpu_gdt array in C.

The boot CPU uses the per-cpu var directly, then in smp_prepare_cpus()
it switches to the per-cpu copy just allocated.  For secondary CPUs,
the early_gdt_descr is set to point directly to their per-cpu copy.

For UP the code is very simple: it keeps using the "per-cpu" GDT as
per SMP, but we never have to move.

Signed-off-by: Rusty Russell <[EMAIL PROTECTED]>
---
 arch/i386/kernel/cpu/common.c|   74 --
 arch/i386/kernel/head.S  |   65 -
 arch/i386/kernel/smpboot.c   |   59 ++-
 arch/i386/mach-voyager/voyager_smp.c |6 --
 include/asm-i386/desc.h  |2
 include/asm-i386/processor.h |1
 6 files changed, 77 insertions(+), 130 deletions(-)

diff -r 9db59163584b arch/i386/kernel/cpu/common.c
--- a/arch/i386/kernel/cpu/common.c Thu Mar 22 10:54:53 2007 +1100
+++ b/arch/i386/kernel/cpu/common.c Thu Mar 22 10:56:49 2007 +1100
@@ -25,7 +25,33 @@ DEFINE_PER_CPU(struct Xgt_desc_struct, c
 DEFINE_PER_CPU(struct Xgt_desc_struct, cpu_gdt_descr);
 EXPORT_PER_CPU_SYMBOL(cpu_gdt_descr);
 
-DEFINE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]);
+DEFINE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]) = {
+   [GDT_ENTRY_KERNEL_CS] = { 0x, 0x00cf9a00 },
+   [GDT_ENTRY_KERNEL_DS] = { 0x, 0x00cf9200 },
+   [GDT_ENTRY_DEFAULT_USER_CS] = { 0x, 0x00cffa00 },
+   [GDT_ENTRY_DEFAULT_USER_DS] = { 0x, 0x00cff200 },
+   /*
+* Segments used for calling PnP BIOS have byte granularity.
+* They code segments and data segments have fixed 64k limits,
+* the transfer segment sizes are set at run time.
+*/
+   [GDT_ENTRY_PNPBIOS_CS32] = { 0x, 0x00409a00 },/* 32-bit code */
+   [GDT_ENTRY_PNPBIOS_CS16] = { 0x, 0x9a00 },/* 16-bit code */
+   [GDT_ENTRY_PNPBIOS_DS] = { 0x, 0x9200 }, /* 16-bit data */
+   [GDT_ENTRY_PNPBIOS_TS1] = { 0x, 0x9200 },/* 16-bit data */
+   [GDT_ENTRY_PNPBIOS_TS2] = { 0x, 0x9200 },/* 16-bit data */
+   /*
+* The APM segments have byte granularity and their bases
+* are set at run time.  All have 64k limits.
+*/
+   [GDT_ENTRY_APMBIOS_BASE] = { 0x, 0x00409a00 },/* 32-bit code */
+   /* 16-bit code */
+   [GDT_ENTRY_APMBIOS_BASE+1] = { 0x, 0x9a00 },
+   [GDT_ENTRY_APMBIOS_BASE+2] = { 0x, 0x00409200 }, /* data */
+
+   [GDT_ENTRY_ESPFIX_SS] = { 0x, 0x00c09200 },
+   [GDT_ENTRY_PDA] = { 0x, 0x00c09200 }, /* set in setup_pda */
+};
 
 DEFINE_PER_CPU(struct i386_pda, _cpu_pda);
 EXPORT_PER_CPU_SYMBOL(_cpu_pda);
@@ -618,46 +644,6 @@ struct i386_pda boot_pda = {
.pcurrent = _task,
 };
 
-static inline void set_kernel_fs(void)
-{
-   /* Set %fs for this CPU's PDA.  Memory clobber is to create a
-  barrier with respect to any PDA operations, so the compiler
-  doesn't move any before here. */
-   asm volatile ("mov %0, %%fs" : : "r" (__KERNEL_PDA) : "memory");
-}
-
-/* Initialize the CPU's GDT and PDA.  This is either the boot CPU doing itself
-   (still using cpu_gdt_table), or a CPU doing it for a secondary which
-   will soon come up. */
-__cpuinit void init_gdt(int cpu, struct task_struct *idle)
-{
-   struct Xgt_desc_struct *cpu_gdt_descr = _cpu(cpu_gdt_descr, cpu);
-   struct desc_struct *gdt = per_cpu(cpu_gdt, cpu);
-   struct i386_pda *pda = _cpu(_cpu_pda, cpu);
-
-   memcpy(gdt, cpu_gdt_table, GDT_SIZE);
-   cpu_gdt_descr->address = (unsigned long)gdt;
-   cpu_gdt_descr->size = GDT_SIZE - 1;
-
-   pack_descriptor((u32 

Re: [PATCH] I/O space boot parameter

2007-03-21 Thread Greg KH
On Wed, Mar 21, 2007 at 09:37:52AM -0400, Daniel Yeisley wrote:
> On Tue, 2007-03-20 at 13:26 -0700, Greg KH wrote:
> > On Tue, Mar 20, 2007 at 01:25:38PM -0400, Daniel Yeisley wrote:
> > > On Tue, 2007-03-20 at 11:00 -0700, Greg KH wrote:
> > > > On Tue, Mar 20, 2007 at 12:18:24PM -0400, Daniel Yeisley wrote:
> > > > > It has been mentioned before that large systems with a lot of PCI 
> > > > > buses
> > > > > have issues with the 64k I/O space limit.  The ES7000 has a BIOS 
> > > > > option
> > > > > to either assign I/O space to all adapters, or only to those that need
> > > > > it.  A list of supported adapters that don't need it is kept in the
> > > > > BIOS.  When this option is used, the kernel sees the BARs on the
> > > > > adapters and still tries to assign I/O space (until it runs out).  
> > > > > I've
> > > > > written a patch to implement a boot parameter that tells the kernel 
> > > > > not
> > > > > to assign I/O space if the BIOS hasn't.  
> > > > 
> > > > How prelevant are machines like this?  And why are the BARs on these
> > > > devices wrong?
> > > > 
> > > I don't have any sales numbers, but I can tell you that our current
> > > systems can have up to 64 PCI buses.  
> > > 
> > > I've been working with Emulex cards, and my understanding is that the
> > > BARs on the devices aren't wrong, but we can't allocate 4k of I/O space
> > > for each one.  So we maintain a list in the BIOS of devices that don't
> > > actually need I/O space and then don't assign it.  I've tested an a
> > > x86_64 system with 20+ adapters and saw all the disks attached without
> > > any problems.
> > 
> > Ah.  Others are working on providing a fix for this too, but it is being
> > done in the drivers themselves, not in the pci core.  Look in the
> > linux-pci mailing list archives for those patches (I don't think they
> > every went into mainline for some reason, but I might be wrong...)
> > 
> > I suggest you work with those developers, as they have the same issue
> > that you are trying to solve here.
> > 
> 
> I have seen some patches that make the drivers I/O port free here:
> http://lkml.org/lkml/2006/2/26/261

Ah, yes, those are the ones.

> I checked and they still aren't in the mainline.  

Poke the developer to get them there :)

> I don't know that it matters though because I see all the disks attached
> to the system regardless of whether or not the adapters get I/O space.
> The real issue I have is with all the error messages I get at boot.  I
> see 40+ messages that say "PCI: Failed to allocate I/O
> resource..." (from setup-res.c) when the kernel tries to allocate the
> I/O space and can't.  The modules load fine.  I see all the disks just
> fine.  But that many error messages tends to raise concerns and causes
> support calls from customers.

If this isn't an issue for functionality, why not fix your BIOS then?

And doesn't the above linked patch set also solve your issue with the
noise in the syslog?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Delete obsolete RAW driver feature.

2007-03-21 Thread Andrew Morton
On Wed, 21 Mar 2007 19:42:36 -0400
Dave Jones <[EMAIL PROTECTED]> wrote:

> On Wed, Mar 21, 2007 at 04:27:17PM -0700, Andrew Morton wrote:
>  > > [1] Though admittedly the one in RHEL deviates from upstream
>  > > as it contains performance enhancements that were vetoed from
>  > > upstream acceptance due to it being "deprecated".
>  > 
>  > What enhancements are they?
> 
> Hmm, actually it seems I was mistaken, we didn't merge those after all,
> we seem to be shipping what was upstream in 2.6.9/2.6.18.
> 
> The patches I was thinking of were a bunch of optimisations for higher
> throughput from someone at Intel iirc. Ken Chen maybe?

Ah, OK.

Ken had an initial set of patches which weren't very popular.  Months
later, he had a second set which did get merged into mainline, but they
broke, so that code is presently mucking up fs/block_dev.c, inside #if 0,
awaiting possible repair.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] - fix compile warning: `found' might be used uninitialized in this function

2007-03-21 Thread Johannes Weiner
Hi,

On Thu, Mar 22, 2007 at 12:48:05AM +0100, roland wrote:
> fs/block_dev.c: In function `bd_claim_by_kobject':
> fs/block_dev.c:953: warning: `found' might be used uninitialized in this 
> function

found actually _is_ used uninitialized if the call to bd_claim() returns
anything but 0.  Thank you!

=Hannes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] : Is /proc/kcore still usefull and/or maintained ?

2007-03-21 Thread Eric Dumazet
I stand corrected : This is a new bug

The /proc/kcore problem appears with linux-2.6.21-rc4-mm1

fd = open("/proc/kcore", 0);
llseek(fd, ...) returns an -EINVAL error


Quick code inspection (before going to sleep...) shows that

proc_reg_llseek() (file fs/proc/inode.c)

is doing something like :

rv = -EINVAL;
llseek = pde->proc_fops->llseek;
spin_unlock(>pde_unload_lock);
if (llseek)
rv = llseek(file, offset, whence);

As kcore dont have a .llseek handler, proc_reg_llseek() returns -EINVAL;

Previous kernel was probably calling a default llseek() handler.

if (!llseek)
llseek = default_llseek;

Hum ???
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems

2007-03-21 Thread Rafael J. Wysocki
On Thursday, 22 March 2007 00:39, Maxim wrote:
> On Thursday 22 March 2007 01:24:25 Rafael J. Wysocki wrote:
> > On Thursday, 22 March 2007 00:09, Maxim wrote:
> > > On Thursday 22 March 2007 00:39:02 you wrote:
> > > > On Wednesday, 21 March 2007 23:21, Pavel Machek wrote:
> > > > > Hi!
> > > > > 
> > > > > > Starting with 2.6.21-rc1 suspend to ram and disk doesn't work 
> > > > > > anymore on my system.
> > > > > > 
> > > > > > I did a git-bisect and found that those commits break it:
> > > > > > 
> > > > > > e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: 
> > > > > > Change code ordering in main.c
> > > > > > ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: 
> > > > > > Change code ordering in disk.c
> > > > > > 259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: 
> > > > > > Change code ordering in user.c
> > > > > > 
> > > > > 
> > > > > (Yep, it was in my "to analyze" queue).
> > > > > 
> > > > > > I already reported about it, but now i know the reason why suspend 
> > > > > > breaks.
> > > > > > 
> > > > > > The problem is that both cpu_up/cpu_down were allowed to sleep 
> > > > > > until now, 
> > > > > > and it did work because those functions could be called only in 
> > > > > > process context
> > > > > > (the one that writes to /sys/devices/system/cpu/cpu*/online) or  
> > > > > > idle thread  that does smp_init()).
> > > > > > 
> > > > > > But now they are called _after_ all tasks were suspended, so if 
> > > > > > cpu_down tries for example to take a lock
> > > > > > that is taken by different process, it can't since the different 
> > > > > > proccess is frozen and can't release the lock.
> > > > > > 
> > > > > 
> > > > > Thanks for detailed explanation.
> > > > > 
> > > > > ...but, on my machine suspend works ok in -rc4. I'm not seeing this.
> > > > > 
> > > > > ...by design, "frozen" tasks must not hold any locks. If frozen task
> > > > > holds a lock, that's a bug.
> > > > > 
> > > > > > Or, it is also possible to revert this change.
> > > > > 
> > > > > Are you using xfs?
> > > > 
> > > > Well, this is the only case that can trigger it.  There are no other 
> > > > freezable
> > > > workqueues.
> > > > 
> > > > Greetings,
> > > > Rafael
> > > > 
> > > 
> > > Hello,
> > > 
> > >   Yes, you are right and it is XFS
> > > 
> > >   System suspends and resumes with xfs and your patch correctly,
> > 
> > Could you please sent this information to the list?  I'd like it to reach 
> > all
> > of the CCed parites. ;-)
> 
> I did now ( sorry I just keep using this Answer command, instead of Answer to 
> everybody)
> I didn't intend to send private email.
> > 
> > >   Of course I need to mention that I had to unload microcode update 
> > > driver because it prevented resume,
> > >   because it calls firmware loader helper, and again sleeps on lock
> > 
> > This is interesting.  Did it happen before or is it a regression?
> 
> It is from the same group of bugs , I mean hang because cpu_up/down is called 
> with frozen tasks
> Of course it didn't happen before those reordering commits were introduced

Well, we want cpu_up/down to be called after processes have been frozen, for
various reasons (one of them being that applications shouldn't see us playing
with the CPUs).

Thanks for reporting this, I'll have a look at the microcode update driver.

> > >   And also I noticed now that system oopses on second attempt to suspend 
> > > ether to ram or disk
> > >   in pci_restore_msi_state which is called indirectly by 
> > > ahci_pci_device_resume, I will investigate this soon.
> > 
> > Thanks.  We've had such reports earlier, but I think the problem is still 
> > unresolved.  Any
> > additional information will be valuable.
> 
> I will do my best,
> Also I want to note that the above problem is 100% repeatable, and happens 
> independently whenever suspend to disk
> or suspend to ram was used in first successful try ( or at least, I got 
> back-trace using kdb, after suspend to disk, after
> suspend to ram system hang, so I assume, that this it is same problem , 
> because it didn't hang of first try)

Thanks for the information.

BTW, what's the last kernel you have tested?

Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] add pfn_valid_within helper for sub-MAX_ORDER hole detection

2007-03-21 Thread Andrew Morton
On Thu, 22 Mar 2007 10:23:27 +1100
Nick Piggin <[EMAIL PROTECTED]> wrote:

> Andy Whitcroft wrote:
> > Generally we work under the assumption that memory the mem_map
> > array is contigious and valid out to MAX_ORDER_NR_PAGES block
> > of pages, ie. that if we have validated any page within this
> > MAX_ORDER_NR_PAGES block we need not check any other.  This is not
> > true when CONFIG_HOLES_IN_ZONE is set and we must check each and
> > every reference we make from a pfn.
> > 
> > Add a pfn_valid_within() helper which should be used when scanning
> > pages within a MAX_ORDER_NR_PAGES block when we have already
> > checked the validility of the block normally with pfn_valid().
> > This can then be optimised away when we do not have holes within
> > a MAX_ORDER_NR_PAGES block of pages.
> 
> Nice cleanup. Horrible name ;) Calls read like "is the pfn valid
> within pfn".

yeah
 
> I can't think of anything really good, but I think, say,
> pfn_valid_within_block or pfn_valid_within_valid_block would be a
> bit better. You still get a slight net savings in keystrokes!

Neither of those identifiers seem to really fit, and I can't think of anything
suitable either.  Oh well, at least pfn_valid_within() has a nice comment
explaining what it does.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >