date:20071103

Re: [PATCH] replace "make ARCH=i386/x86_64 with make ARCH=x86"

2007-11-03 Thread Jeremy Fitzhardinge

Jeff Garzik wrote:
> This also opens a chicken-and-egg problem...  What kind of config is
> generated by allmodconfig when ARCH==x86?  There is no good answer.

With a unified x86 architecture, the decision to compile with 32 or
64-bit mode isn't really different from SMP vs UP, PAE vs non-PAE and so
on.  It's just a config option with global effects.  Over time, the
number of config options with are really 32 or 64-bit specific will
probably pretty small.

> The existing tradition of switching between 32-bit and 64-bit was
> quite nice, and it was done in The Obvious Way(tm) -- via the method
> for specifying the architecture/platform.  Switching to Kconfig for
> that decision is a step backwards in usability and IMO violates the
> Principle of Least Surprise.

The architecture is now x86, with a further 32 or 64-bit parameter.  We
already have config options for setting the sub-architecture.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22: pcspkr driver no longer loads automatically

2007-11-03 Thread Dmitry Torokhov

Hi,

On Wednesday 08 August 2007 15:32, Bill Nottingham wrote:
> Kay Sievers ([EMAIL PROTECTED]) said: 
> > It doesn't have any aliases, so seems it was never autoloaded.
> 
> It was - prior kernels loaded it via the uevent generated from 
> /devices/platform/pcspkr. Newer kernels seem to never actually
> trigger a uevent from that (tested with a combination of
> udevmonitor and 'udevtrigger --subsystem-match=platform'.)
> 

The patch below should restore generation of uevents for pcspkr devices.
Since devices are not created in pcspkr module but rather in arch setup
code it is right (and safe) thing to do.

-- 
Dmitry


pcspkr: restore uevent generation

Make sure that we generate uevents when creating pcspkr devices
so that userspace will load pcspkr driver.

Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]>
---
 arch/alpha/kernel/setup.c  |2 ++
 arch/mips/kernel/pcspeaker.c   |2 ++
 arch/powerpc/kernel/setup-common.c |2 ++
 arch/x86/kernel/pcspeaker.c|2 ++
 4 files changed, 8 insertions(+)

Index: work/arch/alpha/kernel/setup.c
===
--- work.orig/arch/alpha/kernel/setup.c
+++ work/arch/alpha/kernel/setup.c
@@ -1501,6 +1501,8 @@ static __init int add_pcspkr(void)
if (!pd)
return -ENOMEM;
 
+   pd->dev.uevent_suppress = 0;
+
ret = platform_device_add(pd);
if (ret)
platform_device_put(pd);
Index: work/arch/mips/kernel/pcspeaker.c
===
--- work.orig/arch/mips/kernel/pcspeaker.c
+++ work/arch/mips/kernel/pcspeaker.c
@@ -19,6 +19,8 @@ static __init int add_pcspkr(void)
if (!pd)
return -ENOMEM;
 
+   pd->dev.uevent_suppress = 0;
+
ret = platform_device_add(pd);
if (ret)
platform_device_put(pd);
Index: work/arch/powerpc/kernel/setup-common.c
===
--- work.orig/arch/powerpc/kernel/setup-common.c
+++ work/arch/powerpc/kernel/setup-common.c
@@ -454,6 +454,8 @@ static __init int add_pcspkr(void)
if (!pd)
return -ENOMEM;
 
+   pd->dev.uevent_suppress = 0;
+
ret = platform_device_add(pd);
if (ret)
platform_device_put(pd);
Index: work/arch/x86/kernel/pcspeaker.c
===
--- work.orig/arch/x86/kernel/pcspeaker.c
+++ work/arch/x86/kernel/pcspeaker.c
@@ -11,6 +11,8 @@ static __init int add_pcspkr(void)
if (!pd)
return -ENOMEM;
 
+   pd->dev.uevent_suppress = 0;
+
ret = platform_device_add(pd);
if (ret)
platform_device_put(pd);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Massive slowdown when re-querying large nfs dir

2007-11-03 Thread Al Boldi

There is a massive (3-18x) slowdown when re-querying a large nfs dir (2k+ 
entries) using a simple ls -l.

On 2.6.23 client and server running userland rpc.nfs.V2:
first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
more tries: time -p ls -l <2k+ entry dir>  in ~8sec

first  try: time -p ls -l <5k+ entry dir>  in ~9sec
more tries: time -p ls -l <5k+ entry dir>  in ~180sec

On 2.6.23 client and 2.4.31 server running userland rpc.nfs.V2:
first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
more tries: time -p ls -l <2k+ entry dir>  in ~7sec

first  try: time -p ls -l <5k+ entry dir>  in ~8sec
more tries: time -p ls -l <5k+ entry dir>  in ~43sec

Remounting the nfs-dir on the client resets the problem.

Any ideas?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Fix i2c module parameter permissions for read/write

2007-11-03 Thread Jon Smirl

The permissions of i2c module parameters were set to zero making the
parameters invisible and unsettable from the kernel command line. This
patch changes the permissions to the standard 0644 read/write.

Signed-off-by: Jon Smirl <[EMAIL PROTECTED]>
---

diff --git a/include/linux/i2c.h b/include/linux/i2c.h
index 8033e6b..395e430 100644
--- a/include/linux/i2c.h
+++ b/include/linux/i2c.h
@@ -588,7 +588,7 @@ union i2c_smbus_data {
 #define I2C_CLIENT_MODULE_PARM(var,desc) \
   static unsigned short var[I2C_CLIENT_MAX_OPTS] = I2C_CLIENT_DEFAULTS; \
   static unsigned int var##_num; \
-  module_param_array(var, short, ##_num, 0); \
+  module_param_array(var, short, ##_num, 0644); \
   MODULE_PARM_DESC(var,desc)

 #define I2C_CLIENT_MODULE_PARM_FORCE(name) \


-- 
Jon Smirl
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

User Mode Linux still broken in 2.6.23.1

2007-11-03 Thread Rob Landley

Building with the attached .config on x86-64, it does this:

  CC  arch/um/kernel/smp.o
In file included from include/asm/arch/tlb.h:11,
 from include/asm/tlb.h:4,
 from arch/um/kernel/smp.c:8:
include/asm-generic/tlb.h: In function ‘tlb_flush_mmu’:
include/asm-generic/tlb.h:76: error: implicit declaration of function 
‘release_pages’
include/asm-generic/tlb.h: In function ‘tlb_remove_page’:
include/asm-generic/tlb.h:105: error: implicit declaration of function 
‘page_cache_release’
make[1]: *** [arch/um/kernel/smp.o] Error 1
make: *** [arch/um/kernel] Error 2

I've been doing the following to fix it.  I know it's not the right fix,
(see the earlier thread about it at http://lkml.org/lkml/2007/8/24/441 )
but could the one line fix go into the -stable queue 2.6.23 while a proper
fix goes into 2.6.24?

From: Rob Landley <[EMAIL PROTECTED]>

Fix build break in User Mode Linux 2.6.23.1.

Signed-off-by: Rob Landley <[EMAIL PROTECTED]>

---

 arch/um/kernel/smp.c |1 +
 1 file changed, 1 insertion(+)

--- linux-2.6.23-rc3/arch/um/kernel/smp.c
+++ linux-2.6.23-new/arch/um/kernel/smp.c
@@ -5,6 +5,7 @@

 #include "linux/percpu.h"
 #include "asm/pgalloc.h"
+#include "linux/pagemap.h"
 #include "asm/tlb.h"

 /* For some reason, mmu_gathers are referenced when CONFIG_SMP is off. */
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23.1
# Sat Nov  3 20:20:52 2007
#
CONFIG_DEFCONFIG_LIST="arch/$ARCH/defconfig"
CONFIG_GENERIC_HARDIRQS=y
CONFIG_UML=y
CONFIG_MMU=y
CONFIG_NO_IOMEM=y
# CONFIG_TRACE_IRQFLAGS_SUPPORT is not set
CONFIG_LOCKDEP_SUPPORT=y
# CONFIG_STACKTRACE_SUPPORT is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_BUG=y
CONFIG_IRQ_RELEASE_METHOD=y

#
# UML-specific options
#
# CONFIG_STATIC_LINK is not set
CONFIG_MODE_SKAS=y
CONFIG_UML_X86=y
CONFIG_64BIT=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_TOP_ADDR=0x8000
CONFIG_3_LEVEL_PGTABLES=y
CONFIG_STUB_CODE=0x7fbfffe000
CONFIG_STUB_DATA=0x7fb000
CONFIG_STUB_START=0x7fbfffe000
# CONFIG_ARCH_HAS_SC_SIGNALS is not set
# CONFIG_ARCH_REUSE_HOST_VSYSCALL_AREA is not set
CONFIG_SMP_BROKEN=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
# CONFIG_SPARSEMEM_STATIC is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_RESOURCES_64BIT=y
CONFIG_ZONE_DMA_FLAG=0
CONFIG_VIRT_TO_BUS=y
CONFIG_LD_SCRIPT_DYN=y
# CONFIG_NET is not set
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_MISC is not set
CONFIG_HOSTFS=y
# CONFIG_MCONSOLE is not set
CONFIG_NEST_LEVEL=0
CONFIG_KERNEL_STACK_ORDER=1
# CONFIG_UML_REAL_TIME_CLOCK is not set

#
# General setup
#
# CONFIG_EXPERIMENTAL is not set
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=128
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
# CONFIG_SWAP is not set
# CONFIG_SYSVIPC is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=14
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_MODULES is not set
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_AS is not set
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
CONFIG_DEFAULT_NOOP=y
CONFIG_DEFAULT_IOSCHED="noop"
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_UBD is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
# CONFIG_MMAPPER is not set
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_RAM is not set

#
# Character Devices
#
CONFIG_STDERR_CONSOLE=y
CONFIG_STDIO_CONSOLE=y
# CONFIG_SSL is not set
# CONFIG_NULL_CHAN is not set
# CONFIG_PORT_CHAN is not set
# CONFIG_PTY_CHAN is not set
# CONFIG_TTY_CHAN is not set
# CONFIG_XTERM_CHAN is not set
CONFIG_NOCONFIG_CHAN=y
CONFIG_CON_ZERO_CHAN="fd:0,fd:1"
CONFIG_CON_CHAN="xterm"
CONFIG_SSL_CHAN="pty"
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_WATCHDOG is not set
# CONFIG_UML_SOUND is not set
# CONFIG_SOUND is not set
# CONFIG_HOSTAUDIO is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_UML_RANDOM is not set

#
# Generic Driver Options
#
CONFIG_STANDALONE=y
# CONFIG_PREVENT_FIRMWARE_BUILD is not set
# CONFIG_FW_LOADER is not set
# CONFIG_SYS_HYPERVISOR is not set

#
# Networking
#

#
# File

Re: "Fix ATAPI transfer lengths" causes CD writing regression

2007-11-03 Thread Albert Lee

Tejun Heo wrote:
> Daniel Drake wrote:
> 
>>Tejun Heo wrote:
>>
<4>ata2.00: HSM violation: eh_analyze_tf: BUSY|DRQ
<3>ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
<3>ata2.00: cmd a0/00:00:00:0a:00/00:00:00:00:00/a0 tag 0 cdb 0x5a data
10 in
<4> res 58/00:02:00:0a:00/00:00:00:00:00/a0 Emask 0x2 (HSM
violation)
<3>ata2.00: status: { DRDY DRQ }
<6>ata2: soft resetting link
<6>ata2.00: configured for UDMA/33
<6>ata2: EH complete
>>>
>>>Does this patch fix the problem?
>>
>>That fixes it, thanks! There is no more ugly error in dmesg, the test
>>prog doesn't print any sense data, and brasero works OK too. However,
>>these messages appear in the kernel log every time I run the test app
>>(or when brasero does its thing):
>>
>><4>ata2.00: 10 bytes trailing data
>><4>ata2.00: 10 bytes trailing data
>><4>ata2.00: 10 bytes trailing data
>><4>ata2.00: 10 bytes trailing data
>><4>ata2.00: 10 bytes trailing data
>><4>ata2.00: 10 bytes trailing data
>><4>ata2.00: 6 bytes trailing data
> 
> 
> Yeah, that's expected.  What's going on here is that your drive sends
> full mode sense data (76bytes) regardless of allocation size in CDB but
> it does honor transfer chunk set in the PACKET TF, which is set to the
> same value as allocation size by Alan's patch.  So, now the drive sends
> the 76 bytes in 8 chunks.  The first chunk is transferred into the sg
> buffer and the following chunks are thrown away.
> 
> Previously, transfer chunk was set to 8k, so the drive claims to
> transfer 76 bytes from the buegging, libata transfers leading 10 bytes
> got transferred into the user buffer and throws away what's remaining.
> The change caused problem because libata HSM always switches to
> HSM_ST_LAST (command sequence completion) after filling the command
> buffer completely.  So, throwing away is activated iff the extra data to
> throw away is transfered together with the last chunk of useful data.
> 
> With the chunk size reduced to allocation size, the initial chunk fills
> the data buffer completely and all the extra bytes are transfered in
> separate chunks.  However, libata HSM expects command sequence to
> complete after the initial chunk but the drive asserts DRQ for the next
> chunk on the following interrupt, so HSM violation is triggered.
> 
> The patch modifies HSM such that it keeps throwing away extra data as
> long as the drive asserts DRQ which is how IDE driver does it.
> 

>From past experience, some dead ATAPI devices stuck in DRQ=1. We should
take care of such situation, otherwise the HSM might get into an infinite
loop of waiting for the dead ATAPI device to say DRQ=0 and discarding
endless "trailing data".

Maybe we could set a limit here. If the ATAPI device keeps DRQ=1 and
exceeds the limit, we consider it as HSM violation and have EH handle it.

--
albert



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc1 freezes on powerbook at first boot stage

2007-11-03 Thread Nathan Lynch

(cc'ing linuxppc-dev, see
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg221770.html
for original post and .config)

Elimar Riesebieter wrote:
> On Wed, 24 Oct 2007 the mental interface of
> Elimar Riesebieter told:
> 
> [...]
> > The kernel is loaded from firmware but freezes at the moment to load
> > the radeon framebuffer. I can't get any debug info (don't know
> > how?).
> 
> Screen dump till freeze:
> 
> Using PowerMac machine description
> Total memory = 1024MB; using 2048kB for hash table (at cfe0)
> Linux version 2.6.24-rc1-aragorn ([EMAIL PROTECTED]) (gcc version 4.2.3 
> 20071014 (prelelease) (Debian 4.2.2-3)) #1 Wed Oct 24 12:48:27 CEST 2007
> Found UniNorth memory controller & host bridge @ 0xf800 revision: 0xd2
> Mapped at 0xfdfc
> Found a Intrepid mac-io controller, rev: 0, mapped at 0xfdf4
> PowerMac motherboard: PowerBook G4 15"
> console [udbg0] enabled
> setup_arch: bootmem
> Found UniNorth PCI host bridge at 0xf000. Firmware bus number: 
> 0->1
> Found UniNorth PCI host bridge at 0xf200. Firmware bus number: 
> 0->1
> Found UniNorth PCI host bridge at 0xf400. Firmware bus number: 
> 0->1
> via-pmu: Server Mode is disabled
> PMU driver v2 initialized for Core99, firmware: 0c
> nvram: Checking bank 0...
> nvram: gen0=741, gen1=740
> nvram: Active bank is: 0
> nvram: OF partition at 0x410
> nvram: XP partition at 0x1020
> nvram: NR partition at 0x1120
> arch: exit
> Zone PFN ranges:
>   DMA 0 ->   196608
>   Normal 196608 ->   196608
>   HighMem196608 ->   262144
> Movable zone start PFN for each node
> early_node_map[1] active PFN ranges
> 0:0 ->   262144
> Built 1 zonelists in Zone order.  Total pages: 260096
> Kernel command line: root=/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL 
> PROTECTED]:5 root=/dev/hda5
> mpic: Setting up MPIC " MPIC 1   " version 1.2 at 8004, max 4 CPUs
> mpic: ISU size: 64, shift: 6, mask: 3f
> mpic: Initializing for 64 sources
> PID hash table entries: 4096 (order: 12, 16384 bytes)
> GMT Delta read from XPRAM: 0 minutes, DST: off
> clocksource: timebase mult[d9038e4] shift [22] registered
> clockevent: decremeter mult[4b7] shift[16] cpu[0]
> Console: colour dummy device 80x25
> console handover: boot [udbg0] -> real [tty0]

Does 2.6.23 (or any earlier kernel) work?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CLOCK_TICK_RATE in NTP code

2007-11-03 Thread Roman Zippel

Hi,

On Thursday 01 November 2007, Ralf Baechle wrote:

> kernel/time/ntp.c contains the following piece of code:
>
> #define CLOCK_TICK_OVERFLOW (LATCH * HZ - CLOCK_TICK_RATE)
> #define CLOCK_TICK_ADJUST   (((s64)CLOCK_TICK_OVERFLOW * NSEC_PER_SEC)
> / \ (s64)CLOCK_TICK_RATE)
>
> static void ntp_update_frequency(void)
> {
> u64 second_length = (u64)(tick_usec * NSEC_PER_USEC * USER_HZ)
> << TICK_LENGTH_SHIFT;
> second_length += (s64)CLOCK_TICK_ADJUST << TICK_LENGTH_SHIFT;
> second_length += (s64)time_freq << (TICK_LENGTH_SHIFT -
> SHIFT_NSEC);
>
> tick_length_base = second_length;
>
> do_div(second_length, HZ);
> tick_nsec = second_length >> TICK_LENGTH_SHIFT;
>
> do_div(tick_length_base, NTP_INTERVAL_FREQ);
> }
>
> So it uses CLOCK_TICK_RATE which on many systems but not all is defined to
> the i8253 input clock.  But timekeeping on anything remotely modern makes
> little use of the i8253 so I wonder the intent was here.

The basic idea is to provide a base frequency adjustment, when I wrote this I 
already wasn't entirely happy that it was hardcoded like this, but in the end 
I simply reimplemented what the old code did.
It's not strictly needed, so if someone wants to add something like:

#ifndef CLOCK_TICK_RATE
#define CLOCK_TICK_ADJUST 0
#else
...

it would be fine with me.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] replace "make ARCH=i386/x86_64 with make ARCH=x86"

2007-11-03 Thread Adrian Bunk

On Sat, Nov 03, 2007 at 10:02:19PM -0400, Jeff Garzik wrote:
> Sam Ravnborg wrote:
>> This patchset unify the i386 and x86_64 Kconfig
>> files for x86.
>> In addition it replaces the use of ARCH=i386 and
>> ARCH=x86_64 with the more intuitive ARCH=x86.
>> The primary purpose of this patch serie is to enable make ARCH=x86 and let 
>> the config decide
>> if we are building for 32 or 64 bit.
>
> Yuck, I dislike.  Please don't take away this nice development workflow.  
> the current workflow of
>
>   make ARCH=i386 allmodconfig && make ARCH=i386 -sj5
>
> no longer works.  Now, the new and ungainly step of editing the .config is 
> added, with vi or sed.

You could say the same about allmodconfig and 32bit on powerpc.
Or about allmodconfig and 64bit on mips.
Or about allmodconfig and little endian on mips.
Or about allmodconfig and drivers depending on !SMP on x86.
...

> This also opens a chicken-and-egg problem...  What kind of config is 
> generated by allmodconfig when ARCH==x86?  There is no good answer.

This option won't be different from all the other similar options 
allmodconfig already handles in a non-random way...

> The existing tradition of switching between 32-bit and 64-bit was quite 
> nice, and it was done in The Obvious Way(tm) -- via the method for 
> specifying the architecture/platform.  Switching to Kconfig for that 
> decision is a step backwards in usability and IMO violates the Principle of 
> Least Surprise.

The existing tradition on the powerpc and mips ports that have 32bit and 
64bit together in one directory is to switch in Kconfig, and doing it 
differently in the x86 port would violate the Principle of the Least 
Surprise.

>   Jeff

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

patch, reboot fixup for WRAP 2C board (SC1100 based)

2007-11-03 Thread Denys

Hi

Strange, that noone notice issue with popular for wireless board, WRAP2C. 
Probably most of developers was fixing silently in their codebase and was not 
supplying patch to mainline kernel.
The symptoms is easy, it just doesn't reboot in any case, till i apply patch.
Works for me.

Can anyone format it as it is required by kernel policy? It is trivial.

--- linux-2.6.23/arch/i386/kernel/reboot_fixups.c   2007-10-09 
23:31:38.0 +0300
+++ linux-2.6.23-new/arch/i386/kernel/reboot_fixups.c   2007-11-04 
04:34:44.0 +0200
@@ -42,6 +42,7 @@
 static struct device_fixup fixups_table[] = {
 { PCI_VENDOR_ID_CYRIX, PCI_DEVICE_ID_CYRIX_5530_LEGACY, cs5530a_warm_reset },
 { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CS5536_ISA, cs5536_warm_reset },
+{ PCI_VENDOR_ID_NS, PCI_DEVICE_ID_NS_SC1100_BRIDGE, cs5530a_warm_reset },
 };

 /*





--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] replace "make ARCH=i386/x86_64 with make ARCH=x86"

2007-11-03 Thread Jeff Garzik


Sam Ravnborg wrote:

This patchset unify the i386 and x86_64 Kconfig
files for x86.
In addition it replaces the use of ARCH=i386 and
ARCH=x86_64 with the more intuitive ARCH=x86.

The primary purpose of this patch serie is to 
enable make ARCH=x86 and let the config decide

if we are building for 32 or 64 bit.


Yuck, I dislike.  Please don't take away this nice development workflow. 
 the current workflow of


make ARCH=i386 allmodconfig && make ARCH=i386 -sj5

no longer works.  Now, the new and ungainly step of editing the .config 
is added, with vi or sed.


This also opens a chicken-and-egg problem...  What kind of config is 
generated by allmodconfig when ARCH==x86?  There is no good answer.


The existing tradition of switching between 32-bit and 64-bit was quite 
nice, and it was done in The Obvious Way(tm) -- via the method for 
specifying the architecture/platform.  Switching to Kconfig for that 
decision is a step backwards in usability and IMO violates the Principle 
of Least Surprise.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc1-82798a1 compile failure (x86_64)

2007-11-03 Thread Adrian Bunk

On Sat, Nov 03, 2007 at 01:11:49PM +0100, Sam Ravnborg wrote:
> On Sat, Nov 03, 2007 at 11:04:36AM +0100, Thomas Bächler wrote:
> > Thomas Bächler schrieb:
> > > 
> > > I just remembered, a friend of mine got it to compile with the exact
> > > same toolchain, but with a different configuration (which I don't have).
> > > He used a snapshot tarball from yesterday though, not the git tree.
> > > 
> > 
> > I found the problem and eliminated it. While this is my own fault, it is
> > still a bug in either the kernel or the build system: I had CFLAGS set
> > to "-Wall -O3 -march=native -pipe". I always thought the kernel would
> > ignore those and set its own CFLAGS, but I was wrong. Either the -O3 or
> > the -march=native break the build process on gcc 4.2.2.
> > 
> The kernel will now honour the users CFLAGS setting as you just discovered.
> The flags will be appended to the flags specified by the kernel.
>...

I think this should be changed:

I also have CFLAGS set on some computers in my environments since for 
packages using GNU autoconf that's the correct way to set the compiler 
flags.

The kernel already sets all flags correctly, and a user wanting to 
change the flags for the kernel is an exception with very special needs
(I'd even claim so special that he could simply edit the Makefile...).

Using the *FLAGS automatically in the kernel doesn't sound right, can we 
prefix all environment variables the kernel honours with KERNEL_ ?

>   Sam 

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/10] x86: start unification of arch/x86/Kconfig.*

2007-11-03 Thread Adrian Bunk

On Sun, Nov 04, 2007 at 12:51:12AM +0100, Sam Ravnborg wrote:
>...
> --- a/arch/x86/Kconfig.x86_64
> +++ b/arch/x86/Kconfig.x86_64
>...
>  # x86-64 doesn't support PCI BIOS access from long mode so always go direct.
>  config PCI_DIRECT
> @@ -737,36 +723,11 @@ config PCI_DIRECT
>   depends on PCI
>   default y
>  
> -config PCI_MMCONFIG
> - bool "Support mmconfig PCI config space access"
> - depends on PCI && ACPI
> -
>  config PCI_DOMAINS
>   bool
>   depends on PCI
>   default y
>  
> -config DMAR
> - bool "Support for DMA Remapping Devices (EXPERIMENTAL)"
> - depends on PCI_MSI && ACPI && EXPERIMENTAL
> - help
> -   DMA remapping (DMAR) devices support enables independent address
> -   translations for Direct Memory Access (DMA) from devices.
> -   These DMA remapping devices are reported via ACPI tables
> -   and include PCI device scope covered by these DMA
> -   remapping devices.
> -
> -config DMAR_GFX_WA
> - bool "Support for Graphics workaround"
> - depends on DMAR
> - default y
> - help
> -  Current Graphics drivers tend to use physical address
> -  for DMA and avoid using DMA APIs. Setting this config
> -  option permits the IOMMU driver to set a unity map for
> -  all the OS-visible memory. Hence the driver can continue
> -  to use physical addresses for DMA.
> -
>  config DMAR_FLOPPY_WA
>   bool
>   depends on DMAR
>...

In patch 8 the remaining PCI_* options and DMAR_FLOPPY_WA end in a 
completely different place in the Kconfig file than the options moved 
here.

Please keep options that belong together grouped together no matter 
whether all of them are user visible.

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Quad core CPU detected but shows as single core in 2.6.23.1

2007-11-03 Thread Chris Snook


Zurk Tech wrote:

dmesg (new) with disabled GART error reporting if anyone wants to
compare to previous dmesg with GART error reporting :


A few unrelated observations about Barcelona support...


Marking TSC unstable due to TSCs unsynchronized


This is probably wrong.  The TSC is on the northbridge on Barcelona chips, so 
every core on the die should be in sync.  Hypothetically you could have 
different speed northbridges in different sockets, but we've never tried very 
hard to support that case anyway.  We should probably be marking the TSC as 
stable on Barcelona chips.



xor: automatically using best checksumming function: generic_sse
   generic_sse:  7449.000 MB/sec
xor: using function: generic_sse (7449.000 MB/sec)


We should probably also implement an SSE5 function to take advantage of the 
128-bit SSE operations supported on newer processors.



pnp: the driver 'system' has been registered
pnp: match found with the PnP device '00:08' and the driver 'system'
pnp: match found with the PnP device '00:09' and the driver 'system'
pnp: 00:09: ioport range 0x580-0x58f has been reserved
pnp: 00:09: ioport range 0x590-0x593 has been reserved
pnp: 00:09: ioport range 0x700-0x703 has been reserved
pnp: 00:09: ioport range 0xca0-0xcaf has been reserved
pnp: 00:09: iomem range 0xfec0-0xfec00fff could not be reserved
pnp: 00:09: iomem range 0xfec01000-0xfec01fff could not be reserved
pnp: 00:09: iomem range 0xfec02000-0xfec02fff could not be reserved
pnp: 00:09: iomem range 0xfee0-0xfee00fff could not be reserved
pnp: match found with the PnP device '00:0a' and the driver 'system'
pnp: 00:0a: ioport range 0x600-0x61f has been reserved
pnp: 00:0a: ioport range 0x520-0x53f has been reserved
pnp: 00:0a: ioport range 0x540-0x54f has been reserved
pnp: 00:0a: ioport range 0x640-0x65f has been reserved
pnp: match found with the PnP device '00:0b' and the driver 'system'
pnp: 00:0b: iomem range 0xe000-0xefff has been reserved
pnp: match found with the PnP device '00:0c' and the driver 'system'
pnp: 00:0c: iomem range 0x0-0x9 could not be reserved
pnp: 00:0c: iomem range 0x0-0x0 could not be reserved
pnp: 00:0c: iomem range 0xe-0xf could not be reserved
pnp: 00:0c: iomem range 0x10-0xc7ff could not be reserved
PCI: Bridge: :01:0d.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:01.0
  IO window: a000-bfff
  MEM window: ff40-ff4f
  PREFETCH window: disabled.
PCI: Bridge: :00:06.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:07.0
  IO window: disabled.
  MEM window: ff50-ff5f
  PREFETCH window: cfe0-cfef
PCI: Bridge: :00:08.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:09.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:0a.0
  IO window: disabled.
  MEM window: disabled.
  PREFETCH window: disabled.
PCI: Bridge: :00:0b.0
  IO window: disabled.
  MEM window: disabled.


Hmmm... perhaps we're not handling the new mmconfig stuff correctly?  Or maybe 
the BIOS isn't.



hwmon-vid: Unknown VRM version of your x86 CPU
 : Not supporting VRM 0.0


This code probably needs an update for Barcelona.


raid6: int64x1   1920 MB/s
raid6: int64x2   2353 MB/s
raid6: int64x4   2331 MB/s
raid6: int64x8   1254 MB/s
raid6: sse2x12664 MB/s
raid6: sse2x24214 MB/s
raid6: sse2x44905 MB/s
raid6: using algorithm sse2x4 (4905 MB/s)


An update here for SSE5 might be in order as well.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/10] x86: unification of cfufreq/Kconfig

2007-11-03 Thread Adrian Bunk

On Sun, Nov 04, 2007 at 12:51:11AM +0100, Sam Ravnborg wrote:
>...
>  config X86_SPEEDSTEP_CENTRINO
> - tristate "Intel Enhanced SpeedStep"
> + tristate "Intel Enhanced SpeedStep (deprecated)"
>   select CPU_FREQ_TABLE
> - select X86_SPEEDSTEP_CENTRINO_TABLE
> + select X86_SPEEDSTEP_CENTRINO_TABLE if X86_32
>   help
>...

depends on ACPI_PROCESSOR if X86_64



cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux Security Module Framework (Was: LSM conversion to static interface)

2007-11-03 Thread Peter Dolding

On 11/1/07, David Newall <[EMAIL PROTECTED]> wrote:
> Jan Engelhardt wrote:
> > On Nov 1 2007 12:51, Peter Dolding wrote:
> >
> >> This is above me doing code.   No matter how many fixes I do to the
> >> core that will not fix dysfunction in the LSM section.  Strict
> >> policies on fixing the main security model will be required.
> >>
> >
> > If there is no one wanting to fix the existing code, then the
> > perceived problem is not a problem.
>
> What an absurd claim.
>
I agree.  If they can provide a reason.  A correct reason why its not
being fixed then the perceived problem does not exist.  Until then it
common human flaw Tunnel vision.  People normally don't look at the
big picture.

Common fake reason is that Linus does not approve.   History of
patches completely disagrees with that.  Parts Linus has blocked have
been out of alignment with the build into kernel security model.  Yet
other parts that were in alignment with the model have got in during
the same time.  Perfect example of dysfunction making up a lie because
things will not go into kernel exactly how they want.

LSM is nothing more than a testing and model zone.   A place were
important features should not be.  It was put there because model
designs could not get along.   Did that mean that LSM was were the
features were intended to stay.  No the goal should be simple to get
as many good features into the main line as possible while staying in
alignment with the main kernels model.  I don't know where the wrong
idea that the main line did not have a security model came from
either.  Something does not have to be a LSM to be a security model.

XEN, KVM and lguest are not a suitable workaround to problem.  Its
more of LSM developers trying to say its not needed so don't have to
work with each other.  I am not always using x86 machines so at times
not one of those solutions fit.

Containers in Linux kernel get to be processor neutral in features.
So it will not matter what the processor chip it will work.  So the
correct solution to running many LSM somehow has to be done with
Containers.

Note calling me a know it all is not an answer either.   Either they
can put have a good explanation for there failing or the need asses
kicked.  Heck if I am wrong I need ass kick and perfectly prepared to
accept it.  The problem is I am not a person to accept invalid answers
what they have been giving me so far.

My main base is System Administration.  Not coding please note that
System Administrators are the final clients.  If you want someone with
System Administration back ground to take up the leadership of LSM and
bash it into a System Administrator friendly shape I am more than
prepared to do so.  I can bet a System Administrator in charge who is
looking from flexibility's and security point of view is going to get
noses really badly out of joint.  The flexibility bit is currently
missing.   Its not always possible to reboot a server just because the
security framework is not up to the job or client wants you to use a
tighter model.

Yes so people trying to lie to me is something I have very little
tolerance with.  Paperwork like PHD don't scare me off.  I have had to
repair networks destroyed by people with PHD with masters in computer
programming because they run a BIOS destroying virus from a outside
source.  So lets just say my trust has to be earned and using
incorrect facts don't get trust from me.

There is a bigger one than just Containers.  Its called linux on
desktop.   Some how security models will have to tolerate being
controlled from a central server.  Preferred 1 model so any number of
Linux Distros can be used in a network.  Just like different versions
of windows can now.  So somehow we have to get to one master model.
Even if the other models are just like feature tweaks.  Application
controlled allows pam and ldap into play.

Selinux jamed in does not really suit what is needed.  The world of
Linux is changing the LSM need to get there but into gear and catch
up.

Peter Dolding
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] replace "make ARCH=i386/x86_64 with make ARCH=x86"

2007-11-03 Thread Adrian Bunk

On Sat, Nov 03, 2007 at 04:52:47PM -0700, Arjan van de Ven wrote:
> On Sun, 4 Nov 2007 00:48:26 +0100
> Sam Ravnborg <[EMAIL PROTECTED]> wrote:
> 
> > This patchset unify the i386 and x86_64 Kconfig
> > files for x86.
> > In addition it replaces the use of ARCH=i386 and
> > ARCH=x86_64 with the more intuitive ARCH=x86.
> > 
> > The primary purpose of this patch serie is to 
> > enable make ARCH=x86 and let the config decide
> > if we are building for 32 or 64 bit.
> > 
> > But we will break quite a high number of
> > scripts with this change.
> > What is the desired behaviour when specifying:
> > make ARCH=i386
> > and
> > make ARCH=x86_64
> 
> I would say.. just print a very very nasty warning, wait 20 seconds,
> and then pretend it was passed as x86
> and then remove in the next release (assuming the code isn't too ugly)

I'd suggest exit'ing after printing a helpful note.

The user will anyway have to adjust his build environment, so let's make 
a clear cut now.

> other than that.. just break it now in 2.6.24; people at least remember
> .24 as the release that unified, so they expect changes there... they
> won't expect them for .25 or later

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: drive side 80-wire cable detection failed

2007-11-03 Thread Tobias Hoffmann


Alan Cox wrote:

On my NV3 board with a Samsung SP1634N Harddisk I wrongly got
"drive side 80-wire cable detection failed" with the current kernel.


Does the drive cable detect correctly on a non Nvidia chipset ?


I have no box here to test this - my laptop won't help.

But I did another test:

I replaced the 80-wire cable with a 40-wire cable. Then the BIOS warns 
about "no 80-wire cable connected". The kernel boot and seems to enable 
UDMA100 which leads to some BadCRC errors. After that the drive is in 
udma3-mode(hdparm) IIRC but not in udma5, as when using the right cable.


Maybe it's not that clever to add the drive to the ivb_list?

As far as I understood it, pata_acpi should fix the problem, but I'm not 
sure on the required configuration. Do I have to disable the normal 
IDE-support? How 'experimental' is the PATA support?



  Tobias
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] PID namespace design bug, workaround

2007-11-03 Thread david


On Sat, 3 Nov 2007, Arjan van de Ven wrote:


On Sat, 3 Nov 2007 15:40:48 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:


I don't understand how you can call this a "PID namespace design
bug", when it clearly has nothing what-so-ever to do with pid
namespaces, and everything to do with the *futexes* that blithely
assume that pid's are unique and that made it part of the
user-visible interface.

OF COURSE any pid namespace design will always break such
assumptions, but that's not because of any PID namespace bugs. It's
what the whole *point* of PID namespaces are. If you use pid's
(instead of some opaque cookies), you will not be able to use such
things across pid-separation.


well... kind of.
THere are 2 things around pid namespaces: which pids you can see/touch
(in proc or signals or otherwise), and the non-uniqueness.

For containers you clearly want the first part... but... is there a
strong reason to not just *not* create duplicate pids even across
namespaces? there's no rule in posix or anything similar to fd's afaik
concerning which pids we can hand out... so we could just make then
unique globally but just with limited visibility


two problems that I can think of

1. the container people would like to eventually have the ability to 
migrate containers from one system to another (or to suspend a container) 
in this sort of case trying to fit the allocated PIDs from the container 
into a running system is a problem if PIDs are not allowed to overlap.


2. it seems to me that there is porobably a latent security issue in 
having a global PID namespace with just limited visability. the types of 
bugs that may let you affect a process seem easier to make if the only 
protection is visability rather then complete seperation.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: "Fix ATAPI transfer lengths" causes CD writing regression

2007-11-03 Thread Tejun Heo

Daniel Drake wrote:
> Tejun Heo wrote:
>>> <4>ata2.00: HSM violation: eh_analyze_tf: BUSY|DRQ
>>> <3>ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>>> <3>ata2.00: cmd a0/00:00:00:0a:00/00:00:00:00:00/a0 tag 0 cdb 0x5a data
>>> 10 in
>>> <4> res 58/00:02:00:0a:00/00:00:00:00:00/a0 Emask 0x2 (HSM
>>> violation)
>>> <3>ata2.00: status: { DRDY DRQ }
>>> <6>ata2: soft resetting link
>>> <6>ata2.00: configured for UDMA/33
>>> <6>ata2: EH complete
>>
>> Does this patch fix the problem?
> 
> That fixes it, thanks! There is no more ugly error in dmesg, the test
> prog doesn't print any sense data, and brasero works OK too. However,
> these messages appear in the kernel log every time I run the test app
> (or when brasero does its thing):
> 
> <4>ata2.00: 10 bytes trailing data
> <4>ata2.00: 10 bytes trailing data
> <4>ata2.00: 10 bytes trailing data
> <4>ata2.00: 10 bytes trailing data
> <4>ata2.00: 10 bytes trailing data
> <4>ata2.00: 10 bytes trailing data
> <4>ata2.00: 6 bytes trailing data

Yeah, that's expected.  What's going on here is that your drive sends
full mode sense data (76bytes) regardless of allocation size in CDB but
it does honor transfer chunk set in the PACKET TF, which is set to the
same value as allocation size by Alan's patch.  So, now the drive sends
the 76 bytes in 8 chunks.  The first chunk is transferred into the sg
buffer and the following chunks are thrown away.

Previously, transfer chunk was set to 8k, so the drive claims to
transfer 76 bytes from the buegging, libata transfers leading 10 bytes
got transferred into the user buffer and throws away what's remaining.
The change caused problem because libata HSM always switches to
HSM_ST_LAST (command sequence completion) after filling the command
buffer completely.  So, throwing away is activated iff the extra data to
throw away is transfered together with the last chunk of useful data.

With the chunk size reduced to allocation size, the initial chunk fills
the data buffer completely and all the extra bytes are transfered in
separate chunks.  However, libata HSM expects command sequence to
complete after the initial chunk but the drive asserts DRQ for the next
chunk on the following interrupt, so HSM violation is triggered.

The patch modifies HSM such that it keeps throwing away extra data as
long as the drive asserts DRQ which is how IDE driver does it.

However, there's still remaining issues.  What does happen if you raise
allocation length and buffersize of the test program to 16?  ie. Change
0x0a in cmd[] to 0x10 and increase buffer[10] to buffer[16].

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Laptop's HDD

2007-11-03 Thread Alberto Gonzalez

Hi,

Maybe some of you have been hearing lately about a problem with laptop's hard 
disk drives being killed by *insert Linux distro here* [1]

The problem comes from a very high rate of load/unload cycles of the heads 
that reaches the 300.000-600.000 limit in 2-3 years (with smartmontools it 
can checked it with "smartctl -A /dev/sda")  . There are reports of HDD dying 
even earlier for this problem [2]

For what I've read, it's not that Linux is doing anything special to your hard 
disk, it's the BIOS settings that take care of killing your disk sooner than 
later. However, I'm asking on this list because the problem seems to have 
started with kernel 2.6.10 [3].

Windows seems to override the BIOS settings, so hardware vendors have never 
cared about this problem.

So my question is: Is this something the (Linux) kernel should care about or 
should distributions care about it with userspace tools?

By the way, this settings seem to be there in order to save power. However, 
loading/unloading the heads ~3 times per minute doesn't seem like a very good 
powersaving policy. Couldn't this be one of the reasons why Linux is using 
generally more power than Windows?

Regards,
Alberto.

[1] - http://beranger.org/index.php?page=diary&2007/10/24/18/07/21
- http://ubuntudemon.wordpress.com/2007/10/26/laptop-hardrive-killer-bug/

[2] - 
http://ubuntudemon.wordpress.com/2007/10/27/laptop-hardrive-killer-bug-is-worse-than-i-thought/#comment-31490
http://paul.luon.net/journal/hacking/BrokenHDDs.html

[3] - https://www.redhat.com/archives/fedora-list/2005-March/msg00463.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] PID namespace design bug, workaround

2007-11-03 Thread Arjan van de Ven

On Sat, 3 Nov 2007 15:40:48 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> I don't understand how you can call this a "PID namespace design
> bug", when it clearly has nothing what-so-ever to do with pid
> namespaces, and everything to do with the *futexes* that blithely
> assume that pid's are unique and that made it part of the
> user-visible interface.
> 
> OF COURSE any pid namespace design will always break such
> assumptions, but that's not because of any PID namespace bugs. It's
> what the whole *point* of PID namespaces are. If you use pid's
> (instead of some opaque cookies), you will not be able to use such
> things across pid-separation.

well... kind of.
THere are 2 things around pid namespaces: which pids you can see/touch
(in proc or signals or otherwise), and the non-uniqueness.

For containers you clearly want the first part... but... is there a
strong reason to not just *not* create duplicate pids even across
namespaces? there's no rule in posix or anything similar to fd's afaik
concerning which pids we can hand out... so we could just make then
unique globally but just with limited visibility
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] replace "make ARCH=i386/x86_64 with make ARCH=x86"

2007-11-03 Thread Arjan van de Ven

On Sun, 4 Nov 2007 00:48:26 +0100
Sam Ravnborg <[EMAIL PROTECTED]> wrote:

> This patchset unify the i386 and x86_64 Kconfig
> files for x86.
> In addition it replaces the use of ARCH=i386 and
> ARCH=x86_64 with the more intuitive ARCH=x86.
> 
> The primary purpose of this patch serie is to 
> enable make ARCH=x86 and let the config decide
> if we are building for 32 or 64 bit.
> 
> But we will break quite a high number of
> scripts with this change.
> What is the desired behaviour when specifying:
> make ARCH=i386
> and
> make ARCH=x86_64

I would say.. just print a very very nasty warning, wait 20 seconds,
and then pretend it was passed as x86
and then remove in the next release (assuming the code isn't too ugly)

other than that.. just break it now in 2.6.24; people at least remember
.24 as the release that unified, so they expect changes there... they
won't expect them for .25 or later
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/10] x86: copy x86_64 specific Kconfig symbols to Kconifg.i386

2007-11-03 Thread Sam Ravnborg

No functional changes.
A prepatory step towards full unification.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig.i386 |  124 +
 1 files changed, 124 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig.i386 b/arch/x86/Kconfig.i386
index af72240..890c258 100644
--- a/arch/x86/Kconfig.i386
+++ b/arch/x86/Kconfig.i386
@@ -218,6 +218,14 @@ config X86_ES7000
  Only choose this option if you have such a system, otherwise you
  should say N here.
 
+config X86_VSMP
+   bool "Support for ScaleMP vSMP"
+   depends on X86_64 && PCI
+help
+ Support for ScaleMP vSMP systems.  Say 'Y' here if this kernel is
+ supposed to run on these EM64T-based machines.  Only choose this 
option
+ if you have one of these machines.
+
 endchoice
 
 config SCHED_NO_NO_OMIT_FRAME_POINTER
@@ -313,6 +321,54 @@ config HPET_EMULATE_RTC
depends on HPET_TIMER && RTC=y
default y
 
+# Mark as embedded because too many people got it wrong.
+# The code disables itself when not needed.
+config GART_IOMMU
+   bool "GART IOMMU support" if EMBEDDED
+   default y
+   select SWIOTLB
+   select AGP
+   depends on X86_64 && PCI
+   help
+ Support for full DMA access of devices with 32bit memory access only
+ on systems with more than 3GB. This is usually needed for USB,
+ sound, many IDE/SATA chipsets and some other devices.
+ Provides a driver for the AMD Athlon64/Opteron/Turion/Sempron GART
+ based hardware IOMMU and a software bounce buffer based IOMMU used
+ on Intel systems and as fallback.
+ The code is only active when needed (enough memory and limited
+ device) unless CONFIG_IOMMU_DEBUG or iommu=force is specified
+ too.
+
+config CALGARY_IOMMU
+   bool "IBM Calgary IOMMU support"
+   select SWIOTLB
+   depends on X86_64 && PCI && EXPERIMENTAL
+   help
+ Support for hardware IOMMUs in IBM's xSeries x366 and x460
+ systems. Needed to run systems with more than 3GB of memory
+ properly with 32-bit PCI devices that do not support DAC
+ (Double Address Cycle). Calgary also supports bus level
+ isolation, where all DMAs pass through the IOMMU.  This
+ prevents them from going anywhere except their intended
+ destination. This catches hard-to-find kernel bugs and
+ mis-behaving drivers and devices that do not use the DMA-API
+ properly to set up their DMA buffers.  The IOMMU can be
+ turned off at boot time with the iommu=off parameter.
+ Normally the kernel will make the right choice by itself.
+ If unsure, say Y.
+
+config CALGARY_IOMMU_ENABLED_BY_DEFAULT
+   bool "Should Calgary be enabled by default?"
+   default y
+   depends on CALGARY_IOMMU
+   help
+ Should Calgary be enabled by default? if you choose 'y', Calgary
+ will be used (if it exists). If you choose 'n', Calgary will not be
+ used even if it exists. If you choose 'n' and would like to use
+ Calgary anyway, pass 'iommu=calgary' on the kernel command line.
+ If unsure, say Y.
+
 config NR_CPUS
int "Maximum number of CPUs (2-255)"
range 2 255
@@ -424,6 +480,22 @@ config X86_MCE_P4THERMAL
  Enabling this feature will cause a message to be printed when the P4
  enters thermal throttling.
 
+config X86_MCE_INTEL
+   bool "Intel MCE features"
+   depends on X86_64 && X86_MCE && X86_LOCAL_APIC
+   default y
+   help
+  Additional support for intel specific MCE features such as
+  the thermal monitor.
+
+config X86_MCE_AMD
+   bool "AMD MCE features"
+   depends on X86_64 && X86_MCE && X86_LOCAL_APIC
+   default y
+   help
+  Additional support for AMD specific MCE features such as
+  the DRAM Error Threshold.
+
 config VM86
bool "Enable VM86 support" if EMBEDDED
default y
@@ -661,6 +733,34 @@ config NUMA
 comment "NUMA (Summit) requires SMP, 64GB highmem support, ACPI"
depends on X86_32 && X86_SUMMIT && (!HIGHMEM64G || !ACPI)
 
+config K8_NUMA
+   bool "Old style AMD Opteron NUMA detection"
+   depends on X86_64 && NUMA && PCI
+   default y
+   help
+Enable K8 NUMA node topology detection.  You should say Y here if
+you have a multi processor AMD K8 system. This uses an old
+method to read the NUMA configuration directly from the builtin
+Northbridge of Opteron. It is recommended to use X86_64_ACPI_NUMA
+instead, which also takes priority if both are compiled in.
+
+# Dummy CONFIG option to select ACPI_NUMA from drivers/acpi/Kconfig.
+config X86_64_ACPI_NUMA
+   bool "ACPI NUMA detection"
+   depends on X86_64 && NUMA && ACPI && PCI
+   select ACPI_NUMA
+   default y
+   help
+ Enable

[PATCH 09/10] x86: select i386 or x86_64 at config time

2007-11-03 Thread Sam Ravnborg

Like powerpc and other we now select the actual
architecture during configuration.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig |   26 +-
 arch/x86/Kconfig.i386|   17 +++--
 arch/x86/Kconfig.x86_64  |   16 +++-
 scripts/kconfig/Makefile |7 +--
 4 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c325606..d51f3c8 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1,4 +1,28 @@
+mainmenu "Linux Kernel Configuration"
 
+choice
+   prompt "Architecture?"
+   default X86_32
+
+config X86_32
+   bool "i386"
+   help
+ This is Linux's home port.  Linux was originally native to the Intel
+ 386, and runs on all the later x86 processors including the Intel
+ 486, 586, Pentiums, and various instruction-set-compatible chips by
+ AMD, Cyrix, and others.
+
+config X86_64
+   bool "x86_64"
+   help
+ Port to the x86-64 architecture. x86-64 is a 64-bit extension to the
+ classical 32-bit x86 architecture. For details see
+ .
+
+endchoice
+
+source "arch/x86/Kconfig.i386"
+source "arch/x86/Kconfig.x86_64"
 source "init/Kconfig"
 
 menu "Processor type and features"
@@ -792,8 +816,8 @@ config CRASH_DUMP
 
 config PHYSICAL_START
hex "Physical address where the kernel is loaded" if (EMBEDDED || 
CRASH_DUMP)
-   default "0x100" if X86_NUMAQ
default "0x20"  if X86_64
+   default "0x100" if X86_NUMAQ
default "0x10"
help
  This gives the physical address where the kernel is loaded.
diff --git a/arch/x86/Kconfig.i386 b/arch/x86/Kconfig.i386
index 9fe6c36..c8cc0f2 100644
--- a/arch/x86/Kconfig.i386
+++ b/arch/x86/Kconfig.i386
@@ -1,18 +1,7 @@
+# i386 specific settings
 #
-# For a description of the syntax of this configuration file,
-# see Documentation/kbuild/kconfig-language.txt.
-#
-
-mainmenu "Linux Kernel Configuration"
 
-config X86_32
-   bool
-   default y
-   help
- This is Linux's home port.  Linux was originally native to the Intel
- 386, and runs on all the later x86 processors including the Intel
- 486, 586, Pentiums, and various instruction-set-compatible chips by
- AMD, Cyrix, and others.
+if X86_32
 
 config GENERIC_TIME
bool
@@ -296,4 +285,4 @@ config KTIME_SCALAR
bool
default y
 
-source "arch/x86/Kconfig"
+endif  # X86_32
diff --git a/arch/x86/Kconfig.x86_64 b/arch/x86/Kconfig.x86_64
index 8721902..6eb8a09 100644
--- a/arch/x86/Kconfig.x86_64
+++ b/arch/x86/Kconfig.x86_64
@@ -1,21 +1,11 @@
-#
-# For a description of the syntax of this configuration file,
-# see Documentation/kbuild/kconfig-language.txt.
+# X86_64 specific settings
 #
 # Note: ISA is disabled and will hopefully never be enabled.
 # If you managed to buy an ISA x86-64 box you'll have to fix all the
 # ISA drivers you need yourself.
 #
 
-mainmenu "Linux Kernel Configuration"
-
-config X86_64
-   bool
-   default y
-   help
- Port to the x86-64 architecture. x86-64 is a 64-bit extension to the
- classical 32-bit x86 architecture. For details see
- .
+if X86_64
 
 config 64BIT
def_bool y
@@ -300,4 +290,4 @@ config SYSVIPC_COMPAT
depends on COMPAT && SYSVIPC
default y
 
-source "arch/x86/Kconfig"
+endif # X86_64
diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index 5959412..1ad6f7f 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -4,12 +4,7 @@
 
 PHONY += oldconfig xconfig gconfig menuconfig config silentoldconfig 
update-po-config
 
-# If a arch/$(SRCARCH)/Kconfig.$(ARCH) file exist use it
-ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/Kconfig.$(ARCH)),)
-Kconfig := arch/$(SRCARCH)/Kconfig.$(ARCH)
-else
-Kconfig := arch/$(SRCARCH)/Kconfig
-endif
+Kconfig := arch/$(SRCARCH)/Kconfig
 
 xconfig: $(obj)/qconf
$< $(Kconfig)
-- 
1.5.3.4.1157.g0e74-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/10] x86: unification of cfufreq/Kconfig

2007-11-03 Thread Sam Ravnborg

Merge the two Kconfig files to a single file.
Checked by comparing menuconfig before and after.
Except some slight reordering in the x86_&4 case
the result is equal.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig.i386  |2 +-
 arch/x86/Kconfig.x86_64|2 +-
 .../x86/kernel/cpu/cpufreq/{Kconfig_32 => Kconfig} |   68 
 arch/x86/kernel/cpu/cpufreq/Kconfig_64 |  108 
 4 files changed, 48 insertions(+), 132 deletions(-)
 rename arch/x86/kernel/cpu/cpufreq/{Kconfig_32 => Kconfig} (77%)
 delete mode 100644 arch/x86/kernel/cpu/cpufreq/Kconfig_64

diff --git a/arch/x86/Kconfig.i386 b/arch/x86/Kconfig.i386
index 7331efe..b6f2fd0 100644
--- a/arch/x86/Kconfig.i386
+++ b/arch/x86/Kconfig.i386
@@ -1092,7 +1092,7 @@ config APM_REAL_MODE_POWER_OFF
 
 endif # APM
 
-source "arch/x86/kernel/cpu/cpufreq/Kconfig_32"
+source "arch/x86/kernel/cpu/cpufreq/Kconfig"
 
 source "drivers/cpuidle/Kconfig"
 
diff --git a/arch/x86/Kconfig.x86_64 b/arch/x86/Kconfig.x86_64
index cc468ea..8d6b534 100644
--- a/arch/x86/Kconfig.x86_64
+++ b/arch/x86/Kconfig.x86_64
@@ -719,7 +719,7 @@ config ARCH_HIBERNATION_HEADER
 
 source "drivers/acpi/Kconfig"
 
-source "arch/x86/kernel/cpu/cpufreq/Kconfig_64"
+source "arch/x86/kernel/cpu/cpufreq/Kconfig"
 
 source "drivers/cpuidle/Kconfig"
 
diff --git a/arch/x86/kernel/cpu/cpufreq/Kconfig_32 
b/arch/x86/kernel/cpu/cpufreq/Kconfig
similarity index 77%
rename from arch/x86/kernel/cpu/cpufreq/Kconfig_32
rename to arch/x86/kernel/cpu/cpufreq/Kconfig
index d8c6f13..a84cea6 100644
--- a/arch/x86/kernel/cpu/cpufreq/Kconfig_32
+++ b/arch/x86/kernel/cpu/cpufreq/Kconfig
@@ -19,6 +19,9 @@ config X86_ACPI_CPUFREQ
  Processor Performance States.
  This driver also supports Intel Enhanced Speedstep.
 
+ To compile this driver as a module, choose M here: the
+ module will be called acpi-cpufreq.
+
  For details, take a look at .
 
  If in doubt, say N.
@@ -26,7 +29,7 @@ config X86_ACPI_CPUFREQ
 config ELAN_CPUFREQ
tristate "AMD Elan SC400 and SC410"
select CPU_FREQ_TABLE
-   depends on X86_ELAN
+   depends on X86_32 && X86_ELAN
---help---
  This adds the CPUFreq driver for AMD Elan SC400 and SC410
  processors.
@@ -42,7 +45,7 @@ config ELAN_CPUFREQ
 config SC520_CPUFREQ
tristate "AMD Elan SC520"
select CPU_FREQ_TABLE
-   depends on X86_ELAN
+   depends on X86_32 && X86_ELAN
---help---
  This adds the CPUFreq driver for AMD Elan SC520 processor.
 
@@ -54,6 +57,7 @@ config SC520_CPUFREQ
 config X86_POWERNOW_K6
tristate "AMD Mobile K6-2/K6-3 PowerNow!"
select CPU_FREQ_TABLE
+   depends on X86_32
help
  This adds the CPUFreq driver for mobile AMD K6-2+ and mobile
  AMD K6-3+ processors.
@@ -65,6 +69,7 @@ config X86_POWERNOW_K6
 config X86_POWERNOW_K7
tristate "AMD Mobile Athlon/Duron PowerNow!"
select CPU_FREQ_TABLE
+   depends on X86_32
help
  This adds the CPUFreq driver for mobile AMD K7 mobile processors.
 
@@ -76,23 +81,27 @@ config X86_POWERNOW_K7_ACPI
bool
depends on X86_POWERNOW_K7 && ACPI_PROCESSOR
depends on !(X86_POWERNOW_K7 = y && ACPI_PROCESSOR = m)
+   depends on X86_32
default y
 
 config X86_POWERNOW_K8
tristate "AMD Opteron/Athlon64 PowerNow!"
select CPU_FREQ_TABLE
-   depends on EXPERIMENTAL
help
  This adds the CPUFreq driver for mobile AMD Opteron/Athlon64 
processors.
 
+ To compile this driver as a module, choose M here: the
+ module will be called powernow-k8.
+
  For details, take a look at .
 
  If in doubt, say N.
 
 config X86_POWERNOW_K8_ACPI
-   bool "ACPI Support"
-   select ACPI_PROCESSOR
-   depends on ACPI && X86_POWERNOW_K8
+   bool
+   prompt "ACPI Support" if X86_32
+   depends on ACPI && X86_POWERNOW_K8 && ACPI_PROCESSOR
+   depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m)
default y
help
  This provides access to the K8s Processor Performance States via ACPI.
@@ -104,7 +113,7 @@ config X86_POWERNOW_K8_ACPI
 
 config X86_GX_SUSPMOD
tristate "Cyrix MediaGX/NatSemi Geode Suspend Modulation"
-   depends on PCI
+   depends on X86_32 && PCI
help
 This add the CPUFreq driver for NatSemi Geode processors which
 support suspend modulation.
@@ -114,15 +123,19 @@ config X86_GX_SUSPMOD
 If in doubt, say N.
 
 config X86_SPEEDSTEP_CENTRINO
-   tristate "Intel Enhanced SpeedStep"
+   tristate "Intel Enhanced SpeedStep (deprecated)"
select CPU_FREQ_TABLE
-   select X86_SPEEDSTEP_CENTRINO_TABLE
+   select X86_SPEEDSTEP_CENTRINO_TABLE if X86_32
help
+ This is deprecated and this functionality is now merged into
+

[PATCH 07/10] x86: add remaning bits from x86_64 to Kconfig.i386

2007-11-03 Thread Sam Ravnborg

A few symbols remained in Kconfig.x86_64 where the
dependencies were different or the help text was
different from the i386 one.
Modify all relevant Kconfig.i386 symbols such that
they have X86_64 dependencies for x86_64 specific items
and update the help text as appropriate.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig.i386 |   51 
 1 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/arch/x86/Kconfig.i386 b/arch/x86/Kconfig.i386
index 890c258..4984904 100644
--- a/arch/x86/Kconfig.i386
+++ b/arch/x86/Kconfig.i386
@@ -307,9 +307,17 @@ source "arch/x86/Kconfig.cpu"
 config HPET_TIMER
bool
prompt "HPET Timer Support" if X86_32
+   default y if X86_64
help
- This enables the use of the HPET for the kernel's internal timer.
+ Use the IA-PC HPET (High Precision Event Timer) to manage
+ time in preference to the PIT and RTC, if a HPET is
+ present.
  HPET is the next generation timer replacing legacy 8254s.
+  The HPET provides a stable time base on SMP
+ systems, unlike the TSC, but it is more expensive to access,
+ as it is off-chip.  You can find the HPET spec at
+ .
+
  You can safely choose Y here.  However, HPET will only be
  activated if the platform and the BIOS support this feature.
  Otherwise the 8254 will be used for timing services.
@@ -377,8 +385,8 @@ config NR_CPUS
default "8"
help
  This allows you to specify the maximum number of CPUs which this
- kernel will support.  The maximum supported value is 255 and the
- minimum value which makes sense is 2.
+ kernel will support. Current maximum is 255 CPUs due to
+ APIC addressing limits. Less depending on the hardware.
 
  This is purely to save memory - each supported CPU adds
  approximately eight kilobytes to the kernel image.
@@ -444,8 +452,10 @@ config X86_VISWS_APIC
default y
 
 config X86_MCE
-   bool "Machine Check Exception"
+   bool
+   prompt "Machine Check Exception" if X86_32 || (X86_64 && EMBEDDED)
depends on !X86_VOYAGER
+   default y if X86_64
---help---
  Machine Check Exception support allows the processor to notify the
  kernel if it detects a problem (e.g. overheating, component failure).
@@ -459,6 +469,9 @@ config X86_MCE
  problem on some new non-standard machine, you can boot with "nomce"
  to disable it.  MCE support simply ignores non-MCE processors like
  the 386 and 486, so nearly everyone can say Y here.
+ This version for x86_64 will require the mcelog utility to decode
+ some machine check error logs. See
+ ftp://ftp.x86-64.org/pub/linux/tools/mcelog
 
 config X86_MCE_NONFATAL
tristate "Check for non-fatal errors on AMD Athlon/Duron / Intel 
Pentium 4"
@@ -577,6 +590,8 @@ config MICROCODE
 
  To compile this driver as a module, choose M here: the
  module will be called microcode.
+ If you use modprobe or kmod you may also want to add the line
+ 'alias char-major-10-184 microcode' to your /etc/modules.conf file.
 
 config MICROCODE_OLD_INTERFACE
bool
@@ -722,13 +737,19 @@ config X86_PAE
 # Common NUMA Features
 config NUMA
bool "Numa Memory Allocation and Scheduler Support (EXPERIMENTAL)"
-   depends on X86_32 && SMP && HIGHMEM64G && (X86_NUMAQ || (X86_SUMMIT || 
X86_GENERICARCH) && ACPI) && EXPERIMENTAL
+   depends on (X86_32 && SMP && HIGHMEM64G && (X86_NUMAQ || (X86_SUMMIT || 
X86_GENERICARCH) && ACPI) && EXPERIMENTAL) || (X86_64 && SMP)
default n if X86_PC
default y if (X86_NUMAQ || X86_SUMMIT)
help
- NUMA support for i386. This is currently highly experimental
- and should be only used for kernel development. It might also
- cause boot failures.
+ NUMA (Non Uniform Memory Access) support.
+ For i386 this is currently highly experimental and should be
+ only used for kernel development. It might also cause boot failures.
+ NUMA support for X86_64 is working fine. The kernel
+ will try to allocate memory used by a CPU on the local memory
+ controller of the CPU and add some more NUMA awareness to the kernel.
+ This code is recommended on all multiprocessor Opteron systems.
+ If the system is EM64T, you should say N unless your system is EM64T
+ NUMA.
 
 comment "NUMA (Summit) requires SMP, 64GB highmem support, ACPI"
depends on X86_32 && X86_SUMMIT && (!HIGHMEM64G || !ACPI)
@@ -879,6 +900,8 @@ config MTRR
 
  You can safely say Y even if your machine doesn't have MTRRs, you'll
  just add about 9 KB to your kernel.
+ For x86_64 you should just say Y here, all x86_64 machines
+

[PATCH 02/10] x86: start unification of arch/x86/Kconfig.*

2007-11-03 Thread Sam Ravnborg

This step introduces the file arch/x86/Kconfig
which contains all the menu's from "Power Management"
and below.

The main parts of the new Kconfig file are shared
and the remaining i386/x86_64 specific parts
are covered by dependencies.

All config options without prompt are kept in the
i386/x86_64 specific files to keep them
together.

The patch has been tested by comparing the menus available
in menuconfig before and after the patch.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig|  345 +++
 arch/x86/Kconfig.i386   |  299 +
 arch/x86/Kconfig.x86_64 |   91 +
 3 files changed, 352 insertions(+), 383 deletions(-)
 create mode 100644 arch/x86/Kconfig

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
new file mode 100644
index 000..0f08145
--- /dev/null
+++ b/arch/x86/Kconfig
@@ -0,0 +1,345 @@
+menu "Power management options"
+   depends on !X86_VOYAGER
+
+source "kernel/power/Kconfig"
+
+source "drivers/acpi/Kconfig"
+
+menuconfig APM
+   tristate "APM (Advanced Power Management) BIOS support"
+   depends on X86_32 && PM_SLEEP && !X86_VISWS
+   ---help---
+ APM is a BIOS specification for saving power using several different
+ techniques. This is mostly useful for battery powered laptops with
+ APM compliant BIOSes. If you say Y here, the system time will be
+ reset after a RESUME operation, the /proc/apm device will provide
+ battery status information, and user-space programs will receive
+ notification of APM "events" (e.g. battery status change).
+
+ If you select "Y" here, you can disable actual use of the APM
+ BIOS by passing the "apm=off" option to the kernel at boot time.
+
+ Note that the APM support is almost completely disabled for
+ machines with more than one CPU.
+
+ In order to use APM, you will need supporting software. For location
+ and more information, read  and the
+ Battery Powered Linux mini-HOWTO, available from
+ .
+
+ This driver does not spin down disk drives (see the hdparm(8)
+ manpage ("man 8 hdparm") for that), and it doesn't turn off
+ VESA-compliant "green" monitors.
+
+ This driver does not support the TI 4000M TravelMate and the ACER
+ 486/DX4/75 because they don't have compliant BIOSes. Many "green"
+ desktop machines also don't have compliant BIOSes, and this driver
+ may cause those machines to panic during the boot phase.
+
+ Generally, if you don't have a battery in your machine, there isn't
+ much point in using this driver and you should say N. If you get
+ random kernel OOPSes or reboots that don't seem to be related to
+ anything, try disabling/enabling this option (or disabling/enabling
+ APM in your BIOS).
+
+ Some other things you should try when experiencing seemingly random,
+ "weird" problems:
+
+ 1) make sure that you have enough swap space and that it is
+ enabled.
+ 2) pass the "no-hlt" option to the kernel
+ 3) switch on floating point emulation in the kernel and pass
+ the "no387" option to the kernel
+ 4) pass the "floppy=nodma" option to the kernel
+ 5) pass the "mem=4M" option to the kernel (thereby disabling
+ all but the first 4 MB of RAM)
+ 6) make sure that the CPU is not over clocked.
+ 7) read the sig11 FAQ at 
+ 8) disable the cache from your BIOS settings
+ 9) install a fan for the video card or exchange video RAM
+ 10) install a better fan for the CPU
+ 11) exchange RAM chips
+ 12) exchange the motherboard.
+
+ To compile this driver as a module, choose M here: the
+ module will be called apm.
+
+if APM
+
+config APM_IGNORE_USER_SUSPEND
+   bool "Ignore USER SUSPEND"
+   help
+ This option will ignore USER SUSPEND requests. On machines with a
+ compliant APM BIOS, you want to say N. However, on the NEC Versa M
+ series notebooks, it is necessary to say Y because of a BIOS bug.
+
+config APM_DO_ENABLE
+   bool "Enable PM at boot time"
+   ---help---
+ Enable APM features at boot time. From page 36 of the APM BIOS
+ specification: "When disabled, the APM BIOS does not automatically
+ power manage devices, enter the Standby State, enter the Suspend
+ State, or take power saving steps in response to CPU Idle calls."
+ This driver will make CPU Idle calls when Linux is idle (unless this
+ feature is turned off -- see "Do CPU IDLE calls", below). This
+ should always save battery power, but more complicated APM features
+ will be dependent on your BIOS implementation. You may

[PATCH 03/10] x86: arch/x86/Kconfig.cpu unification

2007-11-03 Thread Sam Ravnborg

Move all CPU definitions to Kconfig.cpu

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig.cpu|   83 ++
 arch/x86/Kconfig.x86_64 |   61 +--
 2 files changed, 69 insertions(+), 75 deletions(-)

diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index 0e2adad..6fc19c1 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -3,11 +3,12 @@ if !X86_ELAN
 
 choice
prompt "Processor family"
-   default M686
+   default M686 if X86_32
+   default GENERIC_CPU if X86_64
 
 config M386
bool "386"
-   depends on !UML
+   depends on X86_32 && !UML
---help---
  This is the processor type of your CPU. This information is used for
  optimizing purposes. In order to compile a kernel that can run on
@@ -49,6 +50,7 @@ config M386
 
 config M486
bool "486"
+   depends on X86_32
help
  Select this for a 486 series processor, either Intel or one of the
  compatible processors from AMD, Cyrix, IBM, or Intel.  Includes DX,
@@ -57,6 +59,7 @@ config M486
 
 config M586
bool "586/K5/5x86/6x86/6x86MX"
+   depends on X86_32
help
  Select this for an 586 or 686 series processor such as the AMD K5,
  the Cyrix 5x86, 6x86 and 6x86MX.  This choice does not
@@ -64,18 +67,21 @@ config M586
 
 config M586TSC
bool "Pentium-Classic"
+   depends on X86_32
help
  Select this for a Pentium Classic processor with the RDTSC (Read
  Time Stamp Counter) instruction for benchmarking.
 
 config M586MMX
bool "Pentium-MMX"
+   depends on X86_32
help
  Select this for a Pentium with the MMX graphics/multimedia
  extended instructions.
 
 config M686
bool "Pentium-Pro"
+   depends on X86_32
help
  Select this for Intel Pentium Pro chips.  This enables the use of
  Pentium Pro extended instructions, and disables the init-time guard
@@ -83,6 +89,7 @@ config M686
 
 config MPENTIUMII
bool "Pentium-II/Celeron(pre-Coppermine)"
+   depends on X86_32
help
  Select this for Intel chips based on the Pentium-II and
  pre-Coppermine Celeron core.  This option enables an unaligned
@@ -92,6 +99,7 @@ config MPENTIUMII
 
 config MPENTIUMIII
bool "Pentium-III/Celeron(Coppermine)/Pentium-III Xeon"
+   depends on X86_32
help
  Select this for Intel chips based on the Pentium-III and
  Celeron-Coppermine core.  This option enables use of some
@@ -100,19 +108,14 @@ config MPENTIUMIII
 
 config MPENTIUMM
bool "Pentium M"
+   depends on X86_32
help
  Select this for Intel Pentium M (not Pentium-4 M)
  notebook chips.
 
-config MCORE2
-   bool "Core 2/newer Xeon"
-   help
- Select this for Intel Core 2 and newer Core 2 Xeons (Xeon 51xx and 
53xx)
- CPUs. You can distinguish newer from older Xeons by the CPU family
- in /proc/cpuinfo. Newer ones have 6 and older ones 15 (not a typo)
-
 config MPENTIUM4
bool "Pentium-4/Celeron(P4-based)/Pentium-4 M/older Xeon"
+   depends on X86_32
help
  Select this for Intel Pentium 4 chips.  This includes the
  Pentium 4, Pentium D, P4-based Celeron and Xeon, and
@@ -148,6 +151,7 @@ config MPENTIUM4
 
 config MK6
bool "K6/K6-II/K6-III"
+   depends on X86_32
help
  Select this for an AMD K6-family processor.  Enables use of
  some extended instructions, and passes appropriate optimization
@@ -155,6 +159,7 @@ config MK6
 
 config MK7
bool "Athlon/Duron/K7"
+   depends on X86_32
help
  Select this for an AMD Athlon K7-family processor.  Enables use of
  some extended instructions, and passes appropriate optimization
@@ -169,6 +174,7 @@ config MK8
 
 config MCRUSOE
bool "Crusoe"
+   depends on X86_32
help
  Select this for a Transmeta Crusoe processor.  Treats the processor
  like a 586 with TSC, and sets some GCC optimization flags (like a
@@ -176,11 +182,13 @@ config MCRUSOE
 
 config MEFFICEON
bool "Efficeon"
+   depends on X86_32
help
  Select this for a Transmeta Efficeon processor.
 
 config MWINCHIPC6
bool "Winchip-C6"
+   depends on X86_32
help
  Select this for an IDT Winchip C6 chip.  Linux and GCC
  treat this chip as a 586TSC with some extended instructions
@@ -188,6 +196,7 @@ config MWINCHIPC6
 
 config MWINCHIP2
bool "Winchip-2"
+   depends on X86_32
help
  Select this for an IDT Winchip-2.  Linux and GCC
  treat this chip as a 586TSC with some extended instructions
@@ -195,6 +204,7 @@ config MWINCHIP2
 
 config MWINCHIP3D
bool "Winchip-2A/Winchip-3"
+   depends on X86_32
help

[PATCH 10/10] x86: enable make ARCH=x86

2007-11-03 Thread Sam Ravnborg

But this will break all the scripts that uses
make ARCH=i386 / make ARCH=x86_64

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 Makefile  |5 ++---
 arch/x86/Makefile |2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 188c3b6..eb22003 100644
--- a/Makefile
+++ b/Makefile
@@ -165,7 +165,8 @@ export srctree objtree VPATH TOPDIR
 # then ARCH is assigned, getting whatever value it gets normally, and 
 # SUBARCH is subsequently ignored.
 
-SUBARCH := $(shell uname -m | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ \
+SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
+ -e s/sun4u/sparc64/ \
  -e s/arm.*/arm/ -e s/sa110/arm/ \
  -e s/s390x/s390/ -e s/parisc64/parisc/ \
  -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
@@ -197,8 +198,6 @@ CROSS_COMPILE   ?=
 UTS_MACHINE:= $(ARCH)
 SRCARCH:= $(ARCH)
 
-# for i386 and x86_64 we use SRCARCH equal to x86
-SRCARCH := $(if $(filter x86_64 i386,$(SRCARCH)),x86,$(SRCARCH))
 
 KCONFIG_CONFIG ?= .config
 
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 3095973..c0a4e18 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -6,7 +6,7 @@ KBUILD_DEFCONFIG := $(ARCH)_defconfig
 # # No need to remake these files
 $(srctree)/arch/x86/Makefile%: ;
 
-ifeq ($(ARCH),i386)
+ifeq ($(CONFIG_X86_32),y)
 include $(srctree)/arch/x86/Makefile_32
 else
 include $(srctree)/arch/x86/Makefile_64
-- 
1.5.3.4.1157.g0e74-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/10] x86: add X86_32 dependency to i386 specific symbols in Kconfig.i386

2007-11-03 Thread Sam Ravnborg

To ease unification of Kconfig.i386 and Kconfig.x86_64
add X86_32 dependencies to all i386 specific symbols.

This patch introduce no functional changes but is one step
towards unification. This smaller step is used to ease
review of the patch set.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig.i386 |   52 -
 1 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/arch/x86/Kconfig.i386 b/arch/x86/Kconfig.i386
index 8f79cdd..af72240 100644
--- a/arch/x86/Kconfig.i386
+++ b/arch/x86/Kconfig.i386
@@ -137,6 +137,7 @@ config X86_PC
 
 config X86_ELAN
bool "AMD Elan"
+   depends on X86_32
help
  Select this for an AMD Elan processor.
 
@@ -146,6 +147,7 @@ config X86_ELAN
 
 config X86_VOYAGER
bool "Voyager (NCR)"
+   depends on X86_32
select SMP if !BROKEN
help
  Voyager is an MCA-based 32-way capable SMP architecture proprietary
@@ -160,6 +162,7 @@ config X86_NUMAQ
bool "NUMAQ (IBM/Sequent)"
select SMP
select NUMA
+   depends on X86_32
help
  This option is used for getting Linux to run on a (IBM/Sequent) NUMA
  multiquad box. This changes the way that processors are bootstrapped,
@@ -169,7 +172,7 @@ config X86_NUMAQ
 
 config X86_SUMMIT
bool "Summit/EXA (IBM x440)"
-   depends on SMP
+   depends on X86_32 && SMP
help
  This option is needed for IBM systems that use the Summit/EXA chipset.
  In particular, it is needed for the x440.
@@ -179,7 +182,7 @@ config X86_SUMMIT
 
 config X86_BIGSMP
bool "Support for other sub-arch SMP systems with more than 8 CPUs"
-   depends on SMP
+   depends on X86_32 && SMP
help
  This option is needed for the systems that have more than 8 CPUs
  and if the system is not of any sub-arch type above.
@@ -188,6 +191,7 @@ config X86_BIGSMP
 
 config X86_VISWS
bool "SGI 320/540 (Visual Workstation)"
+   depends on X86_32
help
  The SGI Visual Workstation series is an IA32-based workstation
  based on SGI systems chips with some legacy PC hardware attached.
@@ -199,6 +203,7 @@ config X86_VISWS
 
 config X86_GENERICARCH
bool "Generic architecture (Summit, bigsmp, ES7000, default)"
+   depends on X86_32
help
   This option compiles in the Summit, bigsmp, ES7000, default 
subarchitectures.
  It is intended for a generic binary kernel.
@@ -206,7 +211,7 @@ config X86_GENERICARCH
 
 config X86_ES7000
bool "Support for Unisys ES7000 IA32 series"
-   depends on SMP
+   depends on X86_32 && SMP
help
  Support for Unisys ES7000 systems.  Say 'Y' here if this kernel is
  supposed to run on an IA32-based Unisys ES7000 system.
@@ -218,6 +223,7 @@ endchoice
 config SCHED_NO_NO_OMIT_FRAME_POINTER
bool "Single-depth WCHAN output"
default y
+   depends on X86_32
help
  Calculate simpler /proc//wchan values. If this option
  is disabled then wchan values will recurse back to the
@@ -228,7 +234,7 @@ config SCHED_NO_NO_OMIT_FRAME_POINTER
 
 config PARAVIRT
bool
-   depends on !(X86_VISWS || X86_VOYAGER)
+   depends on X86_32 && !(X86_VISWS || X86_VOYAGER)
help
  This changes the kernel so it can modify itself when it is run
  under a hypervisor, potentially improving performance significantly
@@ -237,6 +243,7 @@ config PARAVIRT
 
 menuconfig PARAVIRT_GUEST
bool "Paravirtualized guest support"
+   depends on X86_32
help
  Say Y here to get to see options related to running Linux under
  various hypervisors.  This option alone does not add any kernel code.
@@ -290,7 +297,8 @@ config ES7000_CLUSTERED_APIC
 source "arch/x86/Kconfig.cpu"
 
 config HPET_TIMER
-   bool "HPET Timer Support"
+   bool
+   prompt "HPET Timer Support" if X86_32
help
  This enables the use of the HPET for the kernel's internal timer.
  HPET is the next generation timer replacing legacy 8254s.
@@ -341,7 +349,7 @@ source "kernel/Kconfig.preempt"
 
 config X86_UP_APIC
bool "Local APIC support on uniprocessors"
-   depends on !SMP && !(X86_VISWS || X86_VOYAGER || X86_GENERICARCH)
+   depends on X86_32 && !SMP && !(X86_VISWS || X86_VOYAGER || 
X86_GENERICARCH)
help
  A local APIC (Advanced Programmable Interrupt Controller) is an
  integrated interrupt controller in the CPU. If you have a single-CPU
@@ -398,7 +406,7 @@ config X86_MCE
 
 config X86_MCE_NONFATAL
tristate "Check for non-fatal errors on AMD Athlon/Duron / Intel 
Pentium 4"
-   depends on X86_MCE
+   depends on X86_32 && X86_MCE
help
  Enabling this feature starts a timer that triggers every 5 seconds 
which
  will look at the machine check registers to see if

[PATCH 05/10] x86: add X86_64 dependency to x86_64 specific symbols in Kconig.x86_64

2007-11-03 Thread Sam Ravnborg

To ease unification of Kconfig.i386 and Kconfig.x86_64
add X86_64 dependencies to all x86_64 specific symbols.

This patch introduce no functional changes but is one step
towards unification. This smaller step is used to ease
review of the patch set.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig.x86_64 |   18 +-
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/Kconfig.x86_64 b/arch/x86/Kconfig.x86_64
index 9fd69a0..5b7ada6 100644
--- a/arch/x86/Kconfig.x86_64
+++ b/arch/x86/Kconfig.x86_64
@@ -151,7 +151,7 @@ config X86_PC
 
 config X86_VSMP
bool "Support for ScaleMP vSMP"
-   depends on PCI
+   depends on X86_64 && PCI
 help
  Support for ScaleMP vSMP systems.  Say 'Y' here if this kernel is
  supposed to run on these EM64T-based machines.  Only choose this 
option
@@ -297,7 +297,7 @@ config NUMA
 
 config K8_NUMA
bool "Old style AMD Opteron NUMA detection"
-   depends on NUMA && PCI
+   depends on X86_64 && NUMA && PCI
default y
help
 Enable K8 NUMA node topology detection.  You should say Y here if
@@ -315,7 +315,7 @@ config NODES_SHIFT
 
 config X86_64_ACPI_NUMA
bool "ACPI NUMA detection"
-   depends on NUMA
+   depends on X86_64 && NUMA
select ACPI 
select PCI
select ACPI_NUMA
@@ -325,7 +325,7 @@ config X86_64_ACPI_NUMA
 
 config NUMA_EMU
bool "NUMA emulation"
-   depends on NUMA
+   depends on X86_64 && NUMA
help
  Enable NUMA emulation. A flat machine will be split
  into virtual nodes when booted with "numa=fake=N", where N is the
@@ -421,7 +421,7 @@ config GART_IOMMU
default y
select SWIOTLB
select AGP
-   depends on PCI
+   depends on X86_64 && PCI
help
  Support for full DMA access of devices with 32bit memory access only
  on systems with more than 3GB. This is usually needed for USB,
@@ -436,7 +436,7 @@ config GART_IOMMU
 config CALGARY_IOMMU
bool "IBM Calgary IOMMU support"
select SWIOTLB
-   depends on PCI && EXPERIMENTAL
+   depends on X86_64 && PCI && EXPERIMENTAL
help
  Support for hardware IOMMUs in IBM's xSeries x366 and x460
  systems. Needed to run systems with more than 3GB of memory
@@ -483,7 +483,7 @@ config X86_MCE
 
 config X86_MCE_INTEL
bool "Intel MCE features"
-   depends on X86_MCE && X86_LOCAL_APIC
+   depends on X86_64 && X86_MCE && X86_LOCAL_APIC
default y
help
   Additional support for intel specific MCE features such as
@@ -491,7 +491,7 @@ config X86_MCE_INTEL
 
 config X86_MCE_AMD
bool "AMD MCE features"
-   depends on X86_MCE && X86_LOCAL_APIC
+   depends on X86_64 && X86_MCE && X86_LOCAL_APIC
default y
help
   Additional support for AMD specific MCE features such as
@@ -598,7 +598,7 @@ config SECCOMP
 
 config CC_STACKPROTECTOR
bool "Enable -fstack-protector buffer overflow detection (EXPERIMENTAL)"
-   depends on EXPERIMENTAL
+   depends on X86_64 && EXPERIMENTAL
help
  This option turns on the -fstack-protector GCC feature. This
  feature puts, at the beginning of critical functions, a canary
-- 
1.5.3.4.1157.g0e74-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] replace "make ARCH=i386/x86_64 with make ARCH=x86"

2007-11-03 Thread Sam Ravnborg

This patchset unify the i386 and x86_64 Kconfig
files for x86.
In addition it replaces the use of ARCH=i386 and
ARCH=x86_64 with the more intuitive ARCH=x86.

The primary purpose of this patch serie is to 
enable make ARCH=x86 and let the config decide
if we are building for 32 or 64 bit.

But we will break quite a high number of
scripts with this change.
What is the desired behaviour when specifying:
make ARCH=i386
and
make ARCH=x86_64
??

For now it just error out like this:
$ make ARCH=i386
Makefile:503: /home/sam/kernel/x86.git/arch/i386/Makefile: No such file or 
directory
make: *** No rule to make target `/home/sam/kernel/x86.git/arch/i386/Makefile'. 
 Stop.


What is the required functionality here?
Note - only patch 9 and 10 are changing any behaviour.
So if we decide to keep current functionality then
the first 8 patches still makes a lot of sense.


The patch serie contains:
Sam Ravnborg (10):
  x86: unification of cfufreq/Kconfig
  x86: start unification of arch/x86/Kconfig.*
  x86: arch/x86/Kconfig.cpu unification
  x86: add X86_32 dependency to i386 specific symbols in Kconfig.i386
  x86: add X86_64 dependency to x86_64 specific symbols in Kconig.x86_64
  x86: copy x86_64 specific Kconfig symbols to Kconifg.i386
  x86: add remaning bits from x86_64 to Kconfig.i386
  x86: combine all config options with prompts in Kconfig
  x86: select i386 or x86_64 at config time
  x86: enable make ARCH=x86

The diffstat looks promising:
 9 files changed, 975 insertions(+), 2639 deletions(-)

A patch that makes the kernel more than 1500 line smaller
is a good patch!

Full diffstat (as git can show it):

$ git diff --stat -C -M -B HEAD~10..HEAD
 Makefile   |5 +-
 arch/x86/{Kconfig.i386 => Kconfig} |  600 
 arch/x86/Kconfig.cpu   |   83 +-
 arch/x86/Kconfig.i386  | 1609 
 arch/x86/Kconfig.x86_64| 1132 --
 arch/x86/Makefile  |2 +-
 .../x86/kernel/cpu/cpufreq/{Kconfig_32 => Kconfig} |   68 +-
 arch/x86/kernel/cpu/cpufreq/Kconfig_64 |  108 --
 scripts/kconfig/Makefile   |7 +-
 9 files changed, 975 insertions(+), 2639 deletions(-)

The serie can be pulled from:

ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/x86.git


Patches will follow in seperate thread.
This serie replaces the single patch I posted the other day.
This serie are made to ease review and to do so a few of the patches
does not change functionality but are just preparing for later steps.

Sam

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH, RESEND] tmpfs: fix mounts when size is less than the page size

2007-11-03 Thread Michael Marineau

When tmpfs is mounted with a size less than one page the number of blocks
is set to 0 which makes the tmpfs mount unlimited. This can lead to a quick
and surprising death is someone typos a tmpfs mount command and writes to much.

tmpfs can still be mounted as unlimited if size or nr_blocks is exactly 0.

Signed-off-by: Michael Marineau <[EMAIL PROTECTED]>
---
Somehow I was a moron and didn't notice the compiler shouting insults at me,
here is the correct version of the patch. Sorry for the noise :-(

 mm/shmem.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 253d205..86b47d8 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2138,6 +2138,8 @@ static int shmem_parse_options(char *options, int *mode, 
uid_t *uid,
if (*rest)
goto bad_val;
*blocks = size >> PAGE_CACHE_SHIFT;
+   if (size && *blocks == 0)
+   *blocks = 1;
} else if (!strcmp(this_char,"nr_blocks")) {
*blocks = memparse(value,);
if (*rest)
-- 
1.5.1.6

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

High cpu load due to pdflush

2007-11-03 Thread Bob Gill

Hi.  My computer is constipated.  The load average when idle never goes 
below 100%.  Top shows this:

top - 11:01:27 up 5 min,  2 users,  load average: 2.63, 1.73, 0.76
Tasks:  91 total,   2 running,  89 sleeping,   0 stopped,   0 zombie
Cpu(s): 13.5%us,  4.4%sy,  1.6%ni,  9.9%id, 70.2%wa,  0.3%hi,  0.2%si,  
0.0%st
Note that we are at 70% for the wait state, 9.9% idle.  Now when I run  
ps aux | grep pdflush | grep -v grep  ...I get:

root   144  0.0  0.0  0 0 ?S10:56   0:00 [pdflush]
root   145  0.0  0.0  0 0 ?D10:56   0:00 [pdflush]
the second pdflush is dirty (and won't write).  I haven't changed 
anything in /proc/sys/vm and am running the 2.6.24-rc1-git12 kernel on 
Ubuntu Gutsy.  The original Gutsy kernel worked fine.  The only way I 
can get around the problem is to go into a terminal and run:

>while true
>do
>sync
>sleep 10
>done
...which will force writes and give me (from top)
top - 11:06:55 up 10 min,  3 users,  load average: 0.42, 1.23, 0.85
Tasks:  93 total,   2 running,  91 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.0%us,  0.9%sy,  0.0%ni, 95.6%id,  0.3%wa,  0.3%hi,  0.0%si,  
0.0%st

(on the very same load as before and the load is still dropping).

The only real change I made to Ubuntu was in 
/etc/init.d/mountdevsubfs.sh where I uncommented:

# Magic to make /proc/bus/usb work
#
mkdir -p /dev/bus/usb/.usbfs
domount usbfs "" /dev/bus/usb/.usbfs 
-obusmode=0700,devmode=0600,listmode=0644

ln -s .usbfs/devices /dev/bus/usb/devices
mount --rbind /dev/bus/usb /proc/bus/usb

...so that usb would work with a custom kernel...

Oh, and btw, my cpu is an Intel P4.
Also, I applied the following patch and build/ran a kernel (but it did 
not reduce the cpu load on my system):
>/> -- fs/jbd/transaction.c 
-/

>/> index cceaf57..d38e0d5 100644/
>/> @@ -55,7 +55,7 @@ get_transaction(journal_t *journal, transaction_t 
*transaction)/

>/> spin_lock_init(>t_handle_lock);/
>/> /
>/> /* Set up the commit timer for the new transaction. *//
>/> - journal->j_commit_timer.expires = 
round_jiffies(transaction->t_expires);/

>/> + journal->j_commit_timer.expires = transaction->t_expires;/
>/> add_timer(>j_commit_timer);/
>/> /
>/> J_ASSERT(journal->j_running_transaction == NULL);

/Any ideas?  Please mail me if you need more information. 
Thanks in Advance,

Bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: x86_64 ten times slower than i386

2007-11-03 Thread H. Peter Anvin


Bo Brantén wrote:

On Sat, 3 Nov 2007, Matt Mackall wrote:


This is typically due to a problem with the setup of your MTRRs. Try
booting with mem=nnnM where nnn is some number smaller than your
actual amount of memory.


Thank you for that advice, the system has 4GB and if I boot with 
mem=3072M it will run as fast as normal while if I don't use the mem 
option it will run 10 times slower, however if I use a figure like 
mem=3500M the kernel will panic, is there any way to determine the 
highest usable figure without try and error?


Yes, look at how your MTRRs are set up (cat /proc/mtrr).

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: device struct bloat

2007-11-03 Thread Greg KH

On Sat, Nov 03, 2007 at 12:48:23PM -0700, Stephen Hemminger wrote:
> The sizeof(struct device) is way too big, especially in the network device 
> case.
> We want to support 1000's of device's and the change from class_device to
> net_device has caused needless bloat.
> 
> sizeof(struct device) = 272
> sizeof(struct class_device) = 92
>   * not the class_id in class_device could also be removed or changed to
>  a ptr, since it is redundant for net_devices.

I agree that struct device is bigger than perhaps it should be (Kay is
working on getting rid of the bus_id field and we both just trimmed down
the base kobject by about 20 bytes) but is this really a problem that is
noticable by anyone?

I'm all for saving memory, but 1000's of struct devices is not anything
that the kernel should even notice.  s390 machines create tens of
thousands of these all the time, and they are severly memory limited,
with no apparent problem.

And I'm guessing that embedded systems would not be the ones that would
be creating 1000's of network devices, right?  Are these virtual devices
or backed by real, physical devices?

If it is an issue, we can start to work on slimming the structure down.
At first glance, I'm sure we can save memory by just rearanging the
fields to get rid of some structure padding that I'm sure is there.

After that, I'm sure we can push a lot of other fields out into a
separate structure to handle if the device is "virtual" or not, which
would let us drop a bunch of the dma and other resource-type things.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Remove unneded lock_kernel() in drivers/pci/syscall.c.

2007-11-03 Thread Diego Woitasen

sys_pciconfig_{read,write}() are protected against PCI removal with the 
reference count in struct pci_dev. The concurrency of 
pci_user_{read,write}_config_* functions are already protected by pci_lock in 
drivers/pci/access.c.

Signed-off-by: Diego Woitasen <[EMAIL PROTECTED]>
---
 drivers/pci/syscall.c |5 -
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/syscall.c b/drivers/pci/syscall.c
index 2ac050d..645d7a6 100644
--- a/drivers/pci/syscall.c
+++ b/drivers/pci/syscall.c
@@ -34,7 +34,6 @@ sys_pciconfig_read(unsigned long bus, unsigned long dfn,
if (!dev)
goto error;
 
-   lock_kernel();
switch (len) {
case 1:
cfg_ret = pci_user_read_config_byte(dev, off, );
@@ -47,10 +46,8 @@ sys_pciconfig_read(unsigned long bus, unsigned long dfn,
break;
default:
err = -EINVAL;
-   unlock_kernel();
goto error;
};
-   unlock_kernel();
 
err = -EIO;
if (cfg_ret != PCIBIOS_SUCCESSFUL)
@@ -107,7 +104,6 @@ sys_pciconfig_write(unsigned long bus, unsigned long dfn,
if (!dev)
return -ENODEV;
 
-   lock_kernel();
switch(len) {
case 1:
err = get_user(byte, (u8 __user *)buf);
@@ -140,7 +136,6 @@ sys_pciconfig_write(unsigned long bus, unsigned long dfn,
err = -EINVAL;
break;
}
-   unlock_kernel();
pci_dev_put(dev);
return err;
 }
-- 
1.5.2.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: drive side 80-wire cable detection failed

2007-11-03 Thread Alan Cox

On Sun, 04 Nov 2007 00:09:49 +0100
Tobias Hoffmann <[EMAIL PROTECTED]> wrote:

> Hi!
> 
> On my NV3 board with a Samsung SP1634N Harddisk I wrongly got
> "drive side 80-wire cable detection failed" with the current kernel.

Does the drive cable detect correctly on a non Nvidia chipset ?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

drive side 80-wire cable detection failed

2007-11-03 Thread Tobias Hoffmann


Hi!

On my NV3 board with a Samsung SP1634N Harddisk I wrongly got
"drive side 80-wire cable detection failed" with the current kernel.

A possible fix is attached.

Please CC any replies as I'm not on the list.

 Tobias


diff --git a/drivers/ide/ide-iops.c b/drivers/ide/ide-iops.c
index dcda0f1..e3cfd6c 100644
--- a/drivers/ide/ide-iops.c
+++ b/drivers/ide/ide-iops.c
@@ -588,6 +588,7 @@ EXPORT_SYMBOL_GPL(ide_in_drive_list);
static const struct drive_list_entry ivb_list[] = {
   { "QUANTUM FIREBALLlct10 05", "A03.0900"},
   { "TSSTcorp CDDVDW SH-S202J", "SB00"},
+   { "SAMSUNG SP1634N" , "UZ100-03"},
   { NULL  , NULL  }
};

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 6/10] split anon and file LRUs

2007-11-03 Thread Rik van Riel

Split the LRU lists in two, one set for pages that are backed by
real file systems ("file") and one for pages that are backed by
memory and swap ("anon").  The latter includes tmpfs.

Eventually mlocked pages will be taken off the LRUs alltogether.
A patch for that already exists and just needs to be integrated
into this series.

This patch mostly has the infrastructure and a basic policy to
balance how much we scan the anon lists and how much we scan
the file lists. Fancy policy changes will be in separate patches.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
Signed-off-by: Lee Schermerhorn <[EMAIL PROTECTED]>

Index: linux-2.6.23-mm1/fs/proc/proc_misc.c
===
--- linux-2.6.23-mm1.orig/fs/proc/proc_misc.c
+++ linux-2.6.23-mm1/fs/proc/proc_misc.c
@@ -149,43 +149,47 @@ static int meminfo_read_proc(char *page,
 * Tagged format, for easy grepping and expansion.
 */
len = sprintf(page,
-   "MemTotal: %8lu kB\n"
-   "MemFree:  %8lu kB\n"
-   "Buffers:  %8lu kB\n"
-   "Cached:   %8lu kB\n"
-   "SwapCached:   %8lu kB\n"
-   "Active:   %8lu kB\n"
-   "Inactive: %8lu kB\n"
+   "MemTotal:   %8lu kB\n"
+   "MemFree:%8lu kB\n"
+   "Buffers:%8lu kB\n"
+   "Cached: %8lu kB\n"
+   "SwapCached: %8lu kB\n"
+   "Active(anon):   %8lu kB\n"
+   "Inactive(anon): %8lu kB\n"
+   "Active(file):   %8lu kB\n"
+   "Inactive(file): %8lu kB\n"
 #ifdef CONFIG_HIGHMEM
-   "HighTotal:%8lu kB\n"
-   "HighFree: %8lu kB\n"
-   "LowTotal: %8lu kB\n"
-   "LowFree:  %8lu kB\n"
-#endif
-   "SwapTotal:%8lu kB\n"
-   "SwapFree: %8lu kB\n"
-   "Dirty:%8lu kB\n"
-   "Writeback:%8lu kB\n"
-   "AnonPages:%8lu kB\n"
-   "Mapped:   %8lu kB\n"
-   "Slab: %8lu kB\n"
-   "SReclaimable: %8lu kB\n"
-   "SUnreclaim:   %8lu kB\n"
-   "PageTables:   %8lu kB\n"
-   "NFS_Unstable: %8lu kB\n"
-   "Bounce:   %8lu kB\n"
-   "CommitLimit:  %8lu kB\n"
-   "Committed_AS: %8lu kB\n"
-   "VmallocTotal: %8lu kB\n"
-   "VmallocUsed:  %8lu kB\n"
-   "VmallocChunk: %8lu kB\n",
+   "HighTotal:  %8lu kB\n"
+   "HighFree:   %8lu kB\n"
+   "LowTotal:   %8lu kB\n"
+   "LowFree:%8lu kB\n"
+#endif
+   "SwapTotal:  %8lu kB\n"
+   "SwapFree:   %8lu kB\n"
+   "Dirty:  %8lu kB\n"
+   "Writeback:  %8lu kB\n"
+   "AnonPages:  %8lu kB\n"
+   "Mapped: %8lu kB\n"
+   "Slab:   %8lu kB\n"
+   "SReclaimable:   %8lu kB\n"
+   "SUnreclaim: %8lu kB\n"
+   "PageTables: %8lu kB\n"
+   "NFS_Unstable:   %8lu kB\n"
+   "Bounce: %8lu kB\n"
+   "CommitLimit:%8lu kB\n"
+   "Committed_AS:   %8lu kB\n"
+   "VmallocTotal:   %8lu kB\n"
+   "VmallocUsed:%8lu kB\n"
+   "VmallocChunk:   %8lu kB\n",
K(i.totalram),
K(i.freeram),
K(i.bufferram),
K(cached),
K(total_swapcache_pages),
-   K(global_page_state(NR_ACTIVE)),
-   K(global_page_state(NR_INACTIVE)),
+   K(global_page_state(NR_ACTIVE_ANON)),
+   K(global_page_state(NR_INACTIVE_ANON)),
+   K(global_page_state(NR_ACTIVE_FILE)),
+   K(global_page_state(NR_INACTIVE_FILE)),
 #ifdef CONFIG_HIGHMEM
K(i.totalhigh),
K(i.freehigh),
Index: linux-2.6.23-mm1/fs/cifs/file.c
===
--- linux-2.6.23-mm1.orig/fs/cifs/file.c
+++ linux-2.6.23-mm1/fs/cifs/file.c
@@ -1740,7 +1740,7 @@ static void cifs_copy_cache_pages(struct
SetPageUptodate(page);
unlock_page(page);
if (!pagevec_add(plru_pvec, page))
-   __pagevec_lru_add(plru_pvec);
+   __pagevec_lru_add_file(plru_pvec);
data += PAGE_CACHE_SIZE;
}
return;
@@ -1878,7 +1878,7 @@ static int cifs_readpages(struct file *f
bytes_read = 0;
}
 
-   pagevec_lru_add(_pvec);
+   pagevec_lru_add_file(_pvec);
 
 /* need to free smb_read_data buf before exit */
if (smb_read_data) {
Index: linux-2.6.23-mm1/fs/ntfs/file.c

Linux 2.6.16.57-rc1

2007-11-03 Thread Adrian Bunk

Security fixes since 2.6.16.56:
- CVE-2007-3740: CIFS should honour umask
- CVE-2007-4308: aacraid: fix security hole
- CVE-2007-4997: [IEEE80211]: avoid integer underflow for runt rx frames
- CVE-2007-5093: USB: fix DoS in pwc USB video driver


Location:
ftp://ftp.kernel.org/pub/linux/kernel/people/bunk/linux-2.6.16.y/testing/

git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git

RSS feed of the git tree:
http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.16.y.git;a=rss


Changes since 2.6.16.56:

Adit Ranadive (1):
  [PKTGEN]: srcmac fix

Adrian Bunk (1):
  Linux 2.6.16.57-rc1

Alan Cox (1):
  aacraid: fix security hole (CVE-2007-4308)

Alexey Dobriyan (1):
  optical /proc/ide/*/media

Chris Wright (1):
  [SPARC64] pass correct addr in get_fb_unmapped_area(MAP_FIXED)

Danny Kukawka (1):
  ide: add "optical" to sysfs "media" attribute

David S. Miller (1):
  [SPARC64]: Fix show_stack() when stack argument is NULL.

Herbert Xu (1):
  [SNAP]: Check packet length before reading

John W. Linville (1):
  [IEEE80211]: avoid integer underflow for runt rx frames (CVE-2007-4997)

Neil Brown (1):
  knfsd: allow nfsd READDIR to return 64bit cookies

Nick Piggin (1):
  buffer: memorder fix

Ohad Ben-Cohen (2):
  [Bluetooth] Fix unintentional fall-through in HCI line discipline
  [Bluetooth] Fix NULL pointer dereference in HCI line discipline

Oliver Neukum (2):
  USB: fix DoS in pwc USB video driver (CVE-2007-5093)
  Fix oops in pwc v4l driver

Patrick McHardy (1):
  [ICMP]: Fix icmp_errors_use_inbound_ifaddr sysctl

Ranko Zivojnovic (1):
  [NET]: gen_estimator deadlock fix

Steve French (1):
  CIFS should honour umask (CVE-2007-3740)


 Makefile|2 
 arch/sparc64/kernel/sys_sparc.c |2 
 arch/sparc64/kernel/traps.c |   18 ++--
 arch/sparc64/mm/fault.c |5 -
 drivers/bluetooth/hci_ldisc.c   |5 -
 drivers/ide/ide-proc.c  |2 
 drivers/ide/ide.c   |2 
 drivers/scsi/aacraid/linit.c|4 
 drivers/usb/media/pwc/pwc-if.c  |  142 +++-
 drivers/usb/media/pwc/pwc.h |1 
 fs/buffer.c |1 
 fs/cifs/dir.c   |6 -
 fs/cifs/inode.c |5 -
 fs/nfsd/nfs3xdr.c   |6 -
 net/802/psnap.c |   17 ++-
 net/core/gen_estimator.c|   81 +++---
 net/core/pktgen.c   |   10 ++
 net/ieee80211/ieee80211_rx.c|6 +
 net/ipv4/icmp.c |   12 ++
 19 files changed, 229 insertions(+), 98 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 2/10] free swap space entries if vm_swap_full()

2007-11-03 Thread Rik van Riel

Rik van Riel's patch to free swap space on swap-in/activiation,
forward ported by Lee Schermerhorn.

Against:  2.6.23-rc2-mm2 atop:
+ lts' convert anon_vma list lock to reader/write lock patch
+ Nick Piggin's move and rework isolate_lru_page() patch

Patch Description:  quick attempt by lts

Free swap cache entries when swapping in pages if vm_swap_full()
[swap space > 1/2 used?].  Uses new pagevec to reduce pressure
on locks.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
Signed-off-by: Lee Schermerhorn <[EMAIL PROTECTED]>

 include/linux/pagevec.h |1 +
 mm/swap.c   |   18 ++
 mm/vmscan.c |   16 +++-
 3 files changed, 30 insertions(+), 5 deletions(-)

Index: linux-2.6.23-rc6-mm1/mm/vmscan.c
===
--- linux-2.6.23-rc6-mm1.orig/mm/vmscan.c   2007-09-25 15:20:05.0 
-0400
+++ linux-2.6.23-rc6-mm1/mm/vmscan.c2007-09-25 15:25:04.0 -0400
@@ -613,6 +613,9 @@ free_it:
continue;
 
 activate_locked:
+   /* Not a candidate for swapping, so reclaim swap space. */
+   if (PageSwapCache(page) && vm_swap_full())
+   remove_exclusive_swap_page(page);
SetPageActive(page);
pgactivate++;
 keep_locked:
@@ -1142,14 +1145,13 @@ force_reclaim_mapped:
}
}
__mod_zone_page_state(zone, NR_INACTIVE, pgmoved);
+   spin_unlock_irq(>lru_lock);
pgdeactivate += pgmoved;
-   if (buffer_heads_over_limit) {
-   spin_unlock_irq(>lru_lock);
-   pagevec_strip();
-   spin_lock_irq(>lru_lock);
-   }
 
+   if (buffer_heads_over_limit)
+   pagevec_strip();
pgmoved = 0;
+   spin_lock_irq(>lru_lock);
while (!list_empty(_active)) {
page = lru_to_page(_active);
prefetchw_prev_lru_page(page, _active, flags);
@@ -1163,6 +1165,8 @@ force_reclaim_mapped:
__mod_zone_page_state(zone, NR_ACTIVE, pgmoved);
pgmoved = 0;
spin_unlock_irq(>lru_lock);
+   if (vm_swap_full())
+   pagevec_swap_free();
__pagevec_release();
spin_lock_irq(>lru_lock);
}
@@ -1172,6 +1176,8 @@ force_reclaim_mapped:
__count_zone_vm_events(PGREFILL, zone, pgscanned);
__count_vm_events(PGDEACTIVATE, pgdeactivate);
spin_unlock_irq(>lru_lock);
+   if (vm_swap_full())
+   pagevec_swap_free();
 
pagevec_release();
 }
Index: linux-2.6.23-rc6-mm1/mm/swap.c
===
--- linux-2.6.23-rc6-mm1.orig/mm/swap.c 2007-09-25 15:20:05.0 -0400
+++ linux-2.6.23-rc6-mm1/mm/swap.c  2007-09-25 15:22:51.0 -0400
@@ -421,6 +421,24 @@ void pagevec_strip(struct pagevec *pvec)
}
 }
 
+/*
+ * Try to free swap space from the pages in a pagevec
+ */
+void pagevec_swap_free(struct pagevec *pvec)
+{
+   int i;
+
+   for (i = 0; i < pagevec_count(pvec); i++) {
+   struct page *page = pvec->pages[i];
+
+   if (PageSwapCache(page) && !TestSetPageLocked(page)) {
+   if (PageSwapCache(page))
+   remove_exclusive_swap_page(page);
+   unlock_page(page);
+   }
+   }
+}
+
 /**
  * pagevec_lookup - gang pagecache lookup
  * @pvec:  Where the resulting pages are placed
Index: linux-2.6.23-rc6-mm1/include/linux/pagevec.h
===
--- linux-2.6.23-rc6-mm1.orig/include/linux/pagevec.h   2007-09-25 
15:20:02.0 -0400
+++ linux-2.6.23-rc6-mm1/include/linux/pagevec.h2007-09-25 
15:22:51.0 -0400
@@ -26,6 +26,7 @@ void __pagevec_free(struct pagevec *pvec
 void __pagevec_lru_add(struct pagevec *pvec);
 void __pagevec_lru_add_active(struct pagevec *pvec);
 void pagevec_strip(struct pagevec *pvec);
+void pagevec_swap_free(struct pagevec *pvec);
 unsigned pagevec_lookup(struct pagevec *pvec, struct address_space *mapping,
pgoff_t start, unsigned nr_pages);
 unsigned pagevec_lookup_tag(struct pagevec *pvec,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 9/10] split VM and memory controllers

2007-11-03 Thread Rik van Riel

The memory controller code is still quite simple, so don't do
anything fancy for now trying to make it work better with the
split VM code.

Will be merged into 6/10 soon.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>

Index: linux-2.6.23-mm1/mm/memcontrol.c
===
--- linux-2.6.23-mm1.orig/mm/memcontrol.c
+++ linux-2.6.23-mm1/mm/memcontrol.c
@@ -210,7 +210,6 @@ unsigned long mem_cgroup_isolate_pages(u
struct list_head *src;
struct page_cgroup *pc;
 
-//TODO:  memory container maintain separate file/anon lists?
if (active)
src = _cont->active_list;
else
@@ -222,6 +221,9 @@ unsigned long mem_cgroup_isolate_pages(u
page = pc->page;
VM_BUG_ON(!pc);
 
+   /*
+* TODO: play better with lumpy reclaim, grabbing anything.
+*/
if (PageActive(page) && !active) {
__mem_cgroup_move_lists(pc, true);
scan--;
@@ -240,6 +242,9 @@ unsigned long mem_cgroup_isolate_pages(u
if (page_zone(page) != z)
continue;
 
+   if (file != !!page_file_cache(page))
+   continue;
+
/*
 * Check if the meta page went away from under us
 */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 1/10] move isolate_lru_page to vmscan.c

2007-11-03 Thread Rik van Riel

move isolate_lru_page() to vmscan.c

Against 2.6.23-rc4-mm1

V1 -> V2 [lts]:
+  fix botched merge -- add back "get_page_unless_zero()"

From: Nick Piggin <[EMAIL PROTECTED]>
To: Linux Memory Management <[EMAIL PROTECTED]>
Subject: [patch 1/4] mm: move and rework isolate_lru_page
Date:   Mon, 12 Mar 2007 07:38:44 +0100 (CET)

isolate_lru_page logically belongs to be in vmscan.c than migrate.c.

It is tough, because we don't need that function without memory migration
so there is a valid argument to have it in migrate.c. However a subsequent
patch needs to make use of it in the core mm, so we can happily move it
to vmscan.c.

Also, make the function a little more generic by not requiring that it
adds an isolated page to a given list. Callers can do that.

Note that we now have '__isolate_lru_page()', that does
something quite different, visible outside of vmscan.c
for use with memory controller.  Methinks we need to
rationalize these names/purposes.   --lts

Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
Signed-off-by: Lee Schermerhorn <[EMAIL PROTECTED]>

 include/linux/migrate.h |3 ---
 mm/internal.h   |2 ++
 mm/mempolicy.c  |   10 --
 mm/migrate.c|   47 ++-
 mm/vmscan.c |   41 +
 5 files changed, 61 insertions(+), 42 deletions(-)

Index: Linux/include/linux/migrate.h
===
--- Linux.orig/include/linux/migrate.h  2007-07-08 19:32:17.0 -0400
+++ Linux/include/linux/migrate.h   2007-09-20 10:21:52.0 -0400
@@ -25,7 +25,6 @@ static inline int vma_migratable(struct 
return 1;
 }
 
-extern int isolate_lru_page(struct page *p, struct list_head *pagelist);
 extern int putback_lru_pages(struct list_head *l);
 extern int migrate_page(struct address_space *,
struct page *, struct page *);
@@ -42,8 +41,6 @@ extern int migrate_vmas(struct mm_struct
 static inline int vma_migratable(struct vm_area_struct *vma)
{ return 0; }
 
-static inline int isolate_lru_page(struct page *p, struct list_head *list)
-   { return -ENOSYS; }
 static inline int putback_lru_pages(struct list_head *l) { return 0; }
 static inline int migrate_pages(struct list_head *l, new_page_t x,
unsigned long private) { return -ENOSYS; }
Index: Linux/mm/internal.h
===
--- Linux.orig/mm/internal.h2007-09-20 09:09:36.0 -0400
+++ Linux/mm/internal.h 2007-09-20 10:21:52.0 -0400
@@ -34,6 +34,8 @@ static inline void __put_page(struct pag
atomic_dec(>_count);
 }
 
+extern int isolate_lru_page(struct page *page);
+
 extern void fastcall __init __free_pages_bootmem(struct page *page,
unsigned int order);
 
Index: Linux/mm/migrate.c
===
--- Linux.orig/mm/migrate.c 2007-09-20 10:21:51.0 -0400
+++ Linux/mm/migrate.c  2007-09-20 10:21:52.0 -0400
@@ -36,36 +36,6 @@
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
 
 /*
- * Isolate one page from the LRU lists. If successful put it onto
- * the indicated list with elevated page count.
- *
- * Result:
- *  -EBUSY: page not on LRU list
- *  0: page removed from LRU list and added to the specified list.
- */
-int isolate_lru_page(struct page *page, struct list_head *pagelist)
-{
-   int ret = -EBUSY;
-
-   if (PageLRU(page)) {
-   struct zone *zone = page_zone(page);
-
-   spin_lock_irq(>lru_lock);
-   if (PageLRU(page) && get_page_unless_zero(page)) {
-   ret = 0;
-   ClearPageLRU(page);
-   if (PageActive(page))
-   del_page_from_active_list(zone, page);
-   else
-   del_page_from_inactive_list(zone, page);
-   list_add_tail(>lru, pagelist);
-   }
-   spin_unlock_irq(>lru_lock);
-   }
-   return ret;
-}
-
-/*
  * migrate_prep() needs to be called before we start compiling a list of pages
  * to be migrated using isolate_lru_page().
  */
@@ -850,14 +820,17 @@ static int do_move_pages(struct mm_struc
!migrate_all)
goto put_and_set;
 
-   err = isolate_lru_page(page, );
+   err = isolate_lru_page(page);
+   if (err) {
 put_and_set:
-   /*
-* Either remove the duplicate refcount from
-* isolate_lru_page() or drop the page ref if it was
-* not isolated.
-*/
-

[RFC PATCH 8/10] make split VM and lumpy reclaim work together

2007-11-03 Thread Rik van Riel

Make lumpy reclaim and the split VM code work together better, by
allowing both file and anonymous pages to be relaimed together.

Will be merged into patch 6/10 soon, split out for the benefit of
people who have looked at the older code in the past.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>

Index: linux-2.6.23-mm1/mm/vmscan.c
===
--- linux-2.6.23-mm1.orig/mm/vmscan.c
+++ linux-2.6.23-mm1/mm/vmscan.c
@@ -752,10 +752,6 @@ static unsigned long isolate_lru_pages(u
 
cursor_page = pfn_to_page(pfn);
 
-   /* Don't lump pages of different types:  file vs anon */
-   if (!PageLRU(page) || (file != 
!!page_file_cache(cursor_page)))
-   break;
-
/* Check that we have not crossed a zone boundary. */
if (unlikely(page_zone_id(cursor_page) != zone_id))
continue;
@@ -799,16 +795,22 @@ static unsigned long isolate_pages_globa
  * clear_active_flags() is a helper for shrink_active_list(), clearing
  * any active bits from the pages in the list.
  */
-static unsigned long clear_active_flags(struct list_head *page_list)
+static unsigned long clear_active_flags(struct list_head *page_list,
+   unsigned int *count)
 {
int nr_active = 0;
+   int lru;
struct page *page;
 
-   list_for_each_entry(page, page_list, lru)
+   list_for_each_entry(page, page_list, lru) {
+   lru = page_file_cache(page);
if (PageActive(page)) {
+   lru += LRU_ACTIVE;
ClearPageActive(page);
nr_active++;
}
+   count[lru]++;
+   }
 
return nr_active;
 }
@@ -876,24 +878,25 @@ static unsigned long shrink_inactive_lis
unsigned long nr_scan;
unsigned long nr_freed;
unsigned long nr_active;
+   unsigned int count[NR_LRU_LISTS] = { 0, };
+   int mode = (sc->order > PAGE_ALLOC_COSTLY_ORDER) ?
+   ISOLATE_BOTH : ISOLATE_INACTIVE;
 
nr_taken = sc->isolate_pages(sc->swap_cluster_max,
-_list, _scan, sc->order,
-(sc->order > PAGE_ALLOC_COSTLY_ORDER)?
-ISOLATE_BOTH : ISOLATE_INACTIVE,
+_list, _scan, sc->order, mode,
zone, sc->mem_cgroup, 0, file);
-   nr_active = clear_active_flags(_list);
+   nr_active = clear_active_flags(_list, count);
__count_vm_events(PGDEACTIVATE, nr_active);
 
-   if (file) {
-   __mod_zone_page_state(zone, NR_ACTIVE_FILE, -nr_active);
-   __mod_zone_page_state(zone, NR_INACTIVE_FILE,
-   -(nr_taken - nr_active));
-   } else {
-   __mod_zone_page_state(zone, NR_ACTIVE_ANON, -nr_active);
-   __mod_zone_page_state(zone, NR_INACTIVE_ANON,
-   -(nr_taken - nr_active));
-   }
+   __mod_zone_page_state(zone, NR_ACTIVE_FILE,
+   -count[LRU_ACTIVE_FILE]);
+   __mod_zone_page_state(zone, NR_INACTIVE_FILE,
+   -count[LRU_INACTIVE_FILE]);
+   __mod_zone_page_state(zone, NR_ACTIVE_ANON,
+   -count[LRU_ACTIVE_ANON]);
+   __mod_zone_page_state(zone, NR_INACTIVE_ANON,
+   -count[LRU_INACTIVE_ANON]);
+
zone->pages_scanned += nr_scan;
spin_unlock_irq(>lru_lock);
 
@@ -914,7 +917,7 @@ static unsigned long shrink_inactive_lis
 * The attempt at page out may have made some
 * of the pages active, mark them inactive again.
 */
-   nr_active = clear_active_flags(_list);
+   nr_active = clear_active_flags(_list, count);
count_vm_events(PGDEACTIVATE, nr_active);
 
nr_freed += shrink_page_list(_list, sc,
@@ -943,11 +946,11 @@ static unsigned long shrink_inactive_lis
VM_BUG_ON(PageLRU(page));
SetPageLRU(page);
list_del(>lru);
-   if (file) {
+   if (page_file_cache(page)) {
l += LRU_FILE;
-   zone->recent_rotated_file += sc->activated;
+   zone->recent_rotated_file++;
} else {
-

[RFC PATCH 5/10] use an indexed array for LRU lists and variables

2007-11-03 Thread Rik van Riel

Use an indexed array for LRU variables.  This makes the rest
of the split VM code a lot cleaner.

V1 -> V2 [lts]:
+ Remove extraneous  __dec_zone_state(zone, NR_ACTIVE) pointed
  out by Mel G.

>From [EMAIL PROTECTED] Wed Aug 29 11:39:51 2007

Currently we are defining explicit variables for the inactive
and active list. An indexed array can be more generic and avoid
repeating similar code in several places in the reclaim code.

We are saving a few bytes in terms of code size:

Before:

   textdata bss dec hex filename
4097753  573120 4092484 8763357  85b7dd vmlinux

After:

   textdata bss dec hex filename
4097729  573120 4092484 876  85b7c5 vmlinux

Having an easy way to add new lru lists may ease future work on
the reclaim code.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
Signed-off-by: Lee Schermerhorn <[EMAIL PROTECTED]>
Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

 include/linux/mm_inline.h |   34 ---
 include/linux/mmzone.h|   17 +++--
 mm/page_alloc.c   |9 +--
 mm/swap.c |2 
 mm/vmscan.c   |  132 ++
 mm/vmstat.c   |3 -
 6 files changed, 107 insertions(+), 90 deletions(-)

Index: linux-2.6.23-rc8-mm2-vm/include/linux/mmzone.h
===
--- linux-2.6.23-rc8-mm2-vm.orig/include/linux/mmzone.h
+++ linux-2.6.23-rc8-mm2-vm/include/linux/mmzone.h
@@ -82,8 +82,8 @@ struct zone_padding {
 enum zone_stat_item {
/* First 128 byte cacheline (assuming 64 bit words) */
NR_FREE_PAGES,
-   NR_INACTIVE,
-   NR_ACTIVE,
+   NR_INACTIVE,/* must match order of LRU_[IN]ACTIVE */
+   NR_ACTIVE,  /*  " " "   "   " */
NR_ANON_PAGES,  /* Mapped anonymous pages */
NR_FILE_MAPPED, /* pagecache pages mapped into pagetables.
   only modified from process context */
@@ -107,6 +107,13 @@ enum zone_stat_item {
 #endif
NR_VM_ZONE_STAT_ITEMS };
 
+enum lru_list {
+   LRU_INACTIVE,   /* must match order of NR_[IN]ACTIVE */
+   LRU_ACTIVE, /*  " " "   "   "*/
+   NR_LRU_LISTS };
+
+#define for_each_lru(l) for (l = 0; l < NR_LRU_LISTS; l++)
+
 struct per_cpu_pages {
int count;  /* number of pages in the list */
int high;   /* high watermark, emptying needed */
@@ -260,10 +267,8 @@ struct zone {
 
/* Fields commonly accessed by the page reclaim scanner */
spinlock_t  lru_lock;   
-   struct list_headactive_list;
-   struct list_headinactive_list;
-   unsigned long   nr_scan_active;
-   unsigned long   nr_scan_inactive;
+   struct list_headlist[NR_LRU_LISTS];
+   unsigned long   nr_scan[NR_LRU_LISTS];
unsigned long   pages_scanned; /* since last reclaim */
unsigned long   flags; /* zone flags, see below */
 
Index: linux-2.6.23-rc8-mm2-vm/include/linux/mm_inline.h
===
--- linux-2.6.23-rc8-mm2-vm.orig/include/linux/mm_inline.h
+++ linux-2.6.23-rc8-mm2-vm/include/linux/mm_inline.h
@@ -30,43 +30,55 @@ static inline int page_swap_cache(struct
 }
 
 static inline void
+add_page_to_lru_list(struct zone *zone, struct page *page, enum lru_list l)
+{
+   list_add(>lru, >list[l]);
+   __inc_zone_state(zone, NR_INACTIVE + l);
+}
+
+static inline void
+del_page_from_lru_list(struct zone *zone, struct page *page, enum lru_list l)
+{
+   list_del(>lru);
+   __dec_zone_state(zone, NR_INACTIVE + l);
+}
+
+
+static inline void
 add_page_to_active_list(struct zone *zone, struct page *page)
 {
-   list_add(>lru, >active_list);
-   __inc_zone_state(zone, NR_ACTIVE);
+   add_page_to_lru_list(zone, page, LRU_ACTIVE);
 }
 
 static inline void
 add_page_to_inactive_list(struct zone *zone, struct page *page)
 {
-   list_add(>lru, >inactive_list);
-   __inc_zone_state(zone, NR_INACTIVE);
+   add_page_to_lru_list(zone, page, LRU_INACTIVE);
 }
 
 static inline void
 del_page_from_active_list(struct zone *zone, struct page *page)
 {
-   list_del(>lru);
-   __dec_zone_state(zone, NR_ACTIVE);
+   del_page_from_lru_list(zone, page, LRU_ACTIVE);
 }
 
 static inline void
 del_page_from_inactive_list(struct zone *zone, struct page *page)
 {
-   list_del(>lru);
-   __dec_zone_state(zone, NR_INACTIVE);
+   del_page_from_lru_list(zone, page, LRU_INACTIVE);
 }
 
 static inline void
 del_page_from_lru(struct zone *zone, struct page *page)
 {
+   enum lru_list l = LRU_INACTIVE;
+
list_del(>lru);
if (PageActive(page)) {
__ClearPageActive(page);
-   __dec_zone_state(zone, NR_ACTIVE);
-   } else {
-   __dec_zone_state(zone, NR_INACTIVE);
+

[RFC PATCH 7/10] clean up the LRU array arithmetic

2007-11-03 Thread Rik van Riel

Make the LRU arithmetic more explicit.  Hopefully this will make
the code a little easier to read and less prone to future errors.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>

Index: linux-2.6.23-mm1/include/linux/mm_inline.h
===
--- linux-2.6.23-mm1.orig/include/linux/mm_inline.h
+++ linux-2.6.23-mm1/include/linux/mm_inline.h
@@ -28,7 +28,7 @@ static inline int page_file_cache(struct
return 0;
 
/* The page is page cache backed by a normal filesystem. */
-   return (LRU_INACTIVE_FILE - LRU_INACTIVE_ANON);
+   return LRU_FILE;
 }
 
 static inline void
Index: linux-2.6.23-mm1/mm/swap.c
===
--- linux-2.6.23-mm1.orig/mm/swap.c
+++ linux-2.6.23-mm1/mm/swap.c
@@ -180,12 +180,12 @@ void fastcall activate_page(struct page 
 
spin_lock_irq(>lru_lock);
if (PageLRU(page) && !PageActive(page)) {
-   int l = LRU_INACTIVE_ANON;
+   int l = LRU_BASE;
l += page_file_cache(page);
del_page_from_lru_list(zone, page, l);
 
SetPageActive(page);
-   l += LRU_ACTIVE_ANON - LRU_INACTIVE_ANON;
+   l += LRU_ACTIVE;
add_page_to_lru_list(zone, page, l);
__count_vm_event(PGACTIVATE);
mem_cgroup_move_lists(page_get_page_cgroup(page), true);
Index: linux-2.6.23-mm1/mm/vmscan.c
===
--- linux-2.6.23-mm1.orig/mm/vmscan.c
+++ linux-2.6.23-mm1/mm/vmscan.c
@@ -786,11 +786,11 @@ static unsigned long isolate_pages_globa
struct mem_cgroup *mem_cont,
int active, int file)
 {
-   int l = LRU_INACTIVE_ANON;
+   int l = LRU_BASE;
if (active)
-   l += LRU_ACTIVE_ANON - LRU_INACTIVE_ANON;
+   l += LRU_ACTIVE;
if (file)
-   l += LRU_INACTIVE_FILE - LRU_INACTIVE_ANON;
+   l += LRU_FILE;
return isolate_lru_pages(nr, >list[l], dst, scanned, order,
mode, !!file);
 }
@@ -842,7 +842,7 @@ int isolate_lru_page(struct page *page)
 
spin_lock_irq(>lru_lock);
if (PageLRU(page) && get_page_unless_zero(page)) {
-   int l = LRU_INACTIVE_ANON;
+   int l = LRU_BASE;
ret = 0;
ClearPageLRU(page);
 
@@ -938,19 +938,19 @@ static unsigned long shrink_inactive_lis
 * Put back any unfreeable pages.
 */
while (!list_empty(_list)) {
-   int l = LRU_INACTIVE_ANON;
+   int l = LRU_BASE;
page = lru_to_page(_list);
VM_BUG_ON(PageLRU(page));
SetPageLRU(page);
list_del(>lru);
if (file) {
-   l += LRU_INACTIVE_FILE - LRU_INACTIVE_ANON;
+   l += LRU_FILE;
zone->recent_rotated_file += sc->activated;
} else {
zone->recent_rotated_anon += sc->activated;
}
if (PageActive(page))
-   l += LRU_ACTIVE_ANON - LRU_INACTIVE_ANON;
+   l += LRU_ACTIVE;
add_page_to_lru_list(zone, page, l);
if (!pagevec_add(, page)) {
spin_unlock_irq(>lru_lock);
@@ -1051,7 +1051,7 @@ static void shrink_active_list(unsigned 
 */
pagevec_init(, 1);
pgmoved = 0;
-   l = LRU_INACTIVE_ANON + file * (LRU_INACTIVE_FILE - LRU_INACTIVE_ANON);
+   l = LRU_BASE + file * LRU_FILE;
spin_lock_irq(>lru_lock);
while (!list_empty([LRU_INACTIVE_ANON])) {
page = lru_to_page([LRU_INACTIVE_ANON]);
@@ -1083,7 +1083,7 @@ static void shrink_active_list(unsigned 
if (buffer_heads_over_limit)
pagevec_strip();
pgmoved = 0;
-   l = LRU_ACTIVE_ANON + file * (LRU_ACTIVE_FILE - LRU_ACTIVE_ANON);
+   l = LRU_ACTIVE + file * LRU_FILE;
spin_lock_irq(>lru_lock);
while (!list_empty([LRU_ACTIVE_ANON])) {
page = lru_to_page([LRU_ACTIVE_ANON]);
Index: linux-2.6.23-mm1/include/linux/mmzone.h
===
--- linux-2.6.23-mm1.orig/include/linux/mmzone.h
+++ linux-2.6.23-mm1/include/linux/mmzone.h
@@ -107,11 +107,22 @@ enum zone_stat_item {
 #endif
NR_VM_ZONE_STAT_ITEMS };
 
+/*
+ * We do arithmetic on the LRU lists in various places in the code,
+ * so it is important to keep the active lists LRU_ACTIVE higher in
+

[RFC PATCH 4/10] debug page_file_cache

2007-11-03 Thread Rik van Riel

Debug whether we end up classifying the wrong pages as
filesystem backed.  This has not triggered in stress
tests on my system, but who knows...

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>

Index: linux-2.6.23-mm1/include/linux/mm_inline.h
===
--- linux-2.6.23-mm1.orig/include/linux/mm_inline.h
+++ linux-2.6.23-mm1/include/linux/mm_inline.h
@@ -1,6 +1,8 @@
 #ifndef LINUX_MM_INLINE_H
 #define LINUX_MM_INLINE_H
 
+#include   /* for struct address_space */
+
 /**
  * page_file_cache(@page)
  * Returns !0 if @page is page cache page backed by a regular file,
@@ -9,11 +11,19 @@
  * We would like to get this info without a page flag, but the state
  * needs to propagate to whereever the page is last deleted from the LRU.
  */
+extern const struct address_space_operations shmem_aops;
 static inline int page_file_cache(struct page *page)
 {
+   struct address_space * mapping = page_mapping(page);
+
if (PageSwapBacked(page))
return 0;
 
+   /* These pages should all be marked PG_swapbacked */
+   WARN_ON(PageAnon(page));
+   WARN_ON(PageSwapCache(page));
+   WARN_ON(mapping && mapping->a_ops && mapping->a_ops == _aops);
+
/* The page is page cache backed by a normal filesystem. */
return 2;
 }
Index: linux-2.6.23-mm1/mm/shmem.c
===
--- linux-2.6.23-mm1.orig/mm/shmem.c
+++ linux-2.6.23-mm1/mm/shmem.c
@@ -180,7 +180,7 @@ static inline void shmem_unacct_blocks(u
 }
 
 static const struct super_operations shmem_ops;
-static const struct address_space_operations shmem_aops;
+const struct address_space_operations shmem_aops;
 static const struct file_operations shmem_file_operations;
 static const struct inode_operations shmem_inode_operations;
 static const struct inode_operations shmem_dir_inode_operations;
@@ -2344,7 +2344,7 @@ static void destroy_inodecache(void)
kmem_cache_destroy(shmem_inode_cachep);
 }
 
-static const struct address_space_operations shmem_aops = {
+const struct address_space_operations shmem_aops = {
.writepage  = shmem_writepage,
.set_page_dirty = __set_page_dirty_no_writeback,
 #ifdef CONFIG_TMPFS
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 10/10] add swapped in pages to the inactive list

2007-11-03 Thread Rik van Riel

Swapin_readahead can read in a lot of data that the processes in
memory never need.  Adding swap cache pages to the inactive list
prevents them from putting too much pressure on the working set.

This has the potential to help the programs that are already in
memory, but it could also be a disadvantage to processes that
are trying to get swapped in.

In short, this patch needs testing.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>

Index: linux-2.6.23-mm1/mm/swap_state.c
===
--- linux-2.6.23-mm1.orig/mm/swap_state.c
+++ linux-2.6.23-mm1/mm/swap_state.c
@@ -370,7 +370,7 @@ struct page *read_swap_cache_async(swp_e
/*
 * Initiate read into locked page and return.
 */
-   lru_cache_add_active_anon(new_page);
+   lru_cache_add_anon(new_page);
swap_readpage(NULL, new_page);
return new_page;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 0/10] split anon and file LRUs

2007-11-03 Thread Rik van Riel

The current page replacement scheme in Linux has a number of problems,
which can be boiled down to:
- Sometimes the kernel evicts the wrong pages, which can result in
  bad performance.
- The kernel scans over pages that should not be evicted.  On systems
  with a few GB of RAM, this can result in the VM using an annoying
  amount of CPU.  On systems with >128GB of RAM, this can knock the
  system out for hours since excess CPU use is compounded with lock
  contention and other issues.

This patch series tries to address the issues by splitting the LRU
lists into two sets, one for swap/ram backed pages ("anon") and
one for filesystem backed pages ("file").

The current version only has the infrastructure.  Large changes to
the page replacement policy will follow later.

More details can be found on this page:

http://linux-mm.org/PageReplacementDesign

TODO:
- have any mlocked and ramfs pages live off of the LRU list,
  so we do not need to scan these pages
- switch to SEQ replacement for the anon LRU lists, so the
  worst case number of pages to scan is reduced greatly.
- figure out if the file LRU lists need page replacement
  changes to help with worst case scenarios
- implement and benchmark a scalable non-resident page
  tracking implementation in the radix tree, this may make
  the anon/file balancing algorithm more stable and could
  allow for further simplifications in the balancing algorithm

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 3/10] define page_file_cache

2007-11-03 Thread Rik van Riel

Define page_file_cache() function to answer the question:
is page backed by a file?

Originally part of Rik van Riel's split-lru patch.  Extracted
to make available for other, independent reclaim patches.

Moved inline function to linux/mm_inline.h where it will
be needed by subsequent "split LRU" and "noreclaim" patches.  

Unfortunately this needs to use a page flag, since the
PG_swapbacked state needs to be preserved all the way
to the point where the page is last removed from the
LRU.  Trying to derive the status from other info in
the page resulted in wrong VM statistics in earlier
split VM patchsets.


Signed-off-by:  Rik van Riel <[EMAIL PROTECTED]>
Signed-off-by:  Lee Schermerhorn <[EMAIL PROTECTED]>


Index: linux-2.6.23-mm1/include/linux/mm_inline.h
===
--- linux-2.6.23-mm1.orig/include/linux/mm_inline.h
+++ linux-2.6.23-mm1/include/linux/mm_inline.h
@@ -1,3 +1,23 @@
+#ifndef LINUX_MM_INLINE_H
+#define LINUX_MM_INLINE_H
+
+/**
+ * page_file_cache(@page)
+ * Returns !0 if @page is page cache page backed by a regular file,
+ * or 0 if @page is anonymous, tmpfs or otherwise swap backed.
+ *
+ * We would like to get this info without a page flag, but the state
+ * needs to propagate to whereever the page is last deleted from the LRU.
+ */
+static inline int page_file_cache(struct page *page)
+{
+   if (PageSwapBacked(page))
+   return 0;
+
+   /* The page is page cache backed by a normal filesystem. */
+   return 2;
+}
+
 static inline void
 add_page_to_active_list(struct zone *zone, struct page *page)
 {
@@ -38,3 +58,4 @@ del_page_from_lru(struct zone *zone, str
}
 }
 
+#endif
Index: linux-2.6.23-mm1/mm/shmem.c
===
--- linux-2.6.23-mm1.orig/mm/shmem.c
+++ linux-2.6.23-mm1/mm/shmem.c
@@ -1267,6 +1267,7 @@ repeat:
goto failed;
}
 
+   SetPageSwapBacked(filepage);
spin_lock(>lock);
entry = shmem_swp_alloc(info, idx, sgp);
if (IS_ERR(entry))
Index: linux-2.6.23-mm1/include/linux/page-flags.h
===
--- linux-2.6.23-mm1.orig/include/linux/page-flags.h
+++ linux-2.6.23-mm1/include/linux/page-flags.h
@@ -89,6 +89,7 @@
 #define PG_mappedtodisk16  /* Has blocks allocated on-disk 
*/
 #define PG_reclaim 17  /* To be reclaimed asap */
 #define PG_buddy   19  /* Page is free, on buddy lists */
+#define PG_swapbacked  20  /* Page is backed by RAM/swap */
 
 /* PG_readahead is only used for file reads; PG_reclaim is only for writes */
 #define PG_readahead   PG_reclaim /* Reminder to do async read-ahead */
@@ -216,6 +217,10 @@ static inline void SetPageUptodate(struc
 #define ClearPageReclaim(page) clear_bit(PG_reclaim, &(page)->flags)
 #define TestClearPageReclaim(page) test_and_clear_bit(PG_reclaim, 
&(page)->flags)
 
+#define PageSwapBacked(page)   test_bit(PG_swapbacked, &(page)->flags)
+#define SetPageSwapBacked(page)set_bit(PG_swapbacked, &(page)->flags)
+#define __ClearPageSwapBacked(page)__clear_bit(PG_swapbacked, 
&(page)->flags)
+
 #define PageCompound(page) test_bit(PG_compound, &(page)->flags)
 #define __SetPageCompound(page)__set_bit(PG_compound, &(page)->flags)
 #define __ClearPageCompound(page) __clear_bit(PG_compound, &(page)->flags)
Index: linux-2.6.23-mm1/mm/memory.c
===
--- linux-2.6.23-mm1.orig/mm/memory.c
+++ linux-2.6.23-mm1/mm/memory.c
@@ -1669,6 +1669,7 @@ gotten:
ptep_clear_flush(vma, address, page_table);
set_pte_at(mm, address, page_table, entry);
update_mmu_cache(vma, address, entry);
+   SetPageSwapBacked(new_page);
lru_cache_add_active(new_page);
page_add_new_anon_rmap(new_page, vma, address);
 
@@ -2198,6 +2199,7 @@ static int do_anonymous_page(struct mm_s
if (!pte_none(*page_table))
goto release;
inc_mm_counter(mm, anon_rss);
+   SetPageSwapBacked(page);
lru_cache_add_active(page);
page_add_new_anon_rmap(page, vma, address);
set_pte_at(mm, address, page_table, entry);
@@ -2351,6 +2353,7 @@ static int __do_fault(struct mm_struct *
set_pte_at(mm, address, page_table, entry);
if (anon) {
 inc_mm_counter(mm, anon_rss);
+   SetPageSwapBacked(page);
 lru_cache_add_active(page);
 page_add_new_anon_rmap(page, vma, address);
} else {
Index: linux-2.6.23-mm1/mm/swap_state.c
===
---

Re: x86_64 ten times slower than i386

2007-11-03 Thread Matt Mackall

On Sat, Nov 03, 2007 at 11:38:24PM +0100, Bo Brant?n wrote:
> On Sat, 3 Nov 2007, Matt Mackall wrote:
> 
> >This is typically due to a problem with the setup of your MTRRs. Try
> >booting with mem=nnnM where nnn is some number smaller than your
> >actual amount of memory.
> 
> Thank you for that advice, the system has 4GB and if I boot with mem=3072M 
> it will run as fast as normal while if I don't use the mem option it will 
> run 10 times slower, however if I use a figure like mem=3500M the kernel 
> will panic, is there any way to determine the highest usable figure 
> without try and error?

This is not really my area, but I suspect if you send us your dmesg
output, someone here will be able to tell you how to optimize things.
How much memory does the system report at normal boot? It's not
uncommon for BIOSes to do the wrong thing with memory approaching 4G,
even on supposedly 64-bit boxes.

Also, please send us your panic message (take a digital photo if you
need to), as that shouldn't happen either.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] PID namespace design bug, workaround

2007-11-03 Thread Linus Torvalds

On Sat, 3 Nov 2007, Ingo Molnar wrote:
> 
> - one problem is that this condition is 'invisible'. If two namespaces 
>   happen to access the same robust futex (say a yum update from two 
>   PID namespaces sharing the same read-mostly filesystem) there's silent
>   breakage and data corruption due to PID overlap.

.. and this is in *no* way different from thousands of applications that 
write their pid to lock-files, and others decide that it's "stale" because 
using "kill(pid, 0)" returns that the pid doesn't exist any more.

The solution? You can't do that kind of locking over NFS, or across pid 
namespaces. Nobody blames NFS or pid namespaces for it. 

> - so via this we isolate an important category of syscalls from
>   cross-namespace use perhaps forever.

So? That's inherent to how those stupid stable mutexes work.

I don't understand how you can call this a "PID namespace design bug", 
when it clearly has nothing what-so-ever to do with pid namespaces, and 
everything to do with the *futexes* that blithely assume that pid's are 
unique and that made it part of the user-visible interface.

OF COURSE any pid namespace design will always break such assumptions, but 
that's not because of any PID namespace bugs. It's what the whole *point* 
of PID namespaces are. If you use pid's (instead of some opaque cookies), 
you will not be able to use such things across pid-separation.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: x86_64 ten times slower than i386

2007-11-03 Thread Bo Brantén


On Sat, 3 Nov 2007, Matt Mackall wrote:


This is typically due to a problem with the setup of your MTRRs. Try
booting with mem=nnnM where nnn is some number smaller than your
actual amount of memory.


Thank you for that advice, the system has 4GB and if I boot with mem=3072M 
it will run as fast as normal while if I don't use the mem option it will 
run 10 times slower, however if I use a figure like mem=3500M the kernel 
will panic, is there any way to determine the highest usable figure 
without try and error?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/4] ticket spinlocks for x86

2007-11-03 Thread Jeremy Fitzhardinge

Nick Piggin wrote:
> Just for fun I also had a shot at merging the headers, as they become a
> lot more similar after this with the removal of the paravirt crud.

Glommer posted a set of patches the other day to implement x86-64
paravirt, which unifies lots of things including spinlocks.  But if
you've removed the need to diddle with sti/cli, then that works too...

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Smackv10: Smack rules grammar + their stateful parser

2007-11-03 Thread Ahmed S. Darwish

On 11/3/07, Kyle Moffett <[EMAIL PROTECTED]> wrote:
> On Nov 03, 2007, at 12:43:06, Ahmed S. Darwish wrote:
> > Bashv3 builtin "echo" behaves very strangely to -EINVAL. It sends
> > all the buffers that causes -EINVAL again in subsequent echo
> > invocations.
> >
> > i.e.
> > echo "Invalid Rule" > /smack/load  # -EINVAL returned
> > echo "Valid Rule" > /smack/load
> >
> > In seconod iteration, echo sends the first invalid buffer again
> > then sends the new one. This causes a "Invalid Rule\nValid Rule"
> > buffer sent to write().
> >
> > IMHO, this is a bug in builtin echo. The external /bin/echo doesn't
> > cause such strange behaviour.
>
> Actually, what causes problems here is something between a bug and a
> feature in libc's buffering.  Basically the -EINVAL error causes libc
> to leave its data in the file-output buffer despite the file being
> closed and reopened. Since a standalone echo just exits that buffer
> is discarded, but for the bash builtin it hangs around in the buffer
> for a while and ends up getting prepended to the following echo
> statement.  There's actually multiple ways to make this fail; this is
> just the simplest.
>

Thanks a lot for such a useful info. Is there a way from my side  to
make subsequent echo invocations not affected by previous failed ones
?

Regards,

-- 
Ahmed S. Darwish
Homepage: http://darwish.07.googlepages.com
Blog: http://darwish-07.blogspot.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] tmpfs: fix mounts when size is less than the page size

2007-11-03 Thread Michael Marineau

When tmpfs is mounted with a size less than one page the number of blocks
is set to 0 which makes the tmpfs mount unlimited. This can lead to a quick
and surprising death is someone typos a tmpfs mount command and writes to much.

tmpfs can still be mounted as unlimited if size or nr_blocks is exactly 0.

Signed-off-by: Michael Marineau <[EMAIL PROTECTED]>
---
 mm/shmem.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 253d205..66b07f2 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2138,6 +2138,8 @@ static int shmem_parse_options(char *options, int *mode, 
uid_t *uid,
if (*rest)
goto bad_val;
*blocks = size >> PAGE_CACHE_SHIFT;
+   if (size && blocks == 0)
+   blocks = 1;
} else if (!strcmp(this_char,"nr_blocks")) {
*blocks = memparse(value,);
if (*rest)
-- 
1.5.1.6

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] file capabilities: allow sigcont within session (v2)

2007-11-03 Thread Andrew Morgan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Serge E. Hallyn wrote:
> Quoting Stephen Smalley ([EMAIL PROTECTED]):
>> On Wed, 2007-10-31 at 18:49 -0500, Serge E. Hallyn wrote:
[..]
>>> Also don't do file-capabilities signaling checks when uids for
>>> the processes don't match, since the standard check_kill_permission
>>> will have done those checks.
>> Description doesn't match the code.
> 
> Egads.  I knew I should've just kept that part out of it for the first
> patch...
> 
> New patch on top of previous one is appended.

Dang! I stared at the code a long time to see what you were doing...

And concluded that you had coded what you intended; allow processes that
share UIDs to kill one another - independent of capabilities. The fact
that this is the reverse of the words you used to introduce your patch,
I didn't notice.

I totally missed the fact that this was (unwanted) new functionality!!
Mea culpa for the bad review.

I certainly Sign off the revised patch.

Cheers

Andrew
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHLOicmwytjiwfWMwRAq5HAJ49eajMT4myf1oKfrab2oCw/o9HnwCgkYt2
RyIsmHVWmClsrxCz5s1HRJY=
=hGLO
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [POWERPC] Fix fs_enet module build

2007-11-03 Thread Jochen Friedrich


If fs_enet is build as module, mii-fec/mii-bitbang should be build as
module, as well. Otherwise some symbols remain undefined.

 Building modules, stage 2.
 MODPOST 5 modules
ERROR: "fs_scc_ops" [drivers/net/fs_enet/fs_enet.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

Signed-off-by: Jochen Friedrich <[EMAIL PROTECTED]>
---

This can be pulled from git://git.bocc.de/dbox2.git for-2.6.24

drivers/net/fs_enet/Makefile |   12 ++--
1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/fs_enet/Makefile b/drivers/net/fs_enet/Makefile
index 02d4dc1..2f7563a 100644
--- a/drivers/net/fs_enet/Makefile
+++ b/drivers/net/fs_enet/Makefile
@@ -4,7 +4,15 @@

obj-$(CONFIG_FS_ENET) += fs_enet.o

-obj-$(CONFIG_8xx) += mac-fec.o mac-scc.o mii-fec.o
-obj-$(CONFIG_CPM2) += mac-fcc.o mii-bitbang.o
+fs_enet-$(CONFIG_8xx) += mac-fec.o mac-scc.o
+fs_enet-$(CONFIG_CPM2) += mac-fcc.o
+
+ifeq ($(CONFIG_8xx),y)
+obj-$(CONFIG_FS_ENET) += mii-fec.o
+endif
+
+ifeq ($(CONFIG_CPM2),y)
+obj-$(CONFIG_FS_ENET) += mii-bitbang.o
+endif

fs_enet-objs := fs_enet-main.o
--
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 3/3] PM: Suspend/hibernation debug documentation update (rev. 2)

2007-11-03 Thread Rafael J. Wysocki

On Saturday, 3 November 2007 21:08, Frans Pop wrote:
> Rafael J. Wysocki wrote:
> > +core run in a test mode.  There are 5 test modes available:
> > +
> > +freezer - test the freezing of processes
> > +devices - test the freezing of processes and suspending of devices
> > +platform - test the freezing of processes, suspending of devices and 
> > platform
> > +   global control methods(*)
> > +processors - test the freezing of processes, suspending of devices, 
> > platform
> > + global control methods(*) and the disabling of nonboot CPUs
> > +core - test the freezing of processes, suspending of devices, platform 
> > global
> > +   control methods(*), the disabling of nonboot CPUs and suspending of
> > +   platform/system devices
> 
> I'd suggest to indent this as follows:
> - freezer
>   Test the freezing of processes.
> - devices
>   Test the freezing of processes and suspending of devices.
> - platform
>   Test the freezing of processes, suspending of devices and platform global
>   control methods(*).
> - processors
>   Test the freezing of processes, suspending of devices, platform global
>   control methods(*) and the disabling of nonboot CPUs.
> - core
>   Test the freezing of processes, suspending of devices, platform global
>   control methods(*), the disabling of nonboot CPUs and suspending of
>   platform/system devices.

Yup, this way it looks better.

Thanks,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> Neither of us has yet posted a correct patch which applies to 2.6.23
> and 2.6.22.  I'm testing your 2.6.24-rc patch overnight, and if that's
> fine then one of us will post the version for -stable.  I thought I'd
> better leave that to you, after I've reported back.

Ok.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] PID namespace design bug, workaround

2007-11-03 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> On Fri, 2 Nov 2007, Dave Hansen wrote:
> > 
> > There are certainly more of these, but here is one In the futex 
> > userspace address, we install the current pid's vnr into a userspace 
> > address.
> 
> Now, realistically, why not just say "you can't use these things 
> across namespaces"? Does anybody really care? After all, somebody who 
> screws this up only screws himself, not anybody else.

i see two main categories of problems:

- one problem is that this condition is 'invisible'. If two namespaces 
  happen to access the same robust futex (say a yum update from two 
  PID namespaces sharing the same read-mostly filesystem) there's silent
  breakage and data corruption due to PID overlap. The other
  namespaces have no such problems. I think the "dont do that" answer is
  lame because most apps _will_ work across PID namespaces because 
  things like fcntl based locking does work. And there's no valid
  technical excuse why futexes shouldnt work: it's all controlled by the
  same native kernel, there's no untrusted network separating the nodes,
  etc.

- so via this we isolate an important category of syscalls from
  cross-namespace use perhaps forever. Pick just about any other kernel
  resource and they can be shared between namespaces. But not futexes -
  which happen to be the most scalable locking primitive and people will
  almost certainly want to use them across namespaces. A
  completely new breed of futexes has to be introduced and trickled
  through userspace and all the architectures to make it work again
  across namespaces. Who will do that work? Generally the people who
  introduce a new concept are the ones who should do that. But in this
  case they are apparently not interested in making it generic enough
  (they are concentrated on their 'isolate it all' aspect) so
  nobody else will do and we are stuck with an incomplete concept.

The answer of user-space/apps is predictable: they'll gravitate towards 
the path of least resistance, and that will be "dont use futexes". PID 
namespaces basically single out an important API category and use the 
natural pressure of the other 300 syscalls and tens of thousands of apps 
against this category. Linux is basically used against itself. The 
counter-force is relatively weak and there's no solution available _at 
all_ presently so it's not even the fight of patches against each other, 
it's the sheer lack of a feature which has an obvious end-result.

We've already got way too many incomplete concepts and APIs in the 
kernel. Maybe i'm over-worrying, but i fear we end up like with 
capabilities or sendfile - code merged too soon and never completed for 
many years - perhaps never completed at all. VMS and WNT did those 
things a bit better i think - their API frameworks were/are pervasive 
and complete, even in the corner cases.

Whether it's the right approach to force reasonable perfection of 
frameworks like this from the get go is another question - but in 
practice even for relatively popular new APIs like epoll we see a way 
too slow movement towards the 'completion of the API', and that hinders 
adoption of new APIs very much. (With splice being a notable exception - 
there the central concept was so strong that it quickly pushed itself to 
total completion - combined with a capable maintainer of the API.) But 
it's not that easy for futexes and we put another roadblock in the path 
of futexes.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Christoph Lameter wrote:
> On Sat, 3 Nov 2007, Hugh Dickins wrote:
> > On Sat, 3 Nov 2007, Christoph Lameter wrote:
> > > On Sat, 3 Nov 2007, Hugh Dickins wrote:
> > > 
> > > > Later Christoph noticed that I'm not handling the SlabDebug case right.
> > > > So stable should ignore my patch, and he will come up with another.
> > > 
> > > Hmmm? I thought you wanted to test the patch provided?
> > 
> > Yes.  Sounds like you see a contradiction there - I don't see it.
> 
> Maybe language issues. I read this as meaning that there is no fix 
> available yet.

Neither of us has yet posted a correct patch which applies to 2.6.23
and 2.6.22.  I'm testing your 2.6.24-rc patch overnight, and if that's
fine then one of us will post the version for -stable.  I thought I'd
better leave that to you, after I've reported back.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: Reproducible oops with lockdep on count_matching_names()

2007-11-03 Thread Michael Buesch

On Saturday 03 November 2007 20:58:09 Luis R. Rodriguez wrote:
> I was using SLAB and ran into other strange oops, as the one below,
> but after switching to SLUB, after Michael Buesch's suggestion that
> one went away... The lockdep segfault is still present, however.

Who is responsible for slab btw?
I mean, someone should be interested in getting this bug fixed. :)
When using slab I see random corruptions. I think related to rmmod, but
I'm not sure. I don't see this with slub.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] slub: fix Objects count

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Christoph Lameter wrote:
> On Sat, 3 Nov 2007, Hugh Dickins wrote:
> 
> > I was afraid you might say something like that.
> > Perhaps it'll be a patch I need to use in my own builds.
> > Though I'd have thought others would want that accuracy too.
> > Didn't SLAB give it?  (The "r*gr*ss**n" word!)
> 
> Slab also only counts objects that are not in the queues. See free_block() 
> f.e.

I'll take your word for it, and apologize for my slur on slub!
(Slub has a great deal to admire in it, I should say.)

> 
> We could improve the situation by flushing all cpu slabs before counts are 
> determined.
> 
> Which can be done manually. Run
> 
>   slabinfo -s
> 
> and then look at the numbers.

Mmm, I'd been doing slabinfo -v sometimes.  These are fine in some
situations, but it's always better when the observer can avoid
interfering with the observed.  Impossible, we know, but...

Also, many caches too quickly re-equip themselves
with cpu slabs which again obscure the numbers.

> > > Adds to much overhead to the fast paths
> > 
> > You've come to that conclusion very quickly!
> 
> I have just spend a few weeks optimizing the fast and slow paths and there 
> is some additional overhead that I am still trying to eliminate.
> 
> > Any numbers to back it up?
> 
> The performance in the fast paths depends on updating only a single word
> for an allocation. Adding another counter makes that impossible.

Gosh, that's a tighter corner than any I've been in.

> 
> See the recent post on SLUB regression on SMP.

I'll have to read up on that, thanks for the pointer.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] PID namespace design bug, workaround

2007-11-03 Thread sukadev

Pavel Emelianov [EMAIL PROTECTED] wrote:
| Ulrich Drepper wrote:
| > -BEGIN PGP SIGNED MESSAGE-
| > Hash: SHA1
| > 
| > Pavel Emelyanov wrote:
| >>> Isn't it this?
| >>>
| >>> http://lkml.org/lkml/2007/11/1/141
| >> That was the initial problem, and I already answered to Ingo about
| >> it
| > 
| > No, look at my old mail which Ingo referenced in that posting.
| 
| You pointed only one problem that is not a variation of "how do 
| we handle the case when we pass our pid outside the namespace".
| 
| This problem with signals is now being resolved at IBM by Sukadev 
| and Serge (I put them in Cc), so this is about to be fixed by the
| time 2.6.24 releases (I hope).

Yes. We (Oleg, Eric included in Cc) have a patchset to address signals
issues in child pid namespaces. It is being discussed on Containers list:

https://lists.linux-foundation.org/pipermail/containers/2007-October/008240.html

We will post the patchset to LKML soon.

| 
| As far as the "passing the pid outside the namespace" is concerned, 
| is my answer "pids should never be used outside the namespace they
| came from, otherwise userspace won't work as expected" satisfactory?
| 
| So is "everything else", you mentioned, covered with the problems
| above?
| 
| > - --
| > ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
| > -BEGIN PGP SIGNATURE-
| > Version: GnuPG v1.4.7 (GNU/Linux)
| > 
| > iD8DBQFHKy692ijCOnn/RHQRAtYLAJ98EXTGl3HMlCbVXOkL7TJRFfw4DACfcgYI
| > HHz5f7TfM05Dps+ruPRiUrU=
| > =IjS4
| > -END PGP SIGNATURE-
| > 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: "Fix ATAPI transfer lengths" causes CD writing regression

2007-11-03 Thread Daniel Drake


Tejun Heo wrote:

<4>ata2.00: HSM violation: eh_analyze_tf: BUSY|DRQ
<3>ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
<3>ata2.00: cmd a0/00:00:00:0a:00/00:00:00:00:00/a0 tag 0 cdb 0x5a data
10 in
<4> res 58/00:02:00:0a:00/00:00:00:00:00/a0 Emask 0x2 (HSM
violation)
<3>ata2.00: status: { DRDY DRQ }
<6>ata2: soft resetting link
<6>ata2.00: configured for UDMA/33
<6>ata2: EH complete


Does this patch fix the problem?


That fixes it, thanks! There is no more ugly error in dmesg, the test
prog doesn't print any sense data, and brasero works OK too. However,
these messages appear in the kernel log every time I run the test app
(or when brasero does its thing):

<4>ata2.00: 10 bytes trailing data
<4>ata2.00: 10 bytes trailing data
<4>ata2.00: 10 bytes trailing data
<4>ata2.00: 10 bytes trailing data
<4>ata2.00: 10 bytes trailing data
<4>ata2.00: 10 bytes trailing data
<4>ata2.00: 6 bytes trailing data

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: Reproducible oops with lockdep on count_matching_names()

2007-11-03 Thread Luis R. Rodriguez

On 11/2/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> On Thu, 2007-11-01 at 19:26 -0400, Michael Wu wrote:
> > On Thursday 01 November 2007 15:17:16 Luis R. Rodriguez wrote:
> > > [EMAIL PROTECTED]:~/devel/wireless-2.6$ git-describe
> > > v2.6.24-rc1-146-g2280253
> > >
> > > So I hit segfault with lockdep on count_matching_names() on the
> > > strcmp() multiple times now. This is reproducible and with different
> > > wireless drivers.
> > >
> > I've found the problem. It appears to be in lockdep. struct lock_class has a
> > const char *name field which points to a statically allocated string that
> > comes from the code which uses the lock. If that code/string is in a module
> > and gets unloaded, the pointer in |name| is no longer valid. Next time this
> > field is dereferenced (count_matching_names, in this case), we crash.
> >
> > The following patch fixes the issue but there's probably a better way.
>
> Thanks, and indeed. From my understanding lockdep_free_key_range()
> should destroy all classes of a module on module unload.
>
> So I'm not quite sure what has gone wrong here..

I've tried digging more and just am still not sure what caused this.
At first I thought perhaps all_lock_classes list had some element not
yet removed as lockdep_free_key_range() iterates over the hash tables
but this doesn't seem to be the case.

I was using SLAB and ran into other strange oops, as the one below,
but after switching to SLUB, after Michael Buesch's suggestion that
one went away... The lockdep segfault is still present, however.

Just not sure what's going on. Any ideas?

- oops with slab, not reproducible with slub:

[EMAIL PROTECTED]:~$ sudo rmmod tg3
[EMAIL PROTECTED]:~$ sudo rmmod sr_mod

*** dmesg -c

ACPI: PCI interrupt for device :02:00.0 disabled
BUG: unable to handle kernel paging request at virtual address f88a4a05
printing eip: f88a4a05 *pde = 0267 *pte = 
Oops:  [#1]
Modules linked in: sr_mod uinput thinkpad_acpi hwmon backlight nvram
ipv6 acpi_cpufreq cpufreq_userspace cpufreq_powersave cpufreq_ondemand
cpufreq_conservative dock arc4 ecb blkcipher cryptomgr crypto_algapi
rc80211_simple ath5k mac80211 cfg80211 pcmcia crc32 snd_hda_intel
snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_hwdep snd_seq_oss
ipw2200 snd_seq_midi_event ieee80211 ieee80211_crypt sg ehci_hcd
uhci_hcd yenta_socket rsrc_nonstatic snd_seq snd_timer snd_seq_device
firmware_class cdrom pcmcia_core usbcore evdev rng_core rtc snd
soundcore

Pid: 2908, comm: modprobe Not tainted (2.6.24-rc1 #18)
EIP: 0060:[] EFLAGS: 00010086 CPU: 0
EIP is at 0xf88a4a05
EAX: c20b75c8 EBX: c2f86f38 ECX: f88a4a05 EDX: c2f86f38
ESI: c20b75c8 EDI: c2f89c00 EBP: c3897bfc ESP: c3897be0
 DS: 007b ES: 007b FS:  GS: 0033 SS: 0068
Process modprobe (pid: 2908, ti=c3896000 task=c3935150 task.ti=c3896000)
Stack: c01b2afc c2f82d98 c3897bf4 c01ba8b6 c2f86f38 c20b75c8 c2f82c00 c3897c24
   c02186dd c2f86f38 c3897c24 c01b54c0 c20b75c8 0001 c20b75c8 c2f86f38
   c20b75c8 c3897c30 c01b54ed 0001 c3897c54 c01b556c 0001 c3897cd4
Call Trace:
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_stack_log_lvl+0x9d/0xa5
 [] show_registers+0xad/0x17c
 [] die+0xf5/0x1c6
 [] do_page_fault+0x450/0x537
 [] error_code+0x6a/0x70
 [] scsi_request_fn+0x5f/0x2ec
 [] __generic_unplug_device+0x20/0x23
 [] blk_execute_rq_nowait+0x7c/0x8f
 [] blk_execute_rq+0xb1/0xcf
 [] scsi_execute+0xc4/0xd7
 [] scsi_execute_req+0xae/0xcb
 [] sr_probe+0x1d5/0x557 [sr_mod]
 [] driver_probe_device+0xe8/0x168
 [] __driver_attach+0x6a/0xa1
 [] bus_for_each_dev+0x36/0x5b
 [] driver_attach+0x19/0x1b
 [] bus_add_driver+0x73/0x1aa
 [] driver_register+0x67/0x6c
 [] scsi_register_driver+0xf/0x11
 [] init_sr+0x23/0x3d [sr_mod]
 [] sys_init_module+0x1142/0x1262
 [] sysenter_past_esp+0x5f/0xa5
 ===
Code:  Bad EIP value.
EIP: [] 0xf88a4a05 SS:ESP 0068:c3897be0

  Luis
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Spelling fix: lenght->length

2007-11-03 Thread Frans Pop

Looks like word-wrapping by your mail client has corrupted the patch.
Suggest you resend it.

Cheers,
FJP
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> On Sat, 3 Nov 2007, Christoph Lameter wrote:
> > On Sat, 3 Nov 2007, Hugh Dickins wrote:
> > 
> > > Later Christoph noticed that I'm not handling the SlabDebug case right.
> > > So stable should ignore my patch, and he will come up with another.
> > 
> > Hmmm? I thought you wanted to test the patch provided?
> 
> Yes.  Sounds like you see a contradiction there - I don't see it.

Maybe language issues. I read this as meaning that there is no fix 
available yet.
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Christoph Lameter wrote:
> On Sat, 3 Nov 2007, Hugh Dickins wrote:
> 
> > Later Christoph noticed that I'm not handling the SlabDebug case right.
> > So stable should ignore my patch, and he will come up with another.
> 
> Hmmm? I thought you wanted to test the patch provided?

Yes.  Sounds like you see a contradiction there - I don't see it.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

device struct bloat

2007-11-03 Thread Stephen Hemminger

The sizeof(struct device) is way too big, especially in the network device case.
We want to support 1000's of device's and the change from class_device to
net_device has caused needless bloat.

sizeof(struct device) = 272
sizeof(struct class_device) = 92
  * not the class_id in class_device could also be removed or changed to
 a ptr, since it is redundant for net_devices.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> Later Christoph noticed that I'm not handling the SlabDebug case right.
> So stable should ignore my patch, and he will come up with another.

Hmmm? I thought you wanted to test the patch provided?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH 3/3] PM: Suspend/hibernation debug documentation update (rev. 2)

2007-11-03 Thread Rafael J. Wysocki

From: Rafael J. Wysocki <[EMAIL PROTECTED]>

Update the suspend/hibernation debugging and testing documentation to describe
the newly introduced testing facilities.

Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
---
 Documentation/power/basic-pm-debugging.txt |  203 -
 Documentation/power/drivers-testing.txt|   30 ++--
 2 files changed, 163 insertions(+), 70 deletions(-)

Index: linux-2.6/Documentation/power/drivers-testing.txt
===
--- linux-2.6.orig/Documentation/power/drivers-testing.txt
+++ linux-2.6/Documentation/power/drivers-testing.txt
@@ -6,9 +6,9 @@ Testing suspend and resume support in de
 Unfortunately, to effectively test the support for the system-wide suspend and
 resume transitions in a driver, it is necessary to suspend and resume a fully
 functional system with this driver loaded.  Moreover, that should be done
-several times, preferably several times in a row, and separately for the 
suspend
-to disk (STD) and the suspend to RAM (STR) transitions, because each of these
-cases involves different ordering of operations and different interactions with
+several times, preferably several times in a row, and separately for 
hibernation
+(aka suspend to disk or STD) and suspend to RAM (STR), because each of these
+cases involves slightly different operations and different interactions with
 the machine's BIOS.
 
 Of course, for this purpose the test system has to be known to suspend and
@@ -22,20 +22,24 @@ for more information about the debugging
 Once you have resolved the suspend/resume-related problems with your test 
system
 without the new driver, you are ready to test it:
 
-a) Build the driver as a module, load it and try the STD in the test mode (see:
-Documents/power/basic-pm-debugging.txt, 1a)).
+a) Build the driver as a module, load it and try the test modes of hibernation
+   (see: Documents/power/basic-pm-debugging.txt, 1).
 
-b) Load the driver and attempt to suspend to disk in the "reboot", "shutdown"
-and "platform" modes (see: Documents/power/basic-pm-debugging.txt, 1).
+b) Load the driver and attempt to hibernate in the "reboot", "shutdown" and
+   "platform" modes (see: Documents/power/basic-pm-debugging.txt, 1).
 
-c) Compile the driver directly into the kernel and try the STD in the test 
mode.
+c) Compile the driver directly into the kernel and try the test modes of
+   hibernation.
 
-d) Attempt to suspend to disk with the driver compiled directly into the kernel
-in the "reboot", "shutdown" and "platform" modes.
+d) Attempt to hibernate with the driver compiled directly into the kernel
+   in the "reboot", "shutdown" and "platform" modes.
 
-e) Attempt to suspend to RAM using the s2ram tool with the driver loaded (see:
-Documents/power/basic-pm-debugging.txt, 2).  As far as the STR tests are
-concerned, it should not matter whether or not the driver is built as a module.
+e) Try the test modes of suspend (see: Documents/power/basic-pm-debugging.txt,
+   2).  [As far as the STR tests are concerned, it should not matter whether or
+   not the driver is built as a module.]
+
+f) Attempt to suspend to RAM using the s2ram tool with the driver loaded
+   (see: Documents/power/basic-pm-debugging.txt, 2).
 
 Each of the above tests should be repeated several times and the STD tests
 should be mixed with the STR tests.  If any of them fails, the driver cannot be
Index: linux-2.6/Documentation/power/basic-pm-debugging.txt
===
--- linux-2.6.orig/Documentation/power/basic-pm-debugging.txt
+++ linux-2.6/Documentation/power/basic-pm-debugging.txt
@@ -1,45 +1,102 @@
-Debugging suspend and resume
+Debugging hibernation and suspend
(C) 2007 Rafael J. Wysocki <[EMAIL PROTECTED]>, GPL
 
-1. Testing suspend to disk (STD)
+1. Testing hibernation (aka suspend to disk or STD)
 
-To verify that the STD works, you can try to suspend in the "reboot" mode:
+To check if hibernation works, you can try to hibernate in the "reboot" mode:
 
 # echo reboot > /sys/power/disk
 # echo disk > /sys/power/state
 
-and the system should suspend, reboot, resume and get back to the command 
prompt
-where you have started the transition.  If that happens, the STD is most likely
-to work correctly, but you need to repeat the test at least a couple of times 
in
-a row for confidence.  This is necessary, because some problems only show up on
-a second attempt at suspending and resuming the system.  You should also test
-the "platform" and "shutdown" modes of suspend:
+and the system should create a hibernation image, reboot, resume and get back 
to
+the command prompt where you have started the transition.  If that happens,
+hibernation is most likely to work correctly.  Still, you need to repeat the
+test at least a couple of times in a row for confidence.  [This is necessary,
+because some problems only show up on a second attempt at

[RFC][PATCH 0/3] Suspend and hibernation test facility (rev. 2)

2007-11-03 Thread Rafael J. Wysocki

Hi,

This is the second iteration of the patches that add a new testing facility for
suspend and hibernation.

The first patch adds the possibility to test the suspend (STD) core code
without actually suspending, which is useful for tracking problems with drivers
etc.

The second one modifies the hibernation core so that it can use the same
facility (it's a bit more powerful than the existing hibernation test modes,
since they really can't test the ACPI global methods).

The third one modified documentation in accordance with the two previous ones.

Comments welcome.

Greetings,
Rafael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH 1/3] Suspend: Testing facility (rev. 2)

2007-11-03 Thread Rafael J. Wysocki

From: Rafael J. Wysocki <[EMAIL PROTECTED]>

Introduce sysfs attribute /sys/power/pm_test allowing one to test the suspend
core code.  Namely, writing one of the strings:

freezer
devices
platform
processors
core

to this file causes the suspend code to work in one of the test modes defined as
follows:

freezer - test the freezing of processes
devices - test the freezing of processes and suspending of devices
platform - test the freezing of processes, suspending of devices and platform
   global control methods(*)
processors - test the freezing of processes, suspending of devices, platform
 global control methods and the disabling of nonboot CPUs
core - test the freezing of processes, suspending of devices, platform global
   control methods, the disabling of nonboot CPUs and suspending of
   platform/system devices

(*) These are ACPI global control methods on ACPI systems

Then, if a suspend is started by normal means, the suspend core will perform
its normal operations up to the point indicated by given test level.  Next, it
will wait for 5 seconds and carry out the resume operations needed to transition
the system back to the fully functional state.

Writing "none" to /sys/power/pm_test turns the testing off.

When open for reading, /sys/power/pm_test contains a space-separated list of all
available tests (including "none" that represents the normal functionality) in
which the current test level is indicated by square brackets.

Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
---
 kernel/power/main.c  |  109 ++-
 kernel/power/power.h |   18 
 2 files changed, 118 insertions(+), 9 deletions(-)

Index: linux-2.6/kernel/power/main.c
===
--- linux-2.6.orig/kernel/power/main.c
+++ linux-2.6/kernel/power/main.c
@@ -31,6 +31,80 @@ DEFINE_MUTEX(pm_mutex);
 unsigned int pm_flags;
 EXPORT_SYMBOL(pm_flags);
 
+#ifdef CONFIG_PM_DEBUG
+int pm_test_level = TEST_NONE;
+
+static int suspend_test(int level)
+{
+   if (pm_test_level == level) {
+   printk(KERN_INFO "suspend debug: Waiting for 5 seconds.\n");
+   mdelay(5000);
+   return 1;
+   }
+   return 0;
+}
+
+static const char * const pm_tests[__TEST_AFTER_LAST] = {
+   [TEST_NONE] = "none",
+   [TEST_CORE] = "core",
+   [TEST_CPUS] = "processors",
+   [TEST_PLATFORM] = "platform",
+   [TEST_DEVICES] = "devices",
+   [TEST_FREEZER] = "freezer",
+};
+
+static ssize_t pm_test_show(struct kset *kset, char *buf)
+{
+   char *s = buf;
+   int level;
+
+   for (level = TEST_FIRST; level <= TEST_MAX; level++)
+   if (pm_tests[level]) {
+   if (level == pm_test_level)
+   s += sprintf(s, "[%s] ", pm_tests[level]);
+   else
+   s += sprintf(s, "%s ", pm_tests[level]);
+   }
+
+   if (s != buf)
+   /* convert the last space to a newline */
+   *(s-1) = '\n';
+
+   return (s - buf);
+}
+
+static ssize_t pm_test_store(struct kset *kset, const char *buf, size_t n)
+{
+   const char * const *s;
+   int level;
+   char *p;
+   int len;
+   int error = -EINVAL;
+
+   p = memchr(buf, '\n', n);
+   len = p ? p - buf : n;
+
+   mutex_lock(_mutex);
+
+   level = TEST_FIRST;
+   for (s = _tests[level]; level <= TEST_MAX; s++, level++)
+   if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
+   pm_test_level = level;
+   error = 0;
+   break;
+   }
+
+   mutex_unlock(_mutex);
+
+   return error ? error : n;
+}
+
+power_attr(pm_test);
+#else /* !CONFIG_PM_DEBUG */
+static inline int suspend_test(int level) { return 0; }
+#endif /* !CONFIG_PM_DEBUG */
+
+
 #ifdef CONFIG_SUSPEND
 
 /* This is just an arbitrary number */
@@ -136,7 +210,10 @@ static int suspend_enter(suspend_state_t
printk(KERN_ERR "Some devices failed to power down\n");
goto Done;
}
-   error = suspend_ops->enter(state);
+
+   if (!suspend_test(TEST_CORE))
+   error = suspend_ops->enter(state);
+
device_power_up();
  Done:
arch_suspend_enable_irqs();
@@ -167,16 +244,25 @@ int suspend_devices_and_enter(suspend_st
printk(KERN_ERR "Some devices failed to suspend\n");
goto Resume_console;
}
+
+   if (suspend_test(TEST_DEVICES))
+   goto Resume_devices;
+
if (suspend_ops->prepare) {
error = suspend_ops->prepare();
if (error)
goto Resume_devices;
}
+
+   if (suspend_test(TEST_PLATFORM))
+   goto Finish;
+
error = disable_nonboot_cpus();
-   if (!error)
+   if (!error &&

[RFC][PATCH 2/3] Hibernation: New testing facility (rev. 2)

2007-11-03 Thread Rafael J. Wysocki

From: Rafael J. Wysocki <[EMAIL PROTECTED]>

Make it possible to test the hibernation core code with the help of the
/sys/power/pm_test attribute introduced for suspend testing in the previous
patch.

Writing an appropriate string to this file causes the hibernation code to work
in one of the test modes defined as follows:

freezer - test the freezing of processes
devices - test the freezing of processes and suspending of devices
platform - test the freezing of processes, suspending of devices and platform
   global control methods(*)
processors - test the freezing of processes, suspending of devices, platform
 global control methods(*) and the disabling of nonboot CPUs
core - test the freezing of processes, suspending of devices, platform global
   control methods(*), the disabling of nonboot CPUs and suspending of
   platform/system devices

(*) - the platform global control methods are only available on ACPI systems
  and are only tested if the hibernation mode is set to "platform"

Then, if a hibernation is started by normal means, the hibernation core will
perform its normal operations up to the point indicated by given test level.
Next, it will wait for 5 seconds and carry out the resume operations needed to
transition the system back to the fully functional state.

Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
---
 kernel/power/disk.c  |   70 ---
 kernel/power/power.h |2 +
 2 files changed, 57 insertions(+), 15 deletions(-)

Index: linux-2.6/kernel/power/disk.c
===
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -70,6 +70,35 @@ void hibernation_set_ops(struct platform
mutex_unlock(_mutex);
 }
 
+#ifdef CONFIG_PM_DEBUG
+static void hibernation_debug_sleep(void)
+{
+   printk(KERN_INFO "hibernation debug: Waiting for 5 seconds.\n");
+   mdelay(5000);
+}
+
+static int hibernation_testmode(int mode)
+{
+   if (hibernation_mode == mode) {
+   hibernation_debug_sleep();
+   return 1;
+   }
+   return 0;
+}
+
+static int hibernation_test(int level)
+{
+   if (pm_test_level == level) {
+   hibernation_debug_sleep();
+   return 1;
+   }
+   return 0;
+}
+#else /* !CONFIG_PM_DEBUG */
+static int hibernation_testmode(int mode) { return 0; }
+static int hibernation_test(int level) { return 0; }
+#endif /* !CONFIG_PM_DEBUG */
+
 /**
  * platform_start - tell the platform driver that we're starting
  * hibernation
@@ -167,6 +196,10 @@ int create_image(int platform_mode)
goto Enable_irqs;
}
 
+   if (hibernation_test(TEST_CORE))
+   goto Power_up;
+
+   in_suspend = 1;
save_processor_state();
error = swsusp_arch_suspend();
if (error)
@@ -175,6 +208,7 @@ int create_image(int platform_mode)
restore_processor_state();
if (!in_suspend)
platform_leave(platform_mode);
+ Power_up:
/* NOTE:  device_power_up() is just a resume() for devices
 * that suspended with irqs off ... no overall powerup.
 */
@@ -211,24 +245,29 @@ int hibernation_snapshot(int platform_mo
if (error)
goto Resume_console;
 
-   error = platform_pre_snapshot(platform_mode);
-   if (error)
+   if (hibernation_test(TEST_DEVICES))
goto Resume_devices;
 
+   error = platform_pre_snapshot(platform_mode);
+   if (error || hibernation_test(TEST_PLATFORM))
+   goto Finish;
+
error = disable_nonboot_cpus();
if (!error) {
-   if (hibernation_mode != HIBERNATION_TEST) {
-   in_suspend = 1;
-   error = create_image(platform_mode);
-   /* Control returns here after successful restore */
-   } else {
-   printk("swsusp debug: Waiting for 5 seconds.\n");
-   mdelay(5000);
-   }
+   if (hibernation_test(TEST_CPUS))
+   goto Enable_cpus;
+
+   if (hibernation_testmode(HIBERNATION_TEST))
+   goto Enable_cpus;
+
+   error = create_image(platform_mode);
+   /* Control returns here after successful restore */
}
+ Enable_cpus:
enable_nonboot_cpus();
- Resume_devices:
+ Finish:
platform_finish(platform_mode);
+ Resume_devices:
device_resume();
  Resume_console:
resume_console();
@@ -406,11 +445,12 @@ int hibernate(void)
if (error)
goto Finish;
 
-   if (hibernation_mode == HIBERNATION_TESTPROC) {
-   printk("swsusp debug: Waiting for 5 seconds.\n");
-   mdelay(5000);
+   if (hibernation_test(TEST_FREEZER))
goto Thaw;
-   }
+
+   if

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Hugh Dickins wrote:
> On Sat, 3 Nov 2007, Hugh Dickins wrote:
> > On Sat, 3 Nov 2007, Olivér Pintér wrote:
> > > Q: It's needed auch to 2.6.22-stable?
> 
> Okay, here's a version for 2.6.23 and 2.6.22...
> Christoph, you've now Acked the 2.6.24 one, thanks:
> do you agree this patch below should go to -stable?

Later Christoph noticed that I'm not handling the SlabDebug case right.
So stable should ignore my patch, and he will come up with another.

Hugh

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Christoph Lameter wrote:
> On Sat, 3 Nov 2007, Hugh Dickins wrote:
> > On Sat, 3 Nov 2007, Christoph Lameter wrote:
> > > On Sat, 3 Nov 2007, Hugh Dickins wrote:
> > > 
> > > > Which fixes the leakage: Objects and Partials then remain stable.
> > > 
> > > Well this code is just an optimization for a rare case.
> > > Your patch may not handle the debug situation the right way.
> > 
> > Oh?  How?
> 
> If SLAB_DEBUG is set then your patch does not do the proper checks for the 
> object, tracing information is not written and the poisoning is not done. 
> See alloc_debug_processing().

Yup, you're right, thanks.

I'll followup that version I CC'ed to stable,
to stop it and say you'll supply a corrected version.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] slub: fix Objects count

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> I was afraid you might say something like that.
> Perhaps it'll be a patch I need to use in my own builds.
> Though I'd have thought others would want that accuracy too.
> Didn't SLAB give it?  (The "r*gr*ss**n" word!)

Slab also only counts objects that are not in the queues. See free_block() 
f.e.

We could improve the situation by flushing all cpu slabs before counts are 
determined.

Which can be done manually. Run

slabinfo -s

and then look at the numbers.

> > Adds to much overhead to the fast paths
> 
> You've come to that conclusion very quickly!

I have just spend a few weeks optimizing the fast and slow paths and there 
is some additional overhead that I am still trying to eliminate.

> Any numbers to back it up?

The performance in the fast paths depends on updating only a single word
for an allocation. Adding another counter makes that impossible.

See the recent post on SLUB regression on SMP.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> On Sat, 3 Nov 2007, Christoph Lameter wrote:
> > On Sat, 3 Nov 2007, Hugh Dickins wrote:
> > 
> > > Which fixes the leakage: Objects and Partials then remain stable.
> > 
> > Well this code is just an optimization for a rare case.
> > Your patch may not handle the debug situation the right way.
> 
> Oh?  How?

If SLAB_DEBUG is set then your patch does not do the proper checks for the 
object, tracing information is not written and the poisoning is not done. 
See alloc_debug_processing().

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [POWERPC] Fix typo #ifdef -> #ifndef

2007-11-03 Thread Jochen Friedrich


fpi->cp_command should be overwritten only if CONFIG_PPC_CPM_NEW_BINDING
is NOT set. Otherwise it is already set from the device tree.

Signed-off-by: Jochen Friedrich <[EMAIL PROTECTED]>
---

This can be pulled from git://git.bocc.de/dbox2.git for-2.6.24

drivers/net/fs_enet/mac-scc.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/fs_enet/mac-scc.c b/drivers/net/fs_enet/mac-scc.c
index 03134f4..48f2f30 100644
--- a/drivers/net/fs_enet/mac-scc.c
+++ b/drivers/net/fs_enet/mac-scc.c
@@ -158,7 +158,7 @@ static int setup_data(struct net_device *dev)
{
struct fs_enet_private *fep = netdev_priv(dev);

-#ifdef CONFIG_PPC_CPM_NEW_BINDING
+#ifndef CONFIG_PPC_CPM_NEW_BINDING
struct fs_platform_info *fpi = fep->fpi;

fep->scc.idx = fs_get_scc_index(fpi->fs_no);
--
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VIA VT6307 OHCI version?

2007-11-03 Thread Stefan Richter

Krzysztof Halasa wrote:
> The difference is of course at 0x0E, not 0x1E. Maybe the byte at 0x0A
> is 0x92 for 4 IR contents and 0xA2 for 8 contents. That would also
> make sense wrt the broken 6306 as it has 0x00 there.

Somebody pointed me to this thread in a support forum of Asustek:
http://vip.asus.com/forum/view.aspx?id=20070710054607250_id=1=M2A-VM+HDMI
For special occasions, Asustek hand out a testing and reprogramming
utility from VIA to their customers (viafire.exe, together with EEPROM
image files and instructions; see the link to the RAR archive at this
page).  The documentation of the tool also contains a partial
description of the EEPROM contents.
-- 
Stefan Richter
-=-=-=== =-== ---=-
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] slub: fix Objects count

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Christoph Lameter wrote:
> On Sat, 3 Nov 2007, Hugh Dickins wrote:
> 
> > The count of active Objects shown by Slub's slabinfo is too approximate,
> > because each cpu slab is counted as all in use, even when lots are free.
> > That makes tracing leaks harder than it need be.
> 
> True but that is the way it is for performance reasons.

I was afraid you might say something like that.
Perhaps it'll be a patch I need to use in my own builds.
Though I'd have thought others would want that accuracy too.
Didn't SLAB give it?  (The "r*gr*ss**n" word!)

> > Add a free count into kmem_cache_cpu (which doesn't enlarge it on 64-bit),
> > to keep that count in the hot and dirty per-cpu cacheline.
> 
> Adds to much overhead to the fast paths

You've come to that conclusion very quickly!
Any numbers to back it up?

> and will make the current optimizations in mm impossible.

I'll have to wait and see what those are: you move too fast for me.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Spelling fix: lenght->length

2007-11-03 Thread Paulius Zaleckas

Signed-off-by: Paulius Zaleckas <[EMAIL PROTECTED]>

---
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 2ce3806..58fccc9 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -333,7 +333,7 @@ setup_rt_frame(int sig, struct k_sigaction *ka,
siginfo_t *info,
flush_user_icache_range((unsigned long) >tramp[0],
   (unsigned long) >tramp[TRAMP_SIZE]);
 
-   /* TRAMP Words 0-4, Lenght 5 = SIGRESTARTBLOCK_TRAMP
+   /* TRAMP Words 0-4, Length 5 = SIGRESTARTBLOCK_TRAMP
 * TRAMP Words 5-9, Length 4 = SIGRETURN_TRAMP
 * So the SIGRETURN_TRAMP is at the end of SIGRESTARTBLOCK_TRAMP
 */
diff --git a/arch/powerpc/sysdev/mmio_nvram.c
b/arch/powerpc/sysdev/mmio_nvram.c
index e073e24..7b49633 100644
--- a/arch/powerpc/sysdev/mmio_nvram.c
+++ b/arch/powerpc/sysdev/mmio_nvram.c
@@ -99,7 +99,7 @@ int __init mmio_nvram_init(void)
nvram_addr = r.start;
mmio_nvram_len = r.end - r.start + 1;
if ( (!mmio_nvram_len) || (!nvram_addr) ) {
-   printk(KERN_WARNING "nvram: address or lenght is 0\n");
+   printk(KERN_WARNING "nvram: address or length is 0\n");
ret = -EIO;
goto out;
}
diff --git a/arch/ppc/syslib/ppc_sys.c b/arch/ppc/syslib/ppc_sys.c
index 2d48018..837183c 100644
--- a/arch/ppc/syslib/ppc_sys.c
+++ b/arch/ppc/syslib/ppc_sys.c
@@ -185,7 +185,7 @@ void platform_notify_map(const struct
platform_notify_dev_map *map,
  */
 
 /*
-   Here we'll replace .name pointers with fixed-lenght strings
+   Here we'll replace .name pointers with fixed-length strings
Hereby, this should be called *before* any func stuff triggeded.
  */
 void ppc_sys_device_initfunc(void)
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index 97aa50d..ad0ede2 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -305,7 +305,7 @@ void *sbus_alloc_consistent(struct sbus_dev *sdev,
long len, u32 *dma_addrp)
struct resource *res;
int order;
 
-   /* XXX why are some lenghts signed, others unsigned? */
+   /* XXX why are some lengths signed, others unsigned? */
if (len <= 0) {
return NULL;
}
@@ -393,7 +393,7 @@ void sbus_free_consistent(struct sbus_dev *sdev,
long n, void *p, u32 ba)
  */
 dma_addr_t sbus_map_single(struct sbus_dev *sdev, void *va, size_t len,
int direction)
 {
-   /* XXX why are some lenghts signed, others unsigned? */
+   /* XXX why are some lengths signed, others unsigned? */
if (len <= 0) {
return 0;
}
diff --git a/drivers/atm/firestream.c b/drivers/atm/firestream.c
index f8f7139..c662d68 100644
--- a/drivers/atm/firestream.c
+++ b/drivers/atm/firestream.c
@@ -171,8 +171,8 @@ static char *res_strings[] = {
"packet purged", 
"packet ageing timeout", 
"channel ageing timeout", 
-   "calculated lenght error", 
-   "programmed lenght limit error", 
+   "calculated length error",
+   "programmed length limit error",
"aal5 crc32 error", 
"oam transp or transpc crc10 error", 
"reserved 25", 
diff --git a/drivers/i2c/busses/i2c-powermac.c
b/drivers/i2c/busses/i2c-powermac.c
index 0ab4f26..7813127 100644
--- a/drivers/i2c/busses/i2c-powermac.c
+++ b/drivers/i2c/busses/i2c-powermac.c
@@ -94,7 +94,7 @@ static s32 i2c_powermac_smbus_xfer(   struct
i2c_adapter*adap,
break;
 
/* Note that these are broken vs. the expected smbus API where
-* on reads, the lenght is actually returned from the function,
+* on reads, the length is actually returned from the function,
 * but I think the current API makes no sense and I don't want
 * any driver that I haven't verified for correctness to go
 * anywhere near a pmac i2c bus anyway ...
diff --git a/drivers/ide/ide-timing.h b/drivers/ide/ide-timing.h
index daffbb9..20de3f3 100644
--- a/drivers/ide/ide-timing.h
+++ b/drivers/ide/ide-timing.h
@@ -201,7 +201,7 @@ static int ide_timing_compute(ide_drive_t *drive,
short speed, struct ide_timing
}
 
 /*
- * Lenghten active & recovery time so that cycle time is correct.
+ * Lengthen active & recovery time so that cycle time is correct.
  */
 
if (t->act8b + t->rec8b < t->cyc8b) {
diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c
b/drivers/infiniband/hw/cxgb3/cxio_hal.c
index eec6a30..26b8c0e 100644
--- a/drivers/infiniband/hw/cxgb3/cxio_hal.c
+++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c
@@ -584,7 +584,7 @@ static int cxio_hal_ctrl_qp_write_mem(struct
cxio_rdev *rdev_p, u32 addr,
 {
u32 i, nr_wqe, copy_len;
u8 *copy_data;
-   u8 wr_len, utx_len; /* lenght in 8 byte flit */
+   u8 wr_len, utx_len; /* length in 8 byte flit */
enum t3_wr_flags flag;
__be64 *wqe;
u64 utx_cmd;
diff --git a/drivers/isdn/hardware/eicon/message.c

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Christoph Lameter wrote:
> On Sat, 3 Nov 2007, Hugh Dickins wrote:
> 
> > Which fixes the leakage: Objects and Partials then remain stable.
> 
> Well this code is just an optimization for a rare case.
> Your patch may not handle the debug situation the right way.

Oh?  How?

> We could just remove it.

Hmm, that does seem a possibility.  It is going to increase the
number of partials in use, but they should go away later once
they fall out of use (in a repetitive test like mine).

I'll give it a try overnight and report back then.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH2/2] [POWERPC] fs_enet: select PHYLIB as the driver needs it.

2007-11-03 Thread Jochen Friedrich


Add a select PHYLIB to config FS_ENET as the driver uses functions of
libphy.

LD  .tmp_vmlinux1
drivers/built-in.o: In function `fs_ioctl':
drivers/net/fs_enet/fs_enet-main.c:952: undefined reference to `phy_mii_ioctl'
[...]
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Jochen Friedrich <[EMAIL PROTECTED]>
---
drivers/net/fs_enet/Kconfig |1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/fs_enet/Kconfig b/drivers/net/fs_enet/Kconfig
index 2765e49..24502d2 100644
--- a/drivers/net/fs_enet/Kconfig
+++ b/drivers/net/fs_enet/Kconfig
@@ -2,6 +2,7 @@ config FS_ENET
tristate "Freescale Ethernet Driver"
depends on CPM1 || CPM2
select MII
+   select PHYLIB

config FS_ENET_HAS_SCC
bool "Chip has an SCC usable for ethernet"
--
1.5.3.4




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH0/2] [POWERPC] Two bug fixes for 2.6.24

2007-11-03 Thread Jochen Friedrich


Here is a series fixing some bugs for 8xx powerpc CPUs.

1. [POWERPC] Kill non-existant symbols from ksyms and commproc.h
2. [POWERPC] fs_enet: select PHYLIB as the driver needs it

This series can be pulled from git://git.bocc.de/dbox2.git for-2.6.24

Thanks,
Jochen


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH1/2] [POWERPC] Kill non-existant symbols from ksyms and commproc.h

2007-11-03 Thread Jochen Friedrich


Remove exports of __res and cpm_install_handler/cpm_free_handler.
Remove cpm_install_handler/cpm_free_handler from the commproc.h as well.
Both were used for ARCH=ppc and aren't defined for ARCH=powerpc.

CC  arch/powerpc/kernel/ppc_ksyms.o
arch/powerpc/kernel/ppc_ksyms.c:180: error: '__res' undeclared here (not in a 
function)
arch/powerpc/kernel/ppc_ksyms.c:180: warning: type defaults to 'int' in 
declaration of '__res'
make[1]: *** [arch/powerpc/kernel/ppc_ksyms.o] Error 1
make: *** [arch/powerpc/kernel] Error 2

LD  .tmp_vmlinux1
arch/powerpc/kernel/built-in.o:(__ksymtab+0x198): undefined reference to 
`cpm_free_handler'
arch/powerpc/kernel/built-in.o:(__ksymtab+0x1a0): undefined reference to 
`cpm_install_handler'
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Jochen Friedrich <[EMAIL PROTECTED]>
---
arch/powerpc/kernel/ppc_ksyms.c |   12 
include/asm-powerpc/commproc.h  |3 ---
2 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index c6b1aa3..13ebeb2 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -45,10 +45,6 @@
#include 
#include 

-#ifdef  CONFIG_8xx
-#include 
-#endif
-
#ifdef CONFIG_PPC64
EXPORT_SYMBOL(local_irq_restore);
#endif
@@ -172,14 +168,6 @@ EXPORT_SYMBOL(console_drivers);
EXPORT_SYMBOL(cacheable_memcpy);
#endif

-#ifdef  CONFIG_8xx
-EXPORT_SYMBOL(cpm_install_handler);
-EXPORT_SYMBOL(cpm_free_handler);
-#endif /* CONFIG_8xx */
-#if defined(CONFIG_8xx)
-EXPORT_SYMBOL(__res);
-#endif
-
#ifdef CONFIG_PPC32
EXPORT_SYMBOL(next_mmu_context);
EXPORT_SYMBOL(set_context);
diff --git a/include/asm-powerpc/commproc.h b/include/asm-powerpc/commproc.h
index 0307c84..5dff922 100644
--- a/include/asm-powerpc/commproc.h
+++ b/include/asm-powerpc/commproc.h
@@ -698,9 +698,6 @@ typedef struct risc_timer_pram {
#define CICR_IEN((uint)0x0080)  /* Int. enable */
#define CICR_SPS((uint)0x0001)  /* SCC Spread */

-extern void cpm_install_handler(int vec, void (*handler)(void *), void 
*dev_id);
-extern void cpm_free_handler(int vec);
-
#define IMAP_ADDR   (get_immrbase())

#define CPM_PIN_INPUT 0
--
1.5.3.4





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Hugh Dickins

On Sat, 3 Nov 2007, Hugh Dickins wrote:
> On Sat, 3 Nov 2007, Olivér Pintér wrote:
> > Q: It's needed auch to 2.6.22-stable?
> 
> I guess so: though SLUB wasn't on by default in 2.6.22; and it being
> only a slow leak rather than a corruption, I was less inclined to
> agitate about it for releases further back.
> 
> But your question makes me realize I never even looked at 2.6.23 or
> 2.6.22 hereabouts, just assumed they were the same; let alone patch
> or build or test them.  The patches reject as such because quite a
> lot has changed around (there was no struct kmem_cache_cpu in either).
> 
> A hurried look suggests that the leakage problem was there in both,
> but let's wait to hear Christoph's expert opinion.

Okay, here's a version for 2.6.23 and 2.6.22...
Christoph, you've now Acked the 2.6.24 one, thanks:
do you agree this patch below should go to -stable?

Slub has been quite leaky under load.  Taking mm_struct as an example, in
a loop of swapping kernel builds, after the first iteration slabinfo shows:
NameObjects ObjsizeSpace Slabs/Part/Cpu  O/S O %Fr %Ef Flg
mm_struct55 84073.7K 18/7/44 0  38  62 A
but Objects and Partials steadily creep up - after the 340th iteration:
mm_struct   110 840   188.4K46/36/44 0  78  49 A
(example taken from 2.6.24-rc1: YMMV).

The culprit turns out to be __slab_alloc(), where it copes with the race
that another task has assigned the cpu slab while we were allocating one.
Don't rush off to load_freelist there: that assumes page->lockless_freelist
is empty, and will lose all its free slots when page->freelist is not empty.
Instead just do a local allocation from lockless_freelist when it has one.

Which fixes the leakage: Objects and Partials then remain stable.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---
Version of patch suitable and recommended for both 2.6.23-stable and
2.6.22-stable.  I've not run tests on either to observe the mounting
leakage; but a version of the patch below with a printk announcing
when non-empty freelist would overwrite non-empty lockless_freelist
does indeed show up in both (though notably less frequently than in
2.6.24-rc1 - something else seems to be making it more likely now).
But please wait for Christoph's Ack before committing to -stable.

 mm/slub.c |6 ++
 1 file changed, 6 insertions(+)

--- 2.6.23/mm/slub.c2007-10-09 21:31:38.0 +0100
+++ linux/mm/slub.c 2007-11-03 18:23:07.0 +
@@ -1517,6 +1517,12 @@ new_slab:
 */
discard_slab(s, page);
page = s->cpu_slab[cpu];
+   if (page->lockless_freelist) {
+   object = page->lockless_freelist;
+   page->lockless_freelist =
+   object[page->offset];
+   return object;
+   }
slab_lock(page);
goto load_freelist;
}

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Christoph Lameter

The fix against mm:


SLUB: Fix memory leak by not reusing cpu_slab

Fix the memory leak that may occur when we attempt to reuse a cpu_slab
that was allocated while we reenabled interrupts in order to be able to
grow a slab cache. The per cpu freelist may contain objects and in that
situation we may overwrite the per cpu freelist pointer loosing objects.
This only occurs if we find that the concurrently allocated slab fits
our allocation needs.

If we simply always deactivate the slab then the freelist will be properly
reintegrated and the memory leak will go away.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 mm/slub.c |   19 +--
 1 file changed, 1 insertion(+), 18 deletions(-)

Index: linux-2.6/mm/slub.c
===
--- linux-2.6.orig/mm/slub.c2007-11-03 11:49:20.0 -0700
+++ linux-2.6/mm/slub.c 2007-11-03 11:49:29.0 -0700
@@ -1529,25 +1529,8 @@ static noinline unsigned long get_new_sl
return 0;
 
*pc = c = get_cpu_slab(s, smp_processor_id());
-   if (c->page) {
-   /*
-* Someone else populated the cpu_slab while we
-* enabled interrupts, or we have gotten scheduled
-* on another cpu. The page may not be on the
-* requested node even if __GFP_THISNODE was
-* specified. So we need to recheck.
-*/
-   if (node_match(c, node)) {
-   /*
-* Current cpuslab is acceptable and we
-* want the current one since its cache hot
-*/
-   discard_slab(s, page);
-   return slab_lock(c->page);
-   }
-   /* New slab does not fit our expectations */
+   if (c->page)
flush_slab(s, c);
-   }
c->page = page;
return slab_lock(page) | FROZEN;
 }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> Which fixes the leakage: Objects and Partials then remain stable.

Well this code is just an optimization for a rare case. Your patch may not 
handle the debug situation the right way. We could just remove it.



SLUB: Fix memory leak by not reusing cpu_slab

Fix the memory leak that may occur when we attempt to reuse a cpu_slab 
that was allocated while we reenabled interrupts in order to be able to 
grow a slab cache. The per cpu freelist may contain objects and in that
situation we may overwrite the per cpu freelist pointer loosing objects.
This only occurs if we find that the concurrently allocated slab fits
our allocation needs.

If we simply always deactivate the slab then the freelist will be properly 
reintegrated and the memory leak will go away.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 mm/slub.c |   20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

Index: linux-2.6/mm/slub.c
===
--- linux-2.6.orig/mm/slub.c2007-11-03 11:41:58.0 -0700
+++ linux-2.6/mm/slub.c 2007-11-03 11:42:12.0 -0700
@@ -1511,26 +1511,8 @@ new_slab:
 
if (new) {
c = get_cpu_slab(s, smp_processor_id());
-   if (c->page) {
-   /*
-* Someone else populated the cpu_slab while we
-* enabled interrupts, or we have gotten scheduled
-* on another cpu. The page may not be on the
-* requested node even if __GFP_THISNODE was
-* specified. So we need to recheck.
-*/
-   if (node_match(c, node)) {
-   /*
-* Current cpuslab is acceptable and we
-* want the current one since its cache hot
-*/
-   discard_slab(s, new);
-   slab_lock(c->page);
-   goto load_freelist;
-   }
-   /* New slab does not fit our expectations */
+   if (c->page)
flush_slab(s, c);
-   }
slab_lock(new);
SetSlabFrozen(new);
c->page = new;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc1-g74521c28: oops during boot [] :power_supply:power_supply_show_property+0x94/0x150

2007-11-03 Thread Thomas Bächler

Rafael J. Wysocki schrieb:
> On Saturday, 3 November 2007 12:31, Thomas Bächler wrote:
>> I am trying to boot 2.6.24-rc1-g74521c28 from the linux-2.6 git tree.
>> During boot, I get a kernel oops when udevtrigger is running, thus most
>> devices are not created and the boot stalls.
>>
>> Fortunately, the services are still started (though I don't see anything
>> due to missing ttys) and syslog caught the oops. The logfile and config
>> are attached.
> 
> Can you please attach these files to the bugzilla entry at:
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=9299

Done.



signature.asc
Description: OpenPGP digital signature

Re: [RFC][PATCH 1/3] Suspend: Testing facility

2007-11-03 Thread Rafael J. Wysocki

Hi,

On Saturday, 3 November 2007 19:23, Pavel Machek wrote:
> Hi!
> 
> > Introduce /sys/power/pm_test_level attribute allowing one to test the 
> > suspend
> > core code.  Namely, writing a number (1-5) to this file causes the suspend 
> > code
> > to work in one of the test modes defined as follows:
> > 
> > 5 - test the freezing of processes
> > 4 - test the freezing of processes and suspending of devices
> > 3 - test the freezing of processes, suspending of devices and platform 
> > global
> > control methods
> > 2 - test the freezing of processes, suspending of devices, platform global
> > control methods and the disabling of nonboot CPUs
> > 1 - test the freezing of processes, suspending of devices, platform global
> > control methods, the disabling of nonboot CPUs and suspending of
> > platform/system devices
> > 
> > Then, if a suspend is started by normal means, the suspend core will perform
> > its normal operations up to the point indicated by the test level.  Next, it
> > will wait for 5 seconds and carry out the resume operations needed to 
> > transition
> > the system back to the fully functional state.
> > 
> > Writing 0 to /sys/power/pm_test_level turns the testing off.  The current 
> > test
> > level may be read from /sys/power/pm_test_level .
> > 
> > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> 
> ACK on whole series, but... should we also remove the old debugging
> infrastructure? (Or did I just miss it?)

Well, I don't want to remove it just yet, but it's going to be deprecated.

Also, after Johannes' remark I thought it wouldn't be a good idea to export the
bare test levels to the user.  For this reason I reworked the patches to use
strings instead of numbers for the test setting (the attribute is now called
"pm_test" and is used in a similar way to "disk").

I'll post the reworked series as soon as I update the changelogs and docs.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: iwl3945 in 2.6.24-rc1 dies under load

2007-11-03 Thread Pavel Machek

Hi!

> > iwl3945: 00 0x007b  403761122
> > iwl3945: 02 0x007d  403761129
> > iwl3945: I iwl_irq_handle_error Restarting adapter due to uCode error.
> > iwl3945: Error Reply type 0x0005 cmd REPLY_TX (0x1C) seq 0x0203
> > ser 0x004B
> > iwl3945: Can't stop Rx DMA.
> > wlan0: No ProbeResp from current AP 00:11:2f:0e:95:a0 - assume out of
> > range
> 
> This firmware error dump is useful. Thanks!

Good.

BTW... the wireless light does not seem to work on x60. It used to
work with some -mm versions...

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

PROBLEM: noauto option prevents codepages from being processed properly by smbfs

2007-11-03 Thread Александр Дунаев

Hi,
The below is my bugreport as suggested in REPORTING-BUGS:

1. noauto option prevents codepages from being processed properly by smbfs

2. I had the following line in my /etc/fstab :
//host/share /path   smbfs
noauto,iocharset=koi8-r,codepage=cp866   0   0
Mounting such a share caused the following warning to be written to
the system log:
kernel: smbfs: Unrecognized mount option noauto

The warning is harmful, but the painful side effect was that other
options, e. g. iocharset and codepage, were ignored.  I originally had
several mounts to remote samba shares with Russian filenames that
include Cyrillic characters, and they displayed fine until I added
noauto to options (the boot was slow when the remote machine was
down).  After I've added noauto, the names have become unreadable
(charset problems).

The problem is probably that mount passes everything to
smbfs--according to mount(8), but noauto is not recognized by smbfs
and somewhy this option prevents all other optoins form being
processed properly.

3. modules smbfs

4. Linux version 2.6.21 (2.6.21-5) ([EMAIL PROTECTED]) (gcc version 4.1.3
20070629 (prerelease) (Debian 4.1.2-13)) #1 SMP Wed Jul 4 23:26:38
NOVST 2007

(Other items mentioned in REPORTING-BUGS are omitted as not necessary.)

The proposed patch is attached.

Please keep me in CC if possible.

Thanks,
-- 
Alexander Dunaev


smbfs-noauto-patch
Description: Binary data

Re: [PATCH 1/2] slub: fix leakage

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> The culprit turns out to be __slab_alloc(), where it copes with the race
> that another task has assigned the cpu slab while we were allocating one.
> Don't rush off to load_freelist there: that assumes c->freelist is empty,
> and will lose all of its free slots when c->page->freelist is not empty.
> Instead just do a local allocation from c->freelist when it has one.

Hmmm.. Right. This will require some fixes to the optimizations in mm.

Acked-by: Christoph Lameter <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Smackv10: Smack rules grammar + their stateful parser

2007-11-03 Thread Kyle Moffett


On Nov 03, 2007, at 12:43:06, Ahmed S. Darwish wrote:
Bashv3 builtin "echo" behaves very strangely to -EINVAL. It sends  
all the buffers that causes -EINVAL again in subsequent echo  
invocations.


i.e.
echo "Invalid Rule" > /smack/load  # -EINVAL returned
echo "Valid Rule" > /smack/load

In seconod iteration, echo sends the first invalid buffer again  
then sends the new one. This causes a "Invalid Rule\nValid Rule"  
buffer sent to write().


IMHO, this is a bug in builtin echo. The external /bin/echo doesn't  
cause such strange behaviour.


Actually, what causes problems here is something between a bug and a  
feature in libc's buffering.  Basically the -EINVAL error causes libc  
to leave its data in the file-output buffer despite the file being  
closed and reopened. Since a standalone echo just exits that buffer  
is discarded, but for the bash builtin it hangs around in the buffer  
for a while and ends up getting prepended to the following echo  
statement.  There's actually multiple ways to make this fail; this is  
just the simplest.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] slub: fix Objects count

2007-11-03 Thread Christoph Lameter

On Sat, 3 Nov 2007, Hugh Dickins wrote:

> The count of active Objects shown by Slub's slabinfo is too approximate,
> because each cpu slab is counted as all in use, even when lots are free.
> That makes tracing leaks harder than it need be.

True but that is the way it is for performance reasons.

> Add a free count into kmem_cache_cpu (which doesn't enlarge it on 64-bit),
> to keep that count in the hot and dirty per-cpu cacheline.

Adds to much overhead to the fast paths and will make the current 
optimizations in mm impossible.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 342 matches

Mail list logo