date:20080108

Re: The ext3 way of journalling

2008-01-08 Thread Valdis . Kletnieks

On Tue, 08 Jan 2008 22:21:02 EST, Kyle Moffett said:

> lvcreate -s -n "${VOLUME}-snap" "${VG}/${VOLUME}"


> Basically you can fsck the offline snapshot in the background.

Something the lvcreate manpage is specifically not clear about is:

Does this create a snapshot of the *disk* at that moment, or does it capture
"disk plus still-to-be-written blocks in the cache"? (Phrased differently, does
it Do The Right Thing regarding "blocks queued before lvcreate" and "blocks
queued for write after lvcreate")?

If the snapshot doesn't capture the blocks queued but still unwritten by
kjournald and similar, then you're still hitting the same old problems that
you always get when you fsck an "active disk".


pgpG1ij7TWtRK.pgp
Description: PGP signature

Re: RE : [tipc-discussion] /net/tipc/port.c: Use tipc_port_unlock

2008-01-08 Thread David Miller

From: Jon Paul Maloy <[EMAIL PROTECTED]>
Date: Tue, 8 Jan 2008 10:34:58 -0500 (EST)

> I have no objections.

I've applied this patch, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFD] Incremental fsck

2008-01-08 Thread Valerie Henson

On Jan 8, 2008 8:40 PM, Al Boldi <[EMAIL PROTECTED]> wrote:
> Rik van Riel wrote:
> > Al Boldi <[EMAIL PROTECTED]> wrote:
> > > Has there been some thought about an incremental fsck?
> > >
> > > You know, somehow fencing a sub-dir to do an online fsck?
> >
> > Search for "chunkfs"
>
> Sure, and there is TileFS too.
>
> But why wouldn't it be possible to do this on the current fs infrastructure,
> using just a smart fsck, working incrementally on some sub-dir?

Several data structures are file system wide and require finding every
allocated file and block to check that they are correct.  In
particular, block and inode bitmaps can't be checked per subdirectory.

http://infohost.nmt.edu/~val/review/chunkfs.pdf

-VAL

-VAL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] CONNECTOR: don't touch queue dev after decrement of ref count

2008-01-08 Thread David Miller

From: Li Zefan <[EMAIL PROTECTED]>
Date: Wed, 09 Jan 2008 13:44:07 +0800

> 
> cn_queue_free_callback() will touch 'dev'(i.e. cbq->pdev),
> so it should be called before atomic_dec(>refcnt).
> 
> Signed-off-by: Li Zefan <[EMAIL PROTECTED]>

Excellent catch, patch applied.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Kprobes: Add kprobes smoke tests that run on boot

2008-01-08 Thread David Miller

From: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
Date: Tue, 8 Jan 2008 12:03:34 +0530

> Here is a quick and naive smoke test for kprobes.

Thanks very much for writing this.

It will come in handy for me when I work on sparc64
kretprobe support.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc7, intel audio: alsa doesn't say a beep

2008-01-08 Thread Takashi Iwai

At Wed, 09 Jan 2008 07:03:18 +0100,
Harald Dunkel wrote:
> 
> Takashi Iwai wrote:
> > 
> > Did you enable CONFIG_SND_HDA_POWER_SAVE feature?  And which hardware
> > (laptop, product name, whatever) exactly?
> > 
> 
> CONFIG_SND_HDA_POWER_SAVE is not set.

That's fine.

> Hardware is a Dell XPS M1330. CPU is Core2 Duo T7500, 2.20GHz,
> 2 GByte RAM. lspci:
> 
> 00:00.0 Host bridge: Intel Corporation Mobile Memory Controller Hub (rev 0c)
> 00:01.0 PCI bridge: Intel Corporation Mobile PCI Express Root Port (rev 0c)
> 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 
> (rev 02)
> 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 
> (rev 02)
> 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 
> (rev 02)
> 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio 
> Controller (rev 02)
> 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 
> (rev 02)
> 00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 
> (rev 02)
> 00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 4 
> (rev 02)
> 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 
> (rev 02)
> 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 
> (rev 02)
> 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 
> (rev 02)
> 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 
> (rev 02)
> 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 
> (rev 02)
> 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f2)
> 00:1f.0 ISA bridge: Intel Corporation Mobile LPC Interface Controller (rev 02)
> 00:1f.1 IDE interface: Intel Corporation Mobile IDE Controller (rev 02)
> 00:1f.2 SATA controller: Intel Corporation Mobile SATA AHCI Controller (rev 
> 02)
> 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 
> 02)
> 01:00.0 VGA compatible controller: nVidia Corporation Unknown device 0427 
> (rev a1)
> 03:01.0 FireWire (IEEE 1394): Ricoh Co Ltd Unknown device 0832 (rev 05)
> 03:01.1 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 
> SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22)
> 03:01.2 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter 
> (rev 12)
> 03:01.3 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12)
> 09:00.0 Ethernet controller: Broadcom Corporation Unknown device 1713 (rev 02)
> 0c:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network 
> Connection (rev 02)
> 
> > Also, please show the contents of /proc/asound/card0/codec#* files.
> > Do you see difference in these files between with and without the
> > patch?
> > 
> 
> See below. There is no difference between both.

Thanks.  Then the possible reason might be the registers that don't
appear in this proc output, such as GPIO.
Could you try the patch below with the latency patch (you reverted) in
rc7?


Takashi

diff -r d773ad622068 sound/pci/hda/patch_sigmatel.c
--- a/sound/pci/hda/patch_sigmatel.cTue Jan 08 18:13:27 2008 +0100
+++ b/sound/pci/hda/patch_sigmatel.cWed Jan 09 08:29:49 2008 +0100
@@ -1624,12 +1624,13 @@ static void stac92xx_enable_gpio_mask(st
  AC_VERB_SET_GPIO_DIRECTION, spec->gpio_mask);
/* Configure GPIOx as CMOS */
snd_hda_codec_write_cache(codec, codec->afg, 0, 0x7e7, 0x);
+   /* Enable GPIOx */
+   snd_hda_codec_write_cache(codec, codec->afg, 0,
+ AC_VERB_SET_GPIO_MASK, spec->gpio_mask);
+   msleep(1);
/* Assert GPIOx */
snd_hda_codec_write_cache(codec, codec->afg, 0,
  AC_VERB_SET_GPIO_DATA, spec->gpio_data);
-   /* Enable GPIOx */
-   snd_hda_codec_write_cache(codec, codec->afg, 0,
- AC_VERB_SET_GPIO_MASK, spec->gpio_mask);
 }
 
 /*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread David Miller

From: Christoph Hellwig <[EMAIL PROTECTED]>
Date: Wed, 9 Jan 2008 08:19:45 +0100

> On Wed, Jan 09, 2008 at 03:55:20AM +, Dave Airlie wrote:
> > now because Linus said send him a patch to revert regressions rather than 
> > just complain,
> 
> this is not a regression by any definition.  You were abusing exported
> symbols for out of tree junk, so you'll lose.

And furthermore, they don't even need it, use a kprobe.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] split MMC_CAP_4_BIT_DATA

2008-01-08 Thread Pierre Ossman

On Wed, 9 Jan 2008 11:21:40 +0800
"Cai, Cliff" <[EMAIL PROTECTED]> wrote:

> 
>  Hi,all
> 
> I'd like to say something about this issue.
> Currently,the blackfin on chip SD host ONLY support 1-bit MMC while
> support 1-bit/4-bit SD/SDIO.
> And we want our driver to support both 1-bit MMC and 4-bit SD/SDIO.but
> the current MMC driver framework
> Only allow us to set one kind of bus width,either 1-bit or 4-bit.So in
> order to meet our case,we need more flexible mechanism
> To inform the upper commom driver to know our situation.
> 

That's just iterating what's already been said. My claim is that 4-bit is 
4-bit, regardless if it's MMC or SD. So if you want this patch to go in you 
need to explain why there is a difference for the blackfin controller.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread David Miller

From: Christoph Hellwig <[EMAIL PROTECTED]>
Date: Wed, 9 Jan 2008 08:17:27 +0100

> NACK.   If you want to do it you'll need a much better reason and an
> in-tree user.  And if you want to redo it it should be available for
> all platforms with a consistant API.

I majorly NACK this as well, we don't want to bring this thing
back especially for specialized debugging hacks.

You can set a kprobe on the x86 fault handler to do things like
mmiotrace.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Christoph Hellwig

On Wed, Jan 09, 2008 at 03:55:20AM +, Dave Airlie wrote:
> now because Linus said send him a patch to revert regressions rather than 
> just complain,

this is not a regression by any definition.  You were abusing exported
symbols for out of tree junk, so you'll lose.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] split MMC_CAP_4_BIT_DATA

2008-01-08 Thread Pierre Ossman

On Tue, 8 Jan 2008 16:44:08 -0500
"Mike Frysinger" <[EMAIL PROTECTED]> wrote:

> On Jan 8, 2008 3:49 PM, Pierre Ossman <[EMAIL PROTECTED]> wrote:
> > So, again, if you feel that there is a hardware difference between 4-bit 
> > MMC and 4-bit SD then please elaborate as it is my understanding that they 
> > are identical.
> 
> you may be 100% correct, i have no idea, i'm not really familiar with
> MMC/SD/SDIO at all.

The patch adds complexity to the system. So until you can convince me that 
complexity is actually needed, I'm afraid the answer is NAK.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Change paride driver to use unlocked_ioctl instead of ioctl

2008-01-08 Thread Nikanth Karthikesan

Sorry missed the function prototype and includes earlier.
Here is the corrected patch. Build tested.

The ioctl handler is called with the BKL held. Registering
unlocked_ioctl handler instead of registering ioctl handler.

Signed-off-by: Nikanth Karthikesan <[EMAIL PROTECTED]>

---

diff --git a/drivers/block/paride/pt.c b/drivers/block/paride/pt.c
index b91accf..860b946 100644
--- a/drivers/block/paride/pt.c
+++ b/drivers/block/paride/pt.c
@@ -146,6 +146,7 @@ static int (*drives[4])[6] = {, , ,
};
 #include 
 #include 
 #include/* current, TASK_*, schedule_timeout() */
+#include 

 #include 

@@ -189,8 +190,8 @@ module_param_array(drive3, int, NULL, 0);
 #define ATAPI_LOG_SENSE0x4d

 static int pt_open(struct inode *inode, struct file *file);
-static int pt_ioctl(struct inode *inode, struct file *file,
-   unsigned int cmd, unsigned long arg);
+static long pt_ioctl(struct file *file, unsigned int cmd,
+   unsigned long arg);
 static int pt_release(struct inode *inode, struct file *file);
 static ssize_t pt_read(struct file *filp, char __user *buf,
   size_t count, loff_t * ppos);
@@ -236,7 +237,7 @@ static const struct file_operations pt_fops = {
.owner = THIS_MODULE,
.read = pt_read,
.write = pt_write,
-   .ioctl = pt_ioctl,
+   .unlocked_ioctl = pt_ioctl,
.open = pt_open,
.release = pt_release,
 };
@@ -685,36 +686,44 @@ out:
return err;
 }

-static int pt_ioctl(struct inode *inode, struct file *file,
-unsigned int cmd, unsigned long arg)
+static long pt_ioctl(struct file *file, unsigned int cmd,
+   unsigned long arg)
 {
struct pt_unit *tape = file->private_data;
struct mtop __user *p = (void __user *)arg;
struct mtop mtop;

+   lock_kernel();
+
switch (cmd) {
case MTIOCTOP:
-   if (copy_from_user(, p, sizeof(struct mtop)))
+   if (copy_from_user(, p, sizeof(struct mtop))) {
+   unlock_kernel();
return -EFAULT;
+   }

switch (mtop.mt_op) {

case MTREW:
pt_rewind(tape);
+   unlock_kernel();
return 0;

case MTWEOF:
pt_write_fm(tape);
+   unlock_kernel();
return 0;

default:
printk("%s: Unimplemented mt_op %d\n", tape->name,
   mtop.mt_op);
+   unlock_kernel();
return -EINVAL;
}

default:
printk("%s: Unimplemented ioctl 0x%x\n", tape->name, cmd);
+   unlock_kernel();
return -EINVAL;

}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Christoph Hellwig

On Wed, Jan 09, 2008 at 02:34:46AM +, Dave Airlie wrote:
> 
> [This an initial RFC but I'd like to have this patch in before 2.6.24 goes 
> final as it really breaks this useful feature]
> 
> mmiotrace the MMIO access tracer used to reverse engineer binary blobs
> used this notifier interface and is planned on being pushed upstream.
> 
> Having users able to just use the tracer module without having to rebuild 
> their kernel to add in a page fault handler hack means we get a lot 
> greater coverage for reverse engineering efforts.
> 
> Signed-off-by: David Airlie <[EMAIL PROTECTED]>
> 
> This reverts commit 74a0b5762713a26496db72eac34fbbed46f20fce.
> Conflicts:

NACK.   If you want to do it you'll need a much better reason and an
in-tree user.  And if you want to redo it it should be available for
all platforms with a consistant API.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] system timer: fix crash in <100Hz system timer

2008-01-08 Thread Andrew Morton

On Sat, 5 Jan 2008 16:16:55 -0600 David Fries <[EMAIL PROTECTED]> wrote:

> --- a/kernel/time.c
> +++ b/kernel/time.c
> @@ -565,7 +565,11 @@ EXPORT_SYMBOL(jiffies_to_timeval);
>  clock_t jiffies_to_clock_t(long x)
>  {
>  #if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
> + #if HZ < USER_HZ
> + return x * (USER_HZ / HZ);
> + #else
>   return x / (HZ / USER_HZ);
> + #endif
>  #else
>   u64 tmp = (u64)x * TICK_NSEC;
>   do_div(tmp, (NSEC_PER_SEC / USER_HZ));
> @@ -598,7 +602,12 @@ EXPORT_SYMBOL(clock_t_to_jiffies);
>  u64 jiffies_64_to_clock_t(u64 x)
>  {
>  #if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
> - do_div(x, HZ / USER_HZ);
> + #if HZ < USER_HZ
> + x *= USER_HZ;
> + do_div(x, HZ);
> + #else
> + do_div(x, HZ / USER_HZ);
> + #endif
>  #else

Somwhat off-topic:

I guess HZ=USER_HZ is a not-uncommon case, and it's pretty silly calling
do_div(x, 1) all the time.  How about we optimise that case?

Perhaps there are other places...


--- a/kernel/time.c~speed-up-jiffies-conversion-functions-if-hz==user_hz
+++ a/kernel/time.c
@@ -618,8 +618,10 @@ u64 jiffies_64_to_clock_t(u64 x)
 # if HZ < USER_HZ
x *= USER_HZ;
do_div(x, HZ);
-# else
+# elif HZ > USER_HZ
do_div(x, HZ / USER_HZ);
+# else
+   /* Nothing to do */
 # endif
 #else
/*
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-08 Thread Dave Young

On Jan 9, 2008 2:37 PM, Dave Young <[EMAIL PROTECTED]> wrote:
>
> On Jan 9, 2008 2:13 PM, Dave Young <[EMAIL PROTECTED]> wrote:
> >
> > On Wed, Jan 09, 2008 at 09:32:48AM +0800, Dave Young wrote:
> > > On Jan 9, 2008 6:48 AM, Greg KH <[EMAIL PROTECTED]> wrote:
> > > > On Tue, Jan 08, 2008 at 03:05:10PM +0800, Dave Young wrote:
> > > > > On Jan 8, 2008 1:20 AM, Greg KH <[EMAIL PROTECTED]> wrote:
> > > > > > On Mon, Jan 07, 2008 at 06:13:37PM +0100, Stefan Richter wrote:
> > > > > > > It's already in the driver core to the most part.  It remains to 
> > > > > > > be seen
> > > > > > > what is less complicated in the end:  Transparent mutex-protected 
> > > > > > > list
> > > > > > > accesses provided by driver core (requires the iterator), or all 
> > > > > > > the
> > > > > > > necessary locking done by the drivers themselves (requires some 
> > > > > > > more
> > > > > > > lock-taking but perhaps fewer lock instances overall in the 
> > > > > > > drivers, and
> > > > > > > respective redefinitions and documentation of the driver core 
> > > > > > > API).
> > > > > >
> > > > > > I favor changing the driver core api and doing this kind of thing 
> > > > > > there.
> > > > > > It keeps the drivers simpler and should hopefully make their lives
> > > > > > easier.
> > > > >
> > > > > What about this?
> > > > >
> > > > > #define class_for_each_dev(pos, head, member) \
> > > > > for (mutex_lock(&(container_of(head, struct class, 
> > > > > devices))->mutex), po
> > > > > s = list_entry((head)->next, typeof(*pos), member); \
> > > > > prefetch(pos->member.next), >member != (head) ? 1 : 
> > > > > (mutex_unlock(&
> > > > > (container_of(head, struct class, devices))->mutex), 0); \
> > > > > pos = list_entry(pos->member.next, typeof(*pos), member))
> > > >
> > > I'm wrong, it's same as before indeed.
> > >
> > > > Eeek, just make the thing a function please, where you pass the iterator
> > > > function in, like the driver core has (driver_for_each_device)
> > >
> > > Ok, so need a new member of knode_class, I will update the patch later.
> > > Thanks.
> >
> > Withdraw my post, sorry :)
> >
> > For now the mutex patch, I will only use the mutex to lock the devices list 
> > and write an iterater function.
> > Most of the iterating is for finding some device in the list, so maybe need 
> > a match function just like drivers do?
> >
>
> Drop one more mail address of David Brownell in cc list.
> Sorry for this, david
>
gmail web client make me crazy.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-08 Thread Dave Young

On Jan 9, 2008 2:13 PM, Dave Young <[EMAIL PROTECTED]> wrote:
>
> On Wed, Jan 09, 2008 at 09:32:48AM +0800, Dave Young wrote:
> > On Jan 9, 2008 6:48 AM, Greg KH <[EMAIL PROTECTED]> wrote:
> > > On Tue, Jan 08, 2008 at 03:05:10PM +0800, Dave Young wrote:
> > > > On Jan 8, 2008 1:20 AM, Greg KH <[EMAIL PROTECTED]> wrote:
> > > > > On Mon, Jan 07, 2008 at 06:13:37PM +0100, Stefan Richter wrote:
> > > > > > It's already in the driver core to the most part.  It remains to be 
> > > > > > seen
> > > > > > what is less complicated in the end:  Transparent mutex-protected 
> > > > > > list
> > > > > > accesses provided by driver core (requires the iterator), or all the
> > > > > > necessary locking done by the drivers themselves (requires some more
> > > > > > lock-taking but perhaps fewer lock instances overall in the 
> > > > > > drivers, and
> > > > > > respective redefinitions and documentation of the driver core API).
> > > > >
> > > > > I favor changing the driver core api and doing this kind of thing 
> > > > > there.
> > > > > It keeps the drivers simpler and should hopefully make their lives
> > > > > easier.
> > > >
> > > > What about this?
> > > >
> > > > #define class_for_each_dev(pos, head, member) \
> > > > for (mutex_lock(&(container_of(head, struct class, 
> > > > devices))->mutex), po
> > > > s = list_entry((head)->next, typeof(*pos), member); \
> > > > prefetch(pos->member.next), >member != (head) ? 1 : 
> > > > (mutex_unlock(&
> > > > (container_of(head, struct class, devices))->mutex), 0); \
> > > > pos = list_entry(pos->member.next, typeof(*pos), member))
> > >
> > I'm wrong, it's same as before indeed.
> >
> > > Eeek, just make the thing a function please, where you pass the iterator
> > > function in, like the driver core has (driver_for_each_device)
> >
> > Ok, so need a new member of knode_class, I will update the patch later.
> > Thanks.
>
> Withdraw my post, sorry :)
>
> For now the mutex patch, I will only use the mutex to lock the devices list 
> and write an iterater function.
> Most of the iterating is for finding some device in the list, so maybe need a 
> match function just like drivers do?
>

Drop one more mail address of David Brownell in cc list.
Sorry for this, david
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-08 Thread Tetsuo Handa

Hello.

[EMAIL PROTECTED] wrote:
> Good summary - probably should add that to the patch, drop it into
> Documentation/syaoran-config.txt or similar...
I see.

> Modification while reading *is* an issue, but can probably be worked around
> with some clever locking.  The race condition I was thinking of was if you
> had the mount and the policy load be 2 separate events, you could see:
> 
> (a) issue mount request
> (b) do something malicious in /dev while..
> (c) load the policy that would have prevented (b).
> 
> This is partly why SELinux has init load the policy *very* early on, before
> any other userspace have had a chance to run and do things that would have
> been prevented by policy.
So, you suggested to load policy before mount() request so that
this filesystem can prevent attackers from doing something malicious
by minimizing (i.e. implement as non-blocking operation) the latency
between the userland process's call of mount() and the nodes become visible
to userland process.

I didn't take such cases into account.
My assumed usage of this filesystem is that run a script with

 #!/bin/sh
 mount -t syaoran -o accept=/etc/ccs/syaoran.conf none /dev
 exec /sbin/init "$@"

by passing "init=/path/to/this/script" to the kernel command line
so that /sbin/init can create /dev/initlog on this filesystem.
If you mount this filesystem after /sbin/init starts,
it will shadow /dev/initctl opened by /sbin/init .

> Which basically ends up meaning that anybody who can trick the mount into
> happening can reset the permitted list and create (for example) a mode 666
> entry for a hard drive, and go scribbling around at will.  Note that you
> don't seem to do any sanity checking on the path (for instance, that each
> component is owned by root, and not world-writable) - so anybody who finds
> a way to get the mount to happen can supply their own list in 
> /home/joeuser/blat
> or /tmp/surprise-mount-list  or wherever.
I assume that being able to reach this location means the caller of mount() is 
root.
But, the patches to allow mount() by non-root is in progress? 
http://lkml.org/lkml/2008/1/8/131
May be I should add some sanity checking on the path.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Do SATA tape drives work?

2008-01-08 Thread Jonathan Woithe

Mark Lord wrote:
> I wouldn't buy anything with "Sony" on it,

Any particular reason?

> but Albert thinks ATAPI tapes should be working now
> (he has my old drive now).

Thanks for the info.

Regards
  jonathan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3] kprobes: Introduce kprobe_handle_fault()

2008-01-08 Thread Harvey Harrison

On Wed, 2008-01-09 at 07:14 +0100, Heiko Carstens wrote:
> > +/*
> > + * If it is a kprobe pagefault we can not be premptible so return before
> 
> Missing 'e' in preemptible.

OK.

> However, the old code you removed had a lot of preempt_disable/enable calls
> that you removed. Hope you checked that preemption was always disabled
> already and the calls were not necessary (true at least for s390).
> 
> Are there cases where this code could be called with preemption enabled?
> If so then that looks like a bug anyway. I'd say the preemptible check
> should be removed or turned into a WARN_ON.
> 
> I like this better (not including any other changes):
> 
>   if (!user_mode(regs) && !preemptible() && kprobe_running())
>   return kprobe_fault_handler(regs, trapnr);
>   return 0;

I could live with that too, will defer to kprobes maintainers if they
prefer that as a follow-on.

Regarding the preempt_enable/disable, the reasoning behind it comes from
the following, I stole the changelog from x86.git which has a good
description of why this should be safe:

commit 6624c638928acce52fbe57d73284efcf9f86abd2
Author: Quentin Barnes <[EMAIL PROTECTED]>
Date:   Wed Jan 9 02:32:57 2008 +0100

Code clarification patch to Kprobes arch code

When developing the Kprobes arch code for ARM, I ran across some
code
found in x86 and s390 Kprobes arch code which I didn't consider as
good as it could be.

Once I figured out what the code was doing, I changed the code
for ARM Kprobes to work the way I felt was more appropriate.
I've tested the code this way in ARM for about a year and would
like to push the same change to the other affected architectures.

The code in question is in kprobe_exceptions_notify() which
does:

  /* kprobe_running() needs smp_processor_id() */
  preempt_disable();
  if (kprobe_running() &&
  kprobe_fault_handler(args->regs, args->trapnr))
  ret = NOTIFY_STOP;
  preempt_enable();

For the moment, ignore the code having the preempt_disable()/
preempt_enable() pair in it.

The problem is that kprobe_running() needs to call
smp_processor_id()
which will assert if preemption is enabled.  That sanity check by
smp_processor_id() makes perfect sense since calling it with
preemption
enabled would return an unreliable result.

But the function kprobe_exceptions_notify() can be called from a
context where preemption could be enabled.  If that happens, the
assertion in smp_processor_id() happens and we're dead.  So what
the original author did (speculation on my part!) is put in the
preempt_disable()/preempt_enable() pair to simply defeat the check.

Once I figured out what was going on, I considered this an
inappropriate approach.  If kprobe_exceptions_notify() is called
from a preemptible context, we can't be in a kprobe processing
context at that time anyways since kprobes requires preemption to
already be disabled, so just check for preemption enabled, and if
so, blow out before ever calling kprobe_running().  I wrote the ARM
kprobe code like this:

  /* To be potentially processing a kprobe fault and to
   * trust the result from kprobe_running(), we have
   * be non-preemptible. */
  if (!preemptible() && kprobe_running() &&
  kprobe_fault_handler(args->regs, args->trapnr))
  ret = NOTIFY_STOP;

The above code has been working fine for ARM Kprobes for a year.
So I changed the x86 code (2.6.24-rc6) to be the same way and ran
the Systemtap tests on that kernel.  As on ARM, Systemtap on x86
comes up with the same test results either way, so it's a neutral
external functional change (as expected).

This issue has been discussed previously on linux-arm-kernel and the
Systemtap mailing lists.  Pointers to the by base for the two
discussions:

http://lists.arm.linux.org.uk/lurker/message/20071219.223225.1f5c2a5e.en.html
http://sourceware.org/ml/systemtap/2007-q1/msg00251.html

Cheers,

Harvey

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"

2008-01-08 Thread Willy Tarreau

On Tue, Jan 08, 2008 at 08:10:42PM -0800, Andrew Morton wrote:
[...]
> I must say that the number of bugs which actually go away when the user
> stops using nvidia/fglrx/ndiswrapper/etc is a small minority.

[...]
> But people who think that removing the nvidia driver will
> magically fix that khubd-got-stuck-in-D-state bug are urinating up an
> incline.
> 
> 
> Facts:
> 
> - lots of people use nvidia/etc
> 
> - most bugs they report aren't caused by nvidia/etc
> 
> - we need lots of testers
> 
> draw you own conclusions.

Thanks Andrew for this demonstration. At least now I know I'm not the only
one to think that. And no, I do not have any nvidia/etc. It's just that I
value their users' reports as much as the other ones just because otherwise
we would only track some elite's bugs, thus reducing the amount of information
we have to understand the circumstances under which it happens.

Cheers,
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3] kprobes: Introduce kprobe_handle_fault()

2008-01-08 Thread Heiko Carstens

> +/*
> + * If it is a kprobe pagefault we can not be premptible so return before

Missing 'e' in preemptible.
However, the old code you removed had a lot of preempt_disable/enable calls
that you removed. Hope you checked that preemption was always disabled
already and the calls were not necessary (true at least for s390).

Are there cases where this code could be called with preemption enabled?
If so then that looks like a bug anyway. I'd say the preemptible check
should be removed or turned into a WARN_ON.

> + * calling kprobe_running() as it will assert on smp_processor_id if
> + * preemption is enabled.
> + */
> +static inline int kprobe_handle_fault(struct pt_regs *regs, int trapnr)
> +{
> + if (!user_mode(regs) && !preemptible() && kprobe_running() &&
> + kprobe_fault_handler(regs, trapnr))
> + return 1;
> + else
> + return 0;

I like this better (not including any other changes):

if (!user_mode(regs) && !preemptible() && kprobe_running())
return kprobe_fault_handler(regs, trapnr);
return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-08 Thread Joerg Platte

Am Mittwoch, 9. Januar 2008 schrieb Fengguang Wu:
> > /dev/sda6 on / type ext3 (rw,noatime,errors=remount-ro,acl)
> > tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
> > proc on /proc type proc (rw,noexec,nosuid,nodev)
> > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> > procbususb on /proc/bus/usb type usbfs (rw)
> > udev on /dev type tmpfs (rw,mode=0755)
> > tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
> > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
> > fusectl on /sys/fs/fuse/connections type fusectl (rw)
> > /dev/sda7 on /tmp type ext2 (rw,noatime,errors=remount-ro,acl)
> > /dev/sda8 on /export type ext3 (rw,noatime,errors=remount-ro,acl)
> > /dev/sda1 on /winxp type ntfs (rw,umask=002,gid=1,nls=utf8)
>
> So they are ext3/ext2/ntfs.  What if you umount ntfs? and ext2 if possible?

Unmounting ntfs doesn't help, hence I converted the remaining ext2 filesystem 
to ext3, modified the fstab entry accordingly and rebooted. Now everything 
seems to be fine! Top reports an idle system and there is no abnormal iowait 
any longer! Seems to be ext2 was causing this! Later today I can try to 
remount the filesystem as ext2 to be sure the bug shows up again.

regards,
Jörg

-- 
PGP Key: send mail with subject 'SEND PGP-KEY' PGP Key-ID: FD 4E 21 1D
PGP Fingerprint: 388A872AFC5649D3 BCEC65778BE0C605
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] convert semaphore to mutex in struct class

2008-01-08 Thread Dave Young

On Wed, Jan 09, 2008 at 09:32:48AM +0800, Dave Young wrote:
> On Jan 9, 2008 6:48 AM, Greg KH <[EMAIL PROTECTED]> wrote:
> > On Tue, Jan 08, 2008 at 03:05:10PM +0800, Dave Young wrote:
> > > On Jan 8, 2008 1:20 AM, Greg KH <[EMAIL PROTECTED]> wrote:
> > > > On Mon, Jan 07, 2008 at 06:13:37PM +0100, Stefan Richter wrote:
> > > > > It's already in the driver core to the most part.  It remains to be 
> > > > > seen
> > > > > what is less complicated in the end:  Transparent mutex-protected list
> > > > > accesses provided by driver core (requires the iterator), or all the
> > > > > necessary locking done by the drivers themselves (requires some more
> > > > > lock-taking but perhaps fewer lock instances overall in the drivers, 
> > > > > and
> > > > > respective redefinitions and documentation of the driver core API).
> > > >
> > > > I favor changing the driver core api and doing this kind of thing there.
> > > > It keeps the drivers simpler and should hopefully make their lives
> > > > easier.
> > >
> > > What about this?
> > >
> > > #define class_for_each_dev(pos, head, member) \
> > > for (mutex_lock(&(container_of(head, struct class, 
> > > devices))->mutex), po
> > > s = list_entry((head)->next, typeof(*pos), member); \
> > > prefetch(pos->member.next), >member != (head) ? 1 : 
> > > (mutex_unlock(&
> > > (container_of(head, struct class, devices))->mutex), 0); \
> > > pos = list_entry(pos->member.next, typeof(*pos), member))
> >
> I'm wrong, it's same as before indeed.
> 
> > Eeek, just make the thing a function please, where you pass the iterator
> > function in, like the driver core has (driver_for_each_device)
> 
> Ok, so need a new member of knode_class, I will update the patch later.
> Thanks.

Withdraw my post, sorry :)

For now the mutex patch, I will only use the mutex to lock the devices list and 
write an iterater function. 
Most of the iterating is for finding some device in the list, so maybe need a 
match function just like drivers do?

Regards
dave
> 
> >
> > thanks,
> >
> > greg k-h
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Change paride driver to use unlocked_ioctl instead of ioctl

2008-01-08 Thread Nikanth Karthikesan

The ioctl handler is called with the BKL held. Registering
unlocked_ioctl handler instead of registering ioctl handler.


Signed-off-by: Nikanth Karthikesan <[EMAIL PROTECTED]>

---

diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c
b/arch/x86/kernel/cpu/mcheck/mce_64.c
diff --git a/drivers/block/paride/pt.c b/drivers/block/paride/pt.c
index b91accf..d4fa468 100644
--- a/drivers/block/paride/pt.c
+++ b/drivers/block/paride/pt.c
@@ -236,7 +236,7 @@ static const struct file_operations pt_fops = {
.owner = THIS_MODULE,
.read = pt_read,
.write = pt_write,
-   .ioctl = pt_ioctl,
+   .unlocked_ioctl = pt_ioctl,
.open = pt_open,
.release = pt_release,
 };
@@ -685,36 +685,43 @@ out:
return err;
 }

-static int pt_ioctl(struct inode *inode, struct file *file,
-unsigned int cmd, unsigned long arg)
+static long pt_ioctl(struct file *file, unsigned int cmd, unsigned long
arg)
 {
struct pt_unit *tape = file->private_data;
struct mtop __user *p = (void __user *)arg;
struct mtop mtop;

+   lock_kernel();
+
switch (cmd) {
case MTIOCTOP:
-   if (copy_from_user(, p, sizeof(struct mtop)))
+   if (copy_from_user(, p, sizeof(struct mtop))) {
+   unlock_kernel();
return -EFAULT;
+   }

switch (mtop.mt_op) {

case MTREW:
pt_rewind(tape);
+   unlock_kernel();
return 0;

case MTWEOF:
pt_write_fm(tape);
+   unlock_kernel();
return 0;

default:
printk("%s: Unimplemented mt_op %d\n", tape->name,
   mtop.mt_op);
+   unlock_kernel();
return -EINVAL;
}

default:
printk("%s: Unimplemented ioctl 0x%x\n", tape->name, cmd);
+   unlock_kernel();
return -EINVAL;

}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Do SATA tape drives work?

2008-01-08 Thread Mark Lord


Tejun Heo wrote:

[cc'ing linux-ide]

Jonathan Woithe wrote:

Hi guys

I was wondering whether anyone can shed any light on the status of SATA tape
drives.  There's very little info on the net about this at least in the
places I've checked; the only thing of any significance I've found thus far
is a note in a Bacula document dated April 2007 which states that drives
other than real SCSI units don't generally work with Bacula.

To put this into context, I'm looking at purchasing a Sony SDX470VRB SATA
AIT-1 tape drive for use with the SATA controller on an Intel DG31PR
mainboard.  The drive will be used primarily with tar/cpio.  Obvsiouly
however I only want to make the purchase if there's a reasonable chance of
it working.

I would appreciate any information you can shed on this issue.


It's supposed to with recent updates.  Mark, right?

..

I wouldn't buy anything with "Sony" on it,
but Albert thinks ATAPI tapes should be working now
(he has my old drive now).


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc7, intel audio: alsa doesn't say a beep

2008-01-08 Thread Harald Dunkel


Takashi Iwai wrote:


Did you enable CONFIG_SND_HDA_POWER_SAVE feature?  And which hardware
(laptop, product name, whatever) exactly?



CONFIG_SND_HDA_POWER_SAVE is not set.

Hardware is a Dell XPS M1330. CPU is Core2 Duo T7500, 2.20GHz,
2 GByte RAM. lspci:

00:00.0 Host bridge: Intel Corporation Mobile Memory Controller Hub (rev 0c)
00:01.0 PCI bridge: Intel Corporation Mobile PCI Express Root Port (rev 0c)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 
02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 
02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 
(rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio 
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 
(rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 
(rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 4 
(rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 
(rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 
02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 
02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 
02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 
(rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation Mobile LPC Interface Controller (rev 02)
00:1f.1 IDE interface: Intel Corporation Mobile IDE Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation Mobile SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation Unknown device 0427 (rev 
a1)
03:01.0 FireWire (IEEE 1394): Ricoh Co Ltd Unknown device 0832 (rev 05)
03:01.1 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 
SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22)
03:01.2 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter 
(rev 12)
03:01.3 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12)
09:00.0 Ethernet controller: Broadcom Corporation Unknown device 1713 (rev 02)
0c:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network 
Connection (rev 02)


Also, please show the contents of /proc/asound/card0/codec#* files.
Do you see difference in these files between with and without the
patch?



See below. There is no difference between both.


Regards

Harri

Codec: SigmaTel STAC9228
Address: 0
Vendor Id: 0x83847616
Subsystem Id: 0x10280209
Revision Id: 0x100201
No Modem Function Group found
Default PCM:
rates [0x7e0]: 44100 48000 88200 96000 176400 192000
bits [0xe]: 16 20 24
formats [0x1]: PCM
Default Amp-In caps: ofs=0x00, nsteps=0x0e, stepsize=0x05, mute=0
Default Amp-Out caps: ofs=0x7f, nsteps=0x7f, stepsize=0x02, mute=1
Node 0x02 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out
  Amp-Out caps: N/A
  Amp-Out vals:  [0x7f 0x7f]
  Power: 0x0
Node 0x03 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out
  Amp-Out caps: N/A
  Amp-Out vals:  [0xff 0xff]
  Power: 0x0
Node 0x04 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out
  Amp-Out caps: N/A
  Amp-Out vals:  [0xff 0xff]
  Power: 0x0
Node 0x05 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out
  Amp-Out caps: N/A
  Amp-Out vals:  [0xff 0xff]
  Power: 0x0
Node 0x06 [Vendor Defined Widget] wcaps 0xfd0c05: Stereo Amp-Out
  Amp-Out caps: N/A
  Amp-Out vals:  [0xff 0xff]
  Power: 0x0
Node 0x07 [Audio Input] wcaps 0x1d0541: Stereo
  Power: 0x0
  Connection: 1
 0x1b
Node 0x08 [Audio Input] wcaps 0x1d0541: Stereo
  Power: 0x0
  Connection: 1
 0x1c
Node 0x09 [Audio Input] wcaps 0x1d0541: Stereo
  Power: 0x0
  Connection: 1
 0x1d
Node 0x0a [Pin Complex] wcaps 0x400181: Stereo
  Pincap 0x08173f: IN OUT HP Detect
  Pin Default 0x02214020: [Jack] HP Out at Ext Front
Conn = 1/8, Color = Green
  Pin-ctls: 0xc0: OUT HP
  Connection: 2
 0x02* 0x03
Node 0x0b [Pin Complex] wcaps 0x400181: Stereo
  Pincap 0x08173f: IN OUT HP Detect
  Pin Default 0x02a19080: [Jack] Mic at Ext Front
Conn = 1/8, Color = Pink
  Pin-ctls: 0x24: IN
  Connection: 2
 0x02 0x03*
Node 0x0c [Pin Complex] wcaps 0x400181: Stereo
  Pincap 0x081737: IN OUT Detect
  Pin Default 0x0181304e: [Jack] Line In at Ext Rear
Conn = 1/8, Color = Blue
  Pin-ctls: 0x20: IN
  Connection: 1
 0x03
Node 0x0d [Pin Complex] wcaps 0x400181: Stereo
  Pincap 0x08173f: IN OUT HP Detect
  Pin Default 0x01014010: [Jack] Line Out at Ext Rear
Conn = 1/8, Color = Green
  Pin-ctls: 0x40: OUT
  Connection: 1
 0x02
Node 0x0e [Pin Complex] wcaps 0x400181: Stereo
  Pincap 0x081737: IN OUT Detect
  Pin Default 0x01a19040: [Jack] Mic at Ext Rear
Conn = 1/8, Color = Pink
  Pin-ctls: 0x24:

Re: [PATCH] Change x86 Machine check handler to use unlocked_iocl instead of ioctl

2008-01-08 Thread Andi Kleen

On Thu, Jan 10, 2008 at 11:25:14AM +0530, Nikanth Karthikesan wrote:
> The Machine check handler registers ioctl handler that is called
> with the BKL held. Changing to register unlocked_ioctl instead.
> Also mce ioctl handler does not seem to need any lock protection.

Thanks, but I already did that here on my own.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Change x86 Machine check handler to use unlocked_iocl instead of ioctl

2008-01-08 Thread Nikanth Karthikesan

The Machine check handler registers ioctl handler that is called
with the BKL held. Changing to register unlocked_ioctl instead.
Also mce ioctl handler does not seem to need any lock protection.

To: Andi Kleen <[EMAIL PROTECTED]>
Cc: linux-kernel@vger.kernel.org
Cc: [EMAIL PROTECTED]

Change the Machine check handler to use unlocked_ioctl instead of
ioctl handler. Also the mce ioctl handler does not need any lock
protection.

Signed-off-by: Nikanth Karthikesan <[EMAIL PROTECTED]>

---

diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c
b/arch/x86/kernel/cpu/mcheck/mce_64.c
index 4b21d29..d3baa62 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -634,8 +634,7 @@ static unsigned int mce_poll(struct file *file,
poll_table *wait)
return 0;
 }

-static int mce_ioctl(struct inode *i, struct file *f,unsigned int cmd,
-unsigned long arg)
+static long mce_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
 {
int __user *p = (int __user *)arg;

@@ -664,7 +663,7 @@ static const struct file_operations mce_chrdev_ops = {
.release = mce_release,
.read = mce_read,
.poll = mce_poll,
-   .ioctl = mce_ioctl,
+   .unlocked_ioctl = mce_ioctl,
 };

 static struct miscdevice mce_log_device = {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] CONNECTOR: don't touch queue dev after decrement of ref count

2008-01-08 Thread Li Zefan


cn_queue_free_callback() will touch 'dev'(i.e. cbq->pdev),
so it should be called before atomic_dec(>refcnt).

Signed-off-by: Li Zefan <[EMAIL PROTECTED]>

---
 drivers/connector/cn_queue.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c
index 23cc87a..5732ca3 100644
--- a/drivers/connector/cn_queue.c
+++ b/drivers/connector/cn_queue.c
@@ -99,8 +99,8 @@ int cn_queue_add_callback(struct cn_queue_dev *dev, char 
*name, struct cb_id *id
spin_unlock_bh(>queue_lock);
 
if (found) {
-   atomic_dec(>refcnt);
cn_queue_free_callback(cbq);
+   atomic_dec(>refcnt);
return -EINVAL;
}
 
-- 
1.5.3.rc7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread Christer Weinigel

On Tue, 08 Jan 2008 18:52:42 -0800
Zachary Amsden <[EMAIL PROTECTED]> wrote:

> On Tue, 2008-01-08 at 14:15 -0500, David P. Reed wrote:
> > Alan Cox wrote:
> > > The natsemi docs here say otherwise. I trust them not you.
> > >   
> > As well you should. I am honestly curious (for my own satisfaction)
> > as to what the natsemi docs say the delay code should do  (can't
> > imagine they say "use io port 80 because it is unused").  I don't
> > have any 
> 
> What is the outcome of this thread?  Are we going to use timing based
> port delays, or can we finally drop these things entirely on 64-bit
> architectures?
> 
> I a have a doubly vested interest in this, both as the owner of an
> affected HP dv9210us laptop and as a maintainer of paravirt code - and
> would like 64-bit Linux code to stop using I/O to port 0x80 in both
> cases (as I suspect would every other person involved with
> virtualization).
> 
> I've tried to follow this thread, but with all the jabs, 1-ups, and
> obscure legacy hardware pageantry going on, it isn't clear what we're
> really doing.

I belive Alan Cox is doing a review of some drivers, to see if they
actually need the I/O port delay.  A lot of drivers probably use outb_p
just because it was copy-pasted from some other driver and it can be
removed.  Alan's review has also brought to light a lack of locking in
some drivers, so I think Alan has been adding proper locking to some of
the watchdog drivers.

Most old ISA only device drivers can keep using OUT 80h.  They are not
used on modern machines and it's better to keep them unchanged to avoid
unneccesary incompatibilities.

As far as I know, the 8253 PIT timer code needs outb_p on some older
platform, and this is one of the most troublesome since the same PIT
controller (or a register compatible one) has been used since the
original IBM PC, and it is frequently executed code.  Ingo Molnar has
done an alternate implementation of the PIT clock source which uses
udelay instead of OUT 80h to delay accesses to the ports. The kernel
could make a choice of which variant to use based on the DMI year, if
compiling for x86_64, or something similar.  Maybe have a command line
option too.

The keyboard controller on some platform needs the delay, and the same
driver is used on both ancient and modern systems, I think it can be
changed to udelay since it's not so time critical code.

The 8259 interrupt controller on some platform needs the delay, I think
it can be changed to udelay since it's only some setup code that uses
outb_p.  I guess there are time critical accesses to the interrupt
controller from assembly code somewhere to acknowledge interrupts, and
that code needs a review.

The floppy controller code uses outb_p.  Even though there might be
floppy controllers on modern systems, I'd rather leave the floppy code
alone since it's supposed to be very fragile.  If you still use
floppies you deserve what you get.

Some specific drivers, such as drivers for 8390 or 8390 clone based
network cards are also a bit troublesome, they do need outb_p (and
the delay for the original 8390 chip is specified in bus cycles), and
there can be a big performance loss if pessimistic udelays are used for
the delay.  There are still a bunch of PCMCIA cards based on that chip
which means that those cards can be used with modern machines.  There
are also PCI and memory mapped variants of the 8390, some of them new
designs which are only register compatible, some other designs are
using a real 8390 with a FPGA used as glue logic. I think Alan
suggested compiling two versions of that driver, one with OUT 80h, and
one with udelay.  Old machines can choose the old driver, and new
machines can use the new one.  Other drivers can probably do the same
thing, or if not time critical, always use a pessimistic udelay.

As for the implementation, I like the suggestion to split outb_b into
two calls, one to outb and one to isa_slow_down_io.  It makes it very
obvious that it is really two function calls, and that it needs
locking.  For those uses that are not ISA port accesses,
isa_slow_down_io should be changed to an appropriate udelay instead.

The goal is anyway that a modern machine should not do OUT 80h, and old
machines keep doing it since it has been working well for some 15-odd
years, both in DOS device drivers and on Linux.  Using an alternate
port may be a workaround, but it's probaby not a good idea since
alternate ports have received less testing and there's bound to be some
platform out there that has problems with any alternate port we
might choose.  Allowing an alternate port will also add code bloat
(OUT 80h, AL becomes MOV DX, alternate_port; OUT DX, AL) for a
dubious gain.

Did I miss anyting?

  /Christer
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: More breakage in native_rdtsc out of line in git-x86 II

2008-01-08 Thread Andi Kleen

> I think the problem is that the vsyscall/vdso code calls it through
> vread and for that it has to be exported. There seems to be also
> another bug with the old style vsyscalls not using the TSC vread
> that masks it on older glibc
> 
> Stepping with gdb through old style vgettimeofday() confirms that RDTSC is 
> not used.
> 
> A long time ago we had a similar problem once and it was because of a 
> problem exporting the vsyscall variables in vmlinux.lds.S -- looks like that 
> has reappeared.
> 
> I think the new glibc shows it because it uses the vDSO not 
> the older vsyscall and the new vDSO probably still works. Anyways haven't 
> investigated why that is in detail yet, but that's a separate 
> regression.

Actually that seems to be because the test system using the older 
glibc didn't use the TSC because it was marked unstable due a 
unsynchronized TSC. It should not have been -- this is a Core2 
dual core single socket. Will investigate later what happened
there.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread H. Peter Anvin


Zachary Amsden wrote:


BTW, it isn't ever safe to pass port 0x80 through to hardware from a
virtual machine; some OSes use port 0x80 as a hardware available scratch
register (I believe Darwin/x86 did/does this during boot).


That's funny, because there is definitely no guarantee that you get back 
what you read (well, perhaps there is on Apple.)


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Do SATA tape drives work?

2008-01-08 Thread Tejun Heo

[cc'ing linux-ide]

Jonathan Woithe wrote:
> Hi guys
> 
> I was wondering whether anyone can shed any light on the status of SATA tape
> drives.  There's very little info on the net about this at least in the
> places I've checked; the only thing of any significance I've found thus far
> is a note in a Bacula document dated April 2007 which states that drives
> other than real SCSI units don't generally work with Bacula.
> 
> To put this into context, I'm looking at purchasing a Sony SDX470VRB SATA
> AIT-1 tape drive for use with the SATA controller on an Intel DG31PR
> mainboard.  The drive will be used primarily with tar/cpio.  Obvsiouly
> however I only want to make the purchase if there's a reasonable chance of
> it working.
> 
> I would appreciate any information you can shed on this issue.

It's supposed to with recent updates.  Mark, right?

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kbuild update

2008-01-08 Thread Sam Ravnborg

On Wed, Jan 09, 2008 at 10:32:39AM +0800, WANG Cong wrote:
> 
> >> > If we can make this to be an offical project for Linux kernel, I
> >> > think it won't be a big problem.
> >> 
> >> We don't even manage to maintain the English language texts properly,
> >> and I am therefore not overly optimistic that we'll have the 
> >> translations maintained properly for many years.
> >Italian was 100% translated at one point in time.
> >And the Linux Kernel Translation project has a number of
> >spelling error fixes in queue (I dunno if they have been applied).
> >
> >So even when run as an external project it was ok for some languages,
> >and having it official and someone taking patches to .po files would
> >for sure allow more users to build a kernel.
> >
> 
> Agreed.
> 
> That's the goal of TLKTP. Sam, can you contact to the author of
> TLKTP? Maybe we can talk to him to see if we can restart the
> project. If so, I can help with the Chinese translation part.

My first try bounced, found another address for Egry Gabor -
let's see if I have more luck.

The associated list is spam only so I did not try that one.

Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-08 Thread Valdis . Kletnieks

On Tue, 08 Jan 2008 22:50:43 +0900, Tetsuo Handa said:

> Yes. It is a line-by-line processable format defined as:
> 
>   filename permission owner group flags type [ symlink_data | major minor ]
> 
> where flags are bit-wised combinations of
> 
>   *  1: Allow creation of the file.
>   *  2: Allow deletion of the file.
>   *  4: Allow changing permissions of the file.
>   *  8: Allow changing owner or group of the file.
>   * 16: For internal use. Remembers whether this file is opened or not.
>   * 32: Don't create this file at mount time.
> 
> and here are some example entries:
> 
>   pts 755 0   0   0   d

Good summary - probably should add that to the patch, drop it into
Documentation/syaoran-config.txt or similar...

> > the idea of passing a file to be read by the kernel, but I also understand
> > that if it isn't done before mount, you have a race condition betweet the
> > mount and the load.
> What race condition is possible?
> Are you worrying that the file gets modified while reading?

Modification while reading *is* an issue, but can probably be worked around
with some clever locking.  The race condition I was thinking of was if you
had the mount and the policy load be 2 separate events, you could see:

(a) issue mount request
(b) do something malicious in /dev while..
(c) load the policy that would have prevented (b).

This is partly why SELinux has init load the policy *very* early on, before
any other userspace have had a chance to run and do things that would have
been prevented by policy.  

>> Does this do what you think it does if run in a chroot process or if
>> some creative person does "accept=../../path/to/bad_data.cfg"?
> sys_open() calls open_pathname() with AT_FDCWD.
> So, it is the same thing as calling
> open("../../path/to/bad_data.cfg", O_RDONLY) from the userland.

Which basically ends up meaning that anybody who can trick the mount into
happening can reset the permitted list and create (for example) a mode 666
entry for a hard drive, and go scribbling around at will.  Note that you
don't seem to do any sanity checking on the path (for instance, that each
component is owned by root, and not world-writable) - so anybody who finds
a way to get the mount to happen can supply their own list in /home/joeuser/blat
or /tmp/surprise-mount-list  or wherever.

>> That printk should be KERN_ERR, I think.
> May be. But I think KERN_WARNING is enough because this is not such emergent 
> error.

OK, I can live with WARNING.  You just want to be sure it's above INFO...


pgplVLr1tgo5y.pgp
Description: PGP signature

Re: [patch 1/1] Convert drivers in drivers/char/drm to use .unlocked_ioctl

2008-01-08 Thread Dave Airlie

> On Wed, Jan 09, 2008 at 03:37:50AM +, Dave Airlie wrote:
> > 
> > > The drm drivers in this patch all used drm_ioctl to perform their
> > > ioctl calls.  The common function is converted to use lock_kernel()
> > > and unlock_kernel() and the drivers are converted to use .unlocked_ioctl
> > > 
> > 
> > NAK
> 
> Did you actually read Kevin's patch? 

Kevin's patch adds the lock/unlock to drm_ioctl which is exactly what I 
don't want, I want to have drm_ioctl become drm_unlocked_ioctl, and 
drm_ioctl to wrap it with the lock/unlocks, then the drivers can all
use unlocked_ioctl like Kevins patch pointing to drm_ioctl, and can 
migrate over to drm_unlocked_ioctl post lock auditing, the new latest i915 
driver seems to be fine with unlocked ioctls so far..

Yes I can use Kevin's patch as a base most likely, but it doesn't do what 
I want yet, and I've already started to do it properly in the drm upstream 
trees

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] call sysrq_timer_list_show from a workqueue

2008-01-08 Thread Andrew Morton

On Wed, 9 Jan 2008 15:27:59 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote:

> > >  I assume you've
> > > queued these because you're thinking of applying them before 2.6.24?  I'd
> > > say only
> > > modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch
> > > warrants that (the other is unlikely and not a regression).
> >
> > Actually I was thinking 2.6.25 on both.
> 
> Then, you should get them next time you grab my series, no?  Or is that 
> particular lever not working yet?
> 
> Hmm, I see my link was not updated (damn, ln -sfn, not ln -sf!).  Fixed now:
>   http://ozlabs.org/~rusty/kernel/rr-latest/
> 
> More goodies there than a UK comedy convention...

My 850-email backlog is down to 759.  You're in there somewhere.  I'm
wondering if I can spin it out to next Christmas.  

I may end up throwing up my hands, trolling it all for bugfixes and then
having an accident with the rest.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFD] Incremental fsck

2008-01-08 Thread Al Boldi

Rik van Riel wrote:
> Al Boldi <[EMAIL PROTECTED]> wrote:
> > Has there been some thought about an incremental fsck?
> >
> > You know, somehow fencing a sub-dir to do an online fsck?
>
> Search for "chunkfs"

Sure, and there is TileFS too.

But why wouldn't it be possible to do this on the current fs infrastructure, 
using just a smart fsck, working incrementally on some sub-dir?


Thanks!

--
Al

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 05/19] split LRU lists into anon & file sets

2008-01-08 Thread KAMEZAWA Hiroyuki

I like this patch set thank you.

On Tue, 08 Jan 2008 15:59:44 -0500
Rik van Riel <[EMAIL PROTECTED]> wrote:
> Index: linux-2.6.24-rc6-mm1/mm/memcontrol.c
> ===
> --- linux-2.6.24-rc6-mm1.orig/mm/memcontrol.c 2008-01-07 11:55:09.0 
> -0500
> +++ linux-2.6.24-rc6-mm1/mm/memcontrol.c  2008-01-07 17:32:53.0 
> -0500


> -enum mem_cgroup_zstat_index {
> - MEM_CGROUP_ZSTAT_ACTIVE,
> - MEM_CGROUP_ZSTAT_INACTIVE,
> -
> - NR_MEM_CGROUP_ZSTAT,
> -};
> -
>  struct mem_cgroup_per_zone {
>   /*
>* spin_lock to protect the per cgroup LRU
>*/
>   spinlock_t  lru_lock;
> - struct list_headactive_list;
> - struct list_headinactive_list;
> - unsigned long count[NR_MEM_CGROUP_ZSTAT];
> + struct list_headlists[NR_LRU_LISTS];
> + unsigned long   count[NR_LRU_LISTS];
>  };
>  /* Macro for accessing counter */
>  #define MEM_CGROUP_ZSTAT(mz, idx)((mz)->count[(idx)])
> @@ -160,6 +152,7 @@ struct page_cgroup {
>  };
>  #define PAGE_CGROUP_FLAG_CACHE   (0x1)   /* charged as cache */
>  #define PAGE_CGROUP_FLAG_ACTIVE (0x2)/* page is active in this 
> cgroup */
> +#define PAGE_CGROUP_FLAG_FILE(0x4)   /* page is file system backed */
> 

Now, we don't have control_type and a feature for accounting only CACHE.
Balbir-san, do you have some new plan ?

BTW, is it better to use PageSwapBacked(pc->page) rather than adding a new flag
PAGE_CGROUP_FLAG_FILE ?


PAGE_CGROUP_FLAG_ACTIVE is used because global reclaim can change
ACTIVE/INACTIVE attribute without accessing memory cgroup.
(Then, we cannot trust PageActive(pc->page))

ANON <-> FILE attribute can be changed dinamically (after added to LRU) ?

If no, using page_file_cache(pc->page) will be easy.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-08 Thread Tetsuo Handa

Hello.

Indan Zupancic wrote:
> I think you focus too much on your way of enforcing filename/attributes
> pairs.
So?

> The same can be achieved by creating the device nodes with
> expected attributes, and preventing processes from changing those files.
The device nodes have to be deletable if some process (including udev) needs to 
delete.
Thus, you cannot unconditionally prevent processes from changing those files.

> This because expected combinations are known beforehand.
Yes.

> And once those files are present, the MAC system used doesn't have to have 
> special
> device nodes attributes support. Protecting those files is enough to
> guarantee filename/attributes pairs.
If MAC system needn't to support this filesystem's functionality,
who creates those files with warrantee of expected attributes? The udev does?
If udev is exploited, who can guarantee?

> No, this is because rename permission was given for files that it shouldn't 
> had.
Do you think all MAC implementation have the same granularity and 
functionalities?
I don't think so. Not all MAC implementation can control with such granularity.
This filesystem is designed to be combined with any MAC,
although the MAC used with this filesystem should be able to restrict
namespace manipulation requests so that this filesystem can remain /dev
and visible to userland applications.

> Either you want a process to manage device names and attributes, and then you
> give it permission to do that, or you want to enforce certain 
> filename/attribute
> pairs and then you just do it yourself.
If I modify udev to enforce certain filename/attribute pairs and the modified 
udev
was exploited, who can guarantee?
"Don't trust userland application" is the basis of restricting access in kernel 
space.
If you can trust userland application, you don't need in-kernel access control.


> Will your filesystem prevent the trivial case of
> 
> rm /dev/hda1
> ln -s /dev/hda2 /dev/hda1
> 
Of course. To permit the above operation, the following permissions are needed.

  hda1660 0   6   2   b   3   1
  hda1777 0   0  33   l   .

> Rename permission can be given for /dev in general, but prohibited for
> certain files in /dev, the ones you want to have specific attributes.
> It isn't all or nothing.
Do you think all MAC implementation can prohibit renaming for certain files in 
/dev ?

> It's "forbid modifying certain nodes that process needn't to modify"
> versus "forbid breaking filename/attribute pairs of certain nodes".
> 
> Both have the same effect, except that the first one is generic and
> can be done by existing MAC systems, while the second one needs
> a special filesystem and a handful of MAC rules to make it effective.
Do you think all MAC implementation can do?
I think the first one is implementation specific and the second one is generic.

> It doesn't matter where they are, it's that a different fs than yours could be
> mounted over it. You say a MAC can prevent that from happening, but a
> MAC can also prevent all processes except for udev from modifying /dev.
But MAC cannot prevent udev from modifying /dev . And what if exploited?
Not all MAC can enforce access control over all processes with the granularity
you are talking. And what if a process that cannot be controlled with your
boolean level granularity exists (e.g. an administrator running his/her
administrative applications that require modification of /dev )?

A crazy example of administrative applications:
(Please don't say "Don't use such crazy application".)

  #! /bin/sh
  rm -f /dev/either-null-or-zero
  read
  mknod /dev/either-null-or-zero c 1 $REPLY && echo "Administrative task 
finished successfully." | mail root

This filesystem can guarantee /dev/either-null-or-zero is either char-1-3 or 
char-1-5 by using a policy

  either-null-or-zero666 0   0   3   c   1   3
  either-null-or-zero666 0   0  35   c   1   5

The boolean level granularity (e.g. forbid all processes except for udev ,
and modify udev to perform name/attribute pair enforcement) is not generic.
Userland application sometimes misbehaves.
I assume kernel process doesn't misbehave.
If you doubt my assumption, you have to doubt in-kernel MAC implementation too.

> I don't. What I complain about is that it's too specific and does it one 
> chosen
> job badly. It lacks abstraction. As far as I can see any decent MAC can 
> achieve
> the same end result as your filesystem, without directly enforcing name/attr
> pairs.
Can SELinux guarantee the same result as my filesystem even if udev or
administrative programs have to be able to modify /dev ?

> The thing is, all special device nodes that are expected to exist by 
> applications
> are known beforehand.
Yes.

> Thus they can be created statically and can be protected
> against any modifications with any MAC system.
But sometimes some modifications needs to be permitted.
Who can

Re: [TOMOYO #6 retry 08/21] Utility functions and policy manipulation interface.

2008-01-08 Thread Kentaro Takeda

Hello.

James Morris wrote:
> Why aren't you using securityfs for this?  (It was designed for LSMs).
We are using securityfs mounted on /sys/kernel/security/ .
Thanks.

Kentaro Takeda

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [TOMOYO #6 retry 08/21] Utility functions and policy manipulation interface.

2008-01-08 Thread James Morris

On Wed, 9 Jan 2008, James Morris wrote:

> On Wed, 9 Jan 2008, Kentaro Takeda wrote:
> 
> > Common functions for TOMOYO Linux.
> > 
> > TOMOYO Linux uses /sys/kernel/security/tomoyo interface for configuration.
> 
> Why aren't you using securityfs for this?  (It was designed for LSMs).

Doh, it is using securityfs, don't worry.


-- 
James Morris
<[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] add task handling notifier: base definitions

2008-01-08 Thread Matthew Helsley

On Thu, 2007-12-20 at 13:12 +, Jan Beulich wrote:
> This is the base patch, adding notification for task creation and
> deletion.
> 
> Signed-off-by: Jan Beulich <[EMAIL PROTECTED]>
> ---
>  include/linux/sched.h |8 +++-
>  kernel/fork.c |   11 +++
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> --- 2.6.24-rc5-notify-task.orig/include/linux/sched.h
> +++ 2.6.24-rc5-notify-task/include/linux/sched.h
> @@ -80,7 +80,7 @@ struct sched_param {
>  #include 
>  #include 
>  #include 
> -
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1700,6 +1700,12 @@ extern int do_execve(char *, char __user
>  extern long do_fork(unsigned long, unsigned long, struct pt_regs *, unsigned 
> long, int __user *, int __user *);
>  struct task_struct *fork_idle(int);
> 
> +#define TASK_NEW 1
> +#define TASK_DELETE 2
> +
> +extern struct blocking_notifier_head task_notifier_list;
> +extern struct atomic_notifier_head atomic_task_notifier_list;
> +
>  extern void set_task_comm(struct task_struct *tsk, char *from);
>  extern void get_task_comm(char *to, struct task_struct *tsk);
> 
> --- 2.6.24-rc5-notify-task.orig/kernel/fork.c
> +++ 2.6.24-rc5-notify-task/kernel/fork.c
> @@ -46,6 +46,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -71,6 +72,11 @@ DEFINE_PER_CPU(unsigned long, process_co
> 
>  __cacheline_aligned DEFINE_RWLOCK(tasklist_lock);  /* outer */
> 
> +BLOCKING_NOTIFIER_HEAD(task_notifier_list);
> +EXPORT_SYMBOL_GPL(task_notifier_list);
> +ATOMIC_NOTIFIER_HEAD(atomic_task_notifier_list);
> +EXPORT_SYMBOL_GPL(atomic_task_notifier_list);
> +

When these global notifier lists were proposed years ago folks at SGI
loudly objected with concerns over anticipated cache line bouncing on
512+ cpu machines. Is that no longer a concern?

>  int nr_processes(void)
>  {
>   int cpu;
> @@ -121,6 +127,9 @@ void __put_task_struct(struct task_struc
>   WARN_ON(atomic_read(>usage));
>   WARN_ON(tsk == current);
> 
> + atomic_notifier_call_chain(_task_notifier_list,
> +TASK_DELETE, tsk);
> +
>   security_task_free(tsk);
>   free_uid(tsk->user);
>   put_group_info(tsk->group_info);

Would the atomic notifier call chain be necessary if you hooked into an
earlier section of do_exit() instead?

Cheers,
-Matt Helsley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] call sysrq_timer_list_show from a workqueue

2008-01-08 Thread Rusty Russell

On Wednesday 09 January 2008 14:33:50 Andrew Morton wrote:
> On Wed, 9 Jan 2008 14:20:18 +1100 Rusty Russell <[EMAIL PROTECTED]> 
wrote:
> > On Wednesday 09 January 2008 11:21:59 Andrew Morton wrote:
> > > The string handling in here has become a bit scruffy.
> >
> > Yes, that patch also evokes a const warning.  Fixed below.
>
> No patch was included.

Yes, I decided it's a secret.  Mine, all mine!

> >  I assume you've
> > queued these because you're thinking of applying them before 2.6.24?  I'd
> > say only
> > modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch
> > warrants that (the other is unlikely and not a regression).
>
> Actually I was thinking 2.6.25 on both.

Then, you should get them next time you grab my series, no?  Or is that 
particular lever not working yet?

Hmm, I see my link was not updated (damn, ln -sfn, not ln -sf!).  Fixed now:
http://ozlabs.org/~rusty/kernel/rr-latest/

More goodies there than a UK comedy convention...

> OK, 2.6.24 seems reasonable.

Kyle acked it at least...

> Yes, it could all do with a revisit.

And it goes without saying that glory awaits they who succeed...
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [TOMOYO #6 retry 08/21] Utility functions and policy manipulation interface.

2008-01-08 Thread James Morris

On Wed, 9 Jan 2008, Kentaro Takeda wrote:

> Common functions for TOMOYO Linux.
> 
> TOMOYO Linux uses /sys/kernel/security/tomoyo interface for configuration.

Why aren't you using securityfs for this?  (It was designed for LSMs).


- James
-- 
James Morris
<[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 03/19] define page_file_cache() function

2008-01-08 Thread KAMEZAWA Hiroyuki

On Tue, 8 Jan 2008 17:28:56 -0500
Rik van Riel <[EMAIL PROTECTED]> wrote:

> On Tue, 8 Jan 2008 14:18:40 -0800 (PST)
> Christoph Lameter <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 8 Jan 2008, Rik van Riel wrote:
> > 
> > > Define page_file_cache() function to answer the question:
> > >   is page backed by a file?
> > 
> > > +static inline int page_file_cache(struct page *page)
> > > +{
> > > + if (PageSwapBacked(page))
> > > + return 0;
> > 
> > Could we call this PageNotFileBacked or so? PageSwapBacked is true for 
> > pages that are RAM based. Its a bit confusing.
> 
> PageNotFileBacked confuses me a little, since shared memory segments live
> in tmpfs and are kinda sorta file backed, but go to swap instead of to a
> filesystem when there is memory pressure.
> 
How about PageIsNotCache() ? :)

When a page is a cache, there is an original data somewhere and can be dropped
out.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] split MMC_CAP_4_BIT_DATA

2008-01-08 Thread Bryan Wu

On Jan 9, 2008 2:29 AM, Mike Frysinger <[EMAIL PROTECTED]> wrote:
> The on-chip Blackfin MMC/SD/SDIO host controller has the ability to do 1-bit
> MMC, 1-bit/4-bit SD, and 1-bit/4-bit SDIO.  Thus the current convention of
> MMC_CAP_4_BIT_DATA meaning "your host controller can do 1-bit or 4-bit for all
> modes" is insufficient for our needs.  The attached patch splits
> MMC_CAP_4_BIT_DATA into MMC_CAP_MMC_4_BIT_DATA and MMC_CAP_SD_4_BIT_DATA and
> updates all host controllers to include these in their caps and then changes
> existing code to check the new defines.  At the moment, SD/SDIO are lumped
> into MMC_CAP_SD_4_BIT_DATA ... should I bother with splitting that into SD and
> SDIO as well while I'm doing this ?
>
> Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]>
> ---
> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> index 68c0e3b..ca12db7 100644
> --- a/drivers/mmc/core/mmc.c
> +++ b/drivers/mmc/core/mmc.c
> @@ -397,7 +397,7 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
>  * Activate wide bus (if supported).
>  */
> if ((card->csd.mmca_vsn >= CSD_SPEC_VER_4) &&
> -   (host->caps & MMC_CAP_4_BIT_DATA)) {
> +   (host->caps & MMC_CAP_MMC_4_BIT_DATA)) {
> err = mmc_switch(card, EXT_CSD_CMD_SET_NORMAL,
> EXT_CSD_BUS_WIDTH, EXT_CSD_BUS_WIDTH_4);
> if (err)
> diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
> index d1c1e0f..974b63d 100644
> --- a/drivers/mmc/core/sd.c
> +++ b/drivers/mmc/core/sd.c
> @@ -441,7 +441,7 @@ static int mmc_sd_init_card(struct mmc_host *host, u32 
> ocr,
> /*
>  * Switch to wider bus (if supported).
>  */
> -   if ((host->caps & MMC_CAP_4_BIT_DATA) &&
> +   if ((host->caps & MMC_CAP_SD_4_BIT_DATA) &&
> (card->scr.bus_widths & SD_SCR_BUS_WIDTH_4)) {
> err = mmc_app_set_bus_width(card, MMC_BUS_WIDTH_4);
> if (err)
> diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
> index 87a50f4..1d389c8 100644
> --- a/drivers/mmc/core/sdio.c
> +++ b/drivers/mmc/core/sdio.c
> @@ -143,7 +143,7 @@ static int sdio_enable_wide(struct mmc_card *card)
> int ret;
> u8 ctrl;
>
> -   if (!(card->host->caps & MMC_CAP_4_BIT_DATA))
> +   if (!(card->host->caps & MMC_CAP_SD_4_BIT_DATA))
> return 0;
>
> if (card->cccr.low_speed && !card->cccr.wide_bus)
> diff --git a/drivers/mmc/host/at91_mci.c b/drivers/mmc/host/at91_mci.c
> index b1edcef..63d89b0 100644
> --- a/drivers/mmc/host/at91_mci.c
> +++ b/drivers/mmc/host/at91_mci.c
> @@ -851,7 +851,7 @@ static int __init at91_mci_probe(struct platform_device 
> *pdev)
> host->board = pdev->dev.platform_data;
> if (host->board->wire4) {
> if (cpu_is_at91sam9260() || cpu_is_at91sam9263())
> -   mmc->caps |= MMC_CAP_4_BIT_DATA;
> +   mmc->caps |= (MMC_CAP_SD_4_BIT_DATA | 
> MMC_CAP_MMC_4_BIT_DATA);
> else
> printk("AT91 MMC: 4 wire bus mode not supported"
> " - using 1 wire\n");
> diff --git a/drivers/mmc/host/imxmmc.c b/drivers/mmc/host/imxmmc.c
> index f2070a1..67d4bc0 100644
> --- a/drivers/mmc/host/imxmmc.c
> +++ b/drivers/mmc/host/imxmmc.c
> @@ -975,7 +975,7 @@ static int imxmci_probe(struct platform_device *pdev)
> mmc->f_min = 15;
> mmc->f_max = CLK_RATE/2;
> mmc->ocr_avail = MMC_VDD_32_33;
> -   mmc->caps = MMC_CAP_4_BIT_DATA;
> +   mmc->caps = (MMC_CAP_SD_4_BIT_DATA | MMC_CAP_MMC_4_BIT_DATA);
>
> /* MMC core transfer sizes tunable parameters */
> mmc->max_hw_segs = 64;
> diff --git a/drivers/mmc/host/omap.c b/drivers/mmc/host/omap.c
> index 971e18b..b1ae793 100644
> --- a/drivers/mmc/host/omap.c
> +++ b/drivers/mmc/host/omap.c
> @@ -1079,7 +1079,7 @@ static int __init mmc_omap_probe(struct platform_device 
> *pdev)
> mmc->caps = MMC_CAP_MULTIWRITE | MMC_CAP_BYTEBLOCK;
>
> if (minfo->wire4)
> -mmc->caps |= MMC_CAP_4_BIT_DATA;
> +mmc->caps |= (MMC_CAP_SD_4_BIT_DATA | 
> MMC_CAP_MMC_4_BIT_DATA);
>
> /* Use scatterlist DMA to reduce per-transfer costs.
>  * NOTE max_seg_size assumption that small blocks aren't
> diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c
> index 1654a33..4fa00f1 100644
> --- a/drivers/mmc/host/pxamci.c
> +++ b/drivers/mmc/host/pxamci.c
> @@ -527,7 +527,7 @@ static int pxamci_probe(struct platform_device *pdev)
> mmc->caps = 0;
> host->cmdat = 0;
> if (!cpu_is_pxa21x() && !cpu_is_pxa25x()) {
> -   mmc->caps |= MMC_CAP_4_BIT_DATA | MMC_CAP_SDIO_IRQ;
> +   mmc->caps |= MMC_CAP_SD_4_BIT_DATA | MMC_CAP_MMC_4_BIT_DATA | 
> MMC_CAP_SDIO_IRQ;
> host->cmdat |= CMDAT_SDIO_INT_EN;
> }
>
> diff --git a/drivers/mmc/host/sdhci.c

Re: NIC as RS232

2008-01-08 Thread Valdis . Kletnieks

On Tue, 08 Jan 2008 08:48:35 +0200, Thanasis said:

> Is there a kernel driver that would make a NIC's port work as a RS232
> port, using the serial cables that are RJ45 on one side and DB9 or DB25
> on the other? Maybe null modem cables of that type ? Or for example
> those used by cisco as console port cables?
> 
> (or may be I'm dreaming ;-)

What I *have* seen are connectors that go from DB9/25 to RJ11, not RJ45.
Basically, using the RJ11 to terminate a 4-conductor cable wired up for
serial use.  It's often hard to tell an 11 from a 45 unless you look at
it *real* close


pgpYhS8X8NOTE.pgp
Description: PGP signature

[PATCHv3] kprobes: Introduce kprobe_handle_fault()

2008-01-08 Thread Harvey Harrison

Use a central kprobe_handle_fault() inline in kprobes.h to remove
all of the arch-dependant, practically identical implementations in
avr32, ia64, powerpc, s390, sparc64, and x86.

avr32 was the only arch without the preempt_disable/enable pair
in its notify_page_fault implementation.

This uncovered a possible bug in the s390 version as that purely
copied the x86 version unconditionally passing 14 as the trapnr
rather than the error_code parameter.

powerpc:
Tested-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>

X86-64
Acked-by: Masami Hiramatsu <[EMAIL PROTECTED]>

Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]>
---
 arch/avr32/mm/fault.c   |   21 +
 arch/ia64/mm/fault.c|   24 +---
 arch/powerpc/mm/fault.c |   25 +
 arch/s390/mm/fault.c|   25 +
 arch/sparc64/mm/fault.c |   23 +--
 arch/x86/mm/fault_64.c  |   26 ++
 include/linux/kprobes.h |   19 +++
 7 files changed, 26 insertions(+), 137 deletions(-)

diff --git a/arch/avr32/mm/fault.c b/arch/avr32/mm/fault.c
index 6560cb1..e41953e 100644
--- a/arch/avr32/mm/fault.c
+++ b/arch/avr32/mm/fault.c
@@ -20,25 +20,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_KPROBES
-static inline int notify_page_fault(struct pt_regs *regs, int trap)
-{
-   int ret = 0;
-
-   if (!user_mode(regs)) {
-   if (kprobe_running() && kprobe_fault_handler(regs, trap))
-   ret = 1;
-   }
-
-   return ret;
-}
-#else
-static inline int notify_page_fault(struct pt_regs *regs, int trap)
-{
-   return 0;
-}
-#endif
-
 int exception_trace = 1;
 
 /*
@@ -66,7 +47,7 @@ asmlinkage void do_page_fault(unsigned long ecr, struct 
pt_regs *regs)
int code;
int fault;
 
-   if (notify_page_fault(regs, ecr))
+   if (kprobe_handle_fault(regs, ecr))
return;
 
address = sysreg_read(TLBEAR);
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 7571076..bfc83e8 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -18,28 +18,6 @@
 
 extern void die (char *, struct pt_regs *, long);
 
-#ifdef CONFIG_KPROBES
-static inline int notify_page_fault(struct pt_regs *regs, int trap)
-{
-   int ret = 0;
-
-   if (!user_mode(regs)) {
-   /* kprobe_running() needs smp_processor_id() */
-   preempt_disable();
-   if (kprobe_running() && kprobes_fault_handler(regs, trap))
-   ret = 1;
-   preempt_enable();
-   }
-
-   return ret;
-}
-#else
-static inline int notify_page_fault(struct pt_regs *regs, int trap)
-{
-   return 0;
-}
-#endif
-
 /*
  * Return TRUE if ADDRESS points at a page in the kernel's mapped segment
  * (inside region 5, on ia64) and that page is present.
@@ -106,7 +84,7 @@ ia64_do_page_fault (unsigned long address, unsigned long 
isr, struct pt_regs *re
/*
 * This is to handle the kprobes on user space access instructions
 */
-   if (notify_page_fault(regs, TRAP_BRKPT))
+   if (kprobe_handle_fault(regs, TRAP_BRKPT))
return;
 
down_read(>mmap_sem);
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 8135da0..ff64bd3 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -39,29 +39,6 @@
 #include 
 #include 
 
-
-#ifdef CONFIG_KPROBES
-static inline int notify_page_fault(struct pt_regs *regs)
-{
-   int ret = 0;
-
-   /* kprobe_running() needs smp_processor_id() */
-   if (!user_mode(regs)) {
-   preempt_disable();
-   if (kprobe_running() && kprobe_fault_handler(regs, 11))
-   ret = 1;
-   preempt_enable();
-   }
-
-   return ret;
-}
-#else
-static inline int notify_page_fault(struct pt_regs *regs)
-{
-   return 0;
-}
-#endif
-
 /*
  * Check whether the instruction at regs->nip is a store using
  * an update addressing form which will update r1.
@@ -164,7 +141,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned 
long address,
is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
 
-   if (notify_page_fault(regs))
+   if (kprobe_handle_fault(regs, 11))
return 0;
 
if (trap == 0x300) {
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 2456b52..a9033cf 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -51,29 +51,6 @@ extern int sysctl_userprocess_debug;
 
 extern void die(const char *,struct pt_regs *,long);
 
-#ifdef CONFIG_KPROBES
-static inline int notify_page_fault(struct pt_regs *regs, long err)
-{
-   int ret = 0;
-
-   /* kprobe_running() needs smp_processor_id() */
-   if (!user_mode(regs)) {
-   preempt_disable();
-   if (kprobe_running() && kprobe_fault_handler(regs, 14))
-   ret = 1;
-   preempt_enable();

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Andi Kleen


> An alternative might be to come up with something decent and target 2.6.24.x

If you want zero cache line cost the only way is to handle that using Mathieu's 
inline patch infrastructure. Having a generic notifier type based on that would 
be 
probably a good idea.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 07/19] (NEW) add some sanity checks to get_scan_ratio

2008-01-08 Thread KAMEZAWA Hiroyuki

On Tue, 08 Jan 2008 15:59:46 -0500
Rik van Riel <[EMAIL PROTECTED]> wrote:

> The access ratio based scan rate determination in get_scan_ratio
> works ok in most situations, but needs to be corrected in some
> corner cases:
> - if we run out of swap space, do not bother scanning the anon LRUs
> - if we have already freed all of the page cache, we need to scan
>   the anon LRUs
> 
> Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
> 
> Index: linux-2.6.24-rc6-mm1/mm/vmscan.c
> ===
> --- linux-2.6.24-rc6-mm1.orig/mm/vmscan.c 2008-01-07 17:33:50.0 
> -0500
> +++ linux-2.6.24-rc6-mm1/mm/vmscan.c  2008-01-07 17:57:49.0 -0500
> @@ -1182,7 +1182,7 @@ static unsigned long shrink_list(enum lr
>  static void get_scan_ratio(struct zone *zone, struct scan_control * sc,
>   unsigned long *percent)
>  {
> - unsigned long anon, file;
> + unsigned long anon, file, free;
>   unsigned long anon_prio, file_prio;
>   unsigned long rotate_sum;
>   unsigned long ap, fp;
> @@ -1230,6 +1230,20 @@ static void get_scan_ratio(struct zone *
>   else if (fp > 100)
>   fp = 100;
>   percent[1] = fp;
> +
> + free = zone_page_state(zone, NR_FREE_PAGES);
> +
> + /*
> +  * If we have no swap space, do not bother scanning anon pages
> +  */
> + if (nr_swap_pages <= 0)
> + percent[0] = 0;
Doesn't this mean that swap-cache in ACTIVE_ANON_LIST is not scanned ?
Or swap-cache is in File-Cache list ?

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"

2008-01-08 Thread Andrew Morton

On Tue, 08 Jan 2008 23:01:07 -0500 [EMAIL PROTECTED] wrote:

> On Mon, 07 Jan 2008 16:19:30 MST, Matthew Wilcox said:
> > On Mon, Jan 07, 2008 at 06:04:25PM -0500, [EMAIL PROTECTED] wrote:
> > > Theoretically, at least.  Sometimes, in the real world, other constraints
> > > enter into it...
> > 
> > So you're saying that you can't find reliable ways to reproduce problems
> > on demand?  Those are some of the lower quality bug reports, so I don't
> > think we're losing much by having you not report them.
> 
> I'm sure that *everybody* on this list would *love* to know how you find
> a reliable way to reproduce all the bugs that start off with "after X days of
> uptime".   But when you're chasing what might be a race condition with a
> very small timing hole, you may need an event to happen several million times
> before the accumulated chance of hitting it becomes appreciable.
> 

I must say that the number of bugs which actually go away when the user
stops using nvidia/fglrx/ndiswrapper/etc is a small minority.

And you can usually tell beforehand too: if the user reports bad_page
warnings or pte table scroggage or whatever and they're using nvidia I just
hit 'd'.  But people who think that removing the nvidia driver will
magically fix that khubd-got-stuck-in-D-state bug are urinating up an
incline.


Facts:

- lots of people use nvidia/etc

- most bugs they report aren't caused by nvidia/etc

- we need lots of testers

draw you own conclusions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/1] Convert drivers in drivers/char/drm to use .unlocked_ioctl

2008-01-08 Thread Andi Kleen

On Wed, Jan 09, 2008 at 03:37:50AM +, Dave Airlie wrote:
> 
> > The drm drivers in this patch all used drm_ioctl to perform their
> > ioctl calls.  The common function is converted to use lock_kernel()
> > and unlock_kernel() and the drivers are converted to use .unlocked_ioctl
> > 
> 
> NAK

Did you actually read Kevin's patch? 

> 
> I've started looking at this already in the drm git tree, I'm going to 
> provide both locked and unlocked paths for drivers to choose, as we need 

If you do that you'll exactly need Kevin's patch as a base.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"

2008-01-08 Thread Valdis . Kletnieks

On Mon, 07 Jan 2008 16:19:30 MST, Matthew Wilcox said:

> So you're saying that you can't find reliable ways to reproduce problems
> on demand?  Those are some of the lower quality bug reports, so I don't
> think we're losing much by having you not report them.

And in the next e-mail in my lkml folder we see:

On Mon, 07 Jan 2008 18:21:45 EST, Parag Warudkar said:
> BTW, I have so far tested 2.6.24-rc4/5/6/7 and 2.6.23.12 - all of
> which have this problem.
> 
> Yesterday I went back to using 2.6.22.15 and after a day's uptime it
> has not reproduced with the same config.
> 
> Time for git-bisect I suppose? (the only problem is that this takes
> anywhere between 20 minutes to 8 hrs to confirm reliably.)

Are you saying that we're not losing much if Parag says "screw it" and
doesn't report the problem?


pgpcIDYr8VWbg.pgp
Description: PGP signature

Do SATA tape drives work?

2008-01-08 Thread Jonathan Woithe

Hi guys

I was wondering whether anyone can shed any light on the status of SATA tape
drives.  There's very little info on the net about this at least in the
places I've checked; the only thing of any significance I've found thus far
is a note in a Bacula document dated April 2007 which states that drives
other than real SCSI units don't generally work with Bacula.

To put this into context, I'm looking at purchasing a Sony SDX470VRB SATA
AIT-1 tape drive for use with the SATA controller on an Intel DG31PR
mainboard.  The drive will be used primarily with tar/cpio.  Obvsiouly
however I only want to make the purchase if there's a reasonable chance of
it working.

I would appreciate any information you can shed on this issue.

Regards
  jonathan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"

2008-01-08 Thread Valdis . Kletnieks

On Mon, 07 Jan 2008 16:19:30 MST, Matthew Wilcox said:
> On Mon, Jan 07, 2008 at 06:04:25PM -0500, [EMAIL PROTECTED] wrote:
> > Theoretically, at least.  Sometimes, in the real world, other constraints
> > enter into it...
> 
> So you're saying that you can't find reliable ways to reproduce problems
> on demand?  Those are some of the lower quality bug reports, so I don't
> think we're losing much by having you not report them.

I'm sure that *everybody* on this list would *love* to know how you find
a reliable way to reproduce all the bugs that start off with "after X days of
uptime".   But when you're chasing what might be a race condition with a
very small timing hole, you may need an event to happen several million times
before the accumulated chance of hitting it becomes appreciable.

pgpIzTII30SXP.pgp
Description: PGP signature

More breakage in native_rdtsc out of line in git-x86

2008-01-08 Thread Andi Kleen


I had some boot failures here with git-x86 with init and hotplug all 
segfaulting early on userland with new glibc.  Bisecting found

commit 6aea5bc37fa790eaf3a942f0785985914568e214
Author: Ingo Molnar <[EMAIL PROTECTED]>
Date:   Sat Jan 5 13:27:08 2008 +0100

x86: move native_read_tsc() offline

move native_read_tsc() offline.

I think the problem is that the vsyscall/vdso code calls it through
vread and for that it has to be exported. There seems to be also
another bug with the old style vsyscalls not using the TSC vread
that masks it on older glibc

Stepping with gdb through old style vgettimeofday() confirms that RDTSC is 
not used.

A long time ago we had a similar problem once and it was because of a 
problem exporting the vsyscall variables in vmlinux.lds.S -- looks like that 
has reappeared.

I think the new glibc shows it because it uses the vDSO not 
the older vsyscall and the new vDSO probably still works. Anyways haven't 
investigated why that is in detail yet, but that's a separate 
regression.

Back to the boot failure:

Unfortunately simply adding __vsyscall_fn to native_read_tsc doesn't 
work -- causes early kernel faults like

PANIC: early exception rip ff600105 error 10 cr2 ff600105
Pid: 0, comm: swapper Not tainted 2.6.24-rc6 #58

Call Trace:
 [] native_sched_clock+0x9/0x3f
 [] init_idle+0x33/0xd1
 [] sched_init+0x26d/0x283
 [] start_kernel+0x10b/0x2bd
 [] _sinittext+0x114/0x11b

Not sure why that is -- in theory the vsyscall functions should be callable 
from the main kernel. Might be a binutils problem or another code
regression.

Anyways it looks like the only good fix is to either revert that or
fork into two functions one for vread() and another for normal tsc ->read()

This is all in addition to the problem of it having incorrect barriers.
I note that my original patch didn't have any of these problems.

I'm using the appended revert patch here as a workaround for now.

-Andi


Revert rdtsc out of line change

Reverts 

commit 6aea5bc37fa790eaf3a942f0785985914568e214
Author: Ingo Molnar <[EMAIL PROTECTED]>
Date:   Sat Jan 5 13:27:08 2008 +0100

x86: move native_read_tsc() offline

move native_read_tsc() offline.

The function is called by vsyscalls in ring 3, so it can't be out of line this 
way.

Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>

Index: linux/arch/x86/kernel/rtc.c
===
--- linux.orig/arch/x86/kernel/rtc.c
+++ linux/arch/x86/kernel/rtc.c
@@ -194,14 +194,3 @@ int update_persistent_clock(struct times
 {
return set_rtc_mmss(now.tv_sec);
 }
-
-unsigned long long native_read_tsc(void)
-{
-   DECLARE_ARGS(val, low, high);
-
-   asm volatile("rdtsc" : EAX_EDX_RET(val, low, high));
-   rdtsc_barrier();
-
-   return EAX_EDX_VAL(val, low, high);
-}
-
Index: linux/include/asm-x86/msr.h
===
--- linux.orig/include/asm-x86/msr.h
+++ linux/include/asm-x86/msr.h
@@ -91,7 +91,13 @@ static inline int native_write_msr_safe(
return err;
 }
 
-extern unsigned long long native_read_tsc(void);
+static inline unsigned long long native_read_tsc(void)
+{
+   DECLARE_ARGS(val, low, high);
+
+   asm volatile("rdtsc" : EAX_EDX_RET(val, low, high));
+   return EAX_EDX_VAL(val, low, high);
+}
 
 static inline unsigned long long native_read_pmc(int counter)
 {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Dave Airlie


> 
> An alternative might be to come up with something decent and target 2.6.24.x

I don't see mmiotrace getting merged into a stable kernel... how do 
however see it getting cleaned up for 2.6.25 now that people know how 
fragile the kernel hooks for it are..

> We put the crappy code back in for 2.6.24 then take it out immediately
> after 2.6.24 and put something else in to support mmiotrace and perhaps the
> other new mystery features to which you refer below.  hm.

(I think the other mystery feature is actually a Novell kernel debugger 
but I'm not sure, madwifi use it for similiar reasons to mmiotrace I 
think..)

> > >   all that crap
> > >   }
> > > 
> > > 
> > > But that's all speculation.  Has anyone actually measured the pagefault
> > > latency impact of this change?

Message-Id: <[EMAIL PROTECTED]>
Subject: [patch 20/38] Minor fault path optimization.
Date:   Fri, 27 Apr 2007 16:05:23 +0200

was a patch to do exactly that.. hch decided the feature wasn't useful and 
posted a patch to remove it..

> 
> That change has been in the mainline tree for nearly three months.  All
> these affected parties have left it until the eve of 2.6.24 to actually
> tell us about it.  This is causing me sympathy problems :(
> 

Jan first complained on the 4th Decemeber last year, I'm just posting this 
now because Linus said send him a patch to revert regressions rather than 
just complain, I've prepared the patch to put back the old behaviour from 
2.6.23. This was only brought to my notice this morning but I'm not going 
to let that stop me from trying to find a correct fix rather than just 
ripping the feature out..

I think we could apply the page fault cleanup patch I mentioned earlier on 
top of this patch and get back the 300 cycles and that would make people 
happy, it makes sense for mmiotrace to use kprobes hooks and not have to 
do this stuff directly but if that is what is wanted the mmiotrace guys 
can do it directly in the future.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: pnpacpi : exceeded the max number of IO resources

2008-01-08 Thread Len Brown


> > Well, yes, the warning is actually new as well. Previously your kernel 
> > just silently ignored 8 more mem resources than it does now it seems.
> > 
> > Given that people are hitting these limits, it might make sense to just 
> > do away with the warning for 2.6.24 again while waiting for the dynamic 
> > code?
> 
> Ping. Should these warnings be reverted for 2.6.24?

No. I don't think hiding this issue again is a good idea.
I'd rather live with people complaining about an addition dmesg line.

-Len

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/6] syslets: add generic syslets infrastructure

2008-01-08 Thread Rusty Russell

On Wednesday 09 January 2008 14:00:04 Zach Brown wrote:
> > Firstly, why not just specify an address for the return value and be
> > done with it?  This infrastructure seems overkill, and you can always
> > extend later if required.
>
> Sorry, which infrastructure?
>
> Providing the function and stack to return to?  Sure, I could certainly
> entertain the idea of not having syslet tasks return to userspace in the
> first pass.  Ingo sure seemed excited by the idea.
>
> Or do you mean the syscall return value ending up in the userspace
> completion event ring?  That's mostly about being able to wait for
> pending syslets to complete.

The latter.  A ring is optimal for processing a huge number of requests, but 
if you're really going to be firing off syslet threads all over the place 
you're not going to be optimal anyway.  And being able to point the return 
value to the stack or into some datastructure is way nicer to code (zero 
setup == easy to understand and easy to convert).

For notification, see below.

> > Secondly, you really should allow integration with an eventfd so you
> > don't make the posix AIO mistake of providing a poll-incompatible
> > interface.
>
> Yeah, this seems straight forward enough that I haven't made it an
> initial priority.  I'm sure it will be helpful for people who are stuck
> integrating with entrenched software that wants to wait for pollable fds.

Unfortunately, waiting for someone to write a killer app which uses your new 
API is the road to disappointment.  The real target is convincing the handful 
of important apps (Samba, Apache, ...) to #ifdef around some small piece of 
code in order to get performance.  And a mere single design wart could mean 
that never happens.  Look at epoll, it's probably been the most successful 
and it's still damn niche.

> For more flexible software, though, it's compelling to now be able to
> aggregate waiting for completion of the existing waiting syscalls (poll,
> epoll_wait, futexes, whatever) by issuing them as concurrent syslets.

Is replacing epoll with syslets really going to win, even if you're writing 
apps from scratch?  Anyway a fast notification mechanism is a different 
problem than syslets, and should be separated.

> > Finally, and probably most alarmingly, AFAICT randomly changing TID will
> > break all threaded programs, which means this won't be fitted into
> > existing code bases, making it YA niche Linux-only API 8(
>
> I wonder if there isn't an opportunity to add a clone() flag which
> juggles the association between TIDs and task_structs.  I don't relish
> the idea of investigating the life cycles of task_struct references that
> derive from TIDs and seeing how those would race with a syslet blocking
> and cloning, but, well, maybe that's what needs to be done.

This must be solved, yet all avenues seem crawling with worms.  Redirecting 
find_task_by_pid() to find the original and converting all the places where 
we return tids to userspace?  Swapping tids when we clone?  Duplicate tids, 
with only the non-syslet one being returned from find_task_by_pid()?

> This all isn't my area of expertise, though, sadly.  It would be swell
> if someone wanted to look into it before I'm forced to learn yet another
> weird corner of the kernel.

Let's just tell Ingo it's impossible to solve :)

Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH][AGP] intel_agp: add new chipset ids

2008-01-08 Thread Zhenyu Wang


Dave,

This one adds new pci ids for Intel intergrated graphics
chipset, with gtt table access change on it and new gtt table
size definition.

Thanks.

Signed-off-by: Zhenyu Wang <[EMAIL PROTECTED]>
---
 drivers/char/agp/agp.h   |3 +++
 drivers/char/agp/intel-agp.c |   31 ++-
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h
index b83824c..d132914 100644
--- a/drivers/char/agp/agp.h
+++ b/drivers/char/agp/agp.h
@@ -235,6 +235,9 @@ struct agp_bridge_data {
 #define I965_PGETBL_SIZE_512KB (0 << 1)
 #define I965_PGETBL_SIZE_256KB (1 << 1)
 #define I965_PGETBL_SIZE_128KB (2 << 1)
+#define I965_PGETBL_SIZE_1MB   (3 << 1)
+#define I965_PGETBL_SIZE_2MB   (4 << 1)
+#define I965_PGETBL_SIZE_1_5MB (5 << 1)
 #define G33_PGETBL_SIZE_MASK(3 << 8)
 #define G33_PGETBL_SIZE_1M  (1 << 8)
 #define G33_PGETBL_SIZE_2M  (2 << 8)
diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
index d879619..091e765 100644
--- a/drivers/char/agp/intel-agp.c
+++ b/drivers/char/agp/intel-agp.c
@@ -30,13 +30,16 @@
 #define PCI_DEVICE_ID_INTEL_Q35_IG  0x29B2
 #define PCI_DEVICE_ID_INTEL_Q33_HB  0x29D0
 #define PCI_DEVICE_ID_INTEL_Q33_IG  0x29D2
+#define PCI_DEVICE_ID_INTEL_IGD_HB  0x2A40
+#define PCI_DEVICE_ID_INTEL_IGD_IG  0x2A42
 
 #define IS_I965 (agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82946GZ_HB || \
  agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965G_1_HB || 
\
  agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965Q_HB || \
  agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965G_HB || \
  agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GM_HB || \
- agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GME_HB)
+ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GME_HB || 
\
+ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_IGD_HB)
 
 #define IS_G33 (agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_G33_HB || \
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_Q35_HB || \
@@ -456,6 +459,15 @@ static void intel_i830_init_gtt_entries(void)
case I965_PGETBL_SIZE_512KB:
size = 512;
break;
+   case I965_PGETBL_SIZE_1MB:
+   size = 1024;
+   break;
+   case I965_PGETBL_SIZE_2MB:
+   size = 2048;
+   break;
+   case I965_PGETBL_SIZE_1_5MB:
+   size = 1024 + 512;
+   break;
default:
printk(KERN_INFO PFX "Unknown page table size, "
   "assuming 512KB\n");
@@ -981,6 +993,7 @@ static int intel_i965_create_gatt_table(struct 
agp_bridge_data *bridge)
struct aper_size_info_fixed *size;
int num_entries;
u32 temp;
+   int gtt_offset, gtt_size;
 
size = agp_bridge->current_size;
page_order = size->page_order;
@@ -990,13 +1003,18 @@ static int intel_i965_create_gatt_table(struct 
agp_bridge_data *bridge)
pci_read_config_dword(intel_private.pcidev, I915_MMADDR, );
 
temp &= 0xfff0;
-   intel_private.gtt = ioremap((temp + (512 * 1024)) , 512 * 1024);
 
-   if (!intel_private.gtt)
-   return -ENOMEM;
+   if (agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_IGD_HB)
+  gtt_offset = gtt_size = MB(2);
+   else
+  gtt_offset = gtt_size = KB(512);
+
+   intel_private.gtt = ioremap((temp + gtt_offset) , gtt_size);
 
+   if (!intel_private.gtt)
+  return -ENOMEM;
 
-   intel_private.registers = ioremap(temp,128 * 4096);
+   intel_private.registers = ioremap(temp, 128 * 4096);
if (!intel_private.registers) {
iounmap(intel_private.gtt);
return -ENOMEM;
@@ -1884,6 +1902,8 @@ static const struct intel_driver_description {
NULL, _g33_driver },
{ PCI_DEVICE_ID_INTEL_Q33_HB, PCI_DEVICE_ID_INTEL_Q33_IG, 0, "Q33",
NULL, _g33_driver },
+   { PCI_DEVICE_ID_INTEL_IGD_HB, PCI_DEVICE_ID_INTEL_IGD_IG, 0,
+   "Intel Integrated Graphics Device", NULL, _i965_driver },
{ 0, 0, 0, NULL, NULL, NULL }
 };
 
@@ -2073,6 +2093,7 @@ static struct pci_device_id agp_intel_pci_table[] = {
ID(PCI_DEVICE_ID_INTEL_G33_HB),
ID(PCI_DEVICE_ID_INTEL_Q35_HB),
ID(PCI_DEVICE_ID_INTEL_Q33_HB),
+   ID(PCI_DEVICE_ID_INTEL_IGD_HB),
{ }
 };
 
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/1] Convert drivers in drivers/char/drm to use .unlocked_ioctl

2008-01-08 Thread Dave Airlie


> The drm drivers in this patch all used drm_ioctl to perform their
> ioctl calls.  The common function is converted to use lock_kernel()
> and unlock_kernel() and the drivers are converted to use .unlocked_ioctl
> 

NAK

I've started looking at this already in the drm git tree, I'm going to 
provide both locked and unlocked paths for drivers to choose, as we need 
to audit the drivers on a per-driver basis, the other option is to provide 
wrappers in each driver to do the lock/unlock kernel and leave drm_ioctl 
alone..

I'll take a look kmalloc failure case sounds like a bug though..

Dave.

> Signed-off-by: Kevin Winchester <[EMAIL PROTECTED]>
> 
> ---
> 
> I also noted that in the failed kmalloc case in drm_ioctl(), the function
> immediately returns -ENOMEM, rather than following the error path that
> calls atomic_dec(>ioctl_count);.  I'm not sure if the ioctl_count
> is just not important in the -ENOMEM case, or if this is a bug.
> 
>  drivers/char/drm/drmP.h   |3 +--
>  drivers/char/drm/drm_drv.c|   10 ++
>  drivers/char/drm/i810_dma.c   |2 +-
>  drivers/char/drm/i810_drv.c   |2 +-
>  drivers/char/drm/i830_dma.c   |2 +-
>  drivers/char/drm/i830_drv.c   |2 +-
>  drivers/char/drm/i915_drv.c   |2 +-
>  drivers/char/drm/mga_drv.c|2 +-
>  drivers/char/drm/r128_drv.c   |2 +-
>  drivers/char/drm/radeon_drv.c |2 +-
>  drivers/char/drm/savage_drv.c |2 +-
>  drivers/char/drm/sis_drv.c|2 +-
>  drivers/char/drm/tdfx_drv.c   |2 +-
>  drivers/char/drm/via_drv.c|2 +-
>  14 files changed, 19 insertions(+), 18 deletions(-)
> 
> Index: v2.6.24-rc7/drivers/char/drm/drmP.h
> ===
> --- v2.6.24-rc7.orig/drivers/char/drm/drmP.h
> +++ v2.6.24-rc7/drivers/char/drm/drmP.h
> @@ -833,8 +833,7 @@ static inline int drm_mtrr_del(int handl
>   /* Driver support (drm_drv.h) */
>  extern int drm_init(struct drm_driver *driver);
>  extern void drm_exit(struct drm_driver *driver);
> -extern int drm_ioctl(struct inode *inode, struct file *filp,
> -  unsigned int cmd, unsigned long arg);
> +extern long drm_ioctl(struct file *filp, unsigned int cmd, unsigned long 
> arg);
>  extern long drm_compat_ioctl(struct file *filp,
>unsigned int cmd, unsigned long arg);
>  extern int drm_lastclose(struct drm_device *dev);
> Index: v2.6.24-rc7/drivers/char/drm/drm_drv.c
> ===
> --- v2.6.24-rc7.orig/drivers/char/drm/drm_drv.c
> +++ v2.6.24-rc7/drivers/char/drm/drm_drv.c
> @@ -438,7 +438,6 @@ static int drm_version(struct drm_device
>  /**
>   * Called whenever a process performs an ioctl on /dev/drm.
>   *
> - * \param inode device inode.
>   * \param file_priv DRM file private.
>   * \param cmd command.
>   * \param arg user argument.
> @@ -447,8 +446,7 @@ static int drm_version(struct drm_device
>   * Looks up the ioctl function in the ::ioctls table, checking for root
>   * previleges if so required, and dispatches to the respective function.
>   */
> -int drm_ioctl(struct inode *inode, struct file *filp,
> -   unsigned int cmd, unsigned long arg)
> +long drm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>  {
>   struct drm_file *file_priv = filp->private_data;
>   struct drm_device *dev = file_priv->head->dev;
> @@ -458,6 +456,7 @@ int drm_ioctl(struct inode *inode, struc
>   int retcode = -EINVAL;
>   char *kdata = NULL;
>  
> + lock_kernel();
>   atomic_inc(>ioctl_count);
>   atomic_inc(>counts[_DRM_STAT_IOCTLS]);
>   ++file_priv->ioctl_count;
> @@ -494,8 +493,10 @@ int drm_ioctl(struct inode *inode, struc
>   } else {
>   if (cmd & (IOC_IN | IOC_OUT)) {
>   kdata = kmalloc(_IOC_SIZE(cmd), GFP_KERNEL);
> - if (!kdata)
> + if (!kdata) {
> + unlock_kernel();
>   return -ENOMEM;
> + }
>   }
>  
>   if (cmd & IOC_IN) {
> @@ -520,6 +521,7 @@ int drm_ioctl(struct inode *inode, struc
>   atomic_dec(>ioctl_count);
>   if (retcode)
>   DRM_DEBUG("ret = %x\n", retcode);
> + unlock_kernel();
>   return retcode;
>  }
>  
> Index: v2.6.24-rc7/drivers/char/drm/i810_dma.c
> ===
> --- v2.6.24-rc7.orig/drivers/char/drm/i810_dma.c
> +++ v2.6.24-rc7/drivers/char/drm/i810_dma.c
> @@ -115,7 +115,7 @@ static int i810_mmap_buffers(struct file
>  static const struct file_operations i810_buffer_fops = {
>   .open = drm_open,
>   .release = drm_release,
> - .ioctl = drm_ioctl,
> + .unlocked_ioctl = drm_ioctl,
>   .mmap = i810_mmap_buffers,
>   .fasync = drm_fasync,
>  };
> Index: v2.6.24-rc7/drivers/char/drm/i810_drv.c
>

[RFC] x86: Add oops_begin, oops_end to X86_32

2008-01-08 Thread Harvey Harrison

Some more work is needed on this patch, but I'm looking for
some feedback about the general direction.  X86_64's
implementation seems nicer and it would be useful to use
a common base for further unification in the oops handling.

Modify the X86_32 implementation of die() using helpers
oops_begin()/oops_end().  Small whitespace change in
traps_64.c for easier comparison between the two.

Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]>
---
 arch/x86/kernel/traps_32.c |  137 +++-
 arch/x86/kernel/traps_64.c |   11 +--
 2 files changed, 76 insertions(+), 72 deletions(-)

diff --git a/arch/x86/kernel/traps_32.c b/arch/x86/kernel/traps_32.c
index 5f2b38e..a4092ed 100644
--- a/arch/x86/kernel/traps_32.c
+++ b/arch/x86/kernel/traps_32.c
@@ -352,10 +352,61 @@ int is_valid_bugaddr(unsigned long ip)
return ud2 == 0x0b0f;
 }
 
-static int die_counter;
+static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED;
+static int die_owner = -1;
+static unsigned int die_nest_count;
+
+unsigned long __kprobes oops_begin(void)
+{
+   int cpu;
+   unsigned long flags;
+
+   oops_enter();
+
+   raw_local_irq_save(flags);
+   cpu = smp_processor_id();
+   /* racy, but better than risking deadlock. */
+   if (!__raw_spin_trylock(_lock) && cpu != die_owner) {
+   __raw_spin_lock(_lock);
+   }
+   die_nest_count++;
+   die_owner = cpu;
+   console_verbose();
+   bust_spinlocks(1);
+   return flags;
+}
+
+void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
+{ 
+   die_owner = -1;
+   bust_spinlocks(0);
+   die_nest_count--;
+   if (!die_nest_count)
+   /* Nest count reaches zero, release the lock. */
+   __raw_spin_unlock(_lock);
+   raw_local_irq_restore(flags);
+
+   if (!regs) {
+   oops_exit();
+   return;
+   }
+
+   if (kexec_should_crash(current))
+   crash_kexec(regs);
+
+   if (in_interrupt())
+   panic("Fatal exception in interrupt");
+
+   if (panic_on_oops)
+   panic("Fatal exception");
+
+   oops_exit();
+   do_exit(signr);
+}
 
 int __kprobes __die(const char * str, struct pt_regs * regs, long err)
 {
+   static int die_counter;
unsigned long sp;
unsigned short ss;
 
@@ -371,24 +422,23 @@ int __kprobes __die(const char * str, struct pt_regs * 
regs, long err)
 #endif
printk("\n");
 
-   if (notify_die(DIE_OOPS, str, regs, err,
-   current->thread.trap_no, SIGSEGV) !=
-   NOTIFY_STOP) {
-   show_registers(regs);
-   /* Executive summary in case the oops scrolled away */
-   sp = (unsigned long) (>sp);
-   savesegment(ss, ss);
-   if (user_mode(regs)) {
-   sp = regs->sp;
-   ss = regs->ss & 0x;
-   }
-   printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip);
-   print_symbol("%s", regs->ip);
-   printk(" SS:ESP %04x:%08lx\n", ss, sp);
-   return 0;
-   } else {
+   if (notify_die(DIE_OOPS, str, regs, err, current->thread.trap_no,
+   SIGSEGV) == NOTIFY_STOP)
return 1;
+
+   show_registers(regs);
+   add_taint(TAINT_DIE);
+   /* Executive summary in case the oops scrolled away */
+   sp = (unsigned long) (>sp);
+   savesegment(ss, ss);
+   if (user_mode(regs)) {
+   sp = regs->sp;
+   ss = regs->ss & 0x;
}
+   printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip);
+   print_symbol("%s", regs->ip);
+   printk(" SS:ESP %04x:%08lx\n", ss, sp);
+   return 0;
 }
 
 /*
@@ -397,58 +447,15 @@ int __kprobes __die(const char * str, struct pt_regs * 
regs, long err)
  */
 void die(const char * str, struct pt_regs * regs, long err)
 {
-   static struct {
-   raw_spinlock_t lock;
-   u32 lock_owner;
-   int lock_owner_depth;
-   } die = {
-   .lock = __RAW_SPIN_LOCK_UNLOCKED,
-   .lock_owner =   -1,
-   .lock_owner_depth = 0
-   };
-   unsigned long flags;
+   unsigned long flags = oops_begin();
 
-   oops_enter();
-
-   if (die.lock_owner != raw_smp_processor_id()) {
-   console_verbose();
-   raw_local_irq_save(flags);
-   __raw_spin_lock();
-   die.lock_owner = smp_processor_id();
-   die.lock_owner_depth = 0;
-   bust_spinlocks(1);
-   } else
-   raw_local_irq_save(flags);
-
-   if (++die.lock_owner_depth < 3) {
+   if (!user_mode(regs))
report_bug(regs->ip, regs);
 
-   if (__die(str, regs, err))
-   regs = NULL;
-   } else {
-   printk(KERN_EMERG

Re: [PATCH] call sysrq_timer_list_show from a workqueue

2008-01-08 Thread Andrew Morton

On Wed, 9 Jan 2008 14:20:18 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote:

> On Wednesday 09 January 2008 11:21:59 Andrew Morton wrote:
> > The string handling in here has become a bit scruffy.
> 
> Yes, that patch also evokes a const warning.  Fixed below.

No patch was included.

>  I assume you've
> queued these because you're thinking of applying them before 2.6.24?  I'd say
> only modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch
> warrants that (the other is unlikely and not a regression).

Actually I was thinking 2.6.25 on both.



   Kyle McMartin reports sysrq_timer_list_show() can hit the module
   mutex; these paths don't need to though, since we long ago changed all
   the module list manipulation to occur via stop_machine().

   Disabling preemption is enough.

Ah.  sysrq_timer_list_show() is called from interrupt.



OK, 2.6.24 seems reasonable.

> > afacit the `namebuf[KSYM_NAME_LEN - 1] = 0;' would be unneeded if we were
> > to use strlcpy() and I suspect the `namebuf[0] = 0;' isn't needed either.
> >
> > And the use of strlcpy() means we don't need to subtract 1 from
> > KSYM_NAME_LEN and we don't need to fret about weird strncpy semantics when
> > the input string is too large.
> >
> >
> > And the fact that incoming arg `namebuf' MUST point at a
> > KSYM_NAME_LEN-sized buffer could be better communicated by using a
> > dedicated struct for this, or by giving the arg a type of `char
> > namebuf[KSYM_NAME_LEN]'.  Or by adding a comment. Or by just ignoring
> > me and doing something more useful.
> 
> Or better, rework all the name lookup interfaces, rather than having: 
> 
> struct module *module_text_address(unsigned long addr);
> struct module *__module_text_address(unsigned long addr);
> int is_module_address(unsigned long addr);
> int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
>   char *name, char *module_name, int *exported);
> char *module_address_lookup(unsigned long addr,
>   unsigned long *symbolsize,
>   unsigned long *offset,
>   char **modname,
>   char *namebuf);
> int lookup_module_symbol_name(unsigned long addr, char *symname);
> int lookup_module_symbol_attrs(unsigned long addr, unsigned long *size,
>  unsigned long *offset, char *modname, char 
> *name);
> unsigned long module_kallsyms_lookup_name(const char *name);
> 
> unsigned long kallsyms_lookup_name(const char *name);
> extern int kallsyms_lookup_size_offset(unsigned long addr,
> unsigned long *symbolsize,
> unsigned long *offset);
> const char *kallsyms_lookup(unsigned long addr,
>   unsigned long *symbolsize,
>   unsigned long *offset,
>   char **modname, char *namebuf);
> extern int sprint_symbol(char *buffer, unsigned long address);
> extern void __print_symbol(const char *fmt, unsigned long address);
> int lookup_symbol_name(unsigned long addr, char *symname);
> int lookup_symbol_attrs(unsigned long addr, unsigned long *size,
>   unsigned long *offset, char *modname, char *name);

Yes, it could all do with a revisit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] writeback bug fixes and simplifications

2008-01-08 Thread WU Fengguang

On Sat, Dec 29, 2007 at 03:56:59PM +0100, Hans-Peter Jansen wrote:
> Am Freitag, 28. Dezember 2007 schrieb Sascha Warner:
> > Andrew Morton wrote:
> > > On Thu, 27 Dec 2007 23:08:40 +0100 Sascha Warner <[EMAIL PROTECTED]> 
> wrote:
> > >> Hi,
> > >>
> > >> I applied your patches to 2.6.24-rc6-mm1, but now I am faced with one
> > >> pdflush often using 100% CPU for a long time. There seem to be some
> > >> rare pauses from its 100% usage, however.
> > >>
> > >> On ~23 minutes uptime i have ~19 minutes pdflush runtime.
> > >>
> > >> This is on E6600, x86_64, 2 Gig RAM, SATA HDD, running on gentoo
> > >> ~x64_64
> > >>
> > >> Let me know if you need more info.
> > >
> > > (some) cc's restored.  Please, always do reply-to-all.
> >
> > Hi Wu,
> 
> Sascha, if you want to address Fengguang by his first name, note that 
> chinese and bavarians (and some others I forgot now, too) typically use the 
> order:
>   
>   lastname firstname 
> 
> when they spell their names. Another evidence is, that the name Wu is a 
> pretty common chinese family name.
> 
> Fengguang, if it's the other way around, correct me please (and I'm going to 
> wear a big brown paper bag for the rest of the day..). 

You are right. We normally do "Fengguang" or "Mr. Wu" :-)
For LKML the first name is less ambiguous.

Thanks,
Fengguang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Andrew Morton

On Wed, 9 Jan 2008 03:17:37 + (GMT) Dave Airlie <[EMAIL PROTECTED]> wrote:

> 
> > On Wed, 9 Jan 2008 02:34:46 + (GMT) Dave Airlie <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > 
> > > [This an initial RFC but I'd like to have this patch in before 2.6.24 
> > > goes 
> > > final as it really breaks this useful feature]
> > > 
> > > mmiotrace the MMIO access tracer used to reverse engineer binary blobs
> > > used this notifier interface and is planned on being pushed upstream.
> > > 
> > > Having users able to just use the tracer module without having to rebuild 
> > > their kernel to add in a page fault handler hack means we get a lot 
> > > greater coverage for reverse engineering efforts.
> > 
> > Sorry, but that's a really really small benefit.  This very small number of
> > fairly (or very) technical users will be able to work out a way of getting
> > this to work in 2.6.24.  And in 2.6.25 with a merged mmiotrace we can do
> > something different.
> 
> mmiotrace isn't targetted at fairly or technical users, its whole 
> usefulness is that you don't need a kernel re-build, the distro kernels 
> all contain enough support for us to just get a user to grab mmiotrace, 
> run make and get a trace so in my eyes this a major feature regression 
> to have to go back to custom kernel builds...

An alternative might be to come up with something decent and target 2.6.24.x

> > It's a modest convenience to a very small number of people.  And the cost? 
> > Multiple functions calls and multiple cachelines hit for every pagefault
> > on, what?  Tens of millions of machines?
> 
> Which has been happening for how many months? perhaps if we merge 
> mmiotrace in 2.6.25 we can clean up this function, otherwise I just count 
> it as a feature regression...

We put the crappy code back in for 2.6.24 then take it out immediately
after 2.6.24 and put something else in to support mmiotrace and perhaps the
other new mystery features to which you refer below.  hm.

> > pagefault it populates a struct on the stack, passes that around for a
> > while, does a bit of RCU stuff only to find that there was nothing to do. 
> > Surely we should at least be doing something along the lines of
> > 
> > if (unlikely(notify_page_fault_chain.notifier_call != NULL)) {
> > all that crap
> > }
> > 
> > 
> > But that's all speculation.  Has anyone actually measured the pagefault
> > latency impact of this change?

^^ this.

> > > +/*
> > > + * These are only here because kprobes.c wants them to implement a
> > > + * blatant layering violation.  Will hopefully go away soon once all
> > > + * architectures are updated.
> > > + */
> > > +static inline int register_page_fault_notifier(struct notifier_block *nb)
> > > +{
> > > + return 0;
> > > +}
> > > +static inline int unregister_page_fault_notifier(struct notifier_block 
> > > *nb)
> > > +{
> > > + return 0;
> > > +}
> > > +
> > 
> > And this doesn't look very good either.  For how long did this fixme remain
> > unfixed?
> > 
> > 
> > So I'd suggest that we leave things as they are for 2.6.24 - mmiotrace
> > people will work something out, I'm sure.  For 2.6.25 if we merge mmiotrace
> > we can look at doing something which is vaguely efficient and tasteful.
> > 
> 
> I just reverted Christophs patch I didn't try and work out if the old code 
> had problems no one has fixed...
> 
> So all distros with 2.6.24 kernels are useless to mmiotrace I don't see 
> why leaving things as is until a suitable replacement mechanism can be 
> used.. I've heard others give out about this also madwifi and SuSE kernel 
> folks...

That change has been in the mainline tree for nearly three months.  All
these affected parties have left it until the eve of 2.6.24 to actually
tell us about it.  This is causing me sympathy problems :(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86_64: cleanup setup_node_zones called by paging_init

2008-01-08 Thread Yinghai Lu

[PATCH] x86_64: cleanup setup_node_zones called by paging_init

setup_node_zones calcuates some variable but only use them when 
FLAT_NODE_MEM_MAP is set

so change the MACRO postion to avoid calculating.

also change it to static

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

Index: linux-2.6/arch/x86/mm/numa_64.c
===
--- linux-2.6.orig/arch/x86/mm/numa_64.c
+++ linux-2.6/arch/x86/mm/numa_64.c
@@ -227,15 +227,16 @@ void __init setup_node_bootmem(int nodei
srat_reserve_add_area(nodeid);
 #endif
node_set_online(nodeid);
-} 
+}
 
+#ifdef CONFIG_FLAT_NODE_MEM_MAP
 /* Initialize final allocator for a zone */
-void __init setup_node_zones(int nodeid)
-{ 
+static void __init setup_node_zones(int nodeid)
+{
unsigned long start_pfn, end_pfn, memmapsize, limit;
 
-   start_pfn = node_start_pfn(nodeid);
-   end_pfn = node_end_pfn(nodeid);
+   start_pfn = node_start_pfn(nodeid);
+   end_pfn = node_end_pfn(nodeid);
 
Dprintk(KERN_INFO "Setting up memmap for node %d %lx-%lx\n",
nodeid, start_pfn, end_pfn);
@@ -244,14 +245,13 @@ void __init setup_node_zones(int nodeid)
   memory. */
memmapsize = sizeof(struct page) * (end_pfn-start_pfn);
limit = end_pfn << PAGE_SHIFT;
-#ifdef CONFIG_FLAT_NODE_MEM_MAP
-   NODE_DATA(nodeid)->node_mem_map = 
-   __alloc_bootmem_core(NODE_DATA(nodeid)->bdata, 
-   memmapsize, SMP_CACHE_BYTES, 
-   round_down(limit - memmapsize, PAGE_SIZE), 
+   NODE_DATA(nodeid)->node_mem_map =
+   __alloc_bootmem_core(NODE_DATA(nodeid)->bdata,
+   memmapsize, SMP_CACHE_BYTES,
+   round_down(limit - memmapsize, PAGE_SIZE),
limit);
+}
 #endif
-} 
 
 void __init numa_init_array(void)
 {
@@ -570,9 +570,11 @@ void __init paging_init(void)
sparse_memory_present_with_active_regions(MAX_NUMNODES);
sparse_init();
 
+#ifdef CONFIG_FLAT_NODE_MEM_MAP
for_each_online_node(i) {
-   setup_node_zones(i); 
+   setup_node_zones(i);
}
+#endif
 
free_area_init_nodes(max_zone_pfns);
 } 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [patch] split MMC_CAP_4_BIT_DATA

2008-01-08 Thread Cai, Cliff

 Hi,all

I'd like to say something about this issue.
Currently,the blackfin on chip SD host ONLY support 1-bit MMC while
support 1-bit/4-bit SD/SDIO.
And we want our driver to support both 1-bit MMC and 4-bit SD/SDIO.but
the current MMC driver framework
Only allow us to set one kind of bus width,either 1-bit or 4-bit.So in
order to meet our case,we need more flexible mechanism
To inform the upper commom driver to know our situation.

Cliff

-Original Message-
From: Bryan Wu [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 09, 2008 10:33 AM
To: Pierre Ossman; [EMAIL PROTECTED]
Cc: Mike Frysinger; linux-kernel@vger.kernel.org
Subject: Re: [patch] split MMC_CAP_4_BIT_DATA

On Jan 9, 2008 4:49 AM, Pierre Ossman <[EMAIL PROTECTED]> wrote:
> On Tue, 8 Jan 2008 14:40:49 -0500
> Mike Frysinger <[EMAIL PROTECTED]> wrote:
>
> >
> > i dont understand what's confusing.  the Blackfin on chip host 
> > controller only supports 1-bit MMC, but it supports 4-bit SD/SDIO.  
> > this is a fact.  while it may be a stupid decision, it is what it 
> > is, and i need the framework made more flexible in order to get the 
> > Blackfin driver merged cleanly.  we do software for hardware, we
dont do hardware.
>
> Well, since I've seen no _hardware_ differences between 4-bit MMC and
4-bit SD, "support" in this case must me "vendor will guarantee it
works". And that is not the kind of "support" that needs a distinction
in the code.
>
> So, again, if you feel that there is a hardware difference between
4-bit MMC and 4-bit SD then please elaborate as it is my understanding
that they are identical.
>

As Mike said, the reason split this flag is because Blackfin on-chip
SDIO controller's limitation.
Cliff is working on it for a long time, so I dropped him in. Hope he can
clarify the confusing things.

Thanks
-Bryan Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] add task handling notifier

2008-01-08 Thread Matthew Helsley

On Tue, 2008-01-08 at 18:24 -0800, Matt Helsley wrote:
> On Sun, 2007-12-23 at 12:26 +, Christoph Hellwig wrote:
> > On Thu, Dec 20, 2007 at 01:11:24PM +, Jan Beulich wrote:
> > > With more and more sub-systems/sub-components leaving their footprint
> > > in task handling functions, it seems reasonable to add notifiers that
> > > these components can use instead of having them all patch themselves
> > > directly into core files.
> > 
> > I agree that we probably want something like this.  As do some others,
> > so we already had a few a few attempts at similar things.  The first one
> > is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> > includes allocating per-task data for it's users.  Then also from SGI
> > there has been a simplified version called pnotify that's also available
> > from the website above.
> > 
> > Later Matt Helsley had something called "Task Watchers" which lwn has
> > an article on: http://lwn.net/Articles/208117/.
> 
> Apologies for the late reply -- I haven't had internet access for the
> last few weeks.
> 
> > For some reason neither ever made a lot of progess (performance
> > problems?).
> 
> Yeah. Some discussion on measuring the performance of Task Watchers:
> http://thread.gmane.org/gmane.linux.lse/4698
> 
> The requirements for Task Watchers were:
> 
> Allow sleeping in most/all notifier functions in these paths:
>   fork
>   exec
>   exit
>   change [re][ug]id
> No performance overhead
> One "chain" per path ("I only care about exec().")
> Easy to use
> Scales to large numbers of CPUs
> Useful to make most in-tree code more readable. Task Watchers took
> direct calls to these pieces of code out of the fork/exec/exit paths:
>   audit
>   semundo
>   cpusets
>   mempolicy
>   trace irqflags
>   lockdep
>   keys (for processes -- not for thread groups)
>   process events connector
> Useful for loadable modules
> 
> Performance overhead in microbenchmarks was measurable at around 1% (see
> the URL above). Overhead on benchmarks like kernbench on the other hand
> were in the noise margins (which were around 1.6%) and hence I couldn't
> determine the overhead there.
> 
> I never got the loadable module part completely working due to races
> between notifier functions and the module unload path. The solution to
> the races seemed to require adding more overhead to the notifier
> function paths (SRCU-like grace periods).
> 
> I stopped pushing the patch set because I hadn't found any new
> optimizations to offset the overheads while still meeting all the
> requirements and Andrew still felt that the "make it more readable"
> argument was not sufficient to justify its inclusion.

Oops. It's been nearly two years so I've forgotten exactly where Task
Watchers v2 was when I stopped pushing it. After a bit more searching I
found a more recent posting:
http://lkml.org/lkml/2006/12/14/384

And here's why I think the microbenchmark results improved to the point
there was a small performance improvement over mainline:
http://lkml.org/lkml/2006/12/19/124

I seem to recall kernbench was still too noisy to tell.

The patch allowing modules to register Task Watchers still isn't posted
there for the reasons I've already described.

Cheers,
-Matt Helsley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-08 Thread Fengguang Wu

On Mon, Jan 07, 2008 at 02:40:13PM +0100, Joerg Platte wrote:
> Am Montag, 7. Januar 2008 schrieb Peter Zijlstra:
> > On Mon, 2008-01-07 at 14:24 +0100, Joerg Platte wrote:
> >
> > This is from: 2.6.24-rc7
> >
> > > kernel: pdflush   D f41c2f14 0 18822  2
> > > kernel:f673f000 0046 0286 f41c2f14 f5194ce0 0286
> > > 0286 f41c2f14 kernel:00175279 f41c2f6c  c0271f6c
> > > f5ff363c f5ff3644 c0354a90 c0354a90 kernel:00175279 c0123251
> > > f5194b80 c03546c0 c0271f67 6c666470 00687375  kernel: Call Trace:
> > > kernel:  [] schedule_timeout+0x6e/0x8b
> > > kernel:  [] process_timeout+0x0/0x5
> > > kernel:  [] schedule_timeout+0x69/0x8b
> > > kernel:  [] __sched_text_start+0x3a/0x70
> > > kernel:  [] congestion_wait+0x4e/0x62
> > > kernel:  [] autoremove_wake_function+0x0/0x33
> > > kernel:  [] pdflush+0x0/0x1bf
> > > kernel:  [] wb_kupdate+0x8c/0xd1
> > > kernel:  [] pdflush+0x0/0x1bf
> > > kernel:  [] pdflush+0x11b/0x1bf
> > > kernel:  [] wb_kupdate+0x0/0xd1
> > > kernel:  [] kthread+0x36/0x5d
> > > kernel:  [] kthread+0x0/0x5d
> > > kernel:  [] kernel_thread_helper+0x7/0x10
> > > kernel:  ===
> >
> > What filesystem are you using?
> 
> Here you can see all currently mounted filesystems:
> 
> /dev/sda6 on / type ext3 (rw,noatime,errors=remount-ro,acl)
> tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
> proc on /proc type proc (rw,noexec,nosuid,nodev)
> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> procbususb on /proc/bus/usb type usbfs (rw)
> udev on /dev type tmpfs (rw,mode=0755)
> tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
> fusectl on /sys/fs/fuse/connections type fusectl (rw)
> /dev/sda7 on /tmp type ext2 (rw,noatime,errors=remount-ro,acl)
> /dev/sda8 on /export type ext3 (rw,noatime,errors=remount-ro,acl)
> /dev/sda1 on /winxp type ntfs (rw,umask=002,gid=1,nls=utf8)

So they are ext3/ext2/ntfs.  What if you umount ntfs? and ext2 if possible?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Oops?

2008-01-08 Thread Stoyan Gaydarov

On Jan 8, 2008 9:02 PM, Alan Cox <[EMAIL PROTECTED]> wrote:
> > Except this time when rebooting the machine i got a kernel oops
> > message and it didn't boot completely. I could not copy it but I did
> > take a picture and now I have re-written the screen here(sorry about
>
> That is interesting - that sort of error usually points at memory
> corruption and early on tends to point at hardware (but not always). What
> hard is in this system  and does it have over 4GB of RAM ?
>
>

There are 2GB of RAM and the motherboard is DFI and it has a duel core
intel cpu. If you need to specifics I could look them up.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00 of 10] x86: unify asm/pgtable.h

2008-01-08 Thread Jeremy Fitzhardinge


Andi Kleen wrote:
Yeah, that may be true, but this particular tree is weird, and I'm trying 
to understand what's going on here.  Specifically, 64-bit ioremap()s 
*don't* set _PAGE_GLOBAL, which appears to be an accident resulting from 
the strange definitions of __PAGE_KERNEL_* vs PAGE_KERNEL_*. 



ioremap() should set G agreed.

  
For example, ioremap_64.c:__ioremap() creates a vma for the io mapping, and 
explicitly sets _PAGE_GLOBAL in the vma's version of pgprot - but then it 
calls ioremap_page_range() to actually create the mapping, which ends up 
making a non-global mapping, because its rolling its own version of 
PAGE_KERNEL by using pgprot(__PAGE_KERNEL) - which is not the actual 
definition of PAGE_KERNEL.



That should not really matter because ioremap_change_attr()->c_p_a is only 
called
when flags is != 0 and that means it is already different from PAGE_KERNEL.

  
I think there's a bug around here, but I think its currently being hidden 



There's one Jan pointed out: iounmap does not subtract the guard page size
so it ends up resetting one page too much. That is probably what causes your
problem. But again you should be passing in G in the first place.

-Andi

Here was Jan's patch; it incidently fixes the G problem too
  


OK, great.  Ingo, that means we can use this and go back to folding 
_PAGE_GLOBAL into __PAGE_KERNEL_*.  Well, at least give it a try.


   J


snip

Additionally I found it necessary to fix ioremap_64.c's use of
change_page_attr_addr():

--- a/arch/x86/mm/ioremap_64.c
+++ b/arch/x86/mm/ioremap_64.c
@@ -48,7 +48,7 @@ ioremap_change_attr(unsigned long phys_a
 * Must use a address here and not struct page because the phys 
addr
 * can be a in hole between nodes and not have an memmap entry.
 */
-   err = 
change_page_attr_addr(vaddr,npages,__pgprot(__PAGE_KERNEL|flags));
+   err = 
change_page_attr_addr(vaddr,npages,MAKE_GLOBAL(__PAGE_KERNEL|flags));
if (!err)
global_flush_tlb();
}
@@ -199,7 +199,7 @@ void iounmap(volatile void __iomem *addr
 
 	/* Reset the direct mapping. Can block */

if (p->flags >> 20)
-   ioremap_change_attr(p->phys_addr, p->size, 0);
+   ioremap_change_attr(p->phys_addr, get_vm_area_size(p), 0);
 
 	/* Finally remove it */

o = remove_vm_area((void *)addr);

Other extra changes I had in my version could possibly be counted as 
enhancements...

Jan
  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: The ext3 way of journalling

2008-01-08 Thread Kyle Moffett


On Jan 08, 2008, at 15:51:53, Andi Kleen wrote:

Theodore Tso <[EMAIL PROTECTED]> writes:
Now, there are good reasons for doing periodic checks every N  
mounts and after M months.  And it has to do with PC class  
hardware.  (Ted's aphorism: "PC class hardware is cr*p").


If these reasons are good ones (some skepticism here) then the  
correct way to really handle this would be to do regular background  
scrubbing during runtime; ideally with metadata checksums so that  
you can actually detect all corruption.


Poor man's background scrubbing:

(A)  Use LVM like virtually all modern distros offer
(B)  Leave some extra space in your LVM volume group (enough for 1  
snapshot over the time it takes to do an FSCK).

(C)  Periodically run the following scriptlet:

set -e
START="$(date +'%Y%m%d%H%M%S')"
lvcreate -s -n "${VOLUME}-snap" "${VG}/${VOLUME}"
if nice +20 fsck -fy "/dev/mapper/${VG}_${VOLUME}-snap"; then
echo 'Background scrubbing succeeded!'
tune2fs -T "${START}" "/dev/mapper/${VG}_${VOLUME}"
else
echo 'Background scrubbing failed!  Reboot to fsck soon!'
tune2fs -C 16383 -T "19000101" "/dev/mapper/${VG}_${VOLUME}"
fi
lvremove "${VG}/${VOLUME}-snap"

Basically you can fsck the offline snapshot in the background.  If it  
succeeds you can adjust the "last checked" date to the time when the  
snapshot was taken and if it fails you can schedule an FSCK at next  
reboot (and possibly remount the filesystem read-only or reboot  
immediately).


You can do the same thing for your /boot volume, although you  
probably have to manually use dmsetup since most bootloaders can't  
interpret LVM volumes.


I've always been surprised that distros like RedHat which  
automatically use LVM don't stuff this in their weekly or monthly  
checks on desktop systems.  User experience could also be  
dramatically improved with automated smartd configuration and user- 
interactive logging and warning messages.



But since fsck is so slow and disks are so big this whole thing is  
a ticking time bomb now. e.g. it is not uncommon to require tens of  
minutes or even hours of fsck time and some server that reboots  
only every few months will eat that when it happens to reboot. This  
means you get a quite long downtime.


My servers all have an "interval-between-checks" of 2-6 weeks and are  
configured to run nice +20 background "fsck" checks during off-hours  
between once every few days and once every few weeks.  I also have  
the "max mount count" numbers set to primes between 7 and 37  
(depending on the filesystem) so that troubled or frequently-rebooted  
systems are more frequently verified.  The end result is that I  
almost never have the dreaded 4-hour-fsck-on-boot problem.  A drive  
has certainly been fscked within the last few weeks of operation, and  
I will only ever have multiple large filesystems all fscked at the  
same time very rarely (gcd of their max-mount-counts).


Cheers,
Kyle Moffett

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] add task handling notifier

2008-01-08 Thread Andrew Morton

On Tue, 08 Jan 2008 18:47:00 -0800 Matt Helsley <[EMAIL PROTECTED]> wrote:

> > > 
> ...
> > > Am I to conclude then that there's no point in addressing the issues other
> > > people pointed out? While I (obviously, since I submitted the patch 
> > > disagree),
> > > I'm not certain how others feel. My main point for disagreement here is 
> > > (I'm
> > > sorry to repeat this) that as long as certain code isn't allowed into the 
> > > kernel
> > > I think it is not unreasonable to at least expect the kernel to provide 
> > > some
> > > fundamental infrastructure that can be used for those (supposedly
> > > unacceptable) bits. All I did here was utilizing the base infrastructure 
> > > I want
> > > added to clean up code that appeared pretty ad-hoc.
> > > 
> > 
> > Ah.  That's a brand new requirement.
> 
> In all fairness it's not really a brand new requirement -- just one that
> wasn't strongly emphasized during prior attempts to get something like
> this in.
> 
> I had a mostly-working patch for this on top of the Task Watchers v2
> patch set. I never posted that specific patch because it had a race with
> module unloading and the fix only increased the overhead you were
> unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X]
> description for Task Watchers v2 (http://lwn.net/Articles/207873/):
> 
> "TODO:
> ...
> I'm working on three more patches that add support for creating a task
> watcher from within a module using an ELF section. They haven't recieved
> as much attention since I've been focusing on measuring the performance
> impact of these patches."
> 
> 
> 
> Would tainting the kernel upon registration of out-of-tree "notifiers"
> be more acceptable?

How does that work?  module.c does the register/deregister on behalf of the
module?

I certainly encourage people to disagreee with me here, but my current
thinking is:

- the cleanup aspect isn't worth the runtime overhead and

- the support-modular-users aspect is largely new and would need a lot
  more description and justification (with examples) before we can even
  begin to evaluate it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] call sysrq_timer_list_show from a workqueue

2008-01-08 Thread Rusty Russell

On Wednesday 09 January 2008 11:21:59 Andrew Morton wrote:
> The string handling in here has become a bit scruffy.

Yes, that patch also evokes a const warning.  Fixed below.  I assume you've
queued these because you're thinking of applying them before 2.6.24?  I'd say
only modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch
warrants that (the other is unlikely and not a regression).

> afacit the `namebuf[KSYM_NAME_LEN - 1] = 0;' would be unneeded if we were
> to use strlcpy() and I suspect the `namebuf[0] = 0;' isn't needed either.
>
> And the use of strlcpy() means we don't need to subtract 1 from
> KSYM_NAME_LEN and we don't need to fret about weird strncpy semantics when
> the input string is too large.
>
>
> And the fact that incoming arg `namebuf' MUST point at a
> KSYM_NAME_LEN-sized buffer could be better communicated by using a
> dedicated struct for this, or by giving the arg a type of `char
> namebuf[KSYM_NAME_LEN]'.  Or by adding a comment. Or by just ignoring
> me and doing something more useful.

Or better, rework all the name lookup interfaces, rather than having: 

struct module *module_text_address(unsigned long addr);
struct module *__module_text_address(unsigned long addr);
int is_module_address(unsigned long addr);
int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
char *name, char *module_name, int *exported);
char *module_address_lookup(unsigned long addr,
unsigned long *symbolsize,
unsigned long *offset,
char **modname,
char *namebuf);
int lookup_module_symbol_name(unsigned long addr, char *symname);
int lookup_module_symbol_attrs(unsigned long addr, unsigned long *size,
   unsigned long *offset, char *modname, char 
*name);
unsigned long module_kallsyms_lookup_name(const char *name);

unsigned long kallsyms_lookup_name(const char *name);
extern int kallsyms_lookup_size_offset(unsigned long addr,
  unsigned long *symbolsize,
  unsigned long *offset);
const char *kallsyms_lookup(unsigned long addr,
unsigned long *symbolsize,
unsigned long *offset,
char **modname, char *namebuf);
extern int sprint_symbol(char *buffer, unsigned long address);
extern void __print_symbol(const char *fmt, unsigned long address);
int lookup_symbol_name(unsigned long addr, char *symname);
int lookup_symbol_attrs(unsigned long addr, unsigned long *size,
unsigned long *offset, char *modname, char *name);

Cheers,
Rusty.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Dave Airlie


> On Wed, 9 Jan 2008 02:34:46 + (GMT) Dave Airlie <[EMAIL PROTECTED]> wrote:
> 
> > 
> > [This an initial RFC but I'd like to have this patch in before 2.6.24 goes 
> > final as it really breaks this useful feature]
> > 
> > mmiotrace the MMIO access tracer used to reverse engineer binary blobs
> > used this notifier interface and is planned on being pushed upstream.
> > 
> > Having users able to just use the tracer module without having to rebuild 
> > their kernel to add in a page fault handler hack means we get a lot 
> > greater coverage for reverse engineering efforts.
> 
> Sorry, but that's a really really small benefit.  This very small number of
> fairly (or very) technical users will be able to work out a way of getting
> this to work in 2.6.24.  And in 2.6.25 with a merged mmiotrace we can do
> something different.

mmiotrace isn't targetted at fairly or technical users, its whole 
usefulness is that you don't need a kernel re-build, the distro kernels 
all contain enough support for us to just get a user to grab mmiotrace, 
run make and get a trace so in my eyes this a major feature regression 
to have to go back to custom kernel builds...

> It's a modest convenience to a very small number of people.  And the cost? 
> Multiple functions calls and multiple cachelines hit for every pagefault
> on, what?  Tens of millions of machines?

Which has been happening for how many months? perhaps if we merge 
mmiotrace in 2.6.25 we can clean up this function, otherwise I just count 
it as a feature regression...

> pagefault it populates a struct on the stack, passes that around for a
> while, does a bit of RCU stuff only to find that there was nothing to do. 
> Surely we should at least be doing something along the lines of
> 
>   if (unlikely(notify_page_fault_chain.notifier_call != NULL)) {
>   all that crap
>   }
> 
> 
> But that's all speculation.  Has anyone actually measured the pagefault
> latency impact of this change?
> 
> > +/*
> > + * These are only here because kprobes.c wants them to implement a
> > + * blatant layering violation.  Will hopefully go away soon once all
> > + * architectures are updated.
> > + */
> > +static inline int register_page_fault_notifier(struct notifier_block *nb)
> > +{
> > +   return 0;
> > +}
> > +static inline int unregister_page_fault_notifier(struct notifier_block *nb)
> > +{
> > +   return 0;
> > +}
> > +
> 
> And this doesn't look very good either.  For how long did this fixme remain
> unfixed?
> 
> 
> So I'd suggest that we leave things as they are for 2.6.24 - mmiotrace
> people will work something out, I'm sure.  For 2.6.25 if we merge mmiotrace
> we can look at doing something which is vaguely efficient and tasteful.
> 

I just reverted Christophs patch I didn't try and work out if the old code 
had problems no one has fixed...

So all distros with 2.6.24 kernels are useless to mmiotrace I don't see 
why leaving things as is until a suitable replacement mechanism can be 
used.. I've heard others give out about this also madwifi and SuSE kernel 
folks...

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: umount -l , getcwd and /proc//cwd inconsistent

2008-01-08 Thread Ian Kent

On Mon, 2008-01-07 at 12:17 +0900, Ian Kent wrote:
> 
> Basically, from a bash shell, setting working directory to a mounted
> directory all is fine with "pwd" and "/proc//cwd". Following a
> "umount - l" on the mount "pwd" continues to return the expected string
> but "/proc//cwd" returns an empty string.
> 
> What I'm really after is why this happens because sys_getcwd and
> proc_pid_readlink appear to do essentially the same thing to get the
> string.

I think I understand what happens here now.

Basically, following a "umount -l", anything that calls d_path from
within the unlinked mount and doesn't have a d_name dentry ops method
can no longer walk back up to the root to get the path.

Of course this makes perfect sense as the mount has been unlinked from
the tree.

But it can also prevent processes still using the mount from
successfully running through to completion to release the mount. I
expect this was never the intent of the functionality but I think it
should be. Especially since the VFS appears to handle this really well
otherwise.

So, I'm after suggestions:
Does anyone feel strongly that this case shouldn't be handled for some
reason? Why?
Does anyone have any suggestions about how this should be done?
Does anyone have any concerns about what shouldn't be done to deal with
this?

Ian

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Andrew Morton

On Wed, 9 Jan 2008 02:34:46 + (GMT) Dave Airlie <[EMAIL PROTECTED]> wrote:

> 
> [This an initial RFC but I'd like to have this patch in before 2.6.24 goes 
> final as it really breaks this useful feature]
> 
> mmiotrace the MMIO access tracer used to reverse engineer binary blobs
> used this notifier interface and is planned on being pushed upstream.
> 
> Having users able to just use the tracer module without having to rebuild 
> their kernel to add in a page fault handler hack means we get a lot 
> greater coverage for reverse engineering efforts.

Sorry, but that's a really really small benefit.  This very small number of
fairly (or very) technical users will be able to work out a way of getting
this to work in 2.6.24.  And in 2.6.25 with a merged mmiotrace we can do
something different.

It's a modest convenience to a very small number of people.  And the cost? 
Multiple functions calls and multiple cachelines hit for every pagefault
on, what?  Tens of millions of machines?

Plus the code which is getting restored isn't even very good.  For every
pagefault it populates a struct on the stack, passes that around for a
while, does a bit of RCU stuff only to find that there was nothing to do. 
Surely we should at least be doing something along the lines of

if (unlikely(notify_page_fault_chain.notifier_call != NULL)) {
all that crap
}

But that's all speculation.  Has anyone actually measured the pagefault
latency impact of this change?

> +/*
> + * These are only here because kprobes.c wants them to implement a
> + * blatant layering violation.  Will hopefully go away soon once all
> + * architectures are updated.
> + */
> +static inline int register_page_fault_notifier(struct notifier_block *nb)
> +{
> + return 0;
> +}
> +static inline int unregister_page_fault_notifier(struct notifier_block *nb)
> +{
> + return 0;
> +}
> +

And this doesn't look very good either.  For how long did this fixme remain
unfixed?

So I'd suggest that we leave things as they are for 2.6.24 - mmiotrace
people will work something out, I'm sure.  For 2.6.25 if we merge mmiotrace
we can look at doing something which is vaguely efficient and tasteful.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Oops?

2008-01-08 Thread Alan Cox

> Except this time when rebooting the machine i got a kernel oops
> message and it didn't boot completely. I could not copy it but I did
> take a picture and now I have re-written the screen here(sorry about

That is interesting - that sort of error usually points at memory
corruption and early on tends to point at hardware (but not always). What
hard is in this system  and does it have over 4GB of RAM ?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 2/2] kexec/i386: kexec page table code clean up - page table setup in C

2008-01-08 Thread Huang, Ying

This patch transforms the kexec page tables setup code from assembler
code to C code in machine_kexec_prepare. This improves readability and
reduces code line number.

Signed-off-by: Huang Ying <[EMAIL PROTECTED]>

---
 arch/x86/kernel/machine_kexec_32.c   |   50 +++
 arch/x86/kernel/relocate_kernel_32.S |  114 ---
 include/asm-x86/kexec_32.h   |   18 -
 3 files changed, 40 insertions(+), 142 deletions(-)

--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -86,6 +86,42 @@ static void free_page_tables(struct kima
free_page((unsigned long)image->arch_kimage.pte1);
 }
 
+static void page_table_set_one(pgd_t *pgd, pmd_t *pmd, pte_t *pte,
+  unsigned long vaddr, unsigned long paddr)
+{
+   pud_t *pud;
+
+   pgd += pgd_index(vaddr);
+#ifdef CONFIG_X86_PAE
+   if (!(pgd_val(*pgd) & _PAGE_PRESENT))
+   set_pgd(pgd, __pgd(__pa(pmd) | _PAGE_PRESENT));
+#endif
+   pud = pud_offset(pgd, vaddr);
+   pmd = pmd_offset(pud, vaddr);
+   if (!(pmd_val(*pmd) & _PAGE_PRESENT))
+   set_pmd(pmd, __pmd(__pa(pte) | _PAGE_TABLE));
+   pte = pte_offset_kernel(pmd, vaddr);
+   set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC));
+}
+
+static void prepare_page_tables(struct kimage *image)
+{
+   void *control_page;
+   pmd_t *pmd = 0;
+
+   control_page = page_address(image->control_code_page);
+#ifdef CONFIG_X86_PAE
+   pmd = image->arch_kimage.pmd0;
+#endif
+   page_table_set_one(image->arch_kimage.pgd, pmd, image->arch_kimage.pte0,
+  (unsigned long)relocate_kernel, __pa(control_page));
+#ifdef CONFIG_X86_PAE
+   pmd = image->arch_kimage.pmd1;
+#endif
+   page_table_set_one(image->arch_kimage.pgd, pmd, image->arch_kimage.pte1,
+  __pa(control_page), __pa(control_page));
+}
+
 /*
  * A architecture hook called to validate the
  * proposed image and prepare the control pages
@@ -98,6 +134,7 @@ static void free_page_tables(struct kima
  * later.
  *
  * - Allocate page tables
+ * - Setup page tables
  */
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -112,6 +149,7 @@ int machine_kexec_prepare(struct kimage 
free_page_tables(image);
return -ENOMEM;
}
+   prepare_page_tables(image);
return 0;
 }
 
@@ -140,19 +178,7 @@ NORET_TYPE void machine_kexec(struct kim
memcpy(control_page, relocate_kernel, PAGE_SIZE);
 
page_list[PA_CONTROL_PAGE] = __pa(control_page);
-   page_list[VA_CONTROL_PAGE] = (unsigned long)relocate_kernel;
page_list[PA_PGD] = __pa(image->arch_kimage.pgd);
-   page_list[VA_PGD] = (unsigned long)image->arch_kimage.pgd;
-#ifdef CONFIG_X86_PAE
-   page_list[PA_PMD_0] = __pa(image->arch_kimage.pmd0);
-   page_list[VA_PMD_0] = (unsigned long)image->arch_kimage.pmd0;
-   page_list[PA_PMD_1] = __pa(image->arch_kimage.pmd1);
-   page_list[VA_PMD_1] = (unsigned long)image->arch_kimage.pmd1;
-#endif
-   page_list[PA_PTE_0] = __pa(image->arch_kimage.pte0);
-   page_list[VA_PTE_0] = (unsigned long)image->arch_kimage.pte0;
-   page_list[PA_PTE_1] = __pa(image->arch_kimage.pte1);
-   page_list[VA_PTE_1] = (unsigned long)image->arch_kimage.pte1;
 
/* The segment registers are funny things, they have both a
 * visible and an invisible part.  Whenever the visible part is
--- a/arch/x86/kernel/relocate_kernel_32.S
+++ b/arch/x86/kernel/relocate_kernel_32.S
@@ -16,126 +16,12 @@
 
 #define PTR(x) (x << 2)
 #define PAGE_ALIGNED (1 << PAGE_SHIFT)
-#define PAGE_ATTR 0x63 /* _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY */
-#define PAE_PGD_ATTR 0x01 /* _PAGE_PRESENT */
 
.text
.align PAGE_ALIGNED
.globl relocate_kernel
 relocate_kernel:
movl8(%esp), %ebp /* list of pages */
-
-#ifdef CONFIG_X86_PAE
-   /* map the control page at its virtual address */
-
-   movlPTR(VA_PGD)(%ebp), %edi
-   movlPTR(VA_CONTROL_PAGE)(%ebp), %eax
-   andl$0xc000, %eax
-   shrl$27, %eax
-   addl%edi, %eax
-
-   movlPTR(PA_PMD_0)(%ebp), %edx
-   orl $PAE_PGD_ATTR, %edx
-   movl%edx, (%eax)
-
-   movlPTR(VA_PMD_0)(%ebp), %edi
-   movlPTR(VA_CONTROL_PAGE)(%ebp), %eax
-   andl$0x3fe0, %eax
-   shrl$18, %eax
-   addl%edi, %eax
-
-   movlPTR(PA_PTE_0)(%ebp), %edx
-   orl $PAGE_ATTR, %edx
-   movl%edx, (%eax)
-
-   movlPTR(VA_PTE_0)(%ebp), %edi
-   movlPTR(VA_CONTROL_PAGE)(%ebp), %eax
-   andl$0x001ff000, %eax
-   shrl$9, %eax
-   addl%edi, %eax
-
-   movlPTR(PA_CONTROL_PAGE)(%ebp), %edx
-   orl $PAGE_ATTR, %edx
-   movl%edx, (%eax)
-
-   /* identity map the control page at its physical address */
-
-   movl

Re: [PATCH 5/6] syslets: add generic syslets infrastructure

2008-01-08 Thread Zach Brown


> Firstly, why not just specify an address for the return value and be done 
> with it?  This infrastructure seems overkill, and you can always extend later 
> if required.

Sorry, which infrastructure?

Providing the function and stack to return to?  Sure, I could certainly
entertain the idea of not having syslet tasks return to userspace in the
first pass.  Ingo sure seemed excited by the idea.

Or do you mean the syscall return value ending up in the userspace
completion event ring?  That's mostly about being able to wait for
pending syslets to complete.

> Secondly, you really should allow integration with an eventfd so you don't 
> make the posix AIO mistake of providing a poll-incompatible interface.

Yeah, this seems straight forward enough that I haven't made it an
initial priority.  I'm sure it will be helpful for people who are stuck
integrating with entrenched software that wants to wait for pollable fds.

For more flexible software, though, it's compelling to now be able to
aggregate waiting for completion of the existing waiting syscalls (poll,
epoll_wait, futexes, whatever) by issuing them as concurrent syslets.

> Finally, and probably most alarmingly, AFAICT randomly changing TID will 
> break 
> all threaded programs, which means this won't be fitted into existing code 
> bases, making it YA niche Linux-only API 8(

Yeah, this still needs to be investigated.  I haven't yet and I haven't
heard of anyone else trying their hand at it.

In the YANLOA mode apps would know that executing syslets is an implicit
clone() and would act accordingly.  "8(", indeed.

I wonder if there isn't an opportunity to add a clone() flag which
juggles the association between TIDs and task_structs.  I don't relish
the idea of investigating the life cycles of task_struct references that
derive from TIDs and seeing how those would race with a syslet blocking
and cloning, but, well, maybe that's what needs to be done.

This all isn't my area of expertise, though, sadly.  It would be swell
if someone wanted to look into it before I'm forced to learn yet another
weird corner of the kernel.

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 1/2] kexec/i386: kexec page table code clean up - add arch_kimage

2008-01-08 Thread Huang, Ying

This patch add an architecture specific struct arch_kimage into struct
kimage. Three pointers to page table pages used by kexec are added to
struct arch_kimage. The page tables pages are dynamically allocated in
machine_kexec_prepare instead of statically from BSS segment. This
will save up to 20k memory when kexec image is not loaded.

Signed-off-by: Huang Ying <[EMAIL PROTECTED]>

---
 arch/x86/kernel/machine_kexec_32.c |   68 +
 include/asm-x86/kexec_32.h |   12 ++
 include/linux/kexec.h  |4 ++
 3 files changed, 63 insertions(+), 21 deletions(-)

--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -21,15 +22,6 @@
 #include 
 #include 
 
-#define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE)))
-static u32 kexec_pgd[1024] PAGE_ALIGNED;
-#ifdef CONFIG_X86_PAE
-static u32 kexec_pmd0[1024] PAGE_ALIGNED;
-static u32 kexec_pmd1[1024] PAGE_ALIGNED;
-#endif
-static u32 kexec_pte0[1024] PAGE_ALIGNED;
-static u32 kexec_pte1[1024] PAGE_ALIGNED;
-
 static void set_idt(void *newidt, __u16 limit)
 {
struct Xgt_desc_struct curidt;
@@ -72,6 +64,28 @@ static void load_segments(void)
 #undef __STR
 }
 
+static void alloc_page_tables(struct kimage *image)
+{
+   image->arch_kimage.pgd = (pgd_t *)get_zeroed_page(GFP_KERNEL);
+#ifdef CONFIG_X86_PAE
+   image->arch_kimage.pmd0 = (pmd_t *)get_zeroed_page(GFP_KERNEL);
+   image->arch_kimage.pmd1 = (pmd_t *)get_zeroed_page(GFP_KERNEL);
+#endif
+   image->arch_kimage.pte0 = (pte_t *)get_zeroed_page(GFP_KERNEL);
+   image->arch_kimage.pte1 = (pte_t *)get_zeroed_page(GFP_KERNEL);
+}
+
+static void free_page_tables(struct kimage *image)
+{
+   free_page((unsigned long)image->arch_kimage.pgd);
+#ifdef CONFIG_X86_PAE
+   free_page((unsigned long)image->arch_kimage.pmd0);
+   free_page((unsigned long)image->arch_kimage.pmd1);
+#endif
+   free_page((unsigned long)image->arch_kimage.pte0);
+   free_page((unsigned long)image->arch_kimage.pte1);
+}
+
 /*
  * A architecture hook called to validate the
  * proposed image and prepare the control pages
@@ -83,10 +97,21 @@ static void load_segments(void)
  * reboot code buffer to allow us to avoid allocations
  * later.
  *
- * Currently nothing.
+ * - Allocate page tables
  */
 int machine_kexec_prepare(struct kimage *image)
 {
+   alloc_page_tables(image);
+   if (!image->arch_kimage.pgd ||
+#ifdef CONFIG_X86_PAE
+   !image->arch_kimage.pmd0 ||
+   !image->arch_kimage.pmd1 ||
+#endif
+   !image->arch_kimage.pte0 ||
+   !image->arch_kimage.pte1) {
+   free_page_tables(image);
+   return -ENOMEM;
+   }
return 0;
 }
 
@@ -96,6 +121,7 @@ int machine_kexec_prepare(struct kimage 
  */
 void machine_kexec_cleanup(struct kimage *image)
 {
+   free_page_tables(image);
 }
 
 /*
@@ -115,18 +141,18 @@ NORET_TYPE void machine_kexec(struct kim
 
page_list[PA_CONTROL_PAGE] = __pa(control_page);
page_list[VA_CONTROL_PAGE] = (unsigned long)relocate_kernel;
-   page_list[PA_PGD] = __pa(kexec_pgd);
-   page_list[VA_PGD] = (unsigned long)kexec_pgd;
+   page_list[PA_PGD] = __pa(image->arch_kimage.pgd);
+   page_list[VA_PGD] = (unsigned long)image->arch_kimage.pgd;
 #ifdef CONFIG_X86_PAE
-   page_list[PA_PMD_0] = __pa(kexec_pmd0);
-   page_list[VA_PMD_0] = (unsigned long)kexec_pmd0;
-   page_list[PA_PMD_1] = __pa(kexec_pmd1);
-   page_list[VA_PMD_1] = (unsigned long)kexec_pmd1;
-#endif
-   page_list[PA_PTE_0] = __pa(kexec_pte0);
-   page_list[VA_PTE_0] = (unsigned long)kexec_pte0;
-   page_list[PA_PTE_1] = __pa(kexec_pte1);
-   page_list[VA_PTE_1] = (unsigned long)kexec_pte1;
+   page_list[PA_PMD_0] = __pa(image->arch_kimage.pmd0);
+   page_list[VA_PMD_0] = (unsigned long)image->arch_kimage.pmd0;
+   page_list[PA_PMD_1] = __pa(image->arch_kimage.pmd1);
+   page_list[VA_PMD_1] = (unsigned long)image->arch_kimage.pmd1;
+#endif
+   page_list[PA_PTE_0] = __pa(image->arch_kimage.pte0);
+   page_list[VA_PTE_0] = (unsigned long)image->arch_kimage.pte0;
+   page_list[PA_PTE_1] = __pa(image->arch_kimage.pte1);
+   page_list[VA_PTE_1] = (unsigned long)image->arch_kimage.pte1;
 
/* The segment registers are funny things, they have both a
 * visible and an invisible part.  Whenever the visible part is
--- a/include/asm-x86/kexec_32.h
+++ b/include/asm-x86/kexec_32.h
@@ -94,6 +94,18 @@ relocate_kernel(unsigned long indirectio
unsigned long start_address,
unsigned int has_pae) ATTRIB_NORET;
 
+#define ARCH_HAS_ARCH_KIMAGE
+
+struct arch_kimage {
+   pgd_t *pgd;
+#ifdef CONFIG_X86_PAE
+   pmd_t *pmd0;
+   pmd_t *pmd1;
+#endif
+   pte_t *pte0;
+   pte_t *pte1;
+};
+
 #endif /* __ASSEMBLY__ */

[PATCH -mm 0/2] kexec/i386: kexec page table code clean up

2008-01-08 Thread Huang, Ying

This patchset cleans up page table setup code of kexec on i386.

This patchset is based on 2.6.24-rc5-mm1 and has been tested on i386
with/without PAE enabled.

Best Regards,
Huang Ying

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5][V2]PCI: x86 MMCONFIG: Preamble

2008-01-08 Thread Tony Camuso


Greg,

Let me know what you think, and if there's anything you want me
to fix/change.


[EMAIL PROTECTED] wrote:

OVERVIEW


This patch-set is being resubmitted after some discussion
and in response to critiques of the original submission
made by the lkml community.

The patches should be applied in sequence to obviate any
possible build problems.

The patch-set was built against 2.6.24-rc6

The large amount of text in the explanation below is due to
the nature of the problem and the discussion engendered on
lkml by my first submission.

 arch/x86/pci/common.c  |   69 
 arch/x86/pci/direct.c  |   49 
 arch/x86/pci/init.c|   18 +--
 arch/x86/pci/mmconfig-shared.c |3 +-
 arch/x86/pci/pci.h |3 ++
 drivers/pci/pci.c  |9 +
 drivers/pci/pci.h  |1 +
 drivers/pci/probe.c|5 +++
 8 files changed, 146 insertions(+), 11 deletions(-)

Description
===

There exist northbridges that do not respond correctly to
PCI MMCONFIG accesses in x86 platforms. Among them are
the AMD 8132. Here is an excerpt from an errata page
published by AMD at the following link.
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf

The base configuration space of the AMD-8132 and
PCI(-X) devices attached to it are accessible using
only the mechanism defined in PCI 2.3. Registers of
PCI-X Mode 2 devices attached to the AMD-8132 in the
extended configuration space are not accessible. The
AMD-8132 has no registers in the extended onfiguration
space.

Fix Planned
No

On bus numbers above that defined by PCI_MAX_CHECK_BUS, and
whose pci_ops field points to the mmconf ops, each device is
checked for mmconf compliance by comparing an MMCONFIG read
to a Legacy PCI config read of the vendor/device dword.

A miscompare means that a device does not correctly respond
to MMCONFIG accesses. When the patch code detects this
condition, the bus that serves this device, and all
subordinate buses, will be programmed to use Legacy PCI
Config accesses.

This patch set does not scan the first few buses, a number
defined by PCI_MMCFG_MAX_CHECK_BUS, because the routine
unreachable_devices() in arch/x86/pci/mmconfig-shared.c
already does this with device granularity using a bitmap.


Alternatives Considered
===

We chose not to extend the bitmap mechanism, since it would
have become too large in order to cover all possible buses
on all possible segments, and having the lookup into such a
large bitmap inline with every pci config access would
have had an adverse affect on performance.

An alternative would have been to allocate a bitmap on a
per-bus basis, so every bus would have a bitmap of its own
unreachable devices. This could be done with a new field
in the pci_bus struct.

However, the only devices that need to perform a mmconfig
translation, and have problems with it, are northbridges.
Once the translation is made and forwarded on the pci bus,
the consumers of the pci config address do not know or care
whether it was generated by an mmconfig or legacy pci access
mechanism.

This being the case, the secondary and subordinate buses
also require legacy pci access, even though they are not
aware of the mechanism, because the pci config access must
still be translateed by the root bridge to get to them.

Also considered in the discusson on lkml was a suggestion
by Loic Prylli to always use legacy pci configuration for
the first 256 bytes of config space. This would certainly
have fixed the problem of configuring and booting.

It would also have fixed the problem with bus sizing code
programming devices to claim MMIO space that beloongs to
MMCONFIG and thereby hang the system (see below).

However, there are devices (tg3) that make a lot of runtime
use of that area of pci config space, so forcing legacy pci
config access on all devices for the few situations where
such a measure would be necessary, when in most situations
mmconfig works just fine, was a performance penalty the
consensus was unwilling to permit.


What this patch set does not fix


This patch-set does not detect or fix the conditon where bus
sizing code programs a device to consume MMIO space that also
happens to include the MMCONFIG address range. This is a
BIOS bug that we have seen in more than one system.

When BIOS maps MMCONFIG space into an MMIO region below 4GB,
some devices, typically graphics chips that want 256 MB or more
of MMIO, will be inadvertently programmed by bus sizing code
to claim this space. At that point, no further boot progress
can be made.

Up to now, the workaround for such systems is to type
"pci=nommconf" at the boot command line.

There was a suggestion made by Ivan Kokshaysky to limit accesses
to pci config space at offsets within

Re: [PATCH 0/4] add task handling notifier

2008-01-08 Thread Matt Helsley

On Tue, 2008-01-08 at 14:14 -0800, Andrew Morton wrote:
> On Tue, 08 Jan 2008 13:38:03 +
> "Jan Beulich" <[EMAIL PROTECTED]> wrote:
> 
> > >>> Andrew Morton <[EMAIL PROTECTED]> 25.12.07 23:05 >>>
> > >On Sun, 23 Dec 2007 12:26:21 + Christoph Hellwig <[EMAIL PROTECTED]> 
> > >wrote:
> > >
> > >> On Thu, Dec 20, 2007 at 01:11:24PM +, Jan Beulich wrote:
> > >> > With more and more sub-systems/sub-components leaving their footprint
> > >> > in task handling functions, it seems reasonable to add notifiers that
> > >> > these components can use instead of having them all patch themselves
> > >> > directly into core files.
> > >> 
> > >> I agree that we probably want something like this.  As do some others,
> > >> so we already had a few a few attempts at similar things.  The first one
> > >> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> > >> includes allocating per-task data for it's users.  Then also from SGI
> > >> there has been a simplified version called pnotify that's also available
> > >> from the website above.
> > >> 
> > >> Later Matt Helsley had something called "Task Watchers" which lwn has
> > >> an article on: http://lwn.net/Articles/208117/.
> > >> 
> > >> For some reason neither ever made a lot of progess (performance
> > >> problems?).
> > >> 
> > >
> > >I had it in -mm, sorted out all the problems but ended up not pulling the
> > >trigger.
> > >
> > >Problem is, it adds runtime overhead purely for the convenience of kernel
> > >programmers, and I don't think that's a good tradeoff.
> > >
> > >Sprinkling direct calls into a few well-known sites won't kill us, and
> > >we've survived this long.  Why not keep doing that, and save everyone a few
> > >cycles?
> > 
> > Am I to conclude then that there's no point in addressing the issues other
> > people pointed out? While I (obviously, since I submitted the patch 
> > disagree),
> > I'm not certain how others feel. My main point for disagreement here is (I'm
> > sorry to repeat this) that as long as certain code isn't allowed into the 
> > kernel
> > I think it is not unreasonable to at least expect the kernel to provide some
> > fundamental infrastructure that can be used for those (supposedly
> > unacceptable) bits. All I did here was utilizing the base infrastructure I 
> > want
> > added to clean up code that appeared pretty ad-hoc.
> > 
> 
> Ah.  That's a brand new requirement.

In all fairness it's not really a brand new requirement -- just one that
wasn't strongly emphasized during prior attempts to get something like
this in.

I had a mostly-working patch for this on top of the Task Watchers v2
patch set. I never posted that specific patch because it had a race with
module unloading and the fix only increased the overhead you were
unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X]
description for Task Watchers v2 (http://lwn.net/Articles/207873/):

"TODO:
...
I'm working on three more patches that add support for creating a task
watcher from within a module using an ELF section. They haven't recieved
as much attention since I've been focusing on measuring the performance
impact of these patches."



Would tainting the kernel upon registration of out-of-tree "notifiers"
be more acceptable?

Cheers,
-Matt Helsley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread Zachary Amsden

On Tue, 2008-01-08 at 14:15 -0500, David P. Reed wrote:
> Alan Cox wrote:
> > The natsemi docs here say otherwise. I trust them not you.
> >   
> As well you should. I am honestly curious (for my own satisfaction) as 
> to what the natsemi docs say the delay code should do  (can't imagine 
> they say "use io port 80 because it is unused").  I don't have any 

What is the outcome of this thread?  Are we going to use timing based
port delays, or can we finally drop these things entirely on 64-bit
architectures?

I a have a doubly vested interest in this, both as the owner of an
affected HP dv9210us laptop and as a maintainer of paravirt code - and
would like 64-bit Linux code to stop using I/O to port 0x80 in both
cases (as I suspect would every other person involved with
virtualization).

BTW, it isn't ever safe to pass port 0x80 through to hardware from a
virtual machine; some OSes use port 0x80 as a hardware available scratch
register (I believe Darwin/x86 did/does this during boot).  This means
simultaneous execution of two virtual machines can interleave port 0x80
values or share data with a hardware provided covert channel.  This
means KVM should be trapping port 0x80 access, which is really
expensive, or alternatively, Linux should not be using port 0x80 for
timing bus access on modern (64-bit) hardware.

I've tried to follow this thread, but with all the jabs, 1-ups, and
obscure legacy hardware pageantry going on, it isn't clear what we're
really doing.

Thanks,

Zach

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 05/19] split LRU lists into anon & file sets

2008-01-08 Thread Rik van Riel

On Tue, 8 Jan 2008 14:42:03 -0800 (PST)
Christoph Lameter <[EMAIL PROTECTED]> wrote:
> On Tue, 8 Jan 2008, Rik van Riel wrote:
> 
> > > Also would it be possible to create generic functions that can move pages 
> > > in pagevecs to an arbitrary lru list?
> > 
> > What would you use those functions for?
> 
> We keep on duplicating the pagevec lru operation functions in mm/swap.c. 
> Some generic stuff would reduce the code size.

Good idea.  Added to my TODO list :)

-- 
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] libata and starting/stopping ATAPI floppy devices

2008-01-08 Thread Tejun Heo

Ondrej Zary wrote:
> Hello,
> I switched to libata drivers for my onboard PATA controller (PIIX4) recently. 
> Everything works fine except that kernel tries to start not only my hard 
> drive (sda) but also LS-120 floppy drive (sdb) which does not like it:
> 
> sd 0:0:0:0: [sda] Starting disk
> ata1.00: configured for UDMA/33
> sd 0:0:0:0: [sda] 58633344 512-byte hardware sectors (30020 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
> DPO or FUA
> sd 1:0:1:0: [sdb] Starting disk
> ata2.00: configured for UDMA/33
> ata2.01: configured for PIO2
> sd 1:0:1:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08
> sd 1:0:1:0: [sdb] Sense Key : 0x2 [current]
> sd 1:0:1:0: [sdb] ASC=0x3a ASCQ=0x0
> 
> 
> The question is: is it correct? Or a patch like this should be applied?

Yeah, looks good to me.  Please reformat the message w/ S-O-B.

Acked-by: Tejun Heo <[EMAIL PROTECTED]>

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Andi Kleen

On Wed, Jan 09, 2008 at 02:34:46AM +, Dave Airlie wrote:
> 
> [This an initial RFC but I'd like to have this patch in before 2.6.24 goes 
> final as it really breaks this useful feature]
> 
> mmiotrace the MMIO access tracer used to reverse engineer binary blobs
> used this notifier interface and is planned on being pushed upstream.
> 
> Having users able to just use the tracer module without having to rebuild 
> their kernel to add in a page fault handler hack means we get a lot 
> greater coverage for reverse engineering efforts.
> 
> Signed-off-by: David Airlie <[EMAIL PROTECTED]>

Acked-by: Andi Kleen <[EMAIL PROTECTED]>

I never liked the original patch.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"

2008-01-08 Thread Dave Airlie


[This an initial RFC but I'd like to have this patch in before 2.6.24 goes 
final as it really breaks this useful feature]

mmiotrace the MMIO access tracer used to reverse engineer binary blobs
used this notifier interface and is planned on being pushed upstream.

Having users able to just use the tracer module without having to rebuild 
their kernel to add in a page fault handler hack means we get a lot 
greater coverage for reverse engineering efforts.

Signed-off-by: David Airlie <[EMAIL PROTECTED]>

This reverts commit 74a0b5762713a26496db72eac34fbbed46f20fce.
Conflicts:

include/asm-avr32/kprobes.h
include/asm-ia64/kprobes.h
include/asm-s390/kprobes.h
include/asm-x86/kdebug_32.h
include/asm-x86/kdebug_64.h
include/asm-x86/kprobes_64.h
---
 arch/x86/kernel/kprobes_32.c  |3 +-
 arch/x86/kernel/kprobes_64.c  |1 +
 arch/x86/mm/fault_32.c|   43 ++-
 arch/x86/mm/fault_64.c|   44 +++-
 include/asm-avr32/kdebug.h|   16 ++
 include/asm-avr32/kprobes.h   |1 +
 include/asm-ia64/kdebug.h |   15 ++
 include/asm-ia64/kprobes.h|1 +
 include/asm-powerpc/kdebug.h  |   19 +
 include/asm-powerpc/kprobes.h |1 +
 include/asm-s390/kdebug.h |   15 ++
 include/asm-s390/kprobes.h|1 +
 include/asm-sh/kdebug.h   |2 +
 include/asm-sparc64/kdebug.h  |   18 
 include/asm-sparc64/kprobes.h |1 +
 include/asm-x86/kdebug.h  |3 ++
 include/asm-x86/kprobes_32.h  |2 +-
 include/asm-x86/kprobes_64.h  |1 +
 kernel/kprobes.c  |   39 +--
 19 files changed, 183 insertions(+), 43 deletions(-)

diff --git a/arch/x86/kernel/kprobes_32.c b/arch/x86/kernel/kprobes_32.c
index 3a020f7..1ba8fee 100644
--- a/arch/x86/kernel/kprobes_32.c
+++ b/arch/x86/kernel/kprobes_32.c
@@ -586,7 +586,7 @@ out:
return 1;
 }
 
-int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
+static int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 {
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -668,6 +668,7 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
ret = NOTIFY_STOP;
break;
case DIE_GPF:
+   case DIE_PAGE_FAULT:
/* kprobe_running() needs smp_processor_id() */
preempt_disable();
if (kprobe_running() &&
diff --git a/arch/x86/kernel/kprobes_64.c b/arch/x86/kernel/kprobes_64.c
index 5df19a9..279cea7 100644
--- a/arch/x86/kernel/kprobes_64.c
+++ b/arch/x86/kernel/kprobes_64.c
@@ -654,6 +654,7 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
ret = NOTIFY_STOP;
break;
case DIE_GPF:
+   case DIE_PAGE_FAULT:
/* kprobe_running() needs smp_processor_id() */
preempt_disable();
if (kprobe_running() &&
diff --git a/arch/x86/mm/fault_32.c b/arch/x86/mm/fault_32.c
index a2273d4..f03cc93 100644
--- a/arch/x86/mm/fault_32.c
+++ b/arch/x86/mm/fault_32.c
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
@@ -33,27 +32,33 @@
 
 extern void die(const char *,struct pt_regs *,long);
 
-#ifdef CONFIG_KPROBES
-static inline int notify_page_fault(struct pt_regs *regs)
+static ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain);
+
+int register_page_fault_notifier(struct notifier_block *nb)
 {
-   int ret = 0;
-
-   /* kprobe_running() needs smp_processor_id() */
-   if (!user_mode_vm(regs)) {
-   preempt_disable();
-   if (kprobe_running() && kprobe_fault_handler(regs, 14))
-   ret = 1;
-   preempt_enable();
-   }
+   vmalloc_sync_all();
+   return atomic_notifier_chain_register(_page_fault_chain, nb);
+}
+EXPORT_SYMBOL_GPL(register_page_fault_notifier);
 
-   return ret;
+int unregister_page_fault_notifier(struct notifier_block *nb)
+{
+   return atomic_notifier_chain_unregister(_page_fault_chain, nb);
 }
-#else
-static inline int notify_page_fault(struct pt_regs *regs)
+EXPORT_SYMBOL_GPL(unregister_page_fault_notifier);
+
+static inline int notify_page_fault(struct pt_regs *regs, long err)
 {
-   return 0;
+   struct die_args args = {
+   .regs = regs,
+   .str = "page fault",
+   .err = err,
+   .trapnr = 14,
+   .signr = SIGSEGV
+   };
+   return atomic_notifier_call_chain(_page_fault_chain,
+ DIE_PAGE_FAULT, );
 }
-#endif
 
 /*
  * Return EIP plus the CS segment base.  The segment limit is also
@@ -331,7 +336,7 @@ fastcall void __kprobes do_page_fault(struct pt_regs *regs,
if (unlikely(address

Re: [PATCH] AMD Thermal Interrupt Support

2008-01-08 Thread Andi Kleen

On Tue, Jan 08, 2008 at 06:28:18PM -0800, Russell Leidich wrote:
> On Jan 8, 2008 3:52 PM, Andi Kleen <[EMAIL PROTECTED]> wrote:
> > >  ENTRY(thermal_interrupt)
> > > - apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt
> > > + apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt(%rip)
> >
> > Are you sure a * is not needed?  I would have thought it would jump
> > to the variable instead of through it. But if it works it's ok for me.
> 
> I will test to make sure it works.  I don't think stars mean anything
> in AT X86-64.

% cat t.s
call foo
call *foo
% as -o t.o t.s
% objdump -S t.o

t.o: file format elf64-x86-64

Disassembly of section .text:

 <.text>:
   0:   e8 00 00 00 00  callq  0x5
   5:   ff 14 25 00 00 00 00callq  *0x0

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kbuild update

2008-01-08 Thread WANG Cong


>> > If we can make this to be an offical project for Linux kernel, I
>> > think it won't be a big problem.
>> 
>> We don't even manage to maintain the English language texts properly,
>> and I am therefore not overly optimistic that we'll have the 
>> translations maintained properly for many years.
>Italian was 100% translated at one point in time.
>And the Linux Kernel Translation project has a number of
>spelling error fixes in queue (I dunno if they have been applied).
>
>So even when run as an external project it was ok for some languages,
>and having it official and someone taking patches to .po files would
>for sure allow more users to build a kernel.
>

Agreed.

That's the goal of TLKTP. Sam, can you contact to the author of
TLKTP? Maybe we can talk to him to see if we can restart the
project. If so, I can help with the Chinese translation part.

Best regards.


 Cong

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] split MMC_CAP_4_BIT_DATA

2008-01-08 Thread Bryan Wu

On Jan 9, 2008 4:49 AM, Pierre Ossman <[EMAIL PROTECTED]> wrote:
> On Tue, 8 Jan 2008 14:40:49 -0500
> Mike Frysinger <[EMAIL PROTECTED]> wrote:
>
> >
> > i dont understand what's confusing.  the Blackfin on chip host controller 
> > only
> > supports 1-bit MMC, but it supports 4-bit SD/SDIO.  this is a fact.  while 
> > it
> > may be a stupid decision, it is what it is, and i need the framework made
> > more flexible in order to get the Blackfin driver merged cleanly.  we do
> > software for hardware, we dont do hardware.
>
> Well, since I've seen no _hardware_ differences between 4-bit MMC and 4-bit 
> SD, "support" in this case must me "vendor will guarantee it works". And that 
> is not the kind of "support" that needs a distinction in the code.
>
> So, again, if you feel that there is a hardware difference between 4-bit MMC 
> and 4-bit SD then please elaborate as it is my understanding that they are 
> identical.
>

As Mike said, the reason split this flag is because Blackfin on-chip
SDIO controller's limitation.
Cliff is working on it for a long time, so I dropped him in. Hope he
can clarify the confusing things.

Thanks
-Bryan Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Believed resolved: SATA kern-buffRd read slow: based on promise driver bug

2008-01-08 Thread Tejun Heo

Linda Walsh wrote:
>Is 'main' diff between NCQ/TCQ that TCQ can re-arrange 'write'
> priority under driver control, whereas NCQ is mostly a FIFO queue?

No, NCQ can reorder although I recently heard that windows issues
overlapping NCQ commands and expects them to be processed in order (what
were they thinking?).

The biggest difference between TCQ and NCQ is that TCQ is for SCSI while
NCQ is for ATA.  Functional difference includes more number of available
tags and ordered tags for TCQ.  The former doesn't matter for single
disk.  The latter may make some difference but on single disk not by much.

> Am trying to differentiate NCQ/TCQ and SAS v. SCSI benefits.
> It seems both support (SAS & SATA) some type of port-multiplier/
> multiplexor/ option to allow more disks/port.
> 
> However, (please correct?) SATA uses a hub type architecture while
> SAS uses a switch architecture.  My experience with network hubs vs.
> switches is that network hubs can be much slower if there is
> communication contention.  Is the word 'hub' being used in the
> "shared-communication media sense", or is someone using the term
> 'hub' as a [sic] replacement for a 'switch'?

Port multiplier is a switch too.  It doesn't broadcast anything and
definitely has forwarding buffers inside.  An allegory which makes more
sense is expander to router and port multiplier to switch.  Unless you
wanna nest them, they aren't that different.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kbuild update

2008-01-08 Thread WANG Cong


>
>"only" is the wrong word in this context.
>
>If someone would update the translations for one language every
>3 months for the next years that would be great and disprove my 
>concerns.
>
>After all, updates every 3 months would beat the maintainance level of 
>at least three of our architectures...

Hmm, yes.

>
>And don't underestimate the amount of work required - even when talking 
>about requiring "only" 10% of the help texts translated that's a four 
>digit number of lines to translate.

Thanks for your point. I agree that the initial work is not so easy.

>
>> If we can make this to be an offical project for Linux kernel, I
>> think it won't be a big problem.
>
>We don't even manage to maintain the English language texts properly,
>and I am therefore not overly optimistic that we'll have the 
>translations maintained properly for many years.
>
>OTOH, if someone wouldn't just blindly translate the outdated English 
>texts but also review the English texts when translating this alone 
>might be worth it...

Fully agreed.

Maybe we can restart TLKTP?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] AMD Thermal Interrupt Support

2008-01-08 Thread Russell Leidich

On Jan 8, 2008 3:52 PM, Andi Kleen <[EMAIL PROTECTED]> wrote:
> >  ENTRY(thermal_interrupt)
> > - apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt
> > + apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt(%rip)
>
> Are you sure a * is not needed?  I would have thought it would jump
> to the variable instead of through it. But if it works it's ok for me.

I will test to make sure it works.  I don't think stars mean anything
in AT X86-64.

>
> The rest of the patch looks ok to to me.

Thank you!  I will give it a final test and submit the official patch this week.

>
> -Andi
>
>



-- 
Russell Leidich
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Oops?

2008-01-08 Thread Stoyan Gaydarov

On Jan 7, 2008 5:30 PM, Alan Cox <[EMAIL PROTECTED]> wrote:
> On Mon, 7 Jan 2008 17:15:01 -0600
> "Stoyan Gaydarov" <[EMAIL PROTECTED]> wrote:
>
> > Today I upgraded my kernel from 2.6.23.9 to 2.6.23.12 and in the past
> > 30 minutes I have had to restart my computer twice.
> > I believe its a kernel oops or a kernel panic because when the
> > computer freezes it blinks the caps and scroll lock LEDs.
> > I don't know what is causing the problem but I am willing to help, I
> > can provide you with any information you need.
> > The only problem is that I don't know how to debug the system myself.
> > If anyone can tell me what to do to I can do it and give back the
> > information.
>
> When the machine hangs in graphical mode its quite hard to get the data
> out - one of the long term todo items is to fix that.
>
> Boot the machine and leave it in text mode (or if it boots to graphical
> mode then switch to a text console/text mode) and wait.. with "luck" it
> will show the same problem in text mode and give you a meaningful screen
> dump you can then write down (or grab with a digital camera)
>
> Alan
>

I reverted back to a clean install of slackware 12.0 after trying to
get it to fail again without luck, then i installed the 2.6.23.9
kernel and continued to use it regularly. Then a few minutes ago it I
restarted the computer because it had frozen again, the same way.
Except this time when rebooting the machine i got a kernel oops
message and it didn't boot completely. I could not copy it but I did
take a picture and now I have re-written the screen here(sorry about
the formating):

Stack: 0010 00d0 0001 00d0 c20fb980 c2104000 c2103e00 0246
  c0a32fc0 47807ae8  c23eeaa0 00d0 0282
c20fb980 c026661b
  c23eeaa0  f586df04 c23eeaa0 c02227f2 0246
 c225c480
Call Trace:
[] kmem_cache_alloc+0x6b/0x90
[] dup_fd+0x22/0x2c0
[] getnstimeofday+0x36/0xc0
[] copy_files+0x41/0x60
[] copy_process+0x488/0x11a0
[] alloc_pid+0x152/0x280
[] do_fork+0x76/0x230
[] recalc_sigpending+0x5d/0xe0
[] sigprocmask+0x5d/0xe0
[] sys_clone+0x32/0x40
[] syscall_call+0x7/0xb
[] __mutex_lock_interruptible_slowpath+0xb0/0xc0
===
Code: 5b 5e 5f 5d c3 8b 7a 10 89 d0 c7 42 34 01 00 00 00 83 c0 10 39 c7 74 b6 8b
 4c 24 10 8b 77 10 3b b1 98 00 00 00 0f 82 1d ff ff ff <0f> 0b eb fe 8b 4c 24 18
 8b 54 24 18 8b 41 08 83 c2 08 89 78 04
EIP: [] cache_alloc_refill+0x1bd/0x540 SS:ESP 0068:f586de7c
INIT: Entering runlevel: 4
Going multiuser...
Updating shared library links:  /sbin/ldconfig &


Hope that someone can find the problem and fix it
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] add task handling notifier

2008-01-08 Thread Matt Helsley

On Sun, 2007-12-23 at 12:26 +, Christoph Hellwig wrote:
> On Thu, Dec 20, 2007 at 01:11:24PM +, Jan Beulich wrote:
> > With more and more sub-systems/sub-components leaving their footprint
> > in task handling functions, it seems reasonable to add notifiers that
> > these components can use instead of having them all patch themselves
> > directly into core files.
> 
> I agree that we probably want something like this.  As do some others,
> so we already had a few a few attempts at similar things.  The first one
> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> includes allocating per-task data for it's users.  Then also from SGI
> there has been a simplified version called pnotify that's also available
> from the website above.
> 
> Later Matt Helsley had something called "Task Watchers" which lwn has
> an article on: http://lwn.net/Articles/208117/.

Apologies for the late reply -- I haven't had internet access for the
last few weeks.

> For some reason neither ever made a lot of progess (performance
> problems?).

Yeah. Some discussion on measuring the performance of Task Watchers:
http://thread.gmane.org/gmane.linux.lse/4698

The requirements for Task Watchers were:

Allow sleeping in most/all notifier functions in these paths:
fork
exec
exit
change [re][ug]id
No performance overhead
One "chain" per path ("I only care about exec().")
Easy to use
Scales to large numbers of CPUs
Useful to make most in-tree code more readable. Task Watchers took
direct calls to these pieces of code out of the fork/exec/exit paths:
audit
semundo
cpusets
mempolicy
trace irqflags
lockdep
keys (for processes -- not for thread groups)
process events connector
Useful for loadable modules

Performance overhead in microbenchmarks was measurable at around 1% (see
the URL above). Overhead on benchmarks like kernbench on the other hand
were in the noise margins (which were around 1.6%) and hence I couldn't
determine the overhead there.

I never got the loadable module part completely working due to races
between notifier functions and the module unload path. The solution to
the races seemed to require adding more overhead to the notifier
function paths (SRCU-like grace periods).

I stopped pushing the patch set because I hadn't found any new
optimizations to offset the overheads while still meeting all the
requirements and Andrew still felt that the "make it more readable"
argument was not sufficient to justify its inclusion.

Jan, instead of adding notifiers could utrace be used or made to work
for modules? Also, please add me to the Cc list for any reposts of the
entire series. Thanks!

Cheers,
-Matt Helsley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: translations (Re: Kbuild update)

2008-01-08 Thread WANG Cong


>"I will use ...
>http://images.google.cz/images?svnum=100=1=cs=firefox-a=org.mozilla%3Acs%3Aofficial=I+will+use+Google+before=Hledat+obr%C3%A1zky
>... for making translations..."
>http://www.google.com/translate?u=http%3A%2F%2Flxr.linux.no%2Flinux%2FDocumentation%2FHOWTO=en%7Czh-TW=en=UTF8
>?
>
>In case if people will help Google to have better quality of translation,
>that will be better generally for much bigger number of *people*,
>especially in China, isn't it?

Perhaps yes.

But at least now, that kind of translation still sucks. It can
satisfy me.

>
>Making any official world-domination/new-world-order projects with
>Linux will not help IMHO. Very fast code flow and almost no up to date
>documentation is still relevant and google search + email archives
>are not going to be obsolete in the near future.
>
>Also, future of the linux codebase with Chinese comments in C or in
>ASM is kind of wired nightmare. Those, who cannot read actual source
>code (i.e. C) will not go too far.
>
>So, translation guys, maybe you will stop making noise and will start
>to make e.g. less buggy Linux? Greg KH have much more stuff to care,
>than some translations IMHO.

I never say to translate C comments. What we want to translate is the
strings in Kconfig.

I abosutely agree that we should focus on the exsiting bugs of Linux,
but like Greg's inclusion of some kernel doc translations, this kind
of work is really helpful to attract some kernel newbies from none
English-speaking countries. Even we can't make offical efforts,
the civil work, like TLKTP, is still worthy. Believe me, I am leading
a local LUG in my college and I found that one _big_ reason that why
the newbies are afraid of Linux kernel is English, instead of the
C tricks or low-level programming.

Regards.


 Cong

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86: Use fixup_exception() in traps_64.c

2008-01-08 Thread Harvey Harrison

Use the fixup_exception() helper instead of the open-coded
search_extable() users.

Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]>
---
Ingo, this depends on my patch in x86.git unifying extable.c that
introduces fixup_exception() to X86_64.

 arch/x86/kernel/traps_64.c |   47 ++-
 1 files changed, 15 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c
index e3d1ca1..c173687 100644
--- a/arch/x86/kernel/traps_64.c
+++ b/arch/x86/kernel/traps_64.c
@@ -606,19 +606,12 @@ static void __kprobes do_trap(int trapnr, int signr, char 
*str,
}
 

-   /* kernel trap */ 
-   {
-   const struct exception_table_entry *fixup;
-   fixup = search_exception_tables(regs->ip);
-   if (fixup)
-   regs->ip = fixup->fixup;
-   else {
-   tsk->thread.error_code = error_code;
-   tsk->thread.trap_no = trapnr;
-   die(str, regs, error_code);
-   }
-   return;
+   if (!fixup_exception(regs)) {
+   tsk->thread.error_code = error_code;
+   tsk->thread.trap_no = trapnr;
+   die(str, regs, error_code);
}
+   return;
 }
 
 #define DO_ERROR(trapnr, signr, str, name) \
@@ -707,22 +700,15 @@ asmlinkage void __kprobes do_general_protection(struct 
pt_regs * regs,
return;
} 
 
-   /* kernel gp */
-   {
-   const struct exception_table_entry *fixup;
-   fixup = search_exception_tables(regs->ip);
-   if (fixup) {
-   regs->ip = fixup->fixup;
-   return;
-   }
+   if (fixup_exception(regs))
+   return;
 
-   tsk->thread.error_code = error_code;
-   tsk->thread.trap_no = 13;
-   if (notify_die(DIE_GPF, "general protection fault", regs,
-   error_code, 13, SIGSEGV) == NOTIFY_STOP)
-   return;
-   die("general protection fault", regs, error_code);
-   }
+   tsk->thread.error_code = error_code;
+   tsk->thread.trap_no = 13;
+   if (notify_die(DIE_GPF, "general protection fault", regs,
+   error_code, 13, SIGSEGV) == NOTIFY_STOP)
+   return;
+   die("general protection fault", regs, error_code);
 }
 
 static __kprobes void
@@ -914,12 +900,9 @@ clear_TF_reenable:
 
 static int kernel_math_error(struct pt_regs *regs, const char *str, int trapnr)
 {
-   const struct exception_table_entry *fixup;
-   fixup = search_exception_tables(regs->ip);
-   if (fixup) {
-   regs->ip = fixup->fixup;
+   if (fixup_exception(regs))
return 1;
-   }
+
notify_die(DIE_GPF, str, regs, 0, trapnr, SIGFPE);
/* Illegal floating point operation in the kernel */
current->thread.trap_no = trapnr;
-- 
1.5.4.rc2.1164.g6451

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1301 matches

Mail list logo