Re: The ext3 way of journalling
On Tue, 08 Jan 2008 22:21:02 EST, Kyle Moffett said: > lvcreate -s -n "${VOLUME}-snap" "${VG}/${VOLUME}" > Basically you can fsck the offline snapshot in the background. Something the lvcreate manpage is specifically not clear about is: Does this create a snapshot of the *disk* at that moment, or does it capture "disk plus still-to-be-written blocks in the cache"? (Phrased differently, does it Do The Right Thing regarding "blocks queued before lvcreate" and "blocks queued for write after lvcreate")? If the snapshot doesn't capture the blocks queued but still unwritten by kjournald and similar, then you're still hitting the same old problems that you always get when you fsck an "active disk". pgpG1ij7TWtRK.pgp Description: PGP signature
Re: RE : [tipc-discussion] /net/tipc/port.c: Use tipc_port_unlock
From: Jon Paul Maloy <[EMAIL PROTECTED]> Date: Tue, 8 Jan 2008 10:34:58 -0500 (EST) > I have no objections. I've applied this patch, thanks everyone. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD] Incremental fsck
On Jan 8, 2008 8:40 PM, Al Boldi <[EMAIL PROTECTED]> wrote: > Rik van Riel wrote: > > Al Boldi <[EMAIL PROTECTED]> wrote: > > > Has there been some thought about an incremental fsck? > > > > > > You know, somehow fencing a sub-dir to do an online fsck? > > > > Search for "chunkfs" > > Sure, and there is TileFS too. > > But why wouldn't it be possible to do this on the current fs infrastructure, > using just a smart fsck, working incrementally on some sub-dir? Several data structures are file system wide and require finding every allocated file and block to check that they are correct. In particular, block and inode bitmaps can't be checked per subdirectory. http://infohost.nmt.edu/~val/review/chunkfs.pdf -VAL -VAL -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CONNECTOR: don't touch queue dev after decrement of ref count
From: Li Zefan <[EMAIL PROTECTED]> Date: Wed, 09 Jan 2008 13:44:07 +0800 > > cn_queue_free_callback() will touch 'dev'(i.e. cbq->pdev), > so it should be called before atomic_dec(>refcnt). > > Signed-off-by: Li Zefan <[EMAIL PROTECTED]> Excellent catch, patch applied. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Kprobes: Add kprobes smoke tests that run on boot
From: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> Date: Tue, 8 Jan 2008 12:03:34 +0530 > Here is a quick and naive smoke test for kprobes. Thanks very much for writing this. It will come in handy for me when I work on sparc64 kretprobe support. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc7, intel audio: alsa doesn't say a beep
At Wed, 09 Jan 2008 07:03:18 +0100, Harald Dunkel wrote: > > Takashi Iwai wrote: > > > > Did you enable CONFIG_SND_HDA_POWER_SAVE feature? And which hardware > > (laptop, product name, whatever) exactly? > > > > CONFIG_SND_HDA_POWER_SAVE is not set. That's fine. > Hardware is a Dell XPS M1330. CPU is Core2 Duo T7500, 2.20GHz, > 2 GByte RAM. lspci: > > 00:00.0 Host bridge: Intel Corporation Mobile Memory Controller Hub (rev 0c) > 00:01.0 PCI bridge: Intel Corporation Mobile PCI Express Root Port (rev 0c) > 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 > (rev 02) > 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 > (rev 02) > 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 > (rev 02) > 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio > Controller (rev 02) > 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 > (rev 02) > 00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 > (rev 02) > 00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 4 > (rev 02) > 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 > (rev 02) > 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 > (rev 02) > 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 > (rev 02) > 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 > (rev 02) > 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 > (rev 02) > 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f2) > 00:1f.0 ISA bridge: Intel Corporation Mobile LPC Interface Controller (rev 02) > 00:1f.1 IDE interface: Intel Corporation Mobile IDE Controller (rev 02) > 00:1f.2 SATA controller: Intel Corporation Mobile SATA AHCI Controller (rev > 02) > 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev > 02) > 01:00.0 VGA compatible controller: nVidia Corporation Unknown device 0427 > (rev a1) > 03:01.0 FireWire (IEEE 1394): Ricoh Co Ltd Unknown device 0832 (rev 05) > 03:01.1 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 > SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22) > 03:01.2 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter > (rev 12) > 03:01.3 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12) > 09:00.0 Ethernet controller: Broadcom Corporation Unknown device 1713 (rev 02) > 0c:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network > Connection (rev 02) > > > Also, please show the contents of /proc/asound/card0/codec#* files. > > Do you see difference in these files between with and without the > > patch? > > > > See below. There is no difference between both. Thanks. Then the possible reason might be the registers that don't appear in this proc output, such as GPIO. Could you try the patch below with the latency patch (you reverted) in rc7? Takashi diff -r d773ad622068 sound/pci/hda/patch_sigmatel.c --- a/sound/pci/hda/patch_sigmatel.cTue Jan 08 18:13:27 2008 +0100 +++ b/sound/pci/hda/patch_sigmatel.cWed Jan 09 08:29:49 2008 +0100 @@ -1624,12 +1624,13 @@ static void stac92xx_enable_gpio_mask(st AC_VERB_SET_GPIO_DIRECTION, spec->gpio_mask); /* Configure GPIOx as CMOS */ snd_hda_codec_write_cache(codec, codec->afg, 0, 0x7e7, 0x); + /* Enable GPIOx */ + snd_hda_codec_write_cache(codec, codec->afg, 0, + AC_VERB_SET_GPIO_MASK, spec->gpio_mask); + msleep(1); /* Assert GPIOx */ snd_hda_codec_write_cache(codec, codec->afg, 0, AC_VERB_SET_GPIO_DATA, spec->gpio_data); - /* Enable GPIOx */ - snd_hda_codec_write_cache(codec, codec->afg, 0, - AC_VERB_SET_GPIO_MASK, spec->gpio_mask); } /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
From: Christoph Hellwig <[EMAIL PROTECTED]> Date: Wed, 9 Jan 2008 08:19:45 +0100 > On Wed, Jan 09, 2008 at 03:55:20AM +, Dave Airlie wrote: > > now because Linus said send him a patch to revert regressions rather than > > just complain, > > this is not a regression by any definition. You were abusing exported > symbols for out of tree junk, so you'll lose. And furthermore, they don't even need it, use a kprobe. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] split MMC_CAP_4_BIT_DATA
On Wed, 9 Jan 2008 11:21:40 +0800 "Cai, Cliff" <[EMAIL PROTECTED]> wrote: > > Hi,all > > I'd like to say something about this issue. > Currently,the blackfin on chip SD host ONLY support 1-bit MMC while > support 1-bit/4-bit SD/SDIO. > And we want our driver to support both 1-bit MMC and 4-bit SD/SDIO.but > the current MMC driver framework > Only allow us to set one kind of bus width,either 1-bit or 4-bit.So in > order to meet our case,we need more flexible mechanism > To inform the upper commom driver to know our situation. > That's just iterating what's already been said. My claim is that 4-bit is 4-bit, regardless if it's MMC or SD. So if you want this patch to go in you need to explain why there is a difference for the blackfin controller. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
From: Christoph Hellwig <[EMAIL PROTECTED]> Date: Wed, 9 Jan 2008 08:17:27 +0100 > NACK. If you want to do it you'll need a much better reason and an > in-tree user. And if you want to redo it it should be available for > all platforms with a consistant API. I majorly NACK this as well, we don't want to bring this thing back especially for specialized debugging hacks. You can set a kprobe on the x86 fault handler to do things like mmiotrace. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
On Wed, Jan 09, 2008 at 03:55:20AM +, Dave Airlie wrote: > now because Linus said send him a patch to revert regressions rather than > just complain, this is not a regression by any definition. You were abusing exported symbols for out of tree junk, so you'll lose. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] split MMC_CAP_4_BIT_DATA
On Tue, 8 Jan 2008 16:44:08 -0500 "Mike Frysinger" <[EMAIL PROTECTED]> wrote: > On Jan 8, 2008 3:49 PM, Pierre Ossman <[EMAIL PROTECTED]> wrote: > > So, again, if you feel that there is a hardware difference between 4-bit > > MMC and 4-bit SD then please elaborate as it is my understanding that they > > are identical. > > you may be 100% correct, i have no idea, i'm not really familiar with > MMC/SD/SDIO at all. The patch adds complexity to the system. So until you can convince me that complexity is actually needed, I'm afraid the answer is NAK. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Change paride driver to use unlocked_ioctl instead of ioctl
Sorry missed the function prototype and includes earlier. Here is the corrected patch. Build tested. The ioctl handler is called with the BKL held. Registering unlocked_ioctl handler instead of registering ioctl handler. Signed-off-by: Nikanth Karthikesan <[EMAIL PROTECTED]> --- diff --git a/drivers/block/paride/pt.c b/drivers/block/paride/pt.c index b91accf..860b946 100644 --- a/drivers/block/paride/pt.c +++ b/drivers/block/paride/pt.c @@ -146,6 +146,7 @@ static int (*drives[4])[6] = {, , , }; #include #include #include/* current, TASK_*, schedule_timeout() */ +#include #include @@ -189,8 +190,8 @@ module_param_array(drive3, int, NULL, 0); #define ATAPI_LOG_SENSE0x4d static int pt_open(struct inode *inode, struct file *file); -static int pt_ioctl(struct inode *inode, struct file *file, - unsigned int cmd, unsigned long arg); +static long pt_ioctl(struct file *file, unsigned int cmd, + unsigned long arg); static int pt_release(struct inode *inode, struct file *file); static ssize_t pt_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos); @@ -236,7 +237,7 @@ static const struct file_operations pt_fops = { .owner = THIS_MODULE, .read = pt_read, .write = pt_write, - .ioctl = pt_ioctl, + .unlocked_ioctl = pt_ioctl, .open = pt_open, .release = pt_release, }; @@ -685,36 +686,44 @@ out: return err; } -static int pt_ioctl(struct inode *inode, struct file *file, -unsigned int cmd, unsigned long arg) +static long pt_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) { struct pt_unit *tape = file->private_data; struct mtop __user *p = (void __user *)arg; struct mtop mtop; + lock_kernel(); + switch (cmd) { case MTIOCTOP: - if (copy_from_user(, p, sizeof(struct mtop))) + if (copy_from_user(, p, sizeof(struct mtop))) { + unlock_kernel(); return -EFAULT; + } switch (mtop.mt_op) { case MTREW: pt_rewind(tape); + unlock_kernel(); return 0; case MTWEOF: pt_write_fm(tape); + unlock_kernel(); return 0; default: printk("%s: Unimplemented mt_op %d\n", tape->name, mtop.mt_op); + unlock_kernel(); return -EINVAL; } default: printk("%s: Unimplemented ioctl 0x%x\n", tape->name, cmd); + unlock_kernel(); return -EINVAL; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
On Wed, Jan 09, 2008 at 02:34:46AM +, Dave Airlie wrote: > > [This an initial RFC but I'd like to have this patch in before 2.6.24 goes > final as it really breaks this useful feature] > > mmiotrace the MMIO access tracer used to reverse engineer binary blobs > used this notifier interface and is planned on being pushed upstream. > > Having users able to just use the tracer module without having to rebuild > their kernel to add in a page fault handler hack means we get a lot > greater coverage for reverse engineering efforts. > > Signed-off-by: David Airlie <[EMAIL PROTECTED]> > > This reverts commit 74a0b5762713a26496db72eac34fbbed46f20fce. > Conflicts: NACK. If you want to do it you'll need a much better reason and an in-tree user. And if you want to redo it it should be available for all platforms with a consistant API. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] system timer: fix crash in <100Hz system timer
On Sat, 5 Jan 2008 16:16:55 -0600 David Fries <[EMAIL PROTECTED]> wrote: > --- a/kernel/time.c > +++ b/kernel/time.c > @@ -565,7 +565,11 @@ EXPORT_SYMBOL(jiffies_to_timeval); > clock_t jiffies_to_clock_t(long x) > { > #if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0 > + #if HZ < USER_HZ > + return x * (USER_HZ / HZ); > + #else > return x / (HZ / USER_HZ); > + #endif > #else > u64 tmp = (u64)x * TICK_NSEC; > do_div(tmp, (NSEC_PER_SEC / USER_HZ)); > @@ -598,7 +602,12 @@ EXPORT_SYMBOL(clock_t_to_jiffies); > u64 jiffies_64_to_clock_t(u64 x) > { > #if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0 > - do_div(x, HZ / USER_HZ); > + #if HZ < USER_HZ > + x *= USER_HZ; > + do_div(x, HZ); > + #else > + do_div(x, HZ / USER_HZ); > + #endif > #else Somwhat off-topic: I guess HZ=USER_HZ is a not-uncommon case, and it's pretty silly calling do_div(x, 1) all the time. How about we optimise that case? Perhaps there are other places... --- a/kernel/time.c~speed-up-jiffies-conversion-functions-if-hz==user_hz +++ a/kernel/time.c @@ -618,8 +618,10 @@ u64 jiffies_64_to_clock_t(u64 x) # if HZ < USER_HZ x *= USER_HZ; do_div(x, HZ); -# else +# elif HZ > USER_HZ do_div(x, HZ / USER_HZ); +# else + /* Nothing to do */ # endif #else /* _ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Jan 9, 2008 2:37 PM, Dave Young <[EMAIL PROTECTED]> wrote: > > On Jan 9, 2008 2:13 PM, Dave Young <[EMAIL PROTECTED]> wrote: > > > > On Wed, Jan 09, 2008 at 09:32:48AM +0800, Dave Young wrote: > > > On Jan 9, 2008 6:48 AM, Greg KH <[EMAIL PROTECTED]> wrote: > > > > On Tue, Jan 08, 2008 at 03:05:10PM +0800, Dave Young wrote: > > > > > On Jan 8, 2008 1:20 AM, Greg KH <[EMAIL PROTECTED]> wrote: > > > > > > On Mon, Jan 07, 2008 at 06:13:37PM +0100, Stefan Richter wrote: > > > > > > > It's already in the driver core to the most part. It remains to > > > > > > > be seen > > > > > > > what is less complicated in the end: Transparent mutex-protected > > > > > > > list > > > > > > > accesses provided by driver core (requires the iterator), or all > > > > > > > the > > > > > > > necessary locking done by the drivers themselves (requires some > > > > > > > more > > > > > > > lock-taking but perhaps fewer lock instances overall in the > > > > > > > drivers, and > > > > > > > respective redefinitions and documentation of the driver core > > > > > > > API). > > > > > > > > > > > > I favor changing the driver core api and doing this kind of thing > > > > > > there. > > > > > > It keeps the drivers simpler and should hopefully make their lives > > > > > > easier. > > > > > > > > > > What about this? > > > > > > > > > > #define class_for_each_dev(pos, head, member) \ > > > > > for (mutex_lock(&(container_of(head, struct class, > > > > > devices))->mutex), po > > > > > s = list_entry((head)->next, typeof(*pos), member); \ > > > > > prefetch(pos->member.next), >member != (head) ? 1 : > > > > > (mutex_unlock(& > > > > > (container_of(head, struct class, devices))->mutex), 0); \ > > > > > pos = list_entry(pos->member.next, typeof(*pos), member)) > > > > > > > I'm wrong, it's same as before indeed. > > > > > > > Eeek, just make the thing a function please, where you pass the iterator > > > > function in, like the driver core has (driver_for_each_device) > > > > > > Ok, so need a new member of knode_class, I will update the patch later. > > > Thanks. > > > > Withdraw my post, sorry :) > > > > For now the mutex patch, I will only use the mutex to lock the devices list > > and write an iterater function. > > Most of the iterating is for finding some device in the list, so maybe need > > a match function just like drivers do? > > > > Drop one more mail address of David Brownell in cc list. > Sorry for this, david > gmail web client make me crazy. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Jan 9, 2008 2:13 PM, Dave Young <[EMAIL PROTECTED]> wrote: > > On Wed, Jan 09, 2008 at 09:32:48AM +0800, Dave Young wrote: > > On Jan 9, 2008 6:48 AM, Greg KH <[EMAIL PROTECTED]> wrote: > > > On Tue, Jan 08, 2008 at 03:05:10PM +0800, Dave Young wrote: > > > > On Jan 8, 2008 1:20 AM, Greg KH <[EMAIL PROTECTED]> wrote: > > > > > On Mon, Jan 07, 2008 at 06:13:37PM +0100, Stefan Richter wrote: > > > > > > It's already in the driver core to the most part. It remains to be > > > > > > seen > > > > > > what is less complicated in the end: Transparent mutex-protected > > > > > > list > > > > > > accesses provided by driver core (requires the iterator), or all the > > > > > > necessary locking done by the drivers themselves (requires some more > > > > > > lock-taking but perhaps fewer lock instances overall in the > > > > > > drivers, and > > > > > > respective redefinitions and documentation of the driver core API). > > > > > > > > > > I favor changing the driver core api and doing this kind of thing > > > > > there. > > > > > It keeps the drivers simpler and should hopefully make their lives > > > > > easier. > > > > > > > > What about this? > > > > > > > > #define class_for_each_dev(pos, head, member) \ > > > > for (mutex_lock(&(container_of(head, struct class, > > > > devices))->mutex), po > > > > s = list_entry((head)->next, typeof(*pos), member); \ > > > > prefetch(pos->member.next), >member != (head) ? 1 : > > > > (mutex_unlock(& > > > > (container_of(head, struct class, devices))->mutex), 0); \ > > > > pos = list_entry(pos->member.next, typeof(*pos), member)) > > > > > I'm wrong, it's same as before indeed. > > > > > Eeek, just make the thing a function please, where you pass the iterator > > > function in, like the driver core has (driver_for_each_device) > > > > Ok, so need a new member of knode_class, I will update the patch later. > > Thanks. > > Withdraw my post, sorry :) > > For now the mutex patch, I will only use the mutex to lock the devices list > and write an iterater function. > Most of the iterating is for finding some device in the list, so maybe need a > match function just like drivers do? > Drop one more mail address of David Brownell in cc list. Sorry for this, david -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] Simple tamper-proof device filesystem.
Hello. [EMAIL PROTECTED] wrote: > Good summary - probably should add that to the patch, drop it into > Documentation/syaoran-config.txt or similar... I see. > Modification while reading *is* an issue, but can probably be worked around > with some clever locking. The race condition I was thinking of was if you > had the mount and the policy load be 2 separate events, you could see: > > (a) issue mount request > (b) do something malicious in /dev while.. > (c) load the policy that would have prevented (b). > > This is partly why SELinux has init load the policy *very* early on, before > any other userspace have had a chance to run and do things that would have > been prevented by policy. So, you suggested to load policy before mount() request so that this filesystem can prevent attackers from doing something malicious by minimizing (i.e. implement as non-blocking operation) the latency between the userland process's call of mount() and the nodes become visible to userland process. I didn't take such cases into account. My assumed usage of this filesystem is that run a script with #!/bin/sh mount -t syaoran -o accept=/etc/ccs/syaoran.conf none /dev exec /sbin/init "$@" by passing "init=/path/to/this/script" to the kernel command line so that /sbin/init can create /dev/initlog on this filesystem. If you mount this filesystem after /sbin/init starts, it will shadow /dev/initctl opened by /sbin/init . > Which basically ends up meaning that anybody who can trick the mount into > happening can reset the permitted list and create (for example) a mode 666 > entry for a hard drive, and go scribbling around at will. Note that you > don't seem to do any sanity checking on the path (for instance, that each > component is owned by root, and not world-writable) - so anybody who finds > a way to get the mount to happen can supply their own list in > /home/joeuser/blat > or /tmp/surprise-mount-list or wherever. I assume that being able to reach this location means the caller of mount() is root. But, the patches to allow mount() by non-root is in progress? http://lkml.org/lkml/2008/1/8/131 May be I should add some sanity checking on the path. Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Do SATA tape drives work?
Mark Lord wrote: > I wouldn't buy anything with "Sony" on it, Any particular reason? > but Albert thinks ATAPI tapes should be working now > (he has my old drive now). Thanks for the info. Regards jonathan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3] kprobes: Introduce kprobe_handle_fault()
On Wed, 2008-01-09 at 07:14 +0100, Heiko Carstens wrote: > > +/* > > + * If it is a kprobe pagefault we can not be premptible so return before > > Missing 'e' in preemptible. OK. > However, the old code you removed had a lot of preempt_disable/enable calls > that you removed. Hope you checked that preemption was always disabled > already and the calls were not necessary (true at least for s390). > > Are there cases where this code could be called with preemption enabled? > If so then that looks like a bug anyway. I'd say the preemptible check > should be removed or turned into a WARN_ON. > > I like this better (not including any other changes): > > if (!user_mode(regs) && !preemptible() && kprobe_running()) > return kprobe_fault_handler(regs, trapnr); > return 0; I could live with that too, will defer to kprobes maintainers if they prefer that as a follow-on. Regarding the preempt_enable/disable, the reasoning behind it comes from the following, I stole the changelog from x86.git which has a good description of why this should be safe: commit 6624c638928acce52fbe57d73284efcf9f86abd2 Author: Quentin Barnes <[EMAIL PROTECTED]> Date: Wed Jan 9 02:32:57 2008 +0100 Code clarification patch to Kprobes arch code When developing the Kprobes arch code for ARM, I ran across some code found in x86 and s390 Kprobes arch code which I didn't consider as good as it could be. Once I figured out what the code was doing, I changed the code for ARM Kprobes to work the way I felt was more appropriate. I've tested the code this way in ARM for about a year and would like to push the same change to the other affected architectures. The code in question is in kprobe_exceptions_notify() which does: /* kprobe_running() needs smp_processor_id() */ preempt_disable(); if (kprobe_running() && kprobe_fault_handler(args->regs, args->trapnr)) ret = NOTIFY_STOP; preempt_enable(); For the moment, ignore the code having the preempt_disable()/ preempt_enable() pair in it. The problem is that kprobe_running() needs to call smp_processor_id() which will assert if preemption is enabled. That sanity check by smp_processor_id() makes perfect sense since calling it with preemption enabled would return an unreliable result. But the function kprobe_exceptions_notify() can be called from a context where preemption could be enabled. If that happens, the assertion in smp_processor_id() happens and we're dead. So what the original author did (speculation on my part!) is put in the preempt_disable()/preempt_enable() pair to simply defeat the check. Once I figured out what was going on, I considered this an inappropriate approach. If kprobe_exceptions_notify() is called from a preemptible context, we can't be in a kprobe processing context at that time anyways since kprobes requires preemption to already be disabled, so just check for preemption enabled, and if so, blow out before ever calling kprobe_running(). I wrote the ARM kprobe code like this: /* To be potentially processing a kprobe fault and to * trust the result from kprobe_running(), we have * be non-preemptible. */ if (!preemptible() && kprobe_running() && kprobe_fault_handler(args->regs, args->trapnr)) ret = NOTIFY_STOP; The above code has been working fine for ARM Kprobes for a year. So I changed the x86 code (2.6.24-rc6) to be the same way and ran the Systemtap tests on that kernel. As on ARM, Systemtap on x86 comes up with the same test results either way, so it's a neutral external functional change (as expected). This issue has been discussed previously on linux-arm-kernel and the Systemtap mailing lists. Pointers to the by base for the two discussions: http://lists.arm.linux.org.uk/lurker/message/20071219.223225.1f5c2a5e.en.html http://sourceware.org/ml/systemtap/2007-q1/msg00251.html Cheers, Harvey -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"
On Tue, Jan 08, 2008 at 08:10:42PM -0800, Andrew Morton wrote: [...] > I must say that the number of bugs which actually go away when the user > stops using nvidia/fglrx/ndiswrapper/etc is a small minority. [...] > But people who think that removing the nvidia driver will > magically fix that khubd-got-stuck-in-D-state bug are urinating up an > incline. > > > Facts: > > - lots of people use nvidia/etc > > - most bugs they report aren't caused by nvidia/etc > > - we need lots of testers > > draw you own conclusions. Thanks Andrew for this demonstration. At least now I know I'm not the only one to think that. And no, I do not have any nvidia/etc. It's just that I value their users' reports as much as the other ones just because otherwise we would only track some elite's bugs, thus reducing the amount of information we have to understand the circumstances under which it happens. Cheers, Willy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3] kprobes: Introduce kprobe_handle_fault()
> +/* > + * If it is a kprobe pagefault we can not be premptible so return before Missing 'e' in preemptible. However, the old code you removed had a lot of preempt_disable/enable calls that you removed. Hope you checked that preemption was always disabled already and the calls were not necessary (true at least for s390). Are there cases where this code could be called with preemption enabled? If so then that looks like a bug anyway. I'd say the preemptible check should be removed or turned into a WARN_ON. > + * calling kprobe_running() as it will assert on smp_processor_id if > + * preemption is enabled. > + */ > +static inline int kprobe_handle_fault(struct pt_regs *regs, int trapnr) > +{ > + if (!user_mode(regs) && !preemptible() && kprobe_running() && > + kprobe_fault_handler(regs, trapnr)) > + return 1; > + else > + return 0; I like this better (not including any other changes): if (!user_mode(regs) && !preemptible() && kprobe_running()) return kprobe_fault_handler(regs, trapnr); return 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: regression: 100% io-wait with 2.6.24-rcX
Am Mittwoch, 9. Januar 2008 schrieb Fengguang Wu: > > /dev/sda6 on / type ext3 (rw,noatime,errors=remount-ro,acl) > > tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) > > proc on /proc type proc (rw,noexec,nosuid,nodev) > > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > > procbususb on /proc/bus/usb type usbfs (rw) > > udev on /dev type tmpfs (rw,mode=0755) > > tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) > > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) > > fusectl on /sys/fs/fuse/connections type fusectl (rw) > > /dev/sda7 on /tmp type ext2 (rw,noatime,errors=remount-ro,acl) > > /dev/sda8 on /export type ext3 (rw,noatime,errors=remount-ro,acl) > > /dev/sda1 on /winxp type ntfs (rw,umask=002,gid=1,nls=utf8) > > So they are ext3/ext2/ntfs. What if you umount ntfs? and ext2 if possible? Unmounting ntfs doesn't help, hence I converted the remaining ext2 filesystem to ext3, modified the fstab entry accordingly and rebooted. Now everything seems to be fine! Top reports an idle system and there is no abnormal iowait any longer! Seems to be ext2 was causing this! Later today I can try to remount the filesystem as ext2 to be sure the bug shows up again. regards, Jörg -- PGP Key: send mail with subject 'SEND PGP-KEY' PGP Key-ID: FD 4E 21 1D PGP Fingerprint: 388A872AFC5649D3 BCEC65778BE0C605 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/7] convert semaphore to mutex in struct class
On Wed, Jan 09, 2008 at 09:32:48AM +0800, Dave Young wrote: > On Jan 9, 2008 6:48 AM, Greg KH <[EMAIL PROTECTED]> wrote: > > On Tue, Jan 08, 2008 at 03:05:10PM +0800, Dave Young wrote: > > > On Jan 8, 2008 1:20 AM, Greg KH <[EMAIL PROTECTED]> wrote: > > > > On Mon, Jan 07, 2008 at 06:13:37PM +0100, Stefan Richter wrote: > > > > > It's already in the driver core to the most part. It remains to be > > > > > seen > > > > > what is less complicated in the end: Transparent mutex-protected list > > > > > accesses provided by driver core (requires the iterator), or all the > > > > > necessary locking done by the drivers themselves (requires some more > > > > > lock-taking but perhaps fewer lock instances overall in the drivers, > > > > > and > > > > > respective redefinitions and documentation of the driver core API). > > > > > > > > I favor changing the driver core api and doing this kind of thing there. > > > > It keeps the drivers simpler and should hopefully make their lives > > > > easier. > > > > > > What about this? > > > > > > #define class_for_each_dev(pos, head, member) \ > > > for (mutex_lock(&(container_of(head, struct class, > > > devices))->mutex), po > > > s = list_entry((head)->next, typeof(*pos), member); \ > > > prefetch(pos->member.next), >member != (head) ? 1 : > > > (mutex_unlock(& > > > (container_of(head, struct class, devices))->mutex), 0); \ > > > pos = list_entry(pos->member.next, typeof(*pos), member)) > > > I'm wrong, it's same as before indeed. > > > Eeek, just make the thing a function please, where you pass the iterator > > function in, like the driver core has (driver_for_each_device) > > Ok, so need a new member of knode_class, I will update the patch later. > Thanks. Withdraw my post, sorry :) For now the mutex patch, I will only use the mutex to lock the devices list and write an iterater function. Most of the iterating is for finding some device in the list, so maybe need a match function just like drivers do? Regards dave > > > > > thanks, > > > > greg k-h > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Change paride driver to use unlocked_ioctl instead of ioctl
The ioctl handler is called with the BKL held. Registering unlocked_ioctl handler instead of registering ioctl handler. Signed-off-by: Nikanth Karthikesan <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c diff --git a/drivers/block/paride/pt.c b/drivers/block/paride/pt.c index b91accf..d4fa468 100644 --- a/drivers/block/paride/pt.c +++ b/drivers/block/paride/pt.c @@ -236,7 +236,7 @@ static const struct file_operations pt_fops = { .owner = THIS_MODULE, .read = pt_read, .write = pt_write, - .ioctl = pt_ioctl, + .unlocked_ioctl = pt_ioctl, .open = pt_open, .release = pt_release, }; @@ -685,36 +685,43 @@ out: return err; } -static int pt_ioctl(struct inode *inode, struct file *file, -unsigned int cmd, unsigned long arg) +static long pt_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct pt_unit *tape = file->private_data; struct mtop __user *p = (void __user *)arg; struct mtop mtop; + lock_kernel(); + switch (cmd) { case MTIOCTOP: - if (copy_from_user(, p, sizeof(struct mtop))) + if (copy_from_user(, p, sizeof(struct mtop))) { + unlock_kernel(); return -EFAULT; + } switch (mtop.mt_op) { case MTREW: pt_rewind(tape); + unlock_kernel(); return 0; case MTWEOF: pt_write_fm(tape); + unlock_kernel(); return 0; default: printk("%s: Unimplemented mt_op %d\n", tape->name, mtop.mt_op); + unlock_kernel(); return -EINVAL; } default: printk("%s: Unimplemented ioctl 0x%x\n", tape->name, cmd); + unlock_kernel(); return -EINVAL; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Do SATA tape drives work?
Tejun Heo wrote: [cc'ing linux-ide] Jonathan Woithe wrote: Hi guys I was wondering whether anyone can shed any light on the status of SATA tape drives. There's very little info on the net about this at least in the places I've checked; the only thing of any significance I've found thus far is a note in a Bacula document dated April 2007 which states that drives other than real SCSI units don't generally work with Bacula. To put this into context, I'm looking at purchasing a Sony SDX470VRB SATA AIT-1 tape drive for use with the SATA controller on an Intel DG31PR mainboard. The drive will be used primarily with tar/cpio. Obvsiouly however I only want to make the purchase if there's a reasonable chance of it working. I would appreciate any information you can shed on this issue. It's supposed to with recent updates. Mark, right? .. I wouldn't buy anything with "Sony" on it, but Albert thinks ATAPI tapes should be working now (he has my old drive now). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc7, intel audio: alsa doesn't say a beep
Takashi Iwai wrote: Did you enable CONFIG_SND_HDA_POWER_SAVE feature? And which hardware (laptop, product name, whatever) exactly? CONFIG_SND_HDA_POWER_SAVE is not set. Hardware is a Dell XPS M1330. CPU is Core2 Duo T7500, 2.20GHz, 2 GByte RAM. lspci: 00:00.0 Host bridge: Intel Corporation Mobile Memory Controller Hub (rev 0c) 00:01.0 PCI bridge: Intel Corporation Mobile PCI Express Root Port (rev 0c) 00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02) 00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 02) 00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 4 (rev 02) 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f2) 00:1f.0 ISA bridge: Intel Corporation Mobile LPC Interface Controller (rev 02) 00:1f.1 IDE interface: Intel Corporation Mobile IDE Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation Mobile SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02) 01:00.0 VGA compatible controller: nVidia Corporation Unknown device 0427 (rev a1) 03:01.0 FireWire (IEEE 1394): Ricoh Co Ltd Unknown device 0832 (rev 05) 03:01.1 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22) 03:01.2 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter (rev 12) 03:01.3 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12) 09:00.0 Ethernet controller: Broadcom Corporation Unknown device 1713 (rev 02) 0c:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network Connection (rev 02) Also, please show the contents of /proc/asound/card0/codec#* files. Do you see difference in these files between with and without the patch? See below. There is no difference between both. Regards Harri Codec: SigmaTel STAC9228 Address: 0 Vendor Id: 0x83847616 Subsystem Id: 0x10280209 Revision Id: 0x100201 No Modem Function Group found Default PCM: rates [0x7e0]: 44100 48000 88200 96000 176400 192000 bits [0xe]: 16 20 24 formats [0x1]: PCM Default Amp-In caps: ofs=0x00, nsteps=0x0e, stepsize=0x05, mute=0 Default Amp-Out caps: ofs=0x7f, nsteps=0x7f, stepsize=0x02, mute=1 Node 0x02 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out Amp-Out caps: N/A Amp-Out vals: [0x7f 0x7f] Power: 0x0 Node 0x03 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out Amp-Out caps: N/A Amp-Out vals: [0xff 0xff] Power: 0x0 Node 0x04 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out Amp-Out caps: N/A Amp-Out vals: [0xff 0xff] Power: 0x0 Node 0x05 [Audio Output] wcaps 0xd0c05: Stereo Amp-Out Amp-Out caps: N/A Amp-Out vals: [0xff 0xff] Power: 0x0 Node 0x06 [Vendor Defined Widget] wcaps 0xfd0c05: Stereo Amp-Out Amp-Out caps: N/A Amp-Out vals: [0xff 0xff] Power: 0x0 Node 0x07 [Audio Input] wcaps 0x1d0541: Stereo Power: 0x0 Connection: 1 0x1b Node 0x08 [Audio Input] wcaps 0x1d0541: Stereo Power: 0x0 Connection: 1 0x1c Node 0x09 [Audio Input] wcaps 0x1d0541: Stereo Power: 0x0 Connection: 1 0x1d Node 0x0a [Pin Complex] wcaps 0x400181: Stereo Pincap 0x08173f: IN OUT HP Detect Pin Default 0x02214020: [Jack] HP Out at Ext Front Conn = 1/8, Color = Green Pin-ctls: 0xc0: OUT HP Connection: 2 0x02* 0x03 Node 0x0b [Pin Complex] wcaps 0x400181: Stereo Pincap 0x08173f: IN OUT HP Detect Pin Default 0x02a19080: [Jack] Mic at Ext Front Conn = 1/8, Color = Pink Pin-ctls: 0x24: IN Connection: 2 0x02 0x03* Node 0x0c [Pin Complex] wcaps 0x400181: Stereo Pincap 0x081737: IN OUT Detect Pin Default 0x0181304e: [Jack] Line In at Ext Rear Conn = 1/8, Color = Blue Pin-ctls: 0x20: IN Connection: 1 0x03 Node 0x0d [Pin Complex] wcaps 0x400181: Stereo Pincap 0x08173f: IN OUT HP Detect Pin Default 0x01014010: [Jack] Line Out at Ext Rear Conn = 1/8, Color = Green Pin-ctls: 0x40: OUT Connection: 1 0x02 Node 0x0e [Pin Complex] wcaps 0x400181: Stereo Pincap 0x081737: IN OUT Detect Pin Default 0x01a19040: [Jack] Mic at Ext Rear Conn = 1/8, Color = Pink Pin-ctls: 0x24:
Re: [PATCH] Change x86 Machine check handler to use unlocked_iocl instead of ioctl
On Thu, Jan 10, 2008 at 11:25:14AM +0530, Nikanth Karthikesan wrote: > The Machine check handler registers ioctl handler that is called > with the BKL held. Changing to register unlocked_ioctl instead. > Also mce ioctl handler does not seem to need any lock protection. Thanks, but I already did that here on my own. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Change x86 Machine check handler to use unlocked_iocl instead of ioctl
The Machine check handler registers ioctl handler that is called with the BKL held. Changing to register unlocked_ioctl instead. Also mce ioctl handler does not seem to need any lock protection. To: Andi Kleen <[EMAIL PROTECTED]> Cc: linux-kernel@vger.kernel.org Cc: [EMAIL PROTECTED] Change the Machine check handler to use unlocked_ioctl instead of ioctl handler. Also the mce ioctl handler does not need any lock protection. Signed-off-by: Nikanth Karthikesan <[EMAIL PROTECTED]> --- diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c b/arch/x86/kernel/cpu/mcheck/mce_64.c index 4b21d29..d3baa62 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_64.c +++ b/arch/x86/kernel/cpu/mcheck/mce_64.c @@ -634,8 +634,7 @@ static unsigned int mce_poll(struct file *file, poll_table *wait) return 0; } -static int mce_ioctl(struct inode *i, struct file *f,unsigned int cmd, -unsigned long arg) +static long mce_ioctl(struct file *f, unsigned int cmd, unsigned long arg) { int __user *p = (int __user *)arg; @@ -664,7 +663,7 @@ static const struct file_operations mce_chrdev_ops = { .release = mce_release, .read = mce_read, .poll = mce_poll, - .ioctl = mce_ioctl, + .unlocked_ioctl = mce_ioctl, }; static struct miscdevice mce_log_device = { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] CONNECTOR: don't touch queue dev after decrement of ref count
cn_queue_free_callback() will touch 'dev'(i.e. cbq->pdev), so it should be called before atomic_dec(>refcnt). Signed-off-by: Li Zefan <[EMAIL PROTECTED]> --- drivers/connector/cn_queue.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/connector/cn_queue.c b/drivers/connector/cn_queue.c index 23cc87a..5732ca3 100644 --- a/drivers/connector/cn_queue.c +++ b/drivers/connector/cn_queue.c @@ -99,8 +99,8 @@ int cn_queue_add_callback(struct cn_queue_dev *dev, char *name, struct cb_id *id spin_unlock_bh(>queue_lock); if (found) { - atomic_dec(>refcnt); cn_queue_free_callback(cbq); + atomic_dec(>refcnt); return -EINVAL; } -- 1.5.3.rc7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.
On Tue, 08 Jan 2008 18:52:42 -0800 Zachary Amsden <[EMAIL PROTECTED]> wrote: > On Tue, 2008-01-08 at 14:15 -0500, David P. Reed wrote: > > Alan Cox wrote: > > > The natsemi docs here say otherwise. I trust them not you. > > > > > As well you should. I am honestly curious (for my own satisfaction) > > as to what the natsemi docs say the delay code should do (can't > > imagine they say "use io port 80 because it is unused"). I don't > > have any > > What is the outcome of this thread? Are we going to use timing based > port delays, or can we finally drop these things entirely on 64-bit > architectures? > > I a have a doubly vested interest in this, both as the owner of an > affected HP dv9210us laptop and as a maintainer of paravirt code - and > would like 64-bit Linux code to stop using I/O to port 0x80 in both > cases (as I suspect would every other person involved with > virtualization). > > I've tried to follow this thread, but with all the jabs, 1-ups, and > obscure legacy hardware pageantry going on, it isn't clear what we're > really doing. I belive Alan Cox is doing a review of some drivers, to see if they actually need the I/O port delay. A lot of drivers probably use outb_p just because it was copy-pasted from some other driver and it can be removed. Alan's review has also brought to light a lack of locking in some drivers, so I think Alan has been adding proper locking to some of the watchdog drivers. Most old ISA only device drivers can keep using OUT 80h. They are not used on modern machines and it's better to keep them unchanged to avoid unneccesary incompatibilities. As far as I know, the 8253 PIT timer code needs outb_p on some older platform, and this is one of the most troublesome since the same PIT controller (or a register compatible one) has been used since the original IBM PC, and it is frequently executed code. Ingo Molnar has done an alternate implementation of the PIT clock source which uses udelay instead of OUT 80h to delay accesses to the ports. The kernel could make a choice of which variant to use based on the DMI year, if compiling for x86_64, or something similar. Maybe have a command line option too. The keyboard controller on some platform needs the delay, and the same driver is used on both ancient and modern systems, I think it can be changed to udelay since it's not so time critical code. The 8259 interrupt controller on some platform needs the delay, I think it can be changed to udelay since it's only some setup code that uses outb_p. I guess there are time critical accesses to the interrupt controller from assembly code somewhere to acknowledge interrupts, and that code needs a review. The floppy controller code uses outb_p. Even though there might be floppy controllers on modern systems, I'd rather leave the floppy code alone since it's supposed to be very fragile. If you still use floppies you deserve what you get. Some specific drivers, such as drivers for 8390 or 8390 clone based network cards are also a bit troublesome, they do need outb_p (and the delay for the original 8390 chip is specified in bus cycles), and there can be a big performance loss if pessimistic udelays are used for the delay. There are still a bunch of PCMCIA cards based on that chip which means that those cards can be used with modern machines. There are also PCI and memory mapped variants of the 8390, some of them new designs which are only register compatible, some other designs are using a real 8390 with a FPGA used as glue logic. I think Alan suggested compiling two versions of that driver, one with OUT 80h, and one with udelay. Old machines can choose the old driver, and new machines can use the new one. Other drivers can probably do the same thing, or if not time critical, always use a pessimistic udelay. As for the implementation, I like the suggestion to split outb_b into two calls, one to outb and one to isa_slow_down_io. It makes it very obvious that it is really two function calls, and that it needs locking. For those uses that are not ISA port accesses, isa_slow_down_io should be changed to an appropriate udelay instead. The goal is anyway that a modern machine should not do OUT 80h, and old machines keep doing it since it has been working well for some 15-odd years, both in DOS device drivers and on Linux. Using an alternate port may be a workaround, but it's probaby not a good idea since alternate ports have received less testing and there's bound to be some platform out there that has problems with any alternate port we might choose. Allowing an alternate port will also add code bloat (OUT 80h, AL becomes MOV DX, alternate_port; OUT DX, AL) for a dubious gain. Did I miss anyting? /Christer -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: More breakage in native_rdtsc out of line in git-x86 II
> I think the problem is that the vsyscall/vdso code calls it through > vread and for that it has to be exported. There seems to be also > another bug with the old style vsyscalls not using the TSC vread > that masks it on older glibc > > Stepping with gdb through old style vgettimeofday() confirms that RDTSC is > not used. > > A long time ago we had a similar problem once and it was because of a > problem exporting the vsyscall variables in vmlinux.lds.S -- looks like that > has reappeared. > > I think the new glibc shows it because it uses the vDSO not > the older vsyscall and the new vDSO probably still works. Anyways haven't > investigated why that is in detail yet, but that's a separate > regression. Actually that seems to be because the test system using the older glibc didn't use the TSC because it was marked unstable due a unsynchronized TSC. It should not have been -- this is a Core2 dual core single socket. Will investigate later what happened there. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.
Zachary Amsden wrote: BTW, it isn't ever safe to pass port 0x80 through to hardware from a virtual machine; some OSes use port 0x80 as a hardware available scratch register (I believe Darwin/x86 did/does this during boot). That's funny, because there is definitely no guarantee that you get back what you read (well, perhaps there is on Apple.) -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Do SATA tape drives work?
[cc'ing linux-ide] Jonathan Woithe wrote: > Hi guys > > I was wondering whether anyone can shed any light on the status of SATA tape > drives. There's very little info on the net about this at least in the > places I've checked; the only thing of any significance I've found thus far > is a note in a Bacula document dated April 2007 which states that drives > other than real SCSI units don't generally work with Bacula. > > To put this into context, I'm looking at purchasing a Sony SDX470VRB SATA > AIT-1 tape drive for use with the SATA controller on an Intel DG31PR > mainboard. The drive will be used primarily with tar/cpio. Obvsiouly > however I only want to make the purchase if there's a reasonable chance of > it working. > > I would appreciate any information you can shed on this issue. It's supposed to with recent updates. Mark, right? -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kbuild update
On Wed, Jan 09, 2008 at 10:32:39AM +0800, WANG Cong wrote: > > >> > If we can make this to be an offical project for Linux kernel, I > >> > think it won't be a big problem. > >> > >> We don't even manage to maintain the English language texts properly, > >> and I am therefore not overly optimistic that we'll have the > >> translations maintained properly for many years. > >Italian was 100% translated at one point in time. > >And the Linux Kernel Translation project has a number of > >spelling error fixes in queue (I dunno if they have been applied). > > > >So even when run as an external project it was ok for some languages, > >and having it official and someone taking patches to .po files would > >for sure allow more users to build a kernel. > > > > Agreed. > > That's the goal of TLKTP. Sam, can you contact to the author of > TLKTP? Maybe we can talk to him to see if we can restart the > project. If so, I can help with the Chinese translation part. My first try bounced, found another address for Egry Gabor - let's see if I have more luck. The associated list is spam only so I did not try that one. Sam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] Simple tamper-proof device filesystem.
On Tue, 08 Jan 2008 22:50:43 +0900, Tetsuo Handa said: > Yes. It is a line-by-line processable format defined as: > > filename permission owner group flags type [ symlink_data | major minor ] > > where flags are bit-wised combinations of > > * 1: Allow creation of the file. > * 2: Allow deletion of the file. > * 4: Allow changing permissions of the file. > * 8: Allow changing owner or group of the file. > * 16: For internal use. Remembers whether this file is opened or not. > * 32: Don't create this file at mount time. > > and here are some example entries: > > pts 755 0 0 0 d Good summary - probably should add that to the patch, drop it into Documentation/syaoran-config.txt or similar... > > the idea of passing a file to be read by the kernel, but I also understand > > that if it isn't done before mount, you have a race condition betweet the > > mount and the load. > What race condition is possible? > Are you worrying that the file gets modified while reading? Modification while reading *is* an issue, but can probably be worked around with some clever locking. The race condition I was thinking of was if you had the mount and the policy load be 2 separate events, you could see: (a) issue mount request (b) do something malicious in /dev while.. (c) load the policy that would have prevented (b). This is partly why SELinux has init load the policy *very* early on, before any other userspace have had a chance to run and do things that would have been prevented by policy. >> Does this do what you think it does if run in a chroot process or if >> some creative person does "accept=../../path/to/bad_data.cfg"? > sys_open() calls open_pathname() with AT_FDCWD. > So, it is the same thing as calling > open("../../path/to/bad_data.cfg", O_RDONLY) from the userland. Which basically ends up meaning that anybody who can trick the mount into happening can reset the permitted list and create (for example) a mode 666 entry for a hard drive, and go scribbling around at will. Note that you don't seem to do any sanity checking on the path (for instance, that each component is owned by root, and not world-writable) - so anybody who finds a way to get the mount to happen can supply their own list in /home/joeuser/blat or /tmp/surprise-mount-list or wherever. >> That printk should be KERN_ERR, I think. > May be. But I think KERN_WARNING is enough because this is not such emergent > error. OK, I can live with WARNING. You just want to be sure it's above INFO... pgplVLr1tgo5y.pgp Description: PGP signature
Re: [patch 1/1] Convert drivers in drivers/char/drm to use .unlocked_ioctl
> On Wed, Jan 09, 2008 at 03:37:50AM +, Dave Airlie wrote: > > > > > The drm drivers in this patch all used drm_ioctl to perform their > > > ioctl calls. The common function is converted to use lock_kernel() > > > and unlock_kernel() and the drivers are converted to use .unlocked_ioctl > > > > > > > NAK > > Did you actually read Kevin's patch? Kevin's patch adds the lock/unlock to drm_ioctl which is exactly what I don't want, I want to have drm_ioctl become drm_unlocked_ioctl, and drm_ioctl to wrap it with the lock/unlocks, then the drivers can all use unlocked_ioctl like Kevins patch pointing to drm_ioctl, and can migrate over to drm_unlocked_ioctl post lock auditing, the new latest i915 driver seems to be fine with unlocked ioctls so far.. Yes I can use Kevin's patch as a base most likely, but it doesn't do what I want yet, and I've already started to do it properly in the drm upstream trees Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] call sysrq_timer_list_show from a workqueue
On Wed, 9 Jan 2008 15:27:59 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote: > > > I assume you've > > > queued these because you're thinking of applying them before 2.6.24? I'd > > > say only > > > modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch > > > warrants that (the other is unlikely and not a regression). > > > > Actually I was thinking 2.6.25 on both. > > Then, you should get them next time you grab my series, no? Or is that > particular lever not working yet? > > Hmm, I see my link was not updated (damn, ln -sfn, not ln -sf!). Fixed now: > http://ozlabs.org/~rusty/kernel/rr-latest/ > > More goodies there than a UK comedy convention... My 850-email backlog is down to 759. You're in there somewhere. I'm wondering if I can spin it out to next Christmas. I may end up throwing up my hands, trolling it all for bugfixes and then having an accident with the rest. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD] Incremental fsck
Rik van Riel wrote: > Al Boldi <[EMAIL PROTECTED]> wrote: > > Has there been some thought about an incremental fsck? > > > > You know, somehow fencing a sub-dir to do an online fsck? > > Search for "chunkfs" Sure, and there is TileFS too. But why wouldn't it be possible to do this on the current fs infrastructure, using just a smart fsck, working incrementally on some sub-dir? Thanks! -- Al -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/19] split LRU lists into anon & file sets
I like this patch set thank you. On Tue, 08 Jan 2008 15:59:44 -0500 Rik van Riel <[EMAIL PROTECTED]> wrote: > Index: linux-2.6.24-rc6-mm1/mm/memcontrol.c > === > --- linux-2.6.24-rc6-mm1.orig/mm/memcontrol.c 2008-01-07 11:55:09.0 > -0500 > +++ linux-2.6.24-rc6-mm1/mm/memcontrol.c 2008-01-07 17:32:53.0 > -0500 > -enum mem_cgroup_zstat_index { > - MEM_CGROUP_ZSTAT_ACTIVE, > - MEM_CGROUP_ZSTAT_INACTIVE, > - > - NR_MEM_CGROUP_ZSTAT, > -}; > - > struct mem_cgroup_per_zone { > /* >* spin_lock to protect the per cgroup LRU >*/ > spinlock_t lru_lock; > - struct list_headactive_list; > - struct list_headinactive_list; > - unsigned long count[NR_MEM_CGROUP_ZSTAT]; > + struct list_headlists[NR_LRU_LISTS]; > + unsigned long count[NR_LRU_LISTS]; > }; > /* Macro for accessing counter */ > #define MEM_CGROUP_ZSTAT(mz, idx)((mz)->count[(idx)]) > @@ -160,6 +152,7 @@ struct page_cgroup { > }; > #define PAGE_CGROUP_FLAG_CACHE (0x1) /* charged as cache */ > #define PAGE_CGROUP_FLAG_ACTIVE (0x2)/* page is active in this > cgroup */ > +#define PAGE_CGROUP_FLAG_FILE(0x4) /* page is file system backed */ > Now, we don't have control_type and a feature for accounting only CACHE. Balbir-san, do you have some new plan ? BTW, is it better to use PageSwapBacked(pc->page) rather than adding a new flag PAGE_CGROUP_FLAG_FILE ? PAGE_CGROUP_FLAG_ACTIVE is used because global reclaim can change ACTIVE/INACTIVE attribute without accessing memory cgroup. (Then, we cannot trust PageActive(pc->page)) ANON <-> FILE attribute can be changed dinamically (after added to LRU) ? If no, using page_file_cache(pc->page) will be easy. Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][RFC] Simple tamper-proof device filesystem.
Hello. Indan Zupancic wrote: > I think you focus too much on your way of enforcing filename/attributes > pairs. So? > The same can be achieved by creating the device nodes with > expected attributes, and preventing processes from changing those files. The device nodes have to be deletable if some process (including udev) needs to delete. Thus, you cannot unconditionally prevent processes from changing those files. > This because expected combinations are known beforehand. Yes. > And once those files are present, the MAC system used doesn't have to have > special > device nodes attributes support. Protecting those files is enough to > guarantee filename/attributes pairs. If MAC system needn't to support this filesystem's functionality, who creates those files with warrantee of expected attributes? The udev does? If udev is exploited, who can guarantee? > No, this is because rename permission was given for files that it shouldn't > had. Do you think all MAC implementation have the same granularity and functionalities? I don't think so. Not all MAC implementation can control with such granularity. This filesystem is designed to be combined with any MAC, although the MAC used with this filesystem should be able to restrict namespace manipulation requests so that this filesystem can remain /dev and visible to userland applications. > Either you want a process to manage device names and attributes, and then you > give it permission to do that, or you want to enforce certain > filename/attribute > pairs and then you just do it yourself. If I modify udev to enforce certain filename/attribute pairs and the modified udev was exploited, who can guarantee? "Don't trust userland application" is the basis of restricting access in kernel space. If you can trust userland application, you don't need in-kernel access control. > Will your filesystem prevent the trivial case of > > rm /dev/hda1 > ln -s /dev/hda2 /dev/hda1 > Of course. To permit the above operation, the following permissions are needed. hda1660 0 6 2 b 3 1 hda1777 0 0 33 l . > Rename permission can be given for /dev in general, but prohibited for > certain files in /dev, the ones you want to have specific attributes. > It isn't all or nothing. Do you think all MAC implementation can prohibit renaming for certain files in /dev ? > It's "forbid modifying certain nodes that process needn't to modify" > versus "forbid breaking filename/attribute pairs of certain nodes". > > Both have the same effect, except that the first one is generic and > can be done by existing MAC systems, while the second one needs > a special filesystem and a handful of MAC rules to make it effective. Do you think all MAC implementation can do? I think the first one is implementation specific and the second one is generic. > It doesn't matter where they are, it's that a different fs than yours could be > mounted over it. You say a MAC can prevent that from happening, but a > MAC can also prevent all processes except for udev from modifying /dev. But MAC cannot prevent udev from modifying /dev . And what if exploited? Not all MAC can enforce access control over all processes with the granularity you are talking. And what if a process that cannot be controlled with your boolean level granularity exists (e.g. an administrator running his/her administrative applications that require modification of /dev )? A crazy example of administrative applications: (Please don't say "Don't use such crazy application".) #! /bin/sh rm -f /dev/either-null-or-zero read mknod /dev/either-null-or-zero c 1 $REPLY && echo "Administrative task finished successfully." | mail root This filesystem can guarantee /dev/either-null-or-zero is either char-1-3 or char-1-5 by using a policy either-null-or-zero666 0 0 3 c 1 3 either-null-or-zero666 0 0 35 c 1 5 The boolean level granularity (e.g. forbid all processes except for udev , and modify udev to perform name/attribute pair enforcement) is not generic. Userland application sometimes misbehaves. I assume kernel process doesn't misbehave. If you doubt my assumption, you have to doubt in-kernel MAC implementation too. > I don't. What I complain about is that it's too specific and does it one > chosen > job badly. It lacks abstraction. As far as I can see any decent MAC can > achieve > the same end result as your filesystem, without directly enforcing name/attr > pairs. Can SELinux guarantee the same result as my filesystem even if udev or administrative programs have to be able to modify /dev ? > The thing is, all special device nodes that are expected to exist by > applications > are known beforehand. Yes. > Thus they can be created statically and can be protected > against any modifications with any MAC system. But sometimes some modifications needs to be permitted. Who can
Re: [TOMOYO #6 retry 08/21] Utility functions and policy manipulation interface.
Hello. James Morris wrote: > Why aren't you using securityfs for this? (It was designed for LSMs). We are using securityfs mounted on /sys/kernel/security/ . Thanks. Kentaro Takeda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [TOMOYO #6 retry 08/21] Utility functions and policy manipulation interface.
On Wed, 9 Jan 2008, James Morris wrote: > On Wed, 9 Jan 2008, Kentaro Takeda wrote: > > > Common functions for TOMOYO Linux. > > > > TOMOYO Linux uses /sys/kernel/security/tomoyo interface for configuration. > > Why aren't you using securityfs for this? (It was designed for LSMs). Doh, it is using securityfs, don't worry. -- James Morris <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] add task handling notifier: base definitions
On Thu, 2007-12-20 at 13:12 +, Jan Beulich wrote: > This is the base patch, adding notification for task creation and > deletion. > > Signed-off-by: Jan Beulich <[EMAIL PROTECTED]> > --- > include/linux/sched.h |8 +++- > kernel/fork.c | 11 +++ > 2 files changed, 18 insertions(+), 1 deletion(-) > > --- 2.6.24-rc5-notify-task.orig/include/linux/sched.h > +++ 2.6.24-rc5-notify-task/include/linux/sched.h > @@ -80,7 +80,7 @@ struct sched_param { > #include > #include > #include > - > +#include > #include > #include > #include > @@ -1700,6 +1700,12 @@ extern int do_execve(char *, char __user > extern long do_fork(unsigned long, unsigned long, struct pt_regs *, unsigned > long, int __user *, int __user *); > struct task_struct *fork_idle(int); > > +#define TASK_NEW 1 > +#define TASK_DELETE 2 > + > +extern struct blocking_notifier_head task_notifier_list; > +extern struct atomic_notifier_head atomic_task_notifier_list; > + > extern void set_task_comm(struct task_struct *tsk, char *from); > extern void get_task_comm(char *to, struct task_struct *tsk); > > --- 2.6.24-rc5-notify-task.orig/kernel/fork.c > +++ 2.6.24-rc5-notify-task/kernel/fork.c > @@ -46,6 +46,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -71,6 +72,11 @@ DEFINE_PER_CPU(unsigned long, process_co > > __cacheline_aligned DEFINE_RWLOCK(tasklist_lock); /* outer */ > > +BLOCKING_NOTIFIER_HEAD(task_notifier_list); > +EXPORT_SYMBOL_GPL(task_notifier_list); > +ATOMIC_NOTIFIER_HEAD(atomic_task_notifier_list); > +EXPORT_SYMBOL_GPL(atomic_task_notifier_list); > + When these global notifier lists were proposed years ago folks at SGI loudly objected with concerns over anticipated cache line bouncing on 512+ cpu machines. Is that no longer a concern? > int nr_processes(void) > { > int cpu; > @@ -121,6 +127,9 @@ void __put_task_struct(struct task_struc > WARN_ON(atomic_read(>usage)); > WARN_ON(tsk == current); > > + atomic_notifier_call_chain(_task_notifier_list, > +TASK_DELETE, tsk); > + > security_task_free(tsk); > free_uid(tsk->user); > put_group_info(tsk->group_info); Would the atomic notifier call chain be necessary if you hooked into an earlier section of do_exit() instead? Cheers, -Matt Helsley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] call sysrq_timer_list_show from a workqueue
On Wednesday 09 January 2008 14:33:50 Andrew Morton wrote: > On Wed, 9 Jan 2008 14:20:18 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote: > > On Wednesday 09 January 2008 11:21:59 Andrew Morton wrote: > > > The string handling in here has become a bit scruffy. > > > > Yes, that patch also evokes a const warning. Fixed below. > > No patch was included. Yes, I decided it's a secret. Mine, all mine! > > I assume you've > > queued these because you're thinking of applying them before 2.6.24? I'd > > say only > > modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch > > warrants that (the other is unlikely and not a regression). > > Actually I was thinking 2.6.25 on both. Then, you should get them next time you grab my series, no? Or is that particular lever not working yet? Hmm, I see my link was not updated (damn, ln -sfn, not ln -sf!). Fixed now: http://ozlabs.org/~rusty/kernel/rr-latest/ More goodies there than a UK comedy convention... > OK, 2.6.24 seems reasonable. Kyle acked it at least... > Yes, it could all do with a revisit. And it goes without saying that glory awaits they who succeed... Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [TOMOYO #6 retry 08/21] Utility functions and policy manipulation interface.
On Wed, 9 Jan 2008, Kentaro Takeda wrote: > Common functions for TOMOYO Linux. > > TOMOYO Linux uses /sys/kernel/security/tomoyo interface for configuration. Why aren't you using securityfs for this? (It was designed for LSMs). - James -- James Morris <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 03/19] define page_file_cache() function
On Tue, 8 Jan 2008 17:28:56 -0500 Rik van Riel <[EMAIL PROTECTED]> wrote: > On Tue, 8 Jan 2008 14:18:40 -0800 (PST) > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > On Tue, 8 Jan 2008, Rik van Riel wrote: > > > > > Define page_file_cache() function to answer the question: > > > is page backed by a file? > > > > > +static inline int page_file_cache(struct page *page) > > > +{ > > > + if (PageSwapBacked(page)) > > > + return 0; > > > > Could we call this PageNotFileBacked or so? PageSwapBacked is true for > > pages that are RAM based. Its a bit confusing. > > PageNotFileBacked confuses me a little, since shared memory segments live > in tmpfs and are kinda sorta file backed, but go to swap instead of to a > filesystem when there is memory pressure. > How about PageIsNotCache() ? :) When a page is a cache, there is an original data somewhere and can be dropped out. Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] split MMC_CAP_4_BIT_DATA
On Jan 9, 2008 2:29 AM, Mike Frysinger <[EMAIL PROTECTED]> wrote: > The on-chip Blackfin MMC/SD/SDIO host controller has the ability to do 1-bit > MMC, 1-bit/4-bit SD, and 1-bit/4-bit SDIO. Thus the current convention of > MMC_CAP_4_BIT_DATA meaning "your host controller can do 1-bit or 4-bit for all > modes" is insufficient for our needs. The attached patch splits > MMC_CAP_4_BIT_DATA into MMC_CAP_MMC_4_BIT_DATA and MMC_CAP_SD_4_BIT_DATA and > updates all host controllers to include these in their caps and then changes > existing code to check the new defines. At the moment, SD/SDIO are lumped > into MMC_CAP_SD_4_BIT_DATA ... should I bother with splitting that into SD and > SDIO as well while I'm doing this ? > > Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]> > --- > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c > index 68c0e3b..ca12db7 100644 > --- a/drivers/mmc/core/mmc.c > +++ b/drivers/mmc/core/mmc.c > @@ -397,7 +397,7 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr, > * Activate wide bus (if supported). > */ > if ((card->csd.mmca_vsn >= CSD_SPEC_VER_4) && > - (host->caps & MMC_CAP_4_BIT_DATA)) { > + (host->caps & MMC_CAP_MMC_4_BIT_DATA)) { > err = mmc_switch(card, EXT_CSD_CMD_SET_NORMAL, > EXT_CSD_BUS_WIDTH, EXT_CSD_BUS_WIDTH_4); > if (err) > diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c > index d1c1e0f..974b63d 100644 > --- a/drivers/mmc/core/sd.c > +++ b/drivers/mmc/core/sd.c > @@ -441,7 +441,7 @@ static int mmc_sd_init_card(struct mmc_host *host, u32 > ocr, > /* > * Switch to wider bus (if supported). > */ > - if ((host->caps & MMC_CAP_4_BIT_DATA) && > + if ((host->caps & MMC_CAP_SD_4_BIT_DATA) && > (card->scr.bus_widths & SD_SCR_BUS_WIDTH_4)) { > err = mmc_app_set_bus_width(card, MMC_BUS_WIDTH_4); > if (err) > diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c > index 87a50f4..1d389c8 100644 > --- a/drivers/mmc/core/sdio.c > +++ b/drivers/mmc/core/sdio.c > @@ -143,7 +143,7 @@ static int sdio_enable_wide(struct mmc_card *card) > int ret; > u8 ctrl; > > - if (!(card->host->caps & MMC_CAP_4_BIT_DATA)) > + if (!(card->host->caps & MMC_CAP_SD_4_BIT_DATA)) > return 0; > > if (card->cccr.low_speed && !card->cccr.wide_bus) > diff --git a/drivers/mmc/host/at91_mci.c b/drivers/mmc/host/at91_mci.c > index b1edcef..63d89b0 100644 > --- a/drivers/mmc/host/at91_mci.c > +++ b/drivers/mmc/host/at91_mci.c > @@ -851,7 +851,7 @@ static int __init at91_mci_probe(struct platform_device > *pdev) > host->board = pdev->dev.platform_data; > if (host->board->wire4) { > if (cpu_is_at91sam9260() || cpu_is_at91sam9263()) > - mmc->caps |= MMC_CAP_4_BIT_DATA; > + mmc->caps |= (MMC_CAP_SD_4_BIT_DATA | > MMC_CAP_MMC_4_BIT_DATA); > else > printk("AT91 MMC: 4 wire bus mode not supported" > " - using 1 wire\n"); > diff --git a/drivers/mmc/host/imxmmc.c b/drivers/mmc/host/imxmmc.c > index f2070a1..67d4bc0 100644 > --- a/drivers/mmc/host/imxmmc.c > +++ b/drivers/mmc/host/imxmmc.c > @@ -975,7 +975,7 @@ static int imxmci_probe(struct platform_device *pdev) > mmc->f_min = 15; > mmc->f_max = CLK_RATE/2; > mmc->ocr_avail = MMC_VDD_32_33; > - mmc->caps = MMC_CAP_4_BIT_DATA; > + mmc->caps = (MMC_CAP_SD_4_BIT_DATA | MMC_CAP_MMC_4_BIT_DATA); > > /* MMC core transfer sizes tunable parameters */ > mmc->max_hw_segs = 64; > diff --git a/drivers/mmc/host/omap.c b/drivers/mmc/host/omap.c > index 971e18b..b1ae793 100644 > --- a/drivers/mmc/host/omap.c > +++ b/drivers/mmc/host/omap.c > @@ -1079,7 +1079,7 @@ static int __init mmc_omap_probe(struct platform_device > *pdev) > mmc->caps = MMC_CAP_MULTIWRITE | MMC_CAP_BYTEBLOCK; > > if (minfo->wire4) > -mmc->caps |= MMC_CAP_4_BIT_DATA; > +mmc->caps |= (MMC_CAP_SD_4_BIT_DATA | > MMC_CAP_MMC_4_BIT_DATA); > > /* Use scatterlist DMA to reduce per-transfer costs. > * NOTE max_seg_size assumption that small blocks aren't > diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c > index 1654a33..4fa00f1 100644 > --- a/drivers/mmc/host/pxamci.c > +++ b/drivers/mmc/host/pxamci.c > @@ -527,7 +527,7 @@ static int pxamci_probe(struct platform_device *pdev) > mmc->caps = 0; > host->cmdat = 0; > if (!cpu_is_pxa21x() && !cpu_is_pxa25x()) { > - mmc->caps |= MMC_CAP_4_BIT_DATA | MMC_CAP_SDIO_IRQ; > + mmc->caps |= MMC_CAP_SD_4_BIT_DATA | MMC_CAP_MMC_4_BIT_DATA | > MMC_CAP_SDIO_IRQ; > host->cmdat |= CMDAT_SDIO_INT_EN; > } > > diff --git a/drivers/mmc/host/sdhci.c
Re: NIC as RS232
On Tue, 08 Jan 2008 08:48:35 +0200, Thanasis said: > Is there a kernel driver that would make a NIC's port work as a RS232 > port, using the serial cables that are RJ45 on one side and DB9 or DB25 > on the other? Maybe null modem cables of that type ? Or for example > those used by cisco as console port cables? > > (or may be I'm dreaming ;-) What I *have* seen are connectors that go from DB9/25 to RJ11, not RJ45. Basically, using the RJ11 to terminate a 4-conductor cable wired up for serial use. It's often hard to tell an 11 from a 45 unless you look at it *real* close pgpYhS8X8NOTE.pgp Description: PGP signature
[PATCHv3] kprobes: Introduce kprobe_handle_fault()
Use a central kprobe_handle_fault() inline in kprobes.h to remove all of the arch-dependant, practically identical implementations in avr32, ia64, powerpc, s390, sparc64, and x86. avr32 was the only arch without the preempt_disable/enable pair in its notify_page_fault implementation. This uncovered a possible bug in the s390 version as that purely copied the x86 version unconditionally passing 14 as the trapnr rather than the error_code parameter. powerpc: Tested-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> X86-64 Acked-by: Masami Hiramatsu <[EMAIL PROTECTED]> Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]> --- arch/avr32/mm/fault.c | 21 + arch/ia64/mm/fault.c| 24 +--- arch/powerpc/mm/fault.c | 25 + arch/s390/mm/fault.c| 25 + arch/sparc64/mm/fault.c | 23 +-- arch/x86/mm/fault_64.c | 26 ++ include/linux/kprobes.h | 19 +++ 7 files changed, 26 insertions(+), 137 deletions(-) diff --git a/arch/avr32/mm/fault.c b/arch/avr32/mm/fault.c index 6560cb1..e41953e 100644 --- a/arch/avr32/mm/fault.c +++ b/arch/avr32/mm/fault.c @@ -20,25 +20,6 @@ #include #include -#ifdef CONFIG_KPROBES -static inline int notify_page_fault(struct pt_regs *regs, int trap) -{ - int ret = 0; - - if (!user_mode(regs)) { - if (kprobe_running() && kprobe_fault_handler(regs, trap)) - ret = 1; - } - - return ret; -} -#else -static inline int notify_page_fault(struct pt_regs *regs, int trap) -{ - return 0; -} -#endif - int exception_trace = 1; /* @@ -66,7 +47,7 @@ asmlinkage void do_page_fault(unsigned long ecr, struct pt_regs *regs) int code; int fault; - if (notify_page_fault(regs, ecr)) + if (kprobe_handle_fault(regs, ecr)) return; address = sysreg_read(TLBEAR); diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c index 7571076..bfc83e8 100644 --- a/arch/ia64/mm/fault.c +++ b/arch/ia64/mm/fault.c @@ -18,28 +18,6 @@ extern void die (char *, struct pt_regs *, long); -#ifdef CONFIG_KPROBES -static inline int notify_page_fault(struct pt_regs *regs, int trap) -{ - int ret = 0; - - if (!user_mode(regs)) { - /* kprobe_running() needs smp_processor_id() */ - preempt_disable(); - if (kprobe_running() && kprobes_fault_handler(regs, trap)) - ret = 1; - preempt_enable(); - } - - return ret; -} -#else -static inline int notify_page_fault(struct pt_regs *regs, int trap) -{ - return 0; -} -#endif - /* * Return TRUE if ADDRESS points at a page in the kernel's mapped segment * (inside region 5, on ia64) and that page is present. @@ -106,7 +84,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re /* * This is to handle the kprobes on user space access instructions */ - if (notify_page_fault(regs, TRAP_BRKPT)) + if (kprobe_handle_fault(regs, TRAP_BRKPT)) return; down_read(>mmap_sem); diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 8135da0..ff64bd3 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -39,29 +39,6 @@ #include #include - -#ifdef CONFIG_KPROBES -static inline int notify_page_fault(struct pt_regs *regs) -{ - int ret = 0; - - /* kprobe_running() needs smp_processor_id() */ - if (!user_mode(regs)) { - preempt_disable(); - if (kprobe_running() && kprobe_fault_handler(regs, 11)) - ret = 1; - preempt_enable(); - } - - return ret; -} -#else -static inline int notify_page_fault(struct pt_regs *regs) -{ - return 0; -} -#endif - /* * Check whether the instruction at regs->nip is a store using * an update addressing form which will update r1. @@ -164,7 +141,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address, is_write = error_code & ESR_DST; #endif /* CONFIG_4xx || CONFIG_BOOKE */ - if (notify_page_fault(regs)) + if (kprobe_handle_fault(regs, 11)) return 0; if (trap == 0x300) { diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index 2456b52..a9033cf 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -51,29 +51,6 @@ extern int sysctl_userprocess_debug; extern void die(const char *,struct pt_regs *,long); -#ifdef CONFIG_KPROBES -static inline int notify_page_fault(struct pt_regs *regs, long err) -{ - int ret = 0; - - /* kprobe_running() needs smp_processor_id() */ - if (!user_mode(regs)) { - preempt_disable(); - if (kprobe_running() && kprobe_fault_handler(regs, 14)) - ret = 1; - preempt_enable();
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
> An alternative might be to come up with something decent and target 2.6.24.x If you want zero cache line cost the only way is to handle that using Mathieu's inline patch infrastructure. Having a generic notifier type based on that would be probably a good idea. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 07/19] (NEW) add some sanity checks to get_scan_ratio
On Tue, 08 Jan 2008 15:59:46 -0500 Rik van Riel <[EMAIL PROTECTED]> wrote: > The access ratio based scan rate determination in get_scan_ratio > works ok in most situations, but needs to be corrected in some > corner cases: > - if we run out of swap space, do not bother scanning the anon LRUs > - if we have already freed all of the page cache, we need to scan > the anon LRUs > > Signed-off-by: Rik van Riel <[EMAIL PROTECTED]> > > Index: linux-2.6.24-rc6-mm1/mm/vmscan.c > === > --- linux-2.6.24-rc6-mm1.orig/mm/vmscan.c 2008-01-07 17:33:50.0 > -0500 > +++ linux-2.6.24-rc6-mm1/mm/vmscan.c 2008-01-07 17:57:49.0 -0500 > @@ -1182,7 +1182,7 @@ static unsigned long shrink_list(enum lr > static void get_scan_ratio(struct zone *zone, struct scan_control * sc, > unsigned long *percent) > { > - unsigned long anon, file; > + unsigned long anon, file, free; > unsigned long anon_prio, file_prio; > unsigned long rotate_sum; > unsigned long ap, fp; > @@ -1230,6 +1230,20 @@ static void get_scan_ratio(struct zone * > else if (fp > 100) > fp = 100; > percent[1] = fp; > + > + free = zone_page_state(zone, NR_FREE_PAGES); > + > + /* > + * If we have no swap space, do not bother scanning anon pages > + */ > + if (nr_swap_pages <= 0) > + percent[0] = 0; Doesn't this mean that swap-cache in ACTIVE_ANON_LIST is not scanned ? Or swap-cache is in File-Cache list ? Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"
On Tue, 08 Jan 2008 23:01:07 -0500 [EMAIL PROTECTED] wrote: > On Mon, 07 Jan 2008 16:19:30 MST, Matthew Wilcox said: > > On Mon, Jan 07, 2008 at 06:04:25PM -0500, [EMAIL PROTECTED] wrote: > > > Theoretically, at least. Sometimes, in the real world, other constraints > > > enter into it... > > > > So you're saying that you can't find reliable ways to reproduce problems > > on demand? Those are some of the lower quality bug reports, so I don't > > think we're losing much by having you not report them. > > I'm sure that *everybody* on this list would *love* to know how you find > a reliable way to reproduce all the bugs that start off with "after X days of > uptime". But when you're chasing what might be a race condition with a > very small timing hole, you may need an event to happen several million times > before the accumulated chance of hitting it becomes appreciable. > I must say that the number of bugs which actually go away when the user stops using nvidia/fglrx/ndiswrapper/etc is a small minority. And you can usually tell beforehand too: if the user reports bad_page warnings or pte table scroggage or whatever and they're using nvidia I just hit 'd'. But people who think that removing the nvidia driver will magically fix that khubd-got-stuck-in-D-state bug are urinating up an incline. Facts: - lots of people use nvidia/etc - most bugs they report aren't caused by nvidia/etc - we need lots of testers draw you own conclusions. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/1] Convert drivers in drivers/char/drm to use .unlocked_ioctl
On Wed, Jan 09, 2008 at 03:37:50AM +, Dave Airlie wrote: > > > The drm drivers in this patch all used drm_ioctl to perform their > > ioctl calls. The common function is converted to use lock_kernel() > > and unlock_kernel() and the drivers are converted to use .unlocked_ioctl > > > > NAK Did you actually read Kevin's patch? > > I've started looking at this already in the drm git tree, I'm going to > provide both locked and unlocked paths for drivers to choose, as we need If you do that you'll exactly need Kevin's patch as a base. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"
On Mon, 07 Jan 2008 16:19:30 MST, Matthew Wilcox said: > So you're saying that you can't find reliable ways to reproduce problems > on demand? Those are some of the lower quality bug reports, so I don't > think we're losing much by having you not report them. And in the next e-mail in my lkml folder we see: On Mon, 07 Jan 2008 18:21:45 EST, Parag Warudkar said: > BTW, I have so far tested 2.6.24-rc4/5/6/7 and 2.6.23.12 - all of > which have this problem. > > Yesterday I went back to using 2.6.22.15 and after a day's uptime it > has not reproduced with the same config. > > Time for git-bisect I suppose? (the only problem is that this takes > anywhere between 20 minutes to 8 hrs to confirm reliably.) Are you saying that we're not losing much if Parag says "screw it" and doesn't report the problem? pgpcIDYr8VWbg.pgp Description: PGP signature
Do SATA tape drives work?
Hi guys I was wondering whether anyone can shed any light on the status of SATA tape drives. There's very little info on the net about this at least in the places I've checked; the only thing of any significance I've found thus far is a note in a Bacula document dated April 2007 which states that drives other than real SCSI units don't generally work with Bacula. To put this into context, I'm looking at purchasing a Sony SDX470VRB SATA AIT-1 tape drive for use with the SATA controller on an Intel DG31PR mainboard. The drive will be used primarily with tar/cpio. Obvsiouly however I only want to make the purchase if there's a reasonable chance of it working. I would appreciate any information you can shed on this issue. Regards jonathan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"
On Mon, 07 Jan 2008 16:19:30 MST, Matthew Wilcox said: > On Mon, Jan 07, 2008 at 06:04:25PM -0500, [EMAIL PROTECTED] wrote: > > Theoretically, at least. Sometimes, in the real world, other constraints > > enter into it... > > So you're saying that you can't find reliable ways to reproduce problems > on demand? Those are some of the lower quality bug reports, so I don't > think we're losing much by having you not report them. I'm sure that *everybody* on this list would *love* to know how you find a reliable way to reproduce all the bugs that start off with "after X days of uptime". But when you're chasing what might be a race condition with a very small timing hole, you may need an event to happen several million times before the accumulated chance of hitting it becomes appreciable. pgpIzTII30SXP.pgp Description: PGP signature
More breakage in native_rdtsc out of line in git-x86
I had some boot failures here with git-x86 with init and hotplug all segfaulting early on userland with new glibc. Bisecting found commit 6aea5bc37fa790eaf3a942f0785985914568e214 Author: Ingo Molnar <[EMAIL PROTECTED]> Date: Sat Jan 5 13:27:08 2008 +0100 x86: move native_read_tsc() offline move native_read_tsc() offline. I think the problem is that the vsyscall/vdso code calls it through vread and for that it has to be exported. There seems to be also another bug with the old style vsyscalls not using the TSC vread that masks it on older glibc Stepping with gdb through old style vgettimeofday() confirms that RDTSC is not used. A long time ago we had a similar problem once and it was because of a problem exporting the vsyscall variables in vmlinux.lds.S -- looks like that has reappeared. I think the new glibc shows it because it uses the vDSO not the older vsyscall and the new vDSO probably still works. Anyways haven't investigated why that is in detail yet, but that's a separate regression. Back to the boot failure: Unfortunately simply adding __vsyscall_fn to native_read_tsc doesn't work -- causes early kernel faults like PANIC: early exception rip ff600105 error 10 cr2 ff600105 Pid: 0, comm: swapper Not tainted 2.6.24-rc6 #58 Call Trace: [] native_sched_clock+0x9/0x3f [] init_idle+0x33/0xd1 [] sched_init+0x26d/0x283 [] start_kernel+0x10b/0x2bd [] _sinittext+0x114/0x11b Not sure why that is -- in theory the vsyscall functions should be callable from the main kernel. Might be a binutils problem or another code regression. Anyways it looks like the only good fix is to either revert that or fork into two functions one for vread() and another for normal tsc ->read() This is all in addition to the problem of it having incorrect barriers. I note that my original patch didn't have any of these problems. I'm using the appended revert patch here as a workaround for now. -Andi Revert rdtsc out of line change Reverts commit 6aea5bc37fa790eaf3a942f0785985914568e214 Author: Ingo Molnar <[EMAIL PROTECTED]> Date: Sat Jan 5 13:27:08 2008 +0100 x86: move native_read_tsc() offline move native_read_tsc() offline. The function is called by vsyscalls in ring 3, so it can't be out of line this way. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Index: linux/arch/x86/kernel/rtc.c === --- linux.orig/arch/x86/kernel/rtc.c +++ linux/arch/x86/kernel/rtc.c @@ -194,14 +194,3 @@ int update_persistent_clock(struct times { return set_rtc_mmss(now.tv_sec); } - -unsigned long long native_read_tsc(void) -{ - DECLARE_ARGS(val, low, high); - - asm volatile("rdtsc" : EAX_EDX_RET(val, low, high)); - rdtsc_barrier(); - - return EAX_EDX_VAL(val, low, high); -} - Index: linux/include/asm-x86/msr.h === --- linux.orig/include/asm-x86/msr.h +++ linux/include/asm-x86/msr.h @@ -91,7 +91,13 @@ static inline int native_write_msr_safe( return err; } -extern unsigned long long native_read_tsc(void); +static inline unsigned long long native_read_tsc(void) +{ + DECLARE_ARGS(val, low, high); + + asm volatile("rdtsc" : EAX_EDX_RET(val, low, high)); + return EAX_EDX_VAL(val, low, high); +} static inline unsigned long long native_read_pmc(int counter) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
> > An alternative might be to come up with something decent and target 2.6.24.x I don't see mmiotrace getting merged into a stable kernel... how do however see it getting cleaned up for 2.6.25 now that people know how fragile the kernel hooks for it are.. > We put the crappy code back in for 2.6.24 then take it out immediately > after 2.6.24 and put something else in to support mmiotrace and perhaps the > other new mystery features to which you refer below. hm. (I think the other mystery feature is actually a Novell kernel debugger but I'm not sure, madwifi use it for similiar reasons to mmiotrace I think..) > > > all that crap > > > } > > > > > > > > > But that's all speculation. Has anyone actually measured the pagefault > > > latency impact of this change? Message-Id: <[EMAIL PROTECTED]> Subject: [patch 20/38] Minor fault path optimization. Date: Fri, 27 Apr 2007 16:05:23 +0200 was a patch to do exactly that.. hch decided the feature wasn't useful and posted a patch to remove it.. > > That change has been in the mainline tree for nearly three months. All > these affected parties have left it until the eve of 2.6.24 to actually > tell us about it. This is causing me sympathy problems :( > Jan first complained on the 4th Decemeber last year, I'm just posting this now because Linus said send him a patch to revert regressions rather than just complain, I've prepared the patch to put back the old behaviour from 2.6.23. This was only brought to my notice this morning but I'm not going to let that stop me from trying to find a correct fix rather than just ripping the feature out.. I think we could apply the page fault cleanup patch I mentioned earlier on top of this patch and get back the 300 cycles and that would make people happy, it makes sense for mmiotrace to use kprobes hooks and not have to do this stuff directly but if that is what is wanted the mmiotrace guys can do it directly in the future. Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pnpacpi : exceeded the max number of IO resources
> > Well, yes, the warning is actually new as well. Previously your kernel > > just silently ignored 8 more mem resources than it does now it seems. > > > > Given that people are hitting these limits, it might make sense to just > > do away with the warning for 2.6.24 again while waiting for the dynamic > > code? > > Ping. Should these warnings be reverted for 2.6.24? No. I don't think hiding this issue again is a good idea. I'd rather live with people complaining about an addition dmesg line. -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/6] syslets: add generic syslets infrastructure
On Wednesday 09 January 2008 14:00:04 Zach Brown wrote: > > Firstly, why not just specify an address for the return value and be > > done with it? This infrastructure seems overkill, and you can always > > extend later if required. > > Sorry, which infrastructure? > > Providing the function and stack to return to? Sure, I could certainly > entertain the idea of not having syslet tasks return to userspace in the > first pass. Ingo sure seemed excited by the idea. > > Or do you mean the syscall return value ending up in the userspace > completion event ring? That's mostly about being able to wait for > pending syslets to complete. The latter. A ring is optimal for processing a huge number of requests, but if you're really going to be firing off syslet threads all over the place you're not going to be optimal anyway. And being able to point the return value to the stack or into some datastructure is way nicer to code (zero setup == easy to understand and easy to convert). For notification, see below. > > Secondly, you really should allow integration with an eventfd so you > > don't make the posix AIO mistake of providing a poll-incompatible > > interface. > > Yeah, this seems straight forward enough that I haven't made it an > initial priority. I'm sure it will be helpful for people who are stuck > integrating with entrenched software that wants to wait for pollable fds. Unfortunately, waiting for someone to write a killer app which uses your new API is the road to disappointment. The real target is convincing the handful of important apps (Samba, Apache, ...) to #ifdef around some small piece of code in order to get performance. And a mere single design wart could mean that never happens. Look at epoll, it's probably been the most successful and it's still damn niche. > For more flexible software, though, it's compelling to now be able to > aggregate waiting for completion of the existing waiting syscalls (poll, > epoll_wait, futexes, whatever) by issuing them as concurrent syslets. Is replacing epoll with syslets really going to win, even if you're writing apps from scratch? Anyway a fast notification mechanism is a different problem than syslets, and should be separated. > > Finally, and probably most alarmingly, AFAICT randomly changing TID will > > break all threaded programs, which means this won't be fitted into > > existing code bases, making it YA niche Linux-only API 8( > > I wonder if there isn't an opportunity to add a clone() flag which > juggles the association between TIDs and task_structs. I don't relish > the idea of investigating the life cycles of task_struct references that > derive from TIDs and seeing how those would race with a syslet blocking > and cloning, but, well, maybe that's what needs to be done. This must be solved, yet all avenues seem crawling with worms. Redirecting find_task_by_pid() to find the original and converting all the places where we return tids to userspace? Swapping tids when we clone? Duplicate tids, with only the non-syslet one being returned from find_task_by_pid()? > This all isn't my area of expertise, though, sadly. It would be swell > if someone wanted to look into it before I'm forced to learn yet another > weird corner of the kernel. Let's just tell Ingo it's impossible to solve :) Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH][AGP] intel_agp: add new chipset ids
Dave, This one adds new pci ids for Intel intergrated graphics chipset, with gtt table access change on it and new gtt table size definition. Thanks. Signed-off-by: Zhenyu Wang <[EMAIL PROTECTED]> --- drivers/char/agp/agp.h |3 +++ drivers/char/agp/intel-agp.c | 31 ++- 2 files changed, 29 insertions(+), 5 deletions(-) diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h index b83824c..d132914 100644 --- a/drivers/char/agp/agp.h +++ b/drivers/char/agp/agp.h @@ -235,6 +235,9 @@ struct agp_bridge_data { #define I965_PGETBL_SIZE_512KB (0 << 1) #define I965_PGETBL_SIZE_256KB (1 << 1) #define I965_PGETBL_SIZE_128KB (2 << 1) +#define I965_PGETBL_SIZE_1MB (3 << 1) +#define I965_PGETBL_SIZE_2MB (4 << 1) +#define I965_PGETBL_SIZE_1_5MB (5 << 1) #define G33_PGETBL_SIZE_MASK(3 << 8) #define G33_PGETBL_SIZE_1M (1 << 8) #define G33_PGETBL_SIZE_2M (2 << 8) diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c index d879619..091e765 100644 --- a/drivers/char/agp/intel-agp.c +++ b/drivers/char/agp/intel-agp.c @@ -30,13 +30,16 @@ #define PCI_DEVICE_ID_INTEL_Q35_IG 0x29B2 #define PCI_DEVICE_ID_INTEL_Q33_HB 0x29D0 #define PCI_DEVICE_ID_INTEL_Q33_IG 0x29D2 +#define PCI_DEVICE_ID_INTEL_IGD_HB 0x2A40 +#define PCI_DEVICE_ID_INTEL_IGD_IG 0x2A42 #define IS_I965 (agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82946GZ_HB || \ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965G_1_HB || \ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965Q_HB || \ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965G_HB || \ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GM_HB || \ - agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GME_HB) + agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GME_HB || \ + agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_IGD_HB) #define IS_G33 (agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_G33_HB || \ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_Q35_HB || \ @@ -456,6 +459,15 @@ static void intel_i830_init_gtt_entries(void) case I965_PGETBL_SIZE_512KB: size = 512; break; + case I965_PGETBL_SIZE_1MB: + size = 1024; + break; + case I965_PGETBL_SIZE_2MB: + size = 2048; + break; + case I965_PGETBL_SIZE_1_5MB: + size = 1024 + 512; + break; default: printk(KERN_INFO PFX "Unknown page table size, " "assuming 512KB\n"); @@ -981,6 +993,7 @@ static int intel_i965_create_gatt_table(struct agp_bridge_data *bridge) struct aper_size_info_fixed *size; int num_entries; u32 temp; + int gtt_offset, gtt_size; size = agp_bridge->current_size; page_order = size->page_order; @@ -990,13 +1003,18 @@ static int intel_i965_create_gatt_table(struct agp_bridge_data *bridge) pci_read_config_dword(intel_private.pcidev, I915_MMADDR, ); temp &= 0xfff0; - intel_private.gtt = ioremap((temp + (512 * 1024)) , 512 * 1024); - if (!intel_private.gtt) - return -ENOMEM; + if (agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_IGD_HB) + gtt_offset = gtt_size = MB(2); + else + gtt_offset = gtt_size = KB(512); + + intel_private.gtt = ioremap((temp + gtt_offset) , gtt_size); + if (!intel_private.gtt) + return -ENOMEM; - intel_private.registers = ioremap(temp,128 * 4096); + intel_private.registers = ioremap(temp, 128 * 4096); if (!intel_private.registers) { iounmap(intel_private.gtt); return -ENOMEM; @@ -1884,6 +1902,8 @@ static const struct intel_driver_description { NULL, _g33_driver }, { PCI_DEVICE_ID_INTEL_Q33_HB, PCI_DEVICE_ID_INTEL_Q33_IG, 0, "Q33", NULL, _g33_driver }, + { PCI_DEVICE_ID_INTEL_IGD_HB, PCI_DEVICE_ID_INTEL_IGD_IG, 0, + "Intel Integrated Graphics Device", NULL, _i965_driver }, { 0, 0, 0, NULL, NULL, NULL } }; @@ -2073,6 +2093,7 @@ static struct pci_device_id agp_intel_pci_table[] = { ID(PCI_DEVICE_ID_INTEL_G33_HB), ID(PCI_DEVICE_ID_INTEL_Q35_HB), ID(PCI_DEVICE_ID_INTEL_Q33_HB), + ID(PCI_DEVICE_ID_INTEL_IGD_HB), { } }; -- 1.5.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/1] Convert drivers in drivers/char/drm to use .unlocked_ioctl
> The drm drivers in this patch all used drm_ioctl to perform their > ioctl calls. The common function is converted to use lock_kernel() > and unlock_kernel() and the drivers are converted to use .unlocked_ioctl > NAK I've started looking at this already in the drm git tree, I'm going to provide both locked and unlocked paths for drivers to choose, as we need to audit the drivers on a per-driver basis, the other option is to provide wrappers in each driver to do the lock/unlock kernel and leave drm_ioctl alone.. I'll take a look kmalloc failure case sounds like a bug though.. Dave. > Signed-off-by: Kevin Winchester <[EMAIL PROTECTED]> > > --- > > I also noted that in the failed kmalloc case in drm_ioctl(), the function > immediately returns -ENOMEM, rather than following the error path that > calls atomic_dec(>ioctl_count);. I'm not sure if the ioctl_count > is just not important in the -ENOMEM case, or if this is a bug. > > drivers/char/drm/drmP.h |3 +-- > drivers/char/drm/drm_drv.c| 10 ++ > drivers/char/drm/i810_dma.c |2 +- > drivers/char/drm/i810_drv.c |2 +- > drivers/char/drm/i830_dma.c |2 +- > drivers/char/drm/i830_drv.c |2 +- > drivers/char/drm/i915_drv.c |2 +- > drivers/char/drm/mga_drv.c|2 +- > drivers/char/drm/r128_drv.c |2 +- > drivers/char/drm/radeon_drv.c |2 +- > drivers/char/drm/savage_drv.c |2 +- > drivers/char/drm/sis_drv.c|2 +- > drivers/char/drm/tdfx_drv.c |2 +- > drivers/char/drm/via_drv.c|2 +- > 14 files changed, 19 insertions(+), 18 deletions(-) > > Index: v2.6.24-rc7/drivers/char/drm/drmP.h > === > --- v2.6.24-rc7.orig/drivers/char/drm/drmP.h > +++ v2.6.24-rc7/drivers/char/drm/drmP.h > @@ -833,8 +833,7 @@ static inline int drm_mtrr_del(int handl > /* Driver support (drm_drv.h) */ > extern int drm_init(struct drm_driver *driver); > extern void drm_exit(struct drm_driver *driver); > -extern int drm_ioctl(struct inode *inode, struct file *filp, > - unsigned int cmd, unsigned long arg); > +extern long drm_ioctl(struct file *filp, unsigned int cmd, unsigned long > arg); > extern long drm_compat_ioctl(struct file *filp, >unsigned int cmd, unsigned long arg); > extern int drm_lastclose(struct drm_device *dev); > Index: v2.6.24-rc7/drivers/char/drm/drm_drv.c > === > --- v2.6.24-rc7.orig/drivers/char/drm/drm_drv.c > +++ v2.6.24-rc7/drivers/char/drm/drm_drv.c > @@ -438,7 +438,6 @@ static int drm_version(struct drm_device > /** > * Called whenever a process performs an ioctl on /dev/drm. > * > - * \param inode device inode. > * \param file_priv DRM file private. > * \param cmd command. > * \param arg user argument. > @@ -447,8 +446,7 @@ static int drm_version(struct drm_device > * Looks up the ioctl function in the ::ioctls table, checking for root > * previleges if so required, and dispatches to the respective function. > */ > -int drm_ioctl(struct inode *inode, struct file *filp, > - unsigned int cmd, unsigned long arg) > +long drm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > { > struct drm_file *file_priv = filp->private_data; > struct drm_device *dev = file_priv->head->dev; > @@ -458,6 +456,7 @@ int drm_ioctl(struct inode *inode, struc > int retcode = -EINVAL; > char *kdata = NULL; > > + lock_kernel(); > atomic_inc(>ioctl_count); > atomic_inc(>counts[_DRM_STAT_IOCTLS]); > ++file_priv->ioctl_count; > @@ -494,8 +493,10 @@ int drm_ioctl(struct inode *inode, struc > } else { > if (cmd & (IOC_IN | IOC_OUT)) { > kdata = kmalloc(_IOC_SIZE(cmd), GFP_KERNEL); > - if (!kdata) > + if (!kdata) { > + unlock_kernel(); > return -ENOMEM; > + } > } > > if (cmd & IOC_IN) { > @@ -520,6 +521,7 @@ int drm_ioctl(struct inode *inode, struc > atomic_dec(>ioctl_count); > if (retcode) > DRM_DEBUG("ret = %x\n", retcode); > + unlock_kernel(); > return retcode; > } > > Index: v2.6.24-rc7/drivers/char/drm/i810_dma.c > === > --- v2.6.24-rc7.orig/drivers/char/drm/i810_dma.c > +++ v2.6.24-rc7/drivers/char/drm/i810_dma.c > @@ -115,7 +115,7 @@ static int i810_mmap_buffers(struct file > static const struct file_operations i810_buffer_fops = { > .open = drm_open, > .release = drm_release, > - .ioctl = drm_ioctl, > + .unlocked_ioctl = drm_ioctl, > .mmap = i810_mmap_buffers, > .fasync = drm_fasync, > }; > Index: v2.6.24-rc7/drivers/char/drm/i810_drv.c >
[RFC] x86: Add oops_begin, oops_end to X86_32
Some more work is needed on this patch, but I'm looking for some feedback about the general direction. X86_64's implementation seems nicer and it would be useful to use a common base for further unification in the oops handling. Modify the X86_32 implementation of die() using helpers oops_begin()/oops_end(). Small whitespace change in traps_64.c for easier comparison between the two. Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]> --- arch/x86/kernel/traps_32.c | 137 +++- arch/x86/kernel/traps_64.c | 11 +-- 2 files changed, 76 insertions(+), 72 deletions(-) diff --git a/arch/x86/kernel/traps_32.c b/arch/x86/kernel/traps_32.c index 5f2b38e..a4092ed 100644 --- a/arch/x86/kernel/traps_32.c +++ b/arch/x86/kernel/traps_32.c @@ -352,10 +352,61 @@ int is_valid_bugaddr(unsigned long ip) return ud2 == 0x0b0f; } -static int die_counter; +static raw_spinlock_t die_lock = __RAW_SPIN_LOCK_UNLOCKED; +static int die_owner = -1; +static unsigned int die_nest_count; + +unsigned long __kprobes oops_begin(void) +{ + int cpu; + unsigned long flags; + + oops_enter(); + + raw_local_irq_save(flags); + cpu = smp_processor_id(); + /* racy, but better than risking deadlock. */ + if (!__raw_spin_trylock(_lock) && cpu != die_owner) { + __raw_spin_lock(_lock); + } + die_nest_count++; + die_owner = cpu; + console_verbose(); + bust_spinlocks(1); + return flags; +} + +void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr) +{ + die_owner = -1; + bust_spinlocks(0); + die_nest_count--; + if (!die_nest_count) + /* Nest count reaches zero, release the lock. */ + __raw_spin_unlock(_lock); + raw_local_irq_restore(flags); + + if (!regs) { + oops_exit(); + return; + } + + if (kexec_should_crash(current)) + crash_kexec(regs); + + if (in_interrupt()) + panic("Fatal exception in interrupt"); + + if (panic_on_oops) + panic("Fatal exception"); + + oops_exit(); + do_exit(signr); +} int __kprobes __die(const char * str, struct pt_regs * regs, long err) { + static int die_counter; unsigned long sp; unsigned short ss; @@ -371,24 +422,23 @@ int __kprobes __die(const char * str, struct pt_regs * regs, long err) #endif printk("\n"); - if (notify_die(DIE_OOPS, str, regs, err, - current->thread.trap_no, SIGSEGV) != - NOTIFY_STOP) { - show_registers(regs); - /* Executive summary in case the oops scrolled away */ - sp = (unsigned long) (>sp); - savesegment(ss, ss); - if (user_mode(regs)) { - sp = regs->sp; - ss = regs->ss & 0x; - } - printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip); - print_symbol("%s", regs->ip); - printk(" SS:ESP %04x:%08lx\n", ss, sp); - return 0; - } else { + if (notify_die(DIE_OOPS, str, regs, err, current->thread.trap_no, + SIGSEGV) == NOTIFY_STOP) return 1; + + show_registers(regs); + add_taint(TAINT_DIE); + /* Executive summary in case the oops scrolled away */ + sp = (unsigned long) (>sp); + savesegment(ss, ss); + if (user_mode(regs)) { + sp = regs->sp; + ss = regs->ss & 0x; } + printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip); + print_symbol("%s", regs->ip); + printk(" SS:ESP %04x:%08lx\n", ss, sp); + return 0; } /* @@ -397,58 +447,15 @@ int __kprobes __die(const char * str, struct pt_regs * regs, long err) */ void die(const char * str, struct pt_regs * regs, long err) { - static struct { - raw_spinlock_t lock; - u32 lock_owner; - int lock_owner_depth; - } die = { - .lock = __RAW_SPIN_LOCK_UNLOCKED, - .lock_owner = -1, - .lock_owner_depth = 0 - }; - unsigned long flags; + unsigned long flags = oops_begin(); - oops_enter(); - - if (die.lock_owner != raw_smp_processor_id()) { - console_verbose(); - raw_local_irq_save(flags); - __raw_spin_lock(); - die.lock_owner = smp_processor_id(); - die.lock_owner_depth = 0; - bust_spinlocks(1); - } else - raw_local_irq_save(flags); - - if (++die.lock_owner_depth < 3) { + if (!user_mode(regs)) report_bug(regs->ip, regs); - if (__die(str, regs, err)) - regs = NULL; - } else { - printk(KERN_EMERG
Re: [PATCH] call sysrq_timer_list_show from a workqueue
On Wed, 9 Jan 2008 14:20:18 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote: > On Wednesday 09 January 2008 11:21:59 Andrew Morton wrote: > > The string handling in here has become a bit scruffy. > > Yes, that patch also evokes a const warning. Fixed below. No patch was included. > I assume you've > queued these because you're thinking of applying them before 2.6.24? I'd say > only modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch > warrants that (the other is unlikely and not a regression). Actually I was thinking 2.6.25 on both. Kyle McMartin reports sysrq_timer_list_show() can hit the module mutex; these paths don't need to though, since we long ago changed all the module list manipulation to occur via stop_machine(). Disabling preemption is enough. Ah. sysrq_timer_list_show() is called from interrupt. OK, 2.6.24 seems reasonable. > > afacit the `namebuf[KSYM_NAME_LEN - 1] = 0;' would be unneeded if we were > > to use strlcpy() and I suspect the `namebuf[0] = 0;' isn't needed either. > > > > And the use of strlcpy() means we don't need to subtract 1 from > > KSYM_NAME_LEN and we don't need to fret about weird strncpy semantics when > > the input string is too large. > > > > > > And the fact that incoming arg `namebuf' MUST point at a > > KSYM_NAME_LEN-sized buffer could be better communicated by using a > > dedicated struct for this, or by giving the arg a type of `char > > namebuf[KSYM_NAME_LEN]'. Or by adding a comment. Or by just ignoring > > me and doing something more useful. > > Or better, rework all the name lookup interfaces, rather than having: > > struct module *module_text_address(unsigned long addr); > struct module *__module_text_address(unsigned long addr); > int is_module_address(unsigned long addr); > int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type, > char *name, char *module_name, int *exported); > char *module_address_lookup(unsigned long addr, > unsigned long *symbolsize, > unsigned long *offset, > char **modname, > char *namebuf); > int lookup_module_symbol_name(unsigned long addr, char *symname); > int lookup_module_symbol_attrs(unsigned long addr, unsigned long *size, > unsigned long *offset, char *modname, char > *name); > unsigned long module_kallsyms_lookup_name(const char *name); > > unsigned long kallsyms_lookup_name(const char *name); > extern int kallsyms_lookup_size_offset(unsigned long addr, > unsigned long *symbolsize, > unsigned long *offset); > const char *kallsyms_lookup(unsigned long addr, > unsigned long *symbolsize, > unsigned long *offset, > char **modname, char *namebuf); > extern int sprint_symbol(char *buffer, unsigned long address); > extern void __print_symbol(const char *fmt, unsigned long address); > int lookup_symbol_name(unsigned long addr, char *symname); > int lookup_symbol_attrs(unsigned long addr, unsigned long *size, > unsigned long *offset, char *modname, char *name); Yes, it could all do with a revisit. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] writeback bug fixes and simplifications
On Sat, Dec 29, 2007 at 03:56:59PM +0100, Hans-Peter Jansen wrote: > Am Freitag, 28. Dezember 2007 schrieb Sascha Warner: > > Andrew Morton wrote: > > > On Thu, 27 Dec 2007 23:08:40 +0100 Sascha Warner <[EMAIL PROTECTED]> > wrote: > > >> Hi, > > >> > > >> I applied your patches to 2.6.24-rc6-mm1, but now I am faced with one > > >> pdflush often using 100% CPU for a long time. There seem to be some > > >> rare pauses from its 100% usage, however. > > >> > > >> On ~23 minutes uptime i have ~19 minutes pdflush runtime. > > >> > > >> This is on E6600, x86_64, 2 Gig RAM, SATA HDD, running on gentoo > > >> ~x64_64 > > >> > > >> Let me know if you need more info. > > > > > > (some) cc's restored. Please, always do reply-to-all. > > > > Hi Wu, > > Sascha, if you want to address Fengguang by his first name, note that > chinese and bavarians (and some others I forgot now, too) typically use the > order: > > lastname firstname > > when they spell their names. Another evidence is, that the name Wu is a > pretty common chinese family name. > > Fengguang, if it's the other way around, correct me please (and I'm going to > wear a big brown paper bag for the rest of the day..). You are right. We normally do "Fengguang" or "Mr. Wu" :-) For LKML the first name is less ambiguous. Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
On Wed, 9 Jan 2008 03:17:37 + (GMT) Dave Airlie <[EMAIL PROTECTED]> wrote: > > > On Wed, 9 Jan 2008 02:34:46 + (GMT) Dave Airlie <[EMAIL PROTECTED]> > > wrote: > > > > > > > > [This an initial RFC but I'd like to have this patch in before 2.6.24 > > > goes > > > final as it really breaks this useful feature] > > > > > > mmiotrace the MMIO access tracer used to reverse engineer binary blobs > > > used this notifier interface and is planned on being pushed upstream. > > > > > > Having users able to just use the tracer module without having to rebuild > > > their kernel to add in a page fault handler hack means we get a lot > > > greater coverage for reverse engineering efforts. > > > > Sorry, but that's a really really small benefit. This very small number of > > fairly (or very) technical users will be able to work out a way of getting > > this to work in 2.6.24. And in 2.6.25 with a merged mmiotrace we can do > > something different. > > mmiotrace isn't targetted at fairly or technical users, its whole > usefulness is that you don't need a kernel re-build, the distro kernels > all contain enough support for us to just get a user to grab mmiotrace, > run make and get a trace so in my eyes this a major feature regression > to have to go back to custom kernel builds... An alternative might be to come up with something decent and target 2.6.24.x > > It's a modest convenience to a very small number of people. And the cost? > > Multiple functions calls and multiple cachelines hit for every pagefault > > on, what? Tens of millions of machines? > > Which has been happening for how many months? perhaps if we merge > mmiotrace in 2.6.25 we can clean up this function, otherwise I just count > it as a feature regression... We put the crappy code back in for 2.6.24 then take it out immediately after 2.6.24 and put something else in to support mmiotrace and perhaps the other new mystery features to which you refer below. hm. > > pagefault it populates a struct on the stack, passes that around for a > > while, does a bit of RCU stuff only to find that there was nothing to do. > > Surely we should at least be doing something along the lines of > > > > if (unlikely(notify_page_fault_chain.notifier_call != NULL)) { > > all that crap > > } > > > > > > But that's all speculation. Has anyone actually measured the pagefault > > latency impact of this change? ^^ this. > > > +/* > > > + * These are only here because kprobes.c wants them to implement a > > > + * blatant layering violation. Will hopefully go away soon once all > > > + * architectures are updated. > > > + */ > > > +static inline int register_page_fault_notifier(struct notifier_block *nb) > > > +{ > > > + return 0; > > > +} > > > +static inline int unregister_page_fault_notifier(struct notifier_block > > > *nb) > > > +{ > > > + return 0; > > > +} > > > + > > > > And this doesn't look very good either. For how long did this fixme remain > > unfixed? > > > > > > So I'd suggest that we leave things as they are for 2.6.24 - mmiotrace > > people will work something out, I'm sure. For 2.6.25 if we merge mmiotrace > > we can look at doing something which is vaguely efficient and tasteful. > > > > I just reverted Christophs patch I didn't try and work out if the old code > had problems no one has fixed... > > So all distros with 2.6.24 kernels are useless to mmiotrace I don't see > why leaving things as is until a suitable replacement mechanism can be > used.. I've heard others give out about this also madwifi and SuSE kernel > folks... That change has been in the mainline tree for nearly three months. All these affected parties have left it until the eve of 2.6.24 to actually tell us about it. This is causing me sympathy problems :( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86_64: cleanup setup_node_zones called by paging_init
[PATCH] x86_64: cleanup setup_node_zones called by paging_init setup_node_zones calcuates some variable but only use them when FLAT_NODE_MEM_MAP is set so change the MACRO postion to avoid calculating. also change it to static Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]> Index: linux-2.6/arch/x86/mm/numa_64.c === --- linux-2.6.orig/arch/x86/mm/numa_64.c +++ linux-2.6/arch/x86/mm/numa_64.c @@ -227,15 +227,16 @@ void __init setup_node_bootmem(int nodei srat_reserve_add_area(nodeid); #endif node_set_online(nodeid); -} +} +#ifdef CONFIG_FLAT_NODE_MEM_MAP /* Initialize final allocator for a zone */ -void __init setup_node_zones(int nodeid) -{ +static void __init setup_node_zones(int nodeid) +{ unsigned long start_pfn, end_pfn, memmapsize, limit; - start_pfn = node_start_pfn(nodeid); - end_pfn = node_end_pfn(nodeid); + start_pfn = node_start_pfn(nodeid); + end_pfn = node_end_pfn(nodeid); Dprintk(KERN_INFO "Setting up memmap for node %d %lx-%lx\n", nodeid, start_pfn, end_pfn); @@ -244,14 +245,13 @@ void __init setup_node_zones(int nodeid) memory. */ memmapsize = sizeof(struct page) * (end_pfn-start_pfn); limit = end_pfn << PAGE_SHIFT; -#ifdef CONFIG_FLAT_NODE_MEM_MAP - NODE_DATA(nodeid)->node_mem_map = - __alloc_bootmem_core(NODE_DATA(nodeid)->bdata, - memmapsize, SMP_CACHE_BYTES, - round_down(limit - memmapsize, PAGE_SIZE), + NODE_DATA(nodeid)->node_mem_map = + __alloc_bootmem_core(NODE_DATA(nodeid)->bdata, + memmapsize, SMP_CACHE_BYTES, + round_down(limit - memmapsize, PAGE_SIZE), limit); +} #endif -} void __init numa_init_array(void) { @@ -570,9 +570,11 @@ void __init paging_init(void) sparse_memory_present_with_active_regions(MAX_NUMNODES); sparse_init(); +#ifdef CONFIG_FLAT_NODE_MEM_MAP for_each_online_node(i) { - setup_node_zones(i); + setup_node_zones(i); } +#endif free_area_init_nodes(max_zone_pfns); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] split MMC_CAP_4_BIT_DATA
Hi,all I'd like to say something about this issue. Currently,the blackfin on chip SD host ONLY support 1-bit MMC while support 1-bit/4-bit SD/SDIO. And we want our driver to support both 1-bit MMC and 4-bit SD/SDIO.but the current MMC driver framework Only allow us to set one kind of bus width,either 1-bit or 4-bit.So in order to meet our case,we need more flexible mechanism To inform the upper commom driver to know our situation. Cliff -Original Message- From: Bryan Wu [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 09, 2008 10:33 AM To: Pierre Ossman; [EMAIL PROTECTED] Cc: Mike Frysinger; linux-kernel@vger.kernel.org Subject: Re: [patch] split MMC_CAP_4_BIT_DATA On Jan 9, 2008 4:49 AM, Pierre Ossman <[EMAIL PROTECTED]> wrote: > On Tue, 8 Jan 2008 14:40:49 -0500 > Mike Frysinger <[EMAIL PROTECTED]> wrote: > > > > > i dont understand what's confusing. the Blackfin on chip host > > controller only supports 1-bit MMC, but it supports 4-bit SD/SDIO. > > this is a fact. while it may be a stupid decision, it is what it > > is, and i need the framework made more flexible in order to get the > > Blackfin driver merged cleanly. we do software for hardware, we dont do hardware. > > Well, since I've seen no _hardware_ differences between 4-bit MMC and 4-bit SD, "support" in this case must me "vendor will guarantee it works". And that is not the kind of "support" that needs a distinction in the code. > > So, again, if you feel that there is a hardware difference between 4-bit MMC and 4-bit SD then please elaborate as it is my understanding that they are identical. > As Mike said, the reason split this flag is because Blackfin on-chip SDIO controller's limitation. Cliff is working on it for a long time, so I dropped him in. Hope he can clarify the confusing things. Thanks -Bryan Wu -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] add task handling notifier
On Tue, 2008-01-08 at 18:24 -0800, Matt Helsley wrote: > On Sun, 2007-12-23 at 12:26 +, Christoph Hellwig wrote: > > On Thu, Dec 20, 2007 at 01:11:24PM +, Jan Beulich wrote: > > > With more and more sub-systems/sub-components leaving their footprint > > > in task handling functions, it seems reasonable to add notifiers that > > > these components can use instead of having them all patch themselves > > > directly into core files. > > > > I agree that we probably want something like this. As do some others, > > so we already had a few a few attempts at similar things. The first one > > is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also > > includes allocating per-task data for it's users. Then also from SGI > > there has been a simplified version called pnotify that's also available > > from the website above. > > > > Later Matt Helsley had something called "Task Watchers" which lwn has > > an article on: http://lwn.net/Articles/208117/. > > Apologies for the late reply -- I haven't had internet access for the > last few weeks. > > > For some reason neither ever made a lot of progess (performance > > problems?). > > Yeah. Some discussion on measuring the performance of Task Watchers: > http://thread.gmane.org/gmane.linux.lse/4698 > > The requirements for Task Watchers were: > > Allow sleeping in most/all notifier functions in these paths: > fork > exec > exit > change [re][ug]id > No performance overhead > One "chain" per path ("I only care about exec().") > Easy to use > Scales to large numbers of CPUs > Useful to make most in-tree code more readable. Task Watchers took > direct calls to these pieces of code out of the fork/exec/exit paths: > audit > semundo > cpusets > mempolicy > trace irqflags > lockdep > keys (for processes -- not for thread groups) > process events connector > Useful for loadable modules > > Performance overhead in microbenchmarks was measurable at around 1% (see > the URL above). Overhead on benchmarks like kernbench on the other hand > were in the noise margins (which were around 1.6%) and hence I couldn't > determine the overhead there. > > I never got the loadable module part completely working due to races > between notifier functions and the module unload path. The solution to > the races seemed to require adding more overhead to the notifier > function paths (SRCU-like grace periods). > > I stopped pushing the patch set because I hadn't found any new > optimizations to offset the overheads while still meeting all the > requirements and Andrew still felt that the "make it more readable" > argument was not sufficient to justify its inclusion. Oops. It's been nearly two years so I've forgotten exactly where Task Watchers v2 was when I stopped pushing it. After a bit more searching I found a more recent posting: http://lkml.org/lkml/2006/12/14/384 And here's why I think the microbenchmark results improved to the point there was a small performance improvement over mainline: http://lkml.org/lkml/2006/12/19/124 I seem to recall kernbench was still too noisy to tell. The patch allowing modules to register Task Watchers still isn't posted there for the reasons I've already described. Cheers, -Matt Helsley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: regression: 100% io-wait with 2.6.24-rcX
On Mon, Jan 07, 2008 at 02:40:13PM +0100, Joerg Platte wrote: > Am Montag, 7. Januar 2008 schrieb Peter Zijlstra: > > On Mon, 2008-01-07 at 14:24 +0100, Joerg Platte wrote: > > > > This is from: 2.6.24-rc7 > > > > > kernel: pdflush D f41c2f14 0 18822 2 > > > kernel:f673f000 0046 0286 f41c2f14 f5194ce0 0286 > > > 0286 f41c2f14 kernel:00175279 f41c2f6c c0271f6c > > > f5ff363c f5ff3644 c0354a90 c0354a90 kernel:00175279 c0123251 > > > f5194b80 c03546c0 c0271f67 6c666470 00687375 kernel: Call Trace: > > > kernel: [] schedule_timeout+0x6e/0x8b > > > kernel: [] process_timeout+0x0/0x5 > > > kernel: [] schedule_timeout+0x69/0x8b > > > kernel: [] __sched_text_start+0x3a/0x70 > > > kernel: [] congestion_wait+0x4e/0x62 > > > kernel: [] autoremove_wake_function+0x0/0x33 > > > kernel: [] pdflush+0x0/0x1bf > > > kernel: [] wb_kupdate+0x8c/0xd1 > > > kernel: [] pdflush+0x0/0x1bf > > > kernel: [] pdflush+0x11b/0x1bf > > > kernel: [] wb_kupdate+0x0/0xd1 > > > kernel: [] kthread+0x36/0x5d > > > kernel: [] kthread+0x0/0x5d > > > kernel: [] kernel_thread_helper+0x7/0x10 > > > kernel: === > > > > What filesystem are you using? > > Here you can see all currently mounted filesystems: > > /dev/sda6 on / type ext3 (rw,noatime,errors=remount-ro,acl) > tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) > proc on /proc type proc (rw,noexec,nosuid,nodev) > sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) > procbususb on /proc/bus/usb type usbfs (rw) > udev on /dev type tmpfs (rw,mode=0755) > tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) > devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) > fusectl on /sys/fs/fuse/connections type fusectl (rw) > /dev/sda7 on /tmp type ext2 (rw,noatime,errors=remount-ro,acl) > /dev/sda8 on /export type ext3 (rw,noatime,errors=remount-ro,acl) > /dev/sda1 on /winxp type ntfs (rw,umask=002,gid=1,nls=utf8) So they are ext3/ext2/ntfs. What if you umount ntfs? and ext2 if possible? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops?
On Jan 8, 2008 9:02 PM, Alan Cox <[EMAIL PROTECTED]> wrote: > > Except this time when rebooting the machine i got a kernel oops > > message and it didn't boot completely. I could not copy it but I did > > take a picture and now I have re-written the screen here(sorry about > > That is interesting - that sort of error usually points at memory > corruption and early on tends to point at hardware (but not always). What > hard is in this system and does it have over 4GB of RAM ? > > There are 2GB of RAM and the motherboard is DFI and it has a duel core intel cpu. If you need to specifics I could look them up. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00 of 10] x86: unify asm/pgtable.h
Andi Kleen wrote: Yeah, that may be true, but this particular tree is weird, and I'm trying to understand what's going on here. Specifically, 64-bit ioremap()s *don't* set _PAGE_GLOBAL, which appears to be an accident resulting from the strange definitions of __PAGE_KERNEL_* vs PAGE_KERNEL_*. ioremap() should set G agreed. For example, ioremap_64.c:__ioremap() creates a vma for the io mapping, and explicitly sets _PAGE_GLOBAL in the vma's version of pgprot - but then it calls ioremap_page_range() to actually create the mapping, which ends up making a non-global mapping, because its rolling its own version of PAGE_KERNEL by using pgprot(__PAGE_KERNEL) - which is not the actual definition of PAGE_KERNEL. That should not really matter because ioremap_change_attr()->c_p_a is only called when flags is != 0 and that means it is already different from PAGE_KERNEL. I think there's a bug around here, but I think its currently being hidden There's one Jan pointed out: iounmap does not subtract the guard page size so it ends up resetting one page too much. That is probably what causes your problem. But again you should be passing in G in the first place. -Andi Here was Jan's patch; it incidently fixes the G problem too OK, great. Ingo, that means we can use this and go back to folding _PAGE_GLOBAL into __PAGE_KERNEL_*. Well, at least give it a try. J snip Additionally I found it necessary to fix ioremap_64.c's use of change_page_attr_addr(): --- a/arch/x86/mm/ioremap_64.c +++ b/arch/x86/mm/ioremap_64.c @@ -48,7 +48,7 @@ ioremap_change_attr(unsigned long phys_a * Must use a address here and not struct page because the phys addr * can be a in hole between nodes and not have an memmap entry. */ - err = change_page_attr_addr(vaddr,npages,__pgprot(__PAGE_KERNEL|flags)); + err = change_page_attr_addr(vaddr,npages,MAKE_GLOBAL(__PAGE_KERNEL|flags)); if (!err) global_flush_tlb(); } @@ -199,7 +199,7 @@ void iounmap(volatile void __iomem *addr /* Reset the direct mapping. Can block */ if (p->flags >> 20) - ioremap_change_attr(p->phys_addr, p->size, 0); + ioremap_change_attr(p->phys_addr, get_vm_area_size(p), 0); /* Finally remove it */ o = remove_vm_area((void *)addr); Other extra changes I had in my version could possibly be counted as enhancements... Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The ext3 way of journalling
On Jan 08, 2008, at 15:51:53, Andi Kleen wrote: Theodore Tso <[EMAIL PROTECTED]> writes: Now, there are good reasons for doing periodic checks every N mounts and after M months. And it has to do with PC class hardware. (Ted's aphorism: "PC class hardware is cr*p"). If these reasons are good ones (some skepticism here) then the correct way to really handle this would be to do regular background scrubbing during runtime; ideally with metadata checksums so that you can actually detect all corruption. Poor man's background scrubbing: (A) Use LVM like virtually all modern distros offer (B) Leave some extra space in your LVM volume group (enough for 1 snapshot over the time it takes to do an FSCK). (C) Periodically run the following scriptlet: set -e START="$(date +'%Y%m%d%H%M%S')" lvcreate -s -n "${VOLUME}-snap" "${VG}/${VOLUME}" if nice +20 fsck -fy "/dev/mapper/${VG}_${VOLUME}-snap"; then echo 'Background scrubbing succeeded!' tune2fs -T "${START}" "/dev/mapper/${VG}_${VOLUME}" else echo 'Background scrubbing failed! Reboot to fsck soon!' tune2fs -C 16383 -T "19000101" "/dev/mapper/${VG}_${VOLUME}" fi lvremove "${VG}/${VOLUME}-snap" Basically you can fsck the offline snapshot in the background. If it succeeds you can adjust the "last checked" date to the time when the snapshot was taken and if it fails you can schedule an FSCK at next reboot (and possibly remount the filesystem read-only or reboot immediately). You can do the same thing for your /boot volume, although you probably have to manually use dmsetup since most bootloaders can't interpret LVM volumes. I've always been surprised that distros like RedHat which automatically use LVM don't stuff this in their weekly or monthly checks on desktop systems. User experience could also be dramatically improved with automated smartd configuration and user- interactive logging and warning messages. But since fsck is so slow and disks are so big this whole thing is a ticking time bomb now. e.g. it is not uncommon to require tens of minutes or even hours of fsck time and some server that reboots only every few months will eat that when it happens to reboot. This means you get a quite long downtime. My servers all have an "interval-between-checks" of 2-6 weeks and are configured to run nice +20 background "fsck" checks during off-hours between once every few days and once every few weeks. I also have the "max mount count" numbers set to primes between 7 and 37 (depending on the filesystem) so that troubled or frequently-rebooted systems are more frequently verified. The end result is that I almost never have the dreaded 4-hour-fsck-on-boot problem. A drive has certainly been fscked within the last few weeks of operation, and I will only ever have multiple large filesystems all fscked at the same time very rarely (gcd of their max-mount-counts). Cheers, Kyle Moffett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] add task handling notifier
On Tue, 08 Jan 2008 18:47:00 -0800 Matt Helsley <[EMAIL PROTECTED]> wrote: > > > > ... > > > Am I to conclude then that there's no point in addressing the issues other > > > people pointed out? While I (obviously, since I submitted the patch > > > disagree), > > > I'm not certain how others feel. My main point for disagreement here is > > > (I'm > > > sorry to repeat this) that as long as certain code isn't allowed into the > > > kernel > > > I think it is not unreasonable to at least expect the kernel to provide > > > some > > > fundamental infrastructure that can be used for those (supposedly > > > unacceptable) bits. All I did here was utilizing the base infrastructure > > > I want > > > added to clean up code that appeared pretty ad-hoc. > > > > > > > Ah. That's a brand new requirement. > > In all fairness it's not really a brand new requirement -- just one that > wasn't strongly emphasized during prior attempts to get something like > this in. > > I had a mostly-working patch for this on top of the Task Watchers v2 > patch set. I never posted that specific patch because it had a race with > module unloading and the fix only increased the overhead you were > unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X] > description for Task Watchers v2 (http://lwn.net/Articles/207873/): > > "TODO: > ... > I'm working on three more patches that add support for creating a task > watcher from within a module using an ELF section. They haven't recieved > as much attention since I've been focusing on measuring the performance > impact of these patches." > > > > Would tainting the kernel upon registration of out-of-tree "notifiers" > be more acceptable? How does that work? module.c does the register/deregister on behalf of the module? I certainly encourage people to disagreee with me here, but my current thinking is: - the cleanup aspect isn't worth the runtime overhead and - the support-modular-users aspect is largely new and would need a lot more description and justification (with examples) before we can even begin to evaluate it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] call sysrq_timer_list_show from a workqueue
On Wednesday 09 January 2008 11:21:59 Andrew Morton wrote: > The string handling in here has become a bit scruffy. Yes, that patch also evokes a const warning. Fixed below. I assume you've queued these because you're thinking of applying them before 2.6.24? I'd say only modules-de-mutex-more-symbol-lookup-paths-in-the-module-code.patch warrants that (the other is unlikely and not a regression). > afacit the `namebuf[KSYM_NAME_LEN - 1] = 0;' would be unneeded if we were > to use strlcpy() and I suspect the `namebuf[0] = 0;' isn't needed either. > > And the use of strlcpy() means we don't need to subtract 1 from > KSYM_NAME_LEN and we don't need to fret about weird strncpy semantics when > the input string is too large. > > > And the fact that incoming arg `namebuf' MUST point at a > KSYM_NAME_LEN-sized buffer could be better communicated by using a > dedicated struct for this, or by giving the arg a type of `char > namebuf[KSYM_NAME_LEN]'. Or by adding a comment. Or by just ignoring > me and doing something more useful. Or better, rework all the name lookup interfaces, rather than having: struct module *module_text_address(unsigned long addr); struct module *__module_text_address(unsigned long addr); int is_module_address(unsigned long addr); int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type, char *name, char *module_name, int *exported); char *module_address_lookup(unsigned long addr, unsigned long *symbolsize, unsigned long *offset, char **modname, char *namebuf); int lookup_module_symbol_name(unsigned long addr, char *symname); int lookup_module_symbol_attrs(unsigned long addr, unsigned long *size, unsigned long *offset, char *modname, char *name); unsigned long module_kallsyms_lookup_name(const char *name); unsigned long kallsyms_lookup_name(const char *name); extern int kallsyms_lookup_size_offset(unsigned long addr, unsigned long *symbolsize, unsigned long *offset); const char *kallsyms_lookup(unsigned long addr, unsigned long *symbolsize, unsigned long *offset, char **modname, char *namebuf); extern int sprint_symbol(char *buffer, unsigned long address); extern void __print_symbol(const char *fmt, unsigned long address); int lookup_symbol_name(unsigned long addr, char *symname); int lookup_symbol_attrs(unsigned long addr, unsigned long *size, unsigned long *offset, char *modname, char *name); Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
> On Wed, 9 Jan 2008 02:34:46 + (GMT) Dave Airlie <[EMAIL PROTECTED]> wrote: > > > > > [This an initial RFC but I'd like to have this patch in before 2.6.24 goes > > final as it really breaks this useful feature] > > > > mmiotrace the MMIO access tracer used to reverse engineer binary blobs > > used this notifier interface and is planned on being pushed upstream. > > > > Having users able to just use the tracer module without having to rebuild > > their kernel to add in a page fault handler hack means we get a lot > > greater coverage for reverse engineering efforts. > > Sorry, but that's a really really small benefit. This very small number of > fairly (or very) technical users will be able to work out a way of getting > this to work in 2.6.24. And in 2.6.25 with a merged mmiotrace we can do > something different. mmiotrace isn't targetted at fairly or technical users, its whole usefulness is that you don't need a kernel re-build, the distro kernels all contain enough support for us to just get a user to grab mmiotrace, run make and get a trace so in my eyes this a major feature regression to have to go back to custom kernel builds... > It's a modest convenience to a very small number of people. And the cost? > Multiple functions calls and multiple cachelines hit for every pagefault > on, what? Tens of millions of machines? Which has been happening for how many months? perhaps if we merge mmiotrace in 2.6.25 we can clean up this function, otherwise I just count it as a feature regression... > pagefault it populates a struct on the stack, passes that around for a > while, does a bit of RCU stuff only to find that there was nothing to do. > Surely we should at least be doing something along the lines of > > if (unlikely(notify_page_fault_chain.notifier_call != NULL)) { > all that crap > } > > > But that's all speculation. Has anyone actually measured the pagefault > latency impact of this change? > > > +/* > > + * These are only here because kprobes.c wants them to implement a > > + * blatant layering violation. Will hopefully go away soon once all > > + * architectures are updated. > > + */ > > +static inline int register_page_fault_notifier(struct notifier_block *nb) > > +{ > > + return 0; > > +} > > +static inline int unregister_page_fault_notifier(struct notifier_block *nb) > > +{ > > + return 0; > > +} > > + > > And this doesn't look very good either. For how long did this fixme remain > unfixed? > > > So I'd suggest that we leave things as they are for 2.6.24 - mmiotrace > people will work something out, I'm sure. For 2.6.25 if we merge mmiotrace > we can look at doing something which is vaguely efficient and tasteful. > I just reverted Christophs patch I didn't try and work out if the old code had problems no one has fixed... So all distros with 2.6.24 kernels are useless to mmiotrace I don't see why leaving things as is until a suitable replacement mechanism can be used.. I've heard others give out about this also madwifi and SuSE kernel folks... Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: umount -l , getcwd and /proc//cwd inconsistent
On Mon, 2008-01-07 at 12:17 +0900, Ian Kent wrote: > > Basically, from a bash shell, setting working directory to a mounted > directory all is fine with "pwd" and "/proc//cwd". Following a > "umount - l" on the mount "pwd" continues to return the expected string > but "/proc//cwd" returns an empty string. > > What I'm really after is why this happens because sys_getcwd and > proc_pid_readlink appear to do essentially the same thing to get the > string. I think I understand what happens here now. Basically, following a "umount -l", anything that calls d_path from within the unlinked mount and doesn't have a d_name dentry ops method can no longer walk back up to the root to get the path. Of course this makes perfect sense as the mount has been unlinked from the tree. But it can also prevent processes still using the mount from successfully running through to completion to release the mount. I expect this was never the intent of the functionality but I think it should be. Especially since the VFS appears to handle this really well otherwise. So, I'm after suggestions: Does anyone feel strongly that this case shouldn't be handled for some reason? Why? Does anyone have any suggestions about how this should be done? Does anyone have any concerns about what shouldn't be done to deal with this? Ian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
On Wed, 9 Jan 2008 02:34:46 + (GMT) Dave Airlie <[EMAIL PROTECTED]> wrote: > > [This an initial RFC but I'd like to have this patch in before 2.6.24 goes > final as it really breaks this useful feature] > > mmiotrace the MMIO access tracer used to reverse engineer binary blobs > used this notifier interface and is planned on being pushed upstream. > > Having users able to just use the tracer module without having to rebuild > their kernel to add in a page fault handler hack means we get a lot > greater coverage for reverse engineering efforts. Sorry, but that's a really really small benefit. This very small number of fairly (or very) technical users will be able to work out a way of getting this to work in 2.6.24. And in 2.6.25 with a merged mmiotrace we can do something different. It's a modest convenience to a very small number of people. And the cost? Multiple functions calls and multiple cachelines hit for every pagefault on, what? Tens of millions of machines? Plus the code which is getting restored isn't even very good. For every pagefault it populates a struct on the stack, passes that around for a while, does a bit of RCU stuff only to find that there was nothing to do. Surely we should at least be doing something along the lines of if (unlikely(notify_page_fault_chain.notifier_call != NULL)) { all that crap } But that's all speculation. Has anyone actually measured the pagefault latency impact of this change? > +/* > + * These are only here because kprobes.c wants them to implement a > + * blatant layering violation. Will hopefully go away soon once all > + * architectures are updated. > + */ > +static inline int register_page_fault_notifier(struct notifier_block *nb) > +{ > + return 0; > +} > +static inline int unregister_page_fault_notifier(struct notifier_block *nb) > +{ > + return 0; > +} > + And this doesn't look very good either. For how long did this fixme remain unfixed? So I'd suggest that we leave things as they are for 2.6.24 - mmiotrace people will work something out, I'm sure. For 2.6.25 if we merge mmiotrace we can look at doing something which is vaguely efficient and tasteful. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops?
> Except this time when rebooting the machine i got a kernel oops > message and it didn't boot completely. I could not copy it but I did > take a picture and now I have re-written the screen here(sorry about That is interesting - that sort of error usually points at memory corruption and early on tends to point at hardware (but not always). What hard is in this system and does it have over 4GB of RAM ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 2/2] kexec/i386: kexec page table code clean up - page table setup in C
This patch transforms the kexec page tables setup code from assembler code to C code in machine_kexec_prepare. This improves readability and reduces code line number. Signed-off-by: Huang Ying <[EMAIL PROTECTED]> --- arch/x86/kernel/machine_kexec_32.c | 50 +++ arch/x86/kernel/relocate_kernel_32.S | 114 --- include/asm-x86/kexec_32.h | 18 - 3 files changed, 40 insertions(+), 142 deletions(-) --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -86,6 +86,42 @@ static void free_page_tables(struct kima free_page((unsigned long)image->arch_kimage.pte1); } +static void page_table_set_one(pgd_t *pgd, pmd_t *pmd, pte_t *pte, + unsigned long vaddr, unsigned long paddr) +{ + pud_t *pud; + + pgd += pgd_index(vaddr); +#ifdef CONFIG_X86_PAE + if (!(pgd_val(*pgd) & _PAGE_PRESENT)) + set_pgd(pgd, __pgd(__pa(pmd) | _PAGE_PRESENT)); +#endif + pud = pud_offset(pgd, vaddr); + pmd = pmd_offset(pud, vaddr); + if (!(pmd_val(*pmd) & _PAGE_PRESENT)) + set_pmd(pmd, __pmd(__pa(pte) | _PAGE_TABLE)); + pte = pte_offset_kernel(pmd, vaddr); + set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL_EXEC)); +} + +static void prepare_page_tables(struct kimage *image) +{ + void *control_page; + pmd_t *pmd = 0; + + control_page = page_address(image->control_code_page); +#ifdef CONFIG_X86_PAE + pmd = image->arch_kimage.pmd0; +#endif + page_table_set_one(image->arch_kimage.pgd, pmd, image->arch_kimage.pte0, + (unsigned long)relocate_kernel, __pa(control_page)); +#ifdef CONFIG_X86_PAE + pmd = image->arch_kimage.pmd1; +#endif + page_table_set_one(image->arch_kimage.pgd, pmd, image->arch_kimage.pte1, + __pa(control_page), __pa(control_page)); +} + /* * A architecture hook called to validate the * proposed image and prepare the control pages @@ -98,6 +134,7 @@ static void free_page_tables(struct kima * later. * * - Allocate page tables + * - Setup page tables */ int machine_kexec_prepare(struct kimage *image) { @@ -112,6 +149,7 @@ int machine_kexec_prepare(struct kimage free_page_tables(image); return -ENOMEM; } + prepare_page_tables(image); return 0; } @@ -140,19 +178,7 @@ NORET_TYPE void machine_kexec(struct kim memcpy(control_page, relocate_kernel, PAGE_SIZE); page_list[PA_CONTROL_PAGE] = __pa(control_page); - page_list[VA_CONTROL_PAGE] = (unsigned long)relocate_kernel; page_list[PA_PGD] = __pa(image->arch_kimage.pgd); - page_list[VA_PGD] = (unsigned long)image->arch_kimage.pgd; -#ifdef CONFIG_X86_PAE - page_list[PA_PMD_0] = __pa(image->arch_kimage.pmd0); - page_list[VA_PMD_0] = (unsigned long)image->arch_kimage.pmd0; - page_list[PA_PMD_1] = __pa(image->arch_kimage.pmd1); - page_list[VA_PMD_1] = (unsigned long)image->arch_kimage.pmd1; -#endif - page_list[PA_PTE_0] = __pa(image->arch_kimage.pte0); - page_list[VA_PTE_0] = (unsigned long)image->arch_kimage.pte0; - page_list[PA_PTE_1] = __pa(image->arch_kimage.pte1); - page_list[VA_PTE_1] = (unsigned long)image->arch_kimage.pte1; /* The segment registers are funny things, they have both a * visible and an invisible part. Whenever the visible part is --- a/arch/x86/kernel/relocate_kernel_32.S +++ b/arch/x86/kernel/relocate_kernel_32.S @@ -16,126 +16,12 @@ #define PTR(x) (x << 2) #define PAGE_ALIGNED (1 << PAGE_SHIFT) -#define PAGE_ATTR 0x63 /* _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY */ -#define PAE_PGD_ATTR 0x01 /* _PAGE_PRESENT */ .text .align PAGE_ALIGNED .globl relocate_kernel relocate_kernel: movl8(%esp), %ebp /* list of pages */ - -#ifdef CONFIG_X86_PAE - /* map the control page at its virtual address */ - - movlPTR(VA_PGD)(%ebp), %edi - movlPTR(VA_CONTROL_PAGE)(%ebp), %eax - andl$0xc000, %eax - shrl$27, %eax - addl%edi, %eax - - movlPTR(PA_PMD_0)(%ebp), %edx - orl $PAE_PGD_ATTR, %edx - movl%edx, (%eax) - - movlPTR(VA_PMD_0)(%ebp), %edi - movlPTR(VA_CONTROL_PAGE)(%ebp), %eax - andl$0x3fe0, %eax - shrl$18, %eax - addl%edi, %eax - - movlPTR(PA_PTE_0)(%ebp), %edx - orl $PAGE_ATTR, %edx - movl%edx, (%eax) - - movlPTR(VA_PTE_0)(%ebp), %edi - movlPTR(VA_CONTROL_PAGE)(%ebp), %eax - andl$0x001ff000, %eax - shrl$9, %eax - addl%edi, %eax - - movlPTR(PA_CONTROL_PAGE)(%ebp), %edx - orl $PAGE_ATTR, %edx - movl%edx, (%eax) - - /* identity map the control page at its physical address */ - - movl
Re: [PATCH 5/6] syslets: add generic syslets infrastructure
> Firstly, why not just specify an address for the return value and be done > with it? This infrastructure seems overkill, and you can always extend later > if required. Sorry, which infrastructure? Providing the function and stack to return to? Sure, I could certainly entertain the idea of not having syslet tasks return to userspace in the first pass. Ingo sure seemed excited by the idea. Or do you mean the syscall return value ending up in the userspace completion event ring? That's mostly about being able to wait for pending syslets to complete. > Secondly, you really should allow integration with an eventfd so you don't > make the posix AIO mistake of providing a poll-incompatible interface. Yeah, this seems straight forward enough that I haven't made it an initial priority. I'm sure it will be helpful for people who are stuck integrating with entrenched software that wants to wait for pollable fds. For more flexible software, though, it's compelling to now be able to aggregate waiting for completion of the existing waiting syscalls (poll, epoll_wait, futexes, whatever) by issuing them as concurrent syslets. > Finally, and probably most alarmingly, AFAICT randomly changing TID will > break > all threaded programs, which means this won't be fitted into existing code > bases, making it YA niche Linux-only API 8( Yeah, this still needs to be investigated. I haven't yet and I haven't heard of anyone else trying their hand at it. In the YANLOA mode apps would know that executing syslets is an implicit clone() and would act accordingly. "8(", indeed. I wonder if there isn't an opportunity to add a clone() flag which juggles the association between TIDs and task_structs. I don't relish the idea of investigating the life cycles of task_struct references that derive from TIDs and seeing how those would race with a syslet blocking and cloning, but, well, maybe that's what needs to be done. This all isn't my area of expertise, though, sadly. It would be swell if someone wanted to look into it before I'm forced to learn yet another weird corner of the kernel. - z -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 1/2] kexec/i386: kexec page table code clean up - add arch_kimage
This patch add an architecture specific struct arch_kimage into struct kimage. Three pointers to page table pages used by kexec are added to struct arch_kimage. The page tables pages are dynamically allocated in machine_kexec_prepare instead of statically from BSS segment. This will save up to 20k memory when kexec image is not loaded. Signed-off-by: Huang Ying <[EMAIL PROTECTED]> --- arch/x86/kernel/machine_kexec_32.c | 68 + include/asm-x86/kexec_32.h | 12 ++ include/linux/kexec.h |4 ++ 3 files changed, 63 insertions(+), 21 deletions(-) --- a/arch/x86/kernel/machine_kexec_32.c +++ b/arch/x86/kernel/machine_kexec_32.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -21,15 +22,6 @@ #include #include -#define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE))) -static u32 kexec_pgd[1024] PAGE_ALIGNED; -#ifdef CONFIG_X86_PAE -static u32 kexec_pmd0[1024] PAGE_ALIGNED; -static u32 kexec_pmd1[1024] PAGE_ALIGNED; -#endif -static u32 kexec_pte0[1024] PAGE_ALIGNED; -static u32 kexec_pte1[1024] PAGE_ALIGNED; - static void set_idt(void *newidt, __u16 limit) { struct Xgt_desc_struct curidt; @@ -72,6 +64,28 @@ static void load_segments(void) #undef __STR } +static void alloc_page_tables(struct kimage *image) +{ + image->arch_kimage.pgd = (pgd_t *)get_zeroed_page(GFP_KERNEL); +#ifdef CONFIG_X86_PAE + image->arch_kimage.pmd0 = (pmd_t *)get_zeroed_page(GFP_KERNEL); + image->arch_kimage.pmd1 = (pmd_t *)get_zeroed_page(GFP_KERNEL); +#endif + image->arch_kimage.pte0 = (pte_t *)get_zeroed_page(GFP_KERNEL); + image->arch_kimage.pte1 = (pte_t *)get_zeroed_page(GFP_KERNEL); +} + +static void free_page_tables(struct kimage *image) +{ + free_page((unsigned long)image->arch_kimage.pgd); +#ifdef CONFIG_X86_PAE + free_page((unsigned long)image->arch_kimage.pmd0); + free_page((unsigned long)image->arch_kimage.pmd1); +#endif + free_page((unsigned long)image->arch_kimage.pte0); + free_page((unsigned long)image->arch_kimage.pte1); +} + /* * A architecture hook called to validate the * proposed image and prepare the control pages @@ -83,10 +97,21 @@ static void load_segments(void) * reboot code buffer to allow us to avoid allocations * later. * - * Currently nothing. + * - Allocate page tables */ int machine_kexec_prepare(struct kimage *image) { + alloc_page_tables(image); + if (!image->arch_kimage.pgd || +#ifdef CONFIG_X86_PAE + !image->arch_kimage.pmd0 || + !image->arch_kimage.pmd1 || +#endif + !image->arch_kimage.pte0 || + !image->arch_kimage.pte1) { + free_page_tables(image); + return -ENOMEM; + } return 0; } @@ -96,6 +121,7 @@ int machine_kexec_prepare(struct kimage */ void machine_kexec_cleanup(struct kimage *image) { + free_page_tables(image); } /* @@ -115,18 +141,18 @@ NORET_TYPE void machine_kexec(struct kim page_list[PA_CONTROL_PAGE] = __pa(control_page); page_list[VA_CONTROL_PAGE] = (unsigned long)relocate_kernel; - page_list[PA_PGD] = __pa(kexec_pgd); - page_list[VA_PGD] = (unsigned long)kexec_pgd; + page_list[PA_PGD] = __pa(image->arch_kimage.pgd); + page_list[VA_PGD] = (unsigned long)image->arch_kimage.pgd; #ifdef CONFIG_X86_PAE - page_list[PA_PMD_0] = __pa(kexec_pmd0); - page_list[VA_PMD_0] = (unsigned long)kexec_pmd0; - page_list[PA_PMD_1] = __pa(kexec_pmd1); - page_list[VA_PMD_1] = (unsigned long)kexec_pmd1; -#endif - page_list[PA_PTE_0] = __pa(kexec_pte0); - page_list[VA_PTE_0] = (unsigned long)kexec_pte0; - page_list[PA_PTE_1] = __pa(kexec_pte1); - page_list[VA_PTE_1] = (unsigned long)kexec_pte1; + page_list[PA_PMD_0] = __pa(image->arch_kimage.pmd0); + page_list[VA_PMD_0] = (unsigned long)image->arch_kimage.pmd0; + page_list[PA_PMD_1] = __pa(image->arch_kimage.pmd1); + page_list[VA_PMD_1] = (unsigned long)image->arch_kimage.pmd1; +#endif + page_list[PA_PTE_0] = __pa(image->arch_kimage.pte0); + page_list[VA_PTE_0] = (unsigned long)image->arch_kimage.pte0; + page_list[PA_PTE_1] = __pa(image->arch_kimage.pte1); + page_list[VA_PTE_1] = (unsigned long)image->arch_kimage.pte1; /* The segment registers are funny things, they have both a * visible and an invisible part. Whenever the visible part is --- a/include/asm-x86/kexec_32.h +++ b/include/asm-x86/kexec_32.h @@ -94,6 +94,18 @@ relocate_kernel(unsigned long indirectio unsigned long start_address, unsigned int has_pae) ATTRIB_NORET; +#define ARCH_HAS_ARCH_KIMAGE + +struct arch_kimage { + pgd_t *pgd; +#ifdef CONFIG_X86_PAE + pmd_t *pmd0; + pmd_t *pmd1; +#endif + pte_t *pte0; + pte_t *pte1; +}; + #endif /* __ASSEMBLY__ */
[PATCH -mm 0/2] kexec/i386: kexec page table code clean up
This patchset cleans up page table setup code of kexec on i386. This patchset is based on 2.6.24-rc5-mm1 and has been tested on i386 with/without PAE enabled. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5][V2]PCI: x86 MMCONFIG: Preamble
Greg, Let me know what you think, and if there's anything you want me to fix/change. [EMAIL PROTECTED] wrote: OVERVIEW This patch-set is being resubmitted after some discussion and in response to critiques of the original submission made by the lkml community. The patches should be applied in sequence to obviate any possible build problems. The patch-set was built against 2.6.24-rc6 The large amount of text in the explanation below is due to the nature of the problem and the discussion engendered on lkml by my first submission. arch/x86/pci/common.c | 69 arch/x86/pci/direct.c | 49 arch/x86/pci/init.c| 18 +-- arch/x86/pci/mmconfig-shared.c |3 +- arch/x86/pci/pci.h |3 ++ drivers/pci/pci.c |9 + drivers/pci/pci.h |1 + drivers/pci/probe.c|5 +++ 8 files changed, 146 insertions(+), 11 deletions(-) Description === There exist northbridges that do not respond correctly to PCI MMCONFIG accesses in x86 platforms. Among them are the AMD 8132. Here is an excerpt from an errata page published by AMD at the following link. http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf The base configuration space of the AMD-8132 and PCI(-X) devices attached to it are accessible using only the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to the AMD-8132 in the extended configuration space are not accessible. The AMD-8132 has no registers in the extended onfiguration space. Fix Planned No On bus numbers above that defined by PCI_MAX_CHECK_BUS, and whose pci_ops field points to the mmconf ops, each device is checked for mmconf compliance by comparing an MMCONFIG read to a Legacy PCI config read of the vendor/device dword. A miscompare means that a device does not correctly respond to MMCONFIG accesses. When the patch code detects this condition, the bus that serves this device, and all subordinate buses, will be programmed to use Legacy PCI Config accesses. This patch set does not scan the first few buses, a number defined by PCI_MMCFG_MAX_CHECK_BUS, because the routine unreachable_devices() in arch/x86/pci/mmconfig-shared.c already does this with device granularity using a bitmap. Alternatives Considered === We chose not to extend the bitmap mechanism, since it would have become too large in order to cover all possible buses on all possible segments, and having the lookup into such a large bitmap inline with every pci config access would have had an adverse affect on performance. An alternative would have been to allocate a bitmap on a per-bus basis, so every bus would have a bitmap of its own unreachable devices. This could be done with a new field in the pci_bus struct. However, the only devices that need to perform a mmconfig translation, and have problems with it, are northbridges. Once the translation is made and forwarded on the pci bus, the consumers of the pci config address do not know or care whether it was generated by an mmconfig or legacy pci access mechanism. This being the case, the secondary and subordinate buses also require legacy pci access, even though they are not aware of the mechanism, because the pci config access must still be translateed by the root bridge to get to them. Also considered in the discusson on lkml was a suggestion by Loic Prylli to always use legacy pci configuration for the first 256 bytes of config space. This would certainly have fixed the problem of configuring and booting. It would also have fixed the problem with bus sizing code programming devices to claim MMIO space that beloongs to MMCONFIG and thereby hang the system (see below). However, there are devices (tg3) that make a lot of runtime use of that area of pci config space, so forcing legacy pci config access on all devices for the few situations where such a measure would be necessary, when in most situations mmconfig works just fine, was a performance penalty the consensus was unwilling to permit. What this patch set does not fix This patch-set does not detect or fix the conditon where bus sizing code programs a device to consume MMIO space that also happens to include the MMCONFIG address range. This is a BIOS bug that we have seen in more than one system. When BIOS maps MMCONFIG space into an MMIO region below 4GB, some devices, typically graphics chips that want 256 MB or more of MMIO, will be inadvertently programmed by bus sizing code to claim this space. At that point, no further boot progress can be made. Up to now, the workaround for such systems is to type "pci=nommconf" at the boot command line. There was a suggestion made by Ivan Kokshaysky to limit accesses to pci config space at offsets within
Re: [PATCH 0/4] add task handling notifier
On Tue, 2008-01-08 at 14:14 -0800, Andrew Morton wrote: > On Tue, 08 Jan 2008 13:38:03 + > "Jan Beulich" <[EMAIL PROTECTED]> wrote: > > > >>> Andrew Morton <[EMAIL PROTECTED]> 25.12.07 23:05 >>> > > >On Sun, 23 Dec 2007 12:26:21 + Christoph Hellwig <[EMAIL PROTECTED]> > > >wrote: > > > > > >> On Thu, Dec 20, 2007 at 01:11:24PM +, Jan Beulich wrote: > > >> > With more and more sub-systems/sub-components leaving their footprint > > >> > in task handling functions, it seems reasonable to add notifiers that > > >> > these components can use instead of having them all patch themselves > > >> > directly into core files. > > >> > > >> I agree that we probably want something like this. As do some others, > > >> so we already had a few a few attempts at similar things. The first one > > >> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also > > >> includes allocating per-task data for it's users. Then also from SGI > > >> there has been a simplified version called pnotify that's also available > > >> from the website above. > > >> > > >> Later Matt Helsley had something called "Task Watchers" which lwn has > > >> an article on: http://lwn.net/Articles/208117/. > > >> > > >> For some reason neither ever made a lot of progess (performance > > >> problems?). > > >> > > > > > >I had it in -mm, sorted out all the problems but ended up not pulling the > > >trigger. > > > > > >Problem is, it adds runtime overhead purely for the convenience of kernel > > >programmers, and I don't think that's a good tradeoff. > > > > > >Sprinkling direct calls into a few well-known sites won't kill us, and > > >we've survived this long. Why not keep doing that, and save everyone a few > > >cycles? > > > > Am I to conclude then that there's no point in addressing the issues other > > people pointed out? While I (obviously, since I submitted the patch > > disagree), > > I'm not certain how others feel. My main point for disagreement here is (I'm > > sorry to repeat this) that as long as certain code isn't allowed into the > > kernel > > I think it is not unreasonable to at least expect the kernel to provide some > > fundamental infrastructure that can be used for those (supposedly > > unacceptable) bits. All I did here was utilizing the base infrastructure I > > want > > added to clean up code that appeared pretty ad-hoc. > > > > Ah. That's a brand new requirement. In all fairness it's not really a brand new requirement -- just one that wasn't strongly emphasized during prior attempts to get something like this in. I had a mostly-working patch for this on top of the Task Watchers v2 patch set. I never posted that specific patch because it had a race with module unloading and the fix only increased the overhead you were unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X] description for Task Watchers v2 (http://lwn.net/Articles/207873/): "TODO: ... I'm working on three more patches that add support for creating a task watcher from within a module using an ELF section. They haven't recieved as much attention since I've been focusing on measuring the performance impact of these patches." Would tainting the kernel upon registration of out-of-tree "notifiers" be more acceptable? Cheers, -Matt Helsley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.
On Tue, 2008-01-08 at 14:15 -0500, David P. Reed wrote: > Alan Cox wrote: > > The natsemi docs here say otherwise. I trust them not you. > > > As well you should. I am honestly curious (for my own satisfaction) as > to what the natsemi docs say the delay code should do (can't imagine > they say "use io port 80 because it is unused"). I don't have any What is the outcome of this thread? Are we going to use timing based port delays, or can we finally drop these things entirely on 64-bit architectures? I a have a doubly vested interest in this, both as the owner of an affected HP dv9210us laptop and as a maintainer of paravirt code - and would like 64-bit Linux code to stop using I/O to port 0x80 in both cases (as I suspect would every other person involved with virtualization). BTW, it isn't ever safe to pass port 0x80 through to hardware from a virtual machine; some OSes use port 0x80 as a hardware available scratch register (I believe Darwin/x86 did/does this during boot). This means simultaneous execution of two virtual machines can interleave port 0x80 values or share data with a hardware provided covert channel. This means KVM should be trapping port 0x80 access, which is really expensive, or alternatively, Linux should not be using port 0x80 for timing bus access on modern (64-bit) hardware. I've tried to follow this thread, but with all the jabs, 1-ups, and obscure legacy hardware pageantry going on, it isn't clear what we're really doing. Thanks, Zach -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 05/19] split LRU lists into anon & file sets
On Tue, 8 Jan 2008 14:42:03 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 8 Jan 2008, Rik van Riel wrote: > > > > Also would it be possible to create generic functions that can move pages > > > in pagevecs to an arbitrary lru list? > > > > What would you use those functions for? > > We keep on duplicating the pagevec lru operation functions in mm/swap.c. > Some generic stuff would reduce the code size. Good idea. Added to my TODO list :) -- All rights reversed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata and starting/stopping ATAPI floppy devices
Ondrej Zary wrote: > Hello, > I switched to libata drivers for my onboard PATA controller (PIIX4) recently. > Everything works fine except that kernel tries to start not only my hard > drive (sda) but also LS-120 floppy drive (sdb) which does not like it: > > sd 0:0:0:0: [sda] Starting disk > ata1.00: configured for UDMA/33 > sd 0:0:0:0: [sda] 58633344 512-byte hardware sectors (30020 MB) > sd 0:0:0:0: [sda] Write Protect is off > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support > DPO or FUA > sd 1:0:1:0: [sdb] Starting disk > ata2.00: configured for UDMA/33 > ata2.01: configured for PIO2 > sd 1:0:1:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08 > sd 1:0:1:0: [sdb] Sense Key : 0x2 [current] > sd 1:0:1:0: [sdb] ASC=0x3a ASCQ=0x0 > > > The question is: is it correct? Or a patch like this should be applied? Yeah, looks good to me. Please reformat the message w/ S-O-B. Acked-by: Tejun Heo <[EMAIL PROTECTED]> -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
On Wed, Jan 09, 2008 at 02:34:46AM +, Dave Airlie wrote: > > [This an initial RFC but I'd like to have this patch in before 2.6.24 goes > final as it really breaks this useful feature] > > mmiotrace the MMIO access tracer used to reverse engineer binary blobs > used this notifier interface and is planned on being pushed upstream. > > Having users able to just use the tracer module without having to rebuild > their kernel to add in a page fault handler hack means we get a lot > greater coverage for reverse engineering efforts. > > Signed-off-by: David Airlie <[EMAIL PROTECTED]> Acked-by: Andi Kleen <[EMAIL PROTECTED]> I never liked the original patch. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Revert "x86: optimize page faults like all other achitectures and kill notifier cruft"
[This an initial RFC but I'd like to have this patch in before 2.6.24 goes final as it really breaks this useful feature] mmiotrace the MMIO access tracer used to reverse engineer binary blobs used this notifier interface and is planned on being pushed upstream. Having users able to just use the tracer module without having to rebuild their kernel to add in a page fault handler hack means we get a lot greater coverage for reverse engineering efforts. Signed-off-by: David Airlie <[EMAIL PROTECTED]> This reverts commit 74a0b5762713a26496db72eac34fbbed46f20fce. Conflicts: include/asm-avr32/kprobes.h include/asm-ia64/kprobes.h include/asm-s390/kprobes.h include/asm-x86/kdebug_32.h include/asm-x86/kdebug_64.h include/asm-x86/kprobes_64.h --- arch/x86/kernel/kprobes_32.c |3 +- arch/x86/kernel/kprobes_64.c |1 + arch/x86/mm/fault_32.c| 43 ++- arch/x86/mm/fault_64.c| 44 +++- include/asm-avr32/kdebug.h| 16 ++ include/asm-avr32/kprobes.h |1 + include/asm-ia64/kdebug.h | 15 ++ include/asm-ia64/kprobes.h|1 + include/asm-powerpc/kdebug.h | 19 + include/asm-powerpc/kprobes.h |1 + include/asm-s390/kdebug.h | 15 ++ include/asm-s390/kprobes.h|1 + include/asm-sh/kdebug.h |2 + include/asm-sparc64/kdebug.h | 18 include/asm-sparc64/kprobes.h |1 + include/asm-x86/kdebug.h |3 ++ include/asm-x86/kprobes_32.h |2 +- include/asm-x86/kprobes_64.h |1 + kernel/kprobes.c | 39 +-- 19 files changed, 183 insertions(+), 43 deletions(-) diff --git a/arch/x86/kernel/kprobes_32.c b/arch/x86/kernel/kprobes_32.c index 3a020f7..1ba8fee 100644 --- a/arch/x86/kernel/kprobes_32.c +++ b/arch/x86/kernel/kprobes_32.c @@ -586,7 +586,7 @@ out: return 1; } -int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr) +static int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr) { struct kprobe *cur = kprobe_running(); struct kprobe_ctlblk *kcb = get_kprobe_ctlblk(); @@ -668,6 +668,7 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self, ret = NOTIFY_STOP; break; case DIE_GPF: + case DIE_PAGE_FAULT: /* kprobe_running() needs smp_processor_id() */ preempt_disable(); if (kprobe_running() && diff --git a/arch/x86/kernel/kprobes_64.c b/arch/x86/kernel/kprobes_64.c index 5df19a9..279cea7 100644 --- a/arch/x86/kernel/kprobes_64.c +++ b/arch/x86/kernel/kprobes_64.c @@ -654,6 +654,7 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self, ret = NOTIFY_STOP; break; case DIE_GPF: + case DIE_PAGE_FAULT: /* kprobe_running() needs smp_processor_id() */ preempt_disable(); if (kprobe_running() && diff --git a/arch/x86/mm/fault_32.c b/arch/x86/mm/fault_32.c index a2273d4..f03cc93 100644 --- a/arch/x86/mm/fault_32.c +++ b/arch/x86/mm/fault_32.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include @@ -33,27 +32,33 @@ extern void die(const char *,struct pt_regs *,long); -#ifdef CONFIG_KPROBES -static inline int notify_page_fault(struct pt_regs *regs) +static ATOMIC_NOTIFIER_HEAD(notify_page_fault_chain); + +int register_page_fault_notifier(struct notifier_block *nb) { - int ret = 0; - - /* kprobe_running() needs smp_processor_id() */ - if (!user_mode_vm(regs)) { - preempt_disable(); - if (kprobe_running() && kprobe_fault_handler(regs, 14)) - ret = 1; - preempt_enable(); - } + vmalloc_sync_all(); + return atomic_notifier_chain_register(_page_fault_chain, nb); +} +EXPORT_SYMBOL_GPL(register_page_fault_notifier); - return ret; +int unregister_page_fault_notifier(struct notifier_block *nb) +{ + return atomic_notifier_chain_unregister(_page_fault_chain, nb); } -#else -static inline int notify_page_fault(struct pt_regs *regs) +EXPORT_SYMBOL_GPL(unregister_page_fault_notifier); + +static inline int notify_page_fault(struct pt_regs *regs, long err) { - return 0; + struct die_args args = { + .regs = regs, + .str = "page fault", + .err = err, + .trapnr = 14, + .signr = SIGSEGV + }; + return atomic_notifier_call_chain(_page_fault_chain, + DIE_PAGE_FAULT, ); } -#endif /* * Return EIP plus the CS segment base. The segment limit is also @@ -331,7 +336,7 @@ fastcall void __kprobes do_page_fault(struct pt_regs *regs, if (unlikely(address
Re: [PATCH] AMD Thermal Interrupt Support
On Tue, Jan 08, 2008 at 06:28:18PM -0800, Russell Leidich wrote: > On Jan 8, 2008 3:52 PM, Andi Kleen <[EMAIL PROTECTED]> wrote: > > > ENTRY(thermal_interrupt) > > > - apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt > > > + apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt(%rip) > > > > Are you sure a * is not needed? I would have thought it would jump > > to the variable instead of through it. But if it works it's ok for me. > > I will test to make sure it works. I don't think stars mean anything > in AT X86-64. % cat t.s call foo call *foo % as -o t.o t.s % objdump -S t.o t.o: file format elf64-x86-64 Disassembly of section .text: <.text>: 0: e8 00 00 00 00 callq 0x5 5: ff 14 25 00 00 00 00callq *0x0 -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kbuild update
>> > If we can make this to be an offical project for Linux kernel, I >> > think it won't be a big problem. >> >> We don't even manage to maintain the English language texts properly, >> and I am therefore not overly optimistic that we'll have the >> translations maintained properly for many years. >Italian was 100% translated at one point in time. >And the Linux Kernel Translation project has a number of >spelling error fixes in queue (I dunno if they have been applied). > >So even when run as an external project it was ok for some languages, >and having it official and someone taking patches to .po files would >for sure allow more users to build a kernel. > Agreed. That's the goal of TLKTP. Sam, can you contact to the author of TLKTP? Maybe we can talk to him to see if we can restart the project. If so, I can help with the Chinese translation part. Best regards. Cong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] split MMC_CAP_4_BIT_DATA
On Jan 9, 2008 4:49 AM, Pierre Ossman <[EMAIL PROTECTED]> wrote: > On Tue, 8 Jan 2008 14:40:49 -0500 > Mike Frysinger <[EMAIL PROTECTED]> wrote: > > > > > i dont understand what's confusing. the Blackfin on chip host controller > > only > > supports 1-bit MMC, but it supports 4-bit SD/SDIO. this is a fact. while > > it > > may be a stupid decision, it is what it is, and i need the framework made > > more flexible in order to get the Blackfin driver merged cleanly. we do > > software for hardware, we dont do hardware. > > Well, since I've seen no _hardware_ differences between 4-bit MMC and 4-bit > SD, "support" in this case must me "vendor will guarantee it works". And that > is not the kind of "support" that needs a distinction in the code. > > So, again, if you feel that there is a hardware difference between 4-bit MMC > and 4-bit SD then please elaborate as it is my understanding that they are > identical. > As Mike said, the reason split this flag is because Blackfin on-chip SDIO controller's limitation. Cliff is working on it for a long time, so I dropped him in. Hope he can clarify the confusing things. Thanks -Bryan Wu -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Believed resolved: SATA kern-buffRd read slow: based on promise driver bug
Linda Walsh wrote: >Is 'main' diff between NCQ/TCQ that TCQ can re-arrange 'write' > priority under driver control, whereas NCQ is mostly a FIFO queue? No, NCQ can reorder although I recently heard that windows issues overlapping NCQ commands and expects them to be processed in order (what were they thinking?). The biggest difference between TCQ and NCQ is that TCQ is for SCSI while NCQ is for ATA. Functional difference includes more number of available tags and ordered tags for TCQ. The former doesn't matter for single disk. The latter may make some difference but on single disk not by much. > Am trying to differentiate NCQ/TCQ and SAS v. SCSI benefits. > It seems both support (SAS & SATA) some type of port-multiplier/ > multiplexor/ option to allow more disks/port. > > However, (please correct?) SATA uses a hub type architecture while > SAS uses a switch architecture. My experience with network hubs vs. > switches is that network hubs can be much slower if there is > communication contention. Is the word 'hub' being used in the > "shared-communication media sense", or is someone using the term > 'hub' as a [sic] replacement for a 'switch'? Port multiplier is a switch too. It doesn't broadcast anything and definitely has forwarding buffers inside. An allegory which makes more sense is expander to router and port multiplier to switch. Unless you wanna nest them, they aren't that different. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kbuild update
> >"only" is the wrong word in this context. > >If someone would update the translations for one language every >3 months for the next years that would be great and disprove my >concerns. > >After all, updates every 3 months would beat the maintainance level of >at least three of our architectures... Hmm, yes. > >And don't underestimate the amount of work required - even when talking >about requiring "only" 10% of the help texts translated that's a four >digit number of lines to translate. Thanks for your point. I agree that the initial work is not so easy. > >> If we can make this to be an offical project for Linux kernel, I >> think it won't be a big problem. > >We don't even manage to maintain the English language texts properly, >and I am therefore not overly optimistic that we'll have the >translations maintained properly for many years. > >OTOH, if someone wouldn't just blindly translate the outdated English >texts but also review the English texts when translating this alone >might be worth it... Fully agreed. Maybe we can restart TLKTP? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] AMD Thermal Interrupt Support
On Jan 8, 2008 3:52 PM, Andi Kleen <[EMAIL PROTECTED]> wrote: > > ENTRY(thermal_interrupt) > > - apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt > > + apicinterrupt THERMAL_APIC_VECTOR,smp_thermal_interrupt(%rip) > > Are you sure a * is not needed? I would have thought it would jump > to the variable instead of through it. But if it works it's ok for me. I will test to make sure it works. I don't think stars mean anything in AT X86-64. > > The rest of the patch looks ok to to me. Thank you! I will give it a final test and submit the official patch this week. > > -Andi > > -- Russell Leidich -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Oops?
On Jan 7, 2008 5:30 PM, Alan Cox <[EMAIL PROTECTED]> wrote: > On Mon, 7 Jan 2008 17:15:01 -0600 > "Stoyan Gaydarov" <[EMAIL PROTECTED]> wrote: > > > Today I upgraded my kernel from 2.6.23.9 to 2.6.23.12 and in the past > > 30 minutes I have had to restart my computer twice. > > I believe its a kernel oops or a kernel panic because when the > > computer freezes it blinks the caps and scroll lock LEDs. > > I don't know what is causing the problem but I am willing to help, I > > can provide you with any information you need. > > The only problem is that I don't know how to debug the system myself. > > If anyone can tell me what to do to I can do it and give back the > > information. > > When the machine hangs in graphical mode its quite hard to get the data > out - one of the long term todo items is to fix that. > > Boot the machine and leave it in text mode (or if it boots to graphical > mode then switch to a text console/text mode) and wait.. with "luck" it > will show the same problem in text mode and give you a meaningful screen > dump you can then write down (or grab with a digital camera) > > Alan > I reverted back to a clean install of slackware 12.0 after trying to get it to fail again without luck, then i installed the 2.6.23.9 kernel and continued to use it regularly. Then a few minutes ago it I restarted the computer because it had frozen again, the same way. Except this time when rebooting the machine i got a kernel oops message and it didn't boot completely. I could not copy it but I did take a picture and now I have re-written the screen here(sorry about the formating): Stack: 0010 00d0 0001 00d0 c20fb980 c2104000 c2103e00 0246 c0a32fc0 47807ae8 c23eeaa0 00d0 0282 c20fb980 c026661b c23eeaa0 f586df04 c23eeaa0 c02227f2 0246 c225c480 Call Trace: [] kmem_cache_alloc+0x6b/0x90 [] dup_fd+0x22/0x2c0 [] getnstimeofday+0x36/0xc0 [] copy_files+0x41/0x60 [] copy_process+0x488/0x11a0 [] alloc_pid+0x152/0x280 [] do_fork+0x76/0x230 [] recalc_sigpending+0x5d/0xe0 [] sigprocmask+0x5d/0xe0 [] sys_clone+0x32/0x40 [] syscall_call+0x7/0xb [] __mutex_lock_interruptible_slowpath+0xb0/0xc0 === Code: 5b 5e 5f 5d c3 8b 7a 10 89 d0 c7 42 34 01 00 00 00 83 c0 10 39 c7 74 b6 8b 4c 24 10 8b 77 10 3b b1 98 00 00 00 0f 82 1d ff ff ff <0f> 0b eb fe 8b 4c 24 18 8b 54 24 18 8b 41 08 83 c2 08 89 78 04 EIP: [] cache_alloc_refill+0x1bd/0x540 SS:ESP 0068:f586de7c INIT: Entering runlevel: 4 Going multiuser... Updating shared library links: /sbin/ldconfig & Hope that someone can find the problem and fix it -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] add task handling notifier
On Sun, 2007-12-23 at 12:26 +, Christoph Hellwig wrote: > On Thu, Dec 20, 2007 at 01:11:24PM +, Jan Beulich wrote: > > With more and more sub-systems/sub-components leaving their footprint > > in task handling functions, it seems reasonable to add notifiers that > > these components can use instead of having them all patch themselves > > directly into core files. > > I agree that we probably want something like this. As do some others, > so we already had a few a few attempts at similar things. The first one > is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also > includes allocating per-task data for it's users. Then also from SGI > there has been a simplified version called pnotify that's also available > from the website above. > > Later Matt Helsley had something called "Task Watchers" which lwn has > an article on: http://lwn.net/Articles/208117/. Apologies for the late reply -- I haven't had internet access for the last few weeks. > For some reason neither ever made a lot of progess (performance > problems?). Yeah. Some discussion on measuring the performance of Task Watchers: http://thread.gmane.org/gmane.linux.lse/4698 The requirements for Task Watchers were: Allow sleeping in most/all notifier functions in these paths: fork exec exit change [re][ug]id No performance overhead One "chain" per path ("I only care about exec().") Easy to use Scales to large numbers of CPUs Useful to make most in-tree code more readable. Task Watchers took direct calls to these pieces of code out of the fork/exec/exit paths: audit semundo cpusets mempolicy trace irqflags lockdep keys (for processes -- not for thread groups) process events connector Useful for loadable modules Performance overhead in microbenchmarks was measurable at around 1% (see the URL above). Overhead on benchmarks like kernbench on the other hand were in the noise margins (which were around 1.6%) and hence I couldn't determine the overhead there. I never got the loadable module part completely working due to races between notifier functions and the module unload path. The solution to the races seemed to require adding more overhead to the notifier function paths (SRCU-like grace periods). I stopped pushing the patch set because I hadn't found any new optimizations to offset the overheads while still meeting all the requirements and Andrew still felt that the "make it more readable" argument was not sufficient to justify its inclusion. Jan, instead of adding notifiers could utrace be used or made to work for modules? Also, please add me to the Cc list for any reposts of the entire series. Thanks! Cheers, -Matt Helsley -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: translations (Re: Kbuild update)
>"I will use ... >http://images.google.cz/images?svnum=100=1=cs=firefox-a=org.mozilla%3Acs%3Aofficial=I+will+use+Google+before=Hledat+obr%C3%A1zky >... for making translations..." >http://www.google.com/translate?u=http%3A%2F%2Flxr.linux.no%2Flinux%2FDocumentation%2FHOWTO=en%7Czh-TW=en=UTF8 >? > >In case if people will help Google to have better quality of translation, >that will be better generally for much bigger number of *people*, >especially in China, isn't it? Perhaps yes. But at least now, that kind of translation still sucks. It can satisfy me. > >Making any official world-domination/new-world-order projects with >Linux will not help IMHO. Very fast code flow and almost no up to date >documentation is still relevant and google search + email archives >are not going to be obsolete in the near future. > >Also, future of the linux codebase with Chinese comments in C or in >ASM is kind of wired nightmare. Those, who cannot read actual source >code (i.e. C) will not go too far. > >So, translation guys, maybe you will stop making noise and will start >to make e.g. less buggy Linux? Greg KH have much more stuff to care, >than some translations IMHO. I never say to translate C comments. What we want to translate is the strings in Kconfig. I abosutely agree that we should focus on the exsiting bugs of Linux, but like Greg's inclusion of some kernel doc translations, this kind of work is really helpful to attract some kernel newbies from none English-speaking countries. Even we can't make offical efforts, the civil work, like TLKTP, is still worthy. Believe me, I am leading a local LUG in my college and I found that one _big_ reason that why the newbies are afraid of Linux kernel is English, instead of the C tricks or low-level programming. Regards. Cong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: Use fixup_exception() in traps_64.c
Use the fixup_exception() helper instead of the open-coded search_extable() users. Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]> --- Ingo, this depends on my patch in x86.git unifying extable.c that introduces fixup_exception() to X86_64. arch/x86/kernel/traps_64.c | 47 ++- 1 files changed, 15 insertions(+), 32 deletions(-) diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c index e3d1ca1..c173687 100644 --- a/arch/x86/kernel/traps_64.c +++ b/arch/x86/kernel/traps_64.c @@ -606,19 +606,12 @@ static void __kprobes do_trap(int trapnr, int signr, char *str, } - /* kernel trap */ - { - const struct exception_table_entry *fixup; - fixup = search_exception_tables(regs->ip); - if (fixup) - regs->ip = fixup->fixup; - else { - tsk->thread.error_code = error_code; - tsk->thread.trap_no = trapnr; - die(str, regs, error_code); - } - return; + if (!fixup_exception(regs)) { + tsk->thread.error_code = error_code; + tsk->thread.trap_no = trapnr; + die(str, regs, error_code); } + return; } #define DO_ERROR(trapnr, signr, str, name) \ @@ -707,22 +700,15 @@ asmlinkage void __kprobes do_general_protection(struct pt_regs * regs, return; } - /* kernel gp */ - { - const struct exception_table_entry *fixup; - fixup = search_exception_tables(regs->ip); - if (fixup) { - regs->ip = fixup->fixup; - return; - } + if (fixup_exception(regs)) + return; - tsk->thread.error_code = error_code; - tsk->thread.trap_no = 13; - if (notify_die(DIE_GPF, "general protection fault", regs, - error_code, 13, SIGSEGV) == NOTIFY_STOP) - return; - die("general protection fault", regs, error_code); - } + tsk->thread.error_code = error_code; + tsk->thread.trap_no = 13; + if (notify_die(DIE_GPF, "general protection fault", regs, + error_code, 13, SIGSEGV) == NOTIFY_STOP) + return; + die("general protection fault", regs, error_code); } static __kprobes void @@ -914,12 +900,9 @@ clear_TF_reenable: static int kernel_math_error(struct pt_regs *regs, const char *str, int trapnr) { - const struct exception_table_entry *fixup; - fixup = search_exception_tables(regs->ip); - if (fixup) { - regs->ip = fixup->fixup; + if (fixup_exception(regs)) return 1; - } + notify_die(DIE_GPF, str, regs, 0, trapnr, SIGFPE); /* Illegal floating point operation in the kernel */ current->thread.trap_no = trapnr; -- 1.5.4.rc2.1164.g6451 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/