Re: [linux-pm] [PATCH v2] Add suspend/resume for HPET
On Mon, 2007-04-02 at 16:04 -0400, Alan Stern wrote: > > It's not that simple though, especially with HPET. The BIOS may expect > > the PIT to work, but Linux currently (and problematically!) uses HPET in > > "legacy replacement mode". And ISTR the problems are coming up when the > > system is already in a low-functionality state: IRQs off everywhere, > > even timer ticks have stopped. > > I know nothing about the workings of the HPET and other clock code. My > point was this: Suspend passes through various intermediate stages in > which some devices are available and others aren't. So long as those > stages are exact duplicates (in reverse order) of the stages that occurred > during startup, it should be possible to make them all work. Unfortunately it is not a fully linear problem. Devices are initialized late and put the system into a more complex state (i.e. dynticks, highres) which needs to be suspended and resumed. If we want to do this completely linear we need to do a full reverse rollback of the system states, which moves even more complexity into such systems. Also the linear approach is not working with other devices, as one can see with the still unresolved "IRQ#X nobody cared" issues at resume, which break my laptop. It works nice on startup of the system, but breaks on resume. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Could the k8temp driver be interfering with ACPI?
Hi Dave, On Mon, 2 Apr 2007 15:22:09 -0400, Dave Jones wrote: > On Mon, Apr 02, 2007 at 05:48:59PM +0200, Jean Delvare wrote: > > + u8 val; > > +#ifdef CONFIG_ACPI > > + acpi_ut_acquire_mutex(ACPI_MTX_INTERPRETER); > > +#endif > >outb(reg, data->addr + ADDR_REG_OFFSET); > > - return inb(data->addr + DATA_REG_OFFSET); > > + val = inb(data->addr + DATA_REG_OFFSET); > > +#ifdef CONFIG_ACPI > > + acpi_ut_release_mutex(ACPI_MTX_INTERPRETER); > > +#endif > > + return val; > > ... deletia, more of the same. > > it'd probably end up a lot cleaner to #define them to empty macros > in the !ACPI case in acpi/acpi.h and just #include it unconditionally. Sure, the implementation details can be refined later. I'm only trying to see what can be done for now. -- Jean Delvare - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vt: Expose system-wide UTF-8 default setting via sysfs
On Tue, 2007-04-03 at 10:06 +0600, Alexander E. Patrakov wrote: > Antonino A. Daplas wrote: > > Create a variable, default_utf8, that defines the system-wide default UTF-8 > > setting. This variable can be altered via sysfs. If the variable is > > properly > > set, this should mimimize breakage of UTF-8 encoded consoles when doing a > > reset or echo -e '\033c' and of newly opened/allocated consoles. > > > > This is based from patches by Jan Engelhardt and Paul LeoNerd Evans. > > > > Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]> > > --- > >> I think you're missing the whole point of console reset. Its purpose is > >> to force the console into a known-good state. The fewer pieces of state > >> it leaves unset, the better. To some degree it's less important what > >> that state actually is. > > > > Okay, you convinced me. Hopefully this is acceptable to all parties. > > > > Andrew, > > > > If everybody agrees, can you drop the previous patch I sent to you, and use > > this instead? > > > > Tony > > +static int default_utf8; > > +module_param(default_utf8, int, S_IRUGO | S_IWUSR); > > Module parameter without description and documentation? Yes, I understand > that it is impossible to make vt a module. How about adding a line to > Documentation/kernel-parameters.txt? I'll do that (and I'll also include Jan's palette patch) once I'm sure there's no violent objection against the change. Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20.3 AMD64 oops in CFQ code
[resending. my mail service was down for more than a week and this message didn't get delivered.] [EMAIL PROTECTED] wrote: > > Anyway, what's annoying is that I can't figure out how to bring the > > drive back on line without resetting the box. It's in a hot-swap enclosure, > > but power cycling the drive doesn't seem to help. I thought libata hotplug > > was working? (SiI3132 card, using the sil24 driver.) Yeah, it's working but failing resets are considered highly dangerous (in that the controller status is unknown and may cause something dangerous like screaming interrupts) and port is muted after that. The plan is to handle this with polling hotplug such that libata tries to revive the port if PHY status change is detected by polling. Patches are available but they need other things to resolved to get integrated. I think it'll happen before the summer. Anyways, you can tell libata to retry the port by manually telling it to rescan the port (echo - - - > /sys/class/scsi_host/hostX/scan). > > (H'm... after rebooting, reallocated sectors jumped from 26 to 39. > > Something is up with that drive.) Yeap, seems like a broken drive to me. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)
On Mon, 2007-04-02 at 21:57 -0700, Andrew Morton wrote: > On Tue, 03 Apr 2007 14:45:02 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > > Does that mean the to function correctly every user needs some internal > > cursor so it doesn't end up scanning the first N entries over and over? > > > > If it wants to be well-behaved, and to behave as the VM expects, yes. > > There's an expectation that the callback will be performing some scan-based > aging operation and of course to do LRU (or whatever) aging, the callback > will need to remember where it was up to last time it was called. > > But it's just a guideline - callbacks could do something different but > in-the-spirit, I guess. Hmm, actually the callers I looked at (nfs, dcache, mbcache) seem to use an LRU list and just walk the first "nr_to_scan" entries, and nr_to_scan is always 128. Someone who keeps a cursor will be disadvantaged: the other shrinkers could well get less effective on repeated calls, but we won't. Someone who picks entries at random might have the same issue. I think it is clearest to describe how we expect everyone to work, and let whoever is getting creative worry about it themselves. How's this: == Cleanup and kernelify shrinker registration. I can never remember what the function to register to receive VM pressure is called. I have to trace down from __alloc_pages() to find it. It's called "set_shrinker()", and it needs Your Help. New version: 1) Don't hide struct shrinker. It contains no magic. 2) Don't allocate "struct shrinker". It's not helpful. 3) Call them "register_shrinker" and "unregister_shrinker". 4) Call the function "shrink" not "shrinker". 5) Reduce the 17 lines of waffly comments to 13, but document it properly. Comments: 1) The comment in reiserfs4 makes me a little queasy. 2) The wrapper code in xfs might no longer be needed. 3) The placing in the x86-64 "hot function list" for seems a little unlikely. Clearly, Andi was testing if anyone was paying attention. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> diff -r 0b43dab739aa arch/x86_64/kernel/functionlist --- a/arch/x86_64/kernel/functionlist Tue Apr 03 15:37:49 2007 +1000 +++ b/arch/x86_64/kernel/functionlist Tue Apr 03 15:37:53 2007 +1000 @@ -1118,7 +1118,6 @@ *(.text.simple_strtoll) *(.text.set_termios) *(.text.set_task_comm) -*(.text.set_shrinker) *(.text.set_normalized_timespec) *(.text.set_brk) *(.text.serial_in) diff -r 0b43dab739aa fs/dcache.c --- a/fs/dcache.c Tue Apr 03 15:37:49 2007 +1000 +++ b/fs/dcache.c Tue Apr 03 15:37:53 2007 +1000 @@ -884,6 +884,11 @@ static int shrink_dcache_memory(int nr, } return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure; } + +static struct shrinker dcache_shrinker = { + .shrink = shrink_dcache_memory, + .seeks = DEFAULT_SEEKS, +}; /** * d_alloc - allocate a dcache entry @@ -2144,8 +2149,8 @@ static void __init dcache_init(unsigned (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| SLAB_MEM_SPREAD), NULL, NULL); - - set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory); + + register_shrinker(_shrinker); /* Hash may have been set up in dcache_init_early */ if (!hashdist) diff -r 0b43dab739aa fs/dquot.c --- a/fs/dquot.cTue Apr 03 15:37:49 2007 +1000 +++ b/fs/dquot.cTue Apr 03 15:37:53 2007 +1000 @@ -538,6 +538,11 @@ static int shrink_dqcache_memory(int nr, } return (dqstats.free_dquots / 100) * sysctl_vfs_cache_pressure; } + +static struct shrinker dqcache_shrinker = { + .shrink = shrink_dqcache_memory, + .seeks = DEFAULT_SEEKS, +}; /* * Put reference to dquot @@ -1871,7 +1876,7 @@ static int __init dquot_init(void) printk("Dquot-cache hash table entries: %ld (order %ld, %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order)); - set_shrinker(DEFAULT_SEEKS, shrink_dqcache_memory); + register_shrinker(_shrinker); return 0; } diff -r 0b43dab739aa fs/inode.c --- a/fs/inode.cTue Apr 03 15:37:49 2007 +1000 +++ b/fs/inode.cTue Apr 03 15:37:53 2007 +1000 @@ -474,6 +474,11 @@ static int shrink_icache_memory(int nr, return (inodes_stat.nr_unused / 100) * sysctl_vfs_cache_pressure; } +static struct shrinker icache_shrinker = { + .shrink = shrink_icache_memory, + .seeks = DEFAULT_SEEKS, +}; + static void __wait_on_freeing_inode(struct inode *inode); /* * Called with the inode lock held. @@ -1393,7 +1398,7 @@ void __init inode_init(unsigned long mem SLAB_MEM_SPREAD), init_once, NULL); - set_shrinker(DEFAULT_SEEKS, shrink_icache_memory); + register_shrinker(_shrinker); /* Hash may have been set up in
Re: 2.6.20.4: NETDEV WATCHDOG and lockups
On Tue, 3 Apr 2007, Len Brown wrote: Which increased stability, disabling ACPI, or disabling the IOAPIC? To be honest, we're not sure. See below. Your box has MPS, so you should be able to use the IOAPIC in either mode. MPS - Multiprocessor Specification? SMP? Yes, it'd be good to use the IOAPIC again. Note that you can do these both independently at boot-time with "acpi=off" and "noapic", respectively. eg. 4 combos 1. 2. noapic 3. acpi=off 4. acpi=off noapic you started with #1, and are running hard-coded #4 now, but skipped #2 and #3 Indeed, we skipped quite a few options. As mentioned before, the boxes are in production already so we don't have much time to play around and we were just happy when they survived a few hours :( But yes, we'll try booting with "acpi=off" and enabled IOAPIC again. @Malte: when will we be able to do so? Len et al., do you even suggest to use ACPI on a server system at all? I myself always thought of ACPI being evil and to avoid when possible (thus switching it off completely on a serversystem). Since these NETDEV WATCHDOG issues seems to be a "known issue" (kinda, since the many postings on the lists in the past), is there something else we should look into? Would more debug .config options help to find out why they lock up? Thanks for your comments, Christian. -- BOFH excuse #340: Well fix that in the next (upgrade, update, patch release, service pack). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21-rc5-mm4
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm4/ - The oops in git-net.patch has been fixed, so that tree has been restored. It is huge. - Added the device-mapper development tree to the -mm lineup (Alasdair Kergon). It is a quilt tree, living at ftp://ftp.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/. - Added davidel's signalfd stuff. Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail [EMAIL PROTECTED] - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. Changes since 2.6.21-rc5-mm3: origin.patch git-acpi.patch git-alsa.patch git-agpgart.patch git-arm.patch git-avr32.patch git-cifs.patch git-cpufreq.patch git-powerpc.patch git-drm.patch git-dvb.patch git-gfs2-nmw.patch git-hid.patch git-ia64.patch git-ieee1394.patch git-infiniband.patch git-input.patch git-kbuild.patch git-kvm.patch git-leds.patch git-libata-all.patch git-md-accel.patch git-md-accel-fix.patch git-mips.patch git-mmc.patch git-mtd.patch git-ubi.patch git-netdev-all.patch git-e1000.patch git-net.patch git-ioat.patch git-ocfs2.patch git-parisc.patch git-r8169.patch git-selinux.patch git-pciseg.patch git-s390.patch git-scsi-misc.patch git-block.patch git-unionfs.patch git-watchdog.patch git-wireless.patch git-ipwireless_cs.patch git-cryptodev.patch git-gccbug.patch git trees. -proc-fix-linkage-with-config_sysctl=y-config_proc_sysctl=n.patch -uml-fix-unreasonably-long-udelay.patch -fix-firmware-sample-code.patch -jdelvare-i2c-i2c-algo-bit-document-udelay.patch -pcmcia-allow-pcmcia-scsi-drivers-to-be-built-into-the.patch -gregkh-pci-pci-set-pci-bfsort-for-poweredge-r900.patch -fix-gregkh-pci-pci-piggy-bus.patch -drivers-scsi-dpt_i2oc-remove-dead-code.patch -scsi-whitespace-cleanup-in-the-dpt-driver.patch -drivers-scsi-aic7xxx-make-functions-static.patch -remove-some-unused-scsi-related-kernel-config-variables.patch -drivers-scsi-aacraid-cleanups.patch -make-mptspi_target_destroy-static.patch -qla2xxx-remove-duplicate-pci_disable_device-call.patch -gregkh-usb-usb-gtcoc-fix-a-use-before-check.patch -gregkh-usb-usb-ati_remote2-add-channel-support.patch -usb-serial-whiteheat-convert-to-generic-boolean.patch -x86_64-mm-dont-probe-for-ddc-on-vbe1_2.patch -x86_64-mm-remove-hardcoding-of-hard_smp_processor_id-on-up-systems.patch -drivers-mfd-sm501c-fix-an-off-by-one.patch Merged into mainline or a subsystem tree. +md-avoid-a-deadlock-when-removing-a-device-from-an-md-array-via-sysfs.patch +md-avoid-a-deadlock-when-removing-a-device-from-an-md-array-via-sysfs-fix.patch +revert-driver-core-do-not-wait-unnecessarily-in-driver_unregister.patch 2.6.21 queue. +vmi-paravirt-ops-bugfix-for-2621.patch Might be 2.6.21 queue. +drivers-acpi-kconfig-formulation-fixpatch.patch ACPI fixlet +pata_platform-for-arm-riscpc.patch ARM/pata fix +cifs-use-mutexdiff.patch CIFS cleanup +agk-dm-dm-merge-max_hw_sector.patch +agk-dm-dm-raid1-one-kmirrord-per-mirror.patch +agk-dm-dm-crypt-disable-barriers.patch +agk-dm-dm-crypt-add-null-iv.patch +agk-dm-dm-mpath-log-device-name.patch +agk-dm-dm-allow-offline-devices.patch +agk-dm-dm-log-fault-detection.patch +agk-dm-dm-log-report-fault-status.patch +agk-dm-dm-raid1-add-handle_errors-feature-flag.patch +agk-dm-dm-io-delay-dec_count.patch +agk-dm-dm-io-prepare-for-new-interface.patch +agk-dm-dm-io-new-interface.patch +agk-dm-dm-kcopyd-update-dm-io-interface.patch +agk-dm-dm-exception-store-update-dm-io-interface.patch +agk-dm-dm-log-update-dm-io-interface.patch +agk-dm-dm-raid1-update-dm-io-interface.patch +agk-dm-dm-io-remove-old-interface.patch Device mapper development tree.
Re: [KJ][PATCH]ROUND_UP macro cleanup in drivers/net/ixgb
Milind Arun Choudhary wrote: IXGB_ROUNDUP macro cleanup ,use ALIGN cool beans! Same reply as to the ALIGN patch you sent for e1000 -> We'll take it for a spin and I'll push your patch upstream as part of the regular updates! Thanks, Auke Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]> --- ixgb.h |3 --- ixgb_ethtool.c |4 ++-- ixgb_main.c|4 ++-- ixgb_param.c |4 ++-- 4 files changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/net/ixgb/ixgb.h b/drivers/net/ixgb/ixgb.h index cf30a10..c8e9086 100644 --- a/drivers/net/ixgb/ixgb.h +++ b/drivers/net/ixgb/ixgb.h @@ -111,9 +111,6 @@ struct ixgb_adapter; /* How many Rx Buffers do we bundle into one write to the hardware ? */ #define IXGB_RX_BUFFER_WRITE 8 /* Must be power of 2 */ -/* only works for sizes that are powers of 2 */ -#define IXGB_ROUNDUP(i, size) ((i) = (((i) + (size) - 1) & ~((size) - 1))) - /* wrapper around a pointer to a socket buffer, * so a DMA handle can be stored along with the buffer */ struct ixgb_buffer { diff --git a/drivers/net/ixgb/ixgb_ethtool.c b/drivers/net/ixgb/ixgb_ethtool.c index d6628bd..cdefaff 100644 --- a/drivers/net/ixgb/ixgb_ethtool.c +++ b/drivers/net/ixgb/ixgb_ethtool.c @@ -577,11 +577,11 @@ ixgb_set_ringparam(struct net_device *netdev, rxdr->count = max(ring->rx_pending,(uint32_t)MIN_RXD); rxdr->count = min(rxdr->count,(uint32_t)MAX_RXD); - IXGB_ROUNDUP(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); + rxdr->count = ALIGN(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); txdr->count = max(ring->tx_pending,(uint32_t)MIN_TXD); txdr->count = min(txdr->count,(uint32_t)MAX_TXD); - IXGB_ROUNDUP(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); + txdr->count = ALIGN(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); if(netif_running(adapter->netdev)) { /* Try to get new resources before deleting old */ diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c index afc2ec7..158c71e 100644 --- a/drivers/net/ixgb/ixgb_main.c +++ b/drivers/net/ixgb/ixgb_main.c @@ -685,7 +685,7 @@ ixgb_setup_tx_resources(struct ixgb_adapter *adapter) /* round up to nearest 4K */ txdr->size = txdr->count * sizeof(struct ixgb_tx_desc); - IXGB_ROUNDUP(txdr->size, 4096); + txdr->size = ALIGN(txdr->size, 4096); txdr->desc = pci_alloc_consistent(pdev, txdr->size, >dma); if(!txdr->desc) { @@ -774,7 +774,7 @@ ixgb_setup_rx_resources(struct ixgb_adapter *adapter) /* Round up to nearest 4K */ rxdr->size = rxdr->count * sizeof(struct ixgb_rx_desc); - IXGB_ROUNDUP(rxdr->size, 4096); + rxdr->size = ALIGN(rxdr->size, 4096); rxdr->desc = pci_alloc_consistent(pdev, rxdr->size, >dma); diff --git a/drivers/net/ixgb/ixgb_param.c b/drivers/net/ixgb/ixgb_param.c index b27442a..ee8cc67 100644 --- a/drivers/net/ixgb/ixgb_param.c +++ b/drivers/net/ixgb/ixgb_param.c @@ -284,7 +284,7 @@ ixgb_check_options(struct ixgb_adapter *adapter) } else { tx_ring->count = opt.def; } - IXGB_ROUNDUP(tx_ring->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); + tx_ring->count = ALIGN(tx_ring->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); } { /* Receive Descriptor Count */ struct ixgb_option opt = { @@ -303,7 +303,7 @@ ixgb_check_options(struct ixgb_adapter *adapter) } else { rx_ring->count = opt.def; } - IXGB_ROUNDUP(rx_ring->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); + rx_ring->count = ALIGN(rx_ring->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); } { /* Receive Checksum Offload Enable */ struct ixgb_option opt = { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Kprobes: Print details of kretprobe on assertion failure
On Mon, Apr 02, 2007 at 02:17:32PM -0700, Andrew Morton wrote: > On Mon, 2 Apr 2007 14:56:36 +0530 > Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> wrote: > > > From: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> > > > > In certain cases like when the real return address can't be found or > > when the number of tracked calls to a kretprobed function is less than > > the number of returns, we may not be able to find the correct return > > address after processing a kretprobe. Currently we just do a BUG_ON, but > > no information is provided about the actual failing kretprobe. > > > > Print out details of the kretprobe before calling BUG(). > > > > Signed-off-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> > > > > --- > > arch/i386/kernel/kprobes.c|6 +- > > arch/ia64/kernel/kprobes.c|7 ++- > > arch/powerpc/kernel/kprobes.c |7 ++- > > arch/x86_64/kernel/kprobes.c |7 ++- > > 4 files changed, 23 insertions(+), 4 deletions(-) > > > > Index: linux-2.6.21-rc5/arch/i386/kernel/kprobes.c > > === > > --- linux-2.6.21-rc5.orig/arch/i386/kernel/kprobes.c > > +++ linux-2.6.21-rc5/arch/i386/kernel/kprobes.c > > @@ -440,7 +440,11 @@ fastcall void *__kprobes trampoline_hand > > break; > > } > > > > - BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address)); > > + if (!orig_ret_address || (orig_ret_address == trampoline_address)) { > > + printk("kretprobe BUG!: Processing kretprobe %p @ %p\n", > > + ri->rp, ri->rp->kp.addr); > > + BUG(); > > + } > > > > spin_unlock_irqrestore(_lock, flags); > > > > Index: linux-2.6.21-rc5/arch/ia64/kernel/kprobes.c > > === > > --- linux-2.6.21-rc5.orig/arch/ia64/kernel/kprobes.c > > +++ linux-2.6.21-rc5/arch/ia64/kernel/kprobes.c > > @@ -444,7 +444,12 @@ int __kprobes trampoline_probe_handler(s > > break; > > } > > > > - BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address)); > > + if (!orig_ret_address || (orig_ret_address == trampoline_address)) { > > + printk("kretprobe BUG!: Processing kretprobe %p @ %p\n", > > + ri->rp, ri->rp->kp.addr); > > + BUG(); > > + } > > + > > regs->cr_iip = orig_ret_address; > > > > reset_current_kprobe(); > > Index: linux-2.6.21-rc5/arch/powerpc/kernel/kprobes.c > > === > > --- linux-2.6.21-rc5.orig/arch/powerpc/kernel/kprobes.c > > +++ linux-2.6.21-rc5/arch/powerpc/kernel/kprobes.c > > @@ -293,7 +293,12 @@ int __kprobes trampoline_probe_handler(s > > break; > > } > > > > - BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address)); > > + if (!orig_ret_address || (orig_ret_address == trampoline_address)) { > > + printk("kretprobe BUG!: Processing kretprobe %p @ %p\n", > > + ri->rp, ri->rp->kp.addr); > > + BUG(); > > + } > > + > > regs->nip = orig_ret_address; > > > > reset_current_kprobe(); > > Index: linux-2.6.21-rc5/arch/x86_64/kernel/kprobes.c > > === > > --- linux-2.6.21-rc5.orig/arch/x86_64/kernel/kprobes.c > > +++ linux-2.6.21-rc5/arch/x86_64/kernel/kprobes.c > > @@ -438,7 +438,12 @@ int __kprobes trampoline_probe_handler(s > > break; > > } > > > > - BUG_ON(!orig_ret_address || (orig_ret_address == trampoline_address)); > > + if (!orig_ret_address || (orig_ret_address == trampoline_address)) { > > + printk("kretprobe BUG!: Processing kretprobe %p @ %p\n", > > + ri->rp, ri->rp->kp.addr); > > + BUG(); > > + } > > + > > regs->rip = orig_ret_address; > > > > A lot of copying-and-pasting there. Would it be better if this assertion > was performed in a library function in kernel/kprobes.c? Indeed. Here is the updated patch... From: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> In certain cases like when the real return address can't be found or when the number of tracked calls to a kretprobed function is less than the number of returns, we may not be able to find the correct return address after processing a kretprobe. Currently we just do a BUG_ON, but no information is provided about the actual failing kretprobe. Print out details of the kretprobe before calling BUG(). Signed-off-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]> --- arch/i386/kernel/kprobes.c|3 +-- arch/ia64/kernel/kprobes.c|3 ++- arch/powerpc/kernel/kprobes.c |2 +- arch/x86_64/kernel/kprobes.c |2 +- include/linux/kprobes.h | 10 ++ 5 files changed, 15 insertions(+), 5 deletions(-) Index: linux-2.6.21-rc5/arch/i386/kernel/kprobes.c
Re: [xfs-masters] Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)
On Mon, Apr 02, 2007 at 09:57:02PM -0700, Andrew Morton wrote: > On Tue, 03 Apr 2007 14:45:02 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > > > On Mon, 2007-04-02 at 20:58 -0700, Andrew Morton wrote: > > > On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > > > I can never remember what the function to register to receive VM > > > > pressure > > > > is called. I have to trace down from __alloc_pages() to find it. > > > > > > > > It's called "set_shrinker()", and it needs Your Help. > > > > > > > > New version: > > > > 1) Don't hide struct shrinker. It contains no magic. > > > > 2) Don't allocate "struct shrinker". It's not helpful. > > > > 3) Call them "register_shrinker" and "unregister_shrinker". > > > > 4) Call the function "shrink" not "shrinker". > > > > 5) Rename "nr_to_scan" argument to "nr_to_free". > > > > > > No, it is actually the number to scan. This is >= the number of freed > > > objects. > > > > > > This is because, for better of for worse, the VM tries to balance the > > > scanning rate of the various caches, not the reclaiming rate. > > > > Err, ok, I completely missed that distinction. > > > > Does that mean the to function correctly every user needs some internal > > cursor so it doesn't end up scanning the first N entries over and over? > > > > If it wants to be well-behaved, and to behave as the VM expects, yes. > > There's an expectation that the callback will be performing some scan-based > aging operation and of course to do LRU (or whatever) aging, the callback > will need to remember where it was up to last time it was called. > > But it's just a guideline - callbacks could do something different but > in-the-spirit, I guess. In XFS, one of the shrinkers cwthat gets registered calls causes all the xfsbufd's in the system to run and write back delayed write metadata - this can't be freed up until it is clean, and this is the only hook we have that can be used to trigger writeback on memory pressure. We need this because we can potentially have hundreds of megabytes of dirty metadata per XFS filesystem. IOW, the way the VM expects the shrinkers to work can be far, far away from what subsystems need the shrinker callbacks for Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20.4: NETDEV WATCHDOG and lockups
On Mon, 2 Apr 2007, Chuck Ebbert wrote: Where is the info from before you changed to "noapic"? Or were the machines always using XT-PIC for all the interrupts??? XT-PIC is only used since we switched to noapic, before there was IO-APIC-fasteoi on both ethernet cards and interrupts were balanced well. Thanks, Christian. -- BOFH excuse #340: Well fix that in the next (upgrade, update, patch release, service pack). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: staircase deadline misc fixes
On Tue, 2007-04-03 at 12:37 +1000, Con Kolivas wrote: > On Thursday 29 March 2007 15:50, Mike Galbraith wrote: > > On Thu, 2007-03-29 at 09:44 +1000, Con Kolivas wrote: > > + * This contains a bitmap for each dynamic priority level with empty slots > > + * for the valid priorities each different nice level can have. It allows > > + * us to stagger the slots where differing priorities run in a way that > > + * keeps latency differences between different nice levels at a minimum. > > + * ie, where 0 means a slot for that priority, priority running from left > > to + * right: > > + * nice -20 > > + * nice -10 1001000100100010001001000100010010001000 > > + * nice 0 0101010101010101010101010101010101010101 > > + * nice 5 1101011010110101101011010110101101011011 > > + * nice 10 0110111011011101110110111011101101110111 > > + * nice 15 0101101101011011 > > + * nice 19 1110 > > Try two instances of chew.c at _differing_ nice levels on one cpu on > mainline, > and then SD. This is why you can't renice X on mainline. How about something more challenging instead :) The numbers below are from my scheduler tree with massive_intr running at nice 0, and chew at nice 5. Below these numbers are 100 lines from the exact center of chew's output. (interactivity remains intact with this rather heavy load) [EMAIL PROTECTED]: ./massive_intr 30 180 005671 1506 005657 1506 005651 1491 005647 1466 005661 1484 005660 1475 005645 1514 005668 1384 005673 1516 005656 1449 005664 1512 005659 1507 005667 1513 005663 1521 005670 1440 005649 1522 005652 1487 005648 1405 005665 1472 005669 1418 005662 1489 005674 1523 005650 1480 005655 1476 005672 1530 005653 1463 005654 1427 005646 1499 005658 1510 005666 1476 100 sequential lines from the middle of chew's logged output. pid 5642, prio 5, out for2 ms, ran for1 ms, load 34% pid 5642, prio 5, out for 1268 ms, ran for 63 ms, load 4% pid 5642, prio 5, out for 52 ms, ran for0 ms, load 0% pid 5642, prio 5, out for8 ms, ran for1 ms, load 14% pid 5642, prio 5, out for9 ms, ran for1 ms, load 12% pid 5642, prio 5, out for8 ms, ran for1 ms, load 17% pid 5642, prio 5, out for8 ms, ran for1 ms, load 15% pid 5642, prio 5, out for9 ms, ran for1 ms, load 17% pid 5642, prio 5, out for8 ms, ran for1 ms, load 15% pid 5642, prio 5, out for8 ms, ran for1 ms, load 12% pid 5642, prio 5, out for7 ms, ran for1 ms, load 18% pid 5642, prio 5, out for8 ms, ran for1 ms, load 11% pid 5642, prio 5, out for8 ms, ran for1 ms, load 18% pid 5642, prio 5, out for4 ms, ran for1 ms, load 22% pid 5642, prio 5, out for 1395 ms, ran for 50 ms, load 3% pid 5642, prio 5, out for 26 ms, ran for0 ms, load 3% pid 5642, prio 5, out for8 ms, ran for1 ms, load 17% pid 5642, prio 5, out for7 ms, ran for1 ms, load 15% pid 5642, prio 5, out for9 ms, ran for1 ms, load 11% pid 5642, prio 5, out for8 ms, ran for1 ms, load 13% pid 5642, prio 5, out for7 ms, ran for0 ms, load 11% pid 5642, prio 5, out for8 ms, ran for1 ms, load 11% pid 5642, prio 5, out for8 ms, ran for1 ms, load 14% pid 5642, prio 5, out for7 ms, ran for1 ms, load 20% pid 5642, prio 5, out for7 ms, ran for1 ms, load 14% pid 5642, prio 5, out for8 ms, ran for1 ms, load 13% pid 5642, prio 5, out for 1400 ms, ran for 53 ms, load 3% pid 5642, prio 5, out for 22 ms, ran for1 ms, load 6% pid 5642, prio 5, out for8 ms, ran for1 ms, load 19% pid 5642, prio 5, out for7 ms, ran for1 ms, load 19% pid 5642, prio 5, out for8 ms, ran for1 ms, load 19% pid 5642, prio 5, out for8 ms, ran for1 ms, load 19% pid 5642, prio 5, out for8 ms, ran for1 ms, load 19% pid 5642, prio 5, out for8 ms, ran for1 ms, load 18% pid 5642, prio 5, out for9 ms, ran for1 ms, load 17% pid 5642, prio 5, out for8 ms, ran for1 ms, load 17% pid 5642, prio 5, out for8 ms, ran for1 ms, load 17% pid 5642, prio 5, out for8 ms, ran for1 ms, load 19% pid 5642, prio 5, out for2 ms, ran for1 ms, load 49% pid 5642, prio 5, out for 1281 ms, ran for 50 ms, load 3% pid 5642, prio 5, out for 50 ms, ran for0 ms, load 1% pid 5642, prio 5, out for8 ms, ran for1 ms, load 15% pid 5642, prio 5, out for8 ms, ran for1 ms, load 16% pid 5642, prio 5, out for8 ms, ran for1 ms, load 19% pid 5642, prio 5, out for7 ms, ran for1 ms, load 17% pid 5642, prio 5, out for7 ms, ran for1 ms, load 13% pid 5642, prio
Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"
Vivek Goyal <[EMAIL PROTECTED]> writes: >> I guess at this point the easy case is that we modify /sbin/kexec to support >> it. And the other bootloaders can come be upgraded if the feature is >> interesting enough. >> >> > On i386, somebody already found an interesting usage of > CONFIG_PHYSICAL_START >> > where he was running his kernel above 16MB so that he can maximize on >> > DMA ZONE. Can't think of any usage for x86_64 at the moment but I think >> > down the line people might come up with such usages. >> >> Agreed. We do have CONFIG_PHYSICAL_ALIGN that can handle that case, >> although I admit that is a bit of a hack. >> > > Yes, but x86_64 will not have any of those options and only way to run > kernel will be either use kexec or modify your boot-loader to so that > it can handle relocatable images. True. >> > To me, retaining CONFIG_PHYSICAL_START gives added flexibility to the user, >> > at the expense of reduced simplicity. We should definitely change the type >> > of vmlinux to ET_DYN but at the same time it might still be worth to retain >> > CONFIG_PHYSICAL_START option. >> >> I think something like CONFIG_PHYSICAL_START currently gives us very >> little gain, and is hard to use correctly, and there are alternative >> solutions. So if we can get rid of it, by only inconveniencing users >> who want load their kernels at a weird address it is worth it. >> >> >> I think I can switch the vmlinux header type in about 100 lines or so >> >> of code. Assuming I can ever get 30 minutes with the appropriate >> >> kernel. >> >> >> > >> > That would be awesome. Then vmlinux will be relocatable too. (Officially). >> >> Yes. For x86_64 I can do this. i386 is more difficult. (Although with >> a little cleverness we can move the code that processes relocations into >> vmlinux). >> > > Performing relocations in vmlinux will be interesting. That way i386 vmlinux > too will become relocatable and only piece of puzzle to solve will be to > make vmlinux of type ET_DYN. Actually making vmlinux have type ET_DYN is the easier piece. Basically the quick way to do this is to have an arch specific: "cmd_vmlinux__" like uml does so we can edit things after the make. Changing an integer in an ELF header is simple. Inserting the code to perform the relocations feels a bit trickier but we can probably just dump it in head.S like we do on x86_64. We still need to insert the actual relocations to process though. Which requires all of the post processing we currently do just called at a slightly different location. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Tue, 2007-04-03 at 12:34 +1000, Con Kolivas wrote: > On Saturday 31 March 2007 19:28, Xenofon Antidides wrote: > > For long time now I use windows to work > > problems. I cannot play wine games with audio, I > > cannot sample video, I cannot use skype, I cannot play > > midi. And even linux only things I try do I cannot > > share my X, I cannot use more than one vmware. All > > those is fix for me with SD. > > Any semblance of cpu bandwidth and latency guarantees are easily shot on > mainline by a single process going wild (eg open tab in firefox). I've written a patch which I believe fixes that. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.20.4: NETDEV WATCHDOG and lockups
On Monday 02 April 2007 15:41, Christian Kujau wrote: > > Hi there, > > we have serious problems with 2 of our servers: both shiny new amd64 > dual core, with both 2GB RAM, 32bit kernel+userland (Debian/testing). > Both servers have 2 NICs, RTL8139 (eth0, irq10) and RTL8169s > (eth1, irq11). > > Both boxes are running fine but after "a while" they lock up and > eventually restart all of a sudden. The last messages in the logfile > are: > > 14:15:11 db2 kernel: NETDEV WATCHDOG: eth0: transmit timed out > 14:15:14 db2 kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 > > Then the box reboots, nothing else in the log. > > As the servers have been set up recently, we only know that it happend > with Debian's 2.6.17-? kernel. When we upgraded the installation, we > went to 2.6.18-4-k7 and the problem persistent. We're using now vanilla > 2.6.20.4 and while the problem persists, it takes longer to lockup (~20h > as opposed to 4-5h). While this is a good thing for us, it's now harder > to reproduce (we have to wait longer). > > Searching the archives turned up quite a few results but no real fix and > lots of old postings too. We then disabled ACPI completely and booted > with 'noapic'. Now both boxes are running for > 20h and we're curious > how long they make it. However, booting with 'noapic' slowed down both > servers *a lot*. Which increased stability, disabling ACPI, or disabling the IOAPIC? Your box has MPS, so you should be able to use the IOAPIC in either mode. Note that you can do these both independently at boot-time with "acpi=off" and "noapic", respectively. eg. 4 combos 1. 2. noapic 3. acpi=off 4. acpi=off noapic you started with #1, and are running hard-coded #4 now, but skipped #2 and #3 cheers, -Len > >From /proc/interrupts we can see that only CPU0 (core 0) is handling > interrupts while CPU1 does not. We compiled with CONFIG_IRQBALANCE=n so > that irqbalance(1) would work - but to no avail. > > Please see http://nerdbynature.de/bits/2.6.20.4/ for details for both > hosts and feel free to ask for more details. Although both boxes are in > production we'll be happy test more bootoptions/patches and the like. > > TIA, > Christian. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [uml-devel] [RFC] UML kernel & rootfs bundle with every kernel release ?
On Mon, Apr 02, 2007 at 05:44:34PM -0400, Jeff Dike wrote: > There are sites (http://uml.nagafix.co.uk/ being the best one I know > of) where, with two downloads, two uncompressions, and one command > line later, you have a booted UML. > > The only way I know of to improve on this, aside from inprovements in > the booted distro, is to package the filesystem as a rootfs within the > UML kernel binary. I've considered this, but haven't done anything > with it. I've done the converse: package the uml kernel within the rootfs image, and use a script that plays the part of bootloader. With ext2 at least, it's fairly easy to use the debugfs 'cat' command for this. That way, you simply distribute the fs image with a companion script that can boot any number of such images. Jason - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Driver core: add suspend() and resume() to struct device_type
Hi Greg, Here is another patch extending struct device_type. I need it to implement generic suspend/resume routines for input devices. As you may remember input core devices and interface devices are mixed in the same class and because suspend/resume only applies to core devices so I can't define these methods on class level. -- Dmitry Driver core: add suspend() and resume() to struct device_type In cases when there are devices of different types in the same class we can't use class's implementation of suspend and resume methods and we need to add them to struct device_type instead. Also fix error handling in resume code (we should not try to call class's resume method iof bus's resume method for the device failed. Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]> --- drivers/base/power/resume.c | 13 - drivers/base/power/suspend.c | 12 include/linux/device.h |2 ++ 3 files changed, 26 insertions(+), 1 deletion(-) Index: work/drivers/base/power/resume.c === --- work.orig/drivers/base/power/resume.c +++ work/drivers/base/power/resume.c @@ -26,7 +26,9 @@ int resume_device(struct device * dev) TRACE_DEVICE(dev); TRACE_RESUME(0); + down(>sem); + if (dev->power.pm_parent && dev->power.pm_parent->power.power_state.event) { dev_err(dev, "PM: resume from %d, parent %s still %d\n", @@ -34,15 +36,24 @@ int resume_device(struct device * dev) dev->power.pm_parent->bus_id, dev->power.pm_parent->power.power_state.event); } + if (dev->bus && dev->bus->resume) { dev_dbg(dev,"resuming\n"); error = dev->bus->resume(dev); } - if (dev->class && dev->class->resume) { + + if (!error && dev->type && dev->type->resume) { + dev_dbg(dev,"resuming\n"); + error = dev->type->resume(dev); + } + + if (!error && dev->class && dev->class->resume) { dev_dbg(dev,"class resume\n"); error = dev->class->resume(dev); } + up(>sem); + TRACE_RESUME(error); return error; } Index: work/drivers/base/power/suspend.c === --- work.orig/drivers/base/power/suspend.c +++ work/drivers/base/power/suspend.c @@ -78,6 +78,18 @@ int suspend_device(struct device * dev, suspend_report_result(dev->class->suspend, error); } + if (!error && dev->type && dev->type->suspend && !dev->power.power_state.event) { + dev_dbg(dev, "%s%s\n", + suspend_verb(state.event), + ((state.event == PM_EVENT_SUSPEND) + && device_may_wakeup(dev)) + ? ", may wakeup" + : "" + ); + error = dev->type->suspend(dev, state); + suspend_report_result(dev->type->suspend, error); + } + if (!error && dev->bus && dev->bus->suspend && !dev->power.power_state.event) { dev_dbg(dev, "%s%s\n", suspend_verb(state.event), Index: work/include/linux/device.h === --- work.orig/include/linux/device.h +++ work/include/linux/device.h @@ -332,6 +332,8 @@ struct device_type { int (*uevent)(struct device *dev, char **envp, int num_envp, char *buffer, int buffer_size); void (*release)(struct device *dev); + int (*suspend)(struct device * dev, pm_message_t state); + int (*resume)(struct device * dev); }; /* interface for exporting device attributes */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: usb hid: reset NumLock
On Monday 02 April 2007 19:12, Pete Zaitcev wrote: > On Mon, 2 Apr 2007 16:48:24 +0200 (CEST), Jiri Kosina <[EMAIL PROTECTED]> > wrote: > > On Sun, 1 Apr 2007, Pete Zaitcev wrote: > > > could you please change the order of the two functions, so that you > > don't have to put the forward declaration here? > >[...] > > I'd say this is a little bit overcommented. > >[...] > > So as soon as you have the VIDs and PIDs of the hardware which > > requires this, could you please update the patch and send it to me again? > > How about this? Actually I think I will be adding the patch below, but it has to wait till 2.6.22 as it requires input core to struct device conversion patch. What do you think? -- Dmitry Input: add generic suspend and resume for uinput devices Automatically turn off leds and sound effects as part of suspend process and restore led state, sounds and repeat rate at resume. Also synchronize hardware state with logical state at device registration. Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]> --- drivers/input/input.c | 80 ++ 1 files changed, 80 insertions(+) Index: work/drivers/input/input.c === --- work.orig/drivers/input/input.c +++ work/drivers/input/input.c @@ -997,10 +997,88 @@ static int input_dev_uevent(struct devic return 0; } +static void input_dev_toggle(struct input_dev *dev, +unsigned int type, unsigned int code, +unsigned long *cap_bits, unsigned long *bits, +int force_off) +{ + if (test_bit(code, cap_bits)) { + if (!force_off) + dev->event(dev, type, code, test_bit(code, bits)); + else if (test_bit(code, bits)) + dev->event(dev, type, code, 0); + } +} + +static void input_dev_reset(struct input_dev *dev, int force_off) +{ + int i; + + if (!dev->event) + return; + + /* synchronize led state */ + if (test_bit(EV_LED, dev->evbit)) + for (i = 0; i <= LED_MAX; i++) + input_dev_toggle(dev, EV_LED, i, +dev->ledbit, dev->led, force_off); + + /* restore sound */ + if (test_bit(EV_SND, dev->evbit)) + for (i = 0; i <= SND_MAX; i++) + input_dev_toggle(dev, EV_SND, i, +dev->sndbit, dev->snd, force_off); + + if (!force_off && test_bit(EV_REP, dev->evbit)) { + dev->event(dev, EV_REP, REP_PERIOD, dev->rep[REP_PERIOD]); + dev->event(dev, EV_REP, REP_DELAY, dev->rep[REP_DELAY]); + } +} + +#ifdef CONFIG_PM +static int input_dev_suspend(struct device *dev, pm_message_t state) +{ + struct input_dev *input_dev = to_input_dev(dev); + + mutex_lock(_dev->mutex); + + if (dev->power.power_state.event != state.event) { + if (state.event == PM_EVENT_SUSPEND) + input_dev_reset(input_dev, 1); + + dev->power.power_state = state; + } + + mutex_unlock(_dev->mutex); + + return 0; +} + +static int input_dev_resume(struct device *dev) +{ + struct input_dev *input_dev = to_input_dev(dev); + + mutex_lock(_dev->mutex); + + if (dev->power.power_state.event != PM_EVENT_ON) + input_dev_reset(to_input_dev(dev), 0); + + dev->power.power_state = PMSG_ON; + + mutex_unlock(_dev->mutex); + + return 0; +} +#endif /* CONFIG_PM */ + static struct device_type input_dev_type = { .groups = input_dev_attr_groups, .release= input_dev_release, .uevent = input_dev_uevent, +#ifdef CONFIG_PM + .suspend= input_dev_suspend, + .resume = input_dev_resume, +#endif }; struct class input_class = { @@ -1080,6 +1158,8 @@ int input_register_device(struct input_d dev->rep[REP_PERIOD] = 33; } + input_dev_reset(dev, 0); + if (!dev->getkeycode) dev->getkeycode = input_default_getkeycode; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)
On Tue, 03 Apr 2007 14:45:02 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > On Mon, 2007-04-02 at 20:58 -0700, Andrew Morton wrote: > > On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > > > > > > > > I can never remember what the function to register to receive VM pressure > > > is called. I have to trace down from __alloc_pages() to find it. > > > > > > It's called "set_shrinker()", and it needs Your Help. > > > > > > New version: > > > 1) Don't hide struct shrinker. It contains no magic. > > > 2) Don't allocate "struct shrinker". It's not helpful. > > > 3) Call them "register_shrinker" and "unregister_shrinker". > > > 4) Call the function "shrink" not "shrinker". > > > 5) Rename "nr_to_scan" argument to "nr_to_free". > > > > No, it is actually the number to scan. This is >= the number of freed > > objects. > > > > This is because, for better of for worse, the VM tries to balance the > > scanning rate of the various caches, not the reclaiming rate. > > Err, ok, I completely missed that distinction. > > Does that mean the to function correctly every user needs some internal > cursor so it doesn't end up scanning the first N entries over and over? > If it wants to be well-behaved, and to behave as the VM expects, yes. There's an expectation that the callback will be performing some scan-based aging operation and of course to do LRU (or whatever) aging, the callback will need to remember where it was up to last time it was called. But it's just a guideline - callbacks could do something different but in-the-spirit, I guess. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)
On Mon, 2007-04-02 at 20:58 -0700, Andrew Morton wrote: > On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > > > > > I can never remember what the function to register to receive VM pressure > > is called. I have to trace down from __alloc_pages() to find it. > > > > It's called "set_shrinker()", and it needs Your Help. > > > > New version: > > 1) Don't hide struct shrinker. It contains no magic. > > 2) Don't allocate "struct shrinker". It's not helpful. > > 3) Call them "register_shrinker" and "unregister_shrinker". > > 4) Call the function "shrink" not "shrinker". > > 5) Rename "nr_to_scan" argument to "nr_to_free". > > No, it is actually the number to scan. This is >= the number of freed > objects. > > This is because, for better of for worse, the VM tries to balance the > scanning rate of the various caches, not the reclaiming rate. Err, ok, I completely missed that distinction. Does that mean the to function correctly every user needs some internal cursor so it doesn't end up scanning the first N entries over and over? Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: add NCQ blacklist entries from Silicon Image Windows driver (v2)
Robert Hancock wrote: > This adds some NCQ blacklist entries taken from the Silicon Image 3124/3132 > Windows driver .inf files. There are some confirming reports of problems > with these drives under Linux (for example > http://lkml.org/lkml/2007/3/4/178) > so let's disable NCQ on these drives. > > Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> Acked-by: Tejun Heo <[EMAIL PROTECTED]> -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] libata: add NCQ blacklist entries from Silicon Image Windows driver (v2)
This adds some NCQ blacklist entries taken from the Silicon Image 3124/3132 Windows driver .inf files. There are some confirming reports of problems with these drives under Linux (for example http://lkml.org/lkml/2007/3/4/178) so let's disable NCQ on these drives. Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> --- linux-2.6.21-rc5-git9/drivers/ata/libata-core.c 2007-04-02 21:03:29.0 -0600 +++ linux-2.6.21-rc5-git9edit/drivers/ata/libata-core.c 2007-04-02 21:26:23.0 -0600 @@ -3363,6 +3363,11 @@ static const struct ata_blacklist_entry { "Maxtor 6L250S0", "BANC1G10", ATA_HORKAGE_NONCQ }, /* NCQ hard hangs device under heavier load, needs hard power cycle */ { "Maxtor 6B250S0", "BANC1B70", ATA_HORKAGE_NONCQ }, + /* Blacklist entries taken from Silicon Image 3124/3132 + Windows driver .inf file - also several Linux problem reports */ + { "HTS541060G9SA00","MB3OC60D", ATA_HORKAGE_NONCQ, }, + { "HTS541080G9SA00","MB4OC60D", ATA_HORKAGE_NONCQ, }, + { "HTS541010G9SA00","MBZOC60D", ATA_HORKAGE_NONCQ, }, /* Devices with NCQ limits */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vt: Expose system-wide UTF-8 default setting via sysfs
Antonino A. Daplas wrote: Create a variable, default_utf8, that defines the system-wide default UTF-8 setting. This variable can be altered via sysfs. If the variable is properly set, this should mimimize breakage of UTF-8 encoded consoles when doing a reset or echo -e '\033c' and of newly opened/allocated consoles. This is based from patches by Jan Engelhardt and Paul LeoNerd Evans. Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]> --- I think you're missing the whole point of console reset. Its purpose is to force the console into a known-good state. The fewer pieces of state it leaves unset, the better. To some degree it's less important what that state actually is. Okay, you convinced me. Hopefully this is acceptable to all parties. Andrew, If everybody agrees, can you drop the previous patch I sent to you, and use this instead? Tony +static int default_utf8; +module_param(default_utf8, int, S_IRUGO | S_IWUSR); Module parameter without description and documentation? Yes, I understand that it is impossible to make vt a module. How about adding a line to Documentation/kernel-parameters.txt? Other than that, the patch looks like a useful change. -- Alexander E. Patrakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"
On Mon, Apr 02, 2007 at 04:59:26PM +0200, [EMAIL PROTECTED] wrote: > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > Date: Mon, Apr 02, 2007 at 04:49:14PM +0200 > > > > I used a working 2.6.21-rc3-mm2 tree, patched it up to 2.6.21-rc5-mm3 > > and applied your patch. I ended up with the .config later in this email, > > and got this error: > > > > CC arch/x86_64/kernel/head64.o > > arch/x86_64/kernel/head64.c: In function 'x86_64_start_kernel': > > arch/x86_64/kernel/head64.c:70: error: size of array 'type name' is negative > > make[1]: *** [arch/x86_64/kernel/head64.o] Error 1 > > make: *** [arch/x86_64/kernel] Error 2 > > > > After reverting your patch, the build didn't fail, but of course the > > kernel won't build. > > > That should, of course, read 'kernel won't boot'. > I agree that error message is not very clear. It is just an indication that there is a problem on line 70 in head64.c. That's why I have put a commet there so that anybody can make out that CONFIG_PHYSICAL_START is not 2MB aligned hence the failure. Unfortunately, Kconfig infrastrucutre does not allow to place alignment restrictions on the values. Otherwise that would have been the best solution. So we still have detected the problem at compilation time in a little indirect manner though. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5-mm3 - no boot, "address not 2M aligned"
On Mon, Apr 02, 2007 at 11:26:38AM -0600, Eric W. Biederman wrote: > Vivek Goyal <[EMAIL PROTECTED]> writes: > > > Only advantage of CONFIG_PHYSICAL_START seems to be that one has got > > capability to run the kernel from other addresses without modifying the > > boot-loader. One can argue that now people should use a relocatable kernel > > for such a feature. But for using relocatable kenrel, one needs to modify > > grub, lilo and I am not sure if somebody is going to do that. Secondly, how > > would one specify an address to a boot-loader to load image at? > > I thought this was important for vmlinux and Xen? > Yes it is. Actually you had already mentioned it in the previous mail that's why I did not repeat it here. Xen folks wanted to continue using vmlinux for capturing dump. I am not sure if there is any technical limitation in using relocatable bzImage or just that they wanted to continue using existing working interface and did not want to switch to new interface. Magnus, Horms, do you want to add to it? Is there a reason that relocatable bzImage will not work in Xen env and we need to retain CONFIG_PHYSICAL_START option in x86_64? > I guess at this point the easy case is that we modify /sbin/kexec to support > it. And the other bootloaders can come be upgraded if the feature is > interesting enough. > > > On i386, somebody already found an interesting usage of > > CONFIG_PHYSICAL_START > > where he was running his kernel above 16MB so that he can maximize on > > DMA ZONE. Can't think of any usage for x86_64 at the moment but I think > > down the line people might come up with such usages. > > Agreed. We do have CONFIG_PHYSICAL_ALIGN that can handle that case, > although I admit that is a bit of a hack. > Yes, but x86_64 will not have any of those options and only way to run kernel will be either use kexec or modify your boot-loader to so that it can handle relocatable images. > > To me, retaining CONFIG_PHYSICAL_START gives added flexibility to the user, > > at the expense of reduced simplicity. We should definitely change the type > > of vmlinux to ET_DYN but at the same time it might still be worth to retain > > CONFIG_PHYSICAL_START option. > > I think something like CONFIG_PHYSICAL_START currently gives us very > little gain, and is hard to use correctly, and there are alternative > solutions. So if we can get rid of it, by only inconveniencing users > who want load their kernels at a weird address it is worth it. > > >> I think I can switch the vmlinux header type in about 100 lines or so > >> of code. Assuming I can ever get 30 minutes with the appropriate > >> kernel. > >> > > > > That would be awesome. Then vmlinux will be relocatable too. (Officially). > > Yes. For x86_64 I can do this. i386 is more difficult. (Although with > a little cleverness we can move the code that processes relocations into > vmlinux). > Performing relocations in vmlinux will be interesting. That way i386 vmlinux too will become relocatable and only piece of puzzle to solve will be to make vmlinux of type ET_DYN. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)
On Tue, 03 Apr 2007 13:44:45 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > > I can never remember what the function to register to receive VM pressure > is called. I have to trace down from __alloc_pages() to find it. > > It's called "set_shrinker()", and it needs Your Help. > > New version: > 1) Don't hide struct shrinker. It contains no magic. > 2) Don't allocate "struct shrinker". It's not helpful. > 3) Call them "register_shrinker" and "unregister_shrinker". > 4) Call the function "shrink" not "shrinker". > 5) Rename "nr_to_scan" argument to "nr_to_free". No, it is actually the number to scan. This is >= the number of freed objects. This is because, for better of for worse, the VM tries to balance the scanning rate of the various caches, not the reclaiming rate. > 6) Reduce the 17 lines of waffly comments to 10, and document the -1 return. > > Comments: > 1) The comment in reiserfs4 makes me a little queasy. I'm going to have to split this patch up into mainline-bit and reiser4-bit. And that's OK (it's a regular occurrence). But never miss a chance to whine. > 2) The wrapper code in xfs might no longer be needed. > 3) The placing in the x86-64 "hot function list" for seems a little >unlikely. Clearly, Andi was testing if anyone was paying attention. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Powerpc build unhappy in 2.6.20.4?
On Monday 02 April 2007 8:51 pm, Tony Breeds wrote: > On Mon, Apr 02, 2007 at 03:14:14PM -0400, Rob Landley wrote: > > > Sure, quite easily the source of the trouble. Attached in both full .config > > and mini.config formats. > > Okay, I have no idea how it happend but you seem to have an invalid > config. It looks to me like you need to select a platform. So "make oldconfig ARCH=powerpc" will accept a config that doesn't have a platform selected? > One of the following: > CONFIG_PPC_PSERIES > CONFIG_PPC_MAPLE > CONFIG_PPC_IBM_CELL_BLADE > CONFIG_PPC_PS3 > CONFIG_PPC_CHRP > CONFIG_PPC_EFIKA > CONFIG_PPC_PMAC Hmmm... So CONFIG_PPC_MULTIPLATFORM doesn't cover it? ("There is no help available for this kernel option"... Maybe a website somewhere?) I just ran "make oldconfig" again and it didn't complain about any of those not being set... > When did this config last build a zImage? I'm guessing either CHRP or > PMAC? Er, never. I was largely guessing at what I needed via menuconfig. (I'm trying to get something I can boot to a shell prompt under QEMU.) I'll try this CHRP thing... Thanks, Rob -- Penguicon 5.0 Apr 20-22, Linux Expo/SF Convention. Bruce Schneier, Christine Peterson, Steve Jackson, Randy Milholland, Elizabeth Bear, Charlie Stross... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)
On Tue, 2007-04-03 at 13:45 +1000, Rusty Russell wrote: > It's called "set_shrinker()", and it needs Your Help. Wrong copy. This is the one which actually compiles reiser4. == I can never remember what the function to register to receive VM pressure is called. I have to trace down from __alloc_pages() to find it. It's called "set_shrinker()", and it needs Your Help. New version: 1) Don't hide struct shrinker. It contains no magic. 2) Don't allocate "struct shrinker". It's not helpful. 3) Call them "register_shrinker" and "unregister_shrinker". 4) Call the function "shrink" not "shrinker". 5) Rename "nr_to_scan" argument to "nr_to_free". 6) Reduce the 17 lines of waffly comments to 10, and document the -1 return. Comments: 1) The comment in reiserfs4 makes me a little queasy. 2) The wrapper code in xfs might no longer be needed. 3) The placing in the x86-64 "hot function list" for seems a little unlikely. Clearly, Andi was testing if anyone was paying attention. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> diff -r a6c8dede237c arch/x86_64/kernel/functionlist --- a/arch/x86_64/kernel/functionlist Tue Apr 03 12:53:59 2007 +1000 +++ b/arch/x86_64/kernel/functionlist Tue Apr 03 13:15:11 2007 +1000 @@ -1118,7 +1118,6 @@ *(.text.simple_strtoll) *(.text.set_termios) *(.text.set_task_comm) -*(.text.set_shrinker) *(.text.set_normalized_timespec) *(.text.set_brk) *(.text.serial_in) diff -r a6c8dede237c fs/dcache.c --- a/fs/dcache.c Tue Apr 03 12:53:59 2007 +1000 +++ b/fs/dcache.c Tue Apr 03 13:09:55 2007 +1000 @@ -884,6 +884,11 @@ static int shrink_dcache_memory(int nr, } return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure; } + +static struct shrinker dcache_shrinker = { + .shrink = shrink_dcache_memory, + .seeks = DEFAULT_SEEKS, +}; /** * d_alloc - allocate a dcache entry @@ -2144,8 +2149,8 @@ static void __init dcache_init(unsigned (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| SLAB_MEM_SPREAD), NULL, NULL); - - set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory); + + register_shrinker(_shrinker); /* Hash may have been set up in dcache_init_early */ if (!hashdist) diff -r a6c8dede237c fs/dquot.c --- a/fs/dquot.cTue Apr 03 12:53:59 2007 +1000 +++ b/fs/dquot.cTue Apr 03 13:10:31 2007 +1000 @@ -538,6 +538,11 @@ static int shrink_dqcache_memory(int nr, } return (dqstats.free_dquots / 100) * sysctl_vfs_cache_pressure; } + +static struct shrinker dqcache_shrinker = { + .shrink = shrink_dqcache_memory, + .seeks = DEFAULT_SEEKS, +}; /* * Put reference to dquot @@ -1871,7 +1876,7 @@ static int __init dquot_init(void) printk("Dquot-cache hash table entries: %ld (order %ld, %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order)); - set_shrinker(DEFAULT_SEEKS, shrink_dqcache_memory); + register_shrinker(_shrinker); return 0; } diff -r a6c8dede237c fs/inode.c --- a/fs/inode.cTue Apr 03 12:53:59 2007 +1000 +++ b/fs/inode.cTue Apr 03 13:11:05 2007 +1000 @@ -474,6 +474,11 @@ static int shrink_icache_memory(int nr, return (inodes_stat.nr_unused / 100) * sysctl_vfs_cache_pressure; } +static struct shrinker icache_shrinker = { + .shrink = shrink_icache_memory, + .seeks = DEFAULT_SEEKS, +}; + static void __wait_on_freeing_inode(struct inode *inode); /* * Called with the inode lock held. @@ -1393,7 +1398,7 @@ void __init inode_init(unsigned long mem SLAB_MEM_SPREAD), init_once, NULL); - set_shrinker(DEFAULT_SEEKS, shrink_icache_memory); + register_shrinker(_shrinker); /* Hash may have been set up in inode_init_early */ if (!hashdist) diff -r a6c8dede237c fs/mbcache.c --- a/fs/mbcache.c Tue Apr 03 12:53:59 2007 +1000 +++ b/fs/mbcache.c Tue Apr 03 13:12:37 2007 +1000 @@ -100,7 +100,6 @@ static LIST_HEAD(mb_cache_list); static LIST_HEAD(mb_cache_list); static LIST_HEAD(mb_cache_lru_list); static DEFINE_SPINLOCK(mb_cache_spinlock); -static struct shrinker *mb_shrinker; static inline int mb_cache_indexes(struct mb_cache *cache) @@ -118,6 +117,10 @@ mb_cache_indexes(struct mb_cache *cache) static int mb_cache_shrink_fn(int nr_to_scan, gfp_t gfp_mask); +static struct shrinker mb_cache_shrinker = { + .shrink = mb_cache_shrink_fn, + .seeks = DEFAULT_SEEKS, +}; static inline int __mb_cache_entry_is_hashed(struct mb_cache_entry *ce) @@ -662,13 +665,13 @@ mb_cache_entry_find_next(struct mb_cache static int __init init_mbcache(void) { - mb_shrinker = set_shrinker(DEFAULT_SEEKS, mb_cache_shrink_fn); + register_shrinker(_cache_shrinker); return 0; }
[PATCH] Cleanup and kernelify shrinker registration (rc5-mm2)
I can never remember what the function to register to receive VM pressure is called. I have to trace down from __alloc_pages() to find it. It's called "set_shrinker()", and it needs Your Help. New version: 1) Don't hide struct shrinker. It contains no magic. 2) Don't allocate "struct shrinker". It's not helpful. 3) Call them "register_shrinker" and "unregister_shrinker". 4) Call the function "shrink" not "shrinker". 5) Rename "nr_to_scan" argument to "nr_to_free". 6) Reduce the 17 lines of waffly comments to 10, and document the -1 return. Comments: 1) The comment in reiserfs4 makes me a little queasy. 2) The wrapper code in xfs might no longer be needed. 3) The placing in the x86-64 "hot function list" for seems a little unlikely. Clearly, Andi was testing if anyone was paying attention. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> diff -r a6c8dede237c arch/x86_64/kernel/functionlist --- a/arch/x86_64/kernel/functionlist Tue Apr 03 12:53:59 2007 +1000 +++ b/arch/x86_64/kernel/functionlist Tue Apr 03 13:15:11 2007 +1000 @@ -1118,7 +1118,6 @@ *(.text.simple_strtoll) *(.text.set_termios) *(.text.set_task_comm) -*(.text.set_shrinker) *(.text.set_normalized_timespec) *(.text.set_brk) *(.text.serial_in) diff -r a6c8dede237c fs/dcache.c --- a/fs/dcache.c Tue Apr 03 12:53:59 2007 +1000 +++ b/fs/dcache.c Tue Apr 03 13:09:55 2007 +1000 @@ -884,6 +884,11 @@ static int shrink_dcache_memory(int nr, } return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure; } + +static struct shrinker dcache_shrinker = { + .shrink = shrink_dcache_memory, + .seeks = DEFAULT_SEEKS, +}; /** * d_alloc - allocate a dcache entry @@ -2144,8 +2149,8 @@ static void __init dcache_init(unsigned (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| SLAB_MEM_SPREAD), NULL, NULL); - - set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory); + + register_shrinker(_shrinker); /* Hash may have been set up in dcache_init_early */ if (!hashdist) diff -r a6c8dede237c fs/dquot.c --- a/fs/dquot.cTue Apr 03 12:53:59 2007 +1000 +++ b/fs/dquot.cTue Apr 03 13:10:31 2007 +1000 @@ -538,6 +538,11 @@ static int shrink_dqcache_memory(int nr, } return (dqstats.free_dquots / 100) * sysctl_vfs_cache_pressure; } + +static struct shrinker dqcache_shrinker = { + .shrink = shrink_dqcache_memory, + .seeks = DEFAULT_SEEKS, +}; /* * Put reference to dquot @@ -1871,7 +1876,7 @@ static int __init dquot_init(void) printk("Dquot-cache hash table entries: %ld (order %ld, %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order)); - set_shrinker(DEFAULT_SEEKS, shrink_dqcache_memory); + register_shrinker(_shrinker); return 0; } diff -r a6c8dede237c fs/inode.c --- a/fs/inode.cTue Apr 03 12:53:59 2007 +1000 +++ b/fs/inode.cTue Apr 03 13:11:05 2007 +1000 @@ -474,6 +474,11 @@ static int shrink_icache_memory(int nr, return (inodes_stat.nr_unused / 100) * sysctl_vfs_cache_pressure; } +static struct shrinker icache_shrinker = { + .shrink = shrink_icache_memory, + .seeks = DEFAULT_SEEKS, +}; + static void __wait_on_freeing_inode(struct inode *inode); /* * Called with the inode lock held. @@ -1393,7 +1398,7 @@ void __init inode_init(unsigned long mem SLAB_MEM_SPREAD), init_once, NULL); - set_shrinker(DEFAULT_SEEKS, shrink_icache_memory); + register_shrinker(_shrinker); /* Hash may have been set up in inode_init_early */ if (!hashdist) diff -r a6c8dede237c fs/mbcache.c --- a/fs/mbcache.c Tue Apr 03 12:53:59 2007 +1000 +++ b/fs/mbcache.c Tue Apr 03 13:12:37 2007 +1000 @@ -100,7 +100,6 @@ static LIST_HEAD(mb_cache_list); static LIST_HEAD(mb_cache_list); static LIST_HEAD(mb_cache_lru_list); static DEFINE_SPINLOCK(mb_cache_spinlock); -static struct shrinker *mb_shrinker; static inline int mb_cache_indexes(struct mb_cache *cache) @@ -118,6 +117,10 @@ mb_cache_indexes(struct mb_cache *cache) static int mb_cache_shrink_fn(int nr_to_scan, gfp_t gfp_mask); +static struct shrinker mb_cache_shrinker = { + .shrink = mb_cache_shrink_fn, + .seeks = DEFAULT_SEEKS, +}; static inline int __mb_cache_entry_is_hashed(struct mb_cache_entry *ce) @@ -662,13 +665,13 @@ mb_cache_entry_find_next(struct mb_cache static int __init init_mbcache(void) { - mb_shrinker = set_shrinker(DEFAULT_SEEKS, mb_cache_shrink_fn); + register_shrinker(_cache_shrinker); return 0; } static void __exit exit_mbcache(void) { - remove_shrinker(mb_shrinker); + unregister_shrinker(_cache_shrinker); } module_init(init_mbcache) diff -r a6c8dede237c
Re: [linux-usb-devel] [RFC] HID bus design overview.
On Monday 02 April 2007 21:15, Li Yu wrote: > > If we don't use "flip-flopping" means, the common driver and specific > driver concepts also don't need. They are completely same driver for HID > bus, just one without some hooks, another without. Exactly. I am glad we are getting on the same page. -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] [RFC] HID bus design overview.
On Monday 02 April 2007 21:40, Li Yu wrote: > May be, we need some means to change blacklist in runtime. and > loading/unloading such driver by specific script to do it. Please look at the new_id sysfs attribute implementation in drivers/pci/pci-driver.c. I believe we need something similar to dynamically adjust HID ignore blacklist. -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
difference between arcmsr (areca) 1.20.0X.13 & 2.6.20 in tree driver?
Has anyone else noticed this regression? -J -- - Forwarded message from Joshua Hoblitt <[EMAIL PROTECTED]> - From: Joshua Hoblitt <[EMAIL PROTECTED]> Date: Thu, 29 Mar 2007 16:38:20 -1000 To: [EMAIL PROTECTED] Subject: difference between 1.20.0X.13 & 2.6.20 in tree driver? Hello, I just attempted to upgrade a system from 2.6.17 (gentoo revision 8) w/ 1.20.0X.13 to 2.6.20 (gentoo revision 4) with the in tree arcmsr driver. On the 2.6.17 kernel there are 2 ~4TB partitions that are visible on the system as /dev/sdb1 & /dev/sdc1. When the system is booted with the 2.6.20 kernel /dev/sdc1 is gone and /dev/sdb1 is properly reported as a 4TB EFI partition but fsck rejects the the filesystem as corrupt. Is this a regression or has there been a fundamental change in the way arcmsr represents the array to the block layer? Here is the info on the RAID card: Controller Name ARC-1170 Firmware VersionV1.39 2005-12-13 BOOT ROM VersionV1.39 2005-12-13 Serial Number Y605CAAVAR700117 Unit Serial # Main Processor 500MHz IOP331 CPU ICache Size 32KBytes CPU DCache Size 32KBytes / Write Back System Memory 256MB / 333MHz Any idea as to what's going on? Thanks, -J -- Under 2.6.20: Mar 29 15:41:05 ipp000 ARECA RAID ADAPTER4: FIRMWARE VERSION V1.39 2005-12-13 Mar 29 15:41:05 ipp000 scsi4 : Areca SATA Host Adapter RAID Controller( RAID6 capable) Mar 29 15:41:05 ipp000 Driver Version 1.20.00.13 Mar 29 15:41:05 ipp000 scsi 4:0:0:0: Direct-Access ArecaARC-1170-VOL#00 R001 PQ: 0 ANSI: 3 Mar 29 15:41:05 ipp000 sdb : very big device. try to use READ CAPACITY(16). Mar 29 15:41:05 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors (4400995 MB) Mar 29 15:41:05 ipp000 sdb: Write Protect is off Mar 29 15:41:05 ipp000 sdb: Mode Sense: cb 00 00 08 Mar 29 15:41:05 ipp000 SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA Mar 29 15:41:05 ipp000 sdb : very big device. try to use READ CAPACITY(16). Mar 29 15:41:05 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors (4400995 MB) Mar 29 15:41:05 ipp000 sdb: Write Protect is off Mar 29 15:41:05 ipp000 sdb: Mode Sense: cb 00 00 08 Mar 29 15:41:05 ipp000 SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA Mar 29 15:41:05 ipp000 sdb: sdb1 Mar 29 15:41:05 ipp000 sd 4:0:0:0: Attached scsi disk sdb Mar 29 15:41:05 ipp000 scsi 4:0:16:0: Processor ArecaRAID controller R001 PQ: 0 ANSI: 0 Under 2.6.17: Mar 29 15:45:11 ipp000 ARECA RAID ADAPTER0: 64BITS PCI BUS DMA ADDRESSING SUPPORTED Mar 29 15:45:11 ipp000 ARECA RAID ADAPTER0: FIRMWARE VERSION V1.39 2005-12-13 Mar 29 15:45:11 ipp000 scsi4 : Areca SATA Host Adapter RAID Controller( RAID6 capable) Mar 29 15:45:11 ipp000 Driver Version 1.20.0X.13 Mar 29 15:45:11 ipp000 Vendor: Areca Model: ARC-1170-VOL#00 Rev: R001 Mar 29 15:45:11 ipp000 Type: Direct-Access ANSI SCSI revision: 03 Mar 29 15:45:11 ipp000 sdb : very big device. try to use READ CAPACITY(16). Mar 29 15:45:11 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors (4400995 MB) Mar 29 15:45:11 ipp000 sdb: Write Protect is off Mar 29 15:45:11 ipp000 sdb: Mode Sense: cb 00 00 08 Mar 29 15:45:11 ipp000 SCSI device sdb: drive cache: write back Mar 29 15:45:11 ipp000 sdb : very big device. try to use READ CAPACITY(16). Mar 29 15:45:11 ipp000 SCSI device sdb: 8595693568 512-byte hdwr sectors (4400995 MB) Mar 29 15:45:11 ipp000 sdb: Write Protect is off Mar 29 15:45:11 ipp000 sdb: Mode Sense: cb 00 00 08 Mar 29 15:45:11 ipp000 SCSI device sdb: drive cache: write back Mar 29 15:45:11 ipp000 sdb:<4>Alternate GPT is invalid, using primary GPT. Mar 29 15:45:11 ipp000 sdb1 Mar 29 15:45:11 ipp000 sd 4:0:0:0: Attached scsi disk sdb Mar 29 15:45:11 ipp000 sd 4:0:0:0: Attached scsi generic sg1 type 0 Mar 29 15:45:11 ipp000 Vendor: Areca Model: ARC-1170-VOL#01 Rev: R001 Mar 29 15:45:11 ipp000 Type: Direct-Access ANSI SCSI revision: 03 Mar 29 15:45:11 ipp000 sdc : very big device. try to use READ CAPACITY(16). Mar 29 15:45:11 ipp000 SCSI device sdc: 8595580928 512-byte hdwr sectors (4400937 MB) Mar 29 15:45:11 ipp000 sdc: Write Protect is off Mar 29 15:45:11 ipp000 sdc: Mode Sense: cb 00 00 08 Mar 29 15:45:11 ipp000 SCSI device sdc: drive cache: write back Mar 29 15:45:11 ipp000 sdc : very big device. try to use READ CAPACITY(16). Mar 29 15:45:11 ipp000 SCSI device sdc: 8595580928 512-byte hdwr sectors (4400937 MB) Mar 29 15:45:11 ipp000 sdc: Write Protect is off Mar 29 15:45:11 ipp000 sdc: Mode Sense: cb 00 00 08 Mar 29 15:45:11 ipp000 SCSI device sdc: drive cache: write back Mar 29 15:45:11 ipp000 sdc: sdc1 Mar 29 15:45:11 ipp000 sd 4:0:0:1: Attached scsi disk sdc Mar 29 15:45:11 ipp000 sd 4:0:0:1: Attached scsi generic sg2 type 0 Mar 29 15:45:11 ipp000 Vendor: Areca Model: RAID controller Rev: R001 Mar 29 15:45:11 ipp000 Type: Processor
Re: [PATCH 2/9] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code
From: David Howells <[EMAIL PROTECTED]> Date: Mon, 02 Apr 2007 23:45:03 +0100 > Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can > use it too. > > The kdoc comments I've attached to the functions needs to be checked by > whoever > wrote them as I had to make some guesses about the workings of these > functions. > > Signed-Off-By: David Howells <[EMAIL PROTECTED]> Patch applied to net-2.6.22, thanks a lot David. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD driver-core] Lifetime problems of the current driver model
Cornelia Huck wrote: > On Mon, 2 Apr 2007 11:20:48 +0200, > Cornelia Huck <[EMAIL PROTECTED]> wrote: > >> Cool. However, there's something fishy there (not sure whether it's in >> your patch or a latent bug in the ccw bus code that just has been >> uncovered): > > Similar bug when loading/unloading a module that creates a driver > attribute. The winner seems to be kfree(sd->s_element) in > release_sysfs_dirent() (in case of an attribute, it will point to the > attribute structure, which is usually statically created)... Thanks for finding it out. I was suspecting that last minute change. The code should be if (dir node) kfree(s_element) else if (symlink node) do things and kfree() Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] [PATCH] persistent_clock() for x86_64
This patch converts the get_cmos_time() function to read_persistent_clock(), which allows x86_64 to utilize the full generic timekeeping suspend/resume code path. Unfortunately I don't have any x86_64 boxes that suspend/resume to play w/ so this is mostly untested (but uses the same generic code path as i386). Any thoughts or comments? thanks -john Signed-off-by: John Stultz <[EMAIL PROTECTED]> time.c | 43 +-- 1 file changed, 1 insertion(+), 42 deletions(-) diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c index 75d73a9..df09c65 100644 --- a/arch/x86_64/kernel/time.c +++ b/arch/x86_64/kernel/time.c @@ -201,7 +201,7 @@ static irqreturn_t timer_interrupt(int i return IRQ_HANDLED; } -static unsigned long get_cmos_time(void) +unsigned long read_persistent_clock(void) { unsigned int year, mon, day, hour, min, sec; unsigned long flags; @@ -327,11 +327,6 @@ void __init time_init(void) { if (nohpet) hpet_address = 0; - xtime.tv_sec = get_cmos_time(); - xtime.tv_nsec = 0; - - set_normalized_timespec(_to_monotonic, - -xtime.tv_sec, -xtime.tv_nsec); if (hpet_arch_init()) hpet_address = 0; @@ -364,59 +359,23 @@ void __init time_init(void) } -static long clock_cmos_diff; -static unsigned long sleep_start; - /* * sysfs support for the timer. */ -static int timer_suspend(struct sys_device *dev, pm_message_t state) -{ - /* -* Estimate time zone so that set_time can update the clock -*/ - long cmos_time = get_cmos_time(); - - clock_cmos_diff = -cmos_time; - clock_cmos_diff += get_seconds(); - sleep_start = cmos_time; - return 0; -} - static int timer_resume(struct sys_device *dev) { - unsigned long flags; - unsigned long sec; - unsigned long ctime = get_cmos_time(); - long sleep_length = (ctime - sleep_start) * HZ; - - if (sleep_length < 0) { - printk(KERN_WARNING "Time skew detected in timer resume!\n"); - /* The time after the resume must not be earlier than the time -* before the suspend or some nasty things will happen -*/ - sleep_length = 0; - ctime = sleep_start; - } if (hpet_address) hpet_reenable(); else i8254_timer_resume(); - sec = ctime + clock_cmos_diff; - write_seqlock_irqsave(_lock,flags); - xtime.tv_sec = sec; - xtime.tv_nsec = 0; - jiffies += sleep_length; - write_sequnlock_irqrestore(_lock,flags); touch_softlockup_watchdog(); return 0; } static struct sysdev_class timer_sysclass = { .resume = timer_resume, - .suspend = timer_suspend, set_kset_name("timer"), }; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[KJ][PATCH]ROUND_UP macro cleanup in drivers/net/ixgb
IXGB_ROUNDUP macro cleanup ,use ALIGN Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]> --- ixgb.h |3 --- ixgb_ethtool.c |4 ++-- ixgb_main.c|4 ++-- ixgb_param.c |4 ++-- 4 files changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/net/ixgb/ixgb.h b/drivers/net/ixgb/ixgb.h index cf30a10..c8e9086 100644 --- a/drivers/net/ixgb/ixgb.h +++ b/drivers/net/ixgb/ixgb.h @@ -111,9 +111,6 @@ struct ixgb_adapter; /* How many Rx Buffers do we bundle into one write to the hardware ? */ #define IXGB_RX_BUFFER_WRITE 8 /* Must be power of 2 */ -/* only works for sizes that are powers of 2 */ -#define IXGB_ROUNDUP(i, size) ((i) = (((i) + (size) - 1) & ~((size) - 1))) - /* wrapper around a pointer to a socket buffer, * so a DMA handle can be stored along with the buffer */ struct ixgb_buffer { diff --git a/drivers/net/ixgb/ixgb_ethtool.c b/drivers/net/ixgb/ixgb_ethtool.c index d6628bd..cdefaff 100644 --- a/drivers/net/ixgb/ixgb_ethtool.c +++ b/drivers/net/ixgb/ixgb_ethtool.c @@ -577,11 +577,11 @@ ixgb_set_ringparam(struct net_device *netdev, rxdr->count = max(ring->rx_pending,(uint32_t)MIN_RXD); rxdr->count = min(rxdr->count,(uint32_t)MAX_RXD); - IXGB_ROUNDUP(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); + rxdr->count = ALIGN(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); txdr->count = max(ring->tx_pending,(uint32_t)MIN_TXD); txdr->count = min(txdr->count,(uint32_t)MAX_TXD); - IXGB_ROUNDUP(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); + txdr->count = ALIGN(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); if(netif_running(adapter->netdev)) { /* Try to get new resources before deleting old */ diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c index afc2ec7..158c71e 100644 --- a/drivers/net/ixgb/ixgb_main.c +++ b/drivers/net/ixgb/ixgb_main.c @@ -685,7 +685,7 @@ ixgb_setup_tx_resources(struct ixgb_adapter *adapter) /* round up to nearest 4K */ txdr->size = txdr->count * sizeof(struct ixgb_tx_desc); - IXGB_ROUNDUP(txdr->size, 4096); + txdr->size = ALIGN(txdr->size, 4096); txdr->desc = pci_alloc_consistent(pdev, txdr->size, >dma); if(!txdr->desc) { @@ -774,7 +774,7 @@ ixgb_setup_rx_resources(struct ixgb_adapter *adapter) /* Round up to nearest 4K */ rxdr->size = rxdr->count * sizeof(struct ixgb_rx_desc); - IXGB_ROUNDUP(rxdr->size, 4096); + rxdr->size = ALIGN(rxdr->size, 4096); rxdr->desc = pci_alloc_consistent(pdev, rxdr->size, >dma); diff --git a/drivers/net/ixgb/ixgb_param.c b/drivers/net/ixgb/ixgb_param.c index b27442a..ee8cc67 100644 --- a/drivers/net/ixgb/ixgb_param.c +++ b/drivers/net/ixgb/ixgb_param.c @@ -284,7 +284,7 @@ ixgb_check_options(struct ixgb_adapter *adapter) } else { tx_ring->count = opt.def; } - IXGB_ROUNDUP(tx_ring->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); + tx_ring->count = ALIGN(tx_ring->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); } { /* Receive Descriptor Count */ struct ixgb_option opt = { @@ -303,7 +303,7 @@ ixgb_check_options(struct ixgb_adapter *adapter) } else { rx_ring->count = opt.def; } - IXGB_ROUNDUP(rx_ring->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); + rx_ring->count = ALIGN(rx_ring->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); } { /* Receive Checksum Offload Enable */ struct ixgb_option opt = { -- Milind Arun Choudhary - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: staircase deadline misc fixes
On Thursday 29 March 2007 15:50, Mike Galbraith wrote: > On Thu, 2007-03-29 at 09:44 +1000, Con Kolivas wrote: > + * This contains a bitmap for each dynamic priority level with empty slots > + * for the valid priorities each different nice level can have. It allows > + * us to stagger the slots where differing priorities run in a way that > + * keeps latency differences between different nice levels at a minimum. > + * ie, where 0 means a slot for that priority, priority running from left > to + * right: > + * nice -20 > + * nice -10 1001000100100010001001000100010010001000 > + * nice 0 0101010101010101010101010101010101010101 > + * nice 5 1101011010110101101011010110101101011011 > + * nice 10 0110111011011101110110111011101101110111 > + * nice 15 0101101101011011 > + * nice 19 1110 Try two instances of chew.c at _differing_ nice levels on one cpu on mainline, and then SD. This is why you can't renice X on mainline. > -Mike -- -ck /* * orignal idea by Chris Friesen. Thanks. */ #include #include #include #define THRESHOLD_USEC 2000 unsigned long long stamp() { struct timeval tv; gettimeofday(, 0); return (unsigned long long) tv.tv_usec + ((unsigned long long) tv.tv_sec)*100; } int main() { unsigned long long thresh_ticks = THRESHOLD_USEC; unsigned long long cur,last; struct timespec ts; sched_rr_get_interval(0, ); printf("pid %d, prio %3d, interval of %d nsec\n", getpid(), getpriority(PRIO_PROCESS, 0), ts.tv_nsec); last = stamp(); while(1) { cur = stamp(); unsigned long long delta = cur-last; if (delta > thresh_ticks) { printf("pid %d, prio %3d, out for %4llu ms\n", getpid(), getpriority(PRIO_PROCESS, 0), delta/1000); cur = stamp(); } last = cur; } return 0; }
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Saturday 31 March 2007 19:28, Xenofon Antidides wrote: > For long time now I use windows to work > problems. I cannot play wine games with audio, I > cannot sample video, I cannot use skype, I cannot play > midi. And even linux only things I try do I cannot > share my X, I cannot use more than one vmware. All > those is fix for me with SD. Any semblance of cpu bandwidth and latency guarantees are easily shot on mainline by a single process going wild (eg open tab in firefox). > I sorry I answer kernel > email and go away now for good. respected; dropped from cc -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: staircase deadline misc fixes
On Thursday 29 March 2007 18:18, Mike Galbraith wrote: > Rereading to make sure I wasn't unclear anywhere... > > On Thu, 2007-03-29 at 07:50 +0200, Mike Galbraith wrote: > > I don't see what a < 95% load really means. > > Egad. Here I'm pondering the numbers and light load as I'm typing, and > my fingers (seemingly independent when mind wanders off) typed < 95% as > in not fully committed, instead of "light". 95% of cases where load is less than 4; not 95% load. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ENOENT creating /dev/root on MTD RAM partition
Olaf Hering wrote: On Mon, Apr 02, John Williams wrote: Any comments or suggestions on a possible cause or approach to track it down would be greatly appreciated. Just a guess: Check if '/dev' exists. I think it is now possible to not add the built-in cpio archive with the mandatory /dev, /dev/console and /root entries. Thanks Olaf - you got me on the right track. Turned out to be an arch link script error whereby the "rootfs" initcalls were not placed in the initcall table - thus the init ramfs wasn't yet mounted, and /dev wasn't there. Cheers, John - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: hugetlb: Unable to handle kernel NULL pointer dereference
On 02.04.2007 [18:46:08 -0700], Nishanth Aravamudan wrote: > Adam, David, > > Just got the following Oops and recursive fault running `make func` > (apparently the `shared` test in particular) with kernel HEAD at > efab03d998da03f67836ffc664b04e0400f85448 on my x86_64. Will pull latest > Linus and reboot, but haven't seen it posted yet. This occurs both with > my branch and master from libhugetlbfs. Bah, blame operator error, perhaps. Not able to reproduce with current Linus tip. Thanks, Nish -- Nishanth Aravamudan <[EMAIL PROTECTED]> IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
hugetlb: Unable to handle kernel NULL pointer dereference
Adam, David, Just got the following Oops and recursive fault running `make func` (apparently the `shared` test in particular) with kernel HEAD at efab03d998da03f67836ffc664b04e0400f85448 on my x86_64. Will pull latest Linus and reboot, but haven't seen it posted yet. This occurs both with my branch and master from libhugetlbfs. [351419.028499] Unable to handle kernel NULL pointer dereference at 0018 RIP: [351419.047110] [] _write_lock_irqsave+0x1d/0x90 [351419.047121] PGD 0 [351419.047124] Oops: 0002 [1] PREEMPT SMP [351419.047127] CPU 0 [351419.047129] Modules linked in: tun [351419.047135] Pid: 29122, comm: shared Not tainted 2.6.21-rc5-gefab03d9-dirty #25 [351419.047138] RIP: 0010:[] [] _write_lock_irqsave+0x1d/0x90 [351419.047144] RSP: 0018:81000a5d3c58 EFLAGS: 00010017 [351419.047148] RAX: 81000a5d3fd8 RBX: 0018 RCX: 81003fe71028 [351419.047151] RDX: 0217 RSI: 81000a5d3cd0 RDI: 0001 [351419.047155] RBP: 81000a5d3c68 R08: 0005 R09: fffc [351419.047158] R10: 0001 R11: 0246 R12: 81003fe71000 [351419.047162] R13: 810037052080 R14: 55c0 R15: 81000a5d3cb8 [351419.047166] FS: 2aea683c6030() GS:80728000() knlGS:556f16c0 [351419.047169] CS: 0010 DS: 002b ES: 002b CR0: 8005003b [351419.047172] CR2: 0018 CR3: 00101000 CR4: 06e0 [351419.047177] Process shared (pid: 29122, threadinfo 81000a5d2000, task 81001130c740) [351419.047179] Stack: 81000a5d3c78 0018 81000a5d3c78 8016a7a9 [351419.047187] 81000a5d3c98 8014c404 81003fe71000 81000a5d3c90 [351419.047193] 81000a5d3d08 801c2864 810037052100 8100087238c8 [351419.047198] Call Trace: [351419.047204] [] _write_lock_irq+0x9/0x10 [351419.047210] [] remove_from_page_cache+0x24/0x40 [351419.047216] [] __unmap_hugepage_range+0x174/0x1b0 [351419.047220] [] unmap_hugepage_range+0x47/0x70 [351419.047225] [] unmap_vmas+0x11b/0x7f0 [351419.047230] [] exit_mmap+0x87/0x130 [351419.047234] [] mmput+0x37/0xb0 [351419.047238] [] exit_mm+0xe5/0xf0 [351419.047242] [] do_exit+0x238/0x8c0 [351419.047246] [] _spin_unlock+0x14/0x40 [351419.047250] [] do_group_exit+0x89/0x90 [351419.047255] [] sys_exit_group+0x12/0x20 [351419.047262] [] cstar_do_call+0x1b/0x65 [351419.047264] [351419.047265] [351419.047266] Code: f0 81 2b 00 00 00 01 0f 94 c0 84 c0 75 4e f0 81 03 00 00 00 [351419.047275] RIP [] _write_lock_irqsave+0x1d/0x90 [351419.047280] RSP [351419.047282] CR2: 0018 [351419.047287] Fixing recursive fault but reboot is needed! [351419.047290] BUG: scheduling while atomic: shared/0x0003/29122 [351419.047292] [351419.047293] Call Trace: [351419.047298] [] __sched_text_start+0x5d/0x807 [351419.047304] [] default_wake_function+0xd/0x10 [351419.047310] [] autoremove_wake_function+0x11/0x40 [351419.047315] [] __wake_up_common+0x44/0x80 [351419.047319] [] do_exit+0x131/0x8c0 [351419.047323] [] do_page_fault+0x7f6/0x8f0 [351419.047328] [] wake_up_bit+0x28/0x40 [351419.047332] [] invalidate_inode_buffers+0x13/0xd0 [351419.047336] [] _spin_lock+0x16/0x80 [351419.047339] [] _spin_lock+0x16/0x80 [351419.047343] [] dput+0x22/0x160 [351419.047347] [] error_exit+0x0/0x84 [351419.047351] [] _write_lock_irqsave+0x1d/0x90 [351419.047355] [] _write_lock_irq+0x9/0x10 [351419.047359] [] remove_from_page_cache+0x24/0x40 [351419.047363] [] __unmap_hugepage_range+0x174/0x1b0 [351419.047368] [] unmap_hugepage_range+0x47/0x70 [351419.047371] [] unmap_vmas+0x11b/0x7f0 [351419.047376] [] exit_mmap+0x87/0x130 [351419.047379] [] mmput+0x37/0xb0 [351419.047383] [] exit_mm+0xe5/0xf0 [351419.047387] [] do_exit+0x238/0x8c0 [351419.047390] [] _spin_unlock+0x14/0x40 [351419.047394] [] do_group_exit+0x89/0x90 [351419.047398] [] sys_exit_group+0x12/0x20 [351419.047402] [] cstar_do_call+0x1b/0x65 [351419.047404] Thanks, Nish -- Nishanth Aravamudan <[EMAIL PROTECTED]> IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] [RFC] HID bus design overview.
Nicolas Mailhot wrote: >> Er, I also want to know what are drawbacks of "flip-flopping" ? >> > > This will cause major havoc as soon as hot-plugging and apps listening to > HAL events (xorg eventually) enter in play. > > ~_~ It really need some extra works in user space, but I do not think this is so critical. These HAL events should not be frequently, and happen when system boot early very likely. In fact, these works also exist with blacklist means, but it migrate to HID driver developer, and from runtime move to development-time. (Of course, you can do it by sysfs, just like vmware, I think it is so). Although I do not agree very much, since such many guru said the "flip-flopping" is not good idea, It is likely appropriate, I also will change code later, this make the implementation more easier in fact. May be, we need some means to change blacklist in runtime. and loading/unloading such driver by specific script to do it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Warning: unable to open an initial console.
Hi, Check if your U-boot enabled the udev, try diable the udev, then using mknod to create the /dev/console. Regards dave 2007/4/2, Tom Strader <[EMAIL PROTECTED]>: I checked /dev/ with U-boot and it shows the existence of /dev/console. From U-boot prompt: $ ls /dev crw---0 Mon Apr 02 17:52:27 2007 console crw-r--r--0 Mon Apr 02 17:52:27 2007 null crw-r--r--0 Mon Apr 02 17:52:27 2007 zero Also, I added a printk in the jffs2_add_fd_to_list() routine in fs/jffs2/nodelist.c to print out the dirent adds and it shows console being added as follows: ... add dirent "var", ino #14 add dirent "usr", ino #13 add dirent "tmp", ino #12 add dirent "sys", ino #11 add dirent "sbin", ino #10 add dirent "proc", ino #9 add dirent "mnt", ino #8 add dirent "linuxrc", ino #7 add dirent "lib", ino #6 add dirent "home", ino #5 add dirent "etc", ino #4 add dirent "dev", ino #3 add dirent "bin", ino #2 VFS: Mounted root (jffs2 filesystem). Freeing init memory: 76K add dirent "zero", ino #70 add dirent "null", ino #69 add dirent "console", ino #68 Warning: unable to open an initial console. add dirent "watchdog", ino #262 add dirent "udevstart", ino #261 add dirent "udevsend", ino #260 ... Any other ideas? Thanks, Tom -Original Message- From: Chris Wedgwood [mailto:[EMAIL PROTECTED] Sent: Monday, April 02, 2007 3:49 PM To: Tom Strader Cc: linux-kernel@vger.kernel.org Subject: Re: Warning: unable to open an initial console. On Mon, Apr 02, 2007 at 12:04:56PM -0700, Tom Strader wrote: > I have seen quite a few posts regarding unable to open an initial > console, but my system seems to have the necessary things in place > so I come looking for help. your rootfs/initramfs/initrd is missing a valid working /dev/console > VFS: Mounted root (jffs2 filesystem). check /dev/ on this filesystem - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5 possible regression: KDE processes die silently (was: 2.6.21-rc3-mm2: KDE processes die while system is idle)
Adrian Bunk <[EMAIL PROTECTED]> writes: > On Sun, Apr 01, 2007 at 05:21:06PM +0200, Tilman Schmidt wrote: >> I'm sorry to say this has now happened with kernel 2.6.21-rc5, too. >> I started a kernel compilation in the evening and came back in the >> morning to find all KDE decorations gone. All processes normally >> running for a KDE session and labelled "[kinit]" in ps were gone >> but everything else was running fine, and the system was still >> usable via ssh. /var/log/kdm.log and /var/log/Xorg.0.log contained >> nothing remotely suspicious. /var/log/messages had two lines I >> never saw before: >> >> Mar 31 02:27:36 gx110 kernel: [153577.891443] ReiserFS: hda3: warning: > vs-8115: get_num_ver: not directory or indirect item >> Mar 31 02:27:36 gx110 kernel: [153577.891559] ReiserFS: hda3: warning: > vs-8115: get_num_ver: not directory or indirect item > > Reiserfs people Cc'ed for this. > >> But those didn't appear on previous occurrences of the "dying KDE" >> problem so I guess they are not related. >> >> This is SUSE LINUX 10.0 (i586) running on a Dell OptiPlex GX110 >> (Intel P3, 933 MHz, i810 chipset, 512 MB RAM, 60 GB ATA disk) >> % uname -a >> Linux gx110 2.6.21-rc5-noinitrd #1 PREEMPT Sat Mar 31 02:15:19 CEST 2007 i686 > i686 i386 GNU/Linux >> % cat /proc/cmdline >> root=/dev/hda3 selinux=0 x11i=vesa video=intelfb:[EMAIL PROTECTED] > nmi_watchdog=2 lapic 5 >> Kernel configuration mostly-modular, based on standard SuSE kernel's >> /proc/config.gz, just compiling into the kernel everything I need to >> boot without an initrd and omitting some parts I'm not interested in. >> (.config attached.) What else might be relevant? >> >> Again, this is a Heisenbug, ie. it's not reproducible and invariably >> happens when I'm away from the machine. (Probably Murphy at work.) >> It's pretty rare: I have seen it four times on 2.6.21-rc3-mm2 and >> once on 2.6.21-rc5, on a machine which spends about equal amounts >> of time running the latest stable, rc, and mm kernels. OTOH, so far >> it hasn't ever happened with any 2.6.20 or earlier kernel. Nor have >> I seen it with 2.6.21-rc[1-4] or 2.6.21-rc4-mm* - but for the -rc4 >> and -rc4-mm releases that's not conclusive as those have only been >> running for a very short time. > > We also have another report of crashes under KDE: > > Subject: crashes in KDE > References : http://bugzilla.kernel.org/show_bug.cgi?id=8157 > Submitter : Oliver Pinter <[EMAIL PROTECTED]> > Status : unknown > > We also have one bug kwin ran into that got fixed after -rc5: > > Subject: kwin dies silently > References : http://lkml.org/lkml/2007/2/28/112 > Submitter : Sid Boyce <[EMAIL PROTECTED]> > Boris Mogwitz <[EMAIL PROTECTED]> > Michael Wu <[EMAIL PROTECTED]> > Caused-By : Eric W. Biederman <[EMAIL PROTECTED]> > commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1 > Fixed-By : Eric W. Biederman <[EMAIL PROTECTED]> > Commit : 14e9d5730adfca26452b3a2838a80af6950556f5 > Status : fixed in -rc6 > > These might or might not be related issues. The description above sounds like the kwin bug, except for the trigger. (i.e. The set of processes that die are all largely part of the same process group, and they sound like the same set of processes). So I'm guessing it is the same issue. The crashes in KDE bug may also be the same problem there isn't enough information to make a good guess. So until -rc6 we get a test report from -rc6 after or whatever Linus latest tree is I'm not inclined to dig farther. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] vt: Expose system-wide UTF-8 default setting via sysfs
Create a variable, default_utf8, that defines the system-wide default UTF-8 setting. This variable can be altered via sysfs. If the variable is properly set, this should mimimize breakage of UTF-8 encoded consoles when doing a reset or echo -e '\033c' and of newly opened/allocated consoles. This is based from patches by Jan Engelhardt and Paul LeoNerd Evans. Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]> --- > I think you're missing the whole point of console reset. Its purpose is > to force the console into a known-good state. The fewer pieces of state > it leaves unset, the better. To some degree it's less important what > that state actually is. Okay, you convinced me. Hopefully this is acceptable to all parties. Andrew, If everybody agrees, can you drop the previous patch I sent to you, and use this instead? Tony drivers/char/vt.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/char/vt.c b/drivers/char/vt.c index 1bbb45b..8aca96f 100644 --- a/drivers/char/vt.c +++ b/drivers/char/vt.c @@ -157,6 +157,8 @@ static void blank_screen_t(unsigned long static void set_palette(struct vc_data *vc); static int printable; /* Is console ready for printing? */ +static int default_utf8; +module_param(default_utf8, int, S_IRUGO | S_IWUSR); /* * ignore_poke: don't unblank the screen when things are typed. This is @@ -1497,7 +1499,7 @@ static void reset_terminal(struct vc_dat vc->vc_charset = 0; vc->vc_need_wrap= 0; vc->vc_report_mouse = 0; - vc->vc_utf = 0; + vc->vc_utf = default_utf8; vc->vc_utf_count= 0; vc->vc_disp_ctrl= 0; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] [RFC] HID bus design overview.
Marcel Holtmann wrote: > The cleanest solution without a layer violation is that you can > register a driver for a specific VID/PID and then report id (one or > more). All > reports with ids that we don't have a special driver for are handled by > the default HID->input driver or handed over to hidraw if not parseable. > The reports for ids with a special driver are handed over to the driver. > > And for hidraw it would be nice if we can apply filters for specific > report ids to keep the round-trips and overhead at a minimum. > > If we don't use "flip-flopping" means, the common driver and specific driver concepts also don't need. They are completely same driver for HID bus, just one without some hooks, another without. The common event processing is an API from HID core. so, here have not round-trips. What's the position of hidraw? It only is used when all other driver is not usable on some report? or, it should be stick every working device. PS: In last broken "flip-flopping" resolution, the USBHID work also need some changes ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [test] hackbench.c interactivity results: vanilla versus SD/RSDL
On Thursday 29 March 2007 21:22, Ingo Molnar wrote: > [ A quick guess: could SD's substandard interactivity in this test be > due to the SMP migration logic inconsistencies Mike noticed? This is > an SMP system and the hackbench workload is very scheduling intense > and tasks are frequently queued from one CPU to another. ] I assume you put it on and endless loop since hackbench 10 runs for .5 second on my machine. Doubtful it's an SMP issue. update_if_moved should maintain cross cpu scheduling decisions. The same slowdown would happen on UP and is almost certainly due to the fact that hackbench 10 induces a load of _160_ on the machine. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sched: staircase deadline improvements
Staircase Deadline improvements. Nice is better distributed for waking tasks with a per-static-prio prio_level. SCHED_RR tasks were not being requeued on expiration. Tighten up accounting. Fix comment style. Microoptimisation courtesy of Dmitry Adamushko <[EMAIL PROTECTED]> Signed-off-by: Con Kolivas <[EMAIL PROTECTED]> --- kernel/sched.c | 97 +++-- 1 file changed, 60 insertions(+), 37 deletions(-) Index: linux-2.6.21-rc5-mm3/kernel/sched.c === --- linux-2.6.21-rc5-mm3.orig/kernel/sched.c2007-04-02 10:37:07.0 +1000 +++ linux-2.6.21-rc5-mm3/kernel/sched.c 2007-04-03 10:40:48.0 +1000 @@ -132,20 +132,20 @@ struct rq; * These are the runqueue data structures: */ struct prio_array { - struct list_head queue[MAX_PRIO]; /* Tasks queued at each priority */ + struct list_head queue[MAX_PRIO]; - DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1); /* * The bitmap of priorities queued for this array. While the expired * array will never have realtime tasks on it, it is simpler to have * equal sized bitmaps for a cheap array swap. Include 1 bit for * delimiter. */ + DECLARE_BITMAP(prio_bitmap, MAX_PRIO + 1); #ifdef CONFIG_SMP - struct rq *rq; /* For convenience looks back at rq */ + struct rq *rq; #endif }; @@ -212,14 +212,14 @@ struct rq { struct prio_array *active, *expired, arrays[2]; unsigned long *dyn_bitmap, *exp_bitmap; - int prio_level, best_static_prio; /* -* The current dynamic priority level this runqueue is at, and the -* best static priority queued this major rotation. +* The current dynamic priority level this runqueue is at per static +* priority level, and the best static priority queued this rotation. */ + int prio_level[PRIO_RANGE], best_static_prio; - unsigned long prio_rotation; /* How many times we have rotated the priority queue */ + unsigned long prio_rotation; atomic_t nr_iowait; @@ -707,19 +707,29 @@ static inline int first_prio_slot(struct static inline int next_entitled_slot(struct task_struct *p, struct rq *rq) { DECLARE_BITMAP(tmp, PRIO_RANGE); - int search_prio; + int search_prio, uprio = USER_PRIO(p->static_prio); - if (p->static_prio < rq->best_static_prio) + /* +* Only priorities equal to the prio_level and above for their +* static_prio are acceptable, and only if it's not better than +* a queued better static_prio's prio_level. +*/ + if (p->static_prio < rq->best_static_prio) { search_prio = MAX_RT_PRIO; - else - search_prio = rq->prio_level; + if (likely(p->policy != SCHED_BATCH)) + rq->best_static_prio = p->static_prio; + } else if (p->static_prio == rq->best_static_prio) + search_prio = rq->prio_level[uprio]; + else { + search_prio = max(rq->prio_level[uprio], + rq->prio_level[USER_PRIO(rq->best_static_prio)]); + } if (unlikely(p->policy == SCHED_BATCH)) { search_prio = max(search_prio, p->static_prio); return SCHED_PRIO(find_next_zero_bit(p->bitmap, PRIO_RANGE, USER_PRIO(search_prio))); } - bitmap_or(tmp, p->bitmap, prio_matrix[USER_PRIO(p->static_prio)], - PRIO_RANGE); + bitmap_or(tmp, p->bitmap, prio_matrix[uprio], PRIO_RANGE); return SCHED_PRIO(find_next_zero_bit(tmp, PRIO_RANGE, USER_PRIO(search_prio))); } @@ -745,14 +755,18 @@ static void queue_expired(struct task_st if (src_rq == rq) return; - if (p->rotation == src_rq->prio_rotation) + /* +* Only need to set p->array when p->rotation == rq->prio_rotation as +* they will be set in recalc_task_prio when != rq->prio_rotation. +*/ + if (p->rotation == src_rq->prio_rotation) { p->rotation = rq->prio_rotation; - else + if (p->array == src_rq->expired) + p->array = rq->expired; + else + p->array = rq->active; + } else p->rotation = 0; - if (p->array == src_rq->expired) - p->array = rq->expired; - else - p->array = rq->active; } #else static inline void update_if_moved(struct task_struct *p, struct rq *rq) @@ -1671,16 +1685,16 @@ void fastcall sched_fork(struct task_str * total amount of pending timeslices in the system doesn't change, * resulting in more scheduling fairness. */ - if (unlikely(p->time_slice < 2)) - p->time_slice = 2; - p->time_slice
Re: [PATCH] vt: Do not clear UTF when resetting console
Jan Engelhardt wrote: On Apr 3 2007 08:16, Antonino A. Daplas wrote: That would be the cleanest and purest behavior. But it's possible to set one console to UTF-8 and another to legacy mode. The question would be: why would you want to have mixed consoles? Switching to UTF8 IMO does not take away any characters, and I mean no-framebuffer 80x25 that is limited to 256 glyphs. 512, not 256. However, the reason would be because you have an application (which might actually be running on another system entirely!) which expects the other behaviour. Antonio wrote: That would be the cleanest and purest behavior. But it's possible to set one console to UTF-8 and another to legacy mode. So one can corrupt the user's console just by issuing a reset or echo -e '\033c'. (Although one can argue that users who know what UTF-8 is also knows how to set the encoding back) Until userspace is more capable of setting back the terminal to its previous configuration, I would tend to agree with Jan, that we should leave the current utf setting of that particular vc alone. I think you're missing the whole point of console reset. Its purpose is to force the console into a known-good state. The fewer pieces of state it leaves unset, the better. To some degree it's less important what that state actually is. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vt: Do not clear UTF when resetting console
On Tue, 2007-04-03 at 02:23 +0200, Jan Engelhardt wrote: > On Apr 3 2007 08:16, Antonino A. Daplas wrote: > > > >That would be the cleanest and purest behavior. But it's possible to set > >one console to UTF-8 and another to legacy mode. > > The question would be: why would you want to have mixed consoles? > Switching to UTF8 IMO does not take away any characters, and I mean > no-framebuffer 80x25 that is limited to 256 glyphs. As long as we provide the users the capability to support mixed encodings, the why is not important, it will happen. If we want to be more restrictive, I guess, we can remove support for echo -e '\033%G' and '\033%@' Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Powerpc build unhappy in 2.6.20.4?
On Mon, Apr 02, 2007 at 03:14:14PM -0400, Rob Landley wrote: > Sure, quite easily the source of the trouble. Attached in both full .config > and mini.config formats. Okay, I have no idea how it happend but you seem to have an invalid config. It looks to me like you need to select a platform. One of the following: CONFIG_PPC_PSERIES CONFIG_PPC_MAPLE CONFIG_PPC_IBM_CELL_BLADE CONFIG_PPC_PS3 CONFIG_PPC_CHRP CONFIG_PPC_EFIKA CONFIG_PPC_PMAC When did this config last build a zImage? I'm guessing either CHRP or PMAC? Yours Tony linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/ Jan 28 - Feb 02 2008 The Australian Linux Technical Conference! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] (re)register_binfmt returns with -EBUSY
On Mon, 2 Apr 2007 18:24:15 +0530 "kalash nainwal" <[EMAIL PROTECTED]> wrote: > When a binary format is unregistered and re-registered, > register_binfmt fails with -EBUSY. The reason is that > unregister_binfmt does not set fmt->next to NULL, and seeing > (fmt->next != NULL), register_binfmt fails with -EBUSY. > > One can find his way around by explicitly setting fmt->next to NULL > after unregistering, but that is kind of unclean (one should better be > using only the interfaces, and not the interal members, isn't it?) > > Attached one-liner can fix it (for 2.6.20). Yes, that'll fix it. But I wonder why register_binfmt() even checks that the to-be-registered linux_binfmt has a non-null fmt->next? Presumably that's there to catch erroneous re-registration of an already-registered format. All very odd. It looks like that code should be converted to list_heads anyway... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] x86_64: Implement SPARSE_VIRTUAL
x86_64 implement SPARSE_VIRTUAL x86_64 is using 2M page table entries to map its 1-1 kernel space. We implement the virtual memmap also using 2M page table entries. So there is no difference at all to FLATMEM. Both schemes require a page table and a TLB for each 2MB. FLATMEM still references memory since the mem_map pointer itself a variable. SPARSE_VIRTUAL uses a constant for vmemmap. Thus no memory reference. SPARSE_VIRTUAL should be superior to even FLATMEM. With this SPARSEMEM becomes the most efficient way of handling virt_to_page, pfn_to_page and friends for UP, SMP and NUMA on x86_64. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.21-rc5-mm3/include/asm-x86_64/page.h === --- linux-2.6.21-rc5-mm3.orig/include/asm-x86_64/page.h 2007-04-02 12:25:03.0 -0700 +++ linux-2.6.21-rc5-mm3/include/asm-x86_64/page.h 2007-04-02 12:27:16.0 -0700 @@ -127,6 +127,7 @@ VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) #define __HAVE_ARCH_GATE_AREA 1 +#define vmemmap ((struct page *)0xe200UL) #include #include Index: linux-2.6.21-rc5-mm3/Documentation/x86_64/mm.txt === --- linux-2.6.21-rc5-mm3.orig/Documentation/x86_64/mm.txt 2007-04-02 12:25:03.0 -0700 +++ linux-2.6.21-rc5-mm3/Documentation/x86_64/mm.txt2007-04-02 12:27:16.0 -0700 @@ -9,6 +9,7 @@ 8100 - c0ff (=46 bits) direct mapping of all phys. memory c100 - c1ff (=40 bits) hole c200 - e1ff (=45 bits) vmalloc/ioremap space +e200 - e2ff (=40 bits) virtual memory map ... unused hole ... 8000 - 8280 (=40 MB) kernel text mapping, from phys 0 ... unused hole ... Index: linux-2.6.21-rc5-mm3/arch/x86_64/Kconfig === --- linux-2.6.21-rc5-mm3.orig/arch/x86_64/Kconfig 2007-04-02 12:27:13.0 -0700 +++ linux-2.6.21-rc5-mm3/arch/x86_64/Kconfig2007-04-02 12:28:13.0 -0700 @@ -392,6 +392,12 @@ def_bool y depends on (NUMA || EXPERIMENTAL) +config SPARSE_VIRTUAL + def_bool y + +config ARCH_SUPPORTS_PMD_MAPPING + def_bool y + config ARCH_MEMORY_PROBE def_bool y depends on MEMORY_HOTPLUG - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] IA64: Implement SPARSE_VIRTUAL
[IA64] Sparse virtual implementation Equip IA64 sparsemem with a virtual memmap. This is similar to the existing CONFIG_VMEMMAP functionality for discontig. It uses a page size mapping. This is provided as a minimally intrusive solution. We split the 128TB VMALLOC area into two 64TB areas and use one for the virtual memmap. I have another patch in testing that uses granule sized 16MB pages to map the memmap but this would require changes to the interrupt vector table and there are certain discussions that would need to take place before we can accept such a large page size for the memmap. That version is better because is improves IA64 performance by reducing TLB pressure. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.21-rc5-mm2/arch/ia64/Kconfig === --- linux-2.6.21-rc5-mm2.orig/arch/ia64/Kconfig 2007-04-02 16:15:29.0 -0700 +++ linux-2.6.21-rc5-mm2/arch/ia64/Kconfig 2007-04-02 16:15:50.0 -0700 @@ -350,6 +350,10 @@ config ARCH_SPARSEMEM_ENABLE def_bool y depends on ARCH_DISCONTIGMEM_ENABLE +config SPARSE_VIRTUAL + def_bool y + depends on ARCH_SPARSEMEM_ENABLE + config ARCH_DISCONTIGMEM_DEFAULT def_bool y if (IA64_SGI_SN2 || IA64_GENERIC || IA64_HP_ZX1 || IA64_HP_ZX1_SWIOTLB) depends on ARCH_DISCONTIGMEM_ENABLE Index: linux-2.6.21-rc5-mm2/include/asm-ia64/page.h === --- linux-2.6.21-rc5-mm2.orig/include/asm-ia64/page.h 2007-04-02 16:15:29.0 -0700 +++ linux-2.6.21-rc5-mm2/include/asm-ia64/page.h2007-04-02 16:15:50.0 -0700 @@ -106,6 +106,9 @@ extern int ia64_pfn_valid (unsigned long # define ia64_pfn_valid(pfn) 1 #endif +#define vmemmap ((struct page *)(RGN_BASE(RGN_GATE) + \ + (1UL << (4*PAGE_SHIFT - 10 + #ifdef CONFIG_VIRTUAL_MEM_MAP extern struct page *vmem_map; #ifdef CONFIG_DISCONTIGMEM Index: linux-2.6.21-rc5-mm2/include/asm-ia64/pgtable.h === --- linux-2.6.21-rc5-mm2.orig/include/asm-ia64/pgtable.h2007-04-02 16:15:29.0 -0700 +++ linux-2.6.21-rc5-mm2/include/asm-ia64/pgtable.h 2007-04-02 16:15:50.0 -0700 @@ -236,8 +236,13 @@ ia64_phys_addr_valid (unsigned long addr # define VMALLOC_END vmalloc_end extern unsigned long vmalloc_end; #else +#if defined(CONFIG_SPARSEMEM) && defined(CONFIG_SPARSE_VIRTUAL) +/* SPARSE_VIRTUAL uses half of vmalloc... */ +# define VMALLOC_END (RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 10))) +#else # define VMALLOC_END (RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 9))) #endif +#endif /* fs/proc/kcore.c */ #definekc_vaddr_to_offset(v) ((v) - RGN_BASE(RGN_GATE)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] Generic Virtual Memmap suport for SPARSEMEM V2
Spare Virtual: Virtual Memmap support for SPARSEMEM V2 V1->V2 - Support for PAGE_SIZE vmemmap which allows the general use of of virtual memmap on any MMU capable platform (enabled IA64 support). - Fix various issues as suggested by Dave Hansen. - Add comments and error handling. SPARSEMEM is a pretty nice framework that unifies quite a bit of code over all the arches. It would be great if it could be the default so that we can get rid of various forms of DISCONTIG and other variations on memory maps. So far what has hindered this are the additional lookups that SPARSEMEM introduces for virt_to_page and page_address. This goes so far that the code to do this has to be kept in a separate function and cannot be used inline. This patch introduces virtual memmap support for sparsemem. virt_to_page page_address and consorts become simple shift/add operations. No page flag fields, no table lookups, nothing involving memory is required. The two key operations pfn_to_page and page_to_page become: #define pfn_to_page(pfn) (vmemmap + (pfn)) #define page_to_pfn(page)((page) - vmemmap) In order for this to work we will have to use a virtual mapping. These are usually for free since kernel memory is already mapped via a 1-1 mapping requiring a page tabld. The virtual mapping must be big enough to span all of memory that an arch can support which may make a virtual memmap difficult to use on 32 bit platforms that support 36 address bits. However, if there is enough virtual space available and the arch already maps its 1-1 kernel space using TLBs (f.e. true of IA64 and x86_64) then this technique makes sparsemem lookups even more effiecient than CONFIG_FLATMEM. FLATMEM still needs to read the contents of mem_map. mem_map is constant for a virtual memory map. Maybe this patch will allow us to make SPARSEMEM the default configuration that will work on UP, SMP and NUMA on most platforms? Then we may hopefully be able to remove the various forms of support for FLATMEM, DISCONTIG etc etc. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index: linux-2.6.21-rc5-mm2/include/asm-generic/memory_model.h === --- linux-2.6.21-rc5-mm2.orig/include/asm-generic/memory_model.h 2007-04-02 15:13:20.0 -0700 +++ linux-2.6.21-rc5-mm2/include/asm-generic/memory_model.h 2007-04-02 17:15:45.0 -0700 @@ -46,6 +46,14 @@ __pgdat->node_start_pfn; \ }) +#elif defined(CONFIG_SPARSE_VIRTUAL) + +/* + * We have a virtual memmap that makes lookups very simple + */ +#define __pfn_to_page(pfn) (vmemmap + (pfn)) +#define __page_to_pfn(page)((page) - vmemmap) + #elif defined(CONFIG_SPARSEMEM) /* * Note: section's mem_map is encorded to reflect its start_pfn. Index: linux-2.6.21-rc5-mm2/mm/sparse.c === --- linux-2.6.21-rc5-mm2.orig/mm/sparse.c 2007-04-02 15:58:23.0 -0700 +++ linux-2.6.21-rc5-mm2/mm/sparse.c2007-04-02 17:19:13.0 -0700 @@ -9,6 +9,8 @@ #include #include #include +#include +#include /* * Permanent SPARSEMEM data: @@ -101,7 +103,7 @@ static inline int sparse_index_init(unsi /* * Although written for the SPARSEMEM_EXTREME case, this happens - * to also work for the flat array case becase + * to also work for the flat array case because * NR_SECTION_ROOTS==NR_MEM_SECTIONS. */ int __section_nr(struct mem_section* ms) @@ -211,6 +213,214 @@ static int sparse_init_one_section(struc return 1; } +#ifdef CONFIG_SPARSE_VIRTUAL +/* + * Virtual Memory Map support + * + * (C) 2007 sgi. Christoph Lameter <[EMAIL PROTECTED]>. + * + * Virtual memory maps allow VM primitives pfn_to_page, page_to_pfn, + * virt_to_page, page_address() etc that involve no memory accesses at all. + * + * However, virtual mappings need a page table and TLBs. Many Linux + * architectures already map their physical space using 1-1 mappings + * via TLBs. For those arches the virtual memmory map is essentially + * for free if we use the same page size as the 1-1 mappings. In that + * case the overhead consists of a few additional pages that are + * allocated to create a view of memory for vmemmap. + * + * Special Kconfig settings: + * + * CONFIG_ARCH_POPULATES_VIRTUAL_MEMMAP + * + * The architecture has its own functions to populate the memory + * map and provides a vmemmap_populate function. + * + * CONFIG_ARCH_SUPPORTS_PMD_MAPPING + * + * If not set then PAGE_SIZE mappings are generated which + * require one PTE/TLB per PAGE_SIZE chunk of the virtual memory map. + * + * If set then PMD_SIZE mappings are generated which are much + * lighter on the TLB. On some platforms these generate + * the same overhead as the 1-1 mappings. + */ + +/* + * Allocate a block of memory to be used for the virtual memory map + * or the page tables that are used to create the
Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)
On Mon, Apr 02, 2007 at 05:23:20PM -0700, Christoph Lameter wrote: > On Mon, 2 Apr 2007, Siddha, Suresh B wrote: > > > Set the node_possible_map at runtime. On a non NUMA system, > > num_possible_nodes() will now say '1' > > How does this relate to nr_node_ids? With this patch, nr_node_ids on non NUMA will also be '1' and as before nr_node_ids is same as num_possible_nodes() thanks, suresh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vt: Do not clear UTF when resetting console
On Apr 3 2007 08:16, Antonino A. Daplas wrote: > >That would be the cleanest and purest behavior. But it's possible to set >one console to UTF-8 and another to legacy mode. The question would be: why would you want to have mixed consoles? Switching to UTF8 IMO does not take away any characters, and I mean no-framebuffer 80x25 that is limited to 256 glyphs. Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)
On Mon, 2 Apr 2007, Siddha, Suresh B wrote: > Set the node_possible_map at runtime. On a non NUMA system, > num_possible_nodes() will now say '1' How does this relate to nr_node_ids? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vt: Do not clear UTF when resetting console
On Mon, 2007-04-02 at 10:35 -0700, H. Peter Anvin wrote: > Antonino A. Daplas wrote: > > Resetting the console, either by ANSI escape sequences or by the reset > > utility, > > will drop the console back to legacy (non-UTF-8) mode. Fix this by leaving > > the > > field vc_data.vc_utf untouched in reset_terminal(). In addition, a global > > variable (default_utf8) which defines system-wide UTF-8 setting is created. > > This variable can be adjusted via sysfs. > > If you're going to introduce a system-wide default, instead of issuing > the appropriate escape code, then I would argue it should still be > forced (to the default) when issuing a console reset. > That would be the cleanest and purest behavior. But it's possible to set one console to UTF-8 and another to legacy mode. So one can corrupt the user's console just by issuing a reset or echo -e '\033c'. (Although one can argue that users who know what UTF-8 is also knows how to set the encoding back) Until userspace is more capable of setting back the terminal to its previous configuration, I would tend to agree with Jan, that we should leave the current utf setting of that particular vc alone. Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vt: Do not clear UTF when resetting console
On Mon, 2007-04-02 at 21:10 +0200, Jan Engelhardt wrote: > On Apr 2 2007 22:13, Antonino A. Daplas wrote: > >Resetting the console, either by ANSI escape sequences or by the reset > >utility, > >will drop the console back to legacy (non-UTF-8) mode. Fix this by leaving > >the > >field vc_data.vc_utf untouched in reset_terminal(). In addition, a global > >variable (default_utf8) which defines system-wide UTF-8 setting is created. > >This variable can be adjusted via sysfs. > > > >This is based from patches by Jan Engelhardt and Paul LeoNerd Evans. > > > >Signed-off-by: Antonino Daplas <[EMAIL PROTECTED]> > > Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]> > > > >--- > > > > drivers/char/vt.c |4 +++- > > 1 files changed, 3 insertions(+), 1 deletions(-) > > > BTW. Is it feasible to make utf8 the default (static int default_utf8 = 1) > or is that likely to break some installs? I guess it would, but most if not all major distribs are moving/have moved to utf-8. Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Poor UDP performance using 2.6.21-rc5-rt5
> -Original Message- > From: Ingo Molnar [mailto:[EMAIL PROTECTED] > Sent: Monday, April 02, 2007 3:05 PM > To: [EMAIL PROTECTED] > Cc: Dave Sperry; linux-rt-users@vger.kernel.org; linux- > [EMAIL PROTECTED] > Subject: Re: Poor UDP performance using 2.6.21-rc5-rt5 > > > * [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > The Intel NIC seems to behave better under RT > > yeah. > > > I think there is some kind of bad behavior happening in the Nvidia > > driver with respect to softirq-net-tx and IRQ-8406. > > yes. Part of the problem is that the forcedeth.c driver does not fully > support NAPI - today i've implemented those bits (see them below), based > on your testcase. The other part is that the Intel NIC uses MSI, while > foredeth uses fasteoi, correct? [you can see this in /proc/interrupts] In my case forcedeth seems to be picking up MSI for eth2 & eth3 ]$ cat /proc/interrupts CPU0 CPU1 0:110 0 IO-APIC-edge timer 1: 0 10 IO-APIC-edge i8042 8: 0 0 IO-APIC-edge rtc 9: 0 0 IO-APIC-fasteoi acpi 12: 0124 IO-APIC-edge i8042 20: 0 0 IO-APIC-fasteoi libata 21: 4 7755 IO-APIC-fasteoi libata 22: 0 5570 IO-APIC-fasteoi ehci_hcd:usb2 23: 0 1 IO-APIC-fasteoi ohci_hcd:usb1, libata 8406: 7 15969 PCI-MSI-edge eth3 8407: 8 17249 PCI-MSI-edge eth2 8408: 0131 PCI-MSI-edge eth1 8409: 0 85 PCI-MSI-edge eth0 NMI: 0 0 LOC: 201594 202389 ERR: 0 Could this be part of my problem? The lspci for the device is: 00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3) Subsystem: Super Micro Computer Inc Unknown device 1611 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- > there are a few other things i'm working on to improve this. I've > uploaded -rt9 which is the current state of affairs. Note that using > -rt9 you'll likely only see IRQ-8406 overhead in the system, because > i've added an optimization to do process the softirq-net-tx workload in > the hardirq thread if the priority of the two is the same (which is the > default behavior). But -rt9 is still work in progress that is not fully > finished yet: in some cases i'm seeing 'fluctuating performance' > problems on forcedeth that werent there before. I tried -rt9 and saw some odd 'fluctuating performance'. I'll try it again tomorrow when I am much closer to the box's power button. Thanks again, Dave > > Ingo > > -> > From: Ingo Molnar <[EMAIL PROTECTED]> > Subject: [patch] forcedeth.c: improve NAPI handler > > another forcedeth.c thing: i noticed that its NAPI handler does not do > tx-ring processing. The patch below implements this - tested on > DESC_VER_2 hardware, with CONFIG_FORCEDETH_NAPI=y. > > Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> > > Index: linux/drivers/net/forcedeth.c > === > --- linux.orig/drivers/net/forcedeth.c > +++ linux/drivers/net/forcedeth.c > @@ -3118,9 +3118,17 @@ static int nv_napi_poll(struct net_devic > int retcode; > > if (np->desc_ver == DESC_VER_1 || np->desc_ver == DESC_VER_2) { > + spin_lock_irqsave(>lock, flags); > + nv_tx_done(dev); > + spin_unlock_irqrestore(>lock, flags); > + > pkts = nv_rx_process(dev, limit); > retcode = nv_alloc_rx(dev); > } else { > + spin_lock_irqsave(>lock, flags); > + nv_tx_done_optimized(dev, np->tx_ring_size); > + spin_unlock_irqrestore(>lock, flags); > + > pkts = nv_rx_process_optimized(dev, limit); > retcode = nv_alloc_rx_optimized(dev); > } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] Use a single loader for i386 and x86_64
On Mon, 2007-04-02 at 16:43 -0300, Glauber de Oliveira Costa wrote: > This patch moves lguest.c one level bellow, and enhances it with the > ability to kick off 64 binaries. It would be much easier to just ifdef > functions, but I have x86_64 machines loading 32-bit kernels as a longer > goal, and that's why the patch features the load_elf_header() function. Hi Glauber! I've been writing documentation, and in the process completely reorganised and cleaned up this file. I have also been working on getting rid of the gratuitous u32's for addresses and trying to cleanse myself of 32-bit thinking! I've now pushed the changes to the repo. They should simplify this patch... Thanks! Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] cpuid: switch to cpuid_on_cpu()
Alexey Dobriyan wrote: Now that cpuid_on_cpu() is in core, cpuid driver can be shrinked. Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Hi Alexey, This, and your other changes in this area does conflict with the work that I've been doing on extending the usability of the CPUID and MSR drivers (which is part of why this work has dragged out seemingly forever.) I would really appreciate it if we could work together on this; there needs to be new paravirtualization entry points for this. Consequently, I just updated and uploaded a git tree with the current status. It still needs porting to x86-64, however. The current cpuid/msr work is at: http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-cpuidmsr.git;a=summary -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] Fix race between cat /proc/slab_allocators and rmmod
On Tue, 03 Apr 2007 09:09:55 +1000 Rusty Russell <[EMAIL PROTECTED]> wrote: > On Mon, 2007-04-02 at 19:03 +0400, Alexey Dobriyan wrote: > > Same story as with cat /proc/*/wchan race vs rmmod race, only > > /proc/slab_allocators want more info than just symbol name. > > > > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> > > All these look excellent. I hope Andrew picked them up? > I wasn't cc'ed on them, but I'll push them into the hole marked "In" anyway. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Warning: unable to open an initial console.
I checked /dev/ with U-boot and it shows the existence of /dev/console. >From U-boot prompt: $ ls /dev crw---0 Mon Apr 02 17:52:27 2007 console crw-r--r--0 Mon Apr 02 17:52:27 2007 null crw-r--r--0 Mon Apr 02 17:52:27 2007 zero Also, I added a printk in the jffs2_add_fd_to_list() routine in fs/jffs2/nodelist.c to print out the dirent adds and it shows console being added as follows: ... add dirent "var", ino #14 add dirent "usr", ino #13 add dirent "tmp", ino #12 add dirent "sys", ino #11 add dirent "sbin", ino #10 add dirent "proc", ino #9 add dirent "mnt", ino #8 add dirent "linuxrc", ino #7 add dirent "lib", ino #6 add dirent "home", ino #5 add dirent "etc", ino #4 add dirent "dev", ino #3 add dirent "bin", ino #2 VFS: Mounted root (jffs2 filesystem). Freeing init memory: 76K add dirent "zero", ino #70 add dirent "null", ino #69 add dirent "console", ino #68 Warning: unable to open an initial console. add dirent "watchdog", ino #262 add dirent "udevstart", ino #261 add dirent "udevsend", ino #260 ... Any other ideas? Thanks, Tom -Original Message- From: Chris Wedgwood [mailto:[EMAIL PROTECTED] Sent: Monday, April 02, 2007 3:49 PM To: Tom Strader Cc: linux-kernel@vger.kernel.org Subject: Re: Warning: unable to open an initial console. On Mon, Apr 02, 2007 at 12:04:56PM -0700, Tom Strader wrote: > I have seen quite a few posts regarding unable to open an initial > console, but my system seems to have the necessary things in place > so I come looking for help. your rootfs/initramfs/initrd is missing a valid working /dev/console > VFS: Mounted root (jffs2 filesystem). check /dev/ on this filesystem - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: usb hid: reset NumLock
On Mon, 2 Apr 2007 16:48:24 +0200 (CEST), Jiri Kosina <[EMAIL PROTECTED]> wrote: > On Sun, 1 Apr 2007, Pete Zaitcev wrote: > could you please change the order of the two functions, so that you > don't have to put the forward declaration here? >[...] > I'd say this is a little bit overcommented. >[...] > So as soon as you have the VIDs and PIDs of the hardware which > requires this, could you please update the patch and send it to me again? How about this? diff --git a/drivers/usb/input/hid-core.c b/drivers/usb/input/hid-core.c index 827a75a..23b1e70 100644 --- a/drivers/usb/input/hid-core.c +++ b/drivers/usb/input/hid-core.c @@ -545,6 +545,45 @@ void usbhid_init_reports(struct hid_device *hid) warn("timeout initializing reports"); } +/* + * Reset LEDs which BIOS might have left on. For now, just NumLock (0x01). + */ + +static int hid_find_field_early(struct hid_device *hid, unsigned int page, +unsigned int hid_code, struct hid_field **pfield) +{ + struct hid_report *report; + struct hid_field *field; + struct hid_usage *usage; + int i, j; + + list_for_each_entry(report, >report_enum[HID_OUTPUT_REPORT].report_list, list) { + for (i = 0; i < report->maxfield; i++) { + field = report->field[i]; + for (j = 0; j < field->maxusage; j++) { + usage = >usage[j]; + if ((usage->hid & HID_USAGE_PAGE) == page && + (usage->hid & 0x) == hid_code) { + *pfield = field; + return j; + } + } + } + } + return -1; +} + +static void usbhid_set_leds(struct hid_device *hid) +{ + struct hid_field *field; + int offset; + + if ((offset = hid_find_field_early(hid, HID_UP_LED, 0x01, )) != -1) { + hid_set_field(field, offset, 0); + usbhid_submit_report(hid, field->report, USB_DIR_OUT); + } +} + #define USB_VENDOR_ID_GTCO 0x078c #define USB_DEVICE_ID_GTCO_90 0x0090 #define USB_DEVICE_ID_GTCO_100 0x0100 @@ -765,6 +804,9 @@ void usbhid_init_reports(struct hid_device *hid) #define USB_VENDOR_ID_SONY 0x054c #define USB_DEVICE_ID_SONY_PS3_CONTROLLER 0x0268 +#define USB_VENDOR_ID_DELL 0x413c +#define USB_DEVICE_ID_DELL_W7658 0x2005 + /* * Alphabetically sorted blacklist by quirk type. */ @@ -947,6 +989,8 @@ static const struct hid_blacklist { { USB_VENDOR_ID_CIDC, 0x0103, HID_QUIRK_IGNORE }, + { USB_VENDOR_ID_DELL, USB_DEVICE_ID_DELL_W7658, HID_QUIRK_RESET_LEDS }, + { 0, 0 } }; @@ -1334,6 +1378,8 @@ static int hid_probe(struct usb_interface *intf, const struct usb_device_id *id) usbhid_init_reports(hid); hid_dump_device(hid); + if (hid->quirks & HID_QUIRK_RESET_LEDS) + usbhid_set_leds(hid); if (!hidinput_connect(hid)) hid->claimed |= HID_CLAIMED_INPUT; diff --git a/include/linux/hid.h b/include/linux/hid.h index 8c97d4d..3e8dcb0 100644 --- a/include/linux/hid.h +++ b/include/linux/hid.h @@ -269,6 +269,7 @@ struct hid_item { #define HID_QUIRK_SONY_PS3_CONTROLLER 0x0008 #define HID_QUIRK_LOGITECH_S510_DESCRIPTOR 0x0010 #define HID_QUIRK_DUPLICATE_USAGES 0x0020 +#define HID_QUIRK_RESET_LEDS 0x0040 /* * This is the global environment of the parser. This information is I wasn't sure where to place the function, so I just put it above its user, to signify that it's special-casing. Also, it's unclear where to put the quirk entry and defines. There's a comment saying that they are sorted alphabetically by quirk, but apparently the order was violated with more recent additions. -- Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] Fix race between cat /proc/slab_allocators and rmmod
On Mon, 2007-04-02 at 19:03 +0400, Alexey Dobriyan wrote: > Same story as with cat /proc/*/wchan race vs rmmod race, only > /proc/slab_allocators want more info than just symbol name. > > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> All these look excellent. I hope Andrew picked them up? Thanks, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] Fix race between rmmod and cat /proc/kallsyms
On Mon, 2007-04-02 at 19:01 +0400, Alexey Dobriyan wrote: > +static inline int module_get_kallsym(unsigned int symnum, unsigned long > *value, > + char *type, char *name, > + char *module_name, int *exported) > { > - return NULL; > + return -ERANGE; > } This would normally by -ENOSYS, but since the return value is used as a binary anyway, it doesn't matter. Thanks, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5 possible regression: KDE processes die silently (was: 2.6.21-rc3-mm2: KDE processes die while system is idle)
On Sun, Apr 01, 2007 at 06:48:03PM +0200, Rafael J. Wysocki wrote: > On Sunday, 1 April 2007 17:21, Tilman Schmidt wrote: > > I'm sorry to say this has now happened with kernel 2.6.21-rc5, too. > > I started a kernel compilation in the evening and came back in the > > morning to find all KDE decorations gone. All processes normally > > running for a KDE session and labelled "[kinit]" in ps were gone > > but everything else was running fine, and the system was still > > usable via ssh. /var/log/kdm.log and /var/log/Xorg.0.log contained > > nothing remotely suspicious. /var/log/messages had two lines I > > never saw before: > > > > Mar 31 02:27:36 gx110 kernel: [153577.891443] ReiserFS: hda3: warning: > > vs-8115: get_num_ver: not directory or indirect item > > Mar 31 02:27:36 gx110 kernel: [153577.891559] ReiserFS: hda3: warning: > > vs-8115: get_num_ver: not directory or indirect item > > > > But those didn't appear on previous occurrences of the "dying KDE" > > problem so I guess they are not related. > > > > This is SUSE LINUX 10.0 (i586) running on a Dell OptiPlex GX110 > > (Intel P3, 933 MHz, i810 chipset, 512 MB RAM, 60 GB ATA disk) > > % uname -a > > Linux gx110 2.6.21-rc5-noinitrd #1 PREEMPT Sat Mar 31 02:15:19 CEST 2007 > > i686 i686 i386 GNU/Linux > > % cat /proc/cmdline > > root=/dev/hda3 selinux=0 x11i=vesa video=intelfb:[EMAIL PROTECTED] > > nmi_watchdog=2 lapic 5 > > Kernel configuration mostly-modular, based on standard SuSE kernel's > > /proc/config.gz, just compiling into the kernel everything I need to > > boot without an initrd and omitting some parts I'm not interested in. > > (.config attached.) What else might be relevant? > > > > Again, this is a Heisenbug, ie. it's not reproducible and invariably > > happens when I'm away from the machine. (Probably Murphy at work.) > > It's pretty rare: I have seen it four times on 2.6.21-rc3-mm2 and > > once on 2.6.21-rc5, on a machine which spends about equal amounts > > of time running the latest stable, rc, and mm kernels. OTOH, so far > > it hasn't ever happened with any 2.6.20 or earlier kernel. Nor have > > I seen it with 2.6.21-rc[1-4] or 2.6.21-rc4-mm* - but for the -rc4 > > and -rc4-mm releases that's not conclusive as those have only been > > running for a very short time. > > I have a similar problem on x86_64 OpenSUSE 10.2, but it seems to happen > when a sound (eg. notification) is played while the display is suspended > (or "powered off"). Is it easily reproducible and still present with the latest -git? If yes, can you bisect? > IMO it's a SUSE bug. We also have a report of KDE crashes on Debian [1]. And just a few days ago a kernel bug kwin ran into was fixed [2]. If the pattern is "works with 2.6.20 but does not work with 2.6.21-rc", then it's most likely a kernel regression. > Greetings, > Rafael cu Adrian [1] http://bugzilla.kernel.org/show_bug.cgi?id=8157 [2] commit 14e9d5730adfca26452b3a2838a80af6950556f5 -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL
On Monday 02 April 2007 09:37, Christoph Lameter wrote: > On Sun, 1 Apr 2007, Andi Kleen wrote: > > Hmm, this means there is at least 2MB worth of struct page on every node? > > Or do you have overlaps with other memory (I think you have) > > In that case you have to handle the overlap in change_page_attr() > > Correct. 2MB worth of struct page is 128 mb of memory. Are there nodes > with smaller amounts of memory? Do you deal with max_addr= and mem=? RHEL4 (2.6.9) blows up if max_addr= happens to leave you with CPU-only nodes. So hopefully you can deal with arbitrary-sized nodes caused by max_addr= or mem=. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 3/4] [SCSI]stex: fix reset recovery for console device
> -Original Message- > From: James Bottomley [mailto:[EMAIL PROTECTED] > Sent: Monday, April 02, 2007 11:28 AM > To: Ed Lin > Cc: linux-scsi; linux-kernel; jeff; Promise_Linux > Subject: RE: [PATCH 3/4] [SCSI]stex: fix reset recovery for > console device > > > On Mon, 2007-04-02 at 11:14 -0700, Ed Lin wrote: > > I just saw the routine name scsi_eh_try_stu, and didn't notice the > > allow_restart (partly because I thought it was not harmful...). > > But the TEST_UNIT_READY must stay. > > Sure ... I was just checking since your change log implied you'd seen > the problem from the error handler ... however, we can add it ... > there's a possibility of getting spin up on init from sd anyway. > You make the decision. But after reconsideration, I think it's better to remove unused code. It also needs change since the patch about id mapping is modified in another mail. How about the attachment here? s3 Description: s3
Re: [SLUB 2/2] i386 arch page size slab fixes
On Sat, Mar 31, 2007 at 11:31:07AM -0800, Christoph Lameter wrote: > Patch by William Irwin with only very minor modifications by me which are > 1. Removal of HIGHMEM64G slab caches. It seems that virtualization hosts >require a a full pgd page. The HIGHMEM64G slab allocations are meaningfully performant vs. page-sized allocations where virtualization is absent. I would personally rather whip Xen into shape enough to be able to handle the minimal pgd allocations than retain the oversized pgd allocations even in only the Xen case. Also, the entire unshared kernel pmd shenanigan in Xen is an artifact of its recursive pagetable affair, which can also be done away with a SMOP. On Sat, Mar 31, 2007 at 11:31:07AM -0800, Christoph Lameter wrote: > 2. Add missing virtualization hook. Seems that we need a new way >of serializing paravirt_alloc(). It may need to do its own serialization. > 3. Remove ARCH_USES_SLAB_PAGE_STRUCT This doesn't quite cover all bases. The changes to pageattr.c and fault.c are dubious and need verification at the very least. They were largely slapped together to get the files past the compiler for the performance comparisons that were never properly done. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SLUB 2/2] i386 arch page size slab fixes
On Mon, 2 Apr 2007, William Lee Irwin III wrote: > This doesn't quite cover all bases. The changes to pageattr.c and > fault.c are dubious and need verification at the very least. They were > largely slapped together to get the files past the compiler for the > performance comparisons that were never properly done. I looked through them but then I am no i386 specialist though. Looked fine. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] Simplify module_get_kallsym() by dropping length arg
On Mon, 2007-04-02 at 19:01 +0400, Alexey Dobriyan wrote: > - > [PATCH 1/5] Simplify module_get_kallsym() by dropping length arg > > module_get_kallsym() could in theory truncate module symbol name to fit > in buffer, but nobody does this. Always use KSYM_NAME_LEN + 1 bytes for name. > > Suggested by lg^WRusty. > > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Acked-by: Rusty Russell <[EMAIL PROTECTED]> Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/4] [SCSI]stex: fix id mapping issue
> -Original Message- > From: James Bottomley [mailto:[EMAIL PROTECTED] > Sent: Saturday, March 31, 2007 7:22 AM > To: Ed Lin > Cc: linux-scsi; linux-kernel; jeff; Promise_Linux > Subject: Re: [PATCH 1/4] [SCSI]stex: fix id mapping issue > > > On Fri, 2007-03-30 at 15:21 -0700, Ed Lin wrote: > > The internal id/lun mapping of st_vsc and st_vsc1 > controllers is different > > from st_shasta. The original driver code can only map > first 16 'entities' > > for st_vsc and st_vsc1 while there are actually 128 available. > > > > Also the ST_MAX_LUN_PER_TARGET should be 8, although this can do > > no harm because inquiries beyond boundary are discarded by firmware. > > > > The correct internal mapping should be: > > id:0~15, lun:0~7 (st_shasta) > > id:0, lun:0~127 (st_yosemite) > > id:0~127, lun:0 (st_vsc and st_vsc1) > > To scsi mid layer they are all channel:0~7, id:0~15, lun:0, > with a maximun > > 'entity' number of 128. The RAID console only interfaces to > scsi mid layer > > and is always mapped at channel:0, id:16, lun:0. > > I'm with Christoph here ... if we're going to break the backwards > compatibility of the mappings (which your code does) then we > could just > dump channel and use the SCSI id and lun directly. > > Understanding this code is predicated on this quirky definition in > stex_queuecommand: > > id = cmd->device->id; > lun = cmd->device->channel; /* firmware lun issue work around */ >^^^ > > > @ -645,12 +645,16 @@ stex_queuecommand(struct scsi_cmnd *cmd, > > > > req = stex_alloc_req(hba); > > > > - if (hba->cardtype == st_yosemite) { > > - req->lun = lun * (ST_MAX_TARGET_NUM - 1) + id; > > This looks to be correct, it goes up id 0 to ST_MAX_TARGET_NUM -1 then > takes the next channel. > > > - req->target = 0; > > - } else { > > + if (hba->cardtype == st_shasta) { > > req->lun = lun; > > req->target = id; > > + } else if (hba->cardtype == st_yosemite){ > > + req->lun = id * ST_MAX_LUN_PER_TARGET + lun; > > + req->target = 0; > > + } else { > > + /* st_vsc and st_vsc1 */ > > + req->lun = 0; > > + req->target = id * ST_MAX_LUN_PER_TARGET + lun; > > These both look to be wrong. You're taking the channel as the lowest > common denominator, so your first target is on channel 1 id > 0, your next > on channel 2, id 0 and so on. That's really going to mess with the > ordering (which will be user visible) is that really what you want? > How about the attached one? s1 Description: s1
Re: [patch 6/13] signal/timer/event fds v10 - timerfd core ...
On Mon, 2007-04-02 at 15:46 -0700, Davide Libenzi wrote: > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include Can you bring them into alphabetic order and check if the whole bunch is really required ? Otherwise it looks good ! tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/13] signal/timer/event fds v10 - signalfd core ...
ChangeLog: v10 - Renamed from "aino" to "anon_inode" -- This patch series implements the new signalfd() system call. I took part of the original Linus code (and you know how badly it can be broken :), and I added even more breakage ;) Signals are fetched from the same signal queue used by the process, so signalfd will compete with standard kernel delivery in dequeue_signal(). If you want to reliably fetch signals on the signalfd file, you need to block them with sigprocmask(SIG_BLOCK). This seems to be working fine on my Dual Opteron machine. I made a quick test program for it: http://www.xmailserver.org/signafd-test.c The signalfd() system call implements signal delivery into a file descriptor receiver. The signalfd file descriptor if created with the following API: int signalfd(int ufd, const sigset_t *mask, size_t masksize); The "ufd" parameter allows to change an existing signalfd sigmask, w/out going to close/create cycle (Linus idea). Use "ufd" == -1 if you want a brand new signalfd file. The "mask" allows to specify the signal mask of signals that we are interested in. The "masksize" parameter is the size of "mask". The signalfd fd supports the poll(2) and read(2) system calls. The poll(2) will return POLLIN when signals are available to be dequeued. As a direct consequence of supporting the Linux poll subsystem, the signalfd fd can use used together with epoll(2) too. The read(2) system call will return a "struct signalfd_siginfo" structure in the userspace supplied buffer. The return value is the number of bytes copied in the supplied buffer, or -1 in case of error. The read(2) call can also return 0, in case the sighand structure to which the signalfd was attached, has been orphaned. The O_NONBLOCK flag is also supported, and read(2) will return -EAGAIN in case no signal is available. If the size of the buffer passed to read(2) is lower than sizeof(struct signalfd_siginfo), -EINVAL is returned. A read from the signalfd can also return -ERESTARTSYS in case a signal hits the process. The format of the struct signalfd_siginfo is, and the valid fields depends of the (->code & __SI_MASK) value, in the same way a struct siginfo would: struct signalfd_siginfo { __u32 signo;/* si_signo */ __s32 err; /* si_errno */ __s32 code; /* si_code */ __u32 pid; /* si_pid */ __u32 uid; /* si_uid */ __s32 fd; /* si_fd */ __u32 tid; /* si_fd */ __u32 band; /* si_band */ __u32 overrun; /* si_overrun */ __u32 trapno; /* si_trapno */ __s32 status; /* si_status */ __s32 svint;/* si_int */ __u64 svptr;/* si_ptr */ __u64 utime;/* si_utime */ __u64 stime;/* si_stime */ __u64 addr; /* si_addr */ }; Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/fs/signalfd.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.21-rc5.fds/fs/signalfd.c 2007-04-02 15:06:29.0 -0700 @@ -0,0 +1,353 @@ +/* + * fs/signalfd.c + * + * Copyright (C) 2003 Linus Torvalds + * + * Mon Mar 5, 2007: Davide Libenzi + * Changed ->read() to return a siginfo strcture instead of signal number. + * Fixed locking in ->poll(). + * Added sighand-detach notification. + * Added fd re-use in sys_signalfd() syscall. + * Now using anonymous inode source. + * Thanks to Oleg Nesterov for useful code review and suggestions. + * More comments and suggestions from Arnd Bergmann. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +struct signalfd_ctx { + struct list_head lnk; + wait_queue_head_t wqh; + sigset_t sigmask; + struct task_struct *tsk; +}; + +struct signalfd_lockctx { + struct task_struct *tsk; + unsigned long flags; +}; + +/* + * Tries to acquire the sighand lock. We do not increment the sighand + * use count, and we do not even pin the task struct, so we need to + * do it inside an RCU read lock, and we must be prepared for the + * ctx->tsk going to NULL (in signalfd_deliver()), and for the sighand + * being detached. We return 0 if the sighand has been detached, or + * 1 if we were able to pin the sighand lock. + */ +static int signalfd_lock(struct signalfd_ctx *ctx, struct signalfd_lockctx *lk) +{ + struct sighand_struct *sighand = NULL; + + rcu_read_lock(); + lk->tsk = rcu_dereference(ctx->tsk); + if (likely(lk->tsk != NULL)) + sighand = lock_task_sighand(lk->tsk, >flags); + rcu_read_unlock(); + + if (sighand && !ctx->tsk) { + unlock_task_sighand(lk->tsk, >flags); + sighand = NULL; + } + + return sighand != NULL; +} + +static void signalfd_unlock(struct signalfd_lockctx *lk) +{
Re: non-NUMA cache_free_alien() (was Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of virt_to_slab(objp); nodeid = slabp->nodeid;)
On Fri, Mar 23, 2007 at 03:12:10PM +0100, Andi Kleen wrote: > > But that is based on compile time option, isn't it? Perhaps I need > > to use some other mechanism to find out the platform is not NUMA capable.. > > We can probably make it runtime on x86. That will be needed sooner or > later for correct NUMA hotplug support anyways. How about this patch? Thanks. --- From: Suresh Siddha <[EMAIL PROTECTED]> [patch] x86_64: set node_possible_map at runtime. Set the node_possible_map at runtime. On a non NUMA system, num_possible_nodes() will now say '1' Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> --- diff --git a/arch/x86_64/mm/k8topology.c b/arch/x86_64/mm/k8topology.c index b5b8dba..d6f4447 100644 --- a/arch/x86_64/mm/k8topology.c +++ b/arch/x86_64/mm/k8topology.c @@ -49,11 +49,8 @@ int __init k8_scan_nodes(unsigned long start, unsigned long end) int found = 0; u32 reg; unsigned numnodes; - nodemask_t nodes_parsed; unsigned dualcore = 0; - nodes_clear(nodes_parsed); - if (!early_pci_allowed()) return -1; @@ -102,7 +99,7 @@ int __init k8_scan_nodes(unsigned long start, unsigned long end) nodeid, (base>>8)&3, (limit>>8) & 3); return -1; } - if (node_isset(nodeid, nodes_parsed)) { + if (node_isset(nodeid, node_possible_map)) { printk(KERN_INFO "Node %d already present. Skipping\n", nodeid); continue; @@ -155,7 +152,7 @@ int __init k8_scan_nodes(unsigned long start, unsigned long end) prevbase = base; - node_set(nodeid, nodes_parsed); + node_set(nodeid, node_possible_map); } if (!found) diff --git a/arch/x86_64/mm/numa.c b/arch/x86_64/mm/numa.c index 41b8fb0..5f7d4d8 100644 --- a/arch/x86_64/mm/numa.c +++ b/arch/x86_64/mm/numa.c @@ -383,6 +383,7 @@ static int __init numa_emulation(unsigned long start_pfn, unsigned long end_pfn) i, nodes[i].start, nodes[i].end, (nodes[i].end - nodes[i].start) >> 20); + node_set(i, node_possible_map); node_set_online(i); } memnode_shift = compute_hash_shift(nodes, numa_fake); @@ -405,6 +406,8 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn) { int i; + nodes_clear(node_possible_map); + #ifdef CONFIG_NUMA_EMU if (numa_fake && !numa_emulation(start_pfn, end_pfn)) return; @@ -432,6 +435,7 @@ void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn) memnodemap[0] = 0; nodes_clear(node_online_map); node_set_online(0); + node_set(0, node_possible_map); for (i = 0; i < NR_CPUS; i++) numa_set_node(i, 0); node_to_cpumask[0] = cpumask_of_cpu(0); diff --git a/arch/x86_64/mm/srat.c b/arch/x86_64/mm/srat.c index 2efe215..9f26e2b 100644 --- a/arch/x86_64/mm/srat.c +++ b/arch/x86_64/mm/srat.c @@ -25,7 +25,6 @@ int acpi_numa __initdata; static struct acpi_table_slit *acpi_slit; -static nodemask_t nodes_parsed __initdata; static struct bootnode nodes[MAX_NUMNODES] __initdata; static struct bootnode nodes_add[MAX_NUMNODES]; static int found_add_area __initdata; @@ -43,7 +42,7 @@ static __init int setup_node(int pxm) static __init int conflicting_nodes(unsigned long start, unsigned long end) { int i; - for_each_node_mask(i, nodes_parsed) { + for_each_node_mask(i, node_possible_map) { struct bootnode *nd = [i]; if (nd->start == nd->end) continue; @@ -321,7 +320,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) } nd = [node]; oldnode = *nd; - if (!node_test_and_set(node, nodes_parsed)) { + if (!node_test_and_set(node, node_possible_map)) { nd->start = start; nd->end = end; } else { @@ -344,7 +343,7 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma) printk(KERN_NOTICE "SRAT: Hotplug region ignored\n"); *nd = oldnode; if ((nd->start | nd->end) == 0) - node_clear(node, nodes_parsed); + node_clear(node, node_possible_map); } } @@ -356,7 +355,7 @@ static int nodes_cover_memory(void) unsigned long pxmram, e820ram; pxmram = 0; - for_each_node_mask(i, nodes_parsed) { + for_each_node_mask(i, node_possible_map) { unsigned long s = nodes[i].start >> PAGE_SHIFT; unsigned long e = nodes[i].end >> PAGE_SHIFT; pxmram += e - s; @@ -380,7 +379,7 @@ static int nodes_cover_memory(void) static void unparse_node(int node) { int
Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL
Christoph Lameter wrote: On Mon, 2 Apr 2007, Martin Bligh wrote: For 64GB you'd need 256M which would be a quarter of low mem. Probably takes up too much of low mem. Yup. We could move whatever you currently use to handle that into i386 arch code. Or are there other platforms that do similar tricks with highmem? We already have special hooks for node lookups in sparsemem. Move all of that off into some arch dir? Well, all I did was basically an early vmalloc kind of thing. You only need to allocate enough virtual space for how much memory you actually *have*, not the full set. The problem on i386 is that you just need to reserve that space early, in order to shuffle everything else into fit. It's messy, but not hard. M. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/9] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use
Add an interface to the AF_RXRPC module so that the AFS filesystem module can more easily make use of the services available. AFS still opens a socket but then uses the action functions in lieu of sendmsg() and registers an intercept functions to grab messages before they're queued on the socket Rx queue. This permits AFS (or whatever) to: (1) Avoid the overhead of using the recvmsg() call. (2) Use different keys directly on individual client calls on one socket rather than having to open a whole slew of sockets, one for each key it might want to use. (3) Avoid calling request_key() at the point of issue of a call or opening of a socket. This is done instead by AFS at the point of open(), unlink() or other VFS operation and the key handed through. (4) Request the use of something other than GFP_KERNEL to allocate memory. Furthermore: (*) The socket buffer markings used by RxRPC are made available for AFS so that it can interpret the cooked RxRPC messages itself. (*) rxgen (un)marshalling abort codes are made available. The following documentation for the kernel interface is added to Documentation/networking/rxrpc.txt: = AF_RXRPC KERNEL INTERFACE = The AF_RXRPC module also provides an interface for use by in-kernel utilities such as the AFS filesystem. This permits such a utility to: (1) Use different keys directly on individual client calls on one socket rather than having to open a whole slew of sockets, one for each key it might want to use. (2) Avoid having RxRPC call request_key() at the point of issue of a call or opening of a socket. Instead the utility is responsible for requesting a key at the appropriate point. AFS, for instance, would do this during VFS operations such as open() or unlink(). The key is then handed through when the call is initiated. (3) Request the use of something other than GFP_KERNEL to allocate memory. (4) Avoid the overhead of using the recvmsg() call. RxRPC messages can be intercepted before they get put into the socket Rx queue and the socket buffers manipulated directly. To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket, bind an addess as appropriate and listen if it's to be a server socket, but then it passes this to the kernel interface functions. The kernel interface functions are as follows: (*) Begin a new client call. struct rxrpc_call * rxrpc_kernel_begin_call(struct socket *sock, struct sockaddr_rxrpc *srx, struct key *key, unsigned long user_call_ID, gfp_t gfp); This allocates the infrastructure to make a new RxRPC call and assigns call and connection numbers. The call will be made on the UDP port that the socket is bound to. The call will go to the destination address of a connected client socket unless an alternative is supplied (srx is non-NULL). If a key is supplied then this will be used to secure the call instead of the key bound to the socket with the RXRPC_SECURITY_KEY sockopt. Calls secured in this way will still share connections if at all possible. The user_call_ID is equivalent to that supplied to sendmsg() in the control data buffer. It is entirely feasible to use this to point to a kernel data structure. If this function is successful, an opaque reference to the RxRPC call is returned. The caller now holds a reference on this and it must be properly ended. (*) End a client call. void rxrpc_kernel_end_call(struct rxrpc_call *call); This is used to end a previously begun call. The user_call_ID is expunged from AF_RXRPC's knowledge and will not be seen again in association with the specified call. (*) Send data through a call. int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg, size_t len); This is used to supply either the request part of a client call or the reply part of a server call. msg.msg_iovlen and msg.msg_iov specify the data buffers to be used. msg_iov may not be NULL and must point exclusively to in-kernel virtual addresses. msg.msg_flags may be given MSG_MORE if there will be subsequent data sends for this call. The msg must not specify a destination address, control data or any flags other than MSG_MORE. len is the total amount of data to transmit. (*) Abort a call. void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code); This is used to abort a call if it's still in an abortable state. The abort code specified will be placed in the ABORT message sent. (*) Intercept received RxRPC messages. typedef void (*rxrpc_interceptor_t)(struct sock *sk,
[patch 11/13] signal/timer/event fds v10 - eventfd wire up i386 arch ...
This patch wire the eventfd system call to the i386 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S === --- linux-2.6.21-rc5.fds.orig/arch/i386/kernel/syscall_table.S 2007-04-02 15:06:39.0 -0700 +++ linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S 2007-04-02 15:06:44.0 -0700 @@ -321,3 +321,4 @@ .long sys_epoll_pwait .long sys_signalfd /* 320 */ .long sys_timerfd + .long sys_eventfd Index: linux-2.6.21-rc5.fds/include/asm-i386/unistd.h === --- linux-2.6.21-rc5.fds.orig/include/asm-i386/unistd.h 2007-04-02 15:06:39.0 -0700 +++ linux-2.6.21-rc5.fds/include/asm-i386/unistd.h 2007-04-02 15:06:44.0 -0700 @@ -327,10 +327,11 @@ #define __NR_epoll_pwait 319 #define __NR_signalfd 320 #define __NR_timerfd 321 +#define __NR_eventfd 322 #ifdef __KERNEL__ -#define NR_syscalls 322 +#define NR_syscalls 323 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5 possible regression: KDE processes die silently (was: 2.6.21-rc3-mm2: KDE processes die while system is idle)
On Sun, Apr 01, 2007 at 05:21:06PM +0200, Tilman Schmidt wrote: > I'm sorry to say this has now happened with kernel 2.6.21-rc5, too. > I started a kernel compilation in the evening and came back in the > morning to find all KDE decorations gone. All processes normally > running for a KDE session and labelled "[kinit]" in ps were gone > but everything else was running fine, and the system was still > usable via ssh. /var/log/kdm.log and /var/log/Xorg.0.log contained > nothing remotely suspicious. /var/log/messages had two lines I > never saw before: > > Mar 31 02:27:36 gx110 kernel: [153577.891443] ReiserFS: hda3: warning: > vs-8115: get_num_ver: not directory or indirect item > Mar 31 02:27:36 gx110 kernel: [153577.891559] ReiserFS: hda3: warning: > vs-8115: get_num_ver: not directory or indirect item Reiserfs people Cc'ed for this. > But those didn't appear on previous occurrences of the "dying KDE" > problem so I guess they are not related. > > This is SUSE LINUX 10.0 (i586) running on a Dell OptiPlex GX110 > (Intel P3, 933 MHz, i810 chipset, 512 MB RAM, 60 GB ATA disk) > % uname -a > Linux gx110 2.6.21-rc5-noinitrd #1 PREEMPT Sat Mar 31 02:15:19 CEST 2007 i686 > i686 i386 GNU/Linux > % cat /proc/cmdline > root=/dev/hda3 selinux=0 x11i=vesa video=intelfb:[EMAIL PROTECTED] > nmi_watchdog=2 lapic 5 > Kernel configuration mostly-modular, based on standard SuSE kernel's > /proc/config.gz, just compiling into the kernel everything I need to > boot without an initrd and omitting some parts I'm not interested in. > (.config attached.) What else might be relevant? > > Again, this is a Heisenbug, ie. it's not reproducible and invariably > happens when I'm away from the machine. (Probably Murphy at work.) > It's pretty rare: I have seen it four times on 2.6.21-rc3-mm2 and > once on 2.6.21-rc5, on a machine which spends about equal amounts > of time running the latest stable, rc, and mm kernels. OTOH, so far > it hasn't ever happened with any 2.6.20 or earlier kernel. Nor have > I seen it with 2.6.21-rc[1-4] or 2.6.21-rc4-mm* - but for the -rc4 > and -rc4-mm releases that's not conclusive as those have only been > running for a very short time. We also have another report of crashes under KDE: Subject: crashes in KDE References : http://bugzilla.kernel.org/show_bug.cgi?id=8157 Submitter : Oliver Pinter <[EMAIL PROTECTED]> Status : unknown We also have one bug kwin ran into that got fixed after -rc5: Subject: kwin dies silently References : http://lkml.org/lkml/2007/2/28/112 Submitter : Sid Boyce <[EMAIL PROTECTED]> Boris Mogwitz <[EMAIL PROTECTED]> Michael Wu <[EMAIL PROTECTED]> Caused-By : Eric W. Biederman <[EMAIL PROTECTED]> commit 0475ac0845f9295bc5f69af45f58dff2c104c8d1 Fixed-By : Eric W. Biederman <[EMAIL PROTECTED]> Commit : 14e9d5730adfca26452b3a2838a80af6950556f5 Status : fixed in -rc6 These might or might not be related issues. > HTH > T. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 9/13] signal/timer/event fds v10 - timerfd compat code ...
This patch implement the necessary compat code for the timerfd system call. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/fs/compat.c === --- linux-2.6.21-rc5.fds.orig/fs/compat.c 2007-04-02 15:06:36.0 -0700 +++ linux-2.6.21-rc5.fds/fs/compat.c2007-04-02 15:06:41.0 -0700 @@ -2361,3 +2361,26 @@ #endif /* CONFIG_SIGNALFD */ +#ifdef CONFIG_TIMERFD + +asmlinkage long compat_sys_timerfd(int ufd, int clockid, int flags, + const struct compat_itimerspec __user *utmr) +{ + long res; + struct itimerspec t; + struct itimerspec __user *ut; + + res = -EFAULT; + if (get_compat_itimerspec(, utmr)) + goto err_exit; + ut = compat_alloc_user_space(sizeof(*ut)); + if (copy_to_user(ut, , sizeof(t)) ) + goto err_exit; + + res = sys_timerfd(ufd, clockid, flags, ut); +err_exit: + return res; +} + +#endif /* CONFIG_TIMERFD */ + Index: linux-2.6.21-rc5.fds/include/linux/compat.h === --- linux-2.6.21-rc5.fds.orig/include/linux/compat.h2007-04-02 15:06:12.0 -0700 +++ linux-2.6.21-rc5.fds/include/linux/compat.h 2007-04-02 15:06:41.0 -0700 @@ -225,6 +225,11 @@ return lhs->tv_nsec - rhs->tv_nsec; } +extern int get_compat_itimerspec(struct itimerspec *dst, +const struct compat_itimerspec __user *src); +extern int put_compat_itimerspec(struct compat_itimerspec __user *dst, +const struct itimerspec *src); + asmlinkage long compat_sys_adjtimex(struct compat_timex __user *utp); extern int compat_printk(const char *fmt, ...); Index: linux-2.6.21-rc5.fds/kernel/compat.c === --- linux-2.6.21-rc5.fds.orig/kernel/compat.c 2007-04-02 15:06:12.0 -0700 +++ linux-2.6.21-rc5.fds/kernel/compat.c2007-04-02 15:06:41.0 -0700 @@ -475,8 +475,8 @@ return min_length; } -static int get_compat_itimerspec(struct itimerspec *dst, -struct compat_itimerspec __user *src) +int get_compat_itimerspec(struct itimerspec *dst, + const struct compat_itimerspec __user *src) { if (get_compat_timespec(>it_interval, >it_interval) || get_compat_timespec(>it_value, >it_value)) @@ -484,8 +484,8 @@ return 0; } -static int put_compat_itimerspec(struct compat_itimerspec __user *dst, -struct itimerspec *src) +int put_compat_itimerspec(struct compat_itimerspec __user *dst, + const struct itimerspec *src) { if (put_compat_timespec(>it_interval, >it_interval) || put_compat_timespec(>it_value, >it_value)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Warning: unable to open an initial console.
On Mon, Apr 02, 2007 at 12:04:56PM -0700, Tom Strader wrote: > I have seen quite a few posts regarding unable to open an initial > console, but my system seems to have the necessary things in place > so I come looking for help. your rootfs/initramfs/initrd is missing a valid working /dev/console > VFS: Mounted root (jffs2 filesystem). check /dev/ on this filesystem - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 10/13] signal/timer/event fds v10 - eventfd core ...
ChangeLog: v10 - Renamed from "aino" to "anon_inode" -- This is a very simple and light file descriptor, that can be used as event wait/dispatch by userspace (both wait and dispatch) and by the kernel (dispatch only). It can be used instead of pipe(2) in all cases where those would simply be used to signal events. Their kernel overhead is much lower than pipes, and they do not consume two fds. When used in the kernel, it can offer an fd-bridge to enable, for example, functionalities like KAIO or syslets/threadlets to signal to an fd the completion of certain operations. But more in general, an eventfd can be used by the kernel to signal readiness, in a POSIX poll/select way, of interfaces that would otherwise be incompatible with it. The API is: int eventfd(unsigned int count); The eventfd API accepts an initial "count" parameter, and returns an eventfd fd. It supports poll(2) (POLLIN, POLLOUT, POLLERR), read(2) and write(2). The POLLIN flag is raised when the internal counter is greater than zero. The POLLOUT flag is raised when at least a value of "1" can be written to the internal counter. The POLLERR flag is raised when an overflow in the counter value is detected. The write(2) operation can never overflow the counter, since it blocks (unless O_NONBLOCK is set, in which case -EAGAIN is returned). But the eventfd_signal() function can do it, since it's supposed to not sleep during its operation. The read(2) function reads the __u64 counter value, and reset the internal value to zero. If the value read is equal to (__u64) -1, an overflow happened on the internal counter (due to 2^64 eventfd_signal() posts that has never been retired - unlickely, but possible). The write(2) call writes an __u64 count value, and adds it to the current counter. The eventfd fd supports O_NONBLOCK also. On the kernel side, we have: struct file *eventfd_fget(int fd); int eventfd_signal(struct file *file, unsigned int n); The eventfd_fget() should be called to get a struct file* from an eventfd fd (this is an fget() + check of f_op being an eventfd fops pointer). The kernel can then call eventfd_signal() every time it wants to post an event to userspace. The eventfd_signal() function can be called from any context. An eventfd() simple test and bench is available here: http://www.xmailserver.org/eventfd-bench.c This is the eventfd-based version of pipetest-4 (pipe(2) based): http://www.xmailserver.org/pipetest-4.c Not that performance matters much in the eventfd case, but eventfd-bench shows almost as double as performance than pipetest-4. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/include/linux/syscalls.h === --- linux-2.6.21-rc5.fds.orig/include/linux/syscalls.h 2007-04-02 15:06:37.0 -0700 +++ linux-2.6.21-rc5.fds/include/linux/syscalls.h 2007-04-02 15:06:43.0 -0700 @@ -605,6 +605,7 @@ asmlinkage long sys_signalfd(int ufd, sigset_t __user *user_mask, size_t sizemask); asmlinkage long sys_timerfd(int ufd, int clockid, int flags, const struct itimerspec __user *utmr); +asmlinkage long sys_eventfd(unsigned int count); int kernel_execve(const char *filename, char *const argv[], char *const envp[]); Index: linux-2.6.21-rc5.fds/fs/eventfd.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.21-rc5.fds/fs/eventfd.c 2007-04-02 15:06:43.0 -0700 @@ -0,0 +1,233 @@ +/* + * fs/eventfd.c + * + * Copyright (C) 2007 Davide Libenzi + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +struct eventfd_ctx { + spinlock_t lock; + wait_queue_head_t wqh; + /* +* Every time that a write(2) is performed on an eventfd, the +* value of the __u64 being written is added to "count" and a +* wakeup is performed on "wqh". A read(2) will return the "count" +* value to userspace, and will reset "count" to zero. The kernel +* size eventfd_signal() also, adds to the "count" counter and +* issue a wakeup. +*/ + __u64 count; +}; + +/* + * Adds "n" to the eventfd counter "count". Returns "n" in case of + * success, or a value lower then "n" in case of coutner overflow. + * This function is supposed to be called by the kernel in paths + * that do not allow sleeping. In this function we allow the counter + * to reach the ULLONG_MAX value, and we signal this as overflow + * condition by returining a POLLERR to poll(2). + */ +int eventfd_signal(struct file *file, int n) +{ + struct eventfd_ctx *ctx = file->private_data; + unsigned long flags; + + if (n < 0) + return -EINVAL; + spin_lock_irqsave(>lock, flags); + if (ULLONG_MAX - ctx->count < n) + n = (int)
Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL
On Mon, 2 Apr 2007, Martin Bligh wrote: > > For 64GB you'd need 256M which would be a quarter of low mem. Probably takes > > up too much of low mem. > > Yup. We could move whatever you currently use to handle that into i386 arch code. Or are there other platforms that do similar tricks with highmem? We already have special hooks for node lookups in sparsemem. Move all of that off into some arch dir? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 5/13] signal/timer/event fds v10 - signalfd compat code ...
This patch implement the necessary compat code for the signalfd system call. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/fs/compat.c === --- linux-2.6.21-rc5.fds.orig/fs/compat.c 2007-04-02 15:06:12.0 -0700 +++ linux-2.6.21-rc5.fds/fs/compat.c2007-04-02 15:06:36.0 -0700 @@ -46,6 +46,7 @@ #include #include #include +#include #include #include #include @@ -2335,3 +2336,28 @@ #endif /* TIF_RESTORE_SIGMASK */ #endif /* CONFIG_EPOLL */ + +#ifdef CONFIG_SIGNALFD + +asmlinkage long compat_sys_signalfd(int ufd, + const compat_sigset_t __user *sigmask, + compat_size_t sigsetsize) +{ + compat_sigset_t ss32; + sigset_t tmp; + sigset_t __user *ksigmask; + + if (sigsetsize != sizeof(compat_sigset_t)) + return -EINVAL; + if (copy_from_user(, sigmask, sizeof(ss32))) + return -EFAULT; + sigset_from_compat(, ); + ksigmask = compat_alloc_user_space(sizeof(sigset_t)); + if (copy_to_user(ksigmask, , sizeof(sigset_t))) + return -EFAULT; + + return sys_signalfd(ufd, ksigmask, sizeof(sigset_t)); +} + +#endif /* CONFIG_SIGNALFD */ + - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/13] signal/timer/event fds v10 - signalfd wire up i386 arch ...
This patch wire the signalfd system call to the i386 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S === --- linux-2.6.21-rc5.fds.orig/arch/i386/kernel/syscall_table.S 2007-04-02 15:06:12.0 -0700 +++ linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S 2007-04-02 15:06:33.0 -0700 @@ -319,3 +319,4 @@ .long sys_move_pages .long sys_getcpu .long sys_epoll_pwait + .long sys_signalfd /* 320 */ Index: linux-2.6.21-rc5.fds/include/asm-i386/unistd.h === --- linux-2.6.21-rc5.fds.orig/include/asm-i386/unistd.h 2007-04-02 15:06:12.0 -0700 +++ linux-2.6.21-rc5.fds/include/asm-i386/unistd.h 2007-04-02 15:06:33.0 -0700 @@ -325,10 +325,11 @@ #define __NR_move_pages317 #define __NR_getcpu318 #define __NR_epoll_pwait 319 +#define __NR_signalfd 320 #ifdef __KERNEL__ -#define NR_syscalls 320 +#define NR_syscalls 321 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 6/13] signal/timer/event fds v10 - timerfd core ...
ChangeLog: v10 - Renamed from "aino" to "anon_inode" - Prevented DoS by re-arming the timer on read (Thomas Gleixner) -- This patch introduces a new system call for timers events delivered though file descriptors. This allows timer event to be used with standard POSIX poll(2), select(2) and read(2). As a consequence of supporting the Linux f_op->poll subsystem, they can be used with epoll(2) too. The system call is defined as: int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr); The "ufd" parameter allows for re-use (re-programming) of an existing timerfd w/out going through the close/open cycle (same as signalfd). If "ufd" is -1, s new file descriptor will be created, otherwise the existing "ufd" will be re-programmed. The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. The time specified in the "utmr->it_value" parameter is the expiry time for the timer. If the TFD_TIMER_ABSTIME flag is set in "flags", this is an absolute time, otherwise it's a relative time. If the time specified in the "utmr->it_interval" is not zero (.tv_sec == 0, tv_nsec == 0), this is the period at which the following ticks should be generated. The "utmr->it_interval" should be set to zero if only one tick is requested. Setting the "utmr->it_value" to zero will disable the timer, or will create a timerfd without the timer enabled. The function returns the new (or same, in case "ufd" is a valid timerfd descriptor) file, or -1 in case of error. As stated before, the timerfd file descriptor supports poll(2), select(2) and epoll(2). When a timer event happened on the timerfd, a POLLIN mask will be returned. The read(2) call can be used, and it will return a u32 variable holding the number of "ticks" that happened on the interface since the last call to read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN will be returned if no ticks happened. A quick test program, shows timerfd working correctly on my amd64 box: http://www.xmailserver.org/timerfd-test.c Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/fs/timerfd.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.21-rc5.fds/fs/timerfd.c 2007-04-02 15:06:37.0 -0700 @@ -0,0 +1,233 @@ +/* + * fs/timerfd.c + * + * Copyright (C) 2007 Davide Libenzi + * + * + * Thanks to Thomas Gleixner for code reviews and useful comments. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +struct timerfd_ctx { + struct hrtimer tmr; + ktime_t tintv; + spinlock_t lock; + wait_queue_head_t wqh; + int expired; +}; + +/* + * This gets called when the timer event triggers. We set the "expired" + * flag, but we do not re-arm the timer (in case it's necessary, + * tintv.tv64 != 0) until the timer is read. + */ +static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr) +{ + struct timerfd_ctx *ctx = container_of(htmr, struct timerfd_ctx, tmr); + unsigned long flags; + + spin_lock_irqsave(>lock, flags); + ctx->expired = 1; + wake_up_locked(>wqh); + spin_unlock_irqrestore(>lock, flags); + + return HRTIMER_NORESTART; +} + +static void timerfd_setup(struct timerfd_ctx *ctx, int clockid, int flags, + const struct itimerspec *ktmr) +{ + enum hrtimer_mode htmode; + ktime_t texp; + + htmode = (flags & TFD_TIMER_ABSTIME) ? + HRTIMER_MODE_ABS: HRTIMER_MODE_REL; + + texp = timespec_to_ktime(ktmr->it_value); + ctx->expired = 0; + ctx->tintv = timespec_to_ktime(ktmr->it_interval); + hrtimer_init(>tmr, clockid, htmode); + ctx->tmr.expires = texp; + ctx->tmr.function = timerfd_tmrproc; + if (texp.tv64 != 0) + hrtimer_start(>tmr, texp, htmode); +} + +static int timerfd_release(struct inode *inode, struct file *file) +{ + struct timerfd_ctx *ctx = file->private_data; + + hrtimer_cancel(>tmr); + kfree(ctx); + return 0; +} + +static unsigned int timerfd_poll(struct file *file, poll_table *wait) +{ + struct timerfd_ctx *ctx = file->private_data; + unsigned int events = 0; + unsigned long flags; + + poll_wait(file, >wqh, wait); + + spin_lock_irqsave(>lock, flags); + if (ctx->expired) + events |= POLLIN; + spin_unlock_irqrestore(>lock, flags); + + return events; +} + +static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count, + loff_t *ppos) +{ + struct timerfd_ctx *ctx = file->private_data; + ssize_t res; + u32 ticks = 0; + DECLARE_WAITQUEUE(wait, current); + + if (count < sizeof(ticks)) + return -EINVAL; +
[PATCH] SLAB: Mention slab name when listing corrupt objects
Mention the slab name when listing corrupt objects. Although the function that released the memory is mentioned, that is frequently ambiguous as such functions often release several pieces of memory. Signed-Off-By: David Howells <[EMAIL PROTECTED]> --- mm/slab.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/slab.c b/mm/slab.c index 57f7aa4..4cbac24 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1802,8 +1802,8 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp) /* Print header */ if (lines == 0) { printk(KERN_ERR - "Slab corruption: start=%p, len=%d\n", - realobj, size); + "Slab corruption: %s start=%p, len=%d\n", + cachep->name, realobj, size); print_objinfo(cachep, objp, 0); } /* Hexdump the affected line */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 8/13] signal/timer/event fds v10 - timerfd wire up x86_64 arch ...
This patch wire the timerfd system call to the x86_64 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S === --- linux-2.6.21-rc5.fds.orig/arch/x86_64/ia32/ia32entry.S 2007-04-02 15:06:34.0 -0700 +++ linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S 2007-04-02 15:06:40.0 -0700 @@ -720,4 +720,5 @@ .quad sys_getcpu .quad sys_epoll_pwait .quad sys_signalfd /* 320 */ + .quad sys_timerfd ia32_syscall_end: Index: linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h === --- linux-2.6.21-rc5.fds.orig/include/asm-x86_64/unistd.h 2007-04-02 15:06:34.0 -0700 +++ linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h2007-04-02 15:06:40.0 -0700 @@ -621,8 +621,10 @@ __SYSCALL(__NR_move_pages, sys_move_pages) #define __NR_signalfd 280 __SYSCALL(__NR_signalfd, sys_signalfd) +#define __NR_timerfd 281 +__SYSCALL(__NR_timerfd, sys_timerfd) -#define __NR_syscall_max __NR_signalfd +#define __NR_syscall_max __NR_timerfd #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/13] signal/timer/event fds v10 - anonymous inode source ...
ChangeLog: v10 - Renamed from "aino" to "anon_inode" -- This patch add an anonymous inode source, to be used for files that need and inode only in order to create a file*. We do not care of having an inode for each file, and we do not even care of having different names in the associated dentries (dentry names will be same for classes of file*). This allow code reuse, and will be used by epoll, signalfd and timerfd (and whatever else there'll be). Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/fs/anon_inodes.c === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6.21-rc5.fds/fs/anon_inodes.c 2007-04-01 16:04:32.0 -0700 @@ -0,0 +1,200 @@ +/* + * fs/anon_inodes.c + * + * Copyright (C) 2007 Davide Libenzi + * + * Thanks to Arnd Bergmann for code review and suggestions. + * More changes for Thomas Gleixner suggestions. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +static struct vfsmount *anon_inode_mnt __read_mostly; +static struct inode *anon_inode_inode; +static const struct file_operations anon_inode_fops; + +static int anon_inodefs_get_sb(struct file_system_type *fs_type, int flags, + const char *dev_name, void *data, + struct vfsmount *mnt) +{ + return get_sb_pseudo(fs_type, "anon_inode:", NULL, ANON_INODE_FS_MAGIC, +mnt); +} + +static int anon_inodefs_delete_dentry(struct dentry *dentry) +{ + /* +* We faked vfs to believe the dentry was hashed when we created it. +* Now we restore the flag so that dput() will work correctly. +*/ + dentry->d_flags |= DCACHE_UNHASHED; + return 1; +} + +static struct file_system_type anon_inode_fs_type = { + .name = "anon_inodefs", + .get_sb = anon_inodefs_get_sb, + .kill_sb= kill_anon_super, +}; +static struct dentry_operations anon_inodefs_dentry_operations = { + .d_delete = anon_inodefs_delete_dentry, +}; + +/** + * anon_inode_getfd - creates a new file instance by hooking it up to and + *anonymous inode, and a dentry that describe the "class" + *of the file + * + * @pfd: [out] pointer to the file descriptor + * @dpinode: [out] pointer to the inode + * @pfile: [out] pointer to the file struct + * @name:[in]name of the "class" of the new file + * @fops [in]file operations for the new file + * @priv [in]private data for the new file (will be file's private_data) + * + * Creates a new file by hooking it on a single inode. This is useful for files + * that do not need to have a full-fledged inode in order to operate correctly. + * All the files created with anon_inode_getfd() will share a single inode, by + * hence saving memory and avoiding code duplication for the file/inode/dentry + * setup. + */ +int anon_inode_getfd(int *pfd, struct inode **pinode, struct file **pfile, +const char *name, const struct file_operations *fops, +void *priv) +{ + struct qstr this; + struct dentry *dentry; + struct inode *inode; + struct file *file; + int error, fd; + + if (IS_ERR(anon_inode_inode)) + return -ENODEV; + file = get_empty_filp(); + if (!file) + return -ENFILE; + + inode = igrab(anon_inode_inode); + if (IS_ERR(inode)) { + error = PTR_ERR(inode); + goto err_put_filp; + } + + error = get_unused_fd(); + if (error < 0) + goto err_iput; + fd = error; + + /* +* Link the inode to a directory entry by creating a unique name +* using the inode sequence number. +*/ + error = -ENOMEM; + this.name = name; + this.len = strlen(name); + this.hash = 0; + dentry = d_alloc(anon_inode_mnt->mnt_sb->s_root, ); + if (!dentry) + goto err_put_unused_fd; + dentry->d_op = _inodefs_dentry_operations; + /* Do not publish this dentry inside the global dentry hash table */ + dentry->d_flags &= ~DCACHE_UNHASHED; + d_instantiate(dentry, inode); + + file->f_path.mnt = mntget(anon_inode_mnt); + file->f_path.dentry = dentry; + file->f_mapping = inode->i_mapping; + + file->f_pos = 0; + file->f_flags = O_RDWR; + file->f_op = fops; + file->f_mode = FMODE_READ | FMODE_WRITE; + file->f_version = 0; + file->private_data = priv; + + fd_install(fd, file); + + *pfd = fd; + *pinode = inode; + *pfile = file; + return 0; + +err_put_unused_fd: + put_unused_fd(fd); +err_iput: + iput(inode); +err_put_filp: + put_filp(file); + return error; +} +
[patch 4/13] signal/timer/event fds v10 - signalfd wire up x86_64 arch ...
This patch wire the signalfd system call to the x86_64 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h === --- linux-2.6.21-rc5.fds.orig/include/asm-x86_64/unistd.h 2007-04-02 15:06:12.0 -0700 +++ linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h2007-04-02 15:06:34.0 -0700 @@ -619,8 +619,10 @@ __SYSCALL(__NR_vmsplice, sys_vmsplice) #define __NR_move_pages279 __SYSCALL(__NR_move_pages, sys_move_pages) +#define __NR_signalfd 280 +__SYSCALL(__NR_signalfd, sys_signalfd) -#define __NR_syscall_max __NR_move_pages +#define __NR_syscall_max __NR_signalfd #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR Index: linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S === --- linux-2.6.21-rc5.fds.orig/arch/x86_64/ia32/ia32entry.S 2007-04-02 15:06:12.0 -0700 +++ linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S 2007-04-02 15:06:34.0 -0700 @@ -714,9 +714,10 @@ .quad compat_sys_get_robust_list .quad sys_splice .quad sys_sync_file_range - .quad sys_tee + .quad sys_tee /* 315 */ .quad compat_sys_vmsplice .quad compat_sys_move_pages .quad sys_getcpu .quad sys_epoll_pwait + .quad sys_signalfd /* 320 */ ia32_syscall_end: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/9] AF_RXRPC: Key facility changes for AF_RXRPC
Export the keyring key type definition and document its availability. Add alternative types into the key's type_data union to make it more useful. Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for example), so make it clear that it can be used in other ways. Signed-Off-By: David Howells <[EMAIL PROTECTED]> --- Documentation/keys.txt | 12 include/linux/key.h |2 ++ security/keys/keyring.c |2 ++ 3 files changed, 16 insertions(+), 0 deletions(-) diff --git a/Documentation/keys.txt b/Documentation/keys.txt index 60c665d..81d9aa0 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt @@ -859,6 +859,18 @@ payload contents" for more information. void unregister_key_type(struct key_type *type); +Under some circumstances, it may be desirable to desirable to deal with a +bundle of keys. The facility provides access to the keyring type for managing +such a bundle: + + struct key_type key_type_keyring; + +This can be used with a function such as request_key() to find a specific +keyring in a process's keyrings. A keyring thus found can then be searched +with keyring_search(). Note that it is not possible to use request_key() to +search a specific keyring, so using keyrings in this way is of limited utility. + + === NOTES ON ACCESSING PAYLOAD CONTENTS === diff --git a/include/linux/key.h b/include/linux/key.h index 169f05e..a9220e7 100644 --- a/include/linux/key.h +++ b/include/linux/key.h @@ -160,6 +160,8 @@ struct key { */ union { struct list_headlink; + unsigned long x[2]; + void*p[2]; } type_data; /* key data diff --git a/security/keys/keyring.c b/security/keys/keyring.c index ad45ce7..88292e3 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -66,6 +66,8 @@ struct key_type key_type_keyring = { .read = keyring_read, }; +EXPORT_SYMBOL(key_type_keyring); + /* * semaphore to serialise link/link calls to prevent two link calls in parallel * introducing a cycle - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 13/13] signal/timer/event fds v10 - KAIO eventfd support example ...
ChangeLog: v10 - Added the "aio_flags" field (in place of the old "aio_reserved3") and introduced a new IOCB_FLAG_RESFD flag to tell that the "aio_resfd" field is valid. -- This is an example about how to add eventfd support to the current KAIO code, in order to enable KAIO to post readiness events to a pollable fd (hence compatible with POSIX select/poll). The KAIO code simply signals the eventfd fd when events are ready, and this triggers a POLLIN in the fd. This patch uses a reserved for future use member of the struct iocb to pass an eventfd file descriptor, that KAIO will use to post events every time a request completes. At that point, an aio_getevents() will return the completed result to a struct io_event. I made a quick test program to verify the patch, and it runs fine here: http://www.xmailserver.org/eventfd-aio-test.c The test program uses poll(2), but it'd, of course, work with select and epoll too. This can allow to schedule both block I/O and other poll-able devices requests, and wait for results using select/poll/epoll. In a typical scenario, an application would submit KAIO request using aio_submit(), and will also use epoll_ctl() on the whole other class of devices (that with the addition of signals, timers and user events, now it's pretty much complete), and then would: epoll_wait(...); for_each_event { if (curr_event_is_kaiofd) { aio_getevents(); dispatch_aio_events(); } else { dispatch_epoll_event(); } } Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/fs/aio.c === --- linux-2.6.21-rc5.fds.orig/fs/aio.c 2007-04-02 15:06:11.0 -0700 +++ linux-2.6.21-rc5.fds/fs/aio.c 2007-04-02 15:06:47.0 -0700 @@ -30,6 +30,7 @@ #include #include #include +#include #include #include @@ -421,6 +422,7 @@ req->private = NULL; req->ki_iovec = NULL; INIT_LIST_HEAD(>ki_run_list); + req->ki_eventfd = ERR_PTR(-EINVAL); /* Check if the completion queue has enough free space to * accept an event from this io. @@ -462,6 +464,8 @@ { assert_spin_locked(>ctx_lock); + if (!IS_ERR(req->ki_eventfd)) + fput(req->ki_eventfd); if (req->ki_dtor) req->ki_dtor(req); if (req->ki_iovec != >ki_inline_vec) @@ -946,6 +950,14 @@ return 1; } + /* +* Check if the user asked us to deliver the result through an +* eventfd. The eventfd_signal() function is safe to be called +* from IRQ context. +*/ + if (!IS_ERR(iocb->ki_eventfd)) + eventfd_signal(iocb->ki_eventfd, 1); + info = >ring_info; /* add a completion event to the ring buffer. @@ -1530,8 +1542,7 @@ ssize_t ret; /* enforce forwards compatibility on users */ - if (unlikely(iocb->aio_reserved1 || iocb->aio_reserved2 || -iocb->aio_reserved3)) { + if (unlikely(iocb->aio_reserved1 || iocb->aio_reserved2)) { pr_debug("EINVAL: io_submit: reserve field set\n"); return -EINVAL; } @@ -1555,6 +1566,19 @@ fput(file); return -EAGAIN; } + if (iocb->aio_flags & IOCB_FLAG_RESFD) { + /* +* If the IOCB_FLAG_RESFD flag of aio_flags is set, get an +* instance of the file* now. The file descriptor must be +* an eventfd() fd, and will be signaled for each completed +* event using the eventfd_signal() function. +*/ + req->ki_eventfd = eventfd_fget((int) iocb->aio_resfd); + if (unlikely(IS_ERR(req->ki_eventfd))) { + ret = PTR_ERR(req->ki_eventfd); + goto out_put_req; + } + } req->ki_filp = file; ret = put_user(req->ki_key, _iocb->aio_key); Index: linux-2.6.21-rc5.fds/include/linux/aio.h === --- linux-2.6.21-rc5.fds.orig/include/linux/aio.h 2007-04-02 15:06:11.0 -0700 +++ linux-2.6.21-rc5.fds/include/linux/aio.h2007-04-02 15:06:47.0 -0700 @@ -119,6 +119,12 @@ struct list_headki_list;/* the aio core uses this * for cancellation */ + + /* +* If the aio_resfd field of the userspace iocb is not zero, +* this is the underlying file* to deliver event to. +*/ + struct file *ki_eventfd; }; #define is_sync_kiocb(iocb)((iocb)->ki_key == KIOCB_SYNC_KEY) Index: linux-2.6.21-rc5.fds/include/linux/aio_abi.h
[patch 12/13] signal/timer/event fds v10 - eventfd wire up x86_64 arch ...
This patch wire the eventfd system call to the x86_64 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S === --- linux-2.6.21-rc5.fds.orig/arch/x86_64/ia32/ia32entry.S 2007-04-02 15:06:40.0 -0700 +++ linux-2.6.21-rc5.fds/arch/x86_64/ia32/ia32entry.S 2007-04-02 15:06:46.0 -0700 @@ -721,4 +721,5 @@ .quad sys_epoll_pwait .quad sys_signalfd /* 320 */ .quad sys_timerfd + .quad sys_eventfd ia32_syscall_end: Index: linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h === --- linux-2.6.21-rc5.fds.orig/include/asm-x86_64/unistd.h 2007-04-02 15:06:40.0 -0700 +++ linux-2.6.21-rc5.fds/include/asm-x86_64/unistd.h2007-04-02 15:06:46.0 -0700 @@ -623,8 +623,10 @@ __SYSCALL(__NR_signalfd, sys_signalfd) #define __NR_timerfd 281 __SYSCALL(__NR_timerfd, sys_timerfd) +#define __NR_eventfd 282 +__SYSCALL(__NR_eventfd, sys_eventfd) -#define __NR_syscall_max __NR_timerfd +#define __NR_syscall_max __NR_eventfd #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 7/13] signal/timer/event fds v10 - timerfd wire up i386 arch ...
This patch wire the timerfd system call to the i386 architecture. Signed-off-by: Davide Libenzi - Davide Index: linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S === --- linux-2.6.21-rc5.fds.orig/arch/i386/kernel/syscall_table.S 2007-04-02 15:06:33.0 -0700 +++ linux-2.6.21-rc5.fds/arch/i386/kernel/syscall_table.S 2007-04-02 15:06:39.0 -0700 @@ -320,3 +320,4 @@ .long sys_getcpu .long sys_epoll_pwait .long sys_signalfd /* 320 */ + .long sys_timerfd Index: linux-2.6.21-rc5.fds/include/asm-i386/unistd.h === --- linux-2.6.21-rc5.fds.orig/include/asm-i386/unistd.h 2007-04-02 15:06:33.0 -0700 +++ linux-2.6.21-rc5.fds/include/asm-i386/unistd.h 2007-04-02 15:06:39.0 -0700 @@ -326,10 +326,11 @@ #define __NR_getcpu318 #define __NR_epoll_pwait 319 #define __NR_signalfd 320 +#define __NR_timerfd 321 #ifdef __KERNEL__ -#define NR_syscalls 321 +#define NR_syscalls 322 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/9] AF_RXRPC: Make it possible to merely try to cancel timers and delayed work
Export try_to_del_timer_sync() for use by the RxRPC module. Add a try_to_cancel_delayed_work() so that it is possible to merely attempt to cancel a delayed work timer. Signed-Off-By: David Howells <[EMAIL PROTECTED]> --- include/linux/workqueue.h | 21 + kernel/timer.c|2 ++ 2 files changed, 23 insertions(+), 0 deletions(-) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 2a7b38d..40a61ae 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -204,4 +204,25 @@ static inline int cancel_delayed_work(struct delayed_work *work) return ret; } +/** + * try_to_cancel_delayed_work - Try to kill pending scheduled, delayed work + * @work: the work to cancel + * + * Try to kill off a pending schedule_delayed_work(). + * - The timer may still be running afterwards, and if so, the work may still + * be pending + * - Returns -1 if timer still active, 1 if timer removed, 0 if not scheduled + * - Can be called from the work routine; if it's still pending, just return + * and it'll be called again. + */ +static inline int try_to_cancel_delayed_work(struct delayed_work *work) +{ + int ret; + + ret = try_to_del_timer_sync(>timer); + if (ret > 0) + work_release(>work); + return ret; +} + #endif diff --git a/kernel/timer.c b/kernel/timer.c index 440048a..ba4d6e0 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -505,6 +505,8 @@ out: return ret; } +EXPORT_SYMBOL(try_to_del_timer_sync); + /** * del_timer_sync - deactivate a timer and wait for the handler to finish. * @timer: the timer to be deactivated - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/9] AF_RXRPC: Add blkcipher accessors for using kernel data directly
Add blkcipher accessors for using kernel data directly without the use of scatter lists. Also add a CRYPTO_ALG_DMA algorithm capability flag to permit or deny the use of DMA and hardware accelerators. A hardware accelerator may not be used to access any arbitrary piece of kernel memory lest it not be in a DMA'able region. Only software algorithms may do that. If kernel data is going to be accessed directly, then CRYPTO_ALG_DMA must, for instance, be passed in the mask of crypto_alloc_blkcipher(), but not the type. This is used by AF_RXRPC to do quick encryptions, where the size of the data being encrypted or decrypted is 8 bytes or, occasionally, 16 bytes (ie: one or two chunks only), and since these data are generally on the stack they may be split over two pages. Because they're so small, and because they may be misaligned, setting up a scatter-gather list is overly expensive. It is very unlikely that a hardware FCrypt PCBC engine will be encountered (there is not, as far as I know, any such thing), and even if one is encountered, the setup/teardown costs for such small transactions will almost certainly be prohibitive. Encrypting and decrypting whole packets, on the other hand, is done through the scatter-gather list interface as the amount of data is sufficient that the expense of doing virtual address to page calculations is sufficiently small by comparison. Signed-Off-By: David Howells <[EMAIL PROTECTED]> --- crypto/blkcipher.c |2 + crypto/pcbc.c | 62 + include/linux/crypto.h | 118 3 files changed, 181 insertions(+), 1 deletions(-) diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c index b5befe8..4498b2d 100644 --- a/crypto/blkcipher.c +++ b/crypto/blkcipher.c @@ -376,6 +376,8 @@ static int crypto_init_blkcipher_ops(struct crypto_tfm *tfm, u32 type, u32 mask) crt->setkey = setkey; crt->encrypt = alg->encrypt; crt->decrypt = alg->decrypt; + crt->encrypt_kernel = alg->encrypt_kernel; + crt->decrypt_kernel = alg->decrypt_kernel; addr = (unsigned long)crypto_tfm_ctx(tfm); addr = ALIGN(addr, align); diff --git a/crypto/pcbc.c b/crypto/pcbc.c index 5174d7f..fa76111 100644 --- a/crypto/pcbc.c +++ b/crypto/pcbc.c @@ -126,6 +126,36 @@ static int crypto_pcbc_encrypt(struct blkcipher_desc *desc, return err; } +static int crypto_pcbc_encrypt_kernel(struct blkcipher_desc *desc, + u8 *dst, const u8 *src, + unsigned int nbytes) +{ + struct blkcipher_walk walk; + struct crypto_blkcipher *tfm = desc->tfm; + struct crypto_pcbc_ctx *ctx = crypto_blkcipher_ctx(tfm); + struct crypto_cipher *child = ctx->child; + void (*xor)(u8 *, const u8 *, unsigned int bs) = ctx->xor; + + BUG_ON(crypto_tfm_alg_capabilities(crypto_cipher_tfm(child)) & + CRYPTO_ALG_DMA); + + if (nbytes == 0) + return 0; + + memset(, 0, sizeof(walk)); + walk.src.virt.addr = (u8 *) src; + walk.dst.virt.addr = (u8 *) dst; + walk.nbytes = nbytes; + walk.total = nbytes; + walk.iv = desc->info; + + if (walk.src.virt.addr == walk.dst.virt.addr) + nbytes = crypto_pcbc_encrypt_inplace(desc, , child, xor); + else + nbytes = crypto_pcbc_encrypt_segment(desc, , child, xor); + return 0; +} + static int crypto_pcbc_decrypt_segment(struct blkcipher_desc *desc, struct blkcipher_walk *walk, struct crypto_cipher *tfm, @@ -211,6 +241,36 @@ static int crypto_pcbc_decrypt(struct blkcipher_desc *desc, return err; } +static int crypto_pcbc_decrypt_kernel(struct blkcipher_desc *desc, + u8 *dst, const u8 *src, + unsigned int nbytes) +{ + struct blkcipher_walk walk; + struct crypto_blkcipher *tfm = desc->tfm; + struct crypto_pcbc_ctx *ctx = crypto_blkcipher_ctx(tfm); + struct crypto_cipher *child = ctx->child; + void (*xor)(u8 *, const u8 *, unsigned int bs) = ctx->xor; + + BUG_ON(crypto_tfm_alg_capabilities(crypto_cipher_tfm(child)) & + CRYPTO_ALG_DMA); + + if (nbytes == 0) + return 0; + + memset(, 0, sizeof(walk)); + walk.src.virt.addr = (u8 *) src; + walk.dst.virt.addr = (u8 *) dst; + walk.nbytes = nbytes; + walk.total = nbytes; + walk.iv = desc->info; + + if (walk.src.virt.addr == walk.dst.virt.addr) + nbytes = crypto_pcbc_decrypt_inplace(desc, , child, xor); + else + nbytes = crypto_pcbc_decrypt_segment(desc, , child, xor); + return 0; +} + static void xor_byte(u8 *a, const u8 *b, unsigned int bs) { do { @@ -313,6 +373,8 @@ static struct crypto_instance