Re: [PATCH 4/5] Fix the configuration dependencies
On Fri, 16 Nov 2007 11:33:20 +0900 "Ken'ichi Ohmichi" <[EMAIL PROTECTED]> wrote: > > This patch fixes the configuration dependencies in the vmcoreinfo data. > > i386's "node_data" is defined in arch/x86/mm/discontig_32.c, > and x86_64's one is defined in arch/x86/mm/numa_64.c. > They depend on CONFIG_NUMA: > arch/x86/mm/Makefile_32:7 > obj-$(CONFIG_NUMA) += discontig_32.o > arch/x86/mm/Makefile_64:7 > obj-$(CONFIG_NUMA) += numa_64.o > > ia64's "pgdat_list" is defined in arch/ia64/mm/discontig.c, > and it depends on CONFIG_DISCONTIGMEM and CONFIG_SPARSEMEM: > arch/ia64/mm/Makefile:9-10 > obj-$(CONFIG_DISCONTIGMEM) += discontig.o > obj-$(CONFIG_SPARSEMEM)+= discontig.o > > ia64's "node_memblk" is defined in arch/ia64/mm/numa.c, > and it depends on CONFIG_NUMA: > arch/ia64/mm/Makefile:8 > obj-$(CONFIG_NUMA) += numa.o > > Signed-off-by: Ken'ichi Ohmichi <[EMAIL PROTECTED]> > --- > diff -rpuN a/arch/ia64/kernel/machine_kexec.c > b/arch/ia64/kernel/machine_kexec.c > --- a/arch/ia64/kernel/machine_kexec.c2007-11-14 15:39:06.0 > +0900 > +++ b/arch/ia64/kernel/machine_kexec.c2007-11-14 15:41:41.0 > +0900 > @@ -129,10 +129,11 @@ void machine_kexec(struct kimage *image) > > void arch_crash_save_vmcoreinfo(void) > { > -#if defined(CONFIG_ARCH_DISCONTIGMEM_ENABLE) && defined(CONFIG_NUMA) > +#if defined(CONFIG_DISCONTIGMEM) || defined(CONFIG_SPARSEMEM) > VMCOREINFO_SYMBOL(pgdat_list); > VMCOREINFO_LENGTH(pgdat_list, MAX_NUMNODES); > - > +#endif > +#ifdef CONFIG_NUMA > VMCOREINFO_SYMBOL(node_memblk); > VMCOREINFO_LENGTH(node_memblk, NR_NODE_MEMBLKS); > VMCOREINFO_STRUCT_SIZE(node_memblk_s); > diff -rpuN a/arch/x86/kernel/machine_kexec_32.c > b/arch/x86/kernel/machine_kexec_32.c > --- a/arch/x86/kernel/machine_kexec_32.c 2007-11-14 15:39:19.0 > +0900 > +++ b/arch/x86/kernel/machine_kexec_32.c 2007-11-14 15:39:33.0 > +0900 > @@ -151,7 +151,7 @@ NORET_TYPE void machine_kexec(struct kim > > void arch_crash_save_vmcoreinfo(void) > { > -#ifdef CONFIG_ARCH_DISCONTIGMEM_ENABLE > +#ifdef CONFIG_NUMA > VMCOREINFO_SYMBOL(node_data); > VMCOREINFO_LENGTH(node_data, MAX_NUMNODES); > #endif > diff -rpuN a/arch/x86/kernel/machine_kexec_64.c > b/arch/x86/kernel/machine_kexec_64.c > --- a/arch/x86/kernel/machine_kexec_64.c 2007-11-14 15:39:19.0 > +0900 > +++ b/arch/x86/kernel/machine_kexec_64.c 2007-11-14 15:39:33.0 > +0900 > @@ -235,7 +235,7 @@ void arch_crash_save_vmcoreinfo(void) > { > VMCOREINFO_SYMBOL(init_level4_pgt); > > -#ifdef CONFIG_ARCH_DISCONTIGMEM_ENABLE > +#ifdef CONFIG_NUMA > VMCOREINFO_SYMBOL(node_data); > VMCOREINFO_LENGTH(node_data, MAX_NUMNODES); > #endif > _ > x86_64-make-sparsemem-vmemmap-the-default-memory-model-v2.patch removes the `VMCOREINFO_SYMBOL(node_data);' from arch/x86/kernel/machine_kexec_64.c altogether, so I dropped that part of your patch. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc3-mm1: I/O error, system hangs
On Sat, Nov 24, 2007 at 07:44:13PM +0200, James Bottomley wrote: > Probing intermittent failures in Domain Validation, even with the fixes > applied leads me to the conclusion that there are further problems with > this commit: > > commit fc5eb4facedbd6d7117905e775cee1975f894e79 > Author: Hannes Reinecke <[EMAIL PROTECTED]> > Date: Tue Nov 6 09:23:40 2007 +0100 > > [SCSI] Do not requeue requests if REQ_FAILFAST is set > > The essence of the problems is that you're causing REQ_FAILFAST to > terminate commands with error on requeuing conditions, some of which are > relatively common on most SCSI devices. While this may be the correct > behaviour for multi-path, it's certainly wrong for the previously > understood meaning of REQ_FAILFAST, which was don't retry on error, > which is why domain validation and other applications use it to control > error handling, but don't expect to get failures for a simple requeue > are now spitting errors. > > I honestly can't see that, even for the multi-path case, returning an > error when we're over queue depth is the correct thing to do (it may not > matter to something like a symmetrix, but an array that has a non-zero > cost associated with a path change, like a CPQ HSV or the AVT > controllers, will show fairly large slow downs if you do this). Even if > this is the desired behaviour (and I think that's a policy issue), > DID_NO_CONNECT is almost certainly the wrong error to be sending back. > > This patch fixes up domain validation to work again correctly, however, > I really think it's just a bandaid. Do you want to rethink the above > commit? > Given the amounted error, yes, I'll have to. But we still face the initial problem that requeued requests will be stuck in the queue forever (ie until the timeout catches it), causing failover to be painfully slow. Anyway, I'll think it over. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage [EMAIL PROTECTED] +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg GF: Markus Rex, HRB 16746 (AG N�rnberg) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 23/27] x86: debugctlmsr kconfig
> Why is it defined in configuration system instead of some *.h file? That seems to be existing practice for this sort of thing. I just followed what I saw. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PPC: CELLEB - fix potential NULL pointer dereference
This patch adds checking for NULL value returned to prevent possible NULL pointer dereference. Also two unneeded 'return' are removed. Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]> --- Any comments are welcome. arch/powerpc/platforms/celleb/pci.c | 23 --- 1 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/celleb/pci.c b/arch/powerpc/platforms/celleb/pci.c index 6bc32fd..9b8bb01 100644 --- a/arch/powerpc/platforms/celleb/pci.c +++ b/arch/powerpc/platforms/celleb/pci.c @@ -138,8 +138,6 @@ static void celleb_config_read_fake(unsigned char *config, int where, *val = celleb_fake_config_readl(p); break; } - - return; } static void celleb_config_write_fake(unsigned char *config, int where, @@ -158,7 +156,6 @@ static void celleb_config_write_fake(unsigned char *config, int where, celleb_fake_config_writel(val, p); break; } - return; } static int celleb_fake_pci_read_config(struct pci_bus *bus, @@ -348,9 +345,25 @@ static int __init celleb_setup_fake_pci_device(struct device_node *node, pr_debug("PCI: res assigned 0x%016lx\n", (unsigned long)*res); wi0 = of_get_property(node, "device-id", NULL); + if (unlikely((!wi0))) { + printk(KERN_ERR "PCI: device-id not found.\n"); + goto error; + } wi1 = of_get_property(node, "vendor-id", NULL); + if (unlikely((!wi1))) { + printk(KERN_ERR "PCI: vendor-id not found.\n"); + goto error; + } wi2 = of_get_property(node, "class-code", NULL); + if (unlikely((!wi2))) { + printk(KERN_ERR "PCI: class-code not found.\n"); + goto error; + } wi3 = of_get_property(node, "revision-id", NULL); + if (unlikely((!wi3))) { + printk(KERN_ERR "PCI: revision-id not found.\n"); + goto error; + } celleb_config_write_fake(*config, PCI_DEVICE_ID, 2, wi0[0] & 0x); celleb_config_write_fake(*config, PCI_VENDOR_ID, 2, wi1[0] & 0x); @@ -372,6 +385,10 @@ static int __init celleb_setup_fake_pci_device(struct device_node *node, celleb_setup_pci_base_addrs(hose, devno, fn, num_base_addr); li = of_get_property(node, "interrupts", ); + if (!li) { + printk(KERN_ERR "PCI: interrupts not found.\n"); + goto error; + } val = li[0]; celleb_config_write_fake(*config, PCI_INTERRUPT_PIN, 1, 1); celleb_config_write_fake(*config, PCI_INTERRUPT_LINE, 1, val);
Re: [PATCH 58/59] sound/isa: Add missing "space"
At Mon, 19 Nov 2007 17:53:45 -0800, Joe Perches wrote: > > > Signed-off-by: Joe Perches <[EMAIL PROTECTED]> Applied to ALSA tree. Thanks. Takashi > --- > sound/isa/sc6000.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/sound/isa/sc6000.c b/sound/isa/sc6000.c > index 94daf83..bc0c379 100644 > --- a/sound/isa/sc6000.c > +++ b/sound/isa/sc6000.c > @@ -390,7 +390,7 @@ static int __devinit sc6000_init_board(char __iomem > *vport, int irq, int dma, > > err = sc6000_init_mss(vport, config, vmss_port, mss_config); > if (err < 0) { > - snd_printk(KERN_ERR "Can not initialize" > + snd_printk(KERN_ERR "Can not initialize " > "Microsoft Sound System mode.\n"); > return -ENODEV; > } > -- > 1.5.3.5.652.gf192c > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: enable dual rng on VIA C7
On Sun, 11 Nov 2007 19:49:08 +0100 Udo van den Heuvel <[EMAIL PROTECTED]> wrote: > Any reason why the second rng on the VIA C7 CPU is not enabled? > > Kind regards, > Udo > > > [via-rng.patch text/plain (634B)] > --- old/drivers/char/hw_random/via-rng.c 2007-11-11 19:39:49.0 > +0100 > +++ new/drivers/char/hw_random/via-rng.c 2007-11-11 19:40:41.0 > +0100 > @@ -41,6 +41,7 @@ > VIA_STRFILT_ENABLE = (1 << 14), > VIA_RAWBITS_ENABLE = (1 << 13), > VIA_RNG_ENABLE = (1 << 6), > + VIA_RNG_DUAL= (1 << 9), > VIA_XSTORE_CNT_MASK = 0x0F, > > VIA_RNG_CHUNK_8 = 0x00, /* 64 rand bits, 64 stored bits */ > @@ -128,6 +129,7 @@ > lo &= ~(0x7f << VIA_STRFILT_CNT_SHIFT); > lo &= ~VIA_XSTORE_CNT_MASK; > lo &= ~(VIA_STRFILT_ENABLE | VIA_STRFILT_FAIL | VIA_RAWBITS_ENABLE); > + lo |= VIA_RNG_DUAL; > lo |= VIA_RNG_ENABLE; > > if (lo != old_lo) > Does the patch work? It's missing a signed-off-by:, btw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] ide-scsi: use print_hex_dump from
these utilities implemented in lib/hexdump.c are more handy, please use this. Cc: Randy Dunlap <[EMAIL PROTECTED]> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]> --- there are still much other private hexdump implementations in the source, which reinvent the wheel, we can find them through: $ grep -RsIn hexdump ... drivers/scsi/ide-scsi.c | 18 -- 1 files changed, 4 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/ide-scsi.c b/drivers/scsi/ide-scsi.c index 8d0244c..8f3fc1d 100644 --- a/drivers/scsi/ide-scsi.c +++ b/drivers/scsi/ide-scsi.c @@ -242,16 +242,6 @@ static void idescsi_output_buffers (ide_drive_t *drive, idescsi_pc_t *pc, unsign } } -static void hexdump(u8 *x, int len) -{ - int i; - - printk("[ "); - for (i = 0; i < len; i++) - printk("%x ", x[i]); - printk("]\n"); -} - static int idescsi_check_condition(ide_drive_t *drive, struct request *failed_command) { idescsi_scsi_t *scsi = drive_to_idescsi(drive); @@ -282,7 +272,7 @@ static int idescsi_check_condition(ide_drive_t *drive, struct request *failed_co pc->scsi_cmd = ((idescsi_pc_t *) failed_command->special)->scsi_cmd; if (test_bit(IDESCSI_LOG_CMD, >log)) { printk ("ide-scsi: %s: queue cmd = ", drive->name); - hexdump(pc->c, 6); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, pc->c, 6, 1); } rq->rq_disk = scsi->disk; return ide_do_drive_cmd(drive, rq, ide_preempt); @@ -337,7 +327,7 @@ static int idescsi_end_request (ide_drive_t *drive, int uptodate, int nrsecs) idescsi_pc_t *opc = (idescsi_pc_t *) rq->buffer; if (log) { printk ("ide-scsi: %s: wrap up check %lu, rst = ", drive->name, opc->scsi_cmd->serial_number); - hexdump(pc->buffer,16); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, pc->buffer, 16, 1); } memcpy((void *) opc->scsi_cmd->sense_buffer, pc->buffer, SCSI_SENSE_BUFFERSIZE); kfree(pc->buffer); @@ -816,10 +806,10 @@ static int idescsi_queue (struct scsi_cmnd *cmd, if (test_bit(IDESCSI_LOG_CMD, >log)) { printk ("ide-scsi: %s: que %lu, cmd = ", drive->name, cmd->serial_number); - hexdump(cmd->cmnd, cmd->cmd_len); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, cmd->cmnd, cmd->cmd_len, 1); if (memcmp(pc->c, cmd->cmnd, cmd->cmd_len)) { printk ("ide-scsi: %s: que %lu, tsl = ", drive->name, cmd->serial_number); - hexdump(pc->c, 12); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, pc->c, 12, 1); } } -- 1.5.3.5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] crypto test: use print_hex_dump from
these utilities implemented in lib/hexdump.c are more handy, please use this. Cc: Randy Dunlap <[EMAIL PROTECTED]> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]> --- crypto/tcrypt.c | 21 +++-- 1 files changed, 7 insertions(+), 14 deletions(-) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 24141fb..8766023 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -81,14 +81,6 @@ static char *check[] = { "camellia", "seed", NULL }; -static void hexdump(unsigned char *buf, unsigned int len) -{ - while (len--) - printk("%02x", *buf++); - - printk("\n"); -} - static void tcrypt_complete(struct crypto_async_request *req, int err) { struct tcrypt_result *res = req->data; @@ -156,7 +148,8 @@ static void test_hash(char *algo, struct hash_testvec *template, goto out; } - hexdump(result, crypto_hash_digestsize(tfm)); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, result, crypto_hash_digestsize(tfm), 1); + printk("%s\n", memcmp(result, hash_tv[i].digest, crypto_hash_digestsize(tfm)) ? @@ -203,7 +196,7 @@ static void test_hash(char *algo, struct hash_testvec *template, goto out; } - hexdump(result, crypto_hash_digestsize(tfm)); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, result, crypto_hash_digestsize(tfm), 1); printk("%s\n", memcmp(result, hash_tv[i].digest, crypto_hash_digestsize(tfm)) ? @@ -319,7 +312,7 @@ static void test_cipher(char *algo, int enc, } q = kmap(sg_page([0])) + sg[0].offset; - hexdump(q, cipher_tv[i].rlen); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, q, cipher_tv[i].rlen, 1); printk("%s\n", memcmp(q, cipher_tv[i].result, @@ -393,7 +386,7 @@ static void test_cipher(char *algo, int enc, for (k = 0; k < cipher_tv[i].np; k++) { printk("page %u\n", k); q = kmap(sg_page([k])) + sg[k].offset; - hexdump(q, cipher_tv[i].tap[k]); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, q, cipher_tv[i].tap[k], 1); printk("%s\n", memcmp(q, cipher_tv[i].result + temp, cipher_tv[i].tap[k]) ? "fail" : @@ -839,7 +832,7 @@ static void test_deflate(void) printk("fail: ret=%d\n", ret); continue; } - hexdump(result, dlen); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, result, dlen, 1); printk("%s (ratio %d:%d)\n", memcmp(result, tv[i].output, dlen) ? "fail" : "pass", ilen, dlen); @@ -870,7 +863,7 @@ static void test_deflate(void) printk("fail: ret=%d\n", ret); continue; } - hexdump(result, dlen); + print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 16, 1, result, dlen, 1); printk("%s (ratio %d:%d)\n", memcmp(result, tv[i].output, dlen) ? "fail" : "pass", ilen, dlen); -- 1.5.3.5 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc3-mm1 (sync is slow ?)
On Sat, 24 Nov 2007 19:04:34 +0100 Gabriel C <[EMAIL PROTECTED]> wrote: > >> It seems OK here from a quick test (i386, ext3-on-IDE). > >> > >> Maybe device driver/block breakage? > > Try revert > > http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d > > See also : > http://lkml.org/lkml/2007/11/23/5 > > and search for '2.6.24-rc3-mm1: I/O error, system hangs' on LKML > Thank you! The problem was fixed by reverting the patch you pointed out. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> Except C doesn't have namespaces and this mechanism doesn't create them. So > this is just complete and utter makework; as I said before, noone's going to > confuse all those udp_* functions if they're not in the udp namespace. I don't understand why you're so opposed to organizing the kernel's exported symbols in a more self-documenting way. It seems pretty clear to me that having a mechanism that requires modules to make explicit which (semi-)internal APIs makes reviewing easier, makes it easier to communicate "please don't use that API" to module authors, and takes at least a small step towards bringing the kernel's exported API under control. What's the real downside? - R. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.24-rc3-$SHA1: kernel BUG at fs/jbd/checkpoint.c:683!
In a desperate attempt to screw up /proc one more time, I added some proc fixes, wrote test module which creates and removes simple proc file, then ran a) modprobe/rmmod loop, b) cat /proc/foo/bar loop, c) LTP loop. So far so good -- survived overnight run. While rebooting into new kernel, kernel died: [56400.857832] kernel BUG at fs/jbd/checkpoint.c:683! [56400.857911] invalid opcode: [1] PREEMPT SMP [56400.857996] CPU 0 [56400.858059] Modules linked in: foo [56400.858138] Pid: 392, comm: kjournald Not tainted 2.6.24-rc3-proc #11 [56400.858227] RIP: 0010:[] [] __journal_drop_transaction+0x110/0x120 [56400.858380] RSP: :81017f30dd58 EFLAGS: 00010286 [56400.858462] RAX: 81012ab9f210 RBX: 81017f336cd8 RCX: 81017fcbbe48 [56400.858555] RDX: 81012ab9f210 RSI: 810110eeb318 RDI: 81017f336cd8 [56400.858648] RBP: 81017aa8a2a0 R08: R09: 81017aa8a4f8 [56400.858741] R10: 0001 R11: 8021b220 R12: 81017aa8a2a0 [56400.858834] R13: 81017aa8a2a0 R14: 81017f30ddbc R15: 81017f30ddbc [56400.858927] FS: () GS:804ea000() knlGS: [56400.859070] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b [56400.859157] CR2: 00437c50 CR3: 000104dae000 CR4: 06e0 [56400.859250] DR0: DR1: DR2: [56400.859343] DR3: DR6: 0ff0 DR7: 0400 [56400.859436] Process kjournald (pid: 392, threadinfo 81017f30c000, task 81017fcc8ec0) [56400.859581] Stack: 802cbf7a 81016f795c60 802cc0a8 [56400.859734] 810110eeb318 81012ab9f210 0001 810117e4da50 [56400.859881] 81017f30ddbc 81017f336e3c 802cca0b 810117e4da50 [56400.859979] Call Trace: [56400.860093] [] __journal_remove_checkpoint+0x5a/0xb0 [56400.860183] [] journal_clean_one_cp_list+0xd8/0x170 [56400.860273] [] __journal_clean_checkpoint_list+0x4b/0xa0 [56400.860370] [] journal_commit_transaction+0x21d/0x1110 [56400.860462] [] lock_timer_base+0x34/0x70 [56400.860546] [] try_to_del_timer_sync+0x53/0x60 [56400.860633] [] kjournald+0xdf/0x240 [56400.860715] [] autoremove_wake_function+0x0/0x30 [56400.860803] [] kjournald+0x0/0x240 [56400.860884] [] kthread+0x4b/0x80 [56400.860967] [] child_rip+0xa/0x12 [56400.861047] [] kthread+0x0/0x80 [56400.861126] [] child_rip+0x0/0x12 [56400.861205] [56400.861262] [56400.861263] Code: 0f 0b eb fe 66 66 66 90 66 66 66 90 66 66 66 90 53 48 8b 77 [56400.861546] RIP [] __journal_drop_transaction+0x110/0x120 [56400.861642] RSP [56400.862158] Kernel panic - not syncing: Fatal exception Version:2.6.24-rc3-2ffbb8377c7a0713baf6644e285adc27a5654582 + proc fixes (cumulative patch attached) Box:Core 2 Duo E6400, 4G RAM mount info: /dev/sda2 on / type ext3 (rw,noatime,nodiratime) scheduler: CFQ .config: # # Automatically generated make config: don't edit # Linux kernel version: 2.6.24-rc3-proc # Sun Nov 25 14:29:24 2007 # CONFIG_64BIT=y # CONFIG_X86_32 is not set CONFIG_X86_64=y CONFIG_X86=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_ZONE_DMA=y # CONFIG_QUICKLIST is not set CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_RWSEM_GENERIC_SPINLOCK=y # CONFIG_RWSEM_XCHGADD_ALGORITHM is not set # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ZONE_DMA32=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_X86_HT=y # CONFIG_KTIME_SCALAR is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_PID_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=15 # CONFIG_CGROUPS is not set # CONFIG_FAIR_GROUP_SCHED is not set # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set # CONFIG_BLK_DEV_INITRD is not set # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y CONFIG_EMBEDDED=y # CONFIG_SYSCTL_SYSCALL is not set CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> > I agree that we shouldn't make things too hard for out-of-tree > > modules, but I disagree with your first statement: there clearly is a > > large class of symbols that are used by multiple modules but which are > > not generically useful -- they are only useful by a certain small class > > of modules. > > If it is so clear, you should be able to easily provide examples? Sure -- Andi's example of symbols required only by TCP congestion modules; the SCSI internals that Christoph wants to mark; the symbols exported by my mlx4_core driver (which I admit are currently only used by the mlx4_ib driver, but which will also be used by at least the ethernet NIC driver for the same hardware). I thought this was already covered repeatedly in the thread and indeed in Andi's code so there was no need to repeat it... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
profile code added to netif_receive_skb function
hi, I have added some code to netif_receive_skb function.As linux kernel is multhreaded , so there is no gaurantee than mine code is completely executed without being disturbed by any other process .Timer interrupt handler is an example of code which might interrupt execution of mine code. I just want to observe which processes are disturbing mine code .I think i need to print EIP register values .How can i print cache contents as well in linux kernel .Are there any tools available for such purpose thanks, shahzad - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make I/O schedulers non-modular
Andrew Morton wrote: > (cc's lovingly restored. Please do not do that) Thanks! I'm replying off list. > On Mon, 26 Nov 2007 07:57:00 +0300 Al Boldi <[EMAIL PROTECTED]> wrote: > > Jens Axboe wrote: > > > On Sun, Nov 25 2007, Adrian Bunk wrote: > > > > Is there any technical reason why we need 4 different schedulers at > > > > all? > > > > > > Until we have the perfect scheduler :-) > > > > > > With some hard work and testing, we should be able to get rid of 'as'. > > > It still beats cfq for some of the workloads that deadline is good at, > > > so not quite yet. > > > > > > > I have the gut feeling that the usual thing happens and people e.g. > > > > not report some cfq problems because as works for them... > > > > > > There's always a risk with "duplicate", like several drivers for the > > > same hardware. I'm not disputing that. > > > > Actually, both 'cfq' and 'as' are broken, and have been repeatedly > > reported as such. Deadline is the only one that currently looks sane, > > and seems like a good starting point for a more involved iosched. But > > keep in mind, the fact that 'cfq' and 'as' are broken may also point to > > a lower-level block-io problem. So, incrementally improving deadline > > may help discovering the problems both 'cfq' and 'as' are plagued with. > > Sorry, but these are vague and unuseful assertions. > > Please send bug reports, preferably with testcases which developers can > use when fixing the bugs. http://bugzilla.kernel.org/show_bug.cgi?id=5900 Thanks again! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Small System Paging Problem - OOM-killer goes nuts
Thanks for the response Mikael. Is your 486 running a IDE disk on a normal interface or via USB? I wonder if the NSLU2 only having I/O via USB might be significant. Also, this is a 2.6 kernel and I've seen spurious reports across the internet about similar oom-killer problems since about 2.6.7. Thanks! -Josh - Original Message - From: "Mikael Pettersson" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; Sent: Sunday, November 25, 2007 3:55 PM Subject: Re: Small System Paging Problem - OOM-killer goes nuts I'm no VM tuning expert, but I have and still do heavy compile jobs on similarly configured machines, with no OOM problems: I regularly build 2.6 kernels and occasionally also gcc on a 100MHz 486 with 28MB of RAM and perhaps 500MB of swap. It runs a standard but stripped down Fedora Core 4 user-space, with ext3 file systems and a kernel that doesn't include anything non-essential. The machine will swap madly, but the OOM killer never triggers. (All system settings are FC4 defaults. I haven't touched them.) In the past I did a fair amount of package rebuilds and test suite runs on an NSLU2 myself, with a 2.4 Linksys/Openslug kernel, ext3, and a 1GB or perhaps 2GB swap partition on a disk attached via a USB2-to-PATA enclosure. Even when swapping heavily the OOM killer wouldn't trigger. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] msi: set 'En' bit of MSI Mapping Capability on HT platform
> Isn't there a way we can make this work for any upstream HT > bridge, rather than only for specific NVIDIA chipsets? The lines Peer indicates below will work for any vendor's bridge device that implements an HT MSI mapping and is an upstream bridge of the endpoint requesting MSI. On some NVIDIA chipsets, the host bridge that implements HT MSI mapping is not hierarchically upstream from the MSI endpoint; it may be a peer on the same bus as the endpoint or the PCIe root complex that's above the endpoint. The NVIDIA-specific code in the patch is to detect those specific chipsets where this can occur. We have tested the patch with both internal and PCI Express MSI endpoints on each of these NVIDIA chipsets. It may be that other vendors have Hypertransport chipsets with similar requirements for HT MSI mapping, but we don't have that information or the ability to test code on those vendors' chipsets. Regards, Andy -- Andy Currid, NVIDIA Corporation [EMAIL PROTECTED] 408 566 6743 -Original Message- From: Peer Chen Sent: Sunday, November 25, 2007 20:02 To: Robert Hancock; peerchen Cc: linux-kernel; akpm; Andy Currid Subject: RE: [PATCH 1/2] msi: set 'En' bit of MSI Mapping Capability on HT platform I think the following lines are suitable for other bridges besides nvidia's, :) : === + if (pci_enable_msi_ht_cap(dev) != 0) { + return 0; + } else { + /* Get upstream bridge device handle */ + + bridge_dev = dev->bus->self; + while(bridge_dev != 0) { + if (pci_enable_msi_ht_cap(bridge_dev) != 0) { + return 0; + } else + bridge_dev = bridge_dev->bus->self; + } + + return 1; + } BRs Peer Chen -Original Message- From: Robert Hancock [mailto:[EMAIL PROTECTED] Sent: Monday, November 26, 2007 2:34 AM To: peerchen Cc: linux-kernel; akpm; Peer Chen; Andy Currid Subject: Re: [PATCH 1/2] msi: set 'En' bit of MSI Mapping Capability on HT platform peerchen wrote: > According to the HyperTransport spec, 'En' indicate if the MSI Mapping is active. So it should be set when enable the MSI. > > The patch base on kernel 2.6.24-rc3 > > Signed-off-by: Andy Currid <[EMAIL PROTECTED]> > Signed-off-by: Peer Chen <[EMAIL PROTECTED]> Isn't there a way we can make this work for any upstream HT bridge, rather than only for specific NVIDIA chipsets? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] [BUG] USB_PERSIST
The device which has the root fs is a READ-ONLY device. There is no way for it to change between getting detached and reattached to the computer which is suspended. In such a case there is no possibility of hibernation because there is nothing to write back to. I understand that this is currently considered a feature but I am arguing here that there should also be another feature that allows this to work under suspend to ram the same as it does with suspend to disk (hibernation). Here's a scenario: 1) You are at the airport working on a laptop without a hard drive, which you have booted up using a live USB distro on a read-only USB key drive. 2) You want to board your plane so you suspend your laptop. You can't keep the USB stick in your laptop because you can not fit the laptop back in the bag with the USB stick still attached. So you detach the USB stick while the laptop is still suspended. 3) You get on the plane and after some time when you are allowed to work again you stick back in the USB stick, resume the laptop and continue work where you left off. This scenario is not currently possible with the any kernel after 2.6.22. It is a very important missing feature. And yes. This feature does work under the 2.6.21 kernel, exactly because the kernel did not have the USB suspend and persist feature available. Under the 2.6.21 kernel, during suspend, the kernel is totally unaware of what is happening to the USB device so nothing happens when the USB device is detached and reattached while the computer is suspended, hence making the described scenario above possible. I currently, and very frequently, use this feature on my live USB distro, FaunOS which uses kernel 2.6.21. Thank you, Raymano G. On 11/25/07, Alan Stern <[EMAIL PROTECTED]> wrote: > On Sat, 24 Nov 2007, Andrew Morton wrote: > > > On Tue, 20 Nov 2007 17:04:32 -0700 "Raymano Garibaldi" <[EMAIL PROTECTED]> > > wrote: > > > > > Is there any other information that I can provide which might help in > > > resolving this bug? > > > > Let's cc the USB developers. > > > > > On 11/18/07, Raymano Garibaldi <[EMAIL PROTECTED]> wrote: > > > > The last time I tried this and it worked was 2.6.21. Below is a > > Sorry, that's not possible. 2.6.21 doesn't include USB Persist > support. Nor does 2.6.22. > > There were some experimental patches with early versions of USB Persist > for those kernels. They are different from what eventually went into > 2.6.23. > > > > > On 11/18/07, Denys Vlasenko <[EMAIL PROTECTED]> wrote: > > > > > On Sunday 18 November 2007 20:14, Raymano Garibaldi wrote: > > > > > > In kernel 2.6.23.8 USB_PERSIST feature does not work if the same USB > > > > > > device is detached and reattached while computer is suspended. The > > > > > > mount points for the USB storage device mounted before suspend are > > > > > > lost and the device has to be remounted after resume. > > USB Persist was never meant to allow you to detach and reattach a > device while the computer is suspended; it was meant to deal with > hibernation. So what you observed is the correct behavior, not a bug. > Detaching and reattaching a device while the computer is suspended > should result in exactly the same behavior as detaching and reattaching > the device while the computer is awake. > > If you try doing the same thing but with the computer in hibernation > instead of suspended, you may find it more in line with what you > expect. > > Alan Stern > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make I/O schedulers non-modular
(cc's lovingly restored. Please do not do that) On Mon, 26 Nov 2007 07:57:00 +0300 Al Boldi <[EMAIL PROTECTED]> wrote: > Jens Axboe wrote: > > On Sun, Nov 25 2007, Adrian Bunk wrote: > > > Is there any technical reason why we need 4 different schedulers at all? > > > > Until we have the perfect scheduler :-) > > > > With some hard work and testing, we should be able to get rid of 'as'. > > It still beats cfq for some of the workloads that deadline is good at, > > so not quite yet. > > > > > I have the gut feeling that the usual thing happens and people e.g. not > > > report some cfq problems because as works for them... > > > > There's always a risk with "duplicate", like several drivers for the > > same hardware. I'm not disputing that. > > Actually, both 'cfq' and 'as' are broken, and have been repeatedly reported > as such. Deadline is the only one that currently looks sane, and seems like > a good starting point for a more involved iosched. But keep in mind, the > fact that 'cfq' and 'as' are broken may also point to a lower-level block-io > problem. So, incrementally improving deadline may help discovering the > problems both 'cfq' and 'as' are plagued with. > Sorry, but these are vague and unuseful assertions. Please send bug reports, preferably with testcases which developers can use when fixing the bugs. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make I/O schedulers non-modular
Jens Axboe wrote: > On Sun, Nov 25 2007, Adrian Bunk wrote: > > Is there any technical reason why we need 4 different schedulers at all? > > Until we have the perfect scheduler :-) > > With some hard work and testing, we should be able to get rid of 'as'. > It still beats cfq for some of the workloads that deadline is good at, > so not quite yet. > > > I have the gut feeling that the usual thing happens and people e.g. not > > report some cfq problems because as works for them... > > There's always a risk with "duplicate", like several drivers for the > same hardware. I'm not disputing that. Actually, both 'cfq' and 'as' are broken, and have been repeatedly reported as such. Deadline is the only one that currently looks sane, and seems like a good starting point for a more involved iosched. But keep in mind, the fact that 'cfq' and 'as' are broken may also point to a lower-level block-io problem. So, incrementally improving deadline may help discovering the problems both 'cfq' and 'as' are plagued with. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Patch 4/4] sched: Improve fairness of cpu bandwidth allocation for task groups
The current load balancing scheme isn't good for group fairness. For ex: on a 8-cpu system, I created 3 groups as under: a = 8 tasks (cpu.shares = 1024) b = 4 tasks (cpu.shares = 1024) c = 3 tasks (cpu.shares = 1024) a, b and c are task groups that have equal weight. We would expect each of the groups to receive 33.33% of cpu bandwidth under a fair scheduler. This is what I get with the latest scheduler git tree: Col1 | Col2| Col3 | Col4 --|-|---|--- a | 277.676 | 57.8% | 54.1% 54.1% 54.1% 54.2% 56.7% 62.2% 62.8% 64.5% b | 116.108 | 24.2% | 47.4% 48.1% 48.7% 49.3% c | 86.326 | 18.0% | 47.5% 47.9% 48.5% Explanation of o/p: Col1 -> Group name Col2 -> Cumulative execution time (in seconds) received by all tasks of that group in a 60sec window across 8 cpus Col3 -> CPU bandwidth received by the group in the 60sec window, expressed in percentage. Col3 data is derived as: Col3 = 100 * Col2 / (NR_CPUS * 60) Col4 -> CPU bandwidth received by each individual task of the group. Col4 = 100 * cpu_time_recd_by_task / 60 [I can share the test case that produces a similar o/p if reqd] The deviation from desired group fairness is as below: a = +24.47% b = -9.13% c = -15.33% which is quite high. After the patch below is applied, here are the results: Col1 | Col2| Col3 | Col4 --|-|---|--- a | 163.112 | 34.0% | 33.2% 33.4% 33.5% 33.5% 33.7% 34.4% 34.8% 35.3% b | 156.220 | 32.5% | 63.3% 64.5% 66.1% 66.5% c | 160.653 | 33.5% | 85.8% 90.6% 91.4% Deviation from desired group fairness is as below: a = +0.67% b = -0.83% c = +0.17% which is far better IMO. Most of other runs have yielded a deviation within +-2% at the most, which is good. Why do we see bad (group) fairness with current scheuler? = Currently cpu's weight is just the summation of individual task weights. This can yield incorrect results. For ex: consider three groups as below on a 2-cpu system: CPU0CPU1 --- A (10) B(5) C(5) --- Group A has 10 tasks, all on CPU0, Group B and C have 5 tasks each all of which are on CPU1. Each task has the same weight (NICE_0_LOAD = 1024). The current scheme would yield a cpu weight of 10240 (10*1024) for each cpu and the load balancer will think both CPUs are perfectly balanced and won't move around any tasks. This, however, would yield this bandwidth: A = 50% B = 25% C = 25% which is not the desired result. What's changing in the patch? = - How cpu weights are calculated when CONFIF_FAIR_GROUP_SCHED is defined (see below) - API Change - Two tunables introduced in sysfs (under SCHED_DEBUG) to control the frequency at which the load balance monitor thread runs. The basic change made in this patch is how cpu weight (rq->load.weight) is calculated. Its now calculated as the summation of group weights on a cpu, rather than summation of task weights. Weight exerted by a group on a cpu is dependent on the shares allocated to it and also the number of tasks the group has on that cpu compared to the total number of (runnable) tasks the group has in the system. Let, W(K,i) = Weight of group K on cpu i T(K,i) = Task load present in group K's cfs_rq on cpu i T(K)= Total task load of group K across various cpus S(K)= Shares allocated to group K NRCPUS = Number of online cpus in the scheduler domain to which group K is assigned. Then, W(K,i) = S(K) * NRCPUS * T(K,i) / T(K) A load balance monitor thread is created at bootup, which periodically runs and adjusts group's weight on each cpu. To avoid its overhead, two min/max tunables are introduced (under SCHED_DEBUG) to control the rate at which it runs. Signed-off-by: Srivatsa Vaddagiri <[EMAIL PROTECTED]> --- include/linux/sched.h |4 kernel/sched.c| 265 -- kernel/sched_fair.c | 86 ++-- kernel/sysctl.c | 18 +++ 4 files changed, 334 insertions(+), 39 deletions(-) Index: current/include/linux/sched.h === ---
[Patch 3/4 v2] sched: change how cpu load is calculated
This patch changes how the cpu load exerted by fair_sched_class tasks is calculated. Load exerted by fair_sched_class tasks on a cpu is now a summation of the group weights, rather than summation of task weights. Weight exerted by a group on a cpu is dependent on the shares allocated to it. This version of patch (v2 of Patch 3/4) has a minor impact on code size (but should have no runtime/functional impact) for !CONFIG_FAIR_GROUP_SCHED case, but the overall code, IMHO, is neater compared to v1 of Patch 3/4 (because of lesser #ifdefs). I prefer v2 of Patch 3/4. Signed-off-by: Srivatsa Vaddagiri <[EMAIL PROTECTED]> --- kernel/sched.c | 27 +++ kernel/sched_fair.c | 31 +++ kernel/sched_rt.c |2 ++ 3 files changed, 40 insertions(+), 20 deletions(-) Index: current/kernel/sched.c === --- current.orig/kernel/sched.c +++ current/kernel/sched.c @@ -869,6 +869,16 @@ struct rq_iterator *iterator); #endif +static inline void inc_cpu_load(struct rq *rq, unsigned long load) +{ + update_load_add(>load, load); +} + +static inline void dec_cpu_load(struct rq *rq, unsigned long load) +{ + update_load_sub(>load, load); +} + #include "sched_stats.h" #include "sched_idletask.c" #include "sched_fair.c" @@ -879,26 +889,14 @@ #define sched_class_highest (_sched_class) -static inline void inc_load(struct rq *rq, const struct task_struct *p) -{ - update_load_add(>load, p->se.load.weight); -} - -static inline void dec_load(struct rq *rq, const struct task_struct *p) -{ - update_load_sub(>load, p->se.load.weight); -} - static void inc_nr_running(struct task_struct *p, struct rq *rq) { rq->nr_running++; - inc_load(rq, p); } static void dec_nr_running(struct task_struct *p, struct rq *rq) { rq->nr_running--; - dec_load(rq, p); } static void set_load_weight(struct task_struct *p) @@ -4070,10 +4068,8 @@ goto out_unlock; } on_rq = p->se.on_rq; - if (on_rq) { + if (on_rq) dequeue_task(rq, p, 0); - dec_load(rq, p); - } p->static_prio = NICE_TO_PRIO(nice); set_load_weight(p); @@ -4083,7 +4079,6 @@ if (on_rq) { enqueue_task(rq, p, 0); - inc_load(rq, p); /* * If the task increased its priority or is running and * lowered its priority, then reschedule its CPU: Index: current/kernel/sched_fair.c === --- current.orig/kernel/sched_fair.c +++ current/kernel/sched_fair.c @@ -755,15 +755,26 @@ static void enqueue_task_fair(struct rq *rq, struct task_struct *p, int wakeup) { struct cfs_rq *cfs_rq; - struct sched_entity *se = >se; + struct sched_entity *se = >se, + *topse = NULL; /* Highest schedulable entity */ + int incload = 1; for_each_sched_entity(se) { - if (se->on_rq) + topse = se; + if (se->on_rq) { + incload = 0; break; + } cfs_rq = cfs_rq_of(se); enqueue_entity(cfs_rq, se, wakeup); wakeup = 1; } + /* Increment cpu load if we just enqueued the first task of a group on +* 'rq->cpu'. 'topse' represents the group to which task 'p' belongs +* at the highest grouping level. +*/ + if (incload) + inc_cpu_load(rq, topse->load.weight); } /* @@ -774,16 +785,28 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int sleep) { struct cfs_rq *cfs_rq; - struct sched_entity *se = >se; + struct sched_entity *se = >se, + *topse = NULL; /* Highest schedulable entity */ + int decload = 1; for_each_sched_entity(se) { + topse = se; cfs_rq = cfs_rq_of(se); dequeue_entity(cfs_rq, se, sleep); /* Don't dequeue parent if it has other entities besides us */ - if (cfs_rq->load.weight) + if (cfs_rq->load.weight) { + if (parent_entity(se)) + decload = 0; break; + } sleep = 1; } + /* Decrement cpu load if we just dequeued the last task of a group on +* 'rq->cpu'. 'topse' represents the group to which task 'p' belongs +* at the highest grouping level. +*/ + if (decload) + dec_cpu_load(rq, topse->load.weight); } /* Index: current/kernel/sched_rt.c === --- current.orig/kernel/sched_rt.c +++ current/kernel/sched_rt.c @@
Re: bonding sysfs output
On Sun, 25 Nov 2007 16:12:57 +0100 Wagner Ferenc <[EMAIL PROTECTED]> wrote: > Hi, > > Am I totally of the limit with the attached patch against > drivers/net/bonding/bond_sysfs.c? I'd like to receive some comments, > as I'm not a kernel developer. Plese alwayts cc [EMAIL PROTECTED] on networking-related matters. > I propose it as a fix for trailing NULs and spaces like eg. > > $ od -c /sys/class/net/bond0/bonding/slaves > 000 e t h - l e f t e t h - r i g > 020 h t \n \0 > 025 > > I'm afraid there're other problems with "++more++" handling, but let's > not consider those just yet. Find the patch attached. The first > hunks also renames buffer to buf, for consistency's shake. > > The original version had varying behaviour for Not Applicable cases. > This patch also settles for empty files (not even a line feed) in > those cases, but I'm not sure about the general policy on this matter. > hm, there are a lot of changes there. Were they all actually needed to fix the one bug which you have described? --- bond_sysfs.c.orig 2007-11-16 19:14:27.0 +0100 +++ bond_sysfs.c2007-11-25 16:01:23.092973099 +0100 @@ -74,7 +74,7 @@ * "show" function for the bond_masters attribute. * The class parameter is ignored. */ -static ssize_t bonding_show_bonds(struct class *cls, char *buffer) +static ssize_t bonding_show_bonds(struct class *cls, char *buf) { int res = 0; struct bonding *bond; @@ -86,14 +86,13 @@ /* not enough space for another interface name */ if ((PAGE_SIZE - res) > 10) res = PAGE_SIZE - 10; - res += sprintf(buffer + res, "++more++"); + res += sprintf(buf + res, "++more++ "); break; } - res += sprintf(buffer + res, "%s ", + res += sprintf(buf + res, "%s ", bond->dev->name); } - res += sprintf(buffer + res, "\n"); - res++; + if (res) buf[res-1] = '\n'; /* eat the leftover space */ up_read(&(bonding_rwsem)); return res; } @@ -237,14 +236,13 @@ /* not enough space for another interface name */ if ((PAGE_SIZE - res) > 10) res = PAGE_SIZE - 10; - res += sprintf(buf + res, "++more++"); + res += sprintf(buf + res, "++more++ "); break; } res += sprintf(buf + res, "%s ", slave->dev->name); } read_unlock_bh(>lock); - res += sprintf(buf + res, "\n"); - res++; + if (res) buf[res-1] = '\n'; /* eat the leftover space */ return res; } @@ -401,7 +399,7 @@ return sprintf(buf, "%s %d\n", bond_mode_tbl[bond->params.mode].modename, - bond->params.mode) + 1; + bond->params.mode); } static ssize_t bonding_store_mode(struct device *d, @@ -452,17 +450,14 @@ struct device_attribute *attr, char *buf) { - int count; + int count = 0; struct bonding *bond = to_bond(d); - if ((bond->params.mode != BOND_MODE_XOR) && - (bond->params.mode != BOND_MODE_8023AD)) { - // Not Applicable - count = sprintf(buf, "NA\n") + 1; - } else { + if ((bond->params.mode == BOND_MODE_XOR) || + (bond->params.mode == BOND_MODE_8023AD)) { count = sprintf(buf, "%s %d\n", xmit_hashtype_tbl[bond->params.xmit_policy].modename, - bond->params.xmit_policy) + 1; + bond->params.xmit_policy); } return count; @@ -522,7 +517,7 @@ return sprintf(buf, "%s %d\n", arp_validate_tbl[bond->params.arp_validate].modename, - bond->params.arp_validate) + 1; + bond->params.arp_validate); } static ssize_t bonding_store_arp_validate(struct device *d, @@ -574,7 +569,7 @@ { struct bonding *bond = to_bond(d); - return sprintf(buf, "%d\n", bond->params.arp_interval) + 1; + return sprintf(buf, "%d\n", bond->params.arp_interval); } static ssize_t bonding_store_arp_interval(struct device *d, @@ -671,10 +666,7 @@ res += sprintf(buf + res, "%u.%u.%u.%u ", NIPQUAD(bond->params.arp_targets[i])); } - if (res) - res--; /* eat the leftover space */ - res += sprintf(buf + res, "\n"); - res++; + if (res) buf[res-1] = '\n'; /* eat the leftover space */ return res; } @@ -775,7 +767,7 @@ { struct bonding *bond = to_bond(d); - return
[Patch 3/4 v1] sched: change how cpu load is calculated
This patch changes how the cpu load exerted by fair_sched_class tasks is calculated. Load exerted by fair_sched_class tasks on a cpu is now a summation of the group weights, rather than summation of task weights. Weight exerted by a group on a cpu is dependent on the shares allocated to it. This version of patch (v1 of Patch 3/4) has zero impact for !CONFIG_FAIR_GROUP_SCHED case. Signed-off-by: Srivatsa Vaddagiri <[EMAIL PROTECTED]> --- kernel/sched.c | 38 ++ kernel/sched_fair.c | 31 +++ kernel/sched_rt.c |2 ++ 3 files changed, 59 insertions(+), 12 deletions(-) Index: current/kernel/sched.c === --- current.orig/kernel/sched.c +++ current/kernel/sched.c @@ -869,15 +869,25 @@ struct rq_iterator *iterator); #endif -#include "sched_stats.h" -#include "sched_idletask.c" -#include "sched_fair.c" -#include "sched_rt.c" -#ifdef CONFIG_SCHED_DEBUG -# include "sched_debug.c" -#endif +#ifdef CONFIG_FAIR_GROUP_SCHED -#define sched_class_highest (_sched_class) +static inline void inc_cpu_load(struct rq *rq, unsigned long load) +{ + update_load_add(>load, load); +} + +static inline void dec_cpu_load(struct rq *rq, unsigned long load) +{ + update_load_sub(>load, load); +} + +static inline void inc_load(struct rq *rq, const struct task_struct *p) { } +static inline void dec_load(struct rq *rq, const struct task_struct *p) { } + +#else /* CONFIG_FAIR_GROUP_SCHED */ + +static inline void inc_cpu_load(struct rq *rq, unsigned long load) { } +static inline void dec_cpu_load(struct rq *rq, unsigned long load) { } static inline void inc_load(struct rq *rq, const struct task_struct *p) { @@ -889,6 +899,18 @@ update_load_sub(>load, p->se.load.weight); } +#endif /* CONFIG_FAIR_GROUP_SCHED */ + +#include "sched_stats.h" +#include "sched_idletask.c" +#include "sched_fair.c" +#include "sched_rt.c" +#ifdef CONFIG_SCHED_DEBUG +# include "sched_debug.c" +#endif + +#define sched_class_highest (_sched_class) + static void inc_nr_running(struct task_struct *p, struct rq *rq) { rq->nr_running++; Index: current/kernel/sched_fair.c === --- current.orig/kernel/sched_fair.c +++ current/kernel/sched_fair.c @@ -755,15 +755,26 @@ static void enqueue_task_fair(struct rq *rq, struct task_struct *p, int wakeup) { struct cfs_rq *cfs_rq; - struct sched_entity *se = >se; + struct sched_entity *se = >se, + *topse = NULL; /* Highest schedulable entity */ + int incload = 1; for_each_sched_entity(se) { - if (se->on_rq) + topse = se; + if (se->on_rq) { + incload = 0; break; + } cfs_rq = cfs_rq_of(se); enqueue_entity(cfs_rq, se, wakeup); wakeup = 1; } + /* Increment cpu load if we just enqueued the first task of a group on +* 'rq->cpu'. 'topse' represents the group to which task 'p' belongs +* at the highest grouping level. +*/ + if (incload) + inc_cpu_load(rq, topse->load.weight); } /* @@ -774,16 +785,28 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int sleep) { struct cfs_rq *cfs_rq; - struct sched_entity *se = >se; + struct sched_entity *se = >se, + *topse = NULL; /* Highest schedulable entity */ + int decload = 1; for_each_sched_entity(se) { + topse = se; cfs_rq = cfs_rq_of(se); dequeue_entity(cfs_rq, se, sleep); /* Don't dequeue parent if it has other entities besides us */ - if (cfs_rq->load.weight) + if (cfs_rq->load.weight) { + if (parent_entity(se)) + decload = 0; break; + } sleep = 1; } + /* Decrement cpu load if we just dequeued the last task of a group on +* 'rq->cpu'. 'topse' represents the group to which task 'p' belongs +* at the highest grouping level. +*/ + if (decload) + dec_cpu_load(rq, topse->load.weight); } /* Index: current/kernel/sched_rt.c === --- current.orig/kernel/sched_rt.c +++ current/kernel/sched_rt.c @@ -31,6 +31,7 @@ list_add_tail(>run_list, array->queue + p->prio); __set_bit(p->prio, array->bitmap); + inc_cpu_load(rq, p->se.load.weight); } /* @@ -45,6 +46,7 @@ list_del(>run_list); if (list_empty(array->queue + p->prio)) __clear_bit(p->prio, array->bitmap); + dec_cpu_load(rq, p->se.load.weight); } /*
[PATCH 2/4] sched: minor fixes for group scheduler
Minor bug fixes for group scheduler: - Use a mutex to serialize add/remove of task groups and also when changing shares of a task group. Use the same mutex when printing cfs_rq stats for various task groups. - Use list_for_each_entry_rcu in for_each_leaf_cfs_rq macro (when walking task group list) Signed-off-by: Srivatsa Vaddagiri <[EMAIL PROTECTED]> --- kernel/sched.c | 33 + kernel/sched_fair.c |4 +++- 2 files changed, 28 insertions(+), 9 deletions(-) Index: current/kernel/sched.c === --- current.orig/kernel/sched.c +++ current/kernel/sched.c @@ -169,8 +169,6 @@ struct task_group { /* runqueue "owned" by this group on each cpu */ struct cfs_rq **cfs_rq; unsigned long shares; - /* spinlock to serialize modification to shares */ - spinlock_t lock; struct rcu_head rcu; }; @@ -182,6 +180,11 @@ static DEFINE_PER_CPU(struct cfs_rq, ini static struct sched_entity *init_sched_entity_p[NR_CPUS]; static struct cfs_rq *init_cfs_rq_p[NR_CPUS]; +/* task_group_mutex serializes add/remove of task groups and also changes to + * a task group's cpu shares. + */ +static DEFINE_MUTEX(task_group_mutex); + /* Default task group. * Every task in system belong to this group at bootup. */ @@ -222,9 +225,21 @@ static inline void set_task_cfs_rq(struc p->se.parent = task_group(p)->se[cpu]; } +static inline void lock_task_group_list(void) +{ + mutex_lock(_group_mutex); +} + +static inline void unlock_task_group_list(void) +{ + mutex_unlock(_group_mutex); +} + #else static inline void set_task_cfs_rq(struct task_struct *p, unsigned int cpu) { } +static inline void lock_task_group_list(void) { } +static inline void unlock_task_group_list(void) { } #endif /* CONFIG_FAIR_GROUP_SCHED */ @@ -6747,7 +6762,6 @@ void __init sched_init(void) se->parent = NULL; } init_task_group.shares = init_task_group_load; - spin_lock_init(_task_group.lock); #endif for (j = 0; j < CPU_LOAD_IDX_MAX; j++) @@ -6987,14 +7001,15 @@ struct task_group *sched_create_group(vo se->parent = NULL; } + tg->shares = NICE_0_LOAD; + + lock_task_group_list(); for_each_possible_cpu(i) { rq = cpu_rq(i); cfs_rq = tg->cfs_rq[i]; list_add_rcu(_rq->leaf_cfs_rq_list, >leaf_cfs_rq_list); } - - tg->shares = NICE_0_LOAD; - spin_lock_init(>lock); + unlock_task_group_list(); return tg; @@ -7040,10 +7055,12 @@ void sched_destroy_group(struct task_gro struct cfs_rq *cfs_rq = NULL; int i; + lock_task_group_list(); for_each_possible_cpu(i) { cfs_rq = tg->cfs_rq[i]; list_del_rcu(_rq->leaf_cfs_rq_list); } + unlock_task_group_list(); BUG_ON(!cfs_rq); @@ -7117,7 +7134,7 @@ int sched_group_set_shares(struct task_g { int i; - spin_lock(>lock); + lock_task_group_list(); if (tg->shares == shares) goto done; @@ -7126,7 +7143,7 @@ int sched_group_set_shares(struct task_g set_se_shares(tg->se[i], shares); done: - spin_unlock(>lock); + unlock_task_group_list(); return 0; } Index: current/kernel/sched_fair.c === --- current.orig/kernel/sched_fair.c +++ current/kernel/sched_fair.c @@ -685,7 +685,7 @@ static inline struct cfs_rq *cpu_cfs_rq( /* Iterate thr' all leaf cfs_rq's on a runqueue */ #define for_each_leaf_cfs_rq(rq, cfs_rq) \ - list_for_each_entry(cfs_rq, >leaf_cfs_rq_list, leaf_cfs_rq_list) + list_for_each_entry_rcu(cfs_rq, >leaf_cfs_rq_list, leaf_cfs_rq_list) /* Do the two (enqueued) entities belong to the same group ? */ static inline int @@ -1126,7 +1126,9 @@ static void print_cfs_stats(struct seq_f #ifdef CONFIG_FAIR_GROUP_SCHED print_cfs_rq(m, cpu, _rq(cpu)->cfs); #endif + lock_task_group_list(); for_each_leaf_cfs_rq(cpu_rq(cpu), cfs_rq) print_cfs_rq(m, cpu, cfs_rq); + unlock_task_group_list(); } #endif -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] sched: code cleanup
Minor cleanups: - Fix coding style - remove obsolete comment Signed-off-by: Srivatsa Vaddagiri <[EMAIL PROTECTED]> --- kernel/sched.c | 21 +++-- 1 files changed, 3 insertions(+), 18 deletions(-) Index: current/kernel/sched.c === --- current.orig/kernel/sched.c +++ current/kernel/sched.c @@ -191,12 +191,12 @@ struct task_group init_task_group = { }; #ifdef CONFIG_FAIR_USER_SCHED -# define INIT_TASK_GRP_LOAD2*NICE_0_LOAD +# define INIT_TASK_GROUP_LOAD 2*NICE_0_LOAD #else -# define INIT_TASK_GRP_LOADNICE_0_LOAD +# define INIT_TASK_GROUP_LOAD NICE_0_LOAD #endif -static int init_task_group_load = INIT_TASK_GRP_LOAD; +static int init_task_group_load = INIT_TASK_GROUP_LOAD; /* return group to which a task belongs */ static inline struct task_group *task_group(struct task_struct *p) @@ -864,21 +864,6 @@ iter_move_one_task(struct rq *this_rq, i #define sched_class_highest (_sched_class) -/* - * Update delta_exec, delta_fair fields for rq. - * - * delta_fair clock advances at a rate inversely proportional to - * total load (rq->load.weight) on the runqueue, while - * delta_exec advances at the same rate as wall-clock (provided - * cpu is not idle). - * - * delta_exec / delta_fair is a measure of the (smoothened) load on this - * runqueue over any given interval. This (smoothened) load is used - * during load balance. - * - * This function is called /before/ updating rq->load - * and when switching tasks. - */ static inline void inc_load(struct rq *rq, const struct task_struct *p) { update_load_add(>load, p->se.load.weight); -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] sched: group scheduler related patches (V3)
Here's V3 of the group scheduler related patches, which is mainly addressing improved fairness of cpu bandwidth allocation for task groups. Patch 1/4 -> coding style cleanup Patch 2/4 -> Minor group scheduling related bug fixes Patch 3/4 (v1) -> Modifies how cpu load is calculated, such that there is zero impact on !CONFIG_FAIR_GROUP_SCHED Patch 3/4 (v2) -> Modifies how cpu load is calculated, such that there is a small impact on code size (but should have NO impact on functionality or runtime behavior) for !CONFIG_FAIR_GROUP_SCHED case. The resulting code however is much neater since it avoids some #ifdefs. I prefer v2. Patch 4/4 -> Updates load balance logic to provide improved fairness for task groups. To have zero impact on !CONFIG_FAIR_GROUP_SCHED case, please apply the following patches: - Patch 1/4 - Patch 2/4 - Patch 3/4 (v1) - Patch 4/4 I personally prefer v2 of Patch 3/4. Even though it has a minor impact on code size for !CONFIG_FAIR_GROUP_SCHED case, the overall code is much neater IMHO. Impact on sched.o size: === !CONFIG_FAIR_GROUP_SCHED: textdata bss dec hex filename 368292766 48 396439adb sched.o-before-nofgs 368292766 48 396439adb sched.o-after-v1-nofgs (v1 of Patch 3/4) 368432766 48 396579ae9 sched.o-after-v2-nofgs (v2 of Patch 3/4) CONFIG_FAIR_GROUP_SCHED: textdata bss dec hex filename 390193346 336 42701a6cd sched.o-before-fgs 403033482 308 44093ac3d sched.o-after-v1-fgs (v1 of Patch 3/4) 403033482 308 44093ac3d sched.o-after-v2-fgs (v2 of Patch 3/4) Changes since V2 of this patchset [1] - Split the patches better and make them pass under checkpatch.pl script - Fixed compile issues under different config options and also a suspend failure (as posted by Ingo at [2]) - Make load_balance_monitor thread run as real-time task, so that its execution is not limited by shares allocated to default task group (init_task_group). - Reduced minimum shares that can be allocated to a group to 1 (from 100). Would be usefull if someone wants a task group to get very low bandiwdth or get bandwidth only when other groups are idle. - Removed check for tg->last_total_load check in rebalance_shares() (which was incorrect in V2) Changes since V1 of this patchset [3]: - Introduced a task_group_mutex to serialize add/removal of task groups (as pointed by Dipankar) Please apply if there are no major concerns. References: 1. http://marc.info/?l=linux-kernel=119549585223262 2. http://lkml.org/lkml/2007/11/19/127 3. http://marc.info/?l=linux-kernel=119547452517055 -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question regarding naming scheme (HP Jornada 6XX/7XX)
On Mon, Nov 26, 2007 at 12:03:29AM +0100, Kristoffer Ericson wrote: > For instance an hp 620 user thought that their system was unsupported > because everything was for '680'. Or the other way round 728 users > didn't want to use 720 since they thought they would loose their extra > ram (only difference between versions). > How exactly is changing from 6XX to 600 going to change this? If users are confused, then you should be documenting this distinction better and working on clearing up the confusion. I'm all for making things obvious to the end user, but there gets to be a point where it just becomes silly. > Why I want to use 600-series/700-series instead of 6XX/7XX is simply > because 600-series/700-series leaves no doubt. > Apparently your end users are more technically apt than I am, as I have no idea how using 00 over XX makes things any less ambiguous. We already have a 6xx mach-type that drivers can set their dependency on. If it's not 680-only, then that's a perfectly reasonable dependency. Feel free to change the Kconfig text to make the description more useful, but please don't start idly shuffling around code and symbols because users can't work out why a driver is available that they can't support. Besides, the kernel frowns upon recursion, and all you need is to find two equally confused users with differening viewpoints to hit imminent death (whether self-inflicted or otherwise). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 1/2] msi: set 'En' bit of MSI Mapping Capability on HT platform
I think the following lines are suitable for other bridges besides nvidia's, :) : === + if (pci_enable_msi_ht_cap(dev) != 0) { + return 0; + } else { + /* Get upstream bridge device handle */ + + bridge_dev = dev->bus->self; + while(bridge_dev != 0) { + if (pci_enable_msi_ht_cap(bridge_dev) != 0) { + return 0; + } else + bridge_dev = bridge_dev->bus->self; + } + + return 1; + } BRs Peer Chen -Original Message- From: Robert Hancock [mailto:[EMAIL PROTECTED] Sent: Monday, November 26, 2007 2:34 AM To: peerchen Cc: linux-kernel; akpm; Peer Chen; Andy Currid Subject: Re: [PATCH 1/2] msi: set 'En' bit of MSI Mapping Capability on HT platform peerchen wrote: > According to the HyperTransport spec, 'En' indicate if the MSI Mapping is active. So it should be set when enable the MSI. > > The patch base on kernel 2.6.24-rc3 > > Signed-off-by: Andy Currid <[EMAIL PROTECTED]> > Signed-off-by: Peer Chen <[EMAIL PROTECTED]> Isn't there a way we can make this work for any upstream HT bridge, rather than only for specific NVIDIA chipsets? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. --- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] Export force_sig_info
Hi Andrew, > Perhaps export it from within a powerpc-specific C file (along with > suitable comment) to prevent people from generally relying upon the > export? Even better, I'll export it from a Cell-specific C file. I'll follow this up in my own spufs series for .25. Cheers, Jeremy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net/irda/parameters.c: Trivial fixes
Samuel Ortiz wrote: Hi Richard, On Sat, Nov 24, 2007 at 09:44:05PM +0100, Richard Knutsson wrote: Make a single va_start() -> va_end() path + fixing: Ok, this should be 2 separate patches then. Thought about it, but they were so simple, I believed they would better be merged... The warning fixes are all good, but I fail to see the point of the va_end() one. That doesn't seem to bring any sort of improvement while adding one variable to the stack and one loop test. Any explanation here ? Not really. Many seem to like a single return and since this made it one va_end() to every va_start(), I thought it would be appropriate. But if not, then I will only filter this hit out from the va_start()->va_end()-testing and get going. I'll push the warning fix for now, thanks. Alright, thank you. Cheers, Samuel. cu Richard Knutsson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] SO_NO_CHECK for IPv6
David Schwartz <[EMAIL PROTECTED]> wrote: > > Exactly. But *he* doesn't need to check that checksum, given that he already > got the packet, since he has an upper-level checksum. He is not saying that > his reasoning applies to everyone, just that it applies to him. He is not > talking about disabling the send checksum, but the receive checksum. He > knows that he does not need it. You must be in some other thread because this one started with a patch to disable sender checksums. Oh and please do keep CCs on this list. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
irq on nforce4 and realtek RTL8168B
Hello, I have a ASUS A6T with nforce4 and realtek RTL8111/8168B ethernet. I am testing kernel 2.6.24-rc3-git1 on this notebook and I have noticed strange behaviour with irq. I pass hpet=force and acpi_use_timer_override to enable apic, otherwise timer cpu interrupt is in old XT-PIC mode. Unfortunately, irq balancing on turion X2 doesn't work very well and there are extra timers interrupt. Realtek RTL8168B shares irq 17 with nvidia 7600 go card and it is not very good, infact if I don't use pci=nomsi option, Realtek RTL8168B is up but doesn't transmit any packet. I have following situation interrupt without pci=nomsi: CPU0 CPU1 0: 57 29452 IO-APIC-edge timer 1: 0331 IO-APIC-edge i8042 7: 1 0 IO-APIC-edge 8: 0 2 IO-APIC-edge rtc 9:180193 IO-APIC-fasteoi acpi 12: 8291133 IO-APIC-edge i8042 14: 2 2810 IO-APIC-edge libata 15: 5186 2646 IO-APIC-edge libata 17: 0683 IO-APIC-fasteoi nvidia 18: 0 2 IO-APIC-fasteoi ohci1394 19: 1 45 IO-APIC-fasteoi ohci_hcd:usb1 20: 0 0 IO-APIC-fasteoi sdhci:slot0 21: 0292 IO-APIC-fasteoi HDA Intel 221: 0 0 PCI-MSI-edge eth1 NMI: 0 0 Non-maskable interrupts LOC: 29452 8 Local timer interrupts RES: 2799 4491 Rescheduling interrupts CAL:128102 function call interrupts TLB:354247 TLB shootdowns TRM: 0 0 Thermal event interrupts SPU: 0 0 Spurious interrupts ERR: 1 MIS: 0 Besides, these problems on irq seem to break lapic with no_hz. I don't get a working suspend memory for irq fault, the notebook doesn't reboot after the suspend memory. Unfortunately, the bios is very buggy and I believe ASUS has to behave better with linux users. I invite ASUS,AMD,NVIDIA and REALTEK manifacturers to offer a better support for linux, and to not violate standard ACPI specifics. I wish to be personally CC'ed the answers/comments posted to the list in response to my posting. Thanks Best Regards Francesco - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sata_nv: don't use legacy DMA in ADMA mode (v3)
Robert Hancock wrote: > We need to run any DMA command with result taskfile requested in ADMA mode > when the port is in ADMA mode, otherwise it may try to use the legacy DMA > engine > in ADMA mode which is not allowed. Enforce this with BUG_ON() since data > corruption could potentially result if this happened. Also, fail any attempt > to > try and issue NCQ commands with result taskfile requested, since the hardware > doesn't allow this. > > Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> Acked-by: Tejun Heo <[EMAIL PROTECTED]> Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net/irda/parameters.c: Trivial fixes
Hi Richard, On Sat, Nov 24, 2007 at 09:44:05PM +0100, Richard Knutsson wrote: > Make a single va_start() -> va_end() path + fixing: Ok, this should be 2 separate patches then. The warning fixes are all good, but I fail to see the point of the va_end() one. That doesn't seem to bring any sort of improvement while adding one variable to the stack and one loop test. Any explanation here ? I'll push the warning fix for now, thanks. Cheers, Samuel. > CHECK /home/kernel/src/net/irda/parameters.c > /home/kernel/src/net/irda/parameters.c:466:2: warning: Using plain integer as > NULL pointer > /home/kernel/src/net/irda/parameters.c:520:2: warning: Using plain integer as > NULL pointer > /home/kernel/src/net/irda/parameters.c:573:2: warning: Using plain integer as > NULL pointer > > Signed-off-by: Richard Knutsson <[EMAIL PROTECTED]> > --- > Compile-tested on i386 with allyesconfig and allmodconfig. > > > diff --git a/net/irda/parameters.c b/net/irda/parameters.c > index 2627dad..bf19071 100644 > --- a/net/irda/parameters.c > +++ b/net/irda/parameters.c > @@ -368,10 +368,11 @@ int irda_param_pack(__u8 *buf, char *fmt, ...) > va_list args; > char *p; > int n = 0; > + int retval = 0; > > va_start(args, fmt); > > - for (p = fmt; *p != '\0'; p++) { > + for (p = fmt; *p != '\0' && retval == 0; p++) { > switch (*p) { > case 'b': /* 8 bits unsigned byte */ > buf[n++] = (__u8)va_arg(args, int); > @@ -392,13 +393,12 @@ int irda_param_pack(__u8 *buf, char *fmt, ...) > break; > #endif > default: > - va_end(args); > - return -1; > + retval = -1; > } > } > va_end(args); > > - return 0; > + return retval; > } > EXPORT_SYMBOL(irda_param_pack); > > @@ -411,10 +411,11 @@ static int irda_param_unpack(__u8 *buf, char *fmt, ...) > va_list args; > char *p; > int n = 0; > + int retval = 0; > > va_start(args, fmt); > > - for (p = fmt; *p != '\0'; p++) { > + for (p = fmt; *p != '\0' && retval == 0; p++) { > switch (*p) { > case 'b': /* 8 bits byte */ > arg.ip = va_arg(args, __u32 *); > @@ -436,14 +437,13 @@ static int irda_param_unpack(__u8 *buf, char *fmt, ...) > break; > #endif > default: > - va_end(args); > - return -1; > + retval = -1; > } > > } > va_end(args); > > - return 0; > + return retval; > } > > /* > @@ -463,7 +463,7 @@ int irda_param_insert(void *self, __u8 pi, __u8 *buf, int > len, > int n = 0; > > IRDA_ASSERT(buf != NULL, return ret;); > - IRDA_ASSERT(info != 0, return ret;); > + IRDA_ASSERT(info != NULL, return ret;); > > pi_minor = pi & info->pi_mask; > pi_major = pi >> info->pi_major_offset; > @@ -517,7 +517,7 @@ static int irda_param_extract(void *self, __u8 *buf, int > len, > int n = 0; > > IRDA_ASSERT(buf != NULL, return ret;); > - IRDA_ASSERT(info != 0, return ret;); > + IRDA_ASSERT(info != NULL, return ret;); > > pi_minor = buf[n] & info->pi_mask; > pi_major = buf[n] >> info->pi_major_offset; > @@ -570,7 +570,7 @@ int irda_param_extract_all(void *self, __u8 *buf, int len, > int n = 0; > > IRDA_ASSERT(buf != NULL, return ret;); > - IRDA_ASSERT(info != 0, return ret;); > + IRDA_ASSERT(info != NULL, return ret;); > > /* >* Parse all parameters. Each parameter must be at least two bytes - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/9]: Reduce Log I/O latency
Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c === --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2007-11-22 10:47:21.945395328 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2007-11-22 10:53:11.556186722 +1100 @@ -1443,6 +1443,8 @@ xlog_sync(xlog_t *log, XFS_BUF_ZEROFLAGS(bp); XFS_BUF_BUSY(bp); XFS_BUF_ASYNC(bp); + XFS_BUF_SET_LOGBUF(bp); + /* * Do an ordered write for the log block. * Its unnecessary to flush the first split block in the log wrap case. Whichever way you go with this one Dave you should probably add another XFS_BUF_SET_LOGBUF() call for the buffer split case further down in the same function. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 23/27] x86: debugctlmsr kconfig
On Sun, Nov 25, 2007 at 02:08:02PM -0800, Roland McGrath wrote: > > This adds the (internal) Kconfig macro CONFIG_X86_DEBUGCTLMSR, > to be defined when configuring to support only hardware that > definitely supports MSR_IA32_DEBUGCTLMSR with the BTF flag. > > The Intel documentation says "P6 family" and later processors all have it. > I think the Kconfig dependencies are right to have it set for those and > unset for others (i.e., when 586 and earlier are supported). What about the non-Intel vendors ? Was this msr present on AMD K6 ? Geode? Winchip? VIA C3 ? If not, then this patch isn't complete. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: "buggy cmd640" message followed by soft lockup
(Dropped Rafael from CC) On Sunday 25 November 2007, Bartlomiej Zolnierkiewicz wrote: > So either something went very very wrong or the oops itself is incorrect. > > Please put BUG() before the put_cmd640_reg() above so the next time > BUG happens we will know which one is it. I've spent quite a bit of time on this issue over the weekend and have seen all kinds of "interesting" behavior with various kernels with different debug patches, but no definite clues (except confirmation that on "good" boots no cmd64x hardware is detected). At some point I scrapped the virtual machine I had been using and created a new one. Since then I've been unable to reproduce the problem. I'm still quite confused by the issue exactly because it was so consistent when it _did_ happen and am still not sure if it can be blamed completely on the quirkiness of Virtualbox. I'll keep testing new kernels in VirtualBox and will keep alert for the issue, but for now I think it's best to forget about it. Bartlomiej: thanks for your feedback and suggestions. Cheers, FJP - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Re: nozomi version 2.1d for review
Hi Frank I was wondering if you had a git tree somewhere I could pull. Thanks Mike On 11/11/2007, Frank Seidel <[EMAIL PROTECTED]> wrote: > Hello, > > this one also holds the - little reworked and optimized - > cleanup of the read/write_mem32 functions. > > Comments and any feedback is more than welcome. > > Thanks a lot - especially to Jiri, Alan and Greg, > Frank - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Monday 26 November 2007 07:27:03 Roland Dreier wrote: > > This patch allows to export symbols only for specific modules by > > introducing symbol name spaces. A module name space has a white > > list of modules that are allowed to import symbols for it; all others > > can't use the symbols. > > > > It adds two new macros: > > > > MODULE_NAMESPACE_ALLOW(namespace, module); > > I definitely like the idea of organizing exported symbols into > namespaces. However, I feel like it would make more sense to have > something like > > MODULE_NAMESPACE_IMPORT(namespace); Except C doesn't have namespaces and this mechanism doesn't create them. So this is just complete and utter makework; as I said before, noone's going to confuse all those udp_* functions if they're not in the udp namespace. For better or worse, this is not C++. Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Monday 26 November 2007 07:29:39 Roland Dreier wrote: > > Yes, and if a symbol is already used by multiple modules, it's > > generically useful. And if so, why restrict it to in-tree modules? > > I agree that we shouldn't make things too hard for out-of-tree > modules, but I disagree with your first statement: there clearly is a > large class of symbols that are used by multiple modules but which are > not generically useful -- they are only useful by a certain small class > of modules. If it is so clear, you should be able to easily provide examples? Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Saturday 24 November 2007 23:39:43 Andi Kleen wrote: > On Sat, Nov 24, 2007 at 03:53:34PM +1100, Rusty Russell wrote: > > So, you're saying that there's a problem with in-tree modules using > > symbols they shouldn't? Can you give an example? [ Note: no response to this ] > > If people aren't reviewing, this won't make them review. I don't think > > the > > With millions of LOC the primary maintainers cannot review everything. > It's not that anybody is doing a bad job -- it is just so much code > that explicit mechanisms are better than implicit contracts. > > > problem is that people are conniving to avoid review. > > No of course not -- it is just too much code to let everything > be reviewed by the core subsystem maintainers. But with explicit > marking of internal symbols they would need to look at it because > the relationship will be clearly spelled out in the code. No, a one-line patch adding the module to the set is all they'd see. There's no reason to think this will cause more review. > > > Several distributions have policies that require to > > > keep the changes to these exported interfaces minimal and that > > > is very hard with thousands of exported symbol. With name spaces > > > the number of truly publicly exported symbols will hopefully > > > shrink to a much smaller, more manageable set. > > > > *This* makes sense. But it's not clear that the burden should be placed > > on kernel coders. You can create a list yourself. How do I tell the > > difference between "truly publicly exported" symbols and others? > > Out of tree solutions generally do not scale. Nobody else can > keep up with 2+ Million changes each merge window. > > > If a symbol has more than one in-tree user, it's hard to argue against an > > There are still classes of drivers. e.g. for the SCSI example: SD,SG,SR > etc. are more internal while low level drivers like aic7xxx are clearly > external drivers. Then mark those symbols internal and only allow concurrently-built modules to access them. That's simpler and requires much less maintenance than your solution. > > out-of-tree module using the symbol, unless you're arguing against *all* > > out-of-tree modules. > > No, actually namespaces kind of help out of tree modules. Once they only > use interfaces that are really generic driver interfaces and fairly stable > their authors will have much less pain forward porting to newer kernel > version. But currently the authors cannot even know what is an instable > internal interface and what is a generic relatively stable driver level > interface. Namespaces are a mechanism to make this all explicit. So in your head you have a notion of a kernel API, and you're trying to make that API explicit in the code. Sorry, but no. Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: forcedeth ethernet driver & Low power state
On Sunday 25 November 2007 10:59, Jeroen wrote: > On Nov 25, 2007 7:36 PM, Robert Hancock <[EMAIL PROTECTED]> wrote: > > Are you sure forcedeth even supports that feature? I haven't seen any > > code for it, and certainly it should never be enabled by default.. > > The windows driver does. I have to disable it because otherwise I have > lot's of connection speed troubles. This is also what i see when I use a > linux distro on the server unfortunately I can't disable it. You need to prepare more extensive bug report, for starters. What are "connection speed troubles"? Which kernel version? Do you see any "interesting" messages in kernel log? lspci output? ethtool output? etc... -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] iwlwifi: remove redundant declaration of 'iwl3945_priv' and 'iwl4965_priv' structs
On Sun, 2007-11-25 at 15:58 +0100, Miguel Botón wrote: > This patch removes a redundant declaration of 'iwl3945_priv' and > 'iwl4965_priv' structs. > > Signed-off-by: Miguel Boton <[EMAIL PROTECTED]> ACK. Thanks, -yi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 23/27] x86: debugctlmsr kconfig
On Sunday 25 November 2007 14:08, Roland McGrath wrote: > This adds the (internal) Kconfig macro CONFIG_X86_DEBUGCTLMSR, > to be defined when configuring to support only hardware that > definitely supports MSR_IA32_DEBUGCTLMSR with the BTF flag. > > The Intel documentation says "P6 family" and later processors all have it. > I think the Kconfig dependencies are right to have it set for those and > unset for others (i.e., when 586 and earlier are supported). > > +config X86_DEBUGCTLMSR > + bool > + depends on !(M586MMX || M586TSC || M586 || M486 || M386) > + default y Why is it defined in configuration system instead of some *.h file? -- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Documentation about unaligned memory access
> mc68020+ No No > (mc68000/010 No 2) (not for Linux) Actually ucLinux has been persuaded to run on m68000. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
setting the init process's personality?
Hi, Is there a simple way (via a kernel boot option or config setting or - if really necessary - a patch or something like that) to set the personality for the init process? I'm running an x86_64 kernel on a system whose userland is almost entirely 32-bits (but needs an occasional 64-bit process to be run, hence the choice of kernel), and I'd like `uname -m` to be i686 unless I take special action. So I think that means letting init (which is indeed a 32-bit process) have the PER_LINUX32 personality (in case I'm wrong about this, the output of uname -m is essentially what matters to me). So, where does the default come from? -- David A. Madore ([EMAIL PROTECTED], http://www.madore.org/~david/ ) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Update REPORTING-BUGS
On Monday, 26 of November 2007, Adrian Bunk wrote: > On Mon, Nov 26, 2007 at 01:04:25AM +0100, Rafael J. Wysocki wrote: > > On Monday, 26 of November 2007, Adrian Bunk wrote: > > > On Mon, Nov 26, 2007 at 12:00:28AM +0100, Rafael J. Wysocki wrote: > > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > >... > > > > > I don't care whether that's done with Bugzilla, some email based bug > > > > > tracker like the Debian bug tracker, someone putting emails manually > > > > > into some bug tracker like you are doing, or whatever else. > > > > > > > > That last solution doesn't scale very well ... > > > > > > > > How about using the system in which it's possible to report bugs using > > > > both > > > > email and a web interface? > > > > > > > > We can request that the address of the bug tracker be added to the Cc > > > > lists of > > > > bug reports sent by email and we can make it resend reports filed with > > > > it to > > > > the appropriate mailing lists and with the appropriate email headers. > > > > This is > > > > technically doable. > > > > > > You are trying to solve something that is not a problem. > > > > It _is_ a problem, because many bug are reported using email and not really > > tracked. The ones that I manually put into the Bugzilla are the tip of the > > iceberg (and BTW I'd prefer not to have to do that manually). > > > > Every bug reported by email and not responded to by the right people, that > > is > > not a recent regression, is currently lost. I'd like to avoid that, if > > possible. > > This is solved by many other projects by asking the submitter to open a > bug for the issue when he sends it in an email. > > The submitter then simply copies the information from his email to his > newly opened bug in the bug tracker. > > -> no problem > > > > It does not matter which medium we choose for getting bug reports. > > > > [Well, you said that we should use a web interface for that. ;-)] > > I said a web interface is not worse than via email. > And it's enough. > > (And I e.g. wouldn't oppose using the Debian bug tracker where the web > interface only allows reading and everything has to be done via email > if all kernel maintainers would agree to use this.) > > > No, it doesn't, as long as the bug reports reach the right place. Now, the > > question is what's that. > > > > IMO, ideally, for each subsystem there should be a mailing list to send bug > > reports to. The Bugzilla should forward the reports to these lists. On > > every > > such list there should be (at least) one person responsible for responding > > to > > the bug reports, if no one else responds first, and for forwarding the > > reports > > to the appropriate developers. This person should also be responsible for > > monitoring the status of each bug report sent to his/her list. > > After all discussions about crazy bug tracker features we are back at > the real problem: We started to discuss them, because you argued that the Bugzilla in its current shape was sufficient, which I didn't agree with and tried to give some arguments. > Where do we find the tree these people grow on? That's a good question, but either we find these people, or we'll start losing users at growing rates. I'm afraid that's already happening ... > > _Every_ bug report sent (including invalid ones) should be recorded in a bug > > tracking system (be it the Bugzilla or whatever else) along with all of it's > > history (at least, refernces to the bug's history should be stored), no > > matter > > how it's been handled. Moreover, a bug can only be resolved as "fixed" if > > there's a pointer to the exact commit fixing it in the bug's history. > > And back we are at crazy bug tracker features... No, they are not bug tracker features, but parts of a process that I think we should have in place. > > > The only thing that matters is that we get bug reports resolved within a > > > reasonable amount of time. > > > > I'm not sure if that's generally possible: > > - What about the bugs that take 2 weeks or more to reproduce? > > - What about the bugs that we _don't_ _know_ how to fix? > > We will never get 100% of all bugs fixed. > > Let's get back to the fact that we have many bug reports that could be > fixed within a reasonable amount of time but are not. Do you have specific examples? Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ipmi_watchdog can not reset the kernel panic machine
The watchdog is "off" by default, meaning that you have to have something actually start resetting the watchdog before it will start running. That's why you are seeing this behavior. There is a start_now option that will start the watchdog when it is loaded, but then it will reset the system unless something resets the watchdog periodically, and you have a limited time to start this operation. On a panic, the IPMI driver attempts to preserve the state of the watchdog and (if running) increase the timeout time to allow a kdump or something like that to occur. That's the purpose of the code you reference. It is not to start a reset operation on any panic. It used to start a reset on every panic, but that cause problems for many users. -corey Andrew Morton wrote: (cc's added) On Fri, 23 Nov 2007 20:28:41 -0800 (PST) [EMAIL PROTECTED] wrote: Build kernel-2.6.24-rc3. pmi_watchdog can not reset the kernel panic machine. The watchdog can never to record panic information to IPMI SEL. 1. I disable auto reset when kernel panic by echo "0" > /proc/sys/kernel/panic 2. modprobe ipmi_watchdog timeout=120 action=reset 3. Load a driver, the driver will call panic() when ioctl to call into the driver. 4. By ioctl call into the driver, panic the system. in wdog_panic_handler, I printk "ipmi_watchdog_state=WDOG_TIMEOUT_NONE" so, the watchdog can never to record panic information to IPMI SEL. static int wdog_panic_handler(struct notifier_block *this, unsigned long event, void *unused) { static int panic_event_handled = 0; /* On a panic, if we have a panic timeout, make sure to extend the watchdog timer to a reasonable value to complete the panic, if the watchdog timer is running. Plus the pretimeout is meaningless at panic time. */ if (watchdog_user && !panic_event_handled && ipmi_watchdog_state != WDOG_TIMEOUT_NONE) { /* Make sure we do this only once. */ panic_event_handled = 1; timeout = 255; pretimeout = 0; panic_halt_ipmi_set_timeout(); } return NOTIFY_OK; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel bugzilla is FPOS (was: Re: "buggy cmd640" message followed by soft lockup)
On Monday, 26 of November 2007, Adrian Bunk wrote: > On Mon, Nov 26, 2007 at 12:28:17AM +0100, Rafael J. Wysocki wrote: > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > On Sun, Nov 25, 2007 at 11:38:59PM +0100, Rafael J. Wysocki wrote: > > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > > > On Sun, Nov 25, 2007 at 10:28:06PM +0100, Rafael J. Wysocki wrote: > > > > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > > > >.. > > > > > > > First of all, Bugzilla is a quite often used bug tracker in the > > > > > > > open > > > > > > > source world [1], so many users already know it. > > > > > > > > > > > > > > But more important, "it pretends to require them to spend" isn't > > > > > > > true > > > > > > > because there's no pretending - we actually often require bug > > > > > > > reporters > > > > > > > to spend a lot of time on the bug report (e.g. when asking for > > > > > > > bisecting). > > > > > > > > > > > > But not *initially*. > > > > > > > > > > > > We should not confuse *debugging* with *reporting bugs*. While the > > > > > > former is > > > > > > actually more difficult and more time consuming than writing the > > > > > > code in which > > > > > > the bug is present, the latter should be as simple as sending an > > > > > > email. > > > > > > > > > > For hardcore geeks like you and me sending an email might be easier > > > > > than > > > > > using some web interface. > > > > > > > > > > Normal humans tend to be more accustomed to web interfaces, and > > > > > following the instructions on some web page is _much_ easier than > > > > > reading three text files for knowing what to write in an email. > > > > > > > > Hm, this is a good argument for having such a web interface, but IMO it > > > > shouldn't be mandatory. IOW, there should be a way to report a bug > > > > using plain > > > > email, if the reporter prefers that. We can, however, request that the > > > > address > > > > of our bug tracking system be added to the report's Cc list. > > > > > > Looking at both other open source projects and the support of commercial > > > software a web interface should be enough. > > > > Well, IMHO the Linux kernel is exceptional in many ways ... > > If your goal is not to solve our problems with bug handling but trying > to maximize the "being different" factor... > > > > But this is not the problem - the problem is what happens after the > > > initial report with the bug report. > > > > Not only that. > > > > First, each bug report has to reach the right lists/people and that's what > > we > > can't assure using the Bugzilla alone right now. To make the Bugzilla > > generally useful for that we need to change the way in which the target of > > the > > report is selected and make it send reports to mailing lists rather than to > > individual people. > > In recent years, the default assignees of changed or new components in > the kernel Bugzilla have been pseudo addresses, and you can subscribe a > mailing list (like any other email address) to get copies of the emails > going to this pseudo address. OK Why haven't they been subscribed already, then? I think you would agree that right now the choice of subsystems in the Bugzilla doesn't reflect the current status of the kernel (some subsystems should be added, some should be called differently, some should be moved to different places etc.) and some addresses to which the bug reports are assigned by default are not the best ones ... > > Second, once the bug report have reached the right place, we have two > > problems > > to solve: > > (1) we need to make the developers respond and actively work on the bug > > This is the one problem we have. > > > (2) we need to make the tracking of the bug possibly unintrusive (ie. > > developers should be able to work with the reporter in a way that *they* > > prefer) > > While it's generally difficult to solve (1), we can at least make (2) happen > > (well, in theory). > > For normal communication (2) already works in the kernel Bugzilla. > > > > > Now, the question is what information this web interface should ask for. > > > > > > > > IMO, first, it should ask for what the bug is against, ie.: > > > > - kernel version (to be obtained from 'git describe' or from > > > > /proc/version or > > > > from .config, if the kernel doesn't boot) > > > > - architecture (x86, ARM, MIPS etc.) > > > > - subsystem and subsubsystem (that could be selectable from a menu and > > > > might > > > > depend on the architecture) > > > > > > > > It also should ask if the problem is a regression and what was the last > > > > known > > > > good kernel (I'd prefer that to be the last known major release > > > > selectable from > > > > a list). > > > > > > > > Also, the reporter should be required to provide a summary (subject) and > > > > a (concise) description of the problem and a list of email addresses to > > > > send the report to in addition
Re: [RFC][PATCH] Update REPORTING-BUGS
On Mon, Nov 26, 2007 at 01:04:25AM +0100, Rafael J. Wysocki wrote: > On Monday, 26 of November 2007, Adrian Bunk wrote: > > On Mon, Nov 26, 2007 at 12:00:28AM +0100, Rafael J. Wysocki wrote: > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > >... > > > > I don't care whether that's done with Bugzilla, some email based bug > > > > tracker like the Debian bug tracker, someone putting emails manually > > > > into some bug tracker like you are doing, or whatever else. > > > > > > That last solution doesn't scale very well ... > > > > > > How about using the system in which it's possible to report bugs using > > > both > > > email and a web interface? > > > > > > We can request that the address of the bug tracker be added to the Cc > > > lists of > > > bug reports sent by email and we can make it resend reports filed with it > > > to > > > the appropriate mailing lists and with the appropriate email headers. > > > This is > > > technically doable. > > > > You are trying to solve something that is not a problem. > > It _is_ a problem, because many bug are reported using email and not really > tracked. The ones that I manually put into the Bugzilla are the tip of the > iceberg (and BTW I'd prefer not to have to do that manually). > > Every bug reported by email and not responded to by the right people, that is > not a recent regression, is currently lost. I'd like to avoid that, if > possible. This is solved by many other projects by asking the submitter to open a bug for the issue when he sends it in an email. The submitter then simply copies the information from his email to his newly opened bug in the bug tracker. -> no problem > > It does not matter which medium we choose for getting bug reports. > > [Well, you said that we should use a web interface for that. ;-)] I said a web interface is not worse than via email. And it's enough. (And I e.g. wouldn't oppose using the Debian bug tracker where the web interface only allows reading and everything has to be done via email if all kernel maintainers would agree to use this.) > No, it doesn't, as long as the bug reports reach the right place. Now, the > question is what's that. > > IMO, ideally, for each subsystem there should be a mailing list to send bug > reports to. The Bugzilla should forward the reports to these lists. On every > such list there should be (at least) one person responsible for responding to > the bug reports, if no one else responds first, and for forwarding the reports > to the appropriate developers. This person should also be responsible for > monitoring the status of each bug report sent to his/her list. After all discussions about crazy bug tracker features we are back at the real problem: Where do we find the tree these people grow on? > _Every_ bug report sent (including invalid ones) should be recorded in a bug > tracking system (be it the Bugzilla or whatever else) along with all of it's > history (at least, refernces to the bug's history should be stored), no matter > how it's been handled. Moreover, a bug can only be resolved as "fixed" if > there's a pointer to the exact commit fixing it in the bug's history. And back we are at crazy bug tracker features... > > The only thing that matters is that we get bug reports resolved within a > > reasonable amount of time. > > I'm not sure if that's generally possible: > - What about the bugs that take 2 weeks or more to reproduce? > - What about the bugs that we _don't_ _know_ how to fix? We will never get 100% of all bugs fixed. Let's get back to the fact that we have many bug reports that could be fixed within a reasonable amount of time but are not. > Rafael cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Update REPORTING-BUGS
On Monday, 26 of November 2007, Adrian Bunk wrote: > On Mon, Nov 26, 2007 at 12:00:28AM +0100, Rafael J. Wysocki wrote: > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > >... > > > I don't care whether that's done with Bugzilla, some email based bug > > > tracker like the Debian bug tracker, someone putting emails manually > > > into some bug tracker like you are doing, or whatever else. > > > > That last solution doesn't scale very well ... > > > > How about using the system in which it's possible to report bugs using both > > email and a web interface? > > > > We can request that the address of the bug tracker be added to the Cc lists > > of > > bug reports sent by email and we can make it resend reports filed with it to > > the appropriate mailing lists and with the appropriate email headers. This > > is > > technically doable. > > You are trying to solve something that is not a problem. It _is_ a problem, because many bug are reported using email and not really tracked. The ones that I manually put into the Bugzilla are the tip of the iceberg (and BTW I'd prefer not to have to do that manually). Every bug reported by email and not responded to by the right people, that is not a recent regression, is currently lost. I'd like to avoid that, if possible. > It does not matter which medium we choose for getting bug reports. [Well, you said that we should use a web interface for that. ;-)] No, it doesn't, as long as the bug reports reach the right place. Now, the question is what's that. IMO, ideally, for each subsystem there should be a mailing list to send bug reports to. The Bugzilla should forward the reports to these lists. On every such list there should be (at least) one person responsible for responding to the bug reports, if no one else responds first, and for forwarding the reports to the appropriate developers. This person should also be responsible for monitoring the status of each bug report sent to his/her list. _Every_ bug report sent (including invalid ones) should be recorded in a bug tracking system (be it the Bugzilla or whatever else) along with all of it's history (at least, refernces to the bug's history should be stored), no matter how it's been handled. Moreover, a bug can only be resolved as "fixed" if there's a pointer to the exact commit fixing it in the bug's history. > The only thing that matters is that we get bug reports resolved within a > reasonable amount of time. I'm not sure if that's generally possible: - What about the bugs that take 2 weeks or more to reproduce? - What about the bugs that we _don't_ _know_ how to fix? Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/PATCH] drm: Fix for non-coherent DMA PowerPC
This patch fixes bits of the DRM so to make the radeon DRI work on non-cache coherent PCI DMA variants of the PowerPC processors. It moves the few places that needs change to wrappers to that other architectures with similar issues can easily add their own changes to those wrappers, at least until we have more useful generic kernel API. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- drivers/char/drm/ati_pcigart.c |6 ++ drivers/char/drm/drm_scatter.c | 12 +++- drivers/char/drm/drm_vm.c | 20 +++- 3 files changed, 32 insertions(+), 6 deletions(-) Index: linux-work/drivers/char/drm/ati_pcigart.c === --- linux-work.orig/drivers/char/drm/ati_pcigart.c 2007-11-26 10:07:29.0 +1100 +++ linux-work/drivers/char/drm/ati_pcigart.c 2007-11-26 10:21:33.0 +1100 @@ -214,6 +214,12 @@ int drm_ati_pcigart_init(struct drm_devi } } + if (gart_info->gart_table_location == DRM_ATI_GART_MAIN) + dma_sync_single_for_device(>pdev->dev, + bus_address, + max_pages * sizeof(u32), + PCI_DMA_TODEVICE); + ret = 1; #if defined(__i386__) || defined(__x86_64__) Index: linux-work/drivers/char/drm/drm_scatter.c === --- linux-work.orig/drivers/char/drm/drm_scatter.c 2007-11-26 10:07:29.0 +1100 +++ linux-work/drivers/char/drm/drm_scatter.c 2007-11-26 10:20:08.0 +1100 @@ -36,6 +36,16 @@ #define DEBUG_SCATTER 0 +static inline void *drm_vmalloc_dma(unsigned long size) +{ +#if defined(__powerpc__) && defined(CONFIG_NOT_COHERENT_CACHE) + return __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM, +PAGE_KERNEL | _PAGE_NO_CACHE); +#else + return vmalloc_32(size); +#endif +} + void drm_sg_cleanup(struct drm_sg_mem * entry) { struct page *page; @@ -104,7 +114,7 @@ int drm_sg_alloc(struct drm_device *dev, } memset((void *)entry->busaddr, 0, pages * sizeof(*entry->busaddr)); - entry->virtual = vmalloc_32(pages << PAGE_SHIFT); + entry->virtual = drm_vmalloc_dma(pages << PAGE_SHIFT); if (!entry->virtual) { drm_free(entry->busaddr, entry->pages * sizeof(*entry->busaddr), DRM_MEM_PAGES); Index: linux-work/drivers/char/drm/drm_vm.c === --- linux-work.orig/drivers/char/drm/drm_vm.c 2007-11-26 10:07:29.0 +1100 +++ linux-work/drivers/char/drm/drm_vm.c2007-11-26 10:11:09.0 +1100 @@ -54,13 +54,24 @@ static pgprot_t drm_io_prot(uint32_t map pgprot_val(tmp) |= _PAGE_NO_CACHE; if (map_type == _DRM_REGISTERS) pgprot_val(tmp) |= _PAGE_GUARDED; -#endif -#if defined(__ia64__) +#elif defined(__ia64__) if (efi_range_is_wc(vma->vm_start, vma->vm_end - vma->vm_start)) tmp = pgprot_writecombine(tmp); else tmp = pgprot_noncached(tmp); +#elif defined(__sparc__) + tmp = pgprot_noncached(tmp); +#endif + return tmp; +} + +static pgprot_t drm_dma_prot(uint32_t map_type, struct vm_area_struct *vma) +{ + pgprot_t tmp = vm_get_page_prot(vma->vm_flags); + +#if defined(__powerpc__) && defined(CONFIG_NOT_COHERENT_CACHE) + tmp |= _PAGE_NO_CACHE; #endif return tmp; } @@ -617,9 +628,6 @@ static int drm_mmap_locked(struct file * offset = dev->driver->get_reg_ofs(dev); vma->vm_flags |= VM_IO; /* not in core dump */ vma->vm_page_prot = drm_io_prot(map->type, vma); -#ifdef __sparc__ - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); -#endif if (io_remap_pfn_range(vma, vma->vm_start, (map->offset + offset) >> PAGE_SHIFT, vma->vm_end - vma->vm_start, @@ -638,6 +646,7 @@ static int drm_mmap_locked(struct file * page_to_pfn(virt_to_page(map->handle)), vma->vm_end - vma->vm_start, vma->vm_page_prot)) return -EAGAIN; + vma->vm_page_prot = drm_dma_prot(map->type, vma); /* fall through to _DRM_SHM */ case _DRM_SHM: vma->vm_ops = _vm_shm_ops; @@ -650,6 +659,7 @@ static int drm_mmap_locked(struct file * vma->vm_ops = _vm_sg_ops; vma->vm_private_data = (void *)map; vma->vm_flags |= VM_RESERVED; + vma->vm_page_prot = drm_dma_prot(map->type, vma); break; default: return -EINVAL; /* This should never happen. */ - To unsubscribe from this list: send the line "unsubscribe
Re: kernel bugzilla is FPOS (was: Re: "buggy cmd640" message followed by soft lockup)
On Mon, Nov 26, 2007 at 12:28:17AM +0100, Rafael J. Wysocki wrote: > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > On Sun, Nov 25, 2007 at 11:38:59PM +0100, Rafael J. Wysocki wrote: > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > > On Sun, Nov 25, 2007 at 10:28:06PM +0100, Rafael J. Wysocki wrote: > > > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > > >.. > > > > > > First of all, Bugzilla is a quite often used bug tracker in the > > > > > > open > > > > > > source world [1], so many users already know it. > > > > > > > > > > > > But more important, "it pretends to require them to spend" isn't > > > > > > true > > > > > > because there's no pretending - we actually often require bug > > > > > > reporters > > > > > > to spend a lot of time on the bug report (e.g. when asking for > > > > > > bisecting). > > > > > > > > > > But not *initially*. > > > > > > > > > > We should not confuse *debugging* with *reporting bugs*. While the > > > > > former is > > > > > actually more difficult and more time consuming than writing the code > > > > > in which > > > > > the bug is present, the latter should be as simple as sending an > > > > > email. > > > > > > > > For hardcore geeks like you and me sending an email might be easier > > > > than > > > > using some web interface. > > > > > > > > Normal humans tend to be more accustomed to web interfaces, and > > > > following the instructions on some web page is _much_ easier than > > > > reading three text files for knowing what to write in an email. > > > > > > Hm, this is a good argument for having such a web interface, but IMO it > > > shouldn't be mandatory. IOW, there should be a way to report a bug using > > > plain > > > email, if the reporter prefers that. We can, however, request that the > > > address > > > of our bug tracking system be added to the report's Cc list. > > > > Looking at both other open source projects and the support of commercial > > software a web interface should be enough. > > Well, IMHO the Linux kernel is exceptional in many ways ... If your goal is not to solve our problems with bug handling but trying to maximize the "being different" factor... > > But this is not the problem - the problem is what happens after the > > initial report with the bug report. > > Not only that. > > First, each bug report has to reach the right lists/people and that's what we > can't assure using the Bugzilla alone right now. To make the Bugzilla > generally useful for that we need to change the way in which the target of the > report is selected and make it send reports to mailing lists rather than to > individual people. In recent years, the default assignees of changed or new components in the kernel Bugzilla have been pseudo addresses, and you can subscribe a mailing list (like any other email address) to get copies of the emails going to this pseudo address. > Second, once the bug report have reached the right place, we have two problems > to solve: > (1) we need to make the developers respond and actively work on the bug This is the one problem we have. > (2) we need to make the tracking of the bug possibly unintrusive (ie. > developers should be able to work with the reporter in a way that *they* > prefer) > While it's generally difficult to solve (1), we can at least make (2) happen > (well, in theory). For normal communication (2) already works in the kernel Bugzilla. > > > Now, the question is what information this web interface should ask for. > > > > > > IMO, first, it should ask for what the bug is against, ie.: > > > - kernel version (to be obtained from 'git describe' or from > > > /proc/version or > > > from .config, if the kernel doesn't boot) > > > - architecture (x86, ARM, MIPS etc.) > > > - subsystem and subsubsystem (that could be selectable from a menu and > > > might > > > depend on the architecture) > > > > > > It also should ask if the problem is a regression and what was the last > > > known > > > good kernel (I'd prefer that to be the last known major release > > > selectable from > > > a list). > > > > > > Also, the reporter should be required to provide a summary (subject) and > > > a (concise) description of the problem and a list of email addresses to > > > send the report to in addition to the regular handling (there should be a > > > way > > > to verify which addresses are acceptable). > > > > > > Anything else? > > > > > > Next, the report should be sent to a mailing list selected on the basis > > > of the > > > information provided (not necessarily to individual developers, unless > > > there > > > are some addresses provided explicitly by the reporter). > > > > The architecture choice seems to be the only thing from your list that > > isn't already available in the "Enter a new bug report" dialog of the > > kernel Bugzilla. > > Yet, the architecture choice affects the way in which the other choices are > made. I can
Re: [PATCH 1/9]: introduce radix_tree_gang_lookup_range
On Mon, Nov 26, 2007 at 10:17:24AM +1100, Nick Piggin wrote: > On Thursday 22 November 2007 11:32, David Chinner wrote: > > Introduce radix_tree_gang_lookup_range() > > > > The inode clustering in XFS requires a gang lookup on the radix tree to > > find all the inodes in the cluster. The gang lookup has to set the > > maximum items to that of a fully populated cluster so we get all the > > inodes in the cluster, but we only populate the radix tree sparsely (on > > demand). > > > > As a result, the gang lookup can search way, way past the index of end > > of the cluster because it is looking for a fixed number of entries to > > return. > > > > We know we want to terminate the search at either a specific index or a > > maximum number of items, so we need to add a "last_index" parameter to > > the lookup. > > Yeah, this fixes one downside of the gang lookup API. For consistency > it would be nice to do this for the tag lookup API as well... Sure, I have need to do that as well. ;) > > Furthermore, the existing radix_tree_gang_lookup() can use this same > > function if we define a RADIX_TREE_MAX_INDEX value so the search is not > > limited by the last_index. > > Nit: should just define it to be ULONG_MAX. Oh, right. Silly me. I'll post updated radix tree patches later today. Thanks, Nick. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22 (gentoo + grsec) kernel BUG at mm/mlock.c:205!
On Sun, 25 Nov 2007 20:36:04 +0100 Mathias Kretschmer <[EMAIL PROTECTED]> wrote: > Hi, > > this is a x86_64 kernel with 4GB of RAM. incident happened when > compiling cdrecord (or some variant of it :) in a 32-bit chroot jail > during the 'configure' process. > > alpha / # uname -a > Linux alpha 2.6.22-hardened-r8 #10 SMP Sun Nov 25 12:52:39 CET 2007 > x86_64 AMD Processor model unknown AuthenticAMD GNU/Linux > > Let me know, if you need for info. > you have both a heavily patched kernel and a tainted kernel due to binary kernel modules sounds like you're best of contacting the support side of whoever gave you the patches and/or the binary module; I don't think there's much lkml can do for you. -- If you want to reach me at my work email, use [EMAIL PROTECTED] For development, discussion and tips for power savings, visit http://www.lesswatts.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] make I/O schedulers non-modular
On Sun, 25 Nov 2007 17:56:54 +0100 Adrian Bunk <[EMAIL PROTECTED]> wrote: > Is there any technical reason why we need 4 different schedulers at > all? > there is at least one technical reason to need more than one: certain types of storage (both big EMC boxes as well as solid state disks) don't behave like disks and have no seek penalty; any cpu time spent on avoiding seeks is wasted on those, so for these devices one really wants to use a different IO scheduler, one which is much lighter weight - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] power: use kasprintf
On Sat, Nov 17, 2007 at 07:55:58PM +0900, Akinobu Mita wrote: > Use kasprintf instead of kmalloc()-strcpy()-strcat(). Applied to battery-2.6.git, thanks. > Cc: Anton Vorontsov <[EMAIL PROTECTED]> > Cc: David Woodhouse <[EMAIL PROTECTED]> > Signed-off-by: Akinobu Mita <[EMAIL PROTECTED]> > > --- > drivers/power/power_supply_leds.c | 25 +++-- > 1 file changed, 7 insertions(+), 18 deletions(-) > > Index: 2.6-mm/drivers/power/power_supply_leds.c > === > --- 2.6-mm.orig/drivers/power/power_supply_leds.c > +++ 2.6-mm/drivers/power/power_supply_leds.c > @@ -10,6 +10,7 @@ > * You may use this code as per GPL version 2 > */ > > +#include > #include > > #include "power_supply.h" > @@ -48,28 +49,20 @@ static int power_supply_create_bat_trigg > { > int rc = 0; > > - psy->charging_full_trig_name = kmalloc(strlen(psy->name) + > - sizeof("-charging-or-full"), GFP_KERNEL); > + psy->charging_full_trig_name = kasprintf(GFP_KERNEL, > + "%s-charging-or-full", psy->name); > if (!psy->charging_full_trig_name) > goto charging_full_failed; > > - psy->charging_trig_name = kmalloc(strlen(psy->name) + > - sizeof("-charging"), GFP_KERNEL); > + psy->charging_trig_name = kasprintf(GFP_KERNEL, > + "%s-charging", psy->name); > if (!psy->charging_trig_name) > goto charging_failed; > > - psy->full_trig_name = kmalloc(strlen(psy->name) + > - sizeof("-full"), GFP_KERNEL); > + psy->full_trig_name = kasprintf(GFP_KERNEL, "%s-full", psy->name); > if (!psy->full_trig_name) > goto full_failed; > > - strcpy(psy->charging_full_trig_name, psy->name); > - strcat(psy->charging_full_trig_name, "-charging-or-full"); > - strcpy(psy->charging_trig_name, psy->name); > - strcat(psy->charging_trig_name, "-charging"); > - strcpy(psy->full_trig_name, psy->name); > - strcat(psy->full_trig_name, "-full"); > - > led_trigger_register_simple(psy->charging_full_trig_name, > >charging_full_trig); > led_trigger_register_simple(psy->charging_trig_name, > @@ -120,14 +113,10 @@ static int power_supply_create_gen_trigg > { > int rc = 0; > > - psy->online_trig_name = kmalloc(strlen(psy->name) + sizeof("-online"), > - GFP_KERNEL); > + psy->online_trig_name = kasprintf(GFP_KERNEL, "%s-online", psy->name); > if (!psy->online_trig_name) > goto online_failed; > > - strcpy(psy->online_trig_name, psy->name); > - strcat(psy->online_trig_name, "-online"); > - > led_trigger_register_simple(psy->online_trig_name, >online_trig); > > goto success; > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Anton Vorontsov email: [EMAIL PROTECTED] backup email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] power_supply_{leds,sysfs}.c should #include "power_supply.h"
On Mon, Nov 05, 2007 at 06:07:45PM +0100, Adrian Bunk wrote: > Every file should include the headers containing the prototypes for > its global functions. Applied to battery-2.6.git, thanks. p.s. Sorry for the delay, I've not been Cc'ed, so I've found out about that patch by pure chance (through looking in the -mm series). > Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> > > --- > > drivers/power/power_supply_leds.c |2 ++ > drivers/power/power_supply_sysfs.c |2 ++ > 2 files changed, 4 insertions(+) > > e34cc994731ec9102bf5b1c7d6585c0aa87d1fa2 > diff --git a/drivers/power/power_supply_leds.c > b/drivers/power/power_supply_leds.c > index 7f8f359..80ca288 100644 > --- a/drivers/power/power_supply_leds.c > +++ b/drivers/power/power_supply_leds.c > @@ -12,6 +12,8 @@ > > #include > > +#include "power_supply.h" > + > /* Battery specific LEDs triggers. */ > > static void power_supply_update_bat_leds(struct power_supply *psy) > diff --git a/drivers/power/power_supply_sysfs.c > b/drivers/power/power_supply_sysfs.c > index 249f61b..e8ad1fd 100644 > --- a/drivers/power/power_supply_sysfs.c > +++ b/drivers/power/power_supply_sysfs.c > @@ -14,6 +14,8 @@ > #include > #include > > +#include "power_supply.h" > + > /* > * This is because the name "current" breaks the device attr macro. > * The "current" word resolves to "(get_current())" so instead of > -- Anton Vorontsov email: [EMAIL PROTECTED] backup email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/9]: introduce radix_tree_gang_lookup_range
On Thursday 22 November 2007 11:32, David Chinner wrote: > Introduce radix_tree_gang_lookup_range() > > The inode clustering in XFS requires a gang lookup on the radix tree to > find all the inodes in the cluster. The gang lookup has to set the > maximum items to that of a fully populated cluster so we get all the > inodes in the cluster, but we only populate the radix tree sparsely (on > demand). > > As a result, the gang lookup can search way, way past the index of end > of the cluster because it is looking for a fixed number of entries to > return. > > We know we want to terminate the search at either a specific index or a > maximum number of items, so we need to add a "last_index" parameter to > the lookup. Yeah, this fixes one downside of the gang lookup API. For consistency it would be nice to do this for the tag lookup API as well... > Furthermore, the existing radix_tree_gang_lookup() can use this same > function if we define a RADIX_TREE_MAX_INDEX value so the search is not > limited by the last_index. Nit: should just define it to be ULONG_MAX. > > Signed-off-by: Dave Chinner <[EMAIL PROTECTED]> Otherwise, Acked-by: Nick Piggin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Update REPORTING-BUGS
On Mon, Nov 26, 2007 at 12:00:28AM +0100, Rafael J. Wysocki wrote: > On Sunday, 25 of November 2007, Adrian Bunk wrote: >... > > I don't care whether that's done with Bugzilla, some email based bug > > tracker like the Debian bug tracker, someone putting emails manually > > into some bug tracker like you are doing, or whatever else. > > That last solution doesn't scale very well ... > > How about using the system in which it's possible to report bugs using both > email and a web interface? > > We can request that the address of the bug tracker be added to the Cc lists of > bug reports sent by email and we can make it resend reports filed with it to > the appropriate mailing lists and with the appropriate email headers. This is > technically doable. You are trying to solve something that is not a problem. It does not matter which medium we choose for getting bug reports. The only thing that matters is that we get bug reports resolved within a reasonable amount of time. > Greetings, > Rafael cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel bugzilla is FPOS (was: Re: "buggy cmd640" message followed by soft lockup)
On Sunday, 25 of November 2007, Adrian Bunk wrote: > On Sun, Nov 25, 2007 at 11:38:59PM +0100, Rafael J. Wysocki wrote: > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > On Sun, Nov 25, 2007 at 10:28:06PM +0100, Rafael J. Wysocki wrote: > > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > >.. > > > > > First of all, Bugzilla is a quite often used bug tracker in the open > > > > > source world [1], so many users already know it. > > > > > > > > > > But more important, "it pretends to require them to spend" isn't true > > > > > because there's no pretending - we actually often require bug > > > > > reporters > > > > > to spend a lot of time on the bug report (e.g. when asking for > > > > > bisecting). > > > > > > > > But not *initially*. > > > > > > > > We should not confuse *debugging* with *reporting bugs*. While the > > > > former is > > > > actually more difficult and more time consuming than writing the code > > > > in which > > > > the bug is present, the latter should be as simple as sending an email. > > > > > > For hardcore geeks like you and me sending an email might be easier than > > > using some web interface. > > > > > > Normal humans tend to be more accustomed to web interfaces, and > > > following the instructions on some web page is _much_ easier than > > > reading three text files for knowing what to write in an email. > > > > Hm, this is a good argument for having such a web interface, but IMO it > > shouldn't be mandatory. IOW, there should be a way to report a bug using > > plain > > email, if the reporter prefers that. We can, however, request that the > > address > > of our bug tracking system be added to the report's Cc list. > > Looking at both other open source projects and the support of commercial > software a web interface should be enough. Well, IMHO the Linux kernel is exceptional in many ways ... > But this is not the problem - the problem is what happens after the > initial report with the bug report. Not only that. First, each bug report has to reach the right lists/people and that's what we can't assure using the Bugzilla alone right now. To make the Bugzilla generally useful for that we need to change the way in which the target of the report is selected and make it send reports to mailing lists rather than to individual people. Second, once the bug report have reached the right place, we have two problems to solve: (1) we need to make the developers respond and actively work on the bug (2) we need to make the tracking of the bug possibly unintrusive (ie. developers should be able to work with the reporter in a way that *they* prefer) While it's generally difficult to solve (1), we can at least make (2) happen (well, in theory). > > Now, the question is what information this web interface should ask for. > > > > IMO, first, it should ask for what the bug is against, ie.: > > - kernel version (to be obtained from 'git describe' or from /proc/version > > or > > from .config, if the kernel doesn't boot) > > - architecture (x86, ARM, MIPS etc.) > > - subsystem and subsubsystem (that could be selectable from a menu and might > > depend on the architecture) > > > > It also should ask if the problem is a regression and what was the last > > known > > good kernel (I'd prefer that to be the last known major release selectable > > from > > a list). > > > > Also, the reporter should be required to provide a summary (subject) and > > a (concise) description of the problem and a list of email addresses to > > send the report to in addition to the regular handling (there should be a > > way > > to verify which addresses are acceptable). > > > > Anything else? > > > > Next, the report should be sent to a mailing list selected on the basis of > > the > > information provided (not necessarily to individual developers, unless there > > are some addresses provided explicitly by the reporter). > > The architecture choice seems to be the only thing from your list that > isn't already available in the "Enter a new bug report" dialog of the > kernel Bugzilla. Yet, the architecture choice affects the way in which the other choices are made. Also, the "sending to mailing lists" part is obviously missing. > > IMO, it should be possible to work on the bug using both email and the web > > interface, whichever is preferred by the participant in question, without > > the > > need to stick to any of them (ie. email messages sent in the corresponding > > email thread should be registered by the bug tracking system and comments > > entered into it should appear as messages in the email thread with the > > appropriate To:, From: and Cc: information). > > > > There surely are more things that we'd like it to do, but the above seem to > > be > > a reasonable minimum. > > Except from the From: header in outgoing emails the kernel Bugzilla > already offers this for years. No, it doesn't. You can't send the initial report by
Question regarding naming scheme (HP Jornada 6XX/7XX)
Greetings, Just want some input before I start dropping patches everywhere. A simple ack will do nicely if you just agree. Currently we use the name of the most typical HP Jornada (680 and 720) to mean all 6XX/7XX (= 620/660/680/690 and 720/720/728). In the past this has led to some confusion when people tried to compile their own kernels. For instance an hp 620 user thought that their system was unsupported because everything was for '680'. Or the other way round 728 users didn't want to use 720 since they thought they would loose their extra ram (only difference between versions). So, I want to instead use the term 600-series or 700-series. This would mean changing Kconfig/Makefile and driver name. For example /drivers/input/keyboard/jornada680_kbd.c would become /drivers/input/keyboard/jornada600_kbd.c The machine name tag would also return (HP Jornada 600-series | HP Jornada 700-series) since I know for instance opie loves to grep the machine line. Currently this is set as "hp6xx" for 600-series and "HP Jornada 720" for 700-series. They are related machines so it would be nice to unify their output a tad. Why I want to use 600-series/700-series instead of 6XX/7XX is simply because 600-series/700-series leaves no doubt. Any objections? Best wishes Kristoffer Ericson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sata_nv: don't use legacy DMA in ADMA mode (v3)
We need to run any DMA command with result taskfile requested in ADMA mode when the port is in ADMA mode, otherwise it may try to use the legacy DMA engine in ADMA mode which is not allowed. Enforce this with BUG_ON() since data corruption could potentially result if this happened. Also, fail any attempt to try and issue NCQ commands with result taskfile requested, since the hardware doesn't allow this. Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> --- linux-2.6.24-rc3-git1edit/drivers/ata/sata_nv.c.before2 2007-11-25 16:28:58.0 -0600 +++ linux-2.6.24-rc3-git1edit/drivers/ata/sata_nv.c 2007-11-25 16:31:09.0 -0600 @@ -792,11 +792,13 @@ static void nv_adma_tf_read(struct ata_port *ap, struct ata_taskfile *tf) { - /* Since commands where a result TF is requested are not - executed in ADMA mode, the only time this function will be called - in ADMA mode will be if a command fails. In this case we - don't care about going into register mode with ADMA commands - pending, as the commands will all shortly be aborted anyway. */ + /* Other than when internal or pass-through commands are executed, + the only time this function will be called in ADMA mode will be + if a command fails. In the failure case we don't care about going + into register mode with ADMA commands pending, as the commands will + all shortly be aborted anyway. We assume that NCQ commands are not + issued via passthrough, which is the only way that switching into + ADMA mode could abort outstanding commands. */ nv_adma_register_mode(ap); ata_tf_read(ap, tf); @@ -1379,11 +1381,9 @@ struct nv_adma_port_priv *pp = qc->ap->private_data; /* ADMA engine can only be used for non-ATAPI DMA commands, - or interrupt-driven no-data commands, where a result taskfile - is not required. */ + or interrupt-driven no-data commands. */ if ((pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) || - (qc->tf.flags & ATA_TFLAG_POLLING) || - (qc->flags & ATA_QCFLAG_RESULT_TF)) + (qc->tf.flags & ATA_TFLAG_POLLING)) return 1; if ((qc->flags & ATA_QCFLAG_DMAMAP) || @@ -1401,6 +1401,8 @@ NV_CPB_CTL_IEN; if (nv_adma_use_reg_mode(qc)) { + BUG_ON(!(pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) && + (qc->flags & ATA_QCFLAG_DMAMAP)); nv_adma_register_mode(qc->ap); ata_qc_prep(qc); return; @@ -1445,9 +1447,21 @@ VPRINTK("ENTER\n"); + /* We can't handle result taskfile with NCQ commands, since + retrieving the taskfile switches us out of ADMA mode and would abort + existing commands. */ + if (unlikely(qc->tf.protocol == ATA_PROT_NCQ && +(qc->flags & ATA_QCFLAG_RESULT_TF))) { + ata_dev_printk(qc->dev, KERN_ERR, + "NCQ w/ RESULT_TF not allowed\n"); + return AC_ERR_SYSTEM; + } + if (nv_adma_use_reg_mode(qc)) { /* use ATA register mode */ VPRINTK("using ATA register mode: 0x%lx\n", qc->flags); + BUG_ON(!(pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) && + (qc->flags & ATA_QCFLAG_DMAMAP)); nv_adma_register_mode(qc->ap); return ata_qc_issue_prot(qc); } else - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/27] ptrace: arch_has_single_step
> Why should arch_has_single_step be a function-like macro? I can't thing > of a case were this wouln't be a compile-time constant. And given that > this is hopefully a transitionary ifdef because eventually all architectures > would use the generic code I'd prefer ifdefs in the code that clearly mark > this as transitional in this case. I'm not sure it's true that there is no machine where some chips support single-step and others don't, though I do think it's true that no arch code has a conditional like this now. In the case of block-step (in later patches), is is the case that a run-time check for availability of the hardware feature comes up (on some x86 configurations). So a main reason is to keep the two parallel macros with the same style and semantics. > > +static inline void user_enable_single_step(struct task_struct *task) > > > +static inline void user_disable_single_step(struct task_struct *task) > > And I don't think these should be provided at all as generic stubs. If > an arch doesn't use the generic code it simply shouldn't compile the > code using this. The code compiles away completely with if (0)'s. I did it this way to avoid more #ifdef's in the generic ptrace code. Previous patch reviews I've read (including ones from you) have said to use header-defined stubs in #ifdef and unconditional calls in the code. Please be explicit in proposing the specific alternatives you would prefer. > Whats the reason for the user_ prefix btw, most architectures seems to > have these functions already anyway, just without the user_ prefix. The arch's are not consistent now, so I chose a new scheme to harmonize on. I think the "set_foo" names are a bit too nonspecific-sounding, especially given that we do have other things kicking around that use single-step functionality in kernel mode. Also, I plan to submit some more work harmonizing the arch-specific access to the user-mode view of machine state, and a uniform prefix for the new, reliably coherent, documented set of internal interfaces just seems like the right thing to do. (I don't really care enough to argue about the names for functions. Anyone who, for some reason I cannot fathom, cares enough to be contrary about the subject, is welcome to set the standard.) Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Small System Paging Problem - OOM-killer goes nuts
On Sun, 25 Nov 2007 15:02:15 -0700, Josh Goldsmith wrote: > I have a Linksys NSLU2 running 2.6.21 (I can replicate the problem on > 2.6.23 but it isn't fully supported on SlugOS). It is a armv5teb device > with 32MB of RAM, 400+ MB swap on its 160GB USB2 root disk. The machine is > used as a fileserver and to build packages for other ARM devices. It may be > underpowered by today's standard but is a whole lot faster than my first > Linux system (386sx20 with 4MB RAM) but the whole system with disk uses <8 > watts and is silent. > > The problem comes when I try to untar a large file (in this case > linux-2.6.23.tar.bz2). Regardless if I kill off every other process, > eventually the oom-killer will appear and kill either the tar or the shell. > I've tried every tuning option I and my buddy Google could find including > (/proc/sys/vm/overcommit*) with no success. I'm not worried about paging > impacting performance. > > I'd appreciate any help, pointers, or gentle taps with the cluebat. I'm no VM tuning expert, but I have and still do heavy compile jobs on similarly configured machines, with no OOM problems: I regularly build 2.6 kernels and occasionally also gcc on a 100MHz 486 with 28MB of RAM and perhaps 500MB of swap. It runs a standard but stripped down Fedora Core 4 user-space, with ext3 file systems and a kernel that doesn't include anything non-essential. The machine will swap madly, but the OOM killer never triggers. (All system settings are FC4 defaults. I haven't touched them.) In the past I did a fair amount of package rebuilds and test suite runs on an NSLU2 myself, with a 2.4 Linksys/Openslug kernel, ext3, and a 1GB or perhaps 2GB swap partition on a disk attached via a USB2-to-PATA enclosure. Even when swapping heavily the OOM killer wouldn't trigger. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] Update REPORTING-BUGS
On Sunday, 25 of November 2007, Adrian Bunk wrote: > On Sun, Nov 25, 2007 at 10:51:14PM +0100, Rafael J. Wysocki wrote: > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > > On Sun, Nov 25, 2007 at 09:57:09PM +0100, Rafael J. Wysocki wrote: [--snip--] > > > > > > How should a newbie find the correct mailing list? > > > > Read MAINTAINERS? Ok, I should have said about that. > > > > > Benchmark: > > > Easier than the "some more work" when using Bugzilla. > > > > Nope. Please try to file a report against libata/PATA using the Bugzilla. > > Good luck. ;-) > > $ grep PATA MAINTAINERS > $ Too bad (and this is a bug BTW). > > > >... > > > > +It also is a good idea to notify the maintainer of the affected > > > > subsystem and > > > > +the maintainer of the tree in which the bug is present by adding their > > > > email > > > > +addresses to the Cc list of the bug report message. The email > > > > addresses of > > > > +maintainers of the majority of kernel subsystems can be found in the > > > > MAINTAINERS > > > > +file, but you should not worry too much about getting a wrong person. > > > > > > If you don't already know MAINTAINERS well then finding the right > > > component in Bugzilla is much easier. > > > > I disagree. How a newbie is supposed to know what AIO and DIO mean and WTH > > is > > the difference between LVM2/DM and MD? > > > > I took only the IO/Storage submenu as an example, but there are other things > > like that. For instance, what is the difference between "Flash/Memory > > Technology Devices" and MMC/SD? Why "Hotplug" is under "Drivers" and WTH > > does it *mean*? What "W1" means for that matter?? Etc. > > Then let's get that improved. OK Who's supposed to be responsible for that? [--snip--] > > > > > > Really, we must define _one_ way for people to report a bug, and how > > > developers are reminded is _our_ job. > > > > Well, who's "we" in that context? IOW, who's job exactly it's supposed to > > be? > > "we" = "we kernel developers" > > And Natalie seems to be the person being paid for doing such stuff... > > > > >... > > > > +Generally, the following things are appreciated in a bug report: > > > >... > > > > > > If you expect people to read and follow this, wouldn't it be easier to > > > simply point them to open the bug in Bugzilla where we already have a > > > template asking these questions? > > > > I don't think so and please refer to the examples above. > > > > > You could replace the whole contents of this file with: > > > Go to http://bugzilla.kernel.org/ and click on "Enter a new bug report". > > > > > > It's a pity that we manage to add/change an average of 100.000 bugs^Wlines > > > of code each month, but do not have one generally accepted and working > > > process for bug reports. > > > > It's a pity that we do not have one, indeed, and so perhaps it's a good idea > > to try to create one? Not necessarily focusing on the Bugzilla for a little > > while. ;-) > > I'm not focussed on Bugzilla. > > But a submitter should send a bug report _once_ through one well-defined > medium, this should result in the bug report not being lost, and every > other communication of the submitter should be triggered by developers > requesting additional information. I don't think that have to be only *one* medium as long as we're able to track the bugs (see my last reply in the other thread). > I don't care whether that's done with Bugzilla, some email based bug > tracker like the Debian bug tracker, someone putting emails manually > into some bug tracker like you are doing, or whatever else. That last solution doesn't scale very well ... How about using the system in which it's possible to report bugs using both email and a web interface? We can request that the address of the bug tracker be added to the Cc lists of bug reports sent by email and we can make it resend reports filed with it to the appropriate mailing lists and with the appropriate email headers. This is technically doable. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel bugzilla is FPOS (was: Re: "buggy cmd640" message followed by soft lockup)
On Sun, Nov 25, 2007 at 11:38:59PM +0100, Rafael J. Wysocki wrote: > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > On Sun, Nov 25, 2007 at 10:28:06PM +0100, Rafael J. Wysocki wrote: > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > >.. > > > > First of all, Bugzilla is a quite often used bug tracker in the open > > > > source world [1], so many users already know it. > > > > > > > > But more important, "it pretends to require them to spend" isn't true > > > > because there's no pretending - we actually often require bug reporters > > > > to spend a lot of time on the bug report (e.g. when asking for > > > > bisecting). > > > > > > But not *initially*. > > > > > > We should not confuse *debugging* with *reporting bugs*. While the > > > former is > > > actually more difficult and more time consuming than writing the code in > > > which > > > the bug is present, the latter should be as simple as sending an email. > > > > For hardcore geeks like you and me sending an email might be easier than > > using some web interface. > > > > Normal humans tend to be more accustomed to web interfaces, and > > following the instructions on some web page is _much_ easier than > > reading three text files for knowing what to write in an email. > > Hm, this is a good argument for having such a web interface, but IMO it > shouldn't be mandatory. IOW, there should be a way to report a bug using > plain > email, if the reporter prefers that. We can, however, request that the > address > of our bug tracking system be added to the report's Cc list. Looking at both other open source projects and the support of commercial software a web interface should be enough. But this is not the problem - the problem is what happens after the initial report with the bug report. > Now, the question is what information this web interface should ask for. > > IMO, first, it should ask for what the bug is against, ie.: > - kernel version (to be obtained from 'git describe' or from /proc/version or > from .config, if the kernel doesn't boot) > - architecture (x86, ARM, MIPS etc.) > - subsystem and subsubsystem (that could be selectable from a menu and might > depend on the architecture) > > It also should ask if the problem is a regression and what was the last known > good kernel (I'd prefer that to be the last known major release selectable > from > a list). > > Also, the reporter should be required to provide a summary (subject) and > a (concise) description of the problem and a list of email addresses to > send the report to in addition to the regular handling (there should be a way > to verify which addresses are acceptable). > > Anything else? > > Next, the report should be sent to a mailing list selected on the basis of the > information provided (not necessarily to individual developers, unless there > are some addresses provided explicitly by the reporter). The architecture choice seems to be the only thing from your list that isn't already available in the "Enter a new bug report" dialog of the kernel Bugzilla. > IMO, it should be possible to work on the bug using both email and the web > interface, whichever is preferred by the participant in question, without the > need to stick to any of them (ie. email messages sent in the corresponding > email thread should be registered by the bug tracking system and comments > entered into it should appear as messages in the email thread with the > appropriate To:, From: and Cc: information). > > There surely are more things that we'd like it to do, but the above seem to be > a reasonable minimum. Except from the From: header in outgoing emails the kernel Bugzilla already offers this for years. >... > Greetings, > Rafael cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/27] ptrace: generic resume
> Could we by any chance just force every architecture using generic code > to implement PTRACE_SINGLESTEP and PTRACE_SYSEMU? This will lead to > both far less messy code and a more consistant user interface. I'd like to look into that later after most arch's have moved to using the generic code for their existing support. I am thoroughly in favor, but it requires some more groundwork that can come after this initial stage. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel bugzilla is FPOS (was: Re: "buggy cmd640" message followed by soft lockup)
On Sunday, 25 of November 2007, Rafael J. Wysocki wrote: > On Sunday, 25 of November 2007, Adrian Bunk wrote: > > On Sun, Nov 25, 2007 at 10:28:06PM +0100, Rafael J. Wysocki wrote: > > > On Sunday, 25 of November 2007, Adrian Bunk wrote: [--snip--] > > Even worse: > > Different people have different opinions what they need and what they > > don't want... > > Let's collect these opitions, then, and try to find a solution that would > satisfy all of them or at least the majority of them. s/opitions/opinions/ Sorry. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/27] ptrace: arch_has_single_step
On Sun, Nov 25, 2007 at 01:55:07PM -0800, Roland McGrath wrote: > This defines the new macro arch_has_single_step() in linux/ptrace.h, a > default for when asm/ptrace.h does not define it. It declares the new > user_enable_single_step and user_disable_single_step functions. > This is not used yet, but paves the way to harmonize on this interface > for the arch-specific calls on all machines. Why should arch_has_single_step be a function-like macro? I can't thing of a case were this wouln't be a compile-time constant. And given that this is hopefully a transitionary ifdef because eventually all architectures would use the generic code I'd prefer ifdefs in the code that clearly mark this as transitional in this case. > +static inline void user_enable_single_step(struct task_struct *task) > +static inline void user_disable_single_step(struct task_struct *task) And I don't think these should be provided at all as generic stubs. If an arch doesn't use the generic code it simply shouldn't compile the code using this. Whats the reason for the user_ prefix btw, most architectures seems to have these functions already anyway, just without the user_ prefix. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 10/27] ptrace: generic resume
On Sun, Nov 25, 2007 at 02:01:09PM -0800, Roland McGrath wrote: > This makes ptrace_request handle all the ptrace requests that wake > up the traced task. These do low-level ptrace implementation magic > that is not arch-specific and should be kept out of arch code. The > implementations on each arch usually do the same thing. The new > generic code makes use of the arch_has_single_step macro and generic > entry points to handle PTRACE_SINGLESTEP. Nice, I've been trying to get people to move this to common code for a while :) > +#ifdef PTRACE_SINGLESTEP > +#define is_singlestep(request) ((request) == PTRACE_SINGLESTEP) > +#else > +#define is_singlestep(request) 0 > +#endif > + > +#ifdef PTRACE_SYSEMU > +#define is_sysemu_singlestep(request)((request) == > PTRACE_SYSEMU_SINGLESTEP) > +#else > +#define is_sysemu_singlestep(request)0 > +#endif Could we by any chance just force every architecture using generic code to implement PTRACE_SINGLESTEP and PTRACE_SYSEMU? This will lead to both far less messy code and a more consistant user interface. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel bugzilla is FPOS (was: Re: "buggy cmd640" message followed by soft lockup)
On Sunday, 25 of November 2007, Adrian Bunk wrote: > On Sun, Nov 25, 2007 at 10:28:06PM +0100, Rafael J. Wysocki wrote: > > On Sunday, 25 of November 2007, Adrian Bunk wrote: > >.. > > > First of all, Bugzilla is a quite often used bug tracker in the open > > > source world [1], so many users already know it. > > > > > > But more important, "it pretends to require them to spend" isn't true > > > because there's no pretending - we actually often require bug reporters > > > to spend a lot of time on the bug report (e.g. when asking for > > > bisecting). > > > > But not *initially*. > > > > We should not confuse *debugging* with *reporting bugs*. While the former > > is > > actually more difficult and more time consuming than writing the code in > > which > > the bug is present, the latter should be as simple as sending an email. > > For hardcore geeks like you and me sending an email might be easier than > using some web interface. > > Normal humans tend to be more accustomed to web interfaces, and > following the instructions on some web page is _much_ easier than > reading three text files for knowing what to write in an email. Hm, this is a good argument for having such a web interface, but IMO it shouldn't be mandatory. IOW, there should be a way to report a bug using plain email, if the reporter prefers that. We can, however, request that the address of our bug tracking system be added to the report's Cc list. Now, the question is what information this web interface should ask for. IMO, first, it should ask for what the bug is against, ie.: - kernel version (to be obtained from 'git describe' or from /proc/version or from .config, if the kernel doesn't boot) - architecture (x86, ARM, MIPS etc.) - subsystem and subsubsystem (that could be selectable from a menu and might depend on the architecture) It also should ask if the problem is a regression and what was the last known good kernel (I'd prefer that to be the last known major release selectable from a list). Also, the reporter should be required to provide a summary (subject) and a (concise) description of the problem and a list of email addresses to send the report to in addition to the regular handling (there should be a way to verify which addresses are acceptable). Anything else? Next, the report should be sent to a mailing list selected on the basis of the information provided (not necessarily to individual developers, unless there are some addresses provided explicitly by the reporter). IMO, it should be possible to work on the bug using both email and the web interface, whichever is preferred by the participant in question, without the need to stick to any of them (ie. email messages sent in the corresponding email thread should be registered by the bug tracking system and comments entered into it should appear as messages in the email thread with the appropriate To:, From: and Cc: information). There surely are more things that we'd like it to do, but the above seem to be a reasonable minimum. > > > I'm also sometimes writing bug reports in different areas, and in my > > > experience it doesn't matter whether it's web-based Bugzilla, the > > > email-based Debian bug tracker or whatever else system - the time spent > > > on a good bug report is not spend on pasting the text whereever or on > > > clicking on a few boxes, the time is spent on tracking the issue down > > > and writing a good bug report. > > > > Apparently, you are expecting the reporters do *debug* problems, while they > > need > > not be aware of how to do that. > > > > IMHO, we should make reporting problems as simple as reasonably possible and > > Agreed, and as said above simple = web interface. > > >... > > > What matters for a bug reporter is to get a solution for his problem > > > within a reasonable amount of time. > > > > Still, it's annoying if you attach tons of information to the report and > > that > > information does not turn out to be useful. > > Agreed. > > > > > Also, some developers do not consider the Bugzilla as a useful thing and > > > > wouldn't like to use it (which is why this thread has appeared, among > > > > other > > > > things ;-)). > > > >... > > > > > > And that's part of the problem. > > > > > > Bugzilla is a usable tool, but it isn't the only tool available. > > > > > > If there was one tool all developers would be willing to use that would > > > be a reason why we should switch to whatever tool this is. > > > > The choice of the tool should be a result of the choice of a *method*. IOW, > > we have to know our needs and choose the tool that satisfies them or write > > one > > if it doesn't exist. > > > > For now, IMHO, we don't really know what we need. > > Even worse: > Different people have different opinions what they need and what they > don't want... Let's collect these opitions, then, and try to find a solution that would satisfy all of them or at least the majority of them.
[patch 3/4] Timerfd v3 - wire the new timerfd API to the x86 family
Wires up the new timerfd API to the x86 family. Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]> - Davide --- arch/x86/ia32/ia32entry.S |4 +++- arch/x86/kernel/syscall_table_32.S |4 +++- include/asm-x86/unistd_32.h|6 -- include/asm-x86/unistd_64.h|9 +++-- 4 files changed, 17 insertions(+), 6 deletions(-) Index: linux-2.6.mod/include/asm-x86/unistd_32.h === --- linux-2.6.mod.orig/include/asm-x86/unistd_32.h 2007-11-23 13:55:15.0 -0800 +++ linux-2.6.mod/include/asm-x86/unistd_32.h 2007-11-24 12:49:28.0 -0800 @@ -327,13 +327,15 @@ #define __NR_epoll_pwait 319 #define __NR_utimensat 320 #define __NR_signalfd 321 -#define __NR_timerfd 322 +#define __NR_timerfd_create322 #define __NR_eventfd 323 #define __NR_fallocate 324 +#define __NR_timerfd_settime 325 +#define __NR_timerfd_gettime 326 #ifdef __KERNEL__ -#define NR_syscalls 325 +#define NR_syscalls 327 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR Index: linux-2.6.mod/include/asm-x86/unistd_64.h === --- linux-2.6.mod.orig/include/asm-x86/unistd_64.h 2007-11-23 13:55:15.0 -0800 +++ linux-2.6.mod/include/asm-x86/unistd_64.h 2007-11-24 12:49:28.0 -0800 @@ -629,12 +629,17 @@ __SYSCALL(__NR_epoll_pwait, sys_epoll_pwait) #define __NR_signalfd 282 __SYSCALL(__NR_signalfd, sys_signalfd) -#define __NR_timerfd 283 -__SYSCALL(__NR_timerfd, sys_timerfd) +#define __NR_timerfd_create283 +__SYSCALL(__NR_timerfd_create, sys_timerfd_create) #define __NR_eventfd 284 __SYSCALL(__NR_eventfd, sys_eventfd) #define __NR_fallocate 285 __SYSCALL(__NR_fallocate, sys_fallocate) +#define __NR_timerfd_settime 286 +__SYSCALL(__NR_timerfd_settime, sys_timerfd_settime) +#define __NR_timerfd_gettime 287 +__SYSCALL(__NR_timerfd_gettime, sys_timerfd_gettime) + #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR Index: linux-2.6.mod/arch/x86/kernel/syscall_table_32.S === --- linux-2.6.mod.orig/arch/x86/kernel/syscall_table_32.S 2007-11-23 13:55:16.0 -0800 +++ linux-2.6.mod/arch/x86/kernel/syscall_table_32.S2007-11-24 12:49:28.0 -0800 @@ -321,6 +321,8 @@ .long sys_epoll_pwait .long sys_utimensat /* 320 */ .long sys_signalfd - .long sys_timerfd + .long sys_timerfd_create .long sys_eventfd .long sys_fallocate + .long sys_timerfd_settime /* 325 */ + .long sys_timerfd_gettime Index: linux-2.6.mod/arch/x86/ia32/ia32entry.S === --- linux-2.6.mod.orig/arch/x86/ia32/ia32entry.S2007-11-23 13:55:16.0 -0800 +++ linux-2.6.mod/arch/x86/ia32/ia32entry.S 2007-11-24 12:49:28.0 -0800 @@ -723,7 +723,9 @@ .quad sys_epoll_pwait .quad compat_sys_utimensat /* 320 */ .quad compat_sys_signalfd - .quad compat_sys_timerfd + .quad sys_timerfd_create .quad sys_eventfd .quad sys32_fallocate + .quad compat_sys_timerfd_settime/* 325 */ + .quad compat_sys_timerfd_gettime ia32_syscall_end: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 4/4] Timerfd v3 - un-break CONFIG_TIMERFD
Remove the broken status to CONFIG_TIMERFD. Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]> - Davide --- init/Kconfig |1 - 1 file changed, 1 deletion(-) Index: linux-2.6.mod/init/Kconfig === --- linux-2.6.mod.orig/init/Kconfig 2007-11-23 13:55:15.0 -0800 +++ linux-2.6.mod/init/Kconfig 2007-11-24 12:49:30.0 -0800 @@ -566,7 +566,6 @@ config TIMERFD bool "Enable timerfd() system call" if EMBEDDED select ANON_INODES - depends on BROKEN default y help Enable the timerfd() system call that allows to receive timer - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/4] Timerfd v3 - introduce a new hrtimer_forward_now() function
I think that advancing the timer against the timer's current "now" can be a pretty common usage, so, w/out exposing hrtimer's internals, we add a new hrtimer_forward_now() function. Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]> - Davide --- include/linux/hrtimer.h |7 +++ 1 file changed, 7 insertions(+) Index: linux-2.6.mod/include/linux/hrtimer.h === --- linux-2.6.mod.orig/include/linux/hrtimer.h 2007-11-23 13:55:16.0 -0800 +++ linux-2.6.mod/include/linux/hrtimer.h 2007-11-24 12:48:05.0 -0800 @@ -298,6 +298,13 @@ extern unsigned long hrtimer_forward(struct hrtimer *timer, ktime_t now, ktime_t interval); +/* Forward a hrtimer so it expires after the hrtimer's current now */ +static inline unsigned long hrtimer_forward_now(struct hrtimer *timer, + ktime_t interval) +{ + return hrtimer_forward(timer, timer->base->get_time(), interval); +} + /* Precise sleep: */ extern long hrtimer_nanosleep(struct timespec *rqtp, struct timespec *rmtp, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/4] Timerfd v3 - new timerfd API
This is the new timerfd API as it is implemented by the following patch: int timerfd_create(int clockid, int flags); int timerfd_settime(int ufd, int flags, const struct itimerspec *utmr, struct itimerspec *otmr); int timerfd_gettime(int ufd, struct itimerspec *otmr); The timerfd_create() API creates an un-programmed timerfd fd. The "clockid" parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME. The timerfd_settime() API give new settings by the timerfd fd, by optionally retrieving the previous expiration time (in case the "otmr" parameter is not NULL). The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit is set in the "flags" parameter. Otherwise it's a relative time. The timerfd_gettime() API returns the next expiration time of the timer, or {0, 0} if the timerfd has not been set yet. Like the previous timerfd API implementation, read(2) and poll(2) are supported (with the same interface). Here's a simple test program I used to exercise the new timerfd APIs: http://www.xmailserver.org/timerfd-test2.c Signed-off-by: Davide Libenzi <[EMAIL PROTECTED]> - Davide --- fs/compat.c | 32 ++- fs/timerfd.c | 199 ++- include/linux/compat.h |7 + include/linux/syscalls.h |7 + 4 files changed, 166 insertions(+), 79 deletions(-) Index: linux-2.6.mod/fs/timerfd.c === --- linux-2.6.mod.orig/fs/timerfd.c 2007-11-23 13:55:16.0 -0800 +++ linux-2.6.mod/fs/timerfd.c 2007-11-24 12:49:21.0 -0800 @@ -25,13 +25,15 @@ struct hrtimer tmr; ktime_t tintv; wait_queue_head_t wqh; + u64 ticks; int expired; + int clockid; }; /* * This gets called when the timer event triggers. We set the "expired" * flag, but we do not re-arm the timer (in case it's necessary, - * tintv.tv64 != 0) until the timer is read. + * tintv.tv64 != 0) until the timer is accessed. */ static enum hrtimer_restart timerfd_tmrproc(struct hrtimer *htmr) { @@ -40,13 +42,14 @@ spin_lock_irqsave(>wqh.lock, flags); ctx->expired = 1; + ctx->ticks++; wake_up_locked(>wqh); spin_unlock_irqrestore(>wqh.lock, flags); return HRTIMER_NORESTART; } -static void timerfd_setup(struct timerfd_ctx *ctx, int clockid, int flags, +static void timerfd_setup(struct timerfd_ctx *ctx, int flags, const struct itimerspec *ktmr) { enum hrtimer_mode htmode; @@ -57,8 +60,9 @@ texp = timespec_to_ktime(ktmr->it_value); ctx->expired = 0; + ctx->ticks = 0; ctx->tintv = timespec_to_ktime(ktmr->it_interval); - hrtimer_init(>tmr, clockid, htmode); + hrtimer_init(>tmr, ctx->clockid, htmode); ctx->tmr.expires = texp; ctx->tmr.function = timerfd_tmrproc; if (texp.tv64 != 0) @@ -83,7 +87,7 @@ poll_wait(file, >wqh, wait); spin_lock_irqsave(>wqh.lock, flags); - if (ctx->expired) + if (ctx->ticks) events |= POLLIN; spin_unlock_irqrestore(>wqh.lock, flags); @@ -102,11 +106,11 @@ return -EINVAL; spin_lock_irq(>wqh.lock); res = -EAGAIN; - if (!ctx->expired && !(file->f_flags & O_NONBLOCK)) { + if (!ctx->ticks && !(file->f_flags & O_NONBLOCK)) { __add_wait_queue(>wqh, ); for (res = 0;;) { set_current_state(TASK_INTERRUPTIBLE); - if (ctx->expired) { + if (ctx->ticks) { res = 0; break; } @@ -121,22 +125,21 @@ __remove_wait_queue(>wqh, ); __set_current_state(TASK_RUNNING); } - if (ctx->expired) { - ctx->expired = 0; - if (ctx->tintv.tv64 != 0) { + if (ctx->ticks) { + ticks = ctx->ticks; + if (ctx->expired && ctx->tintv.tv64) { /* * If tintv.tv64 != 0, this is a periodic timer that * needs to be re-armed. We avoid doing it in the timer * callback to avoid DoS attacks specifying a very * short timer period. */ - ticks = (u64) - hrtimer_forward(>tmr, - hrtimer_cb_get_time(>tmr), - ctx->tintv); + ticks += (u64) hrtimer_forward_now(>tmr, + ctx->tintv) - 1; hrtimer_restart(>tmr); - } else - ticks = 1; + } + ctx->expired = 0;
[PATCH 27/27] x86: PTRACE_SINGLEBLOCK
This adds the PTRACE_SINGLEBLOCK request on x86, matching the ia64 feature. The implementation comes from the generic ptrace code and relies on the low-level machine support provided by arch_has_block_step() et al. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/ia32/ptrace32.c |1 + include/asm-x86/ptrace-abi.h |2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/ia32/ptrace32.c b/arch/x86/ia32/ptrace32.c index 5661abd..d1fe78c 100644 --- a/arch/x86/ia32/ptrace32.c +++ b/arch/x86/ia32/ptrace32.c @@ -212,6 +212,7 @@ asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data) case PTRACE_KILL: case PTRACE_CONT: case PTRACE_SINGLESTEP: + case PTRACE_SINGLEBLOCK: case PTRACE_DETACH: case PTRACE_SYSCALL: case PTRACE_OLDSETOPTIONS: diff --git a/include/asm-x86/ptrace-abi.h b/include/asm-x86/ptrace-abi.h index 7524e12..adce6b5 100644 --- a/include/asm-x86/ptrace-abi.h +++ b/include/asm-x86/ptrace-abi.h @@ -78,4 +78,6 @@ # define PTRACE_SYSEMU_SINGLESTEP 32 #endif +#define PTRACE_SINGLEBLOCK 33 /* resume execution until next branch */ + #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 25/27] x86: debugctlmsr arch_has_block_step
This implements user-mode step-until-branch on x86 using the BTF bit in MSR_IA32_DEBUGCTLMSR. It's just like single-step, only less so. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/step.c | 64 +-- arch/x86/kernel/traps_32.c |6 arch/x86/kernel/traps_64.c |6 include/asm-x86/ptrace.h |7 + 4 files changed, 80 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c index 243bff6..cf4b9da 100644 --- a/arch/x86/kernel/step.c +++ b/arch/x86/kernel/step.c @@ -107,7 +107,10 @@ static int is_setting_trap_flag(struct task_struct *child, struct pt_regs *regs) return 0; } -void user_enable_single_step(struct task_struct *child) +/* + * Enable single-stepping. Return nonzero if user mode is not using TF itself. + */ +static int enable_single_step(struct task_struct *child) { struct pt_regs *regs = task_pt_regs(child); @@ -122,7 +125,7 @@ void user_enable_single_step(struct task_struct *child) * If TF was already set, don't do anything else */ if (regs->eflags & X86_EFLAGS_TF) - return; + return 0; /* Set TF on the kernel stack.. */ regs->eflags |= X86_EFLAGS_TF; @@ -133,13 +136,68 @@ void user_enable_single_step(struct task_struct *child) * won't clear it by hand later. */ if (is_setting_trap_flag(child, regs)) - return; + return 0; set_tsk_thread_flag(child, TIF_FORCED_TF); + + return 1; +} + +/* + * Install this value in MSR_IA32_DEBUGCTLMSR whenever child is running. + */ +static void write_debugctlmsr(struct task_struct *child, unsigned long val) +{ + child->thread.debugctlmsr = val; + + if (child != current) + return; + +#ifdef CONFIG_X86_64 + wrmsrl(MSR_IA32_DEBUGCTLMSR, val); +#else + wrmsr(MSR_IA32_DEBUGCTLMSR, val, 0); +#endif +} + +/* + * Enable single or block step. + */ +static void enable_step(struct task_struct *child, bool block) +{ + /* +* Make sure block stepping (BTF) is not enabled unless it should be. +* Note that we don't try to worry about any is_setting_trap_flag() +* instructions after the first when using block stepping. +* So noone should try to use debugger block stepping in a program +* that uses user-mode single stepping itself. +*/ + if (enable_single_step(child) && block) { + set_tsk_thread_flag(child, TIF_DEBUGCTLMSR); + write_debugctlmsr(child, DEBUGCTLMSR_BTF); + } else if (test_and_clear_tsk_thread_flag(child, TIF_DEBUGCTLMSR)) { + write_debugctlmsr(child, 0); + } +} + +void user_enable_single_step(struct task_struct *child) +{ + enable_step(child, 0); +} + +void user_enable_block_step(struct task_struct *child) +{ + enable_step(child, 1); } void user_disable_single_step(struct task_struct *child) { + /* +* Make sure block stepping (BTF) is disabled. +*/ + if (test_and_clear_tsk_thread_flag(child, TIF_DEBUGCTLMSR)) + write_debugctlmsr(child, 0); + /* Always clear TIF_SINGLESTEP... */ clear_tsk_thread_flag(child, TIF_SINGLESTEP); diff --git a/arch/x86/kernel/traps_32.c b/arch/x86/kernel/traps_32.c index 298d13e..03d5b41 100644 --- a/arch/x86/kernel/traps_32.c +++ b/arch/x86/kernel/traps_32.c @@ -830,6 +830,12 @@ fastcall void __kprobes do_debug(struct pt_regs * regs, long error_code) get_debugreg(condition, 6); + /* +* The processor cleared BTF, so don't mark that we need it set. +*/ + clear_tsk_thread_flag(tsk, TIF_DEBUGCTLMSR); + tsk->thread.debugctlmsr = 0; + if (notify_die(DIE_DEBUG, "debug", regs, condition, error_code, SIGTRAP) == NOTIFY_STOP) return; diff --git a/arch/x86/kernel/traps_64.c b/arch/x86/kernel/traps_64.c index daf35a8..ec70f5c 100644 --- a/arch/x86/kernel/traps_64.c +++ b/arch/x86/kernel/traps_64.c @@ -848,6 +848,12 @@ asmlinkage void __kprobes do_debug(struct pt_regs * regs, get_debugreg(condition, 6); + /* +* The processor cleared BTF, so don't mark that we need it set. +*/ + clear_tsk_thread_flag(tsk, TIF_DEBUGCTLMSR); + tsk->thread.debugctlmsr = 0; + if (notify_die(DIE_DEBUG, "debug", regs, condition, error_code, SIGTRAP) == NOTIFY_STOP) return; diff --git a/include/asm-x86/ptrace.h b/include/asm-x86/ptrace.h index d223dec..04204f3 100644 --- a/include/asm-x86/ptrace.h +++ b/include/asm-x86/ptrace.h @@ -150,6 +150,13 @@ enum { extern void user_enable_single_step(struct task_struct *); extern void user_disable_single_step(struct task_struct *); +extern void user_enable_block_step(struct task_struct *);
[PATCH 24/27] x86: debugctlmsr context switch
This adds low-level support for a per-thread value of MSR_IA32_DEBUGCTLMSR. The per-thread value is switched in when TIF_DEBUGCTLMSR is set. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/process_32.c |6 +- arch/x86/kernel/process_64.c |3 +++ include/asm-x86/processor_32.h |2 ++ include/asm-x86/processor_64.h |2 ++ include/asm-x86/thread_info_32.h |6 -- include/asm-x86/thread_info_64.h |4 +++- 6 files changed, 19 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index f59544e..3a822e3 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -581,10 +581,14 @@ static noinline void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, struct tss_struct *tss) { - struct thread_struct *next; + struct thread_struct *prev, *next; + prev = _p->thread; next = _p->thread; + if (next->debugctlmsr != prev->debugctlmsr) + wrmsr(MSR_IA32_DEBUGCTLMSR, next->debugctlmsr, 0); + if (test_tsk_thread_flag(next_p, TIF_DEBUG)) { set_debugreg(next->debugreg[0], 0); set_debugreg(next->debugreg[1], 1); diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 586f88e..c1e2e9a 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -544,6 +544,9 @@ static inline void __switch_to_xtra(struct task_struct *prev_p, prev = _p->thread, next = _p->thread; + if (next->debugctlmsr != prev->debugctlmsr) + wrmsrl(MSR_IA32_DEBUGCTLMSR, next->debugctlmsr); + if (test_tsk_thread_flag(next_p, TIF_DEBUG)) { loaddebug(next, 0); loaddebug(next, 1); diff --git a/include/asm-x86/processor_32.h b/include/asm-x86/processor_32.h index 34e8063..660d9b0 100644 --- a/include/asm-x86/processor_32.h +++ b/include/asm-x86/processor_32.h @@ -370,6 +370,8 @@ struct thread_struct { unsigned long iopl; /* max allowed port in the bitmap, in bytes: */ unsigned long io_bitmap_max; +/* MSR_IA32_DEBUGCTLMSR value to switch in if TIF_DEBUGCTLMSR is set. */ + unsigned long debugctlmsr; }; #define INIT_THREAD { \ diff --git a/include/asm-x86/processor_64.h b/include/asm-x86/processor_64.h index 2dd739a..1d6daa0 100644 --- a/include/asm-x86/processor_64.h +++ b/include/asm-x86/processor_64.h @@ -239,6 +239,8 @@ struct thread_struct { int ioperm; unsigned long *io_bitmap_ptr; unsigned io_bitmap_max; +/* MSR_IA32_DEBUGCTLMSR value to switch in if TIF_DEBUGCTLMSR is set. */ + unsigned long debugctlmsr; /* cached TLS descriptors. */ u64 tls_array[GDT_ENTRY_TLS_ENTRIES]; } __attribute__((aligned(16))); diff --git a/include/asm-x86/thread_info_32.h b/include/asm-x86/thread_info_32.h index 8a6483f..d5ae1e9 100644 --- a/include/asm-x86/thread_info_32.h +++ b/include/asm-x86/thread_info_32.h @@ -138,6 +138,7 @@ static inline struct thread_info *current_thread_info(void) #define TIF_FREEZE 19 /* is freezing for suspend */ #define TIF_NOTSC 20 /* TSC is not accessible in userland */ #define TIF_FORCED_TF 21 /* true if TF in eflags artificially */ +#define TIF_DEBUGCTLMSR22 /* uses thread_struct.debugctlmsr */ #define _TIF_SYSCALL_TRACE (1
[PATCH 26/27] x86: debugctlmsr kprobes
This adjusts the x86 kprobes implementation to cope with per-thread MSR_IA32_DEBUGCTLMSR being set for user mode. I haven't delved deep enough into the kprobes code to be really sure this covers all the cases where the user-mode BTF setting needs to be cleared or restored. It looks about right to me. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/kprobes_32.c | 15 +++ arch/x86/kernel/kprobes_64.c | 15 +++ 2 files changed, 30 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/kprobes_32.c b/arch/x86/kernel/kprobes_32.c index d87a523..f151f06 100644 --- a/arch/x86/kernel/kprobes_32.c +++ b/arch/x86/kernel/kprobes_32.c @@ -217,8 +217,21 @@ static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs, kcb->kprobe_saved_eflags &= ~IF_MASK; } +static __always_inline void clear_btf(void) +{ + if (test_thread_flag(TIF_DEBUGCTLMSR)) + wrmsr(MSR_IA32_DEBUGCTLMSR, 0, 0); +} + +static __always_inline void restore_btf(void) +{ + if (test_thread_flag(TIF_DEBUGCTLMSR)) + wrmsr(MSR_IA32_DEBUGCTLMSR, current->thread.debugctlmsr, 0); +} + static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs) { + clear_btf(); regs->eflags |= TF_MASK; regs->eflags &= ~IF_MASK; /*single step inline if the instruction is an int3*/ @@ -542,6 +555,8 @@ static void __kprobes resume_execution(struct kprobe *p, regs->eip = orig_eip + (regs->eip - copy_eip); no_change: + restore_btf(); + return; } diff --git a/arch/x86/kernel/kprobes_64.c b/arch/x86/kernel/kprobes_64.c index 3db3611..d3be418 100644 --- a/arch/x86/kernel/kprobes_64.c +++ b/arch/x86/kernel/kprobes_64.c @@ -256,8 +256,21 @@ static void __kprobes set_current_kprobe(struct kprobe *p, struct pt_regs *regs, kcb->kprobe_saved_rflags &= ~IF_MASK; } +static __always_inline void clear_btf(void) +{ + if (test_thread_flag(TIF_DEBUGCTLMSR)) + wrmsrl(MSR_IA32_DEBUGCTLMSR, 0); +} + +static __always_inline void restore_btf(void) +{ + if (test_thread_flag(TIF_DEBUGCTLMSR)) + wrmsrl(MSR_IA32_DEBUGCTLMSR, current->thread.debugctlmsr); +} + static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs) { + clear_btf(); regs->eflags |= TF_MASK; regs->eflags &= ~IF_MASK; /*single step inline if the instruction is an int3*/ @@ -534,6 +547,8 @@ static void __kprobes resume_execution(struct kprobe *p, } else { regs->rip = orig_rip + (regs->rip - copy_rip); } + + restore_btf(); } int __kprobes post_kprobe_handler(struct pt_regs *regs) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 22/27] x86: debugctlmsr constants
This adds constant macros for a few of the bits in MSR_IA32_DEBUGCTLMSR. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- include/asm-x86/msr-index.h |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/include/asm-x86/msr-index.h b/include/asm-x86/msr-index.h index a494473..4045bbe 100644 --- a/include/asm-x86/msr-index.h +++ b/include/asm-x86/msr-index.h @@ -63,6 +63,13 @@ #define MSR_IA32_LASTINTFROMIP 0x01dd #define MSR_IA32_LASTINTTOIP 0x01de +/* DEBUGCTLMSR bits (others vary by model): */ +#define _DEBUGCTLMSR_LBR 0 /* last branch recording */ +#define _DEBUGCTLMSR_BTF 1 /* single-step on branches */ + +#define DEBUGCTLMSR_LBR(1UL << _DEBUGCTLMSR_LBR) +#define DEBUGCTLMSR_BTF(1UL << _DEBUGCTLMSR_BTF) + #define MSR_IA32_MC0_CTL 0x0400 #define MSR_IA32_MC0_STATUS0x0401 #define MSR_IA32_MC0_ADDR 0x0402 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 23/27] x86: debugctlmsr kconfig
This adds the (internal) Kconfig macro CONFIG_X86_DEBUGCTLMSR, to be defined when configuring to support only hardware that definitely supports MSR_IA32_DEBUGCTLMSR with the BTF flag. The Intel documentation says "P6 family" and later processors all have it. I think the Kconfig dependencies are right to have it set for those and unset for others (i.e., when 586 and earlier are supported). Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/Kconfig.cpu |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index c301622..69e2ee4 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -399,3 +399,7 @@ config X86_MINIMUM_CPU_FAMILY default "4" if X86_32 && (X86_XADD || X86_CMPXCHG || X86_BSWAP || X86_WP_WORKS_OK) default "3" +config X86_DEBUGCTLMSR + bool + depends on !(M586MMX || M586TSC || M586 || M486 || M386) + default y - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 20/27] ptrace: arch_has_block_step
This defines the new macro arch_has_block_step() in linux/ptrace.h, a default for when asm/ptrace.h does not define it. This is the analog of arch_has_single_step() for step-until-branch on machines that have it. It declares the new user_enable_block_step function, which goes with the existing user_enable_single_step and user_disable_single_step. This is not used yet, but paves the way to harmonize on this interface for the arch-specific calls on all machines. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- include/linux/ptrace.h | 37 + 1 files changed, 33 insertions(+), 4 deletions(-) diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h index a6effc8..dd8f751 100644 --- a/include/linux/ptrace.h +++ b/include/linux/ptrace.h @@ -154,7 +154,8 @@ int generic_ptrace_pokedata(struct task_struct *tsk, long addr, long data); * * This can only be called when arch_has_single_step() has returned nonzero. * Set @task so that when it returns to user mode, it will trap after the - * next single instruction executes. + * next single instruction executes. If arch_has_block_step() is defined, + * this must clear the effects of user_enable_block_step() too. */ static inline void user_enable_single_step(struct task_struct *task) { @@ -165,15 +166,43 @@ static inline void user_enable_single_step(struct task_struct *task) * user_disable_single_step - cancel user-mode single-step * @task: either current or a task stopped in %TASK_TRACED * - * Clear @task of the effects of user_enable_single_step(). This can - * be called whether or not user_enable_single_step() was ever called - * on @task, and even if arch_has_single_step() returned zero. + * Clear @task of the effects of user_enable_single_step() and + * user_enable_block_step(). This can be called whether or not either + * of those was ever called on @task, and even if arch_has_single_step() + * returned zero. */ static inline void user_disable_single_step(struct task_struct *task) { } #endif /* arch_has_single_step */ +#ifndef arch_has_block_step +/** + * arch_has_block_step - does this CPU support user-mode block-step? + * + * If this is defined, then there must be a function declaration or inline + * for user_enable_block_step(), and arch_has_single_step() must be defined + * too. arch_has_block_step() should evaluate to nonzero iff the machine + * supports step-until-branch for user mode. It can be a constant or it + * can test a CPU feature bit. + */ +#define arch_has_single_step() (0) + +/** + * user_enable_block_step - step until branch in user-mode task + * @task: either current or a task stopped in %TASK_TRACED + * + * This can only be called when arch_has_block_step() has returned nonzero, + * and will never be called when single-instruction stepping is being used. + * Set @task so that when it returns to user mode, it will trap after the + * next branch or trap taken. + */ +static inline void user_enable_block_step(struct task_struct *task) +{ + BUG(); /* This can never be called. */ +} +#endif /* arch_has_block_step */ + #endif #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 21/27] ptrace: generic PTRACE_SINGLEBLOCK
This makes ptrace_request handle PTRACE_SINGLEBLOCK along with PTRACE_CONT et al. The new generic code makes use of the arch_has_block_step macro and generic entry points on machines that define them. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- kernel/ptrace.c | 15 ++- 1 files changed, 14 insertions(+), 1 deletions(-) diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 309796a..2824726 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -373,6 +373,12 @@ static int ptrace_setsiginfo(struct task_struct *child, siginfo_t __user * data) #define is_singlestep(request) 0 #endif +#ifdef PTRACE_SINGLEBLOCK +#define is_singleblock(request)((request) == PTRACE_SINGLEBLOCK) +#else +#define is_singleblock(request)0 +#endif + #ifdef PTRACE_SYSEMU #define is_sysemu_singlestep(request) ((request) == PTRACE_SYSEMU_SINGLESTEP) #else @@ -396,7 +402,11 @@ static int ptrace_resume(struct task_struct *child, long request, long data) clear_tsk_thread_flag(child, TIF_SYSCALL_EMU); #endif - if (is_singlestep(request) || is_sysemu_singlestep(request)) { + if (is_singleblock(request)) { + if (unlikely(!arch_has_block_step())) + return -EIO; + user_enable_block_step(child); + } else if (is_singlestep(request) || is_sysemu_singlestep(request)) { if (unlikely(!arch_has_single_step())) return -EIO; user_enable_single_step(child); @@ -438,6 +448,9 @@ int ptrace_request(struct task_struct *child, long request, #ifdef PTRACE_SINGLESTEP case PTRACE_SINGLESTEP: #endif +#ifdef PTRACE_SINGLEBLOCK + case PTRACE_SINGLEBLOCK: +#endif #ifdef PTRACE_SYSEMU case PTRACE_SYSEMU: case PTRACE_SYSEMU_SINGLESTEP: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 17/27] x86-64 ptrace debugreg cleanup
This cleans up the 64-bit ptrace code to separate the guts of the debug register access from the implementation of PTRACE_PEEKUSR and PTRACE_POKEUSR. The new functions ptrace_[gs]et_debugreg are made global so that the ia32 code can later be changed to call them too. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_64.c | 140 --- include/asm-x86/ptrace.h|3 + 2 files changed, 69 insertions(+), 74 deletions(-) diff --git a/arch/x86/kernel/ptrace_64.c b/arch/x86/kernel/ptrace_64.c index 8123ecb..bad8b3c 100644 --- a/arch/x86/kernel/ptrace_64.c +++ b/arch/x86/kernel/ptrace_64.c @@ -183,9 +183,63 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) } +unsigned long ptrace_get_debugreg(struct task_struct *child, int n) +{ + switch (n) { + case 0: return child->thread.debugreg0; + case 1: return child->thread.debugreg1; + case 2: return child->thread.debugreg2; + case 3: return child->thread.debugreg3; + case 6: return child->thread.debugreg6; + case 7: return child->thread.debugreg7; + } + return 0; +} + +int ptrace_set_debugreg(struct task_struct *child, int n, unsigned long data) +{ + int i; + + if (n < 4) { + int dsize = test_tsk_thread_flag(child, TIF_IA32) ? 3 : 7; + if (unlikely(data >= TASK_SIZE_OF(child) - dsize)) + return -EIO; + } + + switch (n) { + case 0: child->thread.debugreg0 = data; break; + case 1: child->thread.debugreg1 = data; break; + case 2: child->thread.debugreg2 = data; break; + case 3: child->thread.debugreg3 = data; break; + + case 6: + if (data >> 32) + return -EIO; + child->thread.debugreg6 = data; + break; + + case 7: + /* +* See ptrace_32.c for an explanation of this awkward check. +*/ + data &= ~DR_CONTROL_RESERVED; + for (i = 0; i < 4; i++) + if ((0x5554 >> ((data >> (16 + 4*i)) & 0xf)) & 1) + return -EIO; + child->thread.debugreg7 = data; + if (data) + set_tsk_thread_flag(child, TIF_DEBUG); + else + clear_tsk_thread_flag(child, TIF_DEBUG); + break; + } + + return 0; +} + long arch_ptrace(struct task_struct *child, long request, long addr, long data) { - long i, ret; + long ret; unsigned ui; switch (request) { @@ -204,32 +258,14 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) addr > sizeof(struct user) - 7) break; - switch (addr) { - case 0 ... sizeof(struct user_regs_struct) - sizeof(long): + tmp = 0; + if (addr < sizeof(struct user_regs_struct)) tmp = getreg(child, addr); - break; - case offsetof(struct user, u_debugreg[0]): - tmp = child->thread.debugreg0; - break; - case offsetof(struct user, u_debugreg[1]): - tmp = child->thread.debugreg1; - break; - case offsetof(struct user, u_debugreg[2]): - tmp = child->thread.debugreg2; - break; - case offsetof(struct user, u_debugreg[3]): - tmp = child->thread.debugreg3; - break; - case offsetof(struct user, u_debugreg[6]): - tmp = child->thread.debugreg6; - break; - case offsetof(struct user, u_debugreg[7]): - tmp = child->thread.debugreg7; - break; - default: - tmp = 0; - break; + else if (addr >= offsetof(struct user, u_debugreg[0])) { + addr -= offsetof(struct user, u_debugreg[0]); + tmp = ptrace_get_debugreg(child, addr / sizeof(long)); } + ret = put_user(tmp,(unsigned long __user *) data); break; } @@ -241,63 +277,19 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) break; case PTRACE_POKEUSR: /* write the word at location addr in the USER area */ - { - int dsize = test_tsk_thread_flag(child, TIF_IA32) ? 3 : 7; ret = -EIO; if ((addr & 7) || addr > sizeof(struct user) - 7) break; - switch
[PATCH 19/27] x86-32 ptrace debugreg cleanup
This cleans up the 32-bit ptrace code to separate the guts of the debug register access from the implementation of PTRACE_PEEKUSR and PTRACE_POKEUSR. The new functions ptrace_[gs]et_debugreg match the new 64-bit entry points for parity, but they don't need to be global. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_32.c | 119 +-- 1 files changed, 69 insertions(+), 50 deletions(-) diff --git a/arch/x86/kernel/ptrace_32.c b/arch/x86/kernel/ptrace_32.c index 7c33244..0aa3756 100644 --- a/arch/x86/kernel/ptrace_32.c +++ b/arch/x86/kernel/ptrace_32.c @@ -119,6 +119,72 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) } /* + * This function is trivial and will be inlined by the compiler. + * Having it separates the implementation details of debug + * registers from the interface details of ptrace. + */ +static unsigned long ptrace_get_debugreg(struct task_struct *child, int n) +{ + return child->thread.debugreg[n]; +} + +static int ptrace_set_debugreg(struct task_struct *child, + int n, unsigned long data) +{ + if (unlikely(n == 4 || n == 5)) + return -EIO; + + if (n < 4 && unlikely(data >= TASK_SIZE - 3)) + return -EIO; + + if (n == 7) { + /* +* Sanity-check data. Take one half-byte at once with +* check = (val >> (16 + 4*i)) & 0xf. It contains the +* R/Wi and LENi bits; bits 0 and 1 are R/Wi, and bits +* 2 and 3 are LENi. Given a list of invalid values, +* we do mask |= 1 << invalid_value, so that +* (mask >> check) & 1 is a correct test for invalid +* values. +* +* R/Wi contains the type of the breakpoint / +* watchpoint, LENi contains the length of the watched +* data in the watchpoint case. +* +* The invalid values are: +* - LENi == 0x10 (undefined), so mask |= 0x0f00. +* - R/Wi == 0x10 (break on I/O reads or writes), so +* mask |= 0x. +* - R/Wi == 0x00 && LENi != 0x00, so we have mask |= +* 0x1110. +* +* Finally, mask = 0x0f00 | 0x | 0x1110 == 0x5f54. +* +* See the Intel Manual "System Programming Guide", +* 15.2.4 +* +* Note that LENi == 0x10 is defined on x86_64 in long +* mode (i.e. even for 32-bit userspace software, but +* 64-bit kernel), so the x86_64 mask value is 0x5454. +* See the AMD manual no. 24593 (AMD64 System Programming) +*/ + int i; + data &= ~DR_CONTROL_RESERVED; + for (i = 0; i < 4; i++) + if ((0x5f54 >> ((data >> (16 + 4*i)) & 0xf)) & 1) + return -EIO; + if (data) + set_tsk_thread_flag(child, TIF_DEBUG); + else + clear_tsk_thread_flag(child, TIF_DEBUG); + } + + child->thread.debugreg[n] = data; + + return 0; +} + +/* * Called by kernel/ptrace.c when detaching.. * * Make sure the single step bit is not set. @@ -158,7 +224,7 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) addr <= (long) >u_debugreg[7]){ addr -= (long) >u_debugreg[0]; addr = addr >> 2; - tmp = child->thread.debugreg[addr]; + tmp = ptrace_get_debugreg(child, addr); } ret = put_user(tmp, datap); break; @@ -188,56 +254,9 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) ret = -EIO; if(addr >= (long) >u_debugreg[0] && addr <= (long) >u_debugreg[7]){ - - if(addr == (long) >u_debugreg[4]) break; - if(addr == (long) >u_debugreg[5]) break; - if(addr < (long) >u_debugreg[4] && -((unsigned long) data) >= TASK_SIZE-3) break; - - /* Sanity-check data. Take one half-byte at once with - * check = (val >> (16 + 4*i)) & 0xf. It contains the - * R/Wi and LENi bits; bits 0 and 1 are R/Wi, and bits - * 2 and 3 are LENi. Given a list of invalid values, - * we do mask |= 1 << invalid_value, so that - * (mask >> check) & 1 is a correct test for invalid - * values. - * - * R/Wi contains the
[2.6 patch] scsi/qla2xxx/qla_os.c section fix
qla2x00_remove_one() mustn't be __devexit since it's called from qla2xxx_pci_error_detected(). This patch fixes the following section mismatch: <-- snip --> ... WARNING: vmlinux.o(.text+0x2a4462): Section mismatch: reference to .exit.text:qla2x00_remove_one (between 'qla2xxx_pci_error_detected' and 'qla2x00_stop_timer') ... <-- snip --> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> --- drivers/scsi/qla2xxx/qla_os.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) 764ebbed3c09f765963c20a3a326cf651685a81a diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index a5bcf1f..8ecc047 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1831,7 +1831,7 @@ probe_out: return ret; } -static void __devexit +static void qla2x00_remove_one(struct pci_dev *pdev) { scsi_qla_host_t *ha; @@ -2965,7 +2965,7 @@ static struct pci_driver qla2xxx_pci_driver = { }, .id_table = qla2xxx_pci_tbl, .probe = qla2x00_probe_one, - .remove = __devexit_p(qla2x00_remove_one), + .remove = qla2x00_remove_one, .err_handler= _err_handler, }; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 18/27] x86-64 ia32 ptrace debugreg cleanup
This cleans up the ia32 compat ptrace code to use shared code from native ptrace for the implementation guts of debug register access. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/ia32/ptrace32.c | 63 ++ 1 files changed, 8 insertions(+), 55 deletions(-) diff --git a/arch/x86/ia32/ptrace32.c b/arch/x86/ia32/ptrace32.c index a9a5cd4..5661abd 100644 --- a/arch/x86/ia32/ptrace32.c +++ b/arch/x86/ia32/ptrace32.c @@ -40,7 +40,6 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) { - int i; __u64 *stack = (__u64 *)task_pt_regs(child); switch (regno) { @@ -95,43 +94,10 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) break; } - case offsetof(struct user32, u_debugreg[4]): - case offsetof(struct user32, u_debugreg[5]): - return -EIO; - - case offsetof(struct user32, u_debugreg[0]): - child->thread.debugreg0 = val; - break; - - case offsetof(struct user32, u_debugreg[1]): - child->thread.debugreg1 = val; - break; - - case offsetof(struct user32, u_debugreg[2]): - child->thread.debugreg2 = val; - break; - - case offsetof(struct user32, u_debugreg[3]): - child->thread.debugreg3 = val; - break; - - case offsetof(struct user32, u_debugreg[6]): - child->thread.debugreg6 = val; - break; - - case offsetof(struct user32, u_debugreg[7]): - val &= ~DR_CONTROL_RESERVED; - /* See arch/i386/kernel/ptrace.c for an explanation of -* this awkward check.*/ - for(i=0; i<4; i++) - if ((0x5454 >> ((val >> (16 + 4*i)) & 0xf)) & 1) - return -EIO; - child->thread.debugreg7 = val; - if (val) - set_tsk_thread_flag(child, TIF_DEBUG); - else - clear_tsk_thread_flag(child, TIF_DEBUG); - break; + case offsetof(struct user32, u_debugreg[0]) ... + offsetof(struct user32, u_debugreg[7]): + regno -= offsetof(struct user32, u_debugreg[0]); + return ptrace_set_debugreg(child, regno / 4, val); default: if (regno > sizeof(struct user32) || (regno & 3)) @@ -188,23 +154,10 @@ static int getreg32(struct task_struct *child, unsigned regno, u32 *val) *val &= ~X86_EFLAGS_TF; break; - case offsetof(struct user32, u_debugreg[0]): - *val = child->thread.debugreg0; - break; - case offsetof(struct user32, u_debugreg[1]): - *val = child->thread.debugreg1; - break; - case offsetof(struct user32, u_debugreg[2]): - *val = child->thread.debugreg2; - break; - case offsetof(struct user32, u_debugreg[3]): - *val = child->thread.debugreg3; - break; - case offsetof(struct user32, u_debugreg[6]): - *val = child->thread.debugreg6; - break; - case offsetof(struct user32, u_debugreg[7]): - *val = child->thread.debugreg7; + case offsetof(struct user32, u_debugreg[0]) ... + offsetof(struct user32, u_debugreg[7]): + regno -= offsetof(struct user32, u_debugreg[0]); + *val = ptrace_get_debugreg(child, regno / 4); break; default: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[2.6 patch] finish the VID_HARDWARE_* removal
This patch removes a few remainders of the VID_HARDWARE_* removal. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> --- Documentation/DocBook/videobook.tmpl |9 - drivers/media/video/usbvision/usbvision.h |4 2 files changed, 13 deletions(-) 643d01fb38b6f376cced035549f4e193018776e7 diff --git a/Documentation/DocBook/videobook.tmpl b/Documentation/DocBook/videobook.tmpl index b629da3..b3d93ee 100644 --- a/Documentation/DocBook/videobook.tmpl +++ b/Documentation/DocBook/videobook.tmpl @@ -96,7 +96,6 @@ static struct video_device my_radio { "My radio", VID_TYPE_TUNER, -VID_HARDWARE_MYRADIO, radio_open. radio_close, NULL,/* no read */ @@ -119,13 +118,6 @@ static struct video_device my_radio way to change channel so it is tuneable. -The VID_HARDWARE_ types are unique to each device. Numbers are assigned by -[EMAIL PROTECTED] when device drivers are going to be released. Until then you -can pull a suitably large number out of your hat and use it. 1 should be -safe for a very long time even allowing for the huge number of vendors -making new and different radio cards at the moment. - - We declare an open and close routine, but we do not need read or write, which are used to read and write video data to or from the card itself. As we have no read or write there is no poll function. @@ -844,7 +836,6 @@ static struct video_device my_camera "My Camera", VID_TYPE_OVERLAY|VID_TYPE_SCALES|\ VID_TYPE_CAPTURE|VID_TYPE_CHROMAKEY, -VID_HARDWARE_MYCAMERA, camera_open. camera_close, camera_read, /* no read */ diff --git a/drivers/media/video/usbvision/usbvision.h b/drivers/media/video/usbvision/usbvision.h index c5b6c50..2b7c1bf 100644 --- a/drivers/media/video/usbvision/usbvision.h +++ b/drivers/media/video/usbvision/usbvision.h @@ -40,10 +40,6 @@ #define USBVISION_DEBUG/* Turn on debug messages */ -#ifndef VID_HARDWARE_USBVISION - #define VID_HARDWARE_USBVISION 34 /* USBVision Video Grabber */ -#endif - #define USBVISION_PWR_REG 0x00 #define USBVISION_SSPND_EN (1 << 1) #define USBVISION_RES2 (1 << 2) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 16/27] x86-64 ptrace: use task_pt_regs
This cleans up the 64-bit ptrace code to use task_pt_regs instead of its own redundant code that does the same thing a different way. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_64.c | 60 -- 1 files changed, 12 insertions(+), 48 deletions(-) diff --git a/arch/x86/kernel/ptrace_64.c b/arch/x86/kernel/ptrace_64.c index 85fba7b..8123ecb 100644 --- a/arch/x86/kernel/ptrace_64.c +++ b/arch/x86/kernel/ptrace_64.c @@ -43,44 +43,6 @@ #define FLAG_MASK 0x54dd5UL /* - * eflags and offset of eflags on child stack.. - */ -#define EFLAGS offsetof(struct pt_regs, eflags) -#define EFL_OFFSET ((int)(EFLAGS-sizeof(struct pt_regs))) - -/* - * this routine will get a word off of the processes privileged stack. - * the offset is how far from the base addr as stored in the TSS. - * this routine assumes that all the privileged stacks are in our - * data space. - */ -static inline unsigned long get_stack_long(struct task_struct *task, int offset) -{ - unsigned char *stack; - - stack = (unsigned char *)task->thread.rsp0; - stack += offset; - return (*((unsigned long *)stack)); -} - -/* - * this routine will put a word on the processes privileged stack. - * the offset is how far from the base addr as stored in the TSS. - * this routine assumes that all the privileged stacks are in our - * data space. - */ -static inline long put_stack_long(struct task_struct *task, int offset, - unsigned long data) -{ - unsigned char * stack; - - stack = (unsigned char *) task->thread.rsp0; - stack += offset; - *(unsigned long *) stack = data; - return 0; -} - -/* * Called by kernel/ptrace.c when detaching.. * * Make sure the single step bit is not set. @@ -90,11 +52,16 @@ void ptrace_disable(struct task_struct *child) user_disable_single_step(child); } +static unsigned long *pt_regs_access(struct pt_regs *regs, unsigned long offset) +{ + BUILD_BUG_ON(offsetof(struct pt_regs, r15) != 0); + return >r15 + (offset / sizeof(regs->r15)); +} + static int putreg(struct task_struct *child, unsigned long regno, unsigned long value) { - unsigned long tmp; - + struct pt_regs *regs = task_pt_regs(child); switch (regno) { case offsetof(struct user_regs_struct,fs): if (value && (value & 3) != 3) @@ -152,9 +119,7 @@ static int putreg(struct task_struct *child, clear_tsk_thread_flag(child, TIF_FORCED_TF); else if (test_tsk_thread_flag(child, TIF_FORCED_TF)) value |= X86_EFLAGS_TF; - tmp = get_stack_long(child, EFL_OFFSET); - tmp &= ~FLAG_MASK; - value |= tmp; + value |= regs->eflags & ~FLAG_MASK; break; case offsetof(struct user_regs_struct,cs): if ((value & 3) != 3) @@ -162,12 +127,13 @@ static int putreg(struct task_struct *child, value &= 0x; break; } - put_stack_long(child, regno - sizeof(struct pt_regs), value); + *pt_regs_access(regs, regno) = value; return 0; } static unsigned long getreg(struct task_struct *child, unsigned long regno) { + struct pt_regs *regs = task_pt_regs(child); unsigned long val; switch (regno) { case offsetof(struct user_regs_struct, fs): @@ -202,16 +168,14 @@ static unsigned long getreg(struct task_struct *child, unsigned long regno) /* * If the debugger set TF, hide it from the readout. */ - regno = regno - sizeof(struct pt_regs); - val = get_stack_long(child, regno); + val = regs->eflags; if (test_tsk_thread_flag(child, TIF_IA32)) val &= 0x; if (test_tsk_thread_flag(child, TIF_FORCED_TF)) val &= ~X86_EFLAGS_TF; return val; default: - regno = regno - sizeof(struct pt_regs); - val = get_stack_long(child, regno); + val = *pt_regs_access(regs, regno); if (test_tsk_thread_flag(child, TIF_IA32)) val &= 0x; return val; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 13/27] powerpc: arch_has_single_step
This defines the new standard arch_has_single_step macro. It makes the existing set_single_step and clear_single_step entry points global, and renames them to the new standard names user_enable_single_step and user_disable_single_step, respectively. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/powerpc/kernel/ptrace.c | 12 ++-- include/asm-powerpc/ptrace.h |7 +++ 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index 3e17d15..b970d79 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -256,7 +256,7 @@ static int set_evrregs(struct task_struct *task, unsigned long *data) #endif /* CONFIG_SPE */ -static void set_single_step(struct task_struct *task) +void user_enable_single_step(struct task_struct *task) { struct pt_regs *regs = task->thread.regs; @@ -271,7 +271,7 @@ static void set_single_step(struct task_struct *task) set_tsk_thread_flag(task, TIF_SINGLESTEP); } -static void clear_single_step(struct task_struct *task) +void user_disable_single_step(struct task_struct *task) { struct pt_regs *regs = task->thread.regs; @@ -313,7 +313,7 @@ static int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, void ptrace_disable(struct task_struct *child) { /* make sure the single step bit is not set. */ - clear_single_step(child); + user_disable_single_step(child); } /* @@ -456,7 +456,7 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); child->exit_code = data; /* make sure the single step bit is not set. */ - clear_single_step(child); + user_disable_single_step(child); wake_up_process(child); ret = 0; break; @@ -473,7 +473,7 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) break; child->exit_code = SIGKILL; /* make sure the single step bit is not set. */ - clear_single_step(child); + user_disable_single_step(child); wake_up_process(child); break; } @@ -483,7 +483,7 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) if (!valid_signal(data)) break; clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - set_single_step(child); + user_enable_single_step(child); child->exit_code = data; /* give it a chance to run. */ wake_up_process(child); diff --git a/include/asm-powerpc/ptrace.h b/include/asm-powerpc/ptrace.h index 13fccc5..3063363 100644 --- a/include/asm-powerpc/ptrace.h +++ b/include/asm-powerpc/ptrace.h @@ -119,6 +119,13 @@ do { \ } while (0) #endif /* __powerpc64__ */ +/* + * These are defined as per linux/ptrace.h, which see. + */ +#define arch_has_single_step() (1) +extern void user_enable_single_step(struct task_struct *); +extern void user_disable_single_step(struct task_struct *); + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 15/27] x86-32 ptrace: use task_pt_regs
This cleans up the 32-bit ptrace code to use task_pt_regs instead of its own redundant code that does the same thing a different way. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_32.c | 68 ++ 1 files changed, 16 insertions(+), 52 deletions(-) diff --git a/arch/x86/kernel/ptrace_32.c b/arch/x86/kernel/ptrace_32.c index 50882b3..7c33244 100644 --- a/arch/x86/kernel/ptrace_32.c +++ b/arch/x86/kernel/ptrace_32.c @@ -37,53 +37,20 @@ */ #define FLAG_MASK 0x00050dd5 -/* - * Offset of eflags on child stack.. - */ -#define EFL_OFFSET offsetof(struct pt_regs, eflags) - -static inline struct pt_regs *get_child_regs(struct task_struct *task) -{ - void *stack_top = (void *)task->thread.esp0; - return stack_top - sizeof(struct pt_regs); -} - -/* - * This routine will get a word off of the processes privileged stack. - * the offset is bytes into the pt_regs structure on the stack. - * This routine assumes that all the privileged stacks are in our - * data space. - */ -static inline int get_stack_long(struct task_struct *task, int offset) +static long *pt_regs_access(struct pt_regs *regs, unsigned long regno) { - unsigned char *stack; - - stack = (unsigned char *)task->thread.esp0 - sizeof(struct pt_regs); - stack += offset; - return (*((int *)stack)); -} - -/* - * This routine will put a word on the processes privileged stack. - * the offset is bytes into the pt_regs structure on the stack. - * This routine assumes that all the privileged stacks are in our - * data space. - */ -static inline int put_stack_long(struct task_struct *task, int offset, - unsigned long data) -{ - unsigned char * stack; - - stack = (unsigned char *)task->thread.esp0 - sizeof(struct pt_regs); - stack += offset; - *(unsigned long *) stack = data; - return 0; + BUILD_BUG_ON(offsetof(struct pt_regs, ebx) != 0); + if (regno > FS) + --regno; + return >ebx + regno; } static int putreg(struct task_struct *child, unsigned long regno, unsigned long value) { - switch (regno >> 2) { + struct pt_regs *regs = task_pt_regs(child); + regno >>= 2; + switch (regno) { case GS: if (value && (value & 3) != 3) return -EIO; @@ -113,26 +80,25 @@ static int putreg(struct task_struct *child, clear_tsk_thread_flag(child, TIF_FORCED_TF); else if (test_tsk_thread_flag(child, TIF_FORCED_TF)) value |= X86_EFLAGS_TF; - value |= get_stack_long(child, EFL_OFFSET) & ~FLAG_MASK; + value |= regs->eflags & ~FLAG_MASK; break; } - if (regno > FS*4) - regno -= 1*4; - put_stack_long(child, regno, value); + *pt_regs_access(regs, regno) = value; return 0; } -static unsigned long getreg(struct task_struct *child, - unsigned long regno) +static unsigned long getreg(struct task_struct *child, unsigned long regno) { + struct pt_regs *regs = task_pt_regs(child); unsigned long retval = ~0UL; - switch (regno >> 2) { + regno >>= 2; + switch (regno) { case EFL: /* * If the debugger set TF, hide it from the readout. */ - retval = get_stack_long(child, EFL_OFFSET); + retval = regs->eflags; if (test_tsk_thread_flag(child, TIF_FORCED_TF)) retval &= ~X86_EFLAGS_TF; break; @@ -147,9 +113,7 @@ static unsigned long getreg(struct task_struct *child, retval = 0x; /* fall through */ default: - if (regno > FS*4) - regno -= 1*4; - retval &= get_stack_long(child, regno); + retval &= *pt_regs_access(regs, regno); } return retval; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 14/27] powerpc: ptrace generic resume
This removes the handling for PTRACE_CONT et al from the powerpc ptrace code, so it uses the new generic code via ptrace_request. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/powerpc/kernel/ptrace.c | 46 -- 1 files changed, 0 insertions(+), 46 deletions(-) diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index b970d79..8b056d2 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -445,52 +445,6 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) break; } - case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall */ - case PTRACE_CONT: { /* restart after signal. */ - ret = -EIO; - if (!valid_signal(data)) - break; - if (request == PTRACE_SYSCALL) - set_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - else - clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - child->exit_code = data; - /* make sure the single step bit is not set. */ - user_disable_single_step(child); - wake_up_process(child); - ret = 0; - break; - } - -/* - * make the child exit. Best I can do is send it a sigkill. - * perhaps it should be put in the status that it wants to - * exit. - */ - case PTRACE_KILL: { - ret = 0; - if (child->exit_state == EXIT_ZOMBIE) /* already dead */ - break; - child->exit_code = SIGKILL; - /* make sure the single step bit is not set. */ - user_disable_single_step(child); - wake_up_process(child); - break; - } - - case PTRACE_SINGLESTEP: { /* set the trap flag. */ - ret = -EIO; - if (!valid_signal(data)) - break; - clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - user_enable_single_step(child); - child->exit_code = data; - /* give it a chance to run. */ - wake_up_process(child); - ret = 0; - break; - } - case PTRACE_GET_DEBUGREG: { ret = -EINVAL; /* We only support one DABR and no IABRS at the moment */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/27] x86-64: ptrace generic resume
This removes the handling for PTRACE_CONT et al from the 64-bit ptrace code, so it uses the new generic code via ptrace_request. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_64.c | 45 --- 1 files changed, 0 insertions(+), 45 deletions(-) diff --git a/arch/x86/kernel/ptrace_64.c b/arch/x86/kernel/ptrace_64.c index d8453da..85fba7b 100644 --- a/arch/x86/kernel/ptrace_64.c +++ b/arch/x86/kernel/ptrace_64.c @@ -334,23 +334,6 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) } break; } - case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall */ - case PTRACE_CONT:/* restart after signal. */ - - ret = -EIO; - if (!valid_signal(data)) - break; - if (request == PTRACE_SYSCALL) - set_tsk_thread_flag(child,TIF_SYSCALL_TRACE); - else - clear_tsk_thread_flag(child,TIF_SYSCALL_TRACE); - clear_tsk_thread_flag(child, TIF_SINGLESTEP); - child->exit_code = data; - /* make sure the single step bit is not set. */ - user_disable_single_step(child); - wake_up_process(child); - ret = 0; - break; #ifdef CONFIG_IA32_EMULATION /* This makes only sense with 32bit programs. Allow a @@ -378,34 +361,6 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) ret = do_arch_prctl(child, data, addr); break; -/* - * make the child exit. Best I can do is send it a sigkill. - * perhaps it should be put in the status that it wants to - * exit. - */ - case PTRACE_KILL: - ret = 0; - if (child->exit_state == EXIT_ZOMBIE) /* already dead */ - break; - clear_tsk_thread_flag(child, TIF_SINGLESTEP); - child->exit_code = SIGKILL; - /* make sure the single step bit is not set. */ - user_disable_single_step(child); - wake_up_process(child); - break; - - case PTRACE_SINGLESTEP:/* set the trap flag. */ - ret = -EIO; - if (!valid_signal(data)) - break; - clear_tsk_thread_flag(child,TIF_SYSCALL_TRACE); - user_enable_single_step(child); - child->exit_code = data; - /* give it a chance to run. */ - wake_up_process(child); - ret = 0; - break; - case PTRACE_GETREGS: { /* Get all gp regs from the child. */ if (!access_ok(VERIFY_WRITE, (unsigned __user *)data, sizeof(struct user_regs_struct))) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Small System Paging Problem - OOM-killer goes nuts
Hi, I have a Linksys NSLU2 running 2.6.21 (I can replicate the problem on 2.6.23 but it isn't fully supported on SlugOS). It is a armv5teb device with 32MB of RAM, 400+ MB swap on its 160GB USB2 root disk. The machine is used as a fileserver and to build packages for other ARM devices. It may be underpowered by today's standard but is a whole lot faster than my first Linux system (386sx20 with 4MB RAM) but the whole system with disk uses <8 watts and is silent. The problem comes when I try to untar a large file (in this case linux-2.6.23.tar.bz2). Regardless if I kill off every other process, eventually the oom-killer will appear and kill either the tar or the shell. I've tried every tuning option I and my buddy Google could find including (/proc/sys/vm/overcommit*) with no success. I'm not worried about paging impacting performance. I'd appreciate any help, pointers, or gentle taps with the cluebat. -Josh Error output to console: http://www.pastebin.ca/797155 config -> http://www.pastebin.ca/797206 slug2>$ uname -a Linux slug2 2.6.21 #1 PREEMPT Fri Nov 9 11:54:06 MST 2007 armv5teb unknown slug2:~$ free total used free sharedbuffers cached Mem: 30352 29124 1228 0 10196 9468 -/+ buffers/cache: 9460 20892 Swap: 465876 0 465876 cat /proc/swaps FilenameTypeSizeUsed Priority /dev/sda4 partition 465876 0 -1 slug2:~$ lsmod Module Size Used by nfsd 186556 8 exportfs4320 1 nfsd lockd 51416 2 nfsd sunrpc131952 2 nfsd,lockd reiserfs 255380 1 ixp4xx_mac 14644 0 ixp4xx_qmgr 5388 5 ixp4xx_mac mii 3424 1 ixp4xx_mac ext3 110472 2 jbd47784 1 ext3 mbcache 5604 1 ext3 ohci_hcd 16804 0 ehci_hcd 30252 0 slug2>$ dmesg <5>Linux version 2.6.21 ([EMAIL PROTECTED]) (gcc version 4.1.1) #1 PREEMPT Fri Nov 9 11:54:06 MST 2007 <4>CPU: XScale-IXP42x Family [690541f1] revision 1 (ARMv5TE), cr=39ff <4>Machine: Linksys NSLU2 <4>Memory policy: ECC disabled, Data cache writeback <7>On node 0 totalpages: 8192 <7> DMA zone: 64 pages used for memmap <7> DMA zone: 0 pages reserved <7> DMA zone: 8128 pages, LIFO batch:0 <7> Normal zone: 0 pages used for memmap <4>CPU0: D VIVT undefined 5 cache <4>CPU0: I cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets <4>CPU0: D cache: 32768 bytes, associativity 32, 32 byte lines, 32 sets <4>Built 1 zonelists. Total pages: 8128 <5>Kernel command line: rtc-x1205.probe=0,0x6f console=ttyS0,115200n8 root=/dev/mtdblock4 rootfstype=jffs2 rw init=/linuxrc noirqdebug <6>IRQ lockup detection disabled <4>PID hash table entries: 128 (order: 7, 512 bytes) <4>Dentry cache hash table entries: 4096 (order: 2, 16384 bytes) <4>Inode-cache hash table entries: 2048 (order: 1, 8192 bytes) <6>Memory: 32MB = 32MB total <5>Memory: 30268KB available (1940K code, 154K data, 84K init) <7>Calibrating delay loop... 266.24 BogoMIPS (lpj=1331200) <4>Mount-cache hash table entries: 512 <6>CPU: Testing write buffer coherency: ok <6>NET: Registered protocol family 16 <4>IXP4xx: Using 16MiB expansion bus window size <4>PCI: IXP4xx is host <4>PCI: IXP4xx Using direct access for memory space <6>PCI: bus0: Fast back to back transfers disabled <6>dmabounce: registered device :00:01.0 on pci bus <6>dmabounce: registered device :00:01.1 on pci bus <6>dmabounce: registered device :00:01.2 on pci bus <5>SCSI subsystem initialized <6>usbcore: registered new interface driver usbfs <6>usbcore: registered new interface driver hub <6>usbcore: registered new device driver usb <6>Time: OSTS clocksource has been installed. <6>NET: Registered protocol family 2 <4>IP route cache hash table entries: 1024 (order: 0, 4096 bytes) <4>TCP established hash table entries: 1024 (order: 1, 8192 bytes) <4>TCP bind hash table entries: 1024 (order: 0, 4096 bytes) <6>TCP: Hash tables configured (established 1024 bind 1024) <6>TCP reno registered <4>NetWinder Floating Point Emulator V0.97 (double precision) <6>JFFS2 version 2.2. (NAND) (C) 2001-2006 Red Hat, Inc. <6>io scheduler noop registered <6>io scheduler deadline registered (default) <6>Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing disabled <6>serial8250.0: ttyS0 at MMIO 0xc800 (irq = 15) is a XScale <6>serial8250.0: ttyS1 at MMIO 0xc8001000 (irq = 13) is a XScale <4>RAMDISK driver initialized: 4 RAM disks of 10240K size 1024 blocksize <6>IXP4XX NPE driver Version 0.3.0 initialized <6>NFTL driver: nftlcore.c $Revision: 1.98 $, nftlmount.c $Revision: 1.41 $ <6>IXP4XX-Flash.0: Found 1 x16 devices at 0x0 in 16-bit bank <7>IXP4XX-Flash.0: Found an alias at 0x80 for the chip at 0x0 <4> Intel/Sharp
[PATCH 12/27] x86-32: ptrace generic resume
This removes the handling for PTRACE_CONT et al from the 32-bit ptrace code, so it uses the new generic code via ptrace_request. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/ptrace_32.c | 57 --- 1 files changed, 0 insertions(+), 57 deletions(-) diff --git a/arch/x86/kernel/ptrace_32.c b/arch/x86/kernel/ptrace_32.c index a493017..50882b3 100644 --- a/arch/x86/kernel/ptrace_32.c +++ b/arch/x86/kernel/ptrace_32.c @@ -277,63 +277,6 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data) } break; - case PTRACE_SYSEMU: /* continue and stop at next syscall, which will not be executed */ - case PTRACE_SYSCALL:/* continue and stop at next (return from) syscall */ - case PTRACE_CONT: /* restart after signal. */ - ret = -EIO; - if (!valid_signal(data)) - break; - if (request == PTRACE_SYSEMU) { - set_tsk_thread_flag(child, TIF_SYSCALL_EMU); - clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - } else if (request == PTRACE_SYSCALL) { - set_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - clear_tsk_thread_flag(child, TIF_SYSCALL_EMU); - } else { - clear_tsk_thread_flag(child, TIF_SYSCALL_EMU); - clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - } - child->exit_code = data; - /* make sure the single step bit is not set. */ - user_disable_single_step(child); - wake_up_process(child); - ret = 0; - break; - -/* - * make the child exit. Best I can do is send it a sigkill. - * perhaps it should be put in the status that it wants to - * exit. - */ - case PTRACE_KILL: - ret = 0; - if (child->exit_state == EXIT_ZOMBIE) /* already dead */ - break; - child->exit_code = SIGKILL; - /* make sure the single step bit is not set. */ - user_disable_single_step(child); - wake_up_process(child); - break; - - case PTRACE_SYSEMU_SINGLESTEP: /* Same as SYSEMU, but singlestep if not syscall */ - case PTRACE_SINGLESTEP: /* set the trap flag. */ - ret = -EIO; - if (!valid_signal(data)) - break; - - if (request == PTRACE_SYSEMU_SINGLESTEP) - set_tsk_thread_flag(child, TIF_SYSCALL_EMU); - else - clear_tsk_thread_flag(child, TIF_SYSCALL_EMU); - - clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); - user_enable_single_step(child); - child->exit_code = data; - /* give it a chance to run. */ - wake_up_process(child); - ret = 0; - break; - case PTRACE_GETREGS: { /* Get all gp regs from the child. */ if (!access_ok(VERIFY_WRITE, datap, FRAME_SIZE*sizeof(long))) { ret = -EIO; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/27] ptrace: generic resume
This makes ptrace_request handle all the ptrace requests that wake up the traced task. These do low-level ptrace implementation magic that is not arch-specific and should be kept out of arch code. The implementations on each arch usually do the same thing. The new generic code makes use of the arch_has_single_step macro and generic entry points to handle PTRACE_SINGLESTEP. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- kernel/ptrace.c | 61 +++ 1 files changed, 61 insertions(+), 0 deletions(-) diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 7c76f2f..309796a 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -366,6 +366,50 @@ static int ptrace_setsiginfo(struct task_struct *child, siginfo_t __user * data) return error; } + +#ifdef PTRACE_SINGLESTEP +#define is_singlestep(request) ((request) == PTRACE_SINGLESTEP) +#else +#define is_singlestep(request) 0 +#endif + +#ifdef PTRACE_SYSEMU +#define is_sysemu_singlestep(request) ((request) == PTRACE_SYSEMU_SINGLESTEP) +#else +#define is_sysemu_singlestep(request) 0 +#endif + +static int ptrace_resume(struct task_struct *child, long request, long data) +{ + if (!valid_signal(data)) + return -EIO; + + if (request == PTRACE_SYSCALL) + set_tsk_thread_flag(child, TIF_SYSCALL_TRACE); + else + clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE); + +#ifdef TIF_SYSCALL_EMU + if (request == PTRACE_SYSEMU || request == PTRACE_SYSEMU_SINGLESTEP) + set_tsk_thread_flag(child, TIF_SYSCALL_EMU); + else + clear_tsk_thread_flag(child, TIF_SYSCALL_EMU); +#endif + + if (is_singlestep(request) || is_sysemu_singlestep(request)) { + if (unlikely(!arch_has_single_step())) + return -EIO; + user_enable_single_step(child); + } + else + user_disable_single_step(child); + + child->exit_code = data; + wake_up_process(child); + + return 0; +} + int ptrace_request(struct task_struct *child, long request, long addr, long data) { @@ -390,6 +434,23 @@ int ptrace_request(struct task_struct *child, long request, case PTRACE_DETACH: /* detach a process that was attached. */ ret = ptrace_detach(child, data); break; + +#ifdef PTRACE_SINGLESTEP + case PTRACE_SINGLESTEP: +#endif +#ifdef PTRACE_SYSEMU + case PTRACE_SYSEMU: + case PTRACE_SYSEMU_SINGLESTEP: +#endif + case PTRACE_SYSCALL: + case PTRACE_CONT: + return ptrace_resume(child, request, data); + + case PTRACE_KILL: + if (child->exit_state) /* already dead */ + return 0; + return ptrace_resume(child, request, SIGKILL); + default: break; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/27] x86: single_step: share code
This removes the single-step code from ptrace_32.c and uses the step.c code shared with the 64-bit kernel. The two versions of the code were nearly identical already, so the shared code has only a couple of simple #ifdef's. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/Makefile_32 |1 + arch/x86/kernel/ptrace_32.c | 125 --- arch/x86/kernel/step.c | 14 + 3 files changed, 15 insertions(+), 125 deletions(-) diff --git a/arch/x86/kernel/Makefile_32 b/arch/x86/kernel/Makefile_32 index e660584..959ad3c 100644 --- a/arch/x86/kernel/Makefile_32 +++ b/arch/x86/kernel/Makefile_32 @@ -11,6 +11,7 @@ obj-y := process_32.o signal_32.o entry_32.o traps_32.o irq_32.o \ quirks.o i8237.o topology.o alternative.o i8253.o tsc_32.o obj-y += tls.o +obj-y += step.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-y += cpu/ obj-y += acpi/ diff --git a/arch/x86/kernel/ptrace_32.c b/arch/x86/kernel/ptrace_32.c index d1d74e1..e599db5 100644 --- a/arch/x86/kernel/ptrace_32.c +++ b/arch/x86/kernel/ptrace_32.c @@ -137,131 +137,6 @@ static unsigned long getreg(struct task_struct *child, return retval; } -#define LDT_SEGMENT 4 - -static unsigned long convert_eip_to_linear(struct task_struct *child, struct pt_regs *regs) -{ - unsigned long addr, seg; - - addr = regs->eip; - seg = regs->xcs & 0x; - if (regs->eflags & VM_MASK) { - addr = (addr & 0x) + (seg << 4); - return addr; - } - - /* -* We'll assume that the code segments in the GDT -* are all zero-based. That is largely true: the -* TLS segments are used for data, and the PNPBIOS -* and APM bios ones we just ignore here. -*/ - if (seg & LDT_SEGMENT) { - u32 *desc; - unsigned long base; - - seg &= ~7UL; - - mutex_lock(>mm->context.lock); - if (unlikely((seg >> 3) >= child->mm->context.size)) - addr = -1L; /* bogus selector, access would fault */ - else { - desc = child->mm->context.ldt + seg; - base = ((desc[0] >> 16) | - ((desc[1] & 0xff) << 16) | - (desc[1] & 0xff00)); - - /* 16-bit code segment? */ - if (!((desc[1] >> 22) & 1)) - addr &= 0x; - addr += base; - } - mutex_unlock(>mm->context.lock); - } - return addr; -} - -static inline int is_setting_trap_flag(struct task_struct *child, struct pt_regs *regs) -{ - int i, copied; - unsigned char opcode[15]; - unsigned long addr = convert_eip_to_linear(child, regs); - - copied = access_process_vm(child, addr, opcode, sizeof(opcode), 0); - for (i = 0; i < copied; i++) { - switch (opcode[i]) { - /* popf and iret */ - case 0x9d: case 0xcf: - return 1; - /* opcode and address size prefixes */ - case 0x66: case 0x67: - continue; - /* irrelevant prefixes (segment overrides and repeats) */ - case 0x26: case 0x2e: - case 0x36: case 0x3e: - case 0x64: case 0x65: - case 0xf0: case 0xf2: case 0xf3: - continue; - - /* -* pushf: NOTE! We should probably not let -* the user see the TF bit being set. But -* it's more pain than it's worth to avoid -* it, and a debugger could emulate this -* all in user space if it _really_ cares. -*/ - case 0x9c: - default: - return 0; - } - } - return 0; -} - -void user_enable_single_step(struct task_struct *child) -{ - struct pt_regs *regs = get_child_regs(child); - - /* -* Always set TIF_SINGLESTEP - this guarantees that -* we single-step system calls etc.. This will also -* cause us to set TF when returning to user mode. -*/ - set_tsk_thread_flag(child, TIF_SINGLESTEP); - - /* -* If TF was already set, don't do anything else -*/ - if (regs->eflags & X86_EFLAGS_TF) - return; - - /* Set TF on the kernel stack.. */ - regs->eflags |= X86_EFLAGS_TF; - - /* -* ..but if TF is changed by the instruction we will trace, -* don't mark it as being "us" that set it, so that we -* won't clear it by hand later. -*/ - if (is_setting_trap_flag(child, regs)) - return; - -
[PATCH 09/27] x86 single_step: TIF_FORCED_TF
This changes the single-step support to use a new thread_info flag TIF_FORCED_TF instead of the PT_DTRACE flag in task_struct.ptrace. This keeps arch implementation uses out of this non-arch field. This changes the ptrace access to eflags to mask TF and maintain the TIF_FORCED_TF flag directly if userland sets TF, instead of relying on ptrace_signal_deliver. The 64-bit and 32-bit kernels are harmonized on this same behavior. The ptrace_signal_deliver approach works now, but this change makes the low-level register access code reliable when called from different contexts than a ptrace stop, which will be possible in the future. The 64-bit do_debug exception handler is also changed not to clear TF from user-mode registers. This matches the 32-bit kernel's behavior. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/ia32/ptrace32.c | 20 ++-- arch/x86/kernel/process_32.c |3 --- arch/x86/kernel/process_64.c |5 - arch/x86/kernel/ptrace_32.c | 17 + arch/x86/kernel/ptrace_64.c | 20 arch/x86/kernel/signal_32.c | 12 +--- arch/x86/kernel/signal_64.c | 14 +- arch/x86/kernel/step.c |9 +++-- arch/x86/kernel/traps_64.c | 23 +-- include/asm-x86/signal.h | 11 ++- include/asm-x86/thread_info_32.h |2 ++ include/asm-x86/thread_info_64.h |3 ++- 12 files changed, 79 insertions(+), 60 deletions(-) diff --git a/arch/x86/ia32/ptrace32.c b/arch/x86/ia32/ptrace32.c index 4a233ad..a9a5cd4 100644 --- a/arch/x86/ia32/ptrace32.c +++ b/arch/x86/ia32/ptrace32.c @@ -82,6 +82,15 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 val) case offsetof(struct user32, regs.eflags): { __u64 *flags = [offsetof(struct pt_regs, eflags)/8]; val &= FLAG_MASK; + /* +* If the user value contains TF, mark that +* it was not "us" (the debugger) that set it. +* If not, make sure it stays set if we had. +*/ + if (val & X86_EFLAGS_TF) + clear_tsk_thread_flag(child, TIF_FORCED_TF); + else if (test_tsk_thread_flag(child, TIF_FORCED_TF)) + val |= X86_EFLAGS_TF; *flags = val | (*flags & ~FLAG_MASK); break; } @@ -168,9 +177,17 @@ static int getreg32(struct task_struct *child, unsigned regno, u32 *val) R32(eax, rax); R32(orig_eax, orig_rax); R32(eip, rip); - R32(eflags, eflags); R32(esp, rsp); + case offsetof(struct user32, regs.eflags): + /* +* If the debugger set TF, hide it from the readout. +*/ + *val = stack[offsetof(struct pt_regs, eflags)/8]; + if (test_tsk_thread_flag(child, TIF_FORCED_TF)) + *val &= ~X86_EFLAGS_TF; + break; + case offsetof(struct user32, u_debugreg[0]): *val = child->thread.debugreg0; break; @@ -401,4 +418,3 @@ asmlinkage long sys32_ptrace(long request, u32 pid, u32 addr, u32 data) put_task_struct(child); return ret; } - diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index ebbbfc5..f59544e 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -796,9 +796,6 @@ asmlinkage int sys_execve(struct pt_regs regs) (char __user * __user *) regs.edx, ); if (error == 0) { - task_lock(current); - current->ptrace &= ~PT_DTRACE; - task_unlock(current); /* Make sure we don't return using sysenter.. */ set_thread_flag(TIF_IRET); } diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 3fdbf78..586f88e 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -698,11 +698,6 @@ long sys_execve(char __user *name, char __user * __user *argv, if (IS_ERR(filename)) return error; error = do_execve(filename, argv, envp, ); - if (error == 0) { - task_lock(current); - current->ptrace &= ~PT_DTRACE; - task_unlock(current); - } putname(filename); return error; } diff --git a/arch/x86/kernel/ptrace_32.c b/arch/x86/kernel/ptrace_32.c index e599db5..a493017 100644 --- a/arch/x86/kernel/ptrace_32.c +++ b/arch/x86/kernel/ptrace_32.c @@ -104,6 +104,15 @@ static int putreg(struct task_struct *child, break; case EFL: value &= FLAG_MASK; + /* +* If the user value contains TF, mark that +* it was not "us" (the debugger)
[PATCH 05/27] x86: single_step moved
This moves the single-step support code from ptrace_64.c into a new file step.c, verbatim. This paves the way for consolidating this code between 64-bit and 32-bit versions. Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> --- arch/x86/kernel/Makefile_64 |2 + arch/x86/kernel/ptrace_64.c | 134 - arch/x86/kernel/step.c | 140 +++ 3 files changed, 142 insertions(+), 134 deletions(-) diff --git a/arch/x86/kernel/Makefile_64 b/arch/x86/kernel/Makefile_64 index 203a9d8..d35ee6f 100644 --- a/arch/x86/kernel/Makefile_64 +++ b/arch/x86/kernel/Makefile_64 @@ -13,6 +13,8 @@ obj-y := process_64.o signal_64.o entry_64.o traps_64.o irq_64.o \ pci-dma_64.o pci-nommu_64.o alternative.o hpet.o tsc_64.o bugs_64.o \ i8253.o +obj-y += step.o + obj-$(CONFIG_IA32_EMULATION) += tls.o obj-$(CONFIG_STACKTRACE) += stacktrace.o diff --git a/arch/x86/kernel/ptrace_64.c b/arch/x86/kernel/ptrace_64.c index c2e1a13..52479b1 100644 --- a/arch/x86/kernel/ptrace_64.c +++ b/arch/x86/kernel/ptrace_64.c @@ -80,140 +80,6 @@ static inline long put_stack_long(struct task_struct *task, int offset, return 0; } -#define LDT_SEGMENT 4 - -unsigned long convert_rip_to_linear(struct task_struct *child, struct pt_regs *regs) -{ - unsigned long addr, seg; - - addr = regs->rip; - seg = regs->cs & 0x; - - /* -* We'll assume that the code segments in the GDT -* are all zero-based. That is largely true: the -* TLS segments are used for data, and the PNPBIOS -* and APM bios ones we just ignore here. -*/ - if (seg & LDT_SEGMENT) { - u32 *desc; - unsigned long base; - - seg &= ~7UL; - - mutex_lock(>mm->context.lock); - if (unlikely((seg >> 3) >= child->mm->context.size)) - addr = -1L; /* bogus selector, access would fault */ - else { - desc = child->mm->context.ldt + seg; - base = ((desc[0] >> 16) | - ((desc[1] & 0xff) << 16) | - (desc[1] & 0xff00)); - - /* 16-bit code segment? */ - if (!((desc[1] >> 22) & 1)) - addr &= 0x; - addr += base; - } - mutex_unlock(>mm->context.lock); - } - - return addr; -} - -static int is_setting_trap_flag(struct task_struct *child, struct pt_regs *regs) -{ - int i, copied; - unsigned char opcode[15]; - unsigned long addr = convert_rip_to_linear(child, regs); - - copied = access_process_vm(child, addr, opcode, sizeof(opcode), 0); - for (i = 0; i < copied; i++) { - switch (opcode[i]) { - /* popf and iret */ - case 0x9d: case 0xcf: - return 1; - - /* CHECKME: 64 65 */ - - /* opcode and address size prefixes */ - case 0x66: case 0x67: - continue; - /* irrelevant prefixes (segment overrides and repeats) */ - case 0x26: case 0x2e: - case 0x36: case 0x3e: - case 0x64: case 0x65: - case 0xf2: case 0xf3: - continue; - - case 0x40 ... 0x4f: - if (regs->cs != __USER_CS) - /* 32-bit mode: register increment */ - return 0; - /* 64-bit mode: REX prefix */ - continue; - - /* CHECKME: f2, f3 */ - - /* -* pushf: NOTE! We should probably not let -* the user see the TF bit being set. But -* it's more pain than it's worth to avoid -* it, and a debugger could emulate this -* all in user space if it _really_ cares. -*/ - case 0x9c: - default: - return 0; - } - } - return 0; -} - -void user_enable_single_step(struct task_struct *child) -{ - struct pt_regs *regs = task_pt_regs(child); - - /* -* Always set TIF_SINGLESTEP - this guarantees that -* we single-step system calls etc.. This will also -* cause us to set TF when returning to user mode. -*/ - set_tsk_thread_flag(child, TIF_SINGLESTEP); - - /* -* If TF was already set, don't do anything else -*/ - if (regs->eflags & X86_EFLAGS_TF) - return; - - /* Set TF on the kernel stack.. */ - regs->eflags |= X86_EFLAGS_TF; - - /* -* ..but if TF is changed by the instruction we will trace, -