Re: 2.6.26-rc9: Reported regressions from 2.6.25

2008-07-06 Thread Linus Torvalds
] Linus Torvalds [EMAIL PROTECTED] Paul E. McKenney [EMAIL PROTECTED] Patch : http://lkml.org/lkml/2008/5/28/16 This one is the same thing that is reported as unresolved, and no, I don't think that existing patch was ever really tested to fix anything. Paul? I suspect SRCU

Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n

2008-07-06 Thread Linus Torvalds
On Sun, 6 Jul 2008, Randy Dunlap wrote: This still happens with 2.6.26-rc9. Using CONFIG_NUMA=y boots OK. Ok, then it wasn't the nr_zones thing. Since it seems to be repeatable for you, can you bisect it? Linus -- To unsubscribe from this list: send the line unsubscribe

Re: 2.6.26-rc9: Reported regressions from 2.6.25

2008-07-06 Thread Linus Torvalds
) References: http://marc.info/?l=linux-kernelm=121524504505805w=4 Handled-By: Linus Torvalds [EMAIL PROTECTED] The revert that was confirmed by Andrey to fix this regression is now committed as 09ca8adbe9f724a7e96f512c0039c4c4a1c5dcc0. Linus -- To unsubscribe from this list

Re: 2.6.26-rc9: Reported regressions from 2.6.25

2008-07-06 Thread Linus Torvalds
On Mon, 7 Jul 2008, Adrian Bunk wrote: When did you tell me that maintainers should not or cannot be Cc'ed on regression reports? That is not what I'm complaining about. I'm complaining about the fact that you *always* argue against closing bugreports. You have argued against it for over

Re: 2.6.26-rc9-git4: Reported regressions from 2.6.25

2008-07-10 Thread Linus Torvalds
On Thu, 10 Jul 2008, Alexey Dobriyan wrote: On Thu, Jul 10, 2008 at 05:25:35PM +1000, Nick Piggin wrote: Attached is my fix for this problem. I don't think it is a regression as such, but it can't hurt to go into 2.6.26 IMO. Nick, you're a hero. PREEMPT_RCU without HOTPLUG_CPU is

Re: 2.6.26-rc9-git12: Reported regressions from 2.6.25

2008-07-13 Thread Linus Torvalds
On Sun, 13 Jul 2008, Rafael J. Wysocki wrote: Regressions with patches Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11072 Subject : scsi-layer crash after usb storage device unplug Submitter : Johannes Berg [EMAIL PROTECTED] Date

Re: [Bug #11273] 2.6.27-rc1: softcursor behaviour changed

2008-08-11 Thread Linus Torvalds
On Sun, 10 Aug 2008, Rafael J. Wysocki wrote: The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). The revert is now in my tree as 3838f59fc2ea9821f3ea13adb555bfc6ea43c74c.

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-23 Thread Linus Torvalds
On Sat, 23 Aug 2008, Rafael J. Wysocki wrote: The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11342 Subject

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-23 Thread Linus Torvalds
On Sat, 23 Aug 2008, Linus Torvalds wrote: This one makes no sense. It's triggering a BUG_ON(in_interrupt()), but then the call chain shows that there is no interrupt going on. Ahh, later in that thread there's another totally unrelated oops in debug_mutex_add_waiter(). I'd guess

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-24 Thread Linus Torvalds
On Sat, 23 Aug 2008, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11405 Subject : 2.6.27-rc3 segfault on cold boot; not on warm boot. Submitter : David Greaves [EMAIL PROTECTED] Date : 2008-08-21 9:45 (3 days old)

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-24 Thread Linus Torvalds
On Sat, 23 Aug 2008, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11410 Subject : SLUB list_lock vs obj_hash.lock... Submitter : Daniel J Blueman [EMAIL PROTECTED] Date : 2008-08-22 21:48 (2 days old) References:

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-24 Thread Linus Torvalds
On Sat, 23 Aug 2008, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11401 Subject : pktcdvd: BUG, NULL pointer dereference in pkt_ioctl, bisected Submitter : Laurent Riffard [EMAIL PROTECTED] Date : 2008-08-22 08:16 (2 days

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-24 Thread Linus Torvalds
On Sat, 23 Aug 2008, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11354 Subject : AMD Elan regression with 2.6.27-rc3 Submitter : Sean Young [EMAIL PROTECTED] Date : 2008-08-15 18:37 (9 days old) References:

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-24 Thread Linus Torvalds
On Sun, 24 Aug 2008, Vegard Nossum wrote: I haven't really used the hlists before, so my first instinct was to do what is obvious. I do agree that the hlist versions aren't very nice in this regard. The regular lists are much better at moving lists around. Other than that, I guess

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-24 Thread Linus Torvalds
On Sat, 23 Aug 2008, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11356 Subject : Linux 2.6.27-rc3 - build failure: undefined reference to `.lockdep_count_forward_deps' Submitter : Frans Pop [EMAIL PROTECTED] Date :

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-25 Thread Linus Torvalds
On Mon, 25 Aug 2008, Alan D. Brunelle wrote: Before adding any more debugging, this is the status of my kernel boots: 3 times in a row w/ this same error. (Primary problem is the same, secondary stacks differ of course.) Ok, so I took a closer look, and the oops really is suggestive.. [

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-25 Thread Linus Torvalds
On Mon, 25 Aug 2008, Alan D. Brunelle wrote: With /just/ DEBUG_PAGE_ALLOC defined, I have seen two general panic types: o A new double fault w/ SMP_DEBUG_PAGEALLOC problem (prob4.txt) Yeah, that's a stack overflow. Confirmed. Linus -- To unsubscribe from this

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-25 Thread Linus Torvalds
On Mon, 25 Aug 2008, Linus Torvalds wrote: Could you make your kernel image available somewhere, and we can take a look at it? Some versions of gcc are total pigs when it comes to stack usage, and your exact configuration matters too. But yes, module loading is a bad case, for me

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-25 Thread Linus Torvalds
On Mon, 25 Aug 2008, Alan D. Brunelle wrote: Mine has: Dump of assembler code for function sys_init_module: 0x802688c4 sys_init_module+4: sub$0x1c0,%rsp so 448 bytes. Yeah, your build seems to have consistently bigger stack usage, and that may be due to some config

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-25 Thread Linus Torvalds
On Mon, 25 Aug 2008, Linus Torvalds wrote: But I'll look at your vmlinux, see what stands out. Oops. I already see the problem. Your .config has soem _huge_ CPU count, doesn't it? checkstack.pl shows these things as the top problems: 0x80266234 smp_call_function_mask

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-25 Thread Linus Torvalds
On Mon, 25 Aug 2008, Linus Torvalds wrote: checkstack.pl shows these things as the top problems: 0x80266234 smp_call_function_mask [vmlinux]:2736 0x80234747 __build_sched_domains [vmlinux]: 2232 0x8023523f __build_sched_domains [vmlinux

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Yinghai Lu wrote: wonder if could use unsigned long * directly. I would actually suggest something like this: - we continue to have a magic cpumask_t. - we do different cases for big and small NR_CPUS: #if NR_CPUS = BITS_PER_LONG /* * Make

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Rusty Russell wrote: Your workaround is very random, and that scares me. I think a huge number of CPUs needs a real solution (an actual cpumask allocator, then do something clever if we come across an actual fastpath). The thing is, the inlining thing is a separate

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Adrian Bunk wrote: A debugging option (for better traces) to disallow gcc some inlining might make sense (and might even make sense for distributions to enable in their kernels), but when you go to use cases that require really small kernels the cost is too high.

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Adrian Bunk wrote: I added -fno-inline-functions-called-once -fno-early-inlining to KBUILD_CFLAGS, and (with gcc 4.3) that increased the size of my kernel image by 2%. Btw, did you check with just -fno-inline-functions-called-once? The -fearly-inlining decisions

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Mike Travis wrote: The need to allow distros to set NR_CPUS=4096 (and NODES_SHIFT=9) is critical to our upcoming SGI systems using what we have been calling UV. That's fine. You can do it. The default kernel will not, because it's clearly not safe. I really don't

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Jamie Lokier wrote: A function which is only called from one place should, if everything made sense, _never_ use more stack through being inlined. But that's simply not true. See the whole discussion. The problem is that if you inline that function, the stack usage of

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Mike Travis wrote: I would be most interested in any tools to analyze call-trees and accumulated stack usages. My current method of using kdb is really time consuming. Well, even just scripts/checkstack.pl is quite relevant. The fact is, anything with a stack

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Adrian Bunk wrote: I had in mind that we anyway have to support it for tiny kernels. I actually don't think that is true. If we really were to decide to be stricter about it, and it makes a big size difference, we can probably also add a tool to warn about functions

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Adrian Bunk wrote: If you think we have too many stacksize problems I'd suggest to consider removing the choice of 4k stacks on i386, sh and m68knommu instead of using -fno-inline-functions-called-once: Don't be silly. That makes the problem _worse_. We're much

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Wed, 27 Aug 2008, Adrian Bunk wrote: We're much better off with a 1% code-size reduction than forcing big stacks on people. The 4kB stack option is also a good way of saying if it works with this, then 8kB is certainly safe. You implicitely assume both would solve the same

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Wed, 27 Aug 2008, Adrian Bunk wrote: When did we get callpaths like like nfs+xfs+md+scsi reliably working with 4kB stacks on x86-32? XFS may never have been usable, but the rest, sure. And you seem to be making this whole argument an excuse to SUCK, adn an excuse to let gcc crap even

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Parag Warudkar wrote: And although you said in your later reply that Linux x86 with 4K stacks should be more than usable - my experiences running a untainted desktop/file server with 4K stack have been always disastrous XFS or not. It _might_ work for some well defined

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-26 Thread Linus Torvalds
On Tue, 26 Aug 2008, Parag Warudkar wrote: What about deep call chains? The problem with the uptake of 4K stacks seems to be that is not reliably provable that it will work under all circumstances. Umm. Neither is 8k stacks. Nobody proved anything. But yes, some subsystems have insanely

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-27 Thread Linus Torvalds
On Wed, 27 Aug 2008, Paul Mackerras wrote: I think your memory is failing you. In 2.4 and earlier, the kernel stack was 8kB minus the size of the task_struct, which sat at the start of the 8kB. Yup, you're right. Linus -- To unsubscribe from this list: send the

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-27 Thread Linus Torvalds
On Wed, 27 Aug 2008, Peter Osterlund wrote: Why not just revert the offending change and try again during the next merge window, assuming someone has figured out an acceptable way to handle this mess by then? Well,, for 2.6.27 that's what we'll have to do. But there's actually a real

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-27 Thread Linus Torvalds
On Wed, 27 Aug 2008, Linus Torvalds wrote: I also wonder if any other block_ioctl users were converted.. Well, doing git log -p v2.6.26.. -Sunlocked_ioctl and looking for blkdev_ioctl, that does seem to be the only one. So hopefully no other case like this is lurking, although

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-27 Thread Linus Torvalds
On Wed, 27 Aug 2008, Alan Cox wrote: I'll take a crack at it tomorrow - but if its 185 entries then it probably wants to go into -next instead. Being more careful.. This: git grep 'unlocked_ioctl.*=' | sed 's/^.*=[]*\([_a-zA-Z0-9]*\).*$/\1/' |

Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26

2008-08-27 Thread Linus Torvalds
On Wed, 27 Aug 2008, Linus Torvalds wrote: I wonder if I could essentially automate something to do the conversion.. Hmm. compat_ioctl() actually has exactly the same issue. Damn. So you can't just add the new argument, you also have to _pass_ the argument in the compat_ioctl handlers

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-08-29 Thread Linus Torvalds
On Fri, 29 Aug 2008, Jes Sorensen wrote: I have only tested this on ia64, but it boots, so it's obviously perfecttm :-) Well, it probably boots because it doesn't really seem to _change_ much of anything. Things like this: -static inline void arch_send_call_function_ipi(cpumask_t

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-29 Thread Linus Torvalds
On Fri, 29 Aug 2008, Yinghai Lu wrote: http://www.sisk.pl/kernel/debug/mainline/2.6.27-rc5/broken.log pci :00:00.0: BAR has MMCONFIG at e000- And that seems utter crap to begin with. PCI: Using MMCONFIG at e000 - efff Where did it get that bogus

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-29 Thread Linus Torvalds
Btw, what was the original regression that commit was a2bd7274b47124d2fc4dfdb8c0591f545ba749dd trying to fix? It's not listed in that commit, even though the commit has a Bisected-by: David Witbrodt [EMAIL PROTECTED]. In fact, I can find it with google by searching for David

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-29 Thread Linus Torvalds
On Fri, 29 Aug 2008, Yinghai Lu wrote: the root cause is: before 2.6.26, call init_apic_mapping and will insert_resource for lapic address. and then call e820_resource_resouce (with request_resource) to register e820 entries. So the problem there was that traditionally,

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-29 Thread Linus Torvalds
On Fri, 29 Aug 2008, Linus Torvalds wrote: Yes. And I do think this is a workable model. Ok, and here's the patch to do insert_resource_expand_to_fit(root, new); and while I still haven't actually tested it, it looks sane and compiles to code that also looks sane. I'll happily

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-29 Thread Linus Torvalds
On Fri, 29 Aug 2008, Yinghai Lu wrote: Yeah, no, that's horrid. I'm happy it's reverted. if update res-end according mmconfig end, before insert it forcibly, then could fix the chipset BAR problem too. Except it's still a horrible patch that special-cases all the wrong things (ie random

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-29 Thread Linus Torvalds
On Fri, 29 Aug 2008, Yinghai Lu wrote: we need to use insert_resource_split_to_fit instead... otherwise __request_region will not be happy. Are you really really sure? Try just removing the IORESOURCE_BUSY. As mentioned, if we expect the PCI BAR's to work with the e820 resources, then

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Fri, 29 Aug 2008, Yinghai Lu wrote: please check __request_region: conflict: (reserved) [dd00, efff], res: (qla2xxx) [ddffc000, ddff] busy flag qla2xxx :83:00.0: BAR 1: can't reserve mem region [0xddffc000-0xddff] Ok, this is actually when the driver wants to

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Fri, 29 Aug 2008, Linus Torvalds wrote: IORESOURCE_BUSY is really more of a legacy bit. It has almost no bearing on the actual allocations. And just to clarify - I think that while you get that error for the qla2xxx driver, I suspect that your actual resource tree is all good

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Rafael J. Wysocki wrote: And if you have the whole dmesg, that would be useful. dmesg from -rc5 with the offending commit reverted and with the patch below applied is at: http://www.sisk.pl/kernel/debug/mainline/2.6.27-rc5/2.6.27-rc5-git.log Ok, the more I look

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Linus Torvalds wrote: We simply shouldn't try to compare the BAR start with randomly chosen things. Btw, looking at that bogus BAR#3 some more: I don't actually think it's even an MCFG resource. I think it's literally the resource that describes the HT window

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Yinghai Lu wrote: do you agree to use quirk to make the BAR res to have correct end between pci_probe and pci_resource_survey? In general I would agree, but now that I've looked at it a bit more, I actually don't think it's a bug in the chipset any more. See my

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Yinghai Lu wrote: AMD CPU/NB (quad core aka fam 10h later) has MSR to state MMCONFIG, and the ATI bridge BAR that have same address for MMCONFIG not even have chance to decode that, because NB intercept that already. Ok, so it's similar to the local APIC in that

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Yinghai Lu wrote: in old kernel, after BAR3 request_filed, pci_assigned_unassigned should get update resource for that... but it could find that big space for it. Exactly. So what happens is that it doesn't actually re-allocate it at all. Not that it is necessarily

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Yinghai Lu wrote: wait, THAT BAR is 64BIT capable, So kernel should assign 64bit range to it... it request_resource fails... I don't think we've ever done new allocations in 64 bits. Although looking for it, I have to admit that I don't see what would limit us right

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Yinghai Lu wrote: then 1. we should not probe them in probe.c 2. at least we should not try to request_resource for them in pcibios_resource_survey... just pretend that they are not existing. You are missing the fact that we need to know where existing resources

Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd

2008-08-30 Thread Linus Torvalds
On Sat, 30 Aug 2008, Linus Torvalds wrote: Short recap: - we need to populate the resource map with as much possible information about the system as we can.. - .. because when we assign _dynamic_ resources, we need to make sure that they don't clash with random system

Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

2008-09-30 Thread Linus Torvalds
On Tue, 30 Sep 2008, Mike Travis wrote: One pain is: typedef struct __cpumask_s *cpumask_t; const cpumask_t xxx; is not the same as: typedef const struct __cpumask_s *const_cpumask_t; const_cpumask_t xxx; and I'm not exactly sure why. Umm. The const has

Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)

2008-11-10 Thread Linus Torvalds
On Tue, 11 Nov 2008, Benjamin Herrenschmidt wrote: In any case, I doesn't seem to be directly related to those radeonfb changes, though a clash with X like that is indeed more likely to actually happen if radeonfb relies more heavily on acceleration. Just a silly question, without actually

Re: 2.6.28-rc5: Reported regressions from 2.6.27

2008-11-16 Thread Linus Torvalds
On Sun, 16 Nov 2008, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12049 Subject : Oops in acpi_system_wakeup_device_seq_show Submitter : Bruno Prémont [EMAIL PROTECTED] Date : 2008-11-16 13:04 (1 days old) First-Bad-Commit:

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, Eric Dumazet wrote: Ingo Molnar a écrit : it gives a small speedup of ~1% on my box: before: Throughput 3437.65 MB/sec 64 procs after: Throughput 3473.99 MB/sec 64 procs Strange, I get 2350 MB/sec on my 8 cpus box. tbench 8 I think Ingo may

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, David Miller wrote: The scheduler has accounted for at least %10 of the tbench regressions at this point, what are you talking about? I'm wondering if you're not looking at totally different issues. For example, if I recall correctly, David had a big hit on the

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, David Miller wrote: From: Ingo Molnar [EMAIL PROTECTED] Date: Mon, 17 Nov 2008 12:01:19 +0100 The scheduler's overhead barely even registers on a 16-way x86 system i'm running tbench on. Here's the NMI profile during 64 threads tbench on a 16-way x86 box with an

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, David Miller wrote: Again, do a non-NMI profile and the top (at least for me) looks like this: Can _you_ please do a NMI profile and see what your real problem is? I can't imagine that Niagara (or whatever) is so weak that it can't do NMI's. The fact is, David, that

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, David Miller wrote: And as a result I found that wake_up() is now 4 times slower than it was in 2.6.22, I even analyzed this for every single kernel release till now. ..and that's the one where you then pointed to hrtimers, and now you claim that was fixed? At least

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, David Miller wrote: It's on my workstation which is a much simpler 2 processor UltraSPARC-IIIi (1.5Ghz) system. Ok. It could easily be something like a cache footprint issue. And while I don't know my sparc cpu's very well, I think the Ultrasparc-IIIi is super- scalar

Re: skb_release_head_state(): Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, Ingo Molnar wrote: this function _really_ hurts from a 16-bit op: 8048943e: 650366 c7 83 a8 00 00 00movw $0x0,0xa8(%rbx) 80489445:000 00 80489447: 1741015b pop%rbx I don't think that

Re: eth_type_trans(): Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, Ingo Molnar wrote: 8049e2ae:00f b7 c0movzwl %ax,%eax 8049e2b1:03d ff 05 00 00 cmp$0x5ff,%eax 8049e2b6: 4687f 18 jg 8049e2d0 eth_type_trans+0xbb

Re: system_call() - Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Mon, 17 Nov 2008, Ingo Molnar wrote: syscall entry instruction costs - unavoidable security checks, etc. - hardware costs. Yes. One thing to look out for on x86 is the system call _return_ path. It doesn't show up in kernel profiles (it shows up as user costs), and we had a bug where

Re: eth_type_trans(): Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-17 Thread Linus Torvalds
On Tue, 18 Nov 2008, Eric Dumazet wrote: * * Compare two ethernet addresses, returns 0 if equal */ static inline unsigned compare_ether_addr(const u8 *addr1, const u8 *addr2) { const u16 *a = (const u16 *) addr1; const u16 *b = (const u16 *) addr2;

Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -gt; 2.6.28

2008-11-18 Thread Linus Torvalds
On Tue, 18 Nov 2008, Nick Piggin wrote: On Tuesday 18 November 2008 07:58, David Miller wrote: From: Linus Torvalds [EMAIL PROTECTED] Ok. It could easily be something like a cache footprint issue. And while I don't know my sparc cpu's very well, I think the Ultrasparc-IIIi

Re: [Bug #12158] commit b1ee26b freezes system on switching from X to text console

2008-12-03 Thread Linus Torvalds
On Wed, 3 Dec 2008, Rafael J. Wysocki wrote: This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.27. Please verify if it still should be listed and let me know (either

Re: [Bug #12422] 2.6.28-git can't resume from str

2009-01-12 Thread Linus Torvalds
On Tue, 13 Jan 2009, Jeff Chua wrote: I was trying to bisect further, but ended up with this strange behavior... # git bisect good a3a798c88a14b35e5d4ca30716dbc9eb9a1ddfe2 Bisecting: 579 revisions left to test after this [079899c2384023cd8efcd3806680b4f1d2abbd54] Btrfs: Change

Re: [Bug #12809] iozone regression with 2.6.29-rc6

2009-03-14 Thread Linus Torvalds
On Sat, 14 Mar 2009, Rafael J. Wysocki wrote: The following bug entry is on the current list of known regressions from 2.6.28. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12809 Subject

Re: 2.6.29-rc8: Reported regressions from 2.6.28

2009-03-15 Thread Linus Torvalds
On Sun, 15 Mar 2009, Johannes Berg wrote: On Sun, 2009-03-15 at 11:06 +0800, Jeff Chua wrote: The commit below is causing problem with associating with the hidden AP as well. 71c11fb57b924c160297ccd9e1761db598d00ac2 is first bad commit commit

Re: 2.6.29-rc8: Reported regressions from 2.6.28

2009-03-16 Thread Linus Torvalds
On Mon, 16 Mar 2009, Jeff Chua wrote: Take the attached bisect log and replay it Taking a bisect log is repeatable, but pointless. If you made any mistakes in bisecting (marking a kernel that was good as being bad, or the other way around), the log will always replay to the same thing,

Re: 2.6.29-rc8-git5: Reported regressions from 2.6.28

2009-03-23 Thread Linus Torvalds
On Mon, 23 Mar 2009, Wu Fengguang wrote: I think this was tracked back to the effective halving of dirty_ratio by 1cf6e7d83 (mm: task dirty accounting fix) and doubling the ratio fixed the iozone regression. Yes, exactly. The patch for fixing this regression is trivial. I was

Re: 2.6.29-git13: Reported regressions from 2.6.28

2009-04-06 Thread Linus Torvalds
On Mon, 6 Apr 2009, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13019 Subject : /proc/pid/maps offset output broken in 2.6.29 Submitter : Chris Friesen cfrie...@nortel.com Date : 2009-04-01 23:18 (6 days old) References

Re: [Bug #12809] iozone regression with 2.6.29-rc6

2009-04-06 Thread Linus Torvalds
On Mon, 6 Apr 2009, Rafael J. Wysocki wrote: This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.28. Please verify if it still should be listed and let me know (either

Re: 2.6.29-git13: Reported regressions from 2.6.28

2009-04-06 Thread Linus Torvalds
On Mon, 6 Apr 2009, Trenton D. Adams wrote: This went through bisection, but looking at the email log, I tend to suspect that maybe Trenton marked some versions good even though they weren't (because they got versions numbers from v2.6.27), and didn't realize that that messes up

Re: 2.6.29-git13: Reported regressions from 2.6.28

2009-04-16 Thread Linus Torvalds
On Thu, 16 Apr 2009, Chris Friesen wrote: I'm okay with that. The problem causes some backwards compatibility problems with existing apps that get confused by the large offset number. The fix is going to cause problems too, but in a different way. We'll work around it. If you have

Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29

2009-04-16 Thread Linus Torvalds
I think you put this in the wrong regression pile: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13112 Subject : Oops in drain_array Submitter : Bart m...@riz.pl Date : 2009-04-14 10:21 (3 days old) References:

Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29

2009-04-16 Thread Linus Torvalds
On Thu, 16 Apr 2009, Rafael J. Wysocki wrote: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13098 Subject : 2.6.29-git12 breaks vga=0x0f07 on MSI/Intel GPU Submitter : Andi Kleen a...@firstfloor.org Date : 2009-04-06 01:14 (11 days old) References

Re: [Bug #13058] First hibernation attempt fails

2009-04-17 Thread Linus Torvalds
On Fri, 17 Apr 2009, Alan Jenkins wrote: As another datapoint: I tried blindly applying the commit to 2.6.29. The resulting kernel was able to hibernate fine the first time. Yeah, so it's not that commit per se that causes it. I bet it needs all the IO scheduler changes too - and even

Re: [Bug #13058] First hibernation attempt fails

2009-04-17 Thread Linus Torvalds
On Fri, 17 Apr 2009, Rafael J. Wysocki wrote: Can you please try to reproduce the problem with the appended debug patch applied and send the output of dmesg to me? Maybe something like this instead (or in addition to). It does show_mem() when memory shrinking fails. It will show a _lot_ of

Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29

2009-04-17 Thread Linus Torvalds
On Sat, 18 Apr 2009, leiming wrote: From 5715e310a939f3f7cd3e88eae8f25fedbb28def4 Mon Sep 17 00:00:00 2001 From: Ming Lei tom.leim...@gmail.com Date: Wed, 15 Apr 2009 22:32:51 +0800 Subject: [PATCH] V4L/DVB:usbvideo:fix uvc resume failed Now urb buffers is not freed before suspend, so

Re: [PATCH 3/4] PM/Hibernate: Use memory allocations to free memory (rev. 2)

2009-05-02 Thread Linus Torvalds
On Sun, 3 May 2009, Rafael J. Wysocki wrote: Remove the shrinking of memory from the suspend-to-RAM code, where it is not really necessary. Hmm. Shouldn't we do this _regardless_? IOW, shouldn't this be a totally separate patch? It seems to be left-over from when we shared the same

Re: 2.6.30-rc2-git2: Reported regressions from 2.6.29

2009-05-09 Thread Linus Torvalds
On Sat, 9 May 2009, Ming Lei wrote: Rc5 has been released today, why isn't this patch accepted by upstream now? It is really a bug fix. I can take it directly, but was hoping to get it through the regular DVB tree. Haven't had a DVB update request yet (or maybe it got lost?)

Re: [Bug #13122] reiserfs_delete_xattrs: Couldn't delete all xattrs (-13)

2009-05-17 Thread Linus Torvalds
On Sat, 16 May 2009, Rafael J. Wysocki wrote: The following bug entry is on the current list of known regressions from 2.6.29. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13122 Subject

Re: 2.6.30-rc6: Reported regressions from 2.6.29

2009-05-18 Thread Linus Torvalds
On Mon, 18 May 2009, Ingo Molnar wrote: Btw., why did the patch (and the revert) make any difference to the test? Timing differences look improbable. It's the change from !signal_group_exit(signal) to !sig_kernel_only(signr) and quite frankly, I still don't see the

Re: [Bug #14030] Kernel NULL pointer dereference at 0000000000000008, pty-related

2009-08-25 Thread Linus Torvalds
://marc.info/?l=linux-kernelm=125074724623423w=4 Handled-By: Linus Torvalds torva...@linux-foundation.org Patch : http://patchwork.kernel.org/patch/43679/ This is now committed as 5c58ceff103d8a654f24769bb1baaf84a841b0cc. Linus -- To unsubscribe from this list: send the line

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-02 Thread Linus Torvalds
On Tue, 1 Sep 2009, Rafael J. Wysocki wrote: On Tuesday 01 September 2009, Mikael Pettersson wrote: Starting with 2.6.31-rc8 and reverting 85dfd81dc57e8183a277ddd7a56aa65c96f3f487 pty: fix data loss when stopped (^S/^Q) d945cb9cce20ac7143c2de8d88b187f62db99bdc pty: Rework the pty

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-03 Thread Linus Torvalds
On Thu, 3 Sep 2009, OGAWA Hirofumi wrote: If I'm not missing, I think it doesn't have big change with old code. But I would need to check more deeply. The thing is, the old pty code pushed _directly_ to the receiving ldisc, with no buffering. I'm not entirely sure why Alan felt it needed

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-04 Thread Linus Torvalds
On Fri, 4 Sep 2009, Linus Torvalds wrote: And I suspect that that means that the bug is related to do_output_char() expanding '\n' into '\r\n'. And the different buffering (and the pty 'space' logic) just means that we now hit a case that we didn't use to hit. The relevant call chain

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-04 Thread Linus Torvalds
On Fri, 4 Sep 2009, Linus Torvalds wrote: How about something like this? It's way too anal - it says that we can only write data if there's enough space to always push it all the way to the receive buffer (including all the data that was already buffered up, ie the memory_used part

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-04 Thread Linus Torvalds
On Fri, 4 Sep 2009, Linus Torvalds wrote: And again - UNTESTED. Maybe this makes the buffering _too_ small (the 'memory_used' thing is not really counted in bytes buffered, it's counted in how much buffer space we've allocated) and things break even worse and pty's don't work at all

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-04 Thread Linus Torvalds
On Fri, 4 Sep 2009, Alan Cox wrote: In which case ppp will no longer work properly in some cases (ditto other protocols) and things like the pppoe gateway wont work as they don't in 2.6.30 - you need to go back to somewhere between 2.6.28/29 to undo this, then apply the alternative locking

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-04 Thread Linus Torvalds
On Fri, 4 Sep 2009, Mikael Pettersson wrote: Comparing the gcc outputs for this test case from runs with 2.6.30 and 2.6.31-rc8 shows that 2.6.31-rc8 lost a single newline (\n) byte at byte offset 131660. So two lines of diagnostics were fused together and the testsuite framework failed to

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-04 Thread Linus Torvalds
On Fri, 4 Sep 2009, Linus Torvalds wrote: So I'm starting to suspect that the real bug is that we do that 'pty_space()' in pty_write() call at all. The _callers_ should already have done the write_room() check, and if somebody doesn't do it, then the tty buffering will eventually do

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-04 Thread Linus Torvalds
On Fri, 4 Sep 2009, Linus Torvalds wrote: I'm sure you already figured the obvious meaning out, but here's a fixed version. And here's another patch that may also fix this, simply by virtue of writing the \r\n as a single string, rather than as two characters. That way, we should never

Re: [Bug #14015] pty regressed again, breaking expect and gcc's testsuite

2009-09-05 Thread Linus Torvalds
On Sun, 6 Sep 2009, OGAWA Hirofumi wrote: This is not meaning to object to your patch though, I think we would be good to fix pty_space(), not leaving as wrong. With fix it, I guess we don't get strange behavior in the near of buffer limit. I'd actually rather not make that function any

  1   2   >