Re: [RFC] Crash on modpost, addend_386_rel()
On Mon, May 21, 2007 at 10:01:27PM -0400, Ben Collins wrote: > Got this crash in modpost. Bisect blames this commit: > > commit f892b7d480eec809a5dfbd6e65742b3f3155e50e > Author: Atsushi Nemoto <[EMAIL PROTECTED]> > Date: Thu May 17 01:14:38 2007 +0900 > kbuild: make better section mismatch reports on i386, arm and mips > > On i386, ARM and MIPS, warn_sec_mismatch() sometimes fails to show > usefull symbol name. This is because empty 'refsym' due to 0 r_addend > value. This patch is to adjust r_addend value, consulting with > apply_relocate() routine in kernel code. > > Sorry, I don't know enough about the elf stuff to fix this up myself. > Config that causes it is at the end. Thanks. Linus already reported the same in private mail and I asked him to revert this commit which he has done by now. When properly fixed it will be re-added but this time it will cook a little in -mm first. Sam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: This kernel requires the following features not present on the CPU... (on a VIA C7 CPU)
Hello Artur, I purchased C7 VIA Esther and I experience similar problem with random crashes. There is also a thread on Via Arena about "J7F4K - Hard Lockups" where people describe the same issue: http://forums.viaarena.com/messageview.aspx?catid=28&threadid=77032 I've tried the following kernels: 2.6.21.1 - crashes pretty quick 2.6.21.1 with disabled CPU scaling - better mileage but still crashes 2.6.22-rc1 - hangs on boot 2.6.22-rc1-mm1 - doesn't compile 2.6.22-rc2 - hangs on boot here, too: crashes over crashes... I've ran memtest86 and no memory problems were found. Any ideas or suggestions? Thank You. The same here. I applied the patch mentioned by Christian Volkmann in the thread "Re: 2.6.22-rc1 does not boot on VIA C3_2 cause of X86_CMPXCHG64 II" on May 19th 2007. Since then I did not have a crash any more. Touch wood! Could anyone explain to me what CMPXCHG64 / cx8 is and what happens if the kernel has been compiled to use it but the CPU does not have it? Especially if the CPU like the C7 supports it, but it's disabled? Would random crashes be a plausible effect? Regards, claas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] CFS: sched-design-CFS.txt - ambiguity about leftmost
Hello, I felt the description of the leftmost task a bit ambiguous. Is it the leftmost task in the rbtree? or did u mean the "most leftout task" in the task list? If it is so then this patch should correct the leftmost task as "most leftout task". NACK it if I'm wrong. Just trying to help. :) Changes "leftmost task" to "most leftout task". Signed-off by: Pranith Kumar D<[EMAIL PROTECTED]> --- sched-design-CFS.txt.orig2007-05-22 12:04:43.0 +0530 +++ sched-design-CFS.txt2007-05-22 12:11:35.0 +0530 @@ -37,9 +37,9 @@ the task schedules (or a scheduler tick 'accounted for': the (small) time it just spent using the physical CPU is deducted from p->wait_runtime. [minus the 'fair share' it would have gotten anyway]. Once p->wait_runtime gets low enough so that another -task becomes the 'leftmost task' (plus a small amount of 'granularity' -distance relative to the leftmost task so that we do not over-schedule -tasks and trash the cache) then the new leftmost task is picked and the +task becomes the 'most leftout task' (plus a small amount of 'granularity' +distance relative to the most leftout task so that we do not over-schedule +tasks and trash the cache) then the new most leftout task is picked and the current task is preempted. The rq->fair_clock value tracks the 'CPU time a runnable task would have @@ -47,10 +47,10 @@ fairly gotten, had it been runnable duri rq->fair_clock values we can accurately timestamp and measure the 'expected CPU time' a task should have gotten. All runnable tasks are sorted in the rbtree by the "rq->fair_clock - p->wait_runtime" key, and -CFS picks the 'leftmost' task and sticks to it. As the system progresses +CFS picks the 'most leftout' task and sticks to it. As the system progresses forwards, newly woken tasks are put into the tree more and more to the right - slowly but surely giving a chance for every task to become the -'leftmost task' and thus get on the CPU within a deterministic amount of +'most leftout task' and thus get on the CPU within a deterministic amount of time. Some implementation details: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prevent going idle with softirq pending
* Andrew Morton <[EMAIL PROTECTED]> wrote: > [ 550.280860] BUG: at kernel/softirq.c:138 local_bh_enable() yep. The correct patch is the one below. Ingo -> Subject: Prevent going idle with softirq pending From: Thomas Gleixner <[EMAIL PROTECTED]> The NOHZ patch contains a check for softirqs pending when a CPU goes idle. The BUG is unrelated to NOHZ, it just was made visible by the NOHZ patch. The BUG showed up mainly on P4 / hyperthreading enabled machines which lead the investigations into the wrong direction in the first place. The real cause is in cond_resched_softirq(): cond_resched_softirq() is enabling softirqs without invoking the softirq daemon when softirqs are pending. This leads to the warning message in the NOHZ idle code: t1 runs softirq disabled code on CPU#0 interrupt happens, softirq is raised, but deferred (softirqs disabled) t1 calls cond_resched_softirq() enables softirqs via _local_bh_enable() calls schedule() t2 runs t1 is migrated to CPU#1 t2 is done and invokes idle() NOHZ detects the pending softirq Fix: change _local_bh_enable() to local_bh_enable() so the softirq daemon is invoked. Thanks to Anant Nitya for debugging this with great patience ! Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Index: linux/kernel/sched.c === --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -4212,9 +4212,7 @@ int __sched cond_resched_softirq(void) BUG_ON(!in_softirq()); if (need_resched() && system_state == SYSTEM_RUNNING) { - raw_local_irq_disable(); - _local_bh_enable(); - raw_local_irq_enable(); + local_bh_enable(); __cond_resched(); local_bh_disable(); return 1; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prevent going idle with softirq pending
On Mon, 21 May 2007 23:34:24 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote: > The NOHZ patch contains a check for softirqs pending when a CPU goes > idle. The BUG is unrelated to NOHZ, it just was made visible by the NOHZ > patch. The BUG showed up mainly on P4 / hyperthreading enabled machines > which lead the investigations into the wrong direction in the first > place. The real cause is in cond_resched_softirq(): > > cond_resched_softirq() is enabling softirqs without invoking the softirq > daemon when softirqs are pending. This leads to the warning message in > the NOHZ idle code: > > t1 runs softirq disabled code on CPU#0 > interrupt happens, softirq is raised, but deferred (softirqs disabled) > t1 calls cond_resched_softirq() > enables softirqs via _local_bh_enable() > calls schedule() > t2 runs > t1 is migrated to CPU#1 > t2 is done and invokes idle() > NOHZ detects the pending softirq > > Fix: change _local_bh_enable() to local_bh_enable() so the softirq > daemon is invoked. > > Thanks to Anant Nitya for debugging this with great patience ! > > Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> > > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -4776,7 +4776,7 @@ int __sched cond_resched_softirq(void) > > if (need_resched() && system_state == SYSTEM_RUNNING) { > raw_local_irq_disable(); > - _local_bh_enable(); > + local_bh_enable(); > raw_local_irq_enable(); > __cond_resched(); > local_bh_disable(); > [ 550.280860] BUG: at kernel/softirq.c:138 local_bh_enable() [ 550.281019] [] local_bh_enable+0x3c/0x79 [ 550.281153] [] cond_resched_softirq+0x2d/0x43 [ 550.281291] [] release_sock+0x38/0x74 [ 550.281414] [] tcp_sendmsg+0x8e4/0x9d2 [ 550.281565] [] inet_sendmsg+0x3b/0x45 [ 550.281692] [] sock_sendmsg+0xcf/0xea [ 550.281826] [] autoremove_wake_function+0x0/0x35 [ 550.281974] [] __qdisc_run+0x9a/0x12b [ 550.282095] [] dev_queue_xmit+0x1e7/0x206 [ 550.282225] [] ip_output+0x23b/0x277 [ 550.282341] [] __nf_ct_refresh_acct+0xcf/0x102 [nf_conntrack] [ 550.282528] [] tcp_packet+0x9c7/0x9f0 [nf_conntrack] [ 550.282693] [] xdr_skb_read_bits+0x21/0x35 [sunrpc] [ 550.282872] [] xdr_partial_copy_from_skb+0x12a/0x172 [sunrpc] [ 550.283067] [] kernel_sendmsg+0x27/0x35 [ 550.283192] [] xs_send_kvec+0x98/0xa0 [sunrpc] [ 550.283376] [] xs_sendpages+0x75/0x1b4 [sunrpc] [ 550.283554] [] xs_tcp_send_request+0x5a/0x11c [sunrpc] [ 550.283739] [] xprt_transmit+0xc2/0x1a4 [sunrpc] [ 550.283901] [] rpcauth_wrap_req+0x6c/0x74 [sunrpc] [ 550.284070] [] rpcauth_marshcred+0x4b/0x52 [sunrpc] [ 550.284239] [] xprt_prepare_transmit+0x6a/0x73 [sunrpc] [ 550.284423] [] nfs3_xdr_readargs+0x0/0x88 [nfs] [ 550.284595] [] call_transmit+0x1c0/0x1f3 [sunrpc] [ 550.284766] [] call_reserve+0x3c/0x65 [sunrpc] [ 550.284933] [] __rpc_execute+0x6f/0x1fc [sunrpc] [ 550.285095] [] sigprocmask+0x86/0x8d [ 550.285222] [] nfs_execute_read+0x30/0x3f [nfs] [ 550.285396] [] nfs_pagein_one+0x9d/0xda [nfs] [ 550.285563] [] nfs_pageio_doio+0x2c/0x52 [nfs] [ 550.285731] [] nfs_pageio_add_request+0xa2/0xb3 [nfs] [ 550.285912] [] readpage_async_filler+0x102/0x11f [nfs] [ 550.286102] [] readpage_async_filler+0x0/0x11f [nfs] [ 550.286274] [] read_cache_pages+0x72/0xd4 [ 550.286426] [] nfs_readpages+0x10c/0x14d [nfs] [ 550.286595] [] ip_finish_output+0x0/0x1e7 [ 550.286727] [] nfs_pagein_one+0x0/0xda [nfs] [ 550.286893] [] nfs_readpages+0x0/0x14d [nfs] [ 550.287054] [] __do_page_cache_readahead+0xe3/0x19c [ 550.287204] [] xdr_partial_copy_from_skb+0x12a/0x172 [sunrpc] [ 550.291280] [] xs_tcp_data_recv+0x3cd/0x401 [sunrpc] [ 550.295331] [] xdr_skb_read_bits+0x0/0x35 [sunrpc] [ 550.299385] [] blockable_page_cache_readahead+0x4c/0x9f [ 550.303465] [] make_ahead_window+0x7c/0x99 [ 550.307499] [] page_cache_readahead+0x17a/0x1a4 [ 550.311532] [] do_generic_mapping_read+0x13b/0x432 [ 550.315583] [] generic_file_aio_read+0x130/0x157 [ 550.319512] [] file_read_actor+0x0/0xd1 [ 550.323492] [] do_sync_read+0xc6/0x109 [ 550.327471] [] ip_rcv_finish+0x0/0x235 [ 550.331524] [] autoremove_wake_function+0x0/0x35 [ 550.335629] [] do_sync_read+0x0/0x109 [ 550.339532] [] vfs_read+0xa6/0x150 [ 550.343513] [] sys_read+0x41/0x67 [ 550.347385] [] syscall_call+0x7/0xb [ 550.351321] === That's WARN_ON_ONCE(irqs_disabled()); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 10/14] In-kernel file copy between union mounted filesystems
On May 22 2007 08:43, Bharata B Rao wrote: >On Fri, May 18, 2007 at 09:47:31AM -0400, Shaya Potter wrote: >> Bharata B Rao wrote: >> >> > >> >Not really. This is called during copyup of a file residing in a lower >> >layer. And that is done only for regular files. >> >> That is broken. > >But it only breaks the semantics (in other cases we allow writes only to the >top layer files). So the question is why do we have to copy up the device >node ? What difference it makes to writing to the device itself ? Because `chmod 666 blockdevnode` is not the same as writing to the device itself? >Currently we allow write to the device using the lower layer device node >itself. Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Mon, May 21, 2007 at 10:04:10PM -0700, William Lee Irwin III wrote: > On Mon, May 21, 2007 at 06:39:51PM -0700, William Lee Irwin III wrote: > >> address (virtual and physical are trivially inter-convertible), mock > >> up something akin to what filesystems do for anonymous pages, etc. > >> The real objection everyone's going to have is that driver writers > >> will stain their shorts when faced with the rules for handling such > >> things. The thing is, I'm not entirely sure who these driver writers > >> that would have such trouble are, since the driver writers I know > >> personally are sophisticates rather than walking disaster areas as such > >> would imply. I suppose they may not be representative of the whole. > > On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote: > > That's not the objection I would have. I would say that firstly, I > > don't think the mem_map overhead is very significant (at any rate, > > an allocated-on-demand metadata is not going to be any smaller if > > you fill up on pagecache...). Secondly, I think there is merit to > > having the same page metadata used by the major subsystems, because > > it helps for locality of reference. > > The size isn't the advantage being cited; I'd actually expect the net > result to be larger. It's the control over the layout of the metadata > for cache locality and even things like having enough flags, folding > buffer_head -like affairs into the per-page metadata for filesystems > and so reaping cache locality benefits even there (assuming it works > out in other respects), and so on. > > Passing pages between subsystems doesn't seem very significant to me. > There isn't going to be much locality of reference, or even any > guarantee that the subsystem gets fed a cache hot page structure. The > subsystem being passed the page will have its own cache hot accounting > structures to stick the information about the memory into. Well consider the page allocator and pagecache. The page allocator uses page metadata rather than eg. a bitmap, and it uses page list heads for the per-cpu allocator. If we were to instead perhaps use external bitmaps and arrays to keep track of pages, then the pagecache would have to go and allocate its own structures rather than reuse the cache hot page allocator structures. Buffer heads might be something that would work well, but we'd still like to be able to deallocate them without freeing the whole pagecache (because they tend to be associated with less frequent operations like IO). But anyway, I don't know. I'm sure there would be cases where it works better. > On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote: > > But I haven't explored the idea enough myself to know whether there > > would be any really killer benefits to this. Delayed metadata freeing > > via RCU without holding up the freeing of the actual page would have > > been something, however I can do similar with speculative references > > now (or whenever the code gets merged), which doesn't even require the > > RCU overhead. > > I'm not entirely sure what you're on about there, but it sounds > interesting. Heh :) Well the lockless pagecache would become basically trivial if we could RCU-free pagecache pages, however doing that is really awful for a number of reasons. However if you had a system where the metadata is decoupled, you could simply RCU-free the 'struct page' (while still immediately freeing the page itself) which would make lockless pagecache (and potentially similar things) equally trivial. I assumed K42 might have been into that angle. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad networking related lag in v2.6.22-rc2
* Anant Nitya <[EMAIL PROTECTED]> wrote: > > I think I already found the bug, please try if this patch helps. > > Sorry, but this patch is not helping here. [...] btw., could you please send this patch on-list too please? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad networking related lag in v2.6.22-rc2
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > Sorry, but this patch is not helping here. [...] > > btw., could you please send this patch on-list too please? disregard this - just found Patrick's patch. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad networking related lag in v2.6.22-rc2
* Anant Nitya <[EMAIL PROTECTED]> wrote: > > I think I already found the bug, please try if this patch helps. > > Sorry, but this patch is not helping here. I recompiled the kernel > with this patch but same load pattern still make system to crawl. > > Here is the link for script I use to shape traffic. > > http://cybertek.info/taitai/adslbwopt.sh could you also apply the fix for the softirq problem below, to make sure it does not interact? Ingo Index: linux/kernel/sched.c === --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -4212,9 +4212,7 @@ int __sched cond_resched_softirq(void) BUG_ON(!in_softirq()); if (need_resched() && system_state == SYSTEM_RUNNING) { - raw_local_irq_disable(); - _local_bh_enable(); - raw_local_irq_enable(); + local_bh_enable(); __cond_resched(); local_bh_disable(); return 1; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lm-sensors] [RFC] ACPI based hwmon driver for ASUS
Hi, I have following readings: w83627ehf-isa-0290 Adapter: ISA adapter VCore: +1.52 V (min = +0.00 V, max = +1.74 V) in1: +12.30 V (min = +13.46 V, max = +13.04 V) ALARM AVCC: +3.36 V (min = +4.08 V, max = +3.95 V) ALARM 3VCC: +3.36 V (min = +4.05 V, max = +3.06 V) ALARM in4: +2.04 V (min = +1.78 V, max = +2.04 V) in5: +1.60 V (min = +2.04 V, max = +2.02 V) ALARM in6: +5.12 V (min = +6.12 V, max = +6.53 V) ALARM VSB: +3.36 V (min = +4.08 V, max = +4.08 V) ALARM VBAT: +3.30 V (min = +4.08 V, max = +3.04 V) ALARM in9: +1.65 V (min = +0.98 V, max = +2.04 V) Case Fan:0 RPM (min =0 RPM, div = 8) CPU Fan: 1638 RPM (min =0 RPM, div = 4) Aux Fan: 1436 RPM (min = 4272 RPM, div = 4) ALARM fan5:0 RPM (min =0 RPM, div = 16) Sys Temp:+28°C (high = -65°C, hyst = -34°C) ALARM CPU Temp: +34.0°C (high = +80.0°C, hyst = +75.0°C) AUX Temp: +38.5°C (high = +80.0°C, hyst = +75.0°C) Fan4 is disabled in the chip. I think the board has 4 connectors. I dont have time right now to check the manual. Rudolf - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -ak] Fix missing include
On Monday 21 May 2007 22:03, Thomas Gleixner wrote: > ftp://firstfloor.org/pub/ak/x86_64/quilt/x86_64-2.6.22-rc2-070521-1.bz2 > > explodes in various places due to missing defines of __cold. We can't > rely on the assumption that linux/compiler.h is included magically > before bug.h is included. Include it explicitely. Hmm, somehow it worked here. Perhaps longer term it would be a good idea to turn compiler.h into an -include > Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> Added thanks -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad networking related lag in v2.6.22-rc2
On Monday 21 May 2007 15:50:09 Ingo Molnar wrote: > * Anant Nitya <[EMAIL PROTECTED]> wrote: > > Tcp: > > 5 connections established > > hm, this does not explain the /proc/net/tcp overhead i think - although > it could be a red herring. Will have a closer look at your new trace. > > if possible please try to generate the automatic softirq trace for > Thomas, and then a separate trace for the firefox/net-lag thing, using > trace-it-10sec.c. Btw., for the second trace, could you boot with > maxcpus=1? That would make the second trace quite a bit more > straightforward to analyze. You probably need both cpus to trigger the > softirq problem. > > Ingo here is the link for new trace with maxcpus=1. http://cybertek.info/taitai/trace-it-10sec-to-ingo-with-maxcpus=1.bz2 Regards Ananitya -- Out of many thousands, one may endeavor for perfection, and of those who have achieved perfection, hardly one knows Me in truth. -- Gita Sutra Of Mysticism - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad networking related lag in v2.6.22-rc2
On Tuesday 22 May 2007 03:00:31 Patrick McHardy wrote: > Patrick McHardy wrote: > > Ingo Molnar wrote: > >>* Anant Nitya <[EMAIL PROTECTED]> wrote: > >>>I am posting links to the information you asked for. One more thing, > >>>after digging a bit more I found its QoS shaping that is making the > >>>box crawl. Once I disabled the traffic shaping everything comes back > >>>to smooth and normal. Shaping being done on very low speed residential > >>>ADSL 256/64 Kbps connection. If you want me to post shaping rules, > >>>please free to ask. BTW its a simple HTB/SFQ rules. > >> > >>[...] > >> > >>>http://cybertek.info/taitai/trace-to-ingo.txt.bz2 > >> > >>thanks! This trace indeed includes the smoking gun, htb_dequeue() and > >>__qdisc_run(): > >> > >>[..] > > > > This looks like fallout from the switch to hrtimers. Anant, please > > send me your HTB script, I'll try to reproduce it. > > I think I already found the bug, please try if this patch helps. Sorry, but this patch is not helping here. I recompiled the kernel with this patch but same load pattern still make system to crawl. Here is the link for script I use to shape traffic. http://cybertek.info/taitai/adslbwopt.sh Regards Ananitya -- Out of many thousands, one may endeavor for perfection, and of those who have achieved perfection, hardly one knows Me in truth. -- Gita Sutra Of Mysticism - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH (take 2)] Documentation/memory-barriers.txt: various fixes
On Mon, May 21, 2007 at 03:12:07PM +0100, David Howells wrote: > Jarek Poplawski <[EMAIL PROTECTED]> wrote: ... > > > I think this changes the meaning to one I don't want. But I'm not > > > entirely > > > sure. In a way the two concepts "update of perception" and "update > > > perception" > > > are different things. I think this can be argued either way. No, your way is right. I've recollected this afterwards. So, it's all about "The Doors of Perception"... Now it's all clear! These CPUs are really cool and relaxed... Jim Morrison would be proud of them (William Blake even more), I hope. Could you sign (or ack) this patch, please. Regards, Jarek P. PS: I'm not sure you've read Robert's suggestion about this adjective. I've included here only explicitly accepted fixes. Google shows it's probably not the most respected rule, but I can resend this once more - no problem. ---> (take 2) Subject: Documentation/memory-barriers.txt: various fixes CC: "Robert P. J. Day" <[EMAIL PROTECTED]> Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> --- diff -Nur 2.6.22-rc1-git7-/Documentation/memory-barriers.txt 2.6.22-rc1-git7/Documentation/memory-barriers.txt --- 2.6.22-rc1-git7-/Documentation/memory-barriers.txt 2007-04-26 05:08:32.0 +0200 +++ 2.6.22-rc1-git7/Documentation/memory-barriers.txt 2007-05-21 20:05:07.0 +0200 @@ -24,7 +24,7 @@ (*) Explicit kernel barriers. - Compiler barrier. - - The CPU memory barriers. + - CPU memory barriers. - MMIO write barrier. (*) Implicit kernel memory barriers. @@ -457,7 +457,7 @@ (Q == &A) implies (D == 1) (Q == &B) implies (D == 4) -But! CPU 2's perception of P may be updated _before_ its perception of B, thus +But! CPU 2's perception of P may be updated _before_ its perception of B, thus leading to the following situation: (Q == &B) and (D == 2) @@ -546,10 +546,10 @@ When dealing with CPU-CPU interactions, certain types of memory barrier should always be paired. A lack of appropriate pairing is almost certainly an error. -A write barrier should always be paired with a data dependency barrier or read -barrier, though a general barrier would also be viable. Similarly a read -barrier or a data dependency barrier should always be paired with at least an -write barrier, though, again, a general barrier is viable: +A write barrier should always be paired with a data dependency barrier or a +read barrier, though a general barrier would also be viable. Similarly the +read barrier or the data dependency barrier should always be paired with at +least the write barrier, though, again, the general barrier is viable: CPU 1 CPU 2 === === @@ -573,7 +573,7 @@ the "weaker" type. [!] Note that the stores before the write barrier would normally be expected to -match the loads after the read barrier or data dependency barrier, and vice +match the loads after the read barrier or the data dependency barrier, and vice versa: CPU 1 CPU 2 @@ -588,7 +588,7 @@ EXAMPLES OF MEMORY BARRIER SEQUENCES -Firstly, write barriers act as a partial orderings on store operations. +Firstly, write barriers act as partial orderings on store operations. Consider the following sequence of events: CPU 1 @@ -608,15 +608,15 @@ +---+ : : | | +--+ | |-->| C=3 | } /\ - | | :+--+ }- \ -> Events perceptible - | | :| A=1 | }\/ to rest of system + | | :+--+ }- \ -> Events perceptible to + | | :| A=1 | }\/ the rest of the system | | :+--+ } | CPU 1 | :| B=2 | } | | +--+ } | | } <--- At this point the write barrier | | +--+ }requires all stores prior to the | | :| E=5 | }barrier to be committed before - | | :+--+ }further stores may be take place. + | | :+--+ }further stores may take place | |-->| D=4 | } | | +--+ +---+ : : @@ -626,7 +626,7 @@ V -Secondly, data dependency barriers act as a partial orderings on data-dependent +Secondly, data dependency barriers act as partial orderings on data-dependent loads. Consider the following sequence of events: CPU 1 CPU 2 @@ -975,7 +975,7 @@ barrier(); -This a general barrier - lesser varieties of compiler barrier do not exist. +This is a general barrier - lesser varieties of compiler barrier do not exist. The compiler barrier has no direct effect on the CP
8250_pnp is confused... (udev?)
I've used serial ports a lot in the past but not for the past year so. Did something fundamental change? I have serial_core, 8250 and 8250_pnp modules installed and things are quite weird... I get all of the /dev/ttyS's that I DON'T have and none of the ones that I do! $modprobe -a 8250_pnp $tail /var/log/message May 21 22:59:35 home Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled May 21 22:59:35 home pnp: Device 00:07 activated. May 21 22:59:35 home 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A So, yes it looks like it sees the one serial port that I also think I have. but... $ls -al /dev/ttyS* crw-rw 1 root uucp 4, 65 May 21 22:59 /dev/ttyS1 crw-rw 1 root uucp 4, 66 May 21 22:59 /dev/ttyS2 crw-rw 1 root uucp 4, 67 May 21 22:59 /dev/ttyS3 It seems that somehow either 8250_pnp (or maybe udev) gets the logic inverted. It doesn't create a device for the one device that 8250_pnp found and yet it does create devices for everything NOT found?? If I rmmod 8250_pnp 8250 serial_core then all the serial devices go away. So it's 8250/udev creating these oddities and not me. I am using gentoo with kernel v2.6.21.1 on an intel 64 bit dual-core with Nvidia 580i chipset. I haven't tweaked any udev rules. udev is version 104-r12. Any help is appreciated. Thanks! - Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Crash on modpost, addend_386_rel()
On Mon, 21 May 2007 21:52:59 -0700 (PDT), Linus Torvalds <[EMAIL PROTECTED]> wrote: > Would you mind also just making this whole logic (that is generic and > shared with all the different arch versions) be an inline function of its > own? > > > + Elf_Shdr *sechdrs = elf->sechdrs; > > + unsigned int *location; > > + int section = sechdrs[rsection].sh_info; > > + > > + location = (void *)elf->hdr + sechdrs[section].sh_offset + > > + (r->r_offset - sechdrs[section].sh_addr); > > so that all the functions could just use some generic > > location = reloc_location(elf, rsection, r); > > or similar, instead of having that complex thing duplicated three times > (arm, mips and i386)? Sure, updated. > Especially since other architectures will likely end up doing the same > thing too... Archs using RELA instead of REL section are not affected by this patch. But I'm not sure how many are using RELA. Subject: [PATCH] kbuild: make better section mismatch reports on i386, arm and mips (take 3) On i386, ARM and MIPS, warn_sec_mismatch() sometimes fails to show usefull symbol name. This is because empty 'refsym' due to 0 r_addend value. This patch is to adjust r_addend value, consulting with apply_relocate() routine in kernel code. Signed-off-by: Atsushi Nemoto <[EMAIL PROTECTED]> --- diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index 8e5610d..44c3960 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -384,6 +384,8 @@ static int parse_elf(struct elf_info *info, const char *filename) sechdrs[i].sh_size = TO_NATIVE(sechdrs[i].sh_size); sechdrs[i].sh_link = TO_NATIVE(sechdrs[i].sh_link); sechdrs[i].sh_name = TO_NATIVE(sechdrs[i].sh_name); + sechdrs[i].sh_info = TO_NATIVE(sechdrs[i].sh_info); + sechdrs[i].sh_addr = TO_NATIVE(sechdrs[i].sh_addr); } /* Find symbol table. */ for (i = 1; i < hdr->e_shnum; i++) { @@ -753,6 +755,8 @@ static Elf_Sym *find_elf_symbol(struct elf_info *elf, Elf_Addr addr, for (sym = elf->symtab_start; sym < elf->symtab_stop; sym++) { if (sym->st_shndx != relsym->st_shndx) continue; + if (ELF_ST_TYPE(sym->st_info) == STT_SECTION) + continue; if (sym->st_value == addr) return sym; } @@ -895,6 +899,69 @@ static void warn_sec_mismatch(const char *modname, const char *fromsec, } } +static inline unsigned int *reloc_location(struct elf_info *elf, + int rsection, Elf_Rela *r) +{ + Elf_Shdr *sechdrs = elf->sechdrs; + int section = sechdrs[rsection].sh_info; + + return (void *)elf->hdr + sechdrs[section].sh_offset + + (r->r_offset - sechdrs[section].sh_addr); +} + +static void addend_386_rel(struct elf_info *elf, int rsection, Elf_Rela *r) +{ + unsigned int r_typ = ELF_R_TYPE(r->r_info); + unsigned int *location = reloc_location(elf, rsection, r); + + switch (r_typ) { + case R_386_32: + r->r_addend = TO_NATIVE(*location); + break; + case R_386_PC32: + r->r_addend = TO_NATIVE(*location) + 4; + break; + } +} + +static void addend_arm_rel(struct elf_info *elf, int rsection, Elf_Rela *r) +{ + unsigned int r_typ = ELF_R_TYPE(r->r_info); + unsigned int *location = reloc_location(elf, rsection, r); + + switch (r_typ) { + case R_ARM_ABS32: + r->r_addend = TO_NATIVE(*location); + break; + case R_ARM_PC24: + r->r_addend = ((TO_NATIVE(*location) & 0x00ff) << 2) + 8; + break; + } +} + +static int addend_mips_rel(struct elf_info *elf, int rsection, Elf_Rela *r) +{ + unsigned int r_typ = ELF_R_TYPE(r->r_info); + unsigned int *location = reloc_location(elf, rsection, r); + unsigned int inst; + + if (r_typ == R_MIPS_HI16) + return 1; /* skip this */ + inst = TO_NATIVE(*location); + switch (r_typ) { + case R_MIPS_LO16: + r->r_addend = inst & 0x; + break; + case R_MIPS_26: + r->r_addend = (inst & 0x03ff) << 2; + break; + case R_MIPS_32: + r->r_addend = inst; + break; + } + return 0; +} + /** * A module includes a number of sections that are discarded * either when loaded or when used as built-in. @@ -938,8 +1005,11 @@ static void check_sec_ref(struct module *mod, const char *modname, r.r_offset = TO_NATIVE(rela->r_offset); #if KERNEL_ELFCLASS == ELFCLASS64 if (hdr->e_machine == EM_MIPS) { + unsigned int r_typ; r_sym = ELF64_MIPS_
Re: Define CONFIG_BOUNCE to avoid useless inclusion of bounce buffer logic.
On Tue, 22 May 2007, KAMEZAWA Hiroyuki wrote: > > +config BOUNCE > > + def_bool y > > + depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM) > > + > > AFAIK, ppc has only ZONE_DMA and it never needs bounce. > Is this ok ? That is wrong. ppc should have ZONE_NORMAL and no ZONE_DMA. Otherwise you cannot switch off ZONE_DMA and you cannot switch off bounce. ZONE_DMA is a zone for exceptional allocs. If you do not have those then you only have normal allocs -> ZONE_NORMAL. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] Make ide dma blacklist handling a bit saner.
Earlier, the matching of (model,rev) in ide-dma black/white list handling was to consider "ALL" in the table to match any revision. This changes the wildcard to NULL. This way, the DMA_BLACK_LIST macro used in the previous patch does not have to use a slightly funky compile time constant expression to convert NULL to "ALL". Signed-off-by: Junio C Hamano <[EMAIL PROTECTED]> --- * I do not really know what I am doing in the mips area, but that architecture specific table seems to be used by the same ide_in_drive_list() function, so the entries are matched to the updated code. drivers/ide/ide-dma.c | 14 +++--- include/asm-mips/mach-au1x00/au1xxx_ide.h | 28 ++-- 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c index a6a2074..c0b5b10 100644 --- a/drivers/ide/ide-dma.c +++ b/drivers/ide/ide-dma.c @@ -91,16 +91,16 @@ static const struct drive_list_entry drive_whitelist [] = { - { "Micropolis 2112A", "ALL" }, - { "CONNER CTMA 4000", "ALL" }, - { "CONNER CTT8000-A", "ALL" }, - { "ST34342A", "ALL" }, + { "Micropolis 2112A", NULL}, + { "CONNER CTMA 4000", NULL}, + { "CONNER CTT8000-A", NULL}, + { "ST34342A", NULL}, { NULL , NULL} }; static const struct drive_list_entry drive_blacklist [] = { -#define DMA_BLACK_LIST(model,rev) { (model), (rev==NULL ? "ALL" : (rev)) } +#define DMA_BLACK_LIST(model,rev) { (model), (rev) } #include "dma-blacklist.h" #undef DMA_BLACK_LIST { NULL , NULL} @@ -120,8 +120,8 @@ int ide_in_drive_list(struct hd_driveid *id, const struct drive_list_entry *driv { for ( ; drive_table->id_model ; drive_table++) if ((!strcmp(drive_table->id_model, id->model)) && - ((strstr(id->fw_rev, drive_table->id_firmware)) || -(!strcmp(drive_table->id_firmware, "ALL" + (!drive_table->id_firmware || +strstr(id->fw_rev, drive_table->id_firmware))) return 1; return 0; } diff --git a/include/asm-mips/mach-au1x00/au1xxx_ide.h b/include/asm-mips/mach-au1x00/au1xxx_ide.h index 8fcae21..4663e8b 100644 --- a/include/asm-mips/mach-au1x00/au1xxx_ide.h +++ b/include/asm-mips/mach-au1x00/au1xxx_ide.h @@ -88,26 +88,26 @@ static const struct drive_list_entry dma_white_list [] = { /* * Hitachi */ -{ "HITACHI_DK14FA-20", "ALL" }, -{ "HTS726060M9AT00" , "ALL" }, +{ "HITACHI_DK14FA-20", NULL}, +{ "HTS726060M9AT00" , NULL}, /* * Maxtor */ -{ "Maxtor 6E040L0" , "ALL" }, -{ "Maxtor 6Y080P0" , "ALL" }, -{ "Maxtor 6Y160P0" , "ALL" }, +{ "Maxtor 6E040L0" , NULL}, +{ "Maxtor 6Y080P0" , NULL}, +{ "Maxtor 6Y160P0" , NULL}, /* * Seagate */ -{ "ST3120026A" , "ALL" }, -{ "ST320014A" , "ALL" }, -{ "ST94011A", "ALL" }, -{ "ST340016A" , "ALL" }, +{ "ST3120026A" , NULL}, +{ "ST320014A" , NULL}, +{ "ST94011A", NULL}, +{ "ST340016A" , NULL}, /* * Western Digital */ -{ "WDC WD400UE-00HCT0" , "ALL" }, -{ "WDC WD400JB-00JJC0" , "ALL" }, +{ "WDC WD400UE-00HCT0" , NULL}, +{ "WDC WD400JB-00JJC0" , NULL}, { NULL , NULL} }; @@ -116,9 +116,9 @@ static const struct drive_list_entry dma_black_list [] = { /* * Western Digital */ -{ "WDC WD100EB-00CGH0" , "ALL" }, -{ "WDC WD200BB-00AUA1" , "ALL" }, -{ "WDC AC24300L", "ALL" }, +{ "WDC WD100EB-00CGH0" , NULL}, +{ "WDC WD200BB-00AUA1" , NULL}, +{ "WDC AC24300L", NULL}, { NULL , NULL} }; #endif -- 1.5.2.24.g93d4 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] Unify dma blacklist in ide-dma.c and libata-core.c
This introduces a shared header file that defines the entries for two dma blacklists in ide-dma.c and libata-core.c to make it easier to keep them in sync. Signed-off-by: Junio C Hamano <[EMAIL PROTECTED]> --- * Removes more lines than it adds. I am not proud of the DMA_BLACK_LIST macro in ide-dma.c which relies on the compiler doing a sane thing for compile time constant expression to initialize the list, but that hopefully can be fixed in the next patch. drivers/ata/libata-core.c | 34 -- drivers/ide/dma-blacklist.h | 39 +++ drivers/ide/ide-dma.c | 33 +++-- 3 files changed, 46 insertions(+), 60 deletions(-) create mode 100644 drivers/ide/dma-blacklist.h diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index a6de57e..93b7fa7 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -3739,36 +3739,10 @@ struct ata_blacklist_entry { static const struct ata_blacklist_entry ata_device_blacklist [] = { /* Devices with DMA related problems under Linux */ - { "WDC AC11000H", NULL, ATA_HORKAGE_NODMA }, - { "WDC AC22100H", NULL, ATA_HORKAGE_NODMA }, - { "WDC AC32500H", NULL, ATA_HORKAGE_NODMA }, - { "WDC AC33100H", NULL, ATA_HORKAGE_NODMA }, - { "WDC AC31600H", NULL, ATA_HORKAGE_NODMA }, - { "WDC AC32100H", "24.09P07", ATA_HORKAGE_NODMA }, - { "WDC AC23200L", "21.10N21", ATA_HORKAGE_NODMA }, - { "Compaq CRD-8241B", NULL, ATA_HORKAGE_NODMA }, - { "CRD-8400B", NULL, ATA_HORKAGE_NODMA }, - { "CRD-8480B", NULL, ATA_HORKAGE_NODMA }, - { "CRD-8482B", NULL, ATA_HORKAGE_NODMA }, - { "CRD-84", NULL, ATA_HORKAGE_NODMA }, - { "SanDisk SDP3B", NULL, ATA_HORKAGE_NODMA }, - { "SanDisk SDP3B-64", NULL, ATA_HORKAGE_NODMA }, - { "SANYO CD-ROM CRD", NULL, ATA_HORKAGE_NODMA }, - { "HITACHI CDR-8", NULL, ATA_HORKAGE_NODMA }, - { "HITACHI CDR-8335", NULL, ATA_HORKAGE_NODMA }, - { "HITACHI CDR-8435", NULL, ATA_HORKAGE_NODMA }, - { "Toshiba CD-ROM XM-6202B", NULL, ATA_HORKAGE_NODMA }, - { "TOSHIBA CD-ROM XM-1702BC", NULL, ATA_HORKAGE_NODMA }, - { "CD-532E-A", NULL, ATA_HORKAGE_NODMA }, - { "E-IDE CD-ROM CR-840",NULL, ATA_HORKAGE_NODMA }, - { "CD-ROM Drive/F5A", NULL, ATA_HORKAGE_NODMA }, - { "WPI CDD-820",NULL, ATA_HORKAGE_NODMA }, - { "SAMSUNG CD-ROM SC-148C", NULL, ATA_HORKAGE_NODMA }, - { "SAMSUNG CD-ROM SC", NULL, ATA_HORKAGE_NODMA }, - { "ATAPI CD-ROM DRIVE 40X MAXIMUM",NULL,ATA_HORKAGE_NODMA }, - { "_NEC DV5800A", NULL, ATA_HORKAGE_NODMA }, - { "SAMSUNG CD-ROM SN-124","N001", ATA_HORKAGE_NODMA }, - { "Seagate STT2A", NULL,ATA_HORKAGE_NODMA }, + +#define DMA_BLACK_LIST(model,rev) { (model), (rev), ATA_HORKAGE_NODMA } +#include "../ide/dma-blacklist.h" +#undef DMA_BLACK_LIST /* Weird ATAPI devices */ { "TORiSAN DVD-ROM DRD-N216", NULL, ATA_HORKAGE_MAX_SEC_128 | diff --git a/drivers/ide/dma-blacklist.h b/drivers/ide/dma-blacklist.h new file mode 100644 index 000..19b4e0c --- /dev/null +++ b/drivers/ide/dma-blacklist.h @@ -0,0 +1,39 @@ +/* + * Shared between ide-dma.c::drive_blacklist[] and + * ../ata/libata-core.c::ata_device_blacklist[]. + * + * Each of the above users define DMA_BLACK_LIST() macro + * which expands these to structure of their liking. + */ + + DMA_BLACK_LIST("WDC AC11000H", NULL), + DMA_BLACK_LIST("WDC AC22100H", NULL), + DMA_BLACK_LIST("WDC AC32500H", NULL), + DMA_BLACK_LIST("WDC AC33100H", NULL), + DMA_BLACK_LIST("WDC AC31600H", NULL), + DMA_BLACK_LIST("WDC AC32100H", "24.09P07"), + DMA_BLACK_LIST("WDC AC23200L", "21.10N21"), + DMA_BLACK_LIST("Compaq CRD-8241B", NULL), + DMA_BLACK_LIST("CRD-8400B", NULL), + DMA_BLACK_LIST("CRD-8480B", NULL), + DMA_BLACK_LIST("CRD-8482B", NULL), + DMA_BLACK_LIST("CRD-84", NULL), + DMA_BLACK_LIST("SanDisk SDP3B", NULL), + DMA_BLACK_LIST("SanDisk SDP3B-64", NULL), + DMA_BLACK_LIST("SANYO CD-ROM CRD", NULL), + DMA_BLACK_LIST("HITACHI CDR-8", NULL), + DMA_BLACK_LIST("HITACHI CDR-8335", NULL), + DMA_BLACK_LIST("HITACHI CDR-8435", NULL), + DMA_BLACK_LIST("Toshiba CD-ROM XM-6202B", NULL), + DMA_BLACK_LIST("TOSHIBA CD-ROM XM-1702BC", NULL), + DMA_BLACK_LIST("CD-532E-A", NULL), + DMA_BLACK_LIST("E-IDE CD-ROM CR-840", NULL), + DMA_BLACK_LIST("CD-ROM Drive/F5A", NULL), + DMA_BLACK_LIST("WPI
Re: Linux 2.6.22-rc2
On Mon, 21 May 2007, Stephen Hemminger wrote: > > AHCI on this motherboard doesn't seem to use MSI. The problems occur > even if I boot with nomsi. Have you tried playing with PCI latency counters etc? Maybe the SATA/AHCI thing is better at saturating the bus, and the sky2 hardware gets upset if it has overlong DMA access latencies due to some other controller keeping the bus busy with a long burst access? I can't really see that being a real problem in this day and age of PCI-X etc, but it _used_ to be a possible issue a decade ago. Maybe you've found a case where it matters even on modern hardware? We occasionally used to set the PCI latency timer to make people happy. (Not that I'm convinced it even has any semantic meaning on a modern PCI system..) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Prevent going idle with softirq pending
* Thomas Gleixner <[EMAIL PROTECTED]> wrote: > The NOHZ patch contains a check for softirqs pending when a CPU goes > idle. The BUG is unrelated to NOHZ, it just was made visible by the > NOHZ patch. The BUG showed up mainly on P4 / hyperthreading enabled > machines which lead the investigations into the wrong direction in the > first place. The real cause is in cond_resched_softirq(): > > cond_resched_softirq() is enabling softirqs without invoking the > softirq daemon when softirqs are pending. This leads to the warning > message in the NOHZ idle code: good find! > raw_local_irq_disable(); > - _local_bh_enable(); > + local_bh_enable(); > raw_local_irq_enable(); hm, i think this should be done without having irqs disabled? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Mon, May 21, 2007 at 06:39:51PM -0700, William Lee Irwin III wrote: >> address (virtual and physical are trivially inter-convertible), mock >> up something akin to what filesystems do for anonymous pages, etc. >> The real objection everyone's going to have is that driver writers >> will stain their shorts when faced with the rules for handling such >> things. The thing is, I'm not entirely sure who these driver writers >> that would have such trouble are, since the driver writers I know >> personally are sophisticates rather than walking disaster areas as such >> would imply. I suppose they may not be representative of the whole. On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote: > That's not the objection I would have. I would say that firstly, I > don't think the mem_map overhead is very significant (at any rate, > an allocated-on-demand metadata is not going to be any smaller if > you fill up on pagecache...). Secondly, I think there is merit to > having the same page metadata used by the major subsystems, because > it helps for locality of reference. The size isn't the advantage being cited; I'd actually expect the net result to be larger. It's the control over the layout of the metadata for cache locality and even things like having enough flags, folding buffer_head -like affairs into the per-page metadata for filesystems and so reaping cache locality benefits even there (assuming it works out in other respects), and so on. Passing pages between subsystems doesn't seem very significant to me. There isn't going to be much locality of reference, or even any guarantee that the subsystem gets fed a cache hot page structure. The subsystem being passed the page will have its own cache hot accounting structures to stick the information about the memory into. On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote: > But I haven't explored the idea enough myself to know whether there > would be any really killer benefits to this. Delayed metadata freeing > via RCU without holding up the freeing of the actual page would have > been something, however I can do similar with speculative references > now (or whenever the code gets merged), which doesn't even require the > RCU overhead. I'm not entirely sure what you're on about there, but it sounds interesting. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Match DMA blacklist entries between ide-dma.c and libata-core.c
There are a few entries in ata_device_blacklist[] in libata-core.c marked with HORKAGE_NODMA but are missing from drive_blacklist[] in ide-dma.c. This patch makes the lists in sync. Also remove a duplicated entry for "SanDisk SDP3B-64". Signed-off-by: Junio C Hamano <[EMAIL PROTECTED]> --- * Recently Alan Cox responded that libata blacklist needs to be kept in sync when Dave Jones added STT2A to DMA blacklist in ide-dma.c --- drivers/ide/ide-dma.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c index b77b7d1..ead141e 100644 --- a/drivers/ide/ide-dma.c +++ b/drivers/ide/ide-dma.c @@ -119,15 +119,17 @@ static const struct drive_list_entry drive_blacklist [] = { { "HITACHI CDR-8335", "ALL" }, { "HITACHI CDR-8435", "ALL" }, { "Toshiba CD-ROM XM-6202B" , "ALL" }, + { "TOSHIBA CD-ROM XM-1702BC", "ALL" }, { "CD-532E-A" , "ALL" }, { "E-IDE CD-ROM CR-840","ALL" }, { "CD-ROM Drive/F5A", "ALL" }, { "WPI CDD-820","ALL" }, { "SAMSUNG CD-ROM SC-148C", "ALL" }, { "SAMSUNG CD-ROM SC", "ALL" }, - { "SanDisk SDP3B-64", "ALL" }, { "ATAPI CD-ROM DRIVE 40X MAXIMUM", "ALL" }, { "_NEC DV5800A", "ALL" }, + { "SAMSUNG CD-ROM SN-124", "N001" }, + { "Seagate STT2A", "ALL" }, { NULL , NULL} }; -- 1.5.2.24.g93d4 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Crash on modpost, addend_386_rel()
On Tue, 22 May 2007, Atsushi Nemoto wrote: > > Anyway, here is a updated patch tested on i386 (RELOCATABLE=y/n), arm, > and mips. On calculation of 'location', sh_addr should be subtracted > (thank you for debugging, Linus). And this patch contains an another > fix and an improvement of added_mips_rel Would you mind also just making this whole logic (that is generic and shared with all the different arch versions) be an inline function of its own? > + Elf_Shdr *sechdrs = elf->sechdrs; > + unsigned int *location; > + int section = sechdrs[rsection].sh_info; > + > + location = (void *)elf->hdr + sechdrs[section].sh_offset + > + (r->r_offset - sechdrs[section].sh_addr); so that all the functions could just use some generic location = reloc_location(elf, rsection, r); or similar, instead of having that complex thing duplicated three times (arm, mips and i386)? Especially since other architectures will likely end up doing the same thing too... Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Kernel panic during hibernation
Hi, I have kernel panic message when trying to put Dell Inspiron 6400 into hibernation. The following is the message: Process pm-hibernate (pid: 3168, threadinfo 810013dba000, task 810018d0e 860) Stack: 01bc7e40f260 07ef 000fdff0 800a74aa 810037e206f8 880777bc 1ebf7000 0001ebf7 0001enf6 Call Trace: [] swsusp_write+0x2fa/0x440 [] :scsi_mod:scsi_schedule_eh+0x45/0x55 [] pm_suspend_disk+0x5b/0xce [] enter_state+0x52/0x19b [] state_store+0x5e/0x79 [] sysfs_write_file+0xb9/0xe8 [] vfs_write+0xce/0x174 [] sys_write+0x45/0x6e [] tracesys+0xd1/0xdc Code 0f ba 6d 00 00 19 c0 85 c0 74 08 48 89 ef e8 49 c0 f7 ff 48 RIP [] rw_swap_page_sync+0x1c/0xc2 RSP <0>Kernel panic - not syncing: Fatal exception I am using "Linux localhost.localdomain 2.6.18-8.el5 #1 SMP Thu Mar 15 19:46:53 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux", CentOs 5. Could you please tell me if there is a fix for this problem already, I couldn't find anything yet. Thank You, Vlad. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v12
Peter Williams wrote: Dmitry Adamushko wrote: On 18/05/07, Peter Williams <[EMAIL PROTECTED]> wrote: [...] One thing that might work is to jitter the load balancing interval a bit. The reason I say this is that one of the characteristics of top and gkrellm is that they run at a more or less constant interval (and, in this case, X would also be following this pattern as it's doing screen updates for top and gkrellm) and this means that it's possible for the load balancing interval to synchronize with their intervals which in turn causes the observed problem. Hum.. I guess, a 0/4 scenario wouldn't fit well in this explanation.. No, and I haven't seen one. all 4 spinners "tend" to be on CPU0 (and as I understand each gets ~25% approx.?), so there must be plenty of moments for *idle_balance()* to be called on CPU1 - as gkrellm, top and X consume together just a few % of CPU. Hence, we should not be that dependent on the load balancing interval here.. The split that I see is 3/1 and neither CPU seems to be favoured with respect to getting the majority. However, top, gkrellm and X seem to be always on the CPU with the single spinner. The CPU% reported by top is approx. 33%, 33%, 33% and 100% for the spinners. If I renice the spinners to -10 (so that there load weights dominate the run queue load calculations) the problem goes away and the spinner to CPU allocation is 2/2 and top reports them all getting approx. 50% each. For no good reason other than curiosity, I tried a variation of this experiment where I reniced the spinners to 10 instead of -10 and, to my surprise, they were allocated 2/2 to the CPUs on average. I say on average because the allocations were a little more volatile and occasionally 0/4 splits would occur but these would last for less than one top cycle before the 2/2 was re-established. The quickness of these recoveries would indicate that it was most likely the idle balance mechanism that restored the balance. This may point the finger at the tick based load balance mechanism being too conservative in when it decides whether tasks need to be moved. In the case where the spinners are at nice == 0, the idle balance mechanism never comes into play as the 0/4 split is never seen so only the tick based mechanism is in force in this case and this is where the anomalies are seen. This tick rebalance mechanism only situation is also true for the nice == -10 case but in this case the high load weights of the spinners overcomes the tick based load balancing mechanism's conservatism e.g. the difference in queue loads for a 1/3 split in this case is the equivalent to the difference that would be generated by an imbalance of about 18 nice == 0 spinners i.e. too big to be ignored. The evidence seems to indicate that IF a rebalance operation gets initiated then the right amount of load will get moved. This new evidence weakens (but does not totally destroy) my synchronization (a.k.a. conspiracy) theory. Peter PS As the total load weight for 4 nice == 10 tasks is only about 40% of the load weight of a single nice == 0 task, the occasional 0/4 split in the spinners at nice == 10 case is not unexpected as it would be the desirable allocation if there were exactly one other running task at nice == 0. -- Peter Williams [EMAIL PROTECTED] "Learning, n. The kind of ignorance distinguishing the studious." -- Ambrose Bierce - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.22-rc2
On Tue, 22 May 2007 00:36:15 -0400 Jeff Garzik <[EMAIL PROTECTED]> wrote: > Stephen Hemminger wrote: > > There maybe some hardware level interaction with SATA controller. > > I saw no failures running off i386 kernel of PATA drive and quickly > > see errors with SATA/AHCI and x86_64. > > > I presume AHCI is the only other device in the system using PCI MSI, > when you see problems? > > Jeff > > AHCI on this motherboard doesn't seem to use MSI. The problems occur even if I boot with nomsi. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] Crash on modpost, addend_386_rel()
On Mon, 21 May 2007 22:01:27 -0400, Ben Collins <[EMAIL PROTECTED]> wrote: > Got this crash in modpost. Bisect blames this commit: > > commit f892b7d480eec809a5dfbd6e65742b3f3155e50e > Author: Atsushi Nemoto <[EMAIL PROTECTED]> > Date: Thu May 17 01:14:38 2007 +0900 > kbuild: make better section mismatch reports on i386, arm and mips Sorry, the patch breaks CONFIG_RELOCATABLE=y build. Actually I had not tested with that configuration. Linus already have reverted the commmit on git tree due to this breakage. Unfortunately, fixing whole things in the _right_ way is somewhat out of my ELF understanding. Some help from ELF gurus are welcome! Anyway, here is a updated patch tested on i386 (RELOCATABLE=y/n), arm, and mips. On calculation of 'location', sh_addr should be subtracted (thank you for debugging, Linus). And this patch contains an another fix and an improvement of added_mips_rel (kbuild-fix-and-improve-mips-rel-handling.patch in mm tree). Subject: [PATCH] kbuild: make better section mismatch reports on i386, arm and mips (take 2) On i386, ARM and MIPS, warn_sec_mismatch() sometimes fails to show usefull symbol name. This is because empty 'refsym' due to 0 r_addend value. This patch is to adjust r_addend value, consulting with apply_relocate() routine in kernel code. Signed-off-by: Atsushi Nemoto <[EMAIL PROTECTED]> --- diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index 8e5610d..e0bd1cd 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -384,6 +384,8 @@ static int parse_elf(struct elf_info *info, const char *filename) sechdrs[i].sh_size = TO_NATIVE(sechdrs[i].sh_size); sechdrs[i].sh_link = TO_NATIVE(sechdrs[i].sh_link); sechdrs[i].sh_name = TO_NATIVE(sechdrs[i].sh_name); + sechdrs[i].sh_info = TO_NATIVE(sechdrs[i].sh_info); + sechdrs[i].sh_addr = TO_NATIVE(sechdrs[i].sh_addr); } /* Find symbol table. */ for (i = 1; i < hdr->e_shnum; i++) { @@ -753,6 +755,8 @@ static Elf_Sym *find_elf_symbol(struct elf_info *elf, Elf_Addr addr, for (sym = elf->symtab_start; sym < elf->symtab_stop; sym++) { if (sym->st_shndx != relsym->st_shndx) continue; + if (ELF_ST_TYPE(sym->st_info) == STT_SECTION) + continue; if (sym->st_value == addr) return sym; } @@ -895,6 +899,74 @@ static void warn_sec_mismatch(const char *modname, const char *fromsec, } } +static void addend_386_rel(struct elf_info *elf, int rsection, Elf_Rela *r) +{ + Elf_Shdr *sechdrs = elf->sechdrs; + unsigned int r_typ; + unsigned int *location; + int section = sechdrs[rsection].sh_info; + + r_typ = ELF_R_TYPE(r->r_info); + location = (void *)elf->hdr + sechdrs[section].sh_offset + + (r->r_offset - sechdrs[section].sh_addr); + switch (r_typ) { + case R_386_32: + r->r_addend = TO_NATIVE(*location); + break; + case R_386_PC32: + r->r_addend = TO_NATIVE(*location) + 4; + break; + } +} + +static void addend_arm_rel(struct elf_info *elf, int rsection, Elf_Rela *r) +{ + Elf_Shdr *sechdrs = elf->sechdrs; + unsigned int r_typ; + unsigned int *location; + int section = sechdrs[rsection].sh_info; + + r_typ = ELF_R_TYPE(r->r_info); + location = (void *)elf->hdr + sechdrs[section].sh_offset + + (r->r_offset - sechdrs[section].sh_addr); + switch (r_typ) { + case R_ARM_ABS32: + r->r_addend = TO_NATIVE(*location); + break; + case R_ARM_PC24: + r->r_addend = ((TO_NATIVE(*location) & 0x00ff) << 2) + 8; + break; + } +} + +static int addend_mips_rel(struct elf_info *elf, int rsection, Elf_Rela *r) +{ + Elf_Shdr *sechdrs = elf->sechdrs; + unsigned int r_typ; + unsigned int *location; + int section = sechdrs[rsection].sh_info; + unsigned int inst; + + r_typ = ELF_R_TYPE(r->r_info); + if (r_typ == R_MIPS_HI16) + return 1; /* skip this */ + location = (void *)elf->hdr + sechdrs[section].sh_offset + + (r->r_offset - sechdrs[section].sh_addr); + inst = TO_NATIVE(*location); + switch (r_typ) { + case R_MIPS_LO16: + r->r_addend = inst & 0x; + break; + case R_MIPS_26: + r->r_addend = (inst & 0x03ff) << 2; + break; + case R_MIPS_32: + r->r_addend = inst; + break; + } + return 0; +} + /** * A module includes a number of sections that are discarded * either when loaded or when used as built-in. @@ -938,8 +1010,11 @@ static void check_sec_ref(struct module *mod, const char *modname,
Re: Define CONFIG_BOUNCE to avoid useless inclusion of bounce buffer logic.
On Mon, 21 May 2007 21:03:40 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > The bounce buffer logic is included on systems that do not need it. > If a system does not have zones like ZONE_DMA and ZONE_HIGHMEM that > can lead to the use of bounce buffers then there is no need to reserve > memory pools etc etc. This is true f.e. for SGI Altix. > +config BOUNCE > + def_bool y > + depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM) > + AFAIK, ppc has only ZONE_DMA and it never needs bounce. Is this ok ? -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.22-rc2
Stephen Hemminger wrote: There maybe some hardware level interaction with SATA controller. I saw no failures running off i386 kernel of PATA drive and quickly see errors with SATA/AHCI and x86_64. I presume AHCI is the only other device in the system using PCI MSI, when you see problems? Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] smbfs: fix header_check (and build)
The build dies in the header_check portion: /garz/repo/linux-2.6/usr/include/linux/smb_fs.h requires linux/jiffies.h, which does not exist in exported headers make[3]: *** [/garz/repo/linux-2.6/usr/include/linux/.check.smb_fs.h] Error 1 The solution is to move the jiffies.h include inside __KERNEL__. Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> diff --git a/include/linux/smb_fs.h b/include/linux/smb_fs.h index 6b51a48..d5212f0 100644 --- a/include/linux/smb_fs.h +++ b/include/linux/smb_fs.h @@ -9,7 +9,6 @@ #ifndef _LINUX_SMB_FS_H #define _LINUX_SMB_FS_H -#include #include /* @@ -26,6 +25,7 @@ #include #include +#include #include #include #include - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.22-rc2
On Mon, 21 May 2007 22:58:06 -0400 Mike Houston <[EMAIL PROTECTED]> wrote: > On Mon, 21 May 2007 10:37:55 -0700 > Stephen Hemminger <[EMAIL PROTECTED]> wrote: > > > On Mon, 21 May 2007 13:10:55 -0400 > > Mike Houston <[EMAIL PROTECTED]> wrote: > > > > > On Mon, 21 May 2007 08:45:49 -0700 > > > Stephen Hemminger <[EMAIL PROTECTED]> wrote: > > > > > > > It's almost certainly a problem with the BIOS and hardware (not > > > > a sky2) driver issue. Since there are many similar boards and > > > > configurations, I made the decision not to enforce restrictions > > > > in the driver. > > > > > > >> May 20 15:57:48 cramit kernel: sky2 :04:00.0: v1.14 addr > > > >> 0xf800 irq 16 Yukon-EC Ultra (0xb4) rev 2 > > > > > > Thank you for your answer. I was half wondering if that was the > > > case after staring at those log messages several more times. I > > > don't understand hardware at the low level but got thinking maybe > > > interrupt routing issue. There's an Nvidia PCI Express card in > > > there that gets IRQ 16, though it was not initialized by a driver > > > at the time. (plain old VGA console after fresh cold boot... no > > > framebuffer, no X, no nvidia module). I guess some things don't > > > share well. > > > > > > It works well in that other OS that came with the hardware, but > > > that's beside the point. > > > > It is some low level PCI Express related stuff, try latest BIOS (F9) > > and if that doesn't help there is a EEPROM update from Gigabyte > > for the Marvell hardware that might help. > > Thanks for your suggestions, I followed through on them. It may still > be interesting/useful to hear from me that it didn't help. The > problem is the same. > > My motherboard is a newer revision (Gigabyte GA-965P-DS3 Rev 3.3) and > already had the "F10" bios version, but I flashed to the latest F11 > version anyways. I also flashed with the EEPROM update from Gigabyte, > from a FAQ entry for my motherboard revision. > (faq_marvell_eeprom.zip). Both operations were successful. I cleared > the CMOS and reconfigured after the bios flash too. > > Incidently, it was showing IRQ 16 in that early initialization > message, but actually getting a MSI interrupt (IRQ 219, PCI-MSI-edge) > > I've disabled the onboard yukon2 adapter in bios and gone > back to the PCI card now. I think we can consider the matter closed, > since it's not a problem with the driver, but just so you know, I'm > always willing to help test when it's hardware that I have. > > Mike Houston There maybe some hardware level interaction with SATA controller. I saw no failures running off i386 kernel of PATA drive and quickly see errors with SATA/AHCI and x86_64. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] partitions/LDM: build fix
This from a "tested" patch... Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]> diff --git a/fs/partitions/ldm.c b/fs/partitions/ldm.c index c387812..99873a2 100644 --- a/fs/partitions/ldm.c +++ b/fs/partitions/ldm.c @@ -158,7 +158,7 @@ static bool ldm_parse_privhead(const u8 *data, struct privhead *ph) /* Warn the user and continue, carefully. */ ldm_info("Database is normally %u bytes, it claims to " "be %llu bytes.", LDM_DB_SIZE, - udunsigned long long)ph->config_size); + (unsigned long long)ph->config_size); } if ((ph->logical_disk_size == 0) || (ph->logical_disk_start + ph->logical_disk_size > ph->config_start)) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree
On Mon, May 21, 2007 at 07:40:14PM -0700, Ken Chen wrote: > tested, like this? ACK. Could merge loop_init_one() into the only remaining caller, but it won't make the code simpler, so let's leave it at that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Oops in dentry_iput with 2.6.22-rc2 on AMD64
I was running a multithreaded perl application that leaks some memory so it gets to eat up a significant chunk of my 2 GB and even push a bit into swap. I left it running before going out for a walk. When I got back, I found this in the log: [28818.103829] Unable to handle kernel paging request at 910025c9a8b4 RIP: [28818.103836] [] dentry_iput+0x58/0xbb [28818.103845] PGD 0 [28818.103848] Oops: [1] SMP [28818.103851] CPU 0 [28818.103853] Modules linked in: radeon ntfs sbp2 lp lgdt330x cx88_dvb cx88_vp3054_i2c dvb_pll video_buf_dvb tuner cx8802 cx88_alsa cx8800 cx88xx ir_common rtc tveeprom video_buf btcx_risc i2c_nforce2 evdev forcedeth [28818.103868] Pid: 253, comm: kswapd0 Not tainted 2.6.22-rc2 #1 [28818.103871] RIP: 0010:[] [] dentry_iput+0x58/0xbb [28818.103877] RSP: :81007de75d40 EFLAGS: 00010246 [28818.103880] RAX: RBX: 810075662e00 RCX: 810025c9a898 [28818.103883] RDX: 810025c9a898 RSI: 0001 RDI: 0282 [28818.103886] RBP: 81007de75d50 R08: 534d R09: 0064 [28818.103890] R10: 81007de75d80 R11: 000a R12: 910025c9a868 [28818.103893] R13: 810075662e08 R14: R15: 005e [28818.103897] FS: 42804940() GS:806c8000() knlGS: [28818.103900] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b [28818.103903] CR2: 910025c9a8b4 CR3: 254bf000 CR4: 06e0 [28818.103907] Process kswapd0 (pid: 253, threadinfo 81007de74000, task 81007de706d0) [28818.103909] Stack: 810075662e00 0001 81007de75d70 80288aaf [28818.103916] 81007e606070 0001 81007de75d90 80289a35 [28818.103921] 81007e606070 810075662e00 81007de75dd0 80289d3d [28818.103926] Call Trace: [28818.103931] [] d_kill+0x38/0x58 [28818.103935] [] prune_one_dentry+0x3c/0x10f [28818.103939] [] prune_dcache+0x137/0x1a0 [28818.103944] [] shrink_dcache_memory+0x1c/0x35 [28818.103948] [] shrink_slab+0xe6/0x162 [28818.103953] [] kswapd+0x329/0x4ca [28818.103958] [] autoremove_wake_function+0x0/0x38 [28818.103963] [] kswapd+0x0/0x4ca [28818.103967] [] kthread+0x49/0x76 [28818.103971] [] child_rip+0xa/0x12 [28818.103977] [] kthread+0x0/0x76 [28818.103980] [] child_rip+0x0/0x12 [28818.103982] [28818.103984] [28818.103985] Code: 41 83 7c 24 4c 00 75 1c 4c 89 e7 45 31 c0 31 c9 31 d2 be 00 [28818.103994] RIP [] dentry_iput+0x58/0xbb [28818.103999] RSP [28818.104001] CR2: 910025c9a8b4 florin -- Bruce Schneier expects the Spanish Inquisition. http://geekz.co.uk/schneierfacts/fact/163 signature.asc Description: Digital signature
Re: [PATCH] eCryptfs: Delay writing 0's after llseek until write
On Mon, 21 May 2007 18:00:21 -0500 Michael Halcrow <[EMAIL PROTECTED]> wrote: > Delay writing 0's out in eCryptfs after a seek past the end of the > file until data is actually written. a) why? b) what is the impact upon a user of them not having this patch? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Status of squashfs?
> So Fedora uses squashfs, Ubuntu uses, squashfs, Gentoo uses squashfs... It > seems like the only place I can get a kernel _without_ squashfs is > kernel.org. > > Is there a reason for this? Has anyone tried to merge it upstream? Do the squashfs developers want to merge it? - R. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Define CONFIG_BOUNCE to avoid useless inclusion of bounce buffer logic.
The bounce buffer logic is included on systems that do not need it. If a system does not have zones like ZONE_DMA and ZONE_HIGHMEM that can lead to the use of bounce buffers then there is no need to reserve memory pools etc etc. This is true f.e. for SGI Altix. Also nicifies the Makefile and gets rid of the tricky "and" there. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- include/linux/blkdev.h |2 +- mm/Kconfig |4 mm/Makefile|4 +--- 3 files changed, 6 insertions(+), 4 deletions(-) Index: linux-2.6/include/linux/blkdev.h === --- linux-2.6.orig/include/linux/blkdev.h 2007-05-18 15:29:43.0 -0700 +++ linux-2.6/include/linux/blkdev.h2007-05-18 15:32:48.0 -0700 @@ -607,7 +607,7 @@ extern unsigned long blk_max_low_pfn, bl #define BLK_BOUNCE_ANY ((u64)blk_max_pfn << PAGE_SHIFT) #define BLK_BOUNCE_ISA (ISA_DMA_THRESHOLD) -#ifdef CONFIG_MMU +#ifdef CONFIG_BOUNCE extern int init_emergency_isa_pool(void); extern void blk_queue_bounce(request_queue_t *q, struct bio **bio); #else Index: linux-2.6/mm/Kconfig === --- linux-2.6.orig/mm/Kconfig 2007-05-18 15:31:18.0 -0700 +++ linux-2.6/mm/Kconfig2007-05-18 15:38:25.0 -0700 @@ -163,6 +163,10 @@ config ZONE_DMA_FLAG default "0" if !ZONE_DMA default "1" +config BOUNCE + def_bool y + depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM) + config NR_QUICK int depends on QUICKLIST Index: linux-2.6/mm/Makefile === --- linux-2.6.orig/mm/Makefile 2007-05-18 15:27:57.0 -0700 +++ linux-2.6/mm/Makefile 2007-05-18 15:33:17.0 -0700 @@ -13,9 +13,7 @@ obj-y := bootmem.o filemap.o mempool.o prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \ $(mmu-y) -ifeq ($(CONFIG_MMU)$(CONFIG_BLOCK),yy) -obj-y += bounce.o -endif +obj-$(CONFIG_BOUNCE) += bounce.o obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o thrash.o obj-$(CONFIG_HUGETLBFS)+= hugetlb.o obj-$(CONFIG_NUMA) += mempolicy.o - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Status of squashfs?
So Fedora uses squashfs, Ubuntu uses, squashfs, Gentoo uses squashfs... It seems like the only place I can get a kernel _without_ squashfs is kernel.org. Is there a reason for this? Rob - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ahci: add Marvell support (WIP)
George T. Joseph (development) wrote: Hi Jeff, Two issues with the patch... msi has to be disabled for the Marvell or the driver load will throw a "nobody cared" message and eventually hang before discovering all the drives. Light I/O works fine but heavy I/O generates "exception Emask 0x0 Sact 0xb Serr 0x0 action 0x2 frozen" Then softreset and identify failures. Adding AHCI_FLAG_NO_NCQ to the flags fixes this. I've been running with both these changes over your original patch for a few months now with no problems. This is on a very heavily used 4 drive mdraid10 array. I poked Marvell about this, we'll see what they say. If I don't hear anything useful, I'll push the no-MSI, no-NCQ version upstream. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 10/14] In-kernel file copy between union mounted filesystems
On Fri, May 18, 2007 at 09:47:31AM -0400, Shaya Potter wrote: > Bharata B Rao wrote: > > > > >Not really. This is called during copyup of a file residing in a lower > >layer. And that is done only for regular files. > > That is broken. But it only breaks the semantics (in other cases we allow writes only to the top layer files). So the question is why do we have to copy up the device node ? What difference it makes to writing to the device itself ? Currently we allow write to the device using the lower layer device node itself. > > You should be able to change the permissions on a device node on a layer > that is RO. > Hmm not sure why we need to touch the permissions of the device. See below. > so it would copy it up (1. mknod, 2. copy attributes) and then the > appropriate attribute notification change would be called. With union mount, when a regular file is opened for write, it is checked if it resides in the lower layer and if so copied up to the topmost layer and this new fd is returned from open. And any subsequent writes using this fd will go to the newly created topmost file. (We are aware that we are not yet copying the (extended) attributes to the newly created topmost file, which we have to do). Regards, Bharata. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.22-rc2
On Mon, 21 May 2007 10:37:55 -0700 Stephen Hemminger <[EMAIL PROTECTED]> wrote: > On Mon, 21 May 2007 13:10:55 -0400 > Mike Houston <[EMAIL PROTECTED]> wrote: > > > On Mon, 21 May 2007 08:45:49 -0700 > > Stephen Hemminger <[EMAIL PROTECTED]> wrote: > > > > > It's almost certainly a problem with the BIOS and hardware (not > > > a sky2) driver issue. Since there are many similar boards and > > > configurations, I made the decision not to enforce restrictions > > > in the driver. > > > > >> May 20 15:57:48 cramit kernel: sky2 :04:00.0: v1.14 addr > > >> 0xf800 irq 16 Yukon-EC Ultra (0xb4) rev 2 > > > > Thank you for your answer. I was half wondering if that was the > > case after staring at those log messages several more times. I > > don't understand hardware at the low level but got thinking maybe > > interrupt routing issue. There's an Nvidia PCI Express card in > > there that gets IRQ 16, though it was not initialized by a driver > > at the time. (plain old VGA console after fresh cold boot... no > > framebuffer, no X, no nvidia module). I guess some things don't > > share well. > > > > It works well in that other OS that came with the hardware, but > > that's beside the point. > > It is some low level PCI Express related stuff, try latest BIOS (F9) > and if that doesn't help there is a EEPROM update from Gigabyte > for the Marvell hardware that might help. Thanks for your suggestions, I followed through on them. It may still be interesting/useful to hear from me that it didn't help. The problem is the same. My motherboard is a newer revision (Gigabyte GA-965P-DS3 Rev 3.3) and already had the "F10" bios version, but I flashed to the latest F11 version anyways. I also flashed with the EEPROM update from Gigabyte, from a FAQ entry for my motherboard revision. (faq_marvell_eeprom.zip). Both operations were successful. I cleared the CMOS and reconfigured after the bios flash too. Incidently, it was showing IRQ 16 in that early initialization message, but actually getting a MSI interrupt (IRQ 219, PCI-MSI-edge) I've disabled the onboard yukon2 adapter in bios and gone back to the PCI card now. I think we can consider the matter closed, since it's not a problem with the driver, but just so you know, I'm always willing to help test when it's hardware that I have. Mike Houston - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
Hi, 2007/5/18, Jesse Barnes <[EMAIL PROTECTED]>: Comments, questions, suggestions? FBUI, kdrive http://home.comcast.net/~fbui/ http://www.freedesktop.org/wiki/Software/Xserver TIA - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
On Mon, 2007-05-21 at 17:51 -0700, Keith Packard wrote: > > That's the plan; the kernel just provides mechanism. The architecture > used in the X server splits precisely at this point with the mechanism > in the driver and the configuration and policy up in the X server > proper. Quite a bit of that code could be broken out into a shared > library for fbdev-based apps and the X server to share, but that's > down the road a bit after the kernel APIs look a lot more solid. Ok, good plan then. > With the goal of getting to a single-mode-set boot to avoid screen > flashing before login, the key here is to make any early user-mode > graphics apps share the same kernel graphics infrastructure as the X > server to identify the common cases where the startup and X modes are > the same and avoid resetting the configuration. Ok. Fair enough. Cheers, Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] rd: Simplify by using the same helper functions in libfs
While the ramdisk code in the page cache started with the ramfs code it has diverged, and is a result is more complicated then it currently needs to be. This patch simplifies the ramfs code by syncing it with ramfs and similar pieces of code. The big difference is that the ramdisk must cope with people placing buffer heads on it's pages so there is extra code required to mark those buffer heads dirty and uptodate. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- drivers/block/rd.c | 76 +++ 1 files changed, 5 insertions(+), 71 deletions(-) diff --git a/drivers/block/rd.c b/drivers/block/rd.c index 41de0f4..56b2b54 100644 --- a/drivers/block/rd.c +++ b/drivers/block/rd.c @@ -92,36 +92,16 @@ static int rd_blocksize = CONFIG_BLK_DEV_RAM_BLOCKSIZE; * aops copied from ramfs. */ -/* - * If a ramdisk page has buffers, some may be uptodate and some may be not. - * To bring the page uptodate we zero out the non-uptodate buffers. The - * page must be locked. - */ static void make_page_uptodate(struct page *page) { + clear_highpage(page); if (page_has_buffers(page)) { struct buffer_head *bh = page_buffers(page); struct buffer_head *head = bh; do { - if (!buffer_uptodate(bh)) { - memset(bh->b_data, 0, bh->b_size); - /* -* akpm: I'm totally undecided about this. The -* buffer has just been magically brought "up to -* date", but nobody should want to be reading -* it anyway, because it hasn't been used for -* anything yet. It is still in a "not read -* from disk yet" state. -* -* But non-uptodate buffers against an uptodate -* page are against the rules. So do it anyway. -*/ -set_buffer_uptodate(bh); - } + set_buffer_uptodate(bh); } while ((bh = bh->b_this_page) != head); - } else { - memset(page_address(page), 0, PAGE_CACHE_SIZE); } flush_dcache_page(page); SetPageUptodate(page); @@ -129,55 +109,11 @@ static void make_page_uptodate(struct page *page) static int ramdisk_readpage(struct file *file, struct page *page) { - if (!PageUptodate(page)) - make_page_uptodate(page); + make_page_uptodate(page); unlock_page(page); return 0; } -static int ramdisk_prepare_write(struct file *file, struct page *page, - unsigned offset, unsigned to) -{ - if (!PageUptodate(page)) - make_page_uptodate(page); - return 0; -} - -static int ramdisk_commit_write(struct file *file, struct page *page, - unsigned offset, unsigned to) -{ - set_page_dirty(page); - return 0; -} - -/* - * ->writepage to the blockdev's mapping has to redirty the page so that the - * VM doesn't go and steal it. We return AOP_WRITEPAGE_ACTIVATE so that the VM - * won't try to (pointlessly) write the page again for a while. - * - * Really, these pages should not be on the LRU at all. - */ -static int ramdisk_writepage(struct page *page, struct writeback_control *wbc) -{ - if (!PageUptodate(page)) - make_page_uptodate(page); - SetPageDirty(page); - if (wbc->for_reclaim) - return AOP_WRITEPAGE_ACTIVATE; - unlock_page(page); - return 0; -} - -/* - * This is a little speedup thing: short-circuit attempts to write back the - * ramdisk blockdev inode to its non-existent backing store. - */ -static int ramdisk_writepages(struct address_space *mapping, - struct writeback_control *wbc) -{ - return 0; -} - /* * ramdisk blockdev pages have their own ->set_page_dirty() because we don't * want them to contribute to dirty memory accounting. @@ -206,11 +142,9 @@ static int ramdisk_set_page_dirty(struct page *page) static const struct address_space_operations ramdisk_aops = { .readpage = ramdisk_readpage, - .prepare_write = ramdisk_prepare_write, - .commit_write = ramdisk_commit_write, - .writepage = ramdisk_writepage, + .prepare_write = simple_prepare_write, + .commit_write = simple_commit_write, .set_page_dirty = ramdisk_set_page_dirty, - .writepages = ramdisk_writepages, }; static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector, -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More m
Re: [PATCH] Kconfig powernow-k8 driver should depend on ACPI P-States driver
On Fri, May 18, 2007 at 12:01:08PM -0400, Dave Jones wrote: > On Fri, May 18, 2007 at 12:09:38AM -0400, Ed Sweetman wrote: [snip] > still has unnecessary whitespace changes [snip] > and still wordwrapped. > (also capitalise ACPI) I haven't seen any more e-mail traffic on this topic so I'm assuming that the ball has been dropped. Please excuse me if an acceptable patch has been submitted that I wasn't CC'd on. Here is cleaned up version of Ed's patch that I believe addresses Dave's stylistic concerns applies the relevant changes to both x86 & x86_64. Signed-off-by: Joshua Hoblitt <[EMAIL PROTECTED]> -- i386/kernel/cpu/cpufreq/Kconfig | 13 ++--- x86_64/kernel/cpufreq/Kconfig | 13 ++--- 2 files changed, 20 insertions(+), 6 deletions(-) diff -Nurp linux-2.6.22-rc1-mm1.orig/arch/i386/kernel/cpu/cpufreq/Kconfig linux-2.6.22-rc1-mm1/arch/i386/kernel/cpu/cpufreq/Kconfig --- linux-2.6.22-rc1-mm1.orig/arch/i386/kernel/cpu/cpufreq/Kconfig 2007-04-27 11:49:26.0 -1000 +++ linux-2.6.22-rc1-mm1/arch/i386/kernel/cpu/cpufreq/Kconfig 2007-05-21 16:20:47.0 -1000 @@ -90,10 +90,17 @@ config X86_POWERNOW_K8 If in doubt, say N. config X86_POWERNOW_K8_ACPI - bool - depends on X86_POWERNOW_K8 && ACPI_PROCESSOR - depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m) + bool "ACPI Support" + select ACPI_PROCESSOR + depends on X86_POWERNOW_K8 default y + help + This provides access to the K8s Processor Performance States via ACPI. + This driver is probably required for CPUFreq to work with multi-socket and + SMP systems. It is not required on at least some single-socket yet + multi-core systems, even if SMP is enabled. + + It is safe to say Y here. config X86_GX_SUSPMOD tristate "Cyrix MediaGX/NatSemi Geode Suspend Modulation" diff -Nurp linux-2.6.22-rc1-mm1.orig/arch/x86_64/kernel/cpufreq/Kconfig linux-2.6.22-rc1-mm1/arch/x86_64/kernel/cpufreq/Kconfig --- linux-2.6.22-rc1-mm1.orig/arch/x86_64/kernel/cpufreq/Kconfig 2007-05-21 16:11:16.0 -1000 +++ linux-2.6.22-rc1-mm1/arch/x86_64/kernel/cpufreq/Kconfig 2007-05-21 16:29:11.0 -1000 @@ -24,10 +24,17 @@ config X86_POWERNOW_K8 If in doubt, say N. config X86_POWERNOW_K8_ACPI - bool - depends on X86_POWERNOW_K8 && ACPI_PROCESSOR - depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m) + bool "ACPI Support" + select ACPI_PROCESSOR + depends on X86_POWERNOW_K8 default y + help + This provides access to the K8s Processor Performance States via ACPI. + This driver is probably required for CPUFreq to work with multi-socket and + SMP systems. It is not required on at least some single-socket yet + multi-core systems, even if SMP is enabled. + + It is safe to say Y here. config X86_SPEEDSTEP_CENTRINO tristate "Intel Enhanced SpeedStep (deprecated)" pgpwX1JIqMZRf.pgp Description: PGP signature
Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree
On 5/21/07, Ken Chen <[EMAIL PROTECTED]> wrote: On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote: > No, it doesn't. Really. It's easy to split; untested incremental to your > patch follows: > > for (i = 0; i < nr; i++) { > - if (!loop_init_one(i)) > - goto err; > + lo = loop_alloc(i); > + if (!lo) > + goto Enomem; > + list_add_tail(&lo->lo_list, &loop_devices); > } ah, yes, use the loop_device list_head to link all the pending devices. > + /* point of no return */ > + > + list_for_each_entry(lo, &loop_devices, lo_list) > + add_disk(lo->lo_disk); > + > + blk_register_region(MKDEV(LOOP_MAJOR, 0), range, > + THIS_MODULE, loop_probe, NULL, NULL); > + > printk(KERN_INFO "loop: module loaded\n"); > return 0; > -err: > - loop_exit(); > + > +Enomem: > printk(KERN_INFO "loop: out of memory\n"); > + > + while(!list_empty(&loop_devices)) { > + lo = list_entry(loop_devices.next, struct loop_device, lo_list); > + loop_del_one(lo); > + } > + > + unregister_blkdev(LOOP_MAJOR, "loop"); > return -ENOMEM; > } I suppose the loop_del_one call in Enomem label needs to be split up too since in the error path, it hasn't done add_disk() yet? tested, like this? diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 5526ead..0ed5470 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1354,7 +1354,7 @@ #endif */ static int max_loop; module_param(max_loop, int, 0); -MODULE_PARM_DESC(max_loop, "obsolete, loop device is created on-demand"); +MODULE_PARM_DESC(max_loop, "Maximum number of loop devices"); MODULE_LICENSE("GPL"); MODULE_ALIAS_BLOCKDEV_MAJOR(LOOP_MAJOR); @@ -1394,16 +1394,11 @@ int loop_unregister_transfer EXPORT_SYMBOL(loop_register_transfer); EXPORT_SYMBOL(loop_unregister_transfer); -static struct loop_device *loop_init_one(int i) +static struct loop_device *loop_alloc(int i) { struct loop_device *lo; struct gendisk *disk; - list_for_each_entry(lo, &loop_devices, lo_list) { - if (lo->lo_number == i) - return lo; - } - lo = kzalloc(sizeof(*lo), GFP_KERNEL); if (!lo) goto out; @@ -1427,8 +1422,6 @@ static struct loop_device *loop_init_one disk->private_data = lo; disk->queue = lo->lo_queue; sprintf(disk->disk_name, "loop%d", i); - add_disk(disk); - list_add_tail(&lo->lo_list, &loop_devices); return lo; out_free_queue: @@ -1439,15 +1432,37 @@ out: return NULL; } -static void loop_del_one(struct loop_device *lo) +static void loop_free(struct loop_device *lo) { - del_gendisk(lo->lo_disk); blk_cleanup_queue(lo->lo_queue); put_disk(lo->lo_disk); list_del(&lo->lo_list); kfree(lo); } +static struct loop_device *loop_init_one(int i) +{ + struct loop_device *lo; + + list_for_each_entry(lo, &loop_devices, lo_list) { + if (lo->lo_number == i) + return lo; + } + + lo = loop_alloc(i); + if (lo) { + add_disk(lo->lo_disk); + list_add_tail(&lo->lo_list, &loop_devices); + } + return lo; +} + +static void loop_del_one(struct loop_device *lo) +{ + del_gendisk(lo->lo_disk); + loop_free(lo); +} + static struct kobject *loop_probe(dev_t dev, int *part, void *data) { struct loop_device *lo; @@ -1464,28 +1479,77 @@ static struct kobject *loop_probe static int __init loop_init(void) { - if (register_blkdev(LOOP_MAJOR, "loop")) - return -EIO; - blk_register_region(MKDEV(LOOP_MAJOR, 0), 1UL << MINORBITS, - THIS_MODULE, loop_probe, NULL, NULL); + int i, nr; + unsigned long range; + struct loop_device *lo, *next; + + /* +* loop module now has a feature to instantiate underlying device +* structure on-demand, provided that there is an access dev node. +* However, this will not work well with user space tool that doesn't +* know about such "feature". In order to not break any existing +* tool, we do the following: +* +* (1) if max_loop is specified, create that many upfront, and this +* also becomes a hard limit. +* (2) if max_loop is not specified, create 8 loop device on module +* load, user can further extend loop device by create dev node +* themselves and have kernel automatically instantiate actual +* device on-demand. +*/ + if (max_loop > 1UL << MINORBITS) + return -EINVAL; if (max_loop) { - printk(KERN_INFO "loop: the max_loop option is obsolete " -"and will b
[PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty
The problem: When we are trying to free buffers try_to_free_buffers will look at ramdisk pages with clean buffer heads and remove the dirty bit from the page. Resulting in ramdisk pages with data that get removed from the page cache. Ouch! When we mark a ramdisk page dirty we call set_page_dirty which then calls ramdisk_set_page_dirty. Currently we don't mark the buffer heads dirty leaving us susceptible to the problem above. So to fix the mismatch between buffer head state and page state this patch modifies ramdisk_set_page_dirty to set the dirty bit on all of the buffers a page may posses. I set the uptodate bit on the buffer head so that later we can use simple_commit_write, and because it is trivially safe. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- drivers/block/rd.c | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/drivers/block/rd.c b/drivers/block/rd.c index a1512da..41de0f4 100644 --- a/drivers/block/rd.c +++ b/drivers/block/rd.c @@ -184,6 +184,21 @@ static int ramdisk_writepages(struct address_space *mapping, */ static int ramdisk_set_page_dirty(struct page *page) { + struct address_space * const mapping = page_mapping(page); + + spin_lock(&mapping->private_lock); + if (page_has_buffers(page)) { + struct buffer_head *head = page_buffers(page); + struct buffer_head *bh = head; + + do { + set_buffer_uptodate(bh); + set_buffer_dirty(bh); + bh = bh->b_this_page; + } while (bh != head); + } + spin_unlock(&mapping->private_lock); + if (!TestSetPageDirty(page)) return 1; return 0; -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI device problem - MMCONFIG, cannot allocate resource region, resource collisions
Jesse Barnes wrote: There's a recent thread about PCI resource assignment (sounds like your BIOS might be buggy btw, or you're somehow running out of space), search for the title "PCI bridge range sizing bug". You may need the kernel to reassign the resource for your NIC before you can use it. I think Ivan has some test patches along these lines. If you can find out what resource it's colliding with, that might give you a clue. I don't have anything else plugged in to the PC (except a USB drive). BIOS is set to PNP OS. How do I find out what it is colliding with? Here is a full lspci output: # lspci -v 00:00.0 Host bridge: ATI Technologies Inc RS480 Host Bridge (rev 10) Subsystem: ATI Technologies Inc RS480 Host Bridge Flags: bus master, 66MHz, medium devsel, latency 0 00:01.0 PCI bridge: ATI Technologies Inc RS480 PCI Bridge (prog-if 00 [Normal decode]) Flags: bus master, 66MHz, medium devsel, latency 64 Bus: primary=00, secondary=01, subordinate=01, sec-latency=64 I/O behind bridge: a000-afff Memory behind bridge: ff40-ff4f Prefetchable memory behind bridge: fab0-fea0 Capabilities: [44] HyperTransport: MSI Mapping Capabilities: [b0] #0d [] 00:12.0 SATA controller: ATI Technologies Inc SB600 Non-Raid-5 SATA (prog-if 01 [AHCI 1.0]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7244 Flags: bus master, 66MHz, medium devsel, latency 96, IRQ 18 I/O ports at e800 [size=8] I/O ports at e400 [size=4] I/O ports at e000 [size=8] I/O ports at dc00 [size=4] I/O ports at d800 [size=16] Memory at ff6ffc00 (32-bit, non-prefetchable) [size=1K] Capabilities: [60] Power Management version 2 Capabilities: [50] Message Signalled Interrupts: 64bit+ Queue=0/2 Enable- Capabilities: [70] #12 [0010] 00:13.0 USB Controller: ATI Technologies Inc SB600 USB (OHCI0) (prog-if 10 [OHCI]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 19 Memory at ff6fe000 (32-bit, non-prefetchable) [size=4K] Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 Enable- 00:13.1 USB Controller: ATI Technologies Inc SB600 USB (OHCI1) (prog-if 10 [OHCI]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 21 Memory at ff6fd000 (32-bit, non-prefetchable) [size=4K] Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 Enable- 00:13.2 USB Controller: ATI Technologies Inc SB600 USB (OHCI2) (prog-if 10 [OHCI]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 22 Memory at ff6fc000 (32-bit, non-prefetchable) [size=4K] Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 Enable- 00:13.3 USB Controller: ATI Technologies Inc SB600 USB (OHCI3) (prog-if 10 [OHCI]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 21 Memory at ff6fb000 (32-bit, non-prefetchable) [size=4K] Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 Enable- 00:13.4 USB Controller: ATI Technologies Inc SB600 USB (OHCI4) (prog-if 10 [OHCI]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 22 Memory at ff6fa000 (32-bit, non-prefetchable) [size=4K] Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 Enable- 00:13.5 USB Controller: ATI Technologies Inc SB600 USB Controller (EHCI) (prog-if 20 [EHCI]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 20 Memory at ff6ff800 (32-bit, non-prefetchable) [size=256] Capabilities: [c0] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 Enable- Capabilities: [e4] Debug port 00:14.0 SMBus: ATI Technologies Inc SB600 SMBus (rev 13) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: 66MHz, medium devsel I/O ports at 0b00 [size=16] Capabilities: [b0] HyperTransport: MSI Mapping 00:14.1 IDE interface: ATI Technologies Inc SB600 IDE (prog-if 8a [Master SecP PriP]) Subsystem: Micro-Star International Co., Ltd. Unknown device 7242 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 19 I/O ports at 01f0 [size=8] I/O ports at 03f4 [size=1] I/O ports at 0170 [size=8] I/O ports at 0374 [size=1] I/O ports at ff00 [size=16] Capabilities: [70] Mess
[PATCH 1/3] Preserve the dirty bit in init_page_buffers
The problem: When we are trying to free buffers try_to_free_buffers will look at ramdisk pages with clean buffer heads and remove the dirty bit from the page. Resulting in ramdisk pages with data that get removed from the page cache. Ouch! Buffer heads appear on ramdisk pages when a filesystem calls getblk, which through a series of function calls eventually calls init_page_buffers. So to fix the mismatch between buffer head state and page state this patch modifies init_page_buffers to transfer the dirty bit from the page to the buffer heads like we currently do for the uptodate bit. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- fs/buffer.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index aa68206..c6b58e8 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -953,6 +953,7 @@ init_page_buffers(struct page *page, struct block_device *bdev, struct buffer_head *head = page_buffers(page); struct buffer_head *bh = head; int uptodate = PageUptodate(page); + int dirty = PageDirty(page); do { if (!buffer_mapped(bh)) { @@ -961,6 +962,8 @@ init_page_buffers(struct page *page, struct block_device *bdev, bh->b_blocknr = block; if (uptodate) set_buffer_uptodate(bh); + if (dirty) + set_buffer_dirty(bh); set_buffer_mapped(bh); } block++; -- 1.5.1.1.181.g2de0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [PATCH] - store sysfs inode nrs in s_ino to avoid readdir oopses
(2nd try, better(?) changelog, quilt refreshed(!) patch) -- Backport of ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch For regular files in sysfs, sysfs_readdir wants to traverse sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number. But, the dentry can be reclaimed under memory pressure, and there is no synchronization with readdir. This patch follows Tejun's scheme of allocating and storing an inode number in the new s_ino member of a sysfs_dirent, when dirents are created, and retrieving it from there for readdir, so that the pointer chain doesn't have to be traversed. Tejun's upstream patch uses a new-ish "ida" allocator which brings along some extra complexity; this -stable patch has a brain-dead incrementing counter which does not guarantee uniqueness, but because sysfs doesn't hash inodes as iunique expects, uniqueness wasn't guaranteed today anyway. Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]> Index: linux-2.6.21/fs/sysfs/dir.c === --- linux-2.6.21.orig/fs/sysfs/dir.c +++ linux-2.6.21/fs/sysfs/dir.c @@ -30,6 +30,14 @@ static struct dentry_operations sysfs_de .d_iput = sysfs_d_iput, }; +static unsigned int sysfs_inode_counter; +ino_t sysfs_get_inum(void) +{ + if (unlikely(sysfs_inode_counter < 3)) + sysfs_inode_counter = 3; + return sysfs_inode_counter++; +} + /* * Allocates a new sysfs_dirent and links it to the parent sysfs_dirent */ @@ -41,6 +49,7 @@ static struct sysfs_dirent * __sysfs_new if (!sd) return NULL; + sd->s_ino = sysfs_get_inum(); atomic_set(&sd->s_count, 1); atomic_set(&sd->s_event, 1); INIT_LIST_HEAD(&sd->s_children); @@ -509,7 +518,7 @@ static int sysfs_readdir(struct file * f switch (i) { case 0: - ino = dentry->d_inode->i_ino; + ino = parent_sd->s_ino; if (filldir(dirent, ".", 1, i, ino, DT_DIR) < 0) break; filp->f_pos++; @@ -538,10 +547,7 @@ static int sysfs_readdir(struct file * f name = sysfs_get_name(next); len = strlen(name); - if (next->s_dentry) - ino = next->s_dentry->d_inode->i_ino; - else - ino = iunique(sysfs_sb, 2); + ino = next->s_ino; if (filldir(dirent, name, len, filp->f_pos, ino, dt_type(next)) < 0) Index: linux-2.6.21/fs/sysfs/inode.c === --- linux-2.6.21.orig/fs/sysfs/inode.c +++ linux-2.6.21/fs/sysfs/inode.c @@ -140,6 +140,7 @@ struct inode * sysfs_new_inode(mode_t mo inode->i_mapping->a_ops = &sysfs_aops; inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info; inode->i_op = &sysfs_inode_operations; + inode->i_ino = sd->s_ino; lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key); if (sd->s_iattr) { Index: linux-2.6.21/fs/sysfs/mount.c === --- linux-2.6.21.orig/fs/sysfs/mount.c +++ linux-2.6.21/fs/sysfs/mount.c @@ -33,6 +33,7 @@ static struct sysfs_dirent sysfs_root = .s_element = NULL, .s_type = SYSFS_ROOT, .s_iattr= NULL, + .s_ino = 1, }; static void sysfs_clear_inode(struct inode *inode) Index: linux-2.6.21/fs/sysfs/sysfs.h === --- linux-2.6.21.orig/fs/sysfs/sysfs.h +++ linux-2.6.21/fs/sysfs/sysfs.h @@ -5,6 +5,7 @@ struct sysfs_dirent { void* s_element; int s_type; umode_t s_mode; + ino_t s_ino; struct dentry * s_dentry; struct iattr* s_iattr; atomic_ts_event; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] do_div_signed()
Here's a quick pass at adding do_div_signed() which provides a signed version of do_div, avoiding having do_div users hack around signed issues (like in ntp.c). It probably could be optimized further, so let me know if you have any suggestions. Other thoughts? thanks -john Signed-off-by: John Stultz<[EMAIL PROTECTED]> diff --git a/include/asm-arm/div64.h b/include/asm-arm/div64.h index 0b5f881..af44e10 100644 --- a/include/asm-arm/div64.h +++ b/include/asm-arm/div64.h @@ -225,5 +225,6 @@ #endif extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); +extern int64_t do_div_signed(int64_t *n, int32_t base); #endif diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h index a4a4937..e1cac65 100644 --- a/include/asm-generic/div64.h +++ b/include/asm-generic/div64.h @@ -62,4 +62,5 @@ extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); #endif /* BITS_PER_LONG */ +extern int64_t do_div_signed(int64_t *n, int32_t base); #endif /* _ASM_GENERIC_DIV64_H */ diff --git a/include/asm-i386/div64.h b/include/asm-i386/div64.h index 438e980..05320a5 100644 --- a/include/asm-i386/div64.h +++ b/include/asm-i386/div64.h @@ -49,4 +49,5 @@ div_ll_X_l_rem(long long divs, long div, long *rem) } extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); +extern int64_t do_div_signed(int64_t *n, int32_t base); #endif diff --git a/include/asm-m68k/div64.h b/include/asm-m68k/div64.h index 33caad1..3c76059 100644 --- a/include/asm-m68k/div64.h +++ b/include/asm-m68k/div64.h @@ -26,4 +26,5 @@ }) extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); +extern int64_t do_div_signed(int64_t *n, int32_t base); #endif /* _M68K_DIV64_H */ diff --git a/include/asm-mips/div64.h b/include/asm-mips/div64.h index 66189f5..851ce40 100644 --- a/include/asm-mips/div64.h +++ b/include/asm-mips/div64.h @@ -111,5 +111,6 @@ static inline uint64_t div64_64(uint64_t dividend, uint64_t divisor) } #endif /* (_MIPS_SZLONG == 64) */ +extern int64_t do_div_signed(int64_t *n, int32_t base); #endif /* _ASM_DIV64_H */ diff --git a/include/asm-um/div64.h b/include/asm-um/div64.h index 7b73b2c..1fc4a2c 100644 --- a/include/asm-um/div64.h +++ b/include/asm-um/div64.h @@ -4,4 +4,5 @@ #include "asm/arch/div64.h" extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); +extern int64_t do_div_signed(int64_t *n, int32_t base); #endif diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c index 87aa5ff..7c093ad 100644 --- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -302,16 +302,11 @@ int do_adjtimex(struct timex *txc) freq_adj = time_offset * mtemp; freq_adj = shift_right(freq_adj, time_constant * 2 + (SHIFT_PLL + 2) * 2 - SHIFT_NSEC); - if (mtemp >= MINSEC && (time_status & STA_FLL || mtemp > MAXSEC)) { + if (mtemp >= MINSEC && + (time_status & STA_FLL || mtemp > MAXSEC)) { temp64 = time_offset << (SHIFT_NSEC - SHIFT_FLL); - if (time_offset < 0) { - temp64 = -temp64; - do_div(temp64, mtemp); - freq_adj -= temp64; - } else { - do_div(temp64, mtemp); - freq_adj += temp64; - } + do_div_signed(&temp64, mtemp); + freq_adj += temp64; } freq_adj += time_freq; freq_adj = min(freq_adj, (s64)MAXFREQ_NSEC); diff --git a/lib/div64.c b/lib/div64.c index b71cf93..e6ff440 100644 --- a/lib/div64.c +++ b/lib/div64.c @@ -79,3 +79,37 @@ uint64_t div64_64(uint64_t dividend, uint64_t divisor) EXPORT_SYMBOL(div64_64); #endif /* BITS_PER_LONG == 32 */ + +/* Signed 64 bit dividend, result, rem. Signed 32 bit divisor */ +int64_t do_div_signed(int64_t *n, int32_t base) +{ + uint64_t num, den; + int64_t rem; + int num_sign = (*n < 0); + int den_sign = (base < 0); + + if (num_sign) + num = (uint64_t)(-*n); + else + num = (uint64_t)*n; + + /* XXX this is sort of obnoxious,but seems necessary +* to handle the base possibly being negative as well +*/ + if (den_sign) + den = (uint32_t)(-base); + else + den = (uint32_t)base; + + rem = do_div(num, den); + + *n = (int64_t)num; + if(num_sign ^ den_sign) + *n = -*n; + if(num_sign) + rem = -rem; + + return rem; +} + +EXPORT_SYMBOL(do_div_signed); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree
On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote: No, it doesn't. Really. It's easy to split; untested incremental to your patch follows: for (i = 0; i < nr; i++) { - if (!loop_init_one(i)) - goto err; + lo = loop_alloc(i); + if (!lo) + goto Enomem; + list_add_tail(&lo->lo_list, &loop_devices); } ah, yes, use the loop_device list_head to link all the pending devices. + /* point of no return */ + + list_for_each_entry(lo, &loop_devices, lo_list) + add_disk(lo->lo_disk); + + blk_register_region(MKDEV(LOOP_MAJOR, 0), range, + THIS_MODULE, loop_probe, NULL, NULL); + printk(KERN_INFO "loop: module loaded\n"); return 0; -err: - loop_exit(); + +Enomem: printk(KERN_INFO "loop: out of memory\n"); + + while(!list_empty(&loop_devices)) { + lo = list_entry(loop_devices.next, struct loop_device, lo_list); + loop_del_one(lo); + } + + unregister_blkdev(LOOP_MAJOR, "loop"); return -ENOMEM; } I suppose the loop_del_one call in Enomem label needs to be split up too since in the error path, it hasn't done add_disk() yet? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1-mm1
Hi, This implies a miscompile somewhere, *or* that your bios stomps on registers that gcc expect preserved, and adding printf's disturbs the register allocation sufficiently. I think maybe it's caused by gcc optimize, so I add volatile to read_sector inline assemblly, then kernel can boot successfully. please check this patch : diff -ur linux/arch/i386/boot/edd.c linux.new/arch/i386/boot/edd.c --- linux/arch/i386/boot/edd.c 2007-05-22 10:08:59.0 + +++ linux.new/arch/i386/boot/edd.c 2007-05-22 10:06:24.0 + @@ -47,7 +47,7 @@ ax = 0x4200;/* Extended Read */ si = (size_t)&dapa; dx = devno; - asm ("pushfl; stc; int $0x13; setc %%al; popfl" + asm volatile("pushfl; stc; int $0x13; setc %%al; popfl" : "+a" (ax), "+S" (si), "+d" (devno) : : "ebx", "ecx", "edi"); @@ -58,7 +58,7 @@ cx = 0x0001;/* Sector 0-0-1 */ dx = devno; bx = (size_t)buf; - asm ("pushfl; stc; int $0x13; setc %%al; popfl" + asm volatile("pushfl; stc; int $0x13; setc %%al; popfl" : "+a" (ax), "+c" (cx), "+d" (dx), "+b" (bx) : : "esi", "edi"); Regards dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI device problem - MMCONFIG, cannot allocate resource region, resource collisions
On Monday, May 21, 2007, System Design Works wrote: > The kernel has a problem allocating resources for my PCI NIC. Here is > what the kernel is reporting: > > # uname -a > Linux wopr 2.6.20-gentoo-r8 #7 SMP Sun May 20 20:56:56 PDT 2007 i686 AMD > Athlon(tm) 64 X2 Dual Core Processor 3800+ AuthenticAMD GNU/Linux > > # dmesg > ... > PCI: BIOS Bug: MCFG area at e000 is not E820-reserved > PCI: Not using MMCONFIG. This is actually an unrelated problem. We're a little too conservative about using MCFG space (though this turns out to be a good thing on some of my machines), but shouldn't affect the rest of your PCI resource assignment. > ... > PCI: Cannot allocate resource region 0 of device :02:02.0 > ... > PCI: Device :02:02.0 not available because of resource collisions > skge: :02:02.0 cannot enable PCI device > skge: probe of :02:02.0 failed with error -22 > ... > > I have seen other posts reporting similar error messages. I would like > to help resolve this problem, and I can do some testing if needed. More > info: > > Kernel boot params: pci=nomsi There's a recent thread about PCI resource assignment (sounds like your BIOS might be buggy btw, or you're somehow running out of space), search for the title "PCI bridge range sizing bug". You may need the kernel to reassign the resource for your NIC before you can use it. I think Ivan has some test patches along these lines. If you can find out what resource it's colliding with, that might give you a clue. Jesse - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected
> > Are we talking CONFIG_PATA_MARVELL here? If so then the kernel I just > > booted has this set to "y" (ie: built-in) and yet the drive is still not > > detected. Is there a newer version of this driver somewhere? The kernel > > was 2.6.22-rc2. > > Should be current. Its known to work fine for that chip so you might need > to do some debugging and provide more detail. I got someone to go into the BIOS setup (I am debugging this remotely) and check the IDE/ATA related options. The two relevant options were set as follows: ATA/IDE Mode: [ Choices: Legacy, Native ] Configure SATA as: [ Choices: IDE, RAID, AHCI ] I had them change the "Configure SATA as" setting from "IDE" to "AHCI" and then reboot into 2.6.22-rc2. At this point things appear to be much happier: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx Probing IDE interface ide0... Probing IDE interface ide1... : ahci :00:1f.2: version 2.1 ACPI: PCI Interrupt :00:1f.2[A] -> GSI 19 (level, low) -> IRQ 19 ahci :00:1f.2: AHCI 0001.0100 32 slots 6 ports 3 Gbps 0x3f impl SATA mode ahci :00:1f.2: flags: 64bit ncq led clo pio slum part PCI: Setting latency timer of device :00:1f.2 to 64 scsi0 : ahci : scsi5 : ahci ata1: SATA max UDMA/133 cmd 0xf882a100 ctl 0x bmdma 0x irq 0 : ata6: SATA max UDMA/133 cmd 0xf882a380 ctl 0x bmdma 0x irq 0 ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168 ata1.00: ATA-7: WDC WD2500AAJS-00RYA0, 12.01B01, max UDMA/133 ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 31/32) ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168 ata1.00: configured for UDMA/133 ata2: SATA link down (SStatus 0 SControl 300) : ata6: SATA link down (SStatus 0 SControl 300) scsi 0:0:0:0: Direct-Access ATA WDC WD2500AAJS-0 12.0 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) sd 0:0:0:0: [sda] Write Protect is off : sd 0:0:0:0: [sda] Attached SCSI disk ACPI: PCI Interrupt :02:00.0[A] -> GSI 17 (level, low) -> IRQ 16 PCI: Setting latency timer of device :02:00.0 to 64 scsi6 : pata_marvell scsi7 : pata_marvell ata7: PATA max UDMA/100 cmd 0x00011018 ctl 0x00011026 bmdma 0x00011000 irq 0 ata8: DUMMY BAR5:00:00 01:7F 02:22 03:CA 04:00 05:00 06:00 07:00 08:00 09:00 0A:00 0B:00 0C:01 0D:00 0E:00 0F:00 ata7.00: ATAPI, max UDMA/66 ata7.00: limited to UDMA/33 due to 40-wire cable ata7.00: configured for UDMA/33 scsi 6:0:0:0: CD-ROMSONY DVD RW DRU-830A SS25 PQ: 0 ANSI: 5 sr0: scsi3-mmc drive: 40x/40x writer dvd-ram cd/rw xa/form2 cdda tray Uniform CD-ROM driver Revision: 3.20 sr 6:0:0:0: Attached scsi CD-ROM sr0 sr 6:0:0:0: Attached scsi generic sg1 type 5 Therefore it seems that for whatever reason, the marvell_pata driver will only find the Marvell PATA IDE controller if the *SATA* mode in the BIOS is set to "AHCI". It's somewhat counter-intuitive, but since AHCI is the "correct" setting for SATA performance reasons it's probably not such a bad thing. So, thanks for all the suggestions; the problem appears solved. Regards jonathan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Mon, May 21, 2007 at 06:39:51PM -0700, William Lee Irwin III wrote: > On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: > >> ... yeah, something like that would bypass > > On Mon, May 21, 2007 at 05:43:16PM -0500, Matt Mackall wrote: > > As long as we're throwing out crazy unpopular ideas, try this one: > > Divide struct page in two such that all the most commonly used > > elements are in one piece that's nicely sized and the rest are in > > another. Have two parallel arrays containing these pieces and accessor > > functions around the unpopular bits. > > Whether a sensible divide between popular and unpopular bits isn't > > clear to me. But hey, I said it was crazy. > > I have a crazier and even less popular idea. Eliminate struct page > entirely as an accounting structure (and, of course, mem_map with it). > Filesystems can keep the per-page metadata they need in their own > accounting structures, slab mutatis mutandis, etc. The brilliant bit > here is that devolving the accounting structures this way allows the > fs and/or subsystem to arrange for strong cache locality, file offset > adjacency to imply memory adjacency of the page accounting fields, > etc., where grabbing random structures out of some array is a real > cache thrasher. > > The page allocation and page replacement algorithms would have to be > adjusted, and things would have to allocate their own refcounts, > supposing they want/need refcounts, but it's not so far out. Refer to > filesystem pages by pairs, refer to slab pages by BTW. I think the filesystem APIs (at least the VM-side ones) should be doing this anyway (not even index, but offset). Passing things like lists of pages around is just horrible. See my write_begin/write_end and perform_write aops for (what I think is) a step in the right direction. > address (virtual and physical are trivially inter-convertible), mock > up something akin to what filesystems do for anonymous pages, etc. > > The real objection everyone's going to have is that driver writers > will stain their shorts when faced with the rules for handling such > things. The thing is, I'm not entirely sure who these driver writers > that would have such trouble are, since the driver writers I know > personally are sophisticates rather than walking disaster areas as such > would imply. I suppose they may not be representative of the whole. That's not the objection I would have. I would say that firstly, I don't think the mem_map overhead is very significant (at any rate, an allocated-on-demand metadata is not going to be any smaller if you fill up on pagecache...). Secondly, I think there is merit to having the same page metadata used by the major subsystems, because it helps for locality of reference. But I haven't explored the idea enough myself to know whether there would be any really killer benefits to this. Delayed metadata freeing via RCU without holding up the freeing of the actual page would have been something, however I can do similar with speculative references now (or whenever the code gets merged), which doesn't even require the RCU overhead. > -- wli > > P.S. This idea is not plucked out of the air; it has precedents. A > number of microkernels do this, and IIRC k42 does so also. Psst, just say "kernels" when you mention this to Linus ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
On Monday, May 21, 2007, Jon Smirl wrote: > I am not asking that these features be implemented today. I am asking > that enough planning go into the architecture today to make sure that > these features can be built in the future without tearing up the > graphics system for a third time. > > This is the essence of my complaint about this patch. The patch > introduces a new low level graphics API to the kernel. Once we put an > API in it is basically impossible to get it back out. I am not > convinced that enough planning has gone into this API yet. Jon, that's why I'm posting this stuff in the first place! :) Again, if you have specific problems with the proposed interfaces (problems that would preclude your wishlist from being fully implementable), please let me know (preferably with specifics). Thanks, Jesse - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes
* Chris Wright ([EMAIL PROTECTED]) wrote: > * Alan Cox ([EMAIL PROTECTED]) wrote: > > Yeah - fix your mailer, you got a reply 5 days ago. > > Sure wouldn't be the first time something broke. I'll take a look. Thanks for the prod. I found 2 quite stale RBL entries, causing long connection delay (enough from some MTAs to walk away perhaps). thanks, -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree
On Mon, May 21, 2007 at 06:30:15PM -0700, Ken Chen wrote: > On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote: > >On Mon, May 21, 2007 at 03:00:55PM -0700, [EMAIL PROTECTED] wrote: > >> + if (register_blkdev(LOOP_MAJOR, "loop")) > >> + return -EIO; > >> + blk_register_region(MKDEV(LOOP_MAJOR, 0), range, > >> + THIS_MODULE, loop_probe, NULL, NULL); > >> + > >> + for (i = 0; i < nr; i++) { > >> + if (!loop_init_one(i)) > >> + goto err; > >> + } > >> + > >> + printk(KERN_INFO "loop: module loaded\n"); > >> + return 0; > >> +err: > >> + loop_exit(); > > > >This isn't good. You *can't* fail once a single disk has been registered. > >Anyone could've opened it by now. > > > >IOW, you need to > >* register region *after* you are past the point of no return > > That option is a lot harder than I thought. This requires an array to > keep intermediate result of preallocated "lo" device, blk_queue, and > disk structure before calling add_disk() or register region. And this > array could be potentially 1 million entries. Maybe I will use > vmalloc for it, but seems rather sick. No, it doesn't. Really. It's easy to split; untested incremental to your patch follows: diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 0aae8d8..2300490 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1394,16 +1394,11 @@ int loop_unregister_transfer(int number) EXPORT_SYMBOL(loop_register_transfer); EXPORT_SYMBOL(loop_unregister_transfer); -static struct loop_device *loop_init_one(int i) +static struct loop_device *loop_alloc(int i) { struct loop_device *lo; struct gendisk *disk; - list_for_each_entry(lo, &loop_devices, lo_list) { - if (lo->lo_number == i) - return lo; - } - lo = kzalloc(sizeof(*lo), GFP_KERNEL); if (!lo) goto out; @@ -1427,8 +1422,6 @@ static struct loop_device *loop_init_one(int i) disk->private_data = lo; disk->queue = lo->lo_queue; sprintf(disk->disk_name, "loop%d", i); - add_disk(disk); - list_add_tail(&lo->lo_list, &loop_devices); return lo; out_free_queue: @@ -1439,6 +1432,23 @@ out: return NULL; } +static struct loop_device *loop_init_one(int i) +{ + struct loop_device *lo; + + list_for_each_entry(lo, &loop_devices, lo_list) { + if (lo->lo_number == i) + return lo; + } + + lo = loop_alloc(i); + if (lo) { + add_disk(lo->lo_disk); + list_add_tail(&lo->lo_list, &loop_devices); + } + return lo; +} + static void loop_del_one(struct loop_device *lo) { del_gendisk(lo->lo_disk); @@ -1481,6 +1491,7 @@ static int __init loop_init(void) { int i, nr; unsigned long range; + struct loop_device *lo; /* * loop module now has a feature to instantiate underlying device @@ -1506,19 +1517,34 @@ static int __init loop_init(void) if (register_blkdev(LOOP_MAJOR, "loop")) return -EIO; - blk_register_region(MKDEV(LOOP_MAJOR, 0), range, - THIS_MODULE, loop_probe, NULL, NULL); for (i = 0; i < nr; i++) { - if (!loop_init_one(i)) - goto err; + lo = loop_alloc(i); + if (!lo) + goto Enomem; + list_add_tail(&lo->lo_list, &loop_devices); } + /* point of no return */ + + list_for_each_entry(lo, &loop_devices, lo_list) + add_disk(lo->lo_disk); + + blk_register_region(MKDEV(LOOP_MAJOR, 0), range, + THIS_MODULE, loop_probe, NULL, NULL); + printk(KERN_INFO "loop: module loaded\n"); return 0; -err: - loop_exit(); + +Enomem: printk(KERN_INFO "loop: out of memory\n"); + + while(!list_empty(&loop_devices)) { + lo = list_entry(loop_devices.next, struct loop_device, lo_list); + loop_del_one(lo); + } + + unregister_blkdev(LOOP_MAJOR, "loop"); return -ENOMEM; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]serial: make early_uart to use early_param instead of console_initcall
On 5/21/07, Bjorn Helgaas <[EMAIL PROTECTED]> wrote: > > I don't want to add asm-ia64/fixmap.h with dummy definitions > just for this. > > Can we add this: > > asm-ia64/io.h: #define bt_ioremap ioremap > asm-x86_64/io.h: #define bt_ioremap early_ioremap > > and use bt_ioremap instead? > Please check if it work with ia64. i add fix_ioremap in include/asm-x86_64/io.h include/asm-i386/io.h include/asm-ia64/io.h command line will be console=uart8250,io,0x3f8,9600n8 console=uart8250,mmio,0xffe5,115200n8 YH diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 09220a1..634d809 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -444,8 +444,16 @@ and is between 256 and 4096 characters. It is defined in the file Documentation/networking/netconsole.txt for an alternative. - uart,io,[,options] - uart,mmio,[,options] + uart8250,io,[,options] + uart8250,mmio,[,options] + Start an early, polled-mode console on the 8250/16550 + UART at the specified I/O port or MMIO address, + switching to the matching ttyS device later. The + options are the same as for ttyS, above. + + earlycon= [KNL] Output early console device and options. + uart8250,io,[,options] + uart8250,mmio,[,options] Start an early, polled-mode console on the 8250/16550 UART at the specified I/O port or MMIO address, switching to the matching ttyS device later. The diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S index f74dfc4..8271466 100644 --- a/arch/i386/kernel/head.S +++ b/arch/i386/kernel/head.S @@ -168,6 +168,12 @@ page_pde_offset = (__PAGE_OFFSET >> 20); .section .init.text,"ax",@progbits #endif + /* Do an early initialization of the fixmap area */ + movl $(swapper_pg_dir - __PAGE_OFFSET), %edx + movl $(swapper_pg_pmd - __PAGE_OFFSET), %eax + addl $0x007, %eax /* 0x007 = PRESENT+RW+USER */ + movl %eax, 4092(%edx) + #ifdef CONFIG_SMP ENTRY(startup_32_smp) cld @@ -507,6 +513,8 @@ ENTRY(_stext) .section ".bss.page_aligned","w" ENTRY(swapper_pg_dir) .fill 1024,4,0 +ENTRY(swapper_pg_pmd) + .fill 1024,4,0 ENTRY(empty_zero_page) .fill 4096,1,0 diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c index eaa6a24..dd7f95b 100644 --- a/arch/ia64/kernel/setup.c +++ b/arch/ia64/kernel/setup.c @@ -390,10 +390,6 @@ early_console_setup (char *cmdline) if (!efi_setup_pcdp_console(cmdline)) earlycons++; #endif -#ifdef CONFIG_SERIAL_8250_CONSOLE - if (!early_serial_console_init(cmdline)) - earlycons++; -#endif return (earlycons) ? 0 : -1; } diff --git a/arch/x86_64/kernel/head.S b/arch/x86_64/kernel/head.S index 1fab487..941c84b 100644 --- a/arch/x86_64/kernel/head.S +++ b/arch/x86_64/kernel/head.S @@ -73,7 +73,11 @@ startup_64: addq %rbp, init_level4_pgt + (511*8)(%rip) addq %rbp, level3_ident_pgt + 0(%rip) + addq %rbp, level3_kernel_pgt + (510*8)(%rip) + addq %rbp, level3_kernel_pgt + (511*8)(%rip) + + addq %rbp, level2_fixmap_pgt + (506*8)(%rip) /* Add an Identity mapping if I am above 1G */ leaq _text(%rip), %rdi @@ -314,7 +318,16 @@ NEXT_PAGE(level3_kernel_pgt) .fill 510,8,0 /* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */ .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE - .fill 1,8,0 + .quad level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE + +NEXT_PAGE(level2_fixmap_pgt) + .fill 506,8,0 + .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE + /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ + .fill 5,8,0 + +NEXT_PAGE(level1_fixmap_pgt) + .fill 512,8,0 NEXT_PAGE(level2_ident_pgt) /* Since I easily can, map the first 1G. diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c index c84dab0..5a91ac5 100644 --- a/drivers/serial/8250.c +++ b/drivers/serial/8250.c @@ -2367,6 +2367,9 @@ static struct uart_ops serial8250_pops = { .request_port = serial8250_request_port, .config_port = serial8250_config_port, .verify_port = serial8250_verify_port, +#ifdef CONFIG_SERIAL_8250_CONSOLE + .find_port_for_earlycon = serial8250_find_port_for_earlycon, +#endif }; static struct uart_8250_port serial8250_ports[UART_NR]; @@ -2533,7 +2536,7 @@ static int __init serial8250_console_init(void) } console_initcall(serial8250_console_init); -static int __init find_port(struct uart_port *p) +int __init find_port_serial8250(struct uart_port *p) { int line; struct uart_port *port; @@ -2546,26 +2549,6 @@ static int __init find_port(struct uart_port *p) return -ENODEV; } -int __init serial8250_start_console(struct uart_port *port, char *options) -{ - int line; - - line = find_port(port); - if (line < 0) - return -ENODEV; - - add_preferred_console("ttyS", line, options); - printk("Adding console on ttyS%d at %s 0x%lx (options '%s')\n", - line, port->iotype == UPIO_MEM ? "MMIO" : "I/O port", - port->iotype == UPIO_MEM ? (unsigned long) port->mapbase : -
PCI device problem - MMCONFIG, cannot allocate resource region, resource collisions
The kernel has a problem allocating resources for my PCI NIC. Here is what the kernel is reporting: # uname -a Linux wopr 2.6.20-gentoo-r8 #7 SMP Sun May 20 20:56:56 PDT 2007 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ AuthenticAMD GNU/Linux # dmesg ... PCI: BIOS Bug: MCFG area at e000 is not E820-reserved PCI: Not using MMCONFIG. ... PCI: Cannot allocate resource region 0 of device :02:02.0 ... PCI: Device :02:02.0 not available because of resource collisions skge: :02:02.0 cannot enable PCI device skge: probe of :02:02.0 failed with error -22 ... I have seen other posts reporting similar error messages. I would like to help resolve this problem, and I can do some testing if needed. More info: Kernel boot params: pci=nomsi PCI Device: D-Link DGE-530T (10/100/1000 Gigabit Desktop Adapter) http://www.dlink.com/products/?pid=284 Motherboard: MSI K9AGM-FID (AMD Socket AM2) Chipset: SB600 / RS485 http://www.msicomputer.com/product/p_spec.asp?model=K9AGM-FID&class=mb # lspci -vvv ... 02:02.0 Non-VGA unclassified device: D-Link System Inc Unknown device 4901 (rev 11) Subsystem: D-Link System Inc Unknown device 4901 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- Latency: 64 (5750ns min, 7250ns max), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 11 Region 0: Memory at (32-bit, non-prefetchable) Region 1: I/O ports at b800 [size=256] Expansion ROM at [disabled] Capabilities: [48] Power Management version 2 Flags: PMEClk- DSI- D1- D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data ... Thanks, Wayne - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: >> ... yeah, something like that would bypass On Mon, May 21, 2007 at 05:43:16PM -0500, Matt Mackall wrote: > As long as we're throwing out crazy unpopular ideas, try this one: > Divide struct page in two such that all the most commonly used > elements are in one piece that's nicely sized and the rest are in > another. Have two parallel arrays containing these pieces and accessor > functions around the unpopular bits. > Whether a sensible divide between popular and unpopular bits isn't > clear to me. But hey, I said it was crazy. I have a crazier and even less popular idea. Eliminate struct page entirely as an accounting structure (and, of course, mem_map with it). Filesystems can keep the per-page metadata they need in their own accounting structures, slab mutatis mutandis, etc. The brilliant bit here is that devolving the accounting structures this way allows the fs and/or subsystem to arrange for strong cache locality, file offset adjacency to imply memory adjacency of the page accounting fields, etc., where grabbing random structures out of some array is a real cache thrasher. The page allocation and page replacement algorithms would have to be adjusted, and things would have to allocate their own refcounts, supposing they want/need refcounts, but it's not so far out. Refer to filesystem pages by pairs, refer to slab pages by address (virtual and physical are trivially inter-convertible), mock up something akin to what filesystems do for anonymous pages, etc. The real objection everyone's going to have is that driver writers will stain their shorts when faced with the rules for handling such things. The thing is, I'm not entirely sure who these driver writers that would have such trouble are, since the driver writers I know personally are sophisticates rather than walking disaster areas as such would imply. I suppose they may not be representative of the whole. -- wli P.S. This idea is not plucked out of the air; it has precedents. A number of microkernels do this, and IIRC k42 does so also. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [PATCH] - fix oops in sysfs_readdir
Andrew Morton wrote: > Actually, someone (eg distros) looking at Tejun's changelog would still be > struggling to answer the question "do I need this". The one thing it > claims to fix is "duplicate inode numbers". But why is that a problem? > What are the user-visible consequences of not merging the patch? Unobvious. The oops part is explained in #2. sysfs_dirent->s_dentry can go away anytime and the original code accesses it without any synchronization, so it can end up dereferencing NULL or access already freed memory. And, yeah, this is another place where reclaim-related oops occurs. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree
On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote: On Mon, May 21, 2007 at 03:00:55PM -0700, [EMAIL PROTECTED] wrote: > + if (register_blkdev(LOOP_MAJOR, "loop")) > + return -EIO; > + blk_register_region(MKDEV(LOOP_MAJOR, 0), range, > + THIS_MODULE, loop_probe, NULL, NULL); > + > + for (i = 0; i < nr; i++) { > + if (!loop_init_one(i)) > + goto err; > + } > + > + printk(KERN_INFO "loop: module loaded\n"); > + return 0; > +err: > + loop_exit(); This isn't good. You *can't* fail once a single disk has been registered. Anyone could've opened it by now. IOW, you need to * register region *after* you are past the point of no return That option is a lot harder than I thought. This requires an array to keep intermediate result of preallocated "lo" device, blk_queue, and disk structure before calling add_disk() or register region. And this array could be potentially 1 million entries. Maybe I will use vmalloc for it, but seems rather sick. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: select(0, ..) is valid ?
On 5/18/07, Andi Kleen <[EMAIL PROTECTED]> wrote: On Wednesday 16 May 2007 17:37, Anton Blanchard wrote: > Hi Hugh, > > > It's interesting that compat_core_sys_select() shows this kmalloc(0) > > failure but core_sys_select() does not. That's because core_sys_select() > > avoids kmalloc by using a buffer on the stack for small allocations (and > > 0 sure is small). Shouldn't compat_core_sys_select() do just the same? > > Or is SLUB going to be so efficient that doing so is a waste of time? > > Nice catch, the original optimisation from Andi is: > > http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a= >commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f > > And I think it makes sense for the compat code to do it too. Yes agreed. I just forgot the copy'n'pasted code when doing the original change. Is this headed upstream? It's causing some noise on test.kernel.org now that SLAB is also warning about kmalloc(0). Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes
* Alan Cox ([EMAIL PROTECTED]) wrote: > Yeah - fix your mailer, you got a reply 5 days ago. Sure wouldn't be the first time something broke. I'll take a look. thanks, -chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Tue, 22 May 2007, Nick Piggin wrote: > That would be unpopular with pagecache, because that uses pretty well > all fields. SLUB also uses all fields - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Fix headers check fallout
commit e8edc6e03a5c8562dc70a6d969f732bdb355a7e7 added an include of linux/jiffies.h in linux/smb_fs.h outside the ifdef __KERNEL__. Signed-off-by: Stephen Rothwell <[EMAIL PROTECTED]> --- include/linux/smb_fs.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) -- Cheers, Stephen Rothwell[EMAIL PROTECTED] diff --git a/include/linux/smb_fs.h b/include/linux/smb_fs.h index 6b51a48..2c5cd55 100644 --- a/include/linux/smb_fs.h +++ b/include/linux/smb_fs.h @@ -9,7 +9,6 @@ #ifndef _LINUX_SMB_FS_H #define _LINUX_SMB_FS_H -#include #include /* @@ -30,6 +29,7 @@ #include #include #include +#include #include static inline struct smb_sb_info *SMB_SB(struct super_block *sb) -- 1.5.1.4 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Mon, May 21, 2007 at 05:43:16PM -0500, Matt Mackall wrote: > On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: > > > > ... yeah, something like that would bypass > > As long as we're throwing out crazy unpopular ideas, try this one: > > Divide struct page in two such that all the most commonly used > elements are in one piece that's nicely sized and the rest are in > another. Have two parallel arrays containing these pieces and accessor > functions around the unpopular bits. > > Whether a sensible divide between popular and unpopular bits isn't > clear to me. But hey, I said it was crazy. That would be unpopular with pagecache, because that uses pretty well all fields. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch] Off by one in floppy.c
On Tue, 22 May 2007 00:57:56 +0200, Eric Sesterhenn / Snakebyte <[EMAIL PROTECTED]> wrote: > http://marc.info/?l=linux-kernel&m=115144559823592&w=2 Shows how much we care about floppy... It's going to be a year old soon. > +++ linux-2.6/drivers/block/floppy.c 2007-05-22 00:54:18.0 +0200 > @@ -670,7 +670,7 @@ static void __reschedule_timeout(int dri > if (drive == current_reqD) > drive = current_drive; > del_timer(&fd_timeout); > - if (drive < 0 || drive > N_DRIVE) { > + if (drive < 0 || drive >= N_DRIVE) { You need to find someone willing to take this. Maybe Andrew. -- Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kconfig - scan all Kconfig files
> A simple example would be > help texts, right now they are per symbol, but they should really be per > menu, so archs can provide different help texts for something. This one turned out easy. I assume what you had in mind was something like the attached. With this the help entry present is no loger tha last one seen but the one belonging to the menu (the symbol used within that menu). So if I have: menu "My first menu" config FOO bool "Foobar" help First menu help endmenu menu "My second menu" config FOO bool "barfoo" help Second menu help endmenu Then the help text will be the expected one in the two menus. gconf + qconf will be fixed later if you are OK with this one. This is IMO a general improvement and should be finished and applied. Sam diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c index 1199baf..c2ca5fc 100644 --- a/scripts/kconfig/conf.c +++ b/scripts/kconfig/conf.c @@ -187,9 +187,9 @@ int conf_string(struct menu *menu) /* print help */ if (line[1] == '\n') { help = nohelp_text; - if (menu->sym->help) - help = menu->sym->help; - printf("\n%s\n", menu->sym->help); + if (menu->help) + help = menu->help; + printf("\n%s\n", menu->help); def = NULL; break; } @@ -233,7 +233,7 @@ static int conf_sym(struct menu *menu) printf("/m"); if (oldval != yes && sym_tristate_within_range(sym, yes)) printf("/y"); - if (sym->help) + if (menu->help) printf("/?"); printf("] "); conf_askvalue(sym, sym_get_string_value(sym)); @@ -270,8 +270,8 @@ static int conf_sym(struct menu *menu) return 0; help: help = nohelp_text; - if (sym->help) - help = sym->help; + if (menu->help) + help = menu->help; printf("\n%s\n", help); } } @@ -342,7 +342,7 @@ static int conf_choice(struct menu *menu) goto conf_childs; } printf("[1-%d", cnt); - if (sym->help) + if (menu->help) printf("?"); printf("]: "); switch (input_mode) { @@ -359,8 +359,8 @@ static int conf_choice(struct menu *menu) fgets(line, 128, stdin); strip(line); if (line[0] == '?') { - printf("\n%s\n", menu->sym->help ? - menu->sym->help : nohelp_text); + printf("\n%s\n", menu->help ? + menu->help : nohelp_text); continue; } if (!line[0]) @@ -391,8 +391,8 @@ static int conf_choice(struct menu *menu) if (!child) continue; if (line[strlen(line) - 1] == '?') { - printf("\n%s\n", child->sym->help ? - child->sym->help : nohelp_text); + printf("\n%s\n", child->help ? + child->help : nohelp_text); continue; } sym_set_choice_value(sym, child->sym); diff --git a/scripts/kconfig/expr.h b/scripts/kconfig/expr.h index 6084525..0b22c73 100644 --- a/scripts/kconfig/expr.h +++ b/scripts/kconfig/expr.h @@ -71,7 +71,7 @@ enum { struct symbol { struct symbol *next; char *name; - char *help; + //char *help; enum symbol_type type; struct symbol_value curr; struct symbol_value def[4]; @@ -139,7 +139,7 @@ struct menu { struct property *prompt; struct expr *dep; unsigned int flags; - //char *help; + char *help; struct file *file; int lineno; void *data; diff --git a/scripts/kconfig/kxgettext.c b/scripts/kconfig/kxgettext.c index abee55c..4b0aaa1 100644 --- a/scripts/kconfig/kxgettext.c +++ b/scripts/kconfig/kxgettext.c @@ -170,8 +170,8 @@ void menu_build_message_list(struct menu *menu) menu->file == NULL ? "Root Menu" : menu->file->name, menu->lineno); - if (menu->sym != NULL && menu->sym->help != NULL) - message__add(menu->sym->help, menu->sym->name, + if (menu->sym != NULL && menu->help != NULL) + message__add(menu->help, menu->sym->name,
Re: [rfc] increase struct page size?!
On Mon, 21 May 2007 17:38:58 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 22 May 2007, KAMEZAWA Hiroyuki wrote: > > > For i386(32bit arch), there is not enough space for vmemmap. > > I thought 32 bit would use flatmem? Is memory really sparse on 32 > bit? Likely difficult due to lack of address space? > Of course, i386 can use flatmem. I am just afraid that memory hotplug is just for sprasemem. But I also think we can add memory-hotplug for flatmem if necessary. (I myself have no plan now. I wonder memory power-save-mode may be supported by chipsets.) -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [PATCH] - fix oops in sysfs_readdir
On Mon, 21 May 2007 19:18:55 -0500 Eric Sandeen <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > On Mon, 21 May 2007 13:11:21 -0500 > > Eric Sandeen <[EMAIL PROTECTED]> wrote: > > > >> This is a non-ida backport of Tejun's patch in -mm at: > >> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch > >> for the 2.6.16 -stable tree - it follows the same scheme of using s_ino to > >> safely > >> store & retrieve the inode number of sysfs entries for use in > >> sysfs_readdir, > >> but uses a brain-dead-simple inode nr allocator rather than ida, which > >> would > >> bring along a lot of newer, more complex code. > >> > >> No, this doesn't guarantee uniqueness of sysfs inode numbers, but then > >> the code in -stable today doesn't either - and with this change, at least > >> it shouldn't oops. > > > > So I'm sitting here whether to commend this patch to google kernel > > maintainers > > for 2.6.18 backport, but I realise I don't know what it does. And I don't > > know > > if it fixes the reclaim-time oopses they were intermittently seeing, or if > > it > > fixes something else and if so what that is. > > > > Sigh. Better changelogs, please. > > > > Sorry Andrew. I referenced Tejun's upstream patch in -mm which has a > nice changelog etc, and this is a backport of that, and does the same > thing in the same way and solves the same problem - but that doesn't > help if you just want to toss this message into your patch stack. Will > fix up & resend. > Actually, someone (eg distros) looking at Tejun's changelog would still be struggling to answer the question "do I need this". The one thing it claims to fix is "duplicate inode numbers". But why is that a problem? What are the user-visible consequences of not merging the patch? Unobvious. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
> So why do you want it in kernel security is not the sensible answer > here. I'm not proposing the KGI solution where every device driver presents the same API. That model does require a lot of code in the kernel. The existing DRM model where each driver provides it's own API is a good one. The user space DRI driver then takes this API and turns it into a standard one. Applying the DRM style model to fbdev may allow parts of fbdev to be moved out to user space. What I don't want is a permanent root priv process hanging around in the system. It simply isn't needed and I have prototyped a system that runs without root so I know it can be done. With minor mods DRI can run without the need for root, with more major mods the X server can run without the need for root. Most of the mods to the X server are to remove things like PCI bus probing, mode setting and VBIOS support. -- Jon Smirl [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Mon, May 21, 2007 at 04:26:03AM -0700, William Lee Irwin III wrote: > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > >> Choosing k distinct integers (mem_map array indices) from the interval > >> [0,n-1] results in k(n-k+1)/n non-adjacent intervals of contiguous > >> array indices on average. The average interval length is > >> (n+1)/(n-k+1) - 1/C(n,k). Alignment considerations make going much > >> further somewhat hairy, but it should be clear that contiguity arising > >> from random choice is non-negligible. > > On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: > > That doesn't say anything about temporal locality, though. > > It doesn't need to. If what's in the cache is uniformly distributed, > you get that result for spatial locality. From there, it's counting > cachelines. OK, so your 'k' is the number of struct pages that are in cache? Then that's fine. I'm not sure how many that is going to be, but I would be surprised if it were a significant proportion of mem_map, even on not-so-large memory systems. > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > >> In any event, I don't have all that much of an objection to what's > >> actually proposed, just this particular cache footprint argument. > >> One can motivate increases in sizeof(struct page), but not this way. > > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > > Realise that you have to have a run of I think at least 7 or 8 contiguous > > pages and temporally close references in order to save a single cacheline. > > Then also that if the page being touched is not partially in cache from > > an earlier access, then it is statistically going to cost more lines to > > touch it (up to 75% if you touch the first and the last field, obviously 0% > > if you only touch a single field, but that's unlikely given that you > > usually take a reference then do at least something else like check flags). > > I think the problem with the cache footprint argument is just whether > > it makes any significant difference to performance. But.. > > The average interval ("run") length is (n+1)/(n-k+1) - 1/C(n,k), so for > that to be >= 8 you need (n+1)/(n-k+1) - 1/C(n,k) >= 8 which also happens > when (n+1)/(n-k+1) >= 9 or when n >= (9/8)*k - 1 or k <= (8/9)*(n+1). > Clearly a lower bound on k is required, but not obviously derivable. > k >= 8 is obvious, but the least k where (n+1)/(n-k+1) - 1/C(n,k) >= 8 > is not entirely obvious. Numerically solving for the least such k finds > that k actually needs to be relatively close to (8/9)*n. A lower bound > of something like 0.87*n + O(1) probably holds. Ah, you worked it out... yeah I'd guess this is going to be pretty difficult a condition to satisfy (given that it isn't possible for a 4GB system, even if you had 32MB of cache to fill entirely with struct pages). > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > >> Now that I've been informed of the ->_count and ->_mapcount issues, > >> I'd say that they're grave and should be corrected even at the cost > >> of sizeof(struct page). > > On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: > > ... yeah, something like that would bypass > > Did you get cut off here? Must have. I was going to say it would bypass the whole speed/size discussion anyway :P - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
On Tue, 2007-05-22 at 10:09 +1000, Benjamin Herrenschmidt wrote: > I do stongly beleive that the decision of what mode to choose should not > be made in the kernel. That's the plan; the kernel just provides mechanism. The architecture used in the X server splits precisely at this point with the mechanism in the driver and the configuration and policy up in the X server proper. Quite a bit of that code could be broken out into a shared library for fbdev-based apps and the X server to share, but that's down the road a bit after the kernel APIs look a lot more solid. With the goal of getting to a single-mode-set boot to avoid screen flashing before login, the key here is to make any early user-mode graphics apps share the same kernel graphics infrastructure as the X server to identify the common cases where the startup and X modes are the same and avoid resetting the configuration. -- [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: [RFC] enhancing the kernel's graphics subsystem
On 5/21/07, Alan Cox <[EMAIL PROTECTED]> wrote: > > the kernel, it can be a lot smaller than X and auditable.. sticking > > the DRI protocol in the kernel is just pointless.. > > It is a quite sensible idea. > > The userspace X server SHOULD be running under a non-root user, with > appropriate fine-grained privs granted to it. > > "I need root to do graphics" is a myopic, antiquated view of the world. X server: priviledges below everything, pageable kernel: priviledges as high as conceivable, non-pageable So why do you want it in kernel security is not the sensible answer here. Have you inspected the multi-megabyte X server for security holes to the same level the kernel has been inspected? The only part that needs to be in the kernel driver is the code controlling locking and code that plays with the hardware. Moving it into the driver ensures that only the minimal amount of root priv code possible is going to end up in the system. If someone tries to move too much into the kernel I'm sure you'll let them know that it's a bad idea. The problem right now is that code that needs root priv is all intertwined with code that doesn't need it and it all ends up getting run as root. BTW, when I prototyped this a couple of years ago by merging Radeon DRM/fbdev I only needed to add about 10K more code to the device driver. Most of that was associated with getting the VBIOS to run in x86 mode when the driver was first loaded. That code can be marked _init. We're not talking about a lot of code needing to go into the kernel. -- Jon Smirl [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Tue, 22 May 2007, KAMEZAWA Hiroyuki wrote: > For i386(32bit arch), there is not enough space for vmemmap. I thought 32 bit would use flatmem? Is memory really sparse on 32 bit? Likely difficult due to lack of address space? > For 64bit arch, page flags are not exhausted yet. Right. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes
> It's there specifically to fish out why it was sent to -stable w/out > ever making it upstream. Having sent the same question w/ no response > 5 days ago Yeah - fix your mailer, you got a reply 5 days ago. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes
On Mon, 21 May 2007 16:18:25 -0700 (PDT) Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Mon, 21 May 2007, Chris Wright wrote: > > --- > > [chrisw: Why is this not upstream yet?] > > And equally importantly, why is it even in the stable queue if it's not > upstream. Its not relevant to upstream - upstream has different updates which removed the bug but not in a clean "backport this" way. > It's against stable rules, and it means that we may have stuff that gets > fixed in -stable and not in -upstream, if people don't notice. THAT IS > BAD Then the rules are stupid in this specific case. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
Alan Cox wrote: the kernel, it can be a lot smaller than X and auditable.. sticking the DRI protocol in the kernel is just pointless.. It is a quite sensible idea. The userspace X server SHOULD be running under a non-root user, with appropriate fine-grained privs granted to it. "I need root to do graphics" is a myopic, antiquated view of the world. X server: priviledges below everything, pageable kernel: priviledges as high as conceivable, non-pageable So why do you want it in kernel security is not the sensible answer here. Replying/quoting mixup. I was responding to the root-privs userspace aspect, not the "put it in the kernel" aspect. I do not want it in the kernel (should have snipped that last quoted line). Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected
> Are we talking CONFIG_PATA_MARVELL here? If so then the kernel I just > booted has this set to "y" (ie: built-in) and yet the drive is still not > detected. Is there a newer version of this driver somewhere? The kernel > was 2.6.22-rc2. Should be current. Its known to work fine for that chip so you might need to do some debugging and provide more detail. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rfc] increase struct page size?!
On Mon, 21 May 2007 10:08:06 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Sun, 20 May 2007, Andi Kleen wrote: > > > Besides with the scarcity of pageflags it might make sense to do "64 bit > > only" > > flags at some point. > > There is no scarcity of page flags. There is > > 1. Hoarding by Andrew > > 2. Waste by Sparsemem (section flags no longer necessary with >virtual memmap) For i386(32bit arch), there is not enough space for vmemmap. For 64bit arch, page flags are not exhausted yet. -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
On Tue, May 22, 2007 at 12:42:00AM +0200, Pavel Machek wrote: > On Mon 2007-05-21 14:45:53, Matthew Garrett wrote: > > So don't do it badly. The advantage of doing so is that you can make it > > work properly, which you can't by putting it in the kernel. > > You want stuff like critical shutdowns to work even if userspace is > dead. I don't think anyone suggested putting the critical shutdown control in userspace. The kernel already handles that fine. > I do not think you can control passive cooling adequately from > userspace, and you can certainly not prevent kernel from slowing > machine down too soon. Given the choice between something impossible and something difficult, I'm inclined towards picking the difficult one. > Plus, this is actually nasty user-visible change, and a regression > from 2.6.21. I am not sure why we are even debating this; user-kernel > interface changed without warning. Patch should be simply reverted. In http://lkml.org/lkml/2007/1/27/93 you were more than happy to break an interface even though it could be fixed in a (ugly) way that made it work again. Here, there's no way to fix this properly - the platform will quite happily do things based on what it believes the trip points should be, and one of those things may be to alter the trip points. Imagine the following situation: 1) Platform sets critical shutdown trip point to 85C 2) Userspace sets critical shutdown trip point to 95C 3) Temperature reaches 90C 4) Platform forces reevaluation of trip points 5) Entire invasion fleet is lost How do you avoid that? Disable the ability for the platform to set trip points? You're breaking the spec and potentially causing hardware damage. If you have specific hardware that requires specific spec breakage, then a better approach would probably be to quirk the kernel to rectify it. On the other hand, if it works with the Other Leading OS, we ought to be able to just fix the problem properly. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
> > the kernel, it can be a lot smaller than X and auditable.. sticking > > the DRI protocol in the kernel is just pointless.. > > It is a quite sensible idea. > > The userspace X server SHOULD be running under a non-root user, with > appropriate fine-grained privs granted to it. > > "I need root to do graphics" is a myopic, antiquated view of the world. X server: priviledges below everything, pageable kernel: priviledges as high as conceivable, non-pageable So why do you want it in kernel security is not the sensible answer here. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git patches] libata fixes and administrivia
Two fixes; the rest is trivial pre-release administrivia (ie. only bump versions and chomp whitespace after everything else gets merged up) Please pull from 'upstream-linus' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git upstream-linus to receive the following updates: drivers/ata/Kconfig|2 +- drivers/ata/ahci.c | 13 +++-- drivers/ata/ata_generic.c |2 +- drivers/ata/ata_piix.c |9 + drivers/ata/libata-core.c | 14 ++ drivers/ata/libata-eh.c|2 +- drivers/ata/pata_artop.c |4 ++-- drivers/ata/pata_cmd640.c |4 ++-- drivers/ata/pata_cmd64x.c |2 +- drivers/ata/pata_cs5520.c |2 +- drivers/ata/pata_cs5530.c |2 +- drivers/ata/pata_cs5535.c |2 +- drivers/ata/pata_cypress.c |2 +- drivers/ata/pata_hpt366.c | 28 +--- drivers/ata/pata_hpt37x.c | 10 +- drivers/ata/pata_hpt3x3.c |2 +- drivers/ata/pata_isapnp.c |2 +- drivers/ata/pata_it8213.c |2 +- drivers/ata/pata_ixp4xx_cf.c |2 +- drivers/ata/pata_jmicron.c |2 +- drivers/ata/pata_legacy.c |2 +- drivers/ata/pata_platform.c|2 +- drivers/ata/pata_qdi.c |2 +- drivers/ata/pata_rz1000.c |2 +- drivers/ata/pata_sc1200.c |2 +- drivers/ata/pata_scc.c |2 +- drivers/ata/pata_serverworks.c |2 +- drivers/ata/pata_sl82c105.c|2 +- drivers/ata/pata_winbond.c |2 +- drivers/ata/pdc_adma.c |2 +- drivers/ata/sata_inic162x.c|2 +- drivers/ata/sata_mv.c |2 +- drivers/ata/sata_nv.c |6 +++--- drivers/ata/sata_qstor.c |2 +- drivers/ata/sata_sil.c |2 +- drivers/ata/sata_sil24.c |2 +- drivers/ata/sata_sis.c |2 +- drivers/ata/sata_svw.c |2 +- drivers/ata/sata_sx4.c |2 +- drivers/ata/sata_uli.c |2 +- drivers/ata/sata_via.c |2 +- drivers/ata/sata_vsc.c |2 +- include/linux/libata.h |2 -- 43 files changed, 65 insertions(+), 93 deletions(-) Alan Cox (3): pata_hpt366: Enable bits are unreliable so don't use them ata_piix: clean up libata: Kiss post_set_mode goodbye Dave Jones (1): libata: Add Seagate STT2A to DMA blacklist. Jeff Garzik (2): libata: Trim trailing whitespace libata: bump versions Tejun Heo (1): ahci: disable 64bit dma on sb600 diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig index ad1f59c..b4a8d60 100644 --- a/drivers/ata/Kconfig +++ b/drivers/ata/Kconfig @@ -132,7 +132,7 @@ config SATA_SIS depends on PCI select PATA_SIS help - This option enables support for SiS Serial ATA on + This option enables support for SiS Serial ATA on SiS 964/965/966/180 and Parallel ATA on SiS 180. The PATA support for SiS 180 requires additionally to enable the PATA_SIS driver in the config. diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index e00e1b9..7baeaff 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -46,7 +46,7 @@ #include #define DRV_NAME "ahci" -#define DRV_VERSION"2.1" +#define DRV_VERSION"2.2" enum { @@ -170,6 +170,7 @@ enum { AHCI_FLAG_IGN_IRQ_IF_ERR= (1 << 25), /* ignore IRQ_IF_ERR */ AHCI_FLAG_HONOR_PI = (1 << 26), /* honor PORTS_IMPL */ AHCI_FLAG_IGN_SERR_INTERNAL = (1 << 27), /* ignore SERR_INTERNAL */ + AHCI_FLAG_32BIT_ONLY= (1 << 28), /* force 32bit */ AHCI_FLAG_COMMON= ATA_FLAG_SATA | ATA_FLAG_NO_LEGACY | ATA_FLAG_MMIO | ATA_FLAG_PIO_DMA | @@ -354,7 +355,8 @@ static const struct ata_port_info ahci_port_info[] = { /* board_ahci_sb600 */ { .flags = AHCI_FLAG_COMMON | - AHCI_FLAG_IGN_SERR_INTERNAL, + AHCI_FLAG_IGN_SERR_INTERNAL | + AHCI_FLAG_32BIT_ONLY, .pio_mask = 0x1f, /* pio0-4 */ .udma_mask = 0x7f, /* udma0-6 ; FIXME */ .port_ops = &ahci_ops, @@ -492,6 +494,13 @@ static void ahci_save_initial_config(struct pci_dev *pdev, hpriv->saved_cap = cap = readl(mmio + HOST_CAP); hpriv->saved_port_map = port_map = readl(mmio + HOST_PORTS_IMPL); + /* some chips lie about 64bit support */ + if ((cap & HOST_CAP_64) && (pi->flags & AHCI_FLAG_32BIT_ONLY)) { + dev_printk(KERN_INFO, &pdev->dev, + "controller can't do 64bit DMA, forcing 32bit\n"); + cap &= ~HOST_CAP_64; + } + /* fixup zero port_map */ if (!port_map) { port_map = (1 << ahci_
Re: [1/5] 2.6.22-rc2: known regressions
Hey there, On 5/19/07, Michal Piotrowski <[EMAIL PROTECTED]> wrote: Here is a list of some known regressions in 2.6.22-rc2. Feel free to add new regressions/remove fixed etc. http://kernelnewbies.org/known_regressions Subject: nx6125 has lost fan control References : http://lkml.org/lkml/2007/5/16/249 Submitter : Ray Lee <[EMAIL PROTECTED]> Status : Unknown I'm withdrawing this one. At boot, it wasn't controlling the fans, but after a suspend-resume it started up fine. My subsequent boots have also been okay, so I'm at a loss to explain what I saw. Regardless, given that it's hard to reproduce, I have no evidence that it's a regression, it could just be something that's really rare. Thanks, Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
On 5/21/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: Jon Smirl wrote: > 2) Address the long outstanding issue of multi-seat at the console > level. My solution to this is the one device per CRTC model. This is very very low priority. Pretty much nobody besides you is clamoring for it. > 3) Eliminate the need for a root priv controlling process. Get rid of > the potential for a security hole. Agreed. > 4) OOPS should always display even if in a graphics mode Agreed, and this was in the list that Jesse(?) posted. > 8) Allow multiple user space graphics systems to run. These user space Another very very low priority item. There are a lot more important things to work on. Linux is about what people need -right now-, not what you think Linux might need in the future; not what you think might be nice to have. I am not asking that these features be implemented today. I am asking that enough planning go into the architecture today to make sure that these features can be built in the future without tearing up the graphics system for a third time. This is the essence of my complaint about this patch. The patch introduces a new low level graphics API to the kernel. Once we put an API in it is basically impossible to get it back out. I am not convinced that enough planning has gone into this API yet. I'm also not convinced that there is a transition plan in place to ensure that all drivers get updated to this new API. The last thing we want is to maintain two parallel sets of video drivers forever into the future. V4L2 did something similar to this and orphaned a lot of drivers that the distributions were forced into updating later. Mode setting is intimately intertwined with the console. VT swapping adds another messy layer which can and should be eliminated in a redesign. Multi-seat and unicode add more complexity. All of this needs to be designed as a unified system. Satisfying the needs of the X server is the easiest piece of the puzzle. -- Jon Smirl [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: intermittant petabyte usage reported with broadcom nic
On Mon, 2 Apr 2007 11:43:19 +1000 CaT <[EMAIL PROTECTED]> wrote: > > I take minute by minute snapshots of network traffic by sampling > /proc/net/dev and most of the time everything works fine. Occasionally > though I get petabyte byte traffic and corresponding packet traffic. We were able to reproduce the problem and confirmed that it was a DMA problem of the statistics block. About once an hour on average, wrong counter values will be DMA'ed to host memory. Luckily, the DMA write stays within the intended address range so it will not corrupt other parts of memory. Other types of DMA including traffic and buffer descriptors are not affected. If you happen to be reading /proc/net/dev within a second after the DMA corruption, you'll see bogus counters. One second later and until the next bad DMA, the counters will be normal again. We are considering ways to workaround the problem. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
On Mon, 2007-05-21 at 13:42 -0400, Jon Smirl wrote: > > When I went through the design process for all this I came to the same > conclusion about needing a user space console process. > > User space console does impact on all of this because it implies that > the current console should be be defeatured down until it becomes only > a system recovery console and not a console for everyday use. I do agree (heh, for once) with that in the sense that in the long run, we should strip the kernel console down to the bare minimum to boot, display oopses, etc... and have all the fancy stuff, unicode, VT, and more in a userspace console process. However, I'm a little bit worried that we'll end up with 10 competing incompatible and inconsistent userspace console projects and -that- will be horrible. But it's something separate from what Dave and Jesse are trying to address. Let's first gets the fundation right and -then- we can do all sort of crazy things. Or maybe you can start working on a user console project in parallel using the new APIs that Jesse and Dave are providing ? :-) > For example, one part of the defeaturing would be to remove the > drawing acceleration code in the existing fbdev console drivers and to > rework it to support accelerated drawing from the user space console > implementation. You want the system recovery console mode to be as > simple as possible so that it is always guaranteed to work. User space > console is also what leads to the idea of compiling VT out of the > kernel. I do agree on that in the long run, but again, let's look into this -after- we have solved the more immediate issues. We can probably kill most of fbdev, fbcon and current VT once we have a solid userland based replacement that isn't completely bloated (maybe with a "slim" version that does only VGA and non-utf8 for server type apps). > Once you decide that a user space console is needed then the per CRTC > device node becomes more obvious since different people can be logged > onto the different consoles. That's irrelevant. Implementation detail. > All of the points in the list are interrelated and the architecture > needs to address everything as a unified whole. And that's bullshit :-) Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [stable] [PATCH] - fix oops in sysfs_readdir
Andrew Morton wrote: On Mon, 21 May 2007 13:11:21 -0500 Eric Sandeen <[EMAIL PROTECTED]> wrote: This is a non-ida backport of Tejun's patch in -mm at: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch for the 2.6.16 -stable tree - it follows the same scheme of using s_ino to safely store & retrieve the inode number of sysfs entries for use in sysfs_readdir, but uses a brain-dead-simple inode nr allocator rather than ida, which would bring along a lot of newer, more complex code. No, this doesn't guarantee uniqueness of sysfs inode numbers, but then the code in -stable today doesn't either - and with this change, at least it shouldn't oops. So I'm sitting here whether to commend this patch to google kernel maintainers for 2.6.18 backport, but I realise I don't know what it does. And I don't know if it fixes the reclaim-time oopses they were intermittently seeing, or if it fixes something else and if so what that is. Sigh. Better changelogs, please. Sorry Andrew. I referenced Tejun's upstream patch in -mm which has a nice changelog etc, and this is a backport of that, and does the same thing in the same way and solves the same problem - but that doesn't help if you just want to toss this message into your patch stack. Will fix up & resend. -Eric -Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: error in recent patch to fs/partitions/ldm.c
On Mon, 21 May 2007 19:24:57 -0400 (EDT) Robert P. J. Day wrote: > > $ git show dde33348e53ecab687a9768bf5262f0b8f79b7f2 > ... > --- a/fs/partitions/ldm.c > +++ b/fs/partitions/ldm.c > ... > - (unsigned long long)ph->config_size ); > + udunsigned long long)ph->config_size); > ^^ Patch has already been posted to lkml: From: Anton Altaparmakov <[EMAIL PROTECTED]> Subject: Fix to the fix! - Re: [2.6 PATCH] LDM: Fix for Windows Vista dynamic disks Date: Mon, 21 May 2007 21:13:37 +0100 (BST) --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree
On Mon, May 21, 2007 at 03:00:55PM -0700, [EMAIL PROTECTED] wrote: > + if (register_blkdev(LOOP_MAJOR, "loop")) > + return -EIO; > + blk_register_region(MKDEV(LOOP_MAJOR, 0), range, > + THIS_MODULE, loop_probe, NULL, NULL); > + > + for (i = 0; i < nr; i++) { > + if (!loop_init_one(i)) > + goto err; > + } > + > + printk(KERN_INFO "loop: module loaded\n"); > + return 0; > +err: > + loop_exit(); This isn't good. You *can't* fail once a single disk has been registered. Anyone could've opened it by now. IOW, you need to * register region *after* you are past the point of no return * either not fail on failing loop_init_one() here, or separate allocations and actual add_disk(). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
On Mon, 2007-05-21 at 18:14 +0100, Dave Airlie wrote: > > > 6) Eliminate the existing VT swap driver free for all. I would > compile > > out the VT layer and replace it with a compatible API that enforces > > some sanity. > > I'm hoping to look into this but it is a parallel problem to what this > code does, the VT switch API sucks rocks, so providing something > compatible is going to suck rocks.. Yeah, please, don't even go near that until everything else is done & merged or you'll never have anything finished :-) VT is a can of worms that will take some time to sort out and has nothing to do with what we are talking about right now anyway :-) Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bug in 2.6.22-rc2: loop mount limited to one single iso image
On Tue, May 22, 2007 at 01:10:38AM +0100, Al Viro wrote: > On Mon, May 21, 2007 at 09:11:02AM -0700, Linus Torvalds wrote: > > > > > > On Sun, 20 May 2007, Kay Sievers wrote: > > > > > > Right, providing "preallocated" devices, 8 or the number given in > > > max_loop, sounds like the best option until the tools can handle that. > > > > Yes. Can somebody who actually _uses_ loop send a tested patch, please? > > FWIW, I do and I have tested it; what I did *not* do is using a testbox > with dynamic /dev for testing. Mea culpa... > > AFAICS, patch that went to akpm (preallocate max_loop instances) is OK. ... except that it needs to do cleanup on failure in loop_init(). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
> In collaboration with the FB guys, we've been working on enhancing > the > kernel's graphics subsystem in an attempt to bring some sanity to the > Linux graphics world and avoid the situation we have now where > several > kernel and userspace drivers compete for control of graphics devices. .../... A little note about initial mode setting at boot... I do stongly beleive that the decision of what mode to choose should not be made in the kernel. At boot the kernel should either leave the HW in whatever state the FW set it (and text mode is fine) or setup some sane default (ie 640x480 has the most chances of working) if that's not an option, maybe some very minimum EDID parsing in case you have a fixed frequency weirdo panel, but that's it (unless it's a mac an OF tells you what to use :-) The kernel would provide userland with connector infos (presence load detect, EDID, ...) and userland gets to decide what to do. Some reasons to keep that policy completely out of the kernel even at boot time are: - User may want to configure his default gfx setup and have it up early - EDID do lie, monitors routinely ship with windows .inf files containing "updated" infos apple has that too in OS X (EDID overrides afaik) etc and if we're going to do such a database of known monitors, it should definitely not be in the kernel. Besides, we want to add more infos that EDIDs don't provide most of the time to it like subpixel ordering etc... - It sounds better that way :-) (yeah, that's the best reason !) So while I agree that the register frobbing, memory management, etc... should be indeed moved to the kernel as you guys have been doing lately, the policy of deciding what mode to set should totally stick to userland. IMHO, the best would be a lib (or daemon or both) for monitor detection & mode setting that is separate from X :-) That could handle storing the admin's default setup (including weird monitor info if any) and restoring it at boot time, etc... Cheers, Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bug in 2.6.22-rc2: loop mount limited to one single iso image
On Mon, May 21, 2007 at 09:11:02AM -0700, Linus Torvalds wrote: > > > On Sun, 20 May 2007, Kay Sievers wrote: > > > > Right, providing "preallocated" devices, 8 or the number given in > > max_loop, sounds like the best option until the tools can handle that. > > Yes. Can somebody who actually _uses_ loop send a tested patch, please? FWIW, I do and I have tested it; what I did *not* do is using a testbox with dynamic /dev for testing. Mea culpa... AFAICS, patch that went to akpm (preallocate max_loop instances) is OK. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] enhancing the kernel's graphics subsystem
On 5/21/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: Jon Smirl wrote: > There is a significant group of Linux users who want to be able to > login separate users to each screen/head/crtc/output device. These > people are concentrated in the third world and don't show up at OLS to > argue their case. > > There is another group that wants Unicode consoles. The people I > talked to were from India and Japan. If these "significant groups" do not bubble up to the surface, nothing we can do about that. You need to take into account that these features have been asked about for years and we didn't respond. People interested in this have turned to other solutions. James Simmons maintained a multi-seat version of the 2.4 kernel for years that had many happy users. I don't know the details about why it never got merged. I believe two companies built special multi-head video hardware for it (Appian and Matrox? it has been a while). People needing Unicode console support can't speak English. How can they complain on LKML? Linux is an OS for the world, we really should make Unicode console work. -- Jon Smirl [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/