date:20070521


* Andrew Morton <[EMAIL PROTECTED]> wrote:

> [  550.280860] BUG: at kernel/softirq.c:138 local_bh_enable()

yep. The correct patch is the one below.

Ingo

->
Subject: Prevent going idle with softirq pending
From: Thomas Gleixner <[EMAIL PROTECTED]>
 
The NOHZ patch contains a check for softirqs pending when a CPU goes 
idle. The BUG is unrelated to NOHZ, it just was made visible by the NOHZ 
patch. The BUG showed up mainly on P4 / hyperthreading enabled machines 
which lead the investigations into the wrong direction in the first 
place.  The real cause is in cond_resched_softirq():
 
cond_resched_softirq() is enabling softirqs without invoking the softirq 
daemon when softirqs are pending.  This leads to the warning message in 
the NOHZ idle code:
 
t1 runs softirq disabled code on CPU#0
interrupt happens, softirq is raised, but deferred (softirqs disabled)
t1 calls cond_resched_softirq()
enables softirqs via _local_bh_enable()
calls schedule()
t2 runs
t1 is migrated to CPU#1
t2 is done and invokes idle()
NOHZ detects the pending softirq
 
Fix: change _local_bh_enable() to local_bh_enable() so the softirq
daemon is invoked.
 
Thanks to Anant Nitya for debugging this with great patience !
 
Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4212,9 +4212,7 @@ int __sched cond_resched_softirq(void)
BUG_ON(!in_softirq());
 
if (need_resched() && system_state == SYSTEM_RUNNING) {
-   raw_local_irq_disable();
-   _local_bh_enable();
-   raw_local_irq_enable();
+   local_bh_enable();
__cond_resched();
local_bh_disable();
return 1;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Prevent going idle with softirq pending

2007-05-21 Thread Andrew Morton

On Mon, 21 May 2007 23:34:24 +0200 Thomas Gleixner <[EMAIL PROTECTED]> wrote:

> The NOHZ patch contains a check for softirqs pending when a CPU goes
> idle. The BUG is unrelated to NOHZ, it just was made visible by the NOHZ
> patch. The BUG showed up mainly on P4 / hyperthreading enabled machines
> which lead the investigations into the wrong direction in the first
> place. The real cause is in cond_resched_softirq():
> 
> cond_resched_softirq() is enabling softirqs without invoking the softirq
> daemon when softirqs are pending. This leads to the warning message in
> the NOHZ idle code:
> 
> t1 runs softirq disabled code on CPU#0
> interrupt happens, softirq is raised, but deferred (softirqs disabled)
> t1 calls cond_resched_softirq()
>   enables softirqs via _local_bh_enable()
>   calls schedule()
> t2 runs
> t1 is migrated to CPU#1
> t2 is done and invokes idle()
> NOHZ detects the pending softirq
> 
> Fix: change _local_bh_enable() to local_bh_enable() so the softirq
> daemon is invoked.
> 
> Thanks to Anant Nitya for debugging this with great patience !
> 
> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
> 
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -4776,7 +4776,7 @@ int __sched cond_resched_softirq(void)
>  
>   if (need_resched() && system_state == SYSTEM_RUNNING) {
>   raw_local_irq_disable();
> - _local_bh_enable();
> + local_bh_enable();
>   raw_local_irq_enable();
>   __cond_resched();
>   local_bh_disable();
> 

[  550.280860] BUG: at kernel/softirq.c:138 local_bh_enable()
[  550.281019]  [] local_bh_enable+0x3c/0x79
[  550.281153]  [] cond_resched_softirq+0x2d/0x43
[  550.281291]  [] release_sock+0x38/0x74
[  550.281414]  [] tcp_sendmsg+0x8e4/0x9d2
[  550.281565]  [] inet_sendmsg+0x3b/0x45
[  550.281692]  [] sock_sendmsg+0xcf/0xea
[  550.281826]  [] autoremove_wake_function+0x0/0x35
[  550.281974]  [] __qdisc_run+0x9a/0x12b
[  550.282095]  [] dev_queue_xmit+0x1e7/0x206
[  550.282225]  [] ip_output+0x23b/0x277
[  550.282341]  [] __nf_ct_refresh_acct+0xcf/0x102 [nf_conntrack]
[  550.282528]  [] tcp_packet+0x9c7/0x9f0 [nf_conntrack]
[  550.282693]  [] xdr_skb_read_bits+0x21/0x35 [sunrpc]
[  550.282872]  [] xdr_partial_copy_from_skb+0x12a/0x172 [sunrpc]
[  550.283067]  [] kernel_sendmsg+0x27/0x35
[  550.283192]  [] xs_send_kvec+0x98/0xa0 [sunrpc]
[  550.283376]  [] xs_sendpages+0x75/0x1b4 [sunrpc]
[  550.283554]  [] xs_tcp_send_request+0x5a/0x11c [sunrpc]
[  550.283739]  [] xprt_transmit+0xc2/0x1a4 [sunrpc]
[  550.283901]  [] rpcauth_wrap_req+0x6c/0x74 [sunrpc]
[  550.284070]  [] rpcauth_marshcred+0x4b/0x52 [sunrpc]
[  550.284239]  [] xprt_prepare_transmit+0x6a/0x73 [sunrpc]
[  550.284423]  [] nfs3_xdr_readargs+0x0/0x88 [nfs]
[  550.284595]  [] call_transmit+0x1c0/0x1f3 [sunrpc]
[  550.284766]  [] call_reserve+0x3c/0x65 [sunrpc]
[  550.284933]  [] __rpc_execute+0x6f/0x1fc [sunrpc]
[  550.285095]  [] sigprocmask+0x86/0x8d
[  550.285222]  [] nfs_execute_read+0x30/0x3f [nfs]
[  550.285396]  [] nfs_pagein_one+0x9d/0xda [nfs]
[  550.285563]  [] nfs_pageio_doio+0x2c/0x52 [nfs]
[  550.285731]  [] nfs_pageio_add_request+0xa2/0xb3 [nfs]
[  550.285912]  [] readpage_async_filler+0x102/0x11f [nfs]
[  550.286102]  [] readpage_async_filler+0x0/0x11f [nfs]
[  550.286274]  [] read_cache_pages+0x72/0xd4
[  550.286426]  [] nfs_readpages+0x10c/0x14d [nfs]
[  550.286595]  [] ip_finish_output+0x0/0x1e7
[  550.286727]  [] nfs_pagein_one+0x0/0xda [nfs]
[  550.286893]  [] nfs_readpages+0x0/0x14d [nfs]
[  550.287054]  [] __do_page_cache_readahead+0xe3/0x19c
[  550.287204]  [] xdr_partial_copy_from_skb+0x12a/0x172 [sunrpc]
[  550.291280]  [] xs_tcp_data_recv+0x3cd/0x401 [sunrpc]
[  550.295331]  [] xdr_skb_read_bits+0x0/0x35 [sunrpc]
[  550.299385]  [] blockable_page_cache_readahead+0x4c/0x9f
[  550.303465]  [] make_ahead_window+0x7c/0x99
[  550.307499]  [] page_cache_readahead+0x17a/0x1a4
[  550.311532]  [] do_generic_mapping_read+0x13b/0x432
[  550.315583]  [] generic_file_aio_read+0x130/0x157
[  550.319512]  [] file_read_actor+0x0/0xd1
[  550.323492]  [] do_sync_read+0xc6/0x109
[  550.327471]  [] ip_rcv_finish+0x0/0x235
[  550.331524]  [] autoremove_wake_function+0x0/0x35
[  550.335629]  [] do_sync_read+0x0/0x109
[  550.339532]  [] vfs_read+0xa6/0x150
[  550.343513]  [] sys_read+0x41/0x67
[  550.347385]  [] syscall_call+0x7/0xb
[  550.351321]  ===

That's

WARN_ON_ONCE(irqs_disabled());

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 10/14] In-kernel file copy between union mounted filesystems

2007-05-21 Thread Jan Engelhardt


On May 22 2007 08:43, Bharata B Rao wrote:
>On Fri, May 18, 2007 at 09:47:31AM -0400, Shaya Potter wrote:
>> Bharata B Rao wrote:
>> 
>> >
>> >Not really. This is called during copyup of a file residing in a lower
>> >layer. And that is done only for regular files.
>> 
>> That is broken.
>
>But it only breaks the semantics (in other cases we allow writes only to the
>top layer files). So the question is why do we have to copy up the device
>node ? What difference it makes to writing to the device itself ?

Because `chmod 666 blockdevnode` is not the same as writing
to the device itself?

>Currently we allow write to the device using the lower layer device node
>itself.


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

On Mon, May 21, 2007 at 10:04:10PM -0700, William Lee Irwin III wrote:
> On Mon, May 21, 2007 at 06:39:51PM -0700, William Lee Irwin III wrote:
> >> address (virtual and physical are trivially inter-convertible), mock
> >> up something akin to what filesystems do for anonymous pages, etc.
> >> The real objection everyone's going to have is that driver writers
> >> will stain their shorts when faced with the rules for handling such
> >> things. The thing is, I'm not entirely sure who these driver writers
> >> that would have such trouble are, since the driver writers I know
> >> personally are sophisticates rather than walking disaster areas as such
> >> would imply. I suppose they may not be representative of the whole.
> 
> On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote:
> > That's not the objection I would have. I would say that firstly, I
> > don't think the mem_map overhead is very significant (at any rate,
> > an allocated-on-demand metadata is not going to be any smaller if
> > you fill up on pagecache...). Secondly, I think there is merit to
> > having the same page metadata used by the major subsystems, because
> > it helps for locality of reference.
> 
> The size isn't the advantage being cited; I'd actually expect the net
> result to be larger. It's the control over the layout of the metadata
> for cache locality and even things like having enough flags, folding
> buffer_head -like affairs into the per-page metadata for filesystems
> and so reaping cache locality benefits even there (assuming it works
> out in other respects), and so on.
> 
> Passing pages between subsystems doesn't seem very significant to me.
> There isn't going to be much locality of reference, or even any
> guarantee that the subsystem gets fed a cache hot page structure. The
> subsystem being passed the page will have its own cache hot accounting
> structures to stick the information about the memory into.

Well consider the page allocator and pagecache. The page allocator
uses page metadata rather than eg. a bitmap, and it uses page list
heads for the per-cpu allocator.

If we were to instead perhaps use external bitmaps and arrays to 
keep track of pages, then the pagecache would have to go and allocate
its own structures rather than reuse the cache hot page allocator
structures.

Buffer heads might be something that would work well, but we'd still
like to be able to deallocate them without freeing the whole pagecache
(because they tend to be associated with less frequent operations like
IO). But anyway, I don't know. I'm sure there would be cases where it
works better.


> On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote:
> > But I haven't explored the idea enough myself to know whether there
> > would be any really killer benefits to this. Delayed metadata freeing
> > via RCU without holding up the freeing of the actual page would have
> > been something, however I can do similar with speculative references
> > now (or whenever the code gets merged), which doesn't even require the
> > RCU overhead.
> 
> I'm not entirely sure what you're on about there, but it sounds
> interesting.

Heh :) Well the lockless pagecache would become basically trivial if we
could RCU-free pagecache pages, however doing that is really awful for
a number of reasons. However if you had a system where the metadata is
decoupled, you could simply RCU-free the 'struct page' (while still
immediately freeing the page itself) which would make lockless pagecache
(and potentially similar things) equally trivial.

I assumed K42 might have been into that angle.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bad networking related lag in v2.6.22-rc2


* Anant Nitya <[EMAIL PROTECTED]> wrote:

> > I think I already found the bug, please try if this patch helps.
> 
> Sorry, but this patch is not helping here. [...]

btw., could you please send this patch on-list too please?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bad networking related lag in v2.6.22-rc2


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > Sorry, but this patch is not helping here. [...]
> 
> btw., could you please send this patch on-list too please?

disregard this - just found Patrick's patch.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bad networking related lag in v2.6.22-rc2


* Anant Nitya <[EMAIL PROTECTED]> wrote:

> > I think I already found the bug, please try if this patch helps.
> 
> Sorry, but this patch is not helping here. I recompiled the kernel 
> with this patch but same load pattern still make system to crawl.
> 
> Here is the link for script I use to shape traffic.
> 
> http://cybertek.info/taitai/adslbwopt.sh

could you also apply the fix for the softirq problem below, to make sure 
it does not interact?

Ingo

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4212,9 +4212,7 @@ int __sched cond_resched_softirq(void)
BUG_ON(!in_softirq());
 
if (need_resched() && system_state == SYSTEM_RUNNING) {
-   raw_local_irq_disable();
-   _local_bh_enable();
-   raw_local_irq_enable();
+   local_bh_enable();
__cond_resched();
local_bh_disable();
return 1;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [lm-sensors] [RFC] ACPI based hwmon driver for ASUS

2007-05-21 Thread Rudolf Marek


Hi,

I have following readings:

w83627ehf-isa-0290
Adapter: ISA adapter
VCore: +1.52 V  (min =  +0.00 V, max =  +1.74 V)
in1:  +12.30 V  (min = +13.46 V, max = +13.04 V) ALARM
AVCC:  +3.36 V  (min =  +4.08 V, max =  +3.95 V) ALARM
3VCC:  +3.36 V  (min =  +4.05 V, max =  +3.06 V) ALARM
in4:   +2.04 V  (min =  +1.78 V, max =  +2.04 V)
in5:   +1.60 V  (min =  +2.04 V, max =  +2.02 V) ALARM
in6:   +5.12 V  (min =  +6.12 V, max =  +6.53 V) ALARM
VSB:   +3.36 V  (min =  +4.08 V, max =  +4.08 V) ALARM
VBAT:  +3.30 V  (min =  +4.08 V, max =  +3.04 V) ALARM
in9:   +1.65 V  (min =  +0.98 V, max =  +2.04 V)
Case Fan:0 RPM  (min =0 RPM, div = 8)
CPU Fan:  1638 RPM  (min =0 RPM, div = 4)
Aux Fan:  1436 RPM  (min = 4272 RPM, div = 4) ALARM
fan5:0 RPM  (min =0 RPM, div = 16)
Sys Temp:+28°C  (high =   -65°C, hyst =   -34°C)   ALARM
CPU Temp:  +34.0°C  (high = +80.0°C, hyst = +75.0°C)
AUX Temp:  +38.5°C  (high = +80.0°C, hyst = +75.0°C)

Fan4 is disabled in the chip. I think the board has 4 connectors. I dont have 
time right now to check the manual.


Rudolf

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -ak] Fix missing include

2007-05-21 Thread Andi Kleen

On Monday 21 May 2007 22:03, Thomas Gleixner wrote:
> ftp://firstfloor.org/pub/ak/x86_64/quilt/x86_64-2.6.22-rc2-070521-1.bz2
>
> explodes in various places due to missing defines of __cold. We can't
> rely on the assumption that linux/compiler.h is included magically
> before bug.h is included. Include it explicitely.

Hmm, somehow it worked here. Perhaps longer term it would be a good
idea to turn compiler.h into an -include

> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

Added thanks

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Anant Nitya

On Monday 21 May 2007 15:50:09 Ingo Molnar wrote:
> * Anant Nitya <[EMAIL PROTECTED]> wrote:
> > Tcp:
> > 5 connections established
>
> hm, this does not explain the /proc/net/tcp overhead i think - although
> it could be a red herring. Will have a closer look at your new trace.
>
> if possible please try to generate the automatic softirq trace for
> Thomas, and then a separate trace for the firefox/net-lag thing, using
> trace-it-10sec.c. Btw., for the second trace, could you boot with
> maxcpus=1? That would make the second trace quite a bit more
> straightforward to analyze. You probably need both cpus to trigger the
> softirq problem.
>
>   Ingo

here is the link for new trace with maxcpus=1.
http://cybertek.info/taitai/trace-it-10sec-to-ingo-with-maxcpus=1.bz2

Regards
Ananitya

-- 
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bad networking related lag in v2.6.22-rc2

2007-05-21 Thread Anant Nitya

On Tuesday 22 May 2007 03:00:31 Patrick McHardy wrote:
> Patrick McHardy wrote:
> > Ingo Molnar wrote:
> >>* Anant Nitya <[EMAIL PROTECTED]> wrote:
> >>>I am posting links to the information you asked for. One more thing,
> >>>after digging a bit more I found its QoS shaping that is making the
> >>>box crawl. Once I disabled the traffic shaping everything comes back
> >>>to smooth and normal. Shaping being done on very low speed residential
> >>>ADSL 256/64 Kbps connection. If you want me to post shaping rules,
> >>>please free to ask. BTW its a simple HTB/SFQ rules.
> >>
> >>[...]
> >>
> >>>http://cybertek.info/taitai/trace-to-ingo.txt.bz2
> >>
> >>thanks! This trace indeed includes the smoking gun, htb_dequeue() and
> >>__qdisc_run():
> >>
> >>[..]
> >
> > This looks like fallout from the switch to hrtimers. Anant, please
> > send me your HTB script, I'll try to reproduce it.
>
> I think I already found the bug, please try if this patch helps.

Sorry, but this patch is not helping here. I recompiled the kernel with this 
patch but same load pattern still make system to crawl.

Here is the link for script I use to shape traffic.

http://cybertek.info/taitai/adslbwopt.sh

Regards
Ananitya

-- 
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH (take 2)] Documentation/memory-barriers.txt: various fixes

2007-05-21 Thread Jarek Poplawski

On Mon, May 21, 2007 at 03:12:07PM +0100, David Howells wrote:
> Jarek Poplawski <[EMAIL PROTECTED]> wrote:
...
> > > I think this changes the meaning to one I don't want.  But I'm not 
> > > entirely
> > > sure.  In a way the two concepts "update of perception" and "update 
> > > perception"
> > > are different things.  I think this can be argued either way.

No, your way is right. I've recollected this afterwards.
So, it's all about "The Doors of Perception"... Now it's
all clear! These CPUs are really cool and relaxed...
Jim Morrison would be proud of them (William Blake even
more), I hope.

Could you sign (or ack) this patch, please.

Regards,
Jarek P.

PS: I'm not sure you've read Robert's suggestion about this
adjective. I've included here only explicitly accepted fixes.
Google shows it's probably not the most respected rule, but
I can resend this once more - no problem.

---> (take 2)

Subject: Documentation/memory-barriers.txt: various fixes

CC: "Robert P. J. Day" <[EMAIL PROTECTED]>
Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>

---

diff -Nur 2.6.22-rc1-git7-/Documentation/memory-barriers.txt 
2.6.22-rc1-git7/Documentation/memory-barriers.txt
--- 2.6.22-rc1-git7-/Documentation/memory-barriers.txt  2007-04-26 
05:08:32.0 +0200
+++ 2.6.22-rc1-git7/Documentation/memory-barriers.txt   2007-05-21 
20:05:07.0 +0200
@@ -24,7 +24,7 @@
  (*) Explicit kernel barriers.
 
  - Compiler barrier.
- - The CPU memory barriers.
+ - CPU memory barriers.
  - MMIO write barrier.
 
  (*) Implicit kernel memory barriers.
@@ -457,7 +457,7 @@
(Q == &A) implies (D == 1)
(Q == &B) implies (D == 4)
 
-But! CPU 2's perception of P may be updated _before_ its perception of B, thus
+But!  CPU 2's perception of P may be updated _before_ its perception of B, thus
 leading to the following situation:
 
(Q == &B) and (D == 2) 
@@ -546,10 +546,10 @@
 When dealing with CPU-CPU interactions, certain types of memory barrier should
 always be paired.  A lack of appropriate pairing is almost certainly an error.
 
-A write barrier should always be paired with a data dependency barrier or read
-barrier, though a general barrier would also be viable.  Similarly a read
-barrier or a data dependency barrier should always be paired with at least an
-write barrier, though, again, a general barrier is viable:
+A write barrier should always be paired with a data dependency barrier or a
+read barrier, though a general barrier would also be viable.  Similarly the
+read barrier or the data dependency barrier should always be paired with at
+least the write barrier, though, again, the general barrier is viable:
 
CPU 1   CPU 2
=== ===
@@ -573,7 +573,7 @@
 the "weaker" type.
 
 [!] Note that the stores before the write barrier would normally be expected to
-match the loads after the read barrier or data dependency barrier, and vice
+match the loads after the read barrier or the data dependency barrier, and vice
 versa:
 
CPU 1   CPU 2
@@ -588,7 +588,7 @@
 EXAMPLES OF MEMORY BARRIER SEQUENCES
 
 
-Firstly, write barriers act as a partial orderings on store operations.
+Firstly, write barriers act as partial orderings on store operations.
 Consider the following sequence of events:
 
CPU 1
@@ -608,15 +608,15 @@
+---+   :  :
|   |   +--+
|   |-->| C=3  | } /\
-   |   |  :+--+ }-  \  -> Events perceptible
-   |   |  :| A=1  | }\/   to rest of system
+   |   |  :+--+ }-  \  -> Events perceptible to
+   |   |  :| A=1  | }\/   the rest of the system
|   |  :+--+ }
| CPU 1 |  :| B=2  | }
|   |   +--+ }
|   |    }   <--- At this point the write barrier
|   |   +--+ }requires all stores prior to the
|   |  :| E=5  | }barrier to be committed before
-   |   |  :+--+ }further stores may be take place.
+   |   |  :+--+ }further stores may take place
|   |-->| D=4  | }
|   |   +--+
+---+   :  :
@@ -626,7 +626,7 @@
   V
 
 
-Secondly, data dependency barriers act as a partial orderings on data-dependent
+Secondly, data dependency barriers act as partial orderings on data-dependent
 loads.  Consider the following sequence of events:
 
CPU 1   CPU 2
@@ -975,7 +975,7 @@
 
barrier();
 
-This a general barrier - lesser varieties of compiler barrier do not exist.
+This is a general barrier - lesser varieties of compiler barrier do not exist.
 
 The compiler barrier has no direct effect on the CP

8250_pnp is confused... (udev?)

2007-05-21 Thread Jeff Wiegley

I've used serial ports a lot in the past but not for the
past year so. Did something fundamental change?

I have serial_core, 8250 and 8250_pnp modules installed
and things are quite weird... I get all of the /dev/ttyS's
that I DON'T have and none of the ones that I do!

$modprobe -a 8250_pnp
$tail /var/log/message
May 21 22:59:35 home Serial: 8250/16550 driver $Revision: 1.90 $ 4
ports, IRQ sharing disabled
May 21 22:59:35 home pnp: Device 00:07 activated.
May 21 22:59:35 home 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A

So, yes it looks like it sees the one serial port that I also
think I have. but...

$ls -al /dev/ttyS*
crw-rw 1 root uucp 4, 65 May 21 22:59 /dev/ttyS1
crw-rw 1 root uucp 4, 66 May 21 22:59 /dev/ttyS2
crw-rw 1 root uucp 4, 67 May 21 22:59 /dev/ttyS3

It seems that somehow either 8250_pnp (or maybe udev) gets the
logic inverted. It doesn't create a device for the one device
that 8250_pnp found and yet it does create devices for everything
NOT found??

If I rmmod 8250_pnp 8250 serial_core then all the serial devices
go away. So it's 8250/udev creating these oddities and not me.

I am using gentoo with kernel v2.6.21.1 on an intel 64 bit dual-core
with Nvidia 580i chipset. I haven't tweaked any udev rules. udev is
version 104-r12.

Any help is appreciated. Thanks!

- Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Crash on modpost, addend_386_rel()

2007-05-21 Thread Atsushi Nemoto

On Mon, 21 May 2007 21:52:59 -0700 (PDT), Linus Torvalds <[EMAIL PROTECTED]> 
wrote:
> Would you mind also just making this whole logic (that is generic and 
> shared with all the different arch versions) be an inline function of its 
> own? 
> 
> > +   Elf_Shdr *sechdrs = elf->sechdrs;
> > +   unsigned int *location;
> > +   int section = sechdrs[rsection].sh_info;
> > +
> > +   location = (void *)elf->hdr + sechdrs[section].sh_offset +
> > +   (r->r_offset - sechdrs[section].sh_addr);
> 
> so that all the functions could just use some generic
> 
>   location = reloc_location(elf, rsection, r);
> 
> or similar, instead of having that complex thing duplicated three times 
> (arm, mips and i386)?

Sure, updated.

> Especially since other architectures will likely end up doing the same 
> thing too...

Archs using RELA instead of REL section are not affected by this
patch.  But I'm not sure how many are using RELA.


Subject: [PATCH] kbuild: make better section mismatch reports on i386, arm and 
mips (take 3)

On i386, ARM and MIPS, warn_sec_mismatch() sometimes fails to show
usefull symbol name.  This is because empty 'refsym' due to 0 r_addend
value.  This patch is to adjust r_addend value, consulting with
apply_relocate() routine in kernel code.

Signed-off-by: Atsushi Nemoto <[EMAIL PROTECTED]>
---
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 8e5610d..44c3960 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -384,6 +384,8 @@ static int parse_elf(struct elf_info *info, const char 
*filename)
sechdrs[i].sh_size   = TO_NATIVE(sechdrs[i].sh_size);
sechdrs[i].sh_link   = TO_NATIVE(sechdrs[i].sh_link);
sechdrs[i].sh_name   = TO_NATIVE(sechdrs[i].sh_name);
+   sechdrs[i].sh_info   = TO_NATIVE(sechdrs[i].sh_info);
+   sechdrs[i].sh_addr   = TO_NATIVE(sechdrs[i].sh_addr);
}
/* Find symbol table. */
for (i = 1; i < hdr->e_shnum; i++) {
@@ -753,6 +755,8 @@ static Elf_Sym *find_elf_symbol(struct elf_info *elf, 
Elf_Addr addr,
for (sym = elf->symtab_start; sym < elf->symtab_stop; sym++) {
if (sym->st_shndx != relsym->st_shndx)
continue;
+   if (ELF_ST_TYPE(sym->st_info) == STT_SECTION)
+   continue;
if (sym->st_value == addr)
return sym;
}
@@ -895,6 +899,69 @@ static void warn_sec_mismatch(const char *modname, const 
char *fromsec,
}
 }
 
+static inline unsigned int *reloc_location(struct elf_info *elf,
+  int rsection, Elf_Rela *r)
+{
+   Elf_Shdr *sechdrs = elf->sechdrs;
+   int section = sechdrs[rsection].sh_info;
+
+   return (void *)elf->hdr + sechdrs[section].sh_offset +
+   (r->r_offset - sechdrs[section].sh_addr);
+}
+
+static void addend_386_rel(struct elf_info *elf, int rsection, Elf_Rela *r)
+{
+   unsigned int r_typ = ELF_R_TYPE(r->r_info);
+   unsigned int *location = reloc_location(elf, rsection, r);
+
+   switch (r_typ) {
+   case R_386_32:
+   r->r_addend = TO_NATIVE(*location);
+   break;
+   case R_386_PC32:
+   r->r_addend = TO_NATIVE(*location) + 4;
+   break;
+   }
+}
+
+static void addend_arm_rel(struct elf_info *elf, int rsection, Elf_Rela *r)
+{
+   unsigned int r_typ = ELF_R_TYPE(r->r_info);
+   unsigned int *location = reloc_location(elf, rsection, r);
+
+   switch (r_typ) {
+   case R_ARM_ABS32:
+   r->r_addend = TO_NATIVE(*location);
+   break;
+   case R_ARM_PC24:
+   r->r_addend = ((TO_NATIVE(*location) & 0x00ff) << 2) + 8;
+   break;
+   }
+}
+
+static int addend_mips_rel(struct elf_info *elf, int rsection, Elf_Rela *r)
+{
+   unsigned int r_typ = ELF_R_TYPE(r->r_info);
+   unsigned int *location = reloc_location(elf, rsection, r);
+   unsigned int inst;
+
+   if (r_typ == R_MIPS_HI16)
+   return 1;   /* skip this */
+   inst = TO_NATIVE(*location);
+   switch (r_typ) {
+   case R_MIPS_LO16:
+   r->r_addend = inst & 0x;
+   break;
+   case R_MIPS_26:
+   r->r_addend = (inst & 0x03ff) << 2;
+   break;
+   case R_MIPS_32:
+   r->r_addend = inst;
+   break;
+   }
+   return 0;
+}
+
 /**
  * A module includes a number of sections that are discarded
  * either when loaded or when used as built-in.
@@ -938,8 +1005,11 @@ static void check_sec_ref(struct module *mod, const char 
*modname,
r.r_offset = TO_NATIVE(rela->r_offset);
 #if KERNEL_ELFCLASS == ELFCLASS64
if (hdr->e_machine == EM_MIPS) {
+   unsigned int r_typ;
r_sym = ELF64_MIPS_

Re: Define CONFIG_BOUNCE to avoid useless inclusion of bounce buffer logic.

On Tue, 22 May 2007, KAMEZAWA Hiroyuki wrote:

> > +config BOUNCE
> > +   def_bool y
> > +   depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
> > +
> 
> AFAIK, ppc has only ZONE_DMA and it never needs bounce.
> Is this ok ?

That is wrong. ppc should have ZONE_NORMAL and no ZONE_DMA.
Otherwise you cannot switch off ZONE_DMA and you cannot switch off 
bounce. ZONE_DMA is a zone for exceptional allocs. If you do not have 
those then you only have normal allocs -> ZONE_NORMAL.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] Make ide dma blacklist handling a bit saner.

2007-05-21 Thread Junio C Hamano

Earlier, the matching of (model,rev) in ide-dma black/white list
handling was to consider "ALL" in the table to match any
revision.  This changes the wildcard to NULL.  This way, the
DMA_BLACK_LIST macro used in the previous patch does not have to
use a slightly funky compile time constant expression to convert
NULL to "ALL".

Signed-off-by: Junio C Hamano <[EMAIL PROTECTED]>
---

 * I do not really know what I am doing in the mips area, but
   that architecture specific table seems to be used by the same
   ide_in_drive_list() function, so the entries are matched to
   the updated code.

 drivers/ide/ide-dma.c |   14 +++---
 include/asm-mips/mach-au1x00/au1xxx_ide.h |   28 ++--
 2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c
index a6a2074..c0b5b10 100644
--- a/drivers/ide/ide-dma.c
+++ b/drivers/ide/ide-dma.c
@@ -91,16 +91,16 @@
 
 static const struct drive_list_entry drive_whitelist [] = {
 
-   { "Micropolis 2112A",   "ALL"   },
-   { "CONNER CTMA 4000",   "ALL"   },
-   { "CONNER CTT8000-A",   "ALL"   },
-   { "ST34342A",   "ALL"   },
+   { "Micropolis 2112A",   NULL},
+   { "CONNER CTMA 4000",   NULL},
+   { "CONNER CTT8000-A",   NULL},
+   { "ST34342A",   NULL},
{ NULL  ,   NULL}
 };
 
 static const struct drive_list_entry drive_blacklist [] = {
 
-#define DMA_BLACK_LIST(model,rev) { (model), (rev==NULL ? "ALL" : (rev)) }
+#define DMA_BLACK_LIST(model,rev) { (model), (rev) }
 #include "dma-blacklist.h"
 #undef DMA_BLACK_LIST
{ NULL  ,   NULL}
@@ -120,8 +120,8 @@ int ide_in_drive_list(struct hd_driveid *id, const struct 
drive_list_entry *driv
 {
for ( ; drive_table->id_model ; drive_table++)
if ((!strcmp(drive_table->id_model, id->model)) &&
-   ((strstr(id->fw_rev, drive_table->id_firmware)) ||
-(!strcmp(drive_table->id_firmware, "ALL"
+   (!drive_table->id_firmware ||
+strstr(id->fw_rev, drive_table->id_firmware)))
return 1;
return 0;
 }
diff --git a/include/asm-mips/mach-au1x00/au1xxx_ide.h 
b/include/asm-mips/mach-au1x00/au1xxx_ide.h
index 8fcae21..4663e8b 100644
--- a/include/asm-mips/mach-au1x00/au1xxx_ide.h
+++ b/include/asm-mips/mach-au1x00/au1xxx_ide.h
@@ -88,26 +88,26 @@ static const struct drive_list_entry dma_white_list [] = {
 /*
  * Hitachi
  */
-{ "HITACHI_DK14FA-20",   "ALL"   },
-{ "HTS726060M9AT00"  ,   "ALL"   },
+{ "HITACHI_DK14FA-20",   NULL},
+{ "HTS726060M9AT00"  ,   NULL},
 /*
  * Maxtor
  */
-{ "Maxtor 6E040L0"  ,   "ALL"   },
-{ "Maxtor 6Y080P0"  ,   "ALL"   },
-{ "Maxtor 6Y160P0"  ,   "ALL"   },
+{ "Maxtor 6E040L0"  ,   NULL},
+{ "Maxtor 6Y080P0"  ,   NULL},
+{ "Maxtor 6Y160P0"  ,   NULL},
 /*
  * Seagate
  */
-{ "ST3120026A"  ,   "ALL"   },
-{ "ST320014A"   ,   "ALL"   },
-{ "ST94011A",   "ALL"   },
-{ "ST340016A"   ,   "ALL"   },
+{ "ST3120026A"  ,   NULL},
+{ "ST320014A"   ,   NULL},
+{ "ST94011A",   NULL},
+{ "ST340016A"   ,   NULL},
 /*
  * Western Digital
  */
-{ "WDC WD400UE-00HCT0"  ,   "ALL"   },
-{ "WDC WD400JB-00JJC0"  ,   "ALL"   },
+{ "WDC WD400UE-00HCT0"  ,   NULL},
+{ "WDC WD400JB-00JJC0"  ,   NULL},
 { NULL  ,   NULL}
 };
 
@@ -116,9 +116,9 @@ static const struct drive_list_entry dma_black_list [] = {
 /*
  * Western Digital
  */
-{ "WDC WD100EB-00CGH0"  ,   "ALL"   },
-{ "WDC WD200BB-00AUA1"  ,   "ALL"   },
-{ "WDC AC24300L",   "ALL"   },
+{ "WDC WD100EB-00CGH0"  ,   NULL},
+{ "WDC WD200BB-00AUA1"  ,   NULL},
+{ "WDC AC24300L",   NULL},
 { NULL  ,   NULL}
 };
 #endif
-- 
1.5.2.24.g93d4


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] Unify dma blacklist in ide-dma.c and libata-core.c

2007-05-21 Thread Junio C Hamano

This introduces a shared header file that defines the entries
for two dma blacklists in ide-dma.c and libata-core.c to make it
easier to keep them in sync.

Signed-off-by: Junio C Hamano <[EMAIL PROTECTED]>
---

 * Removes more lines than it adds.  I am not proud of the
   DMA_BLACK_LIST macro in ide-dma.c which relies on the
   compiler doing a sane thing for compile time constant
   expression to initialize the list, but that hopefully can be
   fixed in the next patch.

 drivers/ata/libata-core.c   |   34 --
 drivers/ide/dma-blacklist.h |   39 +++
 drivers/ide/ide-dma.c   |   33 +++--
 3 files changed, 46 insertions(+), 60 deletions(-)
 create mode 100644 drivers/ide/dma-blacklist.h

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index a6de57e..93b7fa7 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3739,36 +3739,10 @@ struct ata_blacklist_entry {
 
 static const struct ata_blacklist_entry ata_device_blacklist [] = {
/* Devices with DMA related problems under Linux */
-   { "WDC AC11000H",   NULL,   ATA_HORKAGE_NODMA },
-   { "WDC AC22100H",   NULL,   ATA_HORKAGE_NODMA },
-   { "WDC AC32500H",   NULL,   ATA_HORKAGE_NODMA },
-   { "WDC AC33100H",   NULL,   ATA_HORKAGE_NODMA },
-   { "WDC AC31600H",   NULL,   ATA_HORKAGE_NODMA },
-   { "WDC AC32100H",   "24.09P07", ATA_HORKAGE_NODMA },
-   { "WDC AC23200L",   "21.10N21", ATA_HORKAGE_NODMA },
-   { "Compaq CRD-8241B",   NULL,   ATA_HORKAGE_NODMA },
-   { "CRD-8400B",  NULL,   ATA_HORKAGE_NODMA },
-   { "CRD-8480B",  NULL,   ATA_HORKAGE_NODMA },
-   { "CRD-8482B",  NULL,   ATA_HORKAGE_NODMA },
-   { "CRD-84", NULL,   ATA_HORKAGE_NODMA },
-   { "SanDisk SDP3B",  NULL,   ATA_HORKAGE_NODMA },
-   { "SanDisk SDP3B-64",   NULL,   ATA_HORKAGE_NODMA },
-   { "SANYO CD-ROM CRD",   NULL,   ATA_HORKAGE_NODMA },
-   { "HITACHI CDR-8",  NULL,   ATA_HORKAGE_NODMA },
-   { "HITACHI CDR-8335",   NULL,   ATA_HORKAGE_NODMA },
-   { "HITACHI CDR-8435",   NULL,   ATA_HORKAGE_NODMA },
-   { "Toshiba CD-ROM XM-6202B", NULL,  ATA_HORKAGE_NODMA },
-   { "TOSHIBA CD-ROM XM-1702BC", NULL, ATA_HORKAGE_NODMA },
-   { "CD-532E-A",  NULL,   ATA_HORKAGE_NODMA },
-   { "E-IDE CD-ROM CR-840",NULL,   ATA_HORKAGE_NODMA },
-   { "CD-ROM Drive/F5A",   NULL,   ATA_HORKAGE_NODMA },
-   { "WPI CDD-820",NULL,   ATA_HORKAGE_NODMA },
-   { "SAMSUNG CD-ROM SC-148C", NULL,   ATA_HORKAGE_NODMA },
-   { "SAMSUNG CD-ROM SC",  NULL,   ATA_HORKAGE_NODMA },
-   { "ATAPI CD-ROM DRIVE 40X MAXIMUM",NULL,ATA_HORKAGE_NODMA },
-   { "_NEC DV5800A",   NULL,   ATA_HORKAGE_NODMA },
-   { "SAMSUNG CD-ROM SN-124","N001",   ATA_HORKAGE_NODMA },
-   { "Seagate STT2A", NULL,ATA_HORKAGE_NODMA },
+
+#define DMA_BLACK_LIST(model,rev) { (model), (rev), ATA_HORKAGE_NODMA }
+#include "../ide/dma-blacklist.h"
+#undef DMA_BLACK_LIST
 
/* Weird ATAPI devices */
{ "TORiSAN DVD-ROM DRD-N216", NULL, ATA_HORKAGE_MAX_SEC_128 |
diff --git a/drivers/ide/dma-blacklist.h b/drivers/ide/dma-blacklist.h
new file mode 100644
index 000..19b4e0c
--- /dev/null
+++ b/drivers/ide/dma-blacklist.h
@@ -0,0 +1,39 @@
+/*
+ * Shared between ide-dma.c::drive_blacklist[] and
+ * ../ata/libata-core.c::ata_device_blacklist[].
+ *
+ * Each of the above users define DMA_BLACK_LIST() macro
+ * which expands these to structure of their liking.
+ */
+
+   DMA_BLACK_LIST("WDC AC11000H", NULL),
+   DMA_BLACK_LIST("WDC AC22100H", NULL),
+   DMA_BLACK_LIST("WDC AC32500H", NULL),
+   DMA_BLACK_LIST("WDC AC33100H", NULL),
+   DMA_BLACK_LIST("WDC AC31600H", NULL),
+   DMA_BLACK_LIST("WDC AC32100H", "24.09P07"),
+   DMA_BLACK_LIST("WDC AC23200L", "21.10N21"),
+   DMA_BLACK_LIST("Compaq CRD-8241B", NULL),
+   DMA_BLACK_LIST("CRD-8400B", NULL),
+   DMA_BLACK_LIST("CRD-8480B", NULL),
+   DMA_BLACK_LIST("CRD-8482B", NULL),
+   DMA_BLACK_LIST("CRD-84", NULL),
+   DMA_BLACK_LIST("SanDisk SDP3B", NULL),
+   DMA_BLACK_LIST("SanDisk SDP3B-64", NULL),
+   DMA_BLACK_LIST("SANYO CD-ROM CRD", NULL),
+   DMA_BLACK_LIST("HITACHI CDR-8", NULL),
+   DMA_BLACK_LIST("HITACHI CDR-8335", NULL),
+   DMA_BLACK_LIST("HITACHI CDR-8435", NULL),
+   DMA_BLACK_LIST("Toshiba CD-ROM XM-6202B", NULL),
+   DMA_BLACK_LIST("TOSHIBA CD-ROM XM-1702BC", NULL),
+   DMA_BLACK_LIST("CD-532E-A", NULL),
+   DMA_BLACK_LIST("E-IDE CD-ROM CR-840", NULL),
+   DMA_BLACK_LIST("CD-ROM Drive/F5A", NULL),
+   DMA_BLACK_LIST("WPI

Re: Linux 2.6.22-rc2

2007-05-21 Thread Linus Torvalds

On Mon, 21 May 2007, Stephen Hemminger wrote:
>
> AHCI on this motherboard doesn't seem to use MSI. The problems occur
> even if I boot with nomsi.

Have you tried playing with PCI latency counters etc? 

Maybe the SATA/AHCI thing is better at saturating the bus, and the sky2 
hardware gets upset if it has overlong DMA access latencies due to some 
other controller keeping the bus busy with a long burst access?

I can't really see that being a real problem in this day and age of PCI-X 
etc, but it _used_ to be a possible issue a decade ago. Maybe you've found 
a case where it matters even on modern hardware? We occasionally used to 
set the PCI latency timer to make people happy.

(Not that I'm convinced it even has any semantic meaning on a modern PCI 
system..)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Prevent going idle with softirq pending


* Thomas Gleixner <[EMAIL PROTECTED]> wrote:

> The NOHZ patch contains a check for softirqs pending when a CPU goes 
> idle. The BUG is unrelated to NOHZ, it just was made visible by the 
> NOHZ patch. The BUG showed up mainly on P4 / hyperthreading enabled 
> machines which lead the investigations into the wrong direction in the 
> first place. The real cause is in cond_resched_softirq():
> 
> cond_resched_softirq() is enabling softirqs without invoking the 
> softirq daemon when softirqs are pending. This leads to the warning 
> message in the NOHZ idle code:

good find!

>   raw_local_irq_disable();
> - _local_bh_enable();
> + local_bh_enable();
>   raw_local_irq_enable();

hm, i think this should be done without having irqs disabled?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

2007-05-21 Thread William Lee Irwin III

On Mon, May 21, 2007 at 06:39:51PM -0700, William Lee Irwin III wrote:
>> address (virtual and physical are trivially inter-convertible), mock
>> up something akin to what filesystems do for anonymous pages, etc.
>> The real objection everyone's going to have is that driver writers
>> will stain their shorts when faced with the rules for handling such
>> things. The thing is, I'm not entirely sure who these driver writers
>> that would have such trouble are, since the driver writers I know
>> personally are sophisticates rather than walking disaster areas as such
>> would imply. I suppose they may not be representative of the whole.

On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote:
> That's not the objection I would have. I would say that firstly, I
> don't think the mem_map overhead is very significant (at any rate,
> an allocated-on-demand metadata is not going to be any smaller if
> you fill up on pagecache...). Secondly, I think there is merit to
> having the same page metadata used by the major subsystems, because
> it helps for locality of reference.

The size isn't the advantage being cited; I'd actually expect the net
result to be larger. It's the control over the layout of the metadata
for cache locality and even things like having enough flags, folding
buffer_head -like affairs into the per-page metadata for filesystems
and so reaping cache locality benefits even there (assuming it works
out in other respects), and so on.

Passing pages between subsystems doesn't seem very significant to me.
There isn't going to be much locality of reference, or even any
guarantee that the subsystem gets fed a cache hot page structure. The
subsystem being passed the page will have its own cache hot accounting
structures to stick the information about the memory into.

On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote:
> But I haven't explored the idea enough myself to know whether there
> would be any really killer benefits to this. Delayed metadata freeing
> via RCU without holding up the freeing of the actual page would have
> been something, however I can do similar with speculative references
> now (or whenever the code gets merged), which doesn't even require the
> RCU overhead.

I'm not entirely sure what you're on about there, but it sounds
interesting.

-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Match DMA blacklist entries between ide-dma.c and libata-core.c

2007-05-21 Thread Junio C Hamano

There are a few entries in ata_device_blacklist[] in libata-core.c
marked with HORKAGE_NODMA but are missing from drive_blacklist[]
in ide-dma.c.  This patch makes the lists in sync.

Also remove a duplicated entry for "SanDisk SDP3B-64".

Signed-off-by: Junio C Hamano <[EMAIL PROTECTED]>
---

 * Recently Alan Cox responded that libata blacklist needs to be
   kept in sync when Dave Jones added STT2A to DMA blacklist
   in ide-dma.c
---
 drivers/ide/ide-dma.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c
index b77b7d1..ead141e 100644
--- a/drivers/ide/ide-dma.c
+++ b/drivers/ide/ide-dma.c
@@ -119,15 +119,17 @@ static const struct drive_list_entry drive_blacklist [] = 
{
{ "HITACHI CDR-8335",   "ALL"   },
{ "HITACHI CDR-8435",   "ALL"   },
{ "Toshiba CD-ROM XM-6202B" ,   "ALL"   },
+   { "TOSHIBA CD-ROM XM-1702BC",   "ALL"   },
{ "CD-532E-A"   ,   "ALL"   },
{ "E-IDE CD-ROM CR-840","ALL"   },
{ "CD-ROM Drive/F5A",   "ALL"   },
{ "WPI CDD-820","ALL"   },
{ "SAMSUNG CD-ROM SC-148C", "ALL"   },
{ "SAMSUNG CD-ROM SC",  "ALL"   },
-   { "SanDisk SDP3B-64",   "ALL"   },
{ "ATAPI CD-ROM DRIVE 40X MAXIMUM", "ALL"   },
{ "_NEC DV5800A",   "ALL"   },  
+   { "SAMSUNG CD-ROM SN-124",  "N001" },
+   { "Seagate STT2A",  "ALL" },
{ NULL  ,   NULL}
 
 };
-- 
1.5.2.24.g93d4


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Crash on modpost, addend_386_rel()

2007-05-21 Thread Linus Torvalds



On Tue, 22 May 2007, Atsushi Nemoto wrote:
> 
> Anyway, here is a updated patch tested on i386 (RELOCATABLE=y/n), arm,
> and mips.  On calculation of 'location', sh_addr should be subtracted
> (thank you for debugging, Linus).  And this patch contains an another
> fix and an improvement of added_mips_rel

Would you mind also just making this whole logic (that is generic and 
shared with all the different arch versions) be an inline function of its 
own? 

> + Elf_Shdr *sechdrs = elf->sechdrs;
> + unsigned int *location;
> + int section = sechdrs[rsection].sh_info;
> +
> + location = (void *)elf->hdr + sechdrs[section].sh_offset +
> + (r->r_offset - sechdrs[section].sh_addr);

so that all the functions could just use some generic

location = reloc_location(elf, rsection, r);

or similar, instead of having that complex thing duplicated three times 
(arm, mips and i386)?

Especially since other architectures will likely end up doing the same 
thing too...

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Kernel panic during hibernation

2007-05-21 Thread makalski


Hi,

I have kernel panic message when trying to put Dell Inspiron 6400 into 
hibernation.


The following is the message:

Process pm-hibernate (pid: 3168, threadinfo 810013dba000, task 
810018d0e

860)
Stack:   01bc7e40f260 07ef
000fdff0 800a74aa 810037e206f8 880777bc
1ebf7000 0001ebf7 0001enf6 
Call Trace:
[] swsusp_write+0x2fa/0x440
[] :scsi_mod:scsi_schedule_eh+0x45/0x55
[] pm_suspend_disk+0x5b/0xce
[] enter_state+0x52/0x19b
[] state_store+0x5e/0x79
[] sysfs_write_file+0xb9/0xe8
[] vfs_write+0xce/0x174
[] sys_write+0x45/0x6e
[] tracesys+0xd1/0xdc


Code 0f ba 6d 00 00 19 c0 85 c0 74 08 48 89 ef e8 49 c0 f7 ff 48
RIP  [] rw_swap_page_sync+0x1c/0xc2
RSP 
<0>Kernel panic - not syncing: Fatal exception


I am using "Linux localhost.localdomain 2.6.18-8.el5 #1 SMP Thu Mar 15 
19:46:53 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux", CentOs 5.


Could you please tell me if there is a fix for this problem already, I 
couldn't find anything yet.


Thank You,

Vlad.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] CFS scheduler, -v12

2007-05-21 Thread Peter Williams


Peter Williams wrote:

Dmitry Adamushko wrote:

On 18/05/07, Peter Williams <[EMAIL PROTECTED]> wrote:
[...]

One thing that might work is to jitter the load balancing interval a
bit.  The reason I say this is that one of the characteristics of top
and gkrellm is that they run at a more or less constant interval (and,
in this case, X would also be following this pattern as it's doing
screen updates for top and gkrellm) and this means that it's possible
for the load balancing interval to synchronize with their intervals
which in turn causes the observed problem.


Hum.. I guess, a 0/4 scenario wouldn't fit well in this explanation..


No, and I haven't seen one.


all 4 spinners "tend" to be on CPU0 (and as I understand each gets
~25% approx.?), so there must be plenty of moments for
*idle_balance()* to be called on CPU1 - as gkrellm, top and X consume
together just a few % of CPU. Hence, we should not be that dependent
on the load balancing interval here..


The split that I see is 3/1 and neither CPU seems to be favoured with 
respect to getting the majority.  However, top, gkrellm and X seem to be 
always on the CPU with the single spinner.  The CPU% reported by top is 
approx. 33%, 33%, 33% and 100% for the spinners.


If I renice the spinners to -10 (so that there load weights dominate the 
run queue load calculations) the problem goes away and the spinner to 
CPU allocation is 2/2 and top reports them all getting approx. 50% each.


For no good reason other than curiosity, I tried a variation of this 
experiment where I reniced the spinners to 10 instead of -10 and, to my 
surprise, they were allocated 2/2 to the CPUs on average.  I say on 
average because the allocations were a little more volatile and 
occasionally 0/4 splits would occur but these would last for less than 
one top cycle before the 2/2 was re-established.  The quickness of these 
recoveries would indicate that it was most likely the idle balance 
mechanism that restored the balance.


This may point the finger at the tick based load balance mechanism being 
too conservative in when it decides whether tasks need to be moved.  In 
the case where the spinners are at nice == 0, the idle balance mechanism 
never comes into play as the 0/4 split is never seen so only the tick 
based mechanism is in force in this case and this is where the anomalies 
are seen.


This tick rebalance mechanism only situation is also true for the nice 
== -10 case but in this case the high load weights of the spinners 
overcomes the tick based load balancing mechanism's conservatism e.g. 
the difference in queue loads for a 1/3 split in this case is the 
equivalent to the difference that would be generated by an imbalance of 
about 18 nice == 0 spinners i.e. too big to be ignored.


The evidence seems to indicate that IF a rebalance operation gets 
initiated then the right amount of load will get moved.


This new evidence weakens (but does not totally destroy) my 
synchronization (a.k.a. conspiracy) theory.


Peter
PS As the total load weight for 4 nice == 10 tasks is only about 40% of 
the load weight of a single nice == 0 task, the occasional 0/4 split in 
the spinners at nice == 10 case is not unexpected as it would be the 
desirable allocation if there were exactly one other running task at 
nice == 0.

--
Peter Williams   [EMAIL PROTECTED]

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.22-rc2

2007-05-21 Thread Stephen Hemminger

On Tue, 22 May 2007 00:36:15 -0400
Jeff Garzik <[EMAIL PROTECTED]> wrote:

> Stephen Hemminger wrote:
> > There maybe some hardware level interaction with SATA controller.
> > I saw no failures running off i386 kernel of PATA drive and quickly
> > see errors with SATA/AHCI and x86_64.
> 
> 
> I presume AHCI is the only other device in the system using PCI MSI, 
> when you see problems?
> 
>   Jeff
> 
> 
AHCI on this motherboard doesn't seem to use MSI. The problems occur
even if I boot with nomsi.

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Crash on modpost, addend_386_rel()

2007-05-21 Thread Atsushi Nemoto

On Mon, 21 May 2007 22:01:27 -0400, Ben Collins <[EMAIL PROTECTED]> wrote:
> Got this crash in modpost. Bisect blames this commit:
> 
> commit f892b7d480eec809a5dfbd6e65742b3f3155e50e
> Author: Atsushi Nemoto <[EMAIL PROTECTED]> 
> Date:   Thu May 17 01:14:38 2007 +0900
> kbuild: make better section mismatch reports on i386, arm and mips

Sorry, the patch breaks CONFIG_RELOCATABLE=y build.  Actually I had
not tested with that configuration.

Linus already have reverted the commmit on git tree due to this breakage.

Unfortunately, fixing whole things in the _right_ way is somewhat out
of my ELF understanding.  Some help from ELF gurus are welcome!

Anyway, here is a updated patch tested on i386 (RELOCATABLE=y/n), arm,
and mips.  On calculation of 'location', sh_addr should be subtracted
(thank you for debugging, Linus).  And this patch contains an another
fix and an improvement of added_mips_rel
(kbuild-fix-and-improve-mips-rel-handling.patch in mm tree).


Subject: [PATCH] kbuild: make better section mismatch reports on i386, arm and 
mips (take 2)

On i386, ARM and MIPS, warn_sec_mismatch() sometimes fails to show
usefull symbol name.  This is because empty 'refsym' due to 0 r_addend
value.  This patch is to adjust r_addend value, consulting with
apply_relocate() routine in kernel code.

Signed-off-by: Atsushi Nemoto <[EMAIL PROTECTED]>
---
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 8e5610d..e0bd1cd 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -384,6 +384,8 @@ static int parse_elf(struct elf_info *info, const char 
*filename)
sechdrs[i].sh_size   = TO_NATIVE(sechdrs[i].sh_size);
sechdrs[i].sh_link   = TO_NATIVE(sechdrs[i].sh_link);
sechdrs[i].sh_name   = TO_NATIVE(sechdrs[i].sh_name);
+   sechdrs[i].sh_info   = TO_NATIVE(sechdrs[i].sh_info);
+   sechdrs[i].sh_addr   = TO_NATIVE(sechdrs[i].sh_addr);
}
/* Find symbol table. */
for (i = 1; i < hdr->e_shnum; i++) {
@@ -753,6 +755,8 @@ static Elf_Sym *find_elf_symbol(struct elf_info *elf, 
Elf_Addr addr,
for (sym = elf->symtab_start; sym < elf->symtab_stop; sym++) {
if (sym->st_shndx != relsym->st_shndx)
continue;
+   if (ELF_ST_TYPE(sym->st_info) == STT_SECTION)
+   continue;
if (sym->st_value == addr)
return sym;
}
@@ -895,6 +899,74 @@ static void warn_sec_mismatch(const char *modname, const 
char *fromsec,
}
 }
 
+static void addend_386_rel(struct elf_info *elf, int rsection, Elf_Rela *r)
+{
+   Elf_Shdr *sechdrs = elf->sechdrs;
+   unsigned int r_typ;
+   unsigned int *location;
+   int section = sechdrs[rsection].sh_info;
+
+   r_typ = ELF_R_TYPE(r->r_info);
+   location = (void *)elf->hdr + sechdrs[section].sh_offset +
+   (r->r_offset - sechdrs[section].sh_addr);
+   switch (r_typ) {
+   case R_386_32:
+   r->r_addend = TO_NATIVE(*location);
+   break;
+   case R_386_PC32:
+   r->r_addend = TO_NATIVE(*location) + 4;
+   break;
+   }
+}
+
+static void addend_arm_rel(struct elf_info *elf, int rsection, Elf_Rela *r)
+{
+   Elf_Shdr *sechdrs = elf->sechdrs;
+   unsigned int r_typ;
+   unsigned int *location;
+   int section = sechdrs[rsection].sh_info;
+
+   r_typ = ELF_R_TYPE(r->r_info);
+   location = (void *)elf->hdr + sechdrs[section].sh_offset +
+   (r->r_offset - sechdrs[section].sh_addr);
+   switch (r_typ) {
+   case R_ARM_ABS32:
+   r->r_addend = TO_NATIVE(*location);
+   break;
+   case R_ARM_PC24:
+   r->r_addend = ((TO_NATIVE(*location) & 0x00ff) << 2) + 8;
+   break;
+   }
+}
+
+static int addend_mips_rel(struct elf_info *elf, int rsection, Elf_Rela *r)
+{
+   Elf_Shdr *sechdrs = elf->sechdrs;
+   unsigned int r_typ;
+   unsigned int *location;
+   int section = sechdrs[rsection].sh_info;
+   unsigned int inst;
+
+   r_typ = ELF_R_TYPE(r->r_info);
+   if (r_typ == R_MIPS_HI16)
+   return 1;   /* skip this */
+   location = (void *)elf->hdr + sechdrs[section].sh_offset +
+   (r->r_offset - sechdrs[section].sh_addr);
+   inst = TO_NATIVE(*location);
+   switch (r_typ) {
+   case R_MIPS_LO16:
+   r->r_addend = inst & 0x;
+   break;
+   case R_MIPS_26:
+   r->r_addend = (inst & 0x03ff) << 2;
+   break;
+   case R_MIPS_32:
+   r->r_addend = inst;
+   break;
+   }
+   return 0;
+}
+
 /**
  * A module includes a number of sections that are discarded
  * either when loaded or when used as built-in.
@@ -938,8 +1010,11 @@ static void check_sec_ref(struct module *mod, const char 
*modname,

Re: Define CONFIG_BOUNCE to avoid useless inclusion of bounce buffer logic.

2007-05-21 Thread KAMEZAWA Hiroyuki

On Mon, 21 May 2007 21:03:40 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> The bounce buffer logic is included on systems that do not need it.
> If a system does not have zones like ZONE_DMA and ZONE_HIGHMEM that
> can lead to the use of bounce buffers then there is no need to reserve 
> memory pools etc etc. This is true f.e. for SGI Altix.

> +config BOUNCE
> + def_bool y
> + depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
> +

AFAIK, ppc has only ZONE_DMA and it never needs bounce.
Is this ok ?

-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.22-rc2


Stephen Hemminger wrote:

There maybe some hardware level interaction with SATA controller.
I saw no failures running off i386 kernel of PATA drive and quickly
see errors with SATA/AHCI and x86_64.



I presume AHCI is the only other device in the system using PCI MSI, 
when you see problems?


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] smbfs: fix header_check (and build)


The build dies in the header_check portion:
/garz/repo/linux-2.6/usr/include/linux/smb_fs.h requires linux/jiffies.h, which 
does not exist in exported headers
make[3]: *** [/garz/repo/linux-2.6/usr/include/linux/.check.smb_fs.h] Error 1

The solution is to move the jiffies.h include inside __KERNEL__.

Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>

diff --git a/include/linux/smb_fs.h b/include/linux/smb_fs.h
index 6b51a48..d5212f0 100644
--- a/include/linux/smb_fs.h
+++ b/include/linux/smb_fs.h
@@ -9,7 +9,6 @@
 #ifndef _LINUX_SMB_FS_H
 #define _LINUX_SMB_FS_H
 
-#include 
 #include 
 
 /*
@@ -26,6 +25,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.22-rc2

2007-05-21 Thread Stephen Hemminger

On Mon, 21 May 2007 22:58:06 -0400
Mike Houston <[EMAIL PROTECTED]> wrote:

> On Mon, 21 May 2007 10:37:55 -0700
> Stephen Hemminger <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, 21 May 2007 13:10:55 -0400
> > Mike Houston <[EMAIL PROTECTED]> wrote:
> > 
> > > On Mon, 21 May 2007 08:45:49 -0700
> > > Stephen Hemminger <[EMAIL PROTECTED]> wrote:
> > > 
> > > > It's almost certainly a problem with the BIOS and hardware (not
> > > > a sky2) driver issue. Since there are many similar boards and
> > > > configurations, I made the decision not to enforce restrictions
> > > > in the driver.
> > > 
> > > >> May 20 15:57:48 cramit kernel: sky2 :04:00.0: v1.14 addr
> > > >> 0xf800 irq 16 Yukon-EC Ultra (0xb4) rev 2
> > > 
> > > Thank you for your answer. I was half wondering if that was the
> > > case after staring at those log messages several more times. I
> > > don't understand hardware at the low level but got thinking maybe
> > > interrupt routing issue. There's an Nvidia PCI Express card in
> > > there that gets IRQ 16, though it was not initialized by a driver
> > > at the time. (plain old VGA console after fresh cold boot... no
> > > framebuffer, no X, no nvidia module). I guess some things don't
> > > share well.
> > > 
> > > It works well in that other OS that came with the hardware, but
> > > that's beside the point.
> > 
> > It is some low level PCI Express related stuff, try latest BIOS (F9)
> > and if that doesn't help there is a EEPROM update from Gigabyte
> > for the Marvell hardware that might help.
> 
> Thanks for your suggestions, I followed through on them. It may still
> be interesting/useful to hear from me that it didn't help. The
> problem is the same.
> 
> My motherboard is a newer revision (Gigabyte GA-965P-DS3 Rev 3.3) and
> already had the "F10" bios version, but I flashed to the latest F11
> version anyways. I also flashed with the EEPROM update from Gigabyte,
> from a FAQ entry for my motherboard revision.
> (faq_marvell_eeprom.zip). Both operations were successful. I cleared
> the CMOS and reconfigured after the bios flash too.
> 
> Incidently, it was showing IRQ 16 in that early initialization
> message, but actually getting a MSI interrupt (IRQ 219, PCI-MSI-edge)
> 
> I've disabled the onboard yukon2 adapter in bios and gone
> back to the PCI card now. I think we can consider the matter closed,
> since it's not a problem with the driver, but just so you know, I'm
> always willing to help test when it's hardware that I have.
> 
> Mike Houston

There maybe some hardware level interaction with SATA controller.
I saw no failures running off i386 kernel of PATA drive and quickly
see errors with SATA/AHCI and x86_64.

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] partitions/LDM: build fix


This from a "tested" patch...

Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>

diff --git a/fs/partitions/ldm.c b/fs/partitions/ldm.c
index c387812..99873a2 100644
--- a/fs/partitions/ldm.c
+++ b/fs/partitions/ldm.c
@@ -158,7 +158,7 @@ static bool ldm_parse_privhead(const u8 *data, struct 
privhead *ph)
/* Warn the user and continue, carefully. */
ldm_info("Database is normally %u bytes, it claims to "
"be %llu bytes.", LDM_DB_SIZE,
-   udunsigned long long)ph->config_size);
+   (unsigned long long)ph->config_size);
}
if ((ph->logical_disk_size == 0) || (ph->logical_disk_start +
ph->logical_disk_size > ph->config_start)) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree

On Mon, May 21, 2007 at 07:40:14PM -0700, Ken Chen wrote:
> tested, like this?
 
ACK.  Could merge loop_init_one() into the only remaining caller,
but it won't make the code simpler, so let's leave it at that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Oops in dentry_iput with 2.6.22-rc2 on AMD64

2007-05-21 Thread Florin Iucha

I was running a multithreaded perl application that leaks some memory
so it gets to eat up a significant chunk of my 2 GB and even push a
bit into swap.  I left it running before going out for a walk.

When I got back, I found this in the log:

[28818.103829] Unable to handle kernel paging request at 910025c9a8b4 RIP: 
[28818.103836]  [] dentry_iput+0x58/0xbb
[28818.103845] PGD 0 
[28818.103848] Oops:  [1] SMP 
[28818.103851] CPU 0 
[28818.103853] Modules linked in: radeon ntfs sbp2 lp lgdt330x cx88_dvb 
cx88_vp3054_i2c dvb_pll video_buf_dvb tuner cx8802 cx88_alsa cx8800 cx88xx 
ir_common rtc tveeprom video_buf btcx_risc i2c_nforce2 evdev forcedeth
[28818.103868] Pid: 253, comm: kswapd0 Not tainted 2.6.22-rc2 #1
[28818.103871] RIP: 0010:[]  [] 
dentry_iput+0x58/0xbb
[28818.103877] RSP: :81007de75d40  EFLAGS: 00010246
[28818.103880] RAX:  RBX: 810075662e00 RCX: 810025c9a898
[28818.103883] RDX: 810025c9a898 RSI: 0001 RDI: 0282
[28818.103886] RBP: 81007de75d50 R08: 534d R09: 0064
[28818.103890] R10: 81007de75d80 R11: 000a R12: 910025c9a868
[28818.103893] R13: 810075662e08 R14:  R15: 005e
[28818.103897] FS:  42804940() GS:806c8000() 
knlGS:
[28818.103900] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[28818.103903] CR2: 910025c9a8b4 CR3: 254bf000 CR4: 06e0
[28818.103907] Process kswapd0 (pid: 253, threadinfo 81007de74000, task 
81007de706d0)
[28818.103909] Stack:  810075662e00 0001 81007de75d70 
80288aaf
[28818.103916]  81007e606070 0001 81007de75d90 
80289a35
[28818.103921]  81007e606070 810075662e00 81007de75dd0 
80289d3d
[28818.103926] Call Trace:
[28818.103931]  [] d_kill+0x38/0x58
[28818.103935]  [] prune_one_dentry+0x3c/0x10f
[28818.103939]  [] prune_dcache+0x137/0x1a0
[28818.103944]  [] shrink_dcache_memory+0x1c/0x35
[28818.103948]  [] shrink_slab+0xe6/0x162
[28818.103953]  [] kswapd+0x329/0x4ca
[28818.103958]  [] autoremove_wake_function+0x0/0x38
[28818.103963]  [] kswapd+0x0/0x4ca
[28818.103967]  [] kthread+0x49/0x76
[28818.103971]  [] child_rip+0xa/0x12
[28818.103977]  [] kthread+0x0/0x76
[28818.103980]  [] child_rip+0x0/0x12
[28818.103982] 
[28818.103984] 
[28818.103985] Code: 41 83 7c 24 4c 00 75 1c 4c 89 e7 45 31 c0 31 c9 31 d2 be 
00 
[28818.103994] RIP  [] dentry_iput+0x58/0xbb
[28818.103999]  RSP 
[28818.104001] CR2: 910025c9a8b4

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature

Re: [PATCH] eCryptfs: Delay writing 0's after llseek until write

2007-05-21 Thread Andrew Morton

On Mon, 21 May 2007 18:00:21 -0500 Michael Halcrow <[EMAIL PROTECTED]> wrote:

> Delay writing 0's out in eCryptfs after a seek past the end of the
> file until data is actually written.

a) why?

b) what is the impact upon a user of them not having this patch?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Status of squashfs?

2007-05-21 Thread Roland Dreier

 > So Fedora uses squashfs, Ubuntu uses, squashfs, Gentoo uses squashfs...  It 
 > seems like the only place I can get a kernel _without_ squashfs is 
 > kernel.org.
 > 
 > Is there a reason for this?

Has anyone tried to merge it upstream?  Do the squashfs developers
want to merge it?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Define CONFIG_BOUNCE to avoid useless inclusion of bounce buffer logic.

The bounce buffer logic is included on systems that do not need it.
If a system does not have zones like ZONE_DMA and ZONE_HIGHMEM that
can lead to the use of bounce buffers then there is no need to reserve 
memory pools etc etc. This is true f.e. for SGI Altix.

Also nicifies the Makefile and gets rid of the tricky "and" there.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 include/linux/blkdev.h |2 +-
 mm/Kconfig |4 
 mm/Makefile|4 +---
 3 files changed, 6 insertions(+), 4 deletions(-)

Index: linux-2.6/include/linux/blkdev.h
===
--- linux-2.6.orig/include/linux/blkdev.h   2007-05-18 15:29:43.0 
-0700
+++ linux-2.6/include/linux/blkdev.h2007-05-18 15:32:48.0 -0700
@@ -607,7 +607,7 @@ extern unsigned long blk_max_low_pfn, bl
 #define BLK_BOUNCE_ANY ((u64)blk_max_pfn << PAGE_SHIFT)
 #define BLK_BOUNCE_ISA (ISA_DMA_THRESHOLD)
 
-#ifdef CONFIG_MMU
+#ifdef CONFIG_BOUNCE
 extern int init_emergency_isa_pool(void);
 extern void blk_queue_bounce(request_queue_t *q, struct bio **bio);
 #else
Index: linux-2.6/mm/Kconfig
===
--- linux-2.6.orig/mm/Kconfig   2007-05-18 15:31:18.0 -0700
+++ linux-2.6/mm/Kconfig2007-05-18 15:38:25.0 -0700
@@ -163,6 +163,10 @@ config ZONE_DMA_FLAG
default "0" if !ZONE_DMA
default "1"
 
+config BOUNCE
+   def_bool y
+   depends on BLOCK && MMU && (ZONE_DMA || HIGHMEM)
+
 config NR_QUICK
int
depends on QUICKLIST
Index: linux-2.6/mm/Makefile
===
--- linux-2.6.orig/mm/Makefile  2007-05-18 15:27:57.0 -0700
+++ linux-2.6/mm/Makefile   2007-05-18 15:33:17.0 -0700
@@ -13,9 +13,7 @@ obj-y := bootmem.o filemap.o mempool.o
   prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \
   $(mmu-y)
 
-ifeq ($(CONFIG_MMU)$(CONFIG_BLOCK),yy)
-obj-y  += bounce.o
-endif
+obj-$(CONFIG_BOUNCE)   += bounce.o
 obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o thrash.o
 obj-$(CONFIG_HUGETLBFS)+= hugetlb.o
 obj-$(CONFIG_NUMA) += mempolicy.o
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Status of squashfs?

2007-05-21 Thread Rob Landley

So Fedora uses squashfs, Ubuntu uses, squashfs, Gentoo uses squashfs...  It 
seems like the only place I can get a kernel _without_ squashfs is 
kernel.org.

Is there a reason for this?

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ahci: add Marvell support (WIP)


George T. Joseph (development) wrote:

Hi Jeff,

Two issues with the patch...

msi has to be disabled for the Marvell or the driver load will throw a
"nobody cared" message and eventually hang before discovering all the
drives.

Light I/O works fine but heavy I/O generates 
"exception Emask 0x0 Sact 0xb Serr 0x0 action 0x2 frozen"

Then softreset and identify failures.
Adding AHCI_FLAG_NO_NCQ to the flags fixes this.

I've been running with both these changes over your original patch for a
few months now with no problems.  This is on a very heavily used 4 drive
mdraid10 array.  



I poked Marvell about this, we'll see what they say.

If I don't hear anything useful, I'll push the no-MSI, no-NCQ version 
upstream.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 10/14] In-kernel file copy between union mounted filesystems

2007-05-21 Thread Bharata B Rao

On Fri, May 18, 2007 at 09:47:31AM -0400, Shaya Potter wrote:
> Bharata B Rao wrote:
> 
> >
> >Not really. This is called during copyup of a file residing in a lower
> >layer. And that is done only for regular files.
> 
> That is broken.

But it only breaks the semantics (in other cases we allow writes only to the
top layer files). So the question is why do we have to copy up the device
node ? What difference it makes to writing to the device itself ? Currently
we allow write to the device using the lower layer device node itself.

> 
> You should be able to change the permissions on a device node on a layer 
> that is RO.
> 

Hmm not sure why we need to touch the permissions of the device. See below.

> so it would copy it up (1. mknod, 2. copy attributes) and then the 
> appropriate attribute notification change would be called.

With union mount, when a regular file is opened for write, it is checked
if it resides in the lower layer and if so copied up to the topmost layer
and this new fd is returned from open. And any subsequent writes using this
fd will go to the newly created topmost file. (We are aware that we are not
yet copying the (extended) attributes to the newly created topmost file,
which we have to do).

Regards,
Bharata.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.22-rc2

2007-05-21 Thread Mike Houston

On Mon, 21 May 2007 10:37:55 -0700
Stephen Hemminger <[EMAIL PROTECTED]> wrote:

> On Mon, 21 May 2007 13:10:55 -0400
> Mike Houston <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, 21 May 2007 08:45:49 -0700
> > Stephen Hemminger <[EMAIL PROTECTED]> wrote:
> > 
> > > It's almost certainly a problem with the BIOS and hardware (not
> > > a sky2) driver issue. Since there are many similar boards and
> > > configurations, I made the decision not to enforce restrictions
> > > in the driver.
> > 
> > >> May 20 15:57:48 cramit kernel: sky2 :04:00.0: v1.14 addr
> > >> 0xf800 irq 16 Yukon-EC Ultra (0xb4) rev 2
> > 
> > Thank you for your answer. I was half wondering if that was the
> > case after staring at those log messages several more times. I
> > don't understand hardware at the low level but got thinking maybe
> > interrupt routing issue. There's an Nvidia PCI Express card in
> > there that gets IRQ 16, though it was not initialized by a driver
> > at the time. (plain old VGA console after fresh cold boot... no
> > framebuffer, no X, no nvidia module). I guess some things don't
> > share well.
> > 
> > It works well in that other OS that came with the hardware, but
> > that's beside the point.
> 
> It is some low level PCI Express related stuff, try latest BIOS (F9)
> and if that doesn't help there is a EEPROM update from Gigabyte
> for the Marvell hardware that might help.

Thanks for your suggestions, I followed through on them. It may still
be interesting/useful to hear from me that it didn't help. The
problem is the same.

My motherboard is a newer revision (Gigabyte GA-965P-DS3 Rev 3.3) and
already had the "F10" bios version, but I flashed to the latest F11
version anyways. I also flashed with the EEPROM update from Gigabyte,
from a FAQ entry for my motherboard revision.
(faq_marvell_eeprom.zip). Both operations were successful. I cleared
the CMOS and reconfigured after the bios flash too.

Incidently, it was showing IRQ 16 in that early initialization
message, but actually getting a MSI interrupt (IRQ 219, PCI-MSI-edge)

I've disabled the onboard yukon2 adapter in bios and gone
back to the PCI card now. I think we can consider the matter closed,
since it's not a problem with the driver, but just so you know, I'm
always willing to help test when it's hardware that I have.

Mike Houston
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

2007-05-21 Thread l l


Hi,

2007/5/18, Jesse Barnes <[EMAIL PROTECTED]>:

Comments, questions, suggestions?


FBUI, kdrive
http://home.comcast.net/~fbui/
http://www.freedesktop.org/wiki/Software/Xserver

TIA
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

On Mon, 2007-05-21 at 17:51 -0700, Keith Packard wrote:
> 
> That's the plan; the kernel just provides mechanism. The architecture
> used in the X server splits precisely at this point with the mechanism
> in the driver and the configuration and policy up in the X server
> proper. Quite a bit of that code could be broken out into a shared
> library for fbdev-based apps and the X server to share, but that's
> down the road a bit after the kernel APIs look a lot more solid.

Ok, good plan then.

> With the goal of getting to a single-mode-set boot to avoid screen
> flashing before login, the key here is to make any early user-mode
> graphics apps share the same kernel graphics infrastructure as the X
> server to identify the common cases where the startup and X modes are
> the same and avoid resetting the configuration.

Ok. Fair enough.

Cheers,
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] rd: Simplify by using the same helper functions in libfs

2007-05-21 Thread Eric W. Biederman


While the ramdisk code in the page cache started with the ramfs
code it has diverged, and is a result is more complicated then
it currently needs to be.  This patch simplifies the ramfs
code by syncing it with ramfs and similar pieces of code.

The big difference is that the ramdisk must cope with people placing
buffer heads on it's pages so there is extra code required to mark
those buffer heads dirty and uptodate.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 drivers/block/rd.c |   76 +++
 1 files changed, 5 insertions(+), 71 deletions(-)

diff --git a/drivers/block/rd.c b/drivers/block/rd.c
index 41de0f4..56b2b54 100644
--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -92,36 +92,16 @@ static int rd_blocksize = CONFIG_BLK_DEV_RAM_BLOCKSIZE;
  * aops copied from ramfs.
  */
 
-/*
- * If a ramdisk page has buffers, some may be uptodate and some may be not.
- * To bring the page uptodate we zero out the non-uptodate buffers.  The
- * page must be locked.
- */
 static void make_page_uptodate(struct page *page)
 {
+   clear_highpage(page);
if (page_has_buffers(page)) {
struct buffer_head *bh = page_buffers(page);
struct buffer_head *head = bh;
 
do {
-   if (!buffer_uptodate(bh)) {
-   memset(bh->b_data, 0, bh->b_size);
-   /*
-* akpm: I'm totally undecided about this.  The
-* buffer has just been magically brought "up to
-* date", but nobody should want to be reading
-* it anyway, because it hasn't been used for
-* anything yet.  It is still in a "not read
-* from disk yet" state.
-*
-* But non-uptodate buffers against an uptodate
-* page are against the rules.  So do it anyway.
-*/
-set_buffer_uptodate(bh);
-   }
+   set_buffer_uptodate(bh);
} while ((bh = bh->b_this_page) != head);
-   } else {
-   memset(page_address(page), 0, PAGE_CACHE_SIZE);
}
flush_dcache_page(page);
SetPageUptodate(page);
@@ -129,55 +109,11 @@ static void make_page_uptodate(struct page *page)
 
 static int ramdisk_readpage(struct file *file, struct page *page)
 {
-   if (!PageUptodate(page))
-   make_page_uptodate(page);
+   make_page_uptodate(page);
unlock_page(page);
return 0;
 }
 
-static int ramdisk_prepare_write(struct file *file, struct page *page,
-   unsigned offset, unsigned to)
-{
-   if (!PageUptodate(page))
-   make_page_uptodate(page);
-   return 0;
-}
-
-static int ramdisk_commit_write(struct file *file, struct page *page,
-   unsigned offset, unsigned to)
-{
-   set_page_dirty(page);
-   return 0;
-}
-
-/*
- * ->writepage to the blockdev's mapping has to redirty the page so that the
- * VM doesn't go and steal it.  We return AOP_WRITEPAGE_ACTIVATE so that the VM
- * won't try to (pointlessly) write the page again for a while.
- *
- * Really, these pages should not be on the LRU at all.
- */
-static int ramdisk_writepage(struct page *page, struct writeback_control *wbc)
-{
-   if (!PageUptodate(page))
-   make_page_uptodate(page);
-   SetPageDirty(page);
-   if (wbc->for_reclaim)
-   return AOP_WRITEPAGE_ACTIVATE;
-   unlock_page(page);
-   return 0;
-}
-
-/*
- * This is a little speedup thing: short-circuit attempts to write back the
- * ramdisk blockdev inode to its non-existent backing store.
- */
-static int ramdisk_writepages(struct address_space *mapping,
-   struct writeback_control *wbc)
-{
-   return 0;
-}
-
 /*
  * ramdisk blockdev pages have their own ->set_page_dirty() because we don't
  * want them to contribute to dirty memory accounting.
@@ -206,11 +142,9 @@ static int ramdisk_set_page_dirty(struct page *page)
 
 static const struct address_space_operations ramdisk_aops = {
.readpage   = ramdisk_readpage,
-   .prepare_write  = ramdisk_prepare_write,
-   .commit_write   = ramdisk_commit_write,
-   .writepage  = ramdisk_writepage,
+   .prepare_write  = simple_prepare_write,
+   .commit_write   = simple_commit_write,
.set_page_dirty = ramdisk_set_page_dirty,
-   .writepages = ramdisk_writepages,
 };
 
 static int rd_blkdev_pagecache_IO(int rw, struct bio_vec *vec, sector_t sector,
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More m

Re: [PATCH] Kconfig powernow-k8 driver should depend on ACPI P-States driver

2007-05-21 Thread Joshua Hoblitt

On Fri, May 18, 2007 at 12:01:08PM -0400, Dave Jones wrote:
> On Fri, May 18, 2007 at 12:09:38AM -0400, Ed Sweetman wrote:
[snip]
> still has unnecessary whitespace changes
[snip]
> and still wordwrapped.
> (also capitalise ACPI)

I haven't seen any more e-mail traffic on this topic so I'm assuming
that the ball has been dropped.  Please excuse me if an acceptable patch
has been submitted that I wasn't CC'd on. 

Here is cleaned up version of Ed's patch that I believe addresses Dave's
stylistic concerns applies the relevant changes to both x86 & x86_64.

Signed-off-by: Joshua Hoblitt <[EMAIL PROTECTED]>
--
 i386/kernel/cpu/cpufreq/Kconfig |   13 ++---
 x86_64/kernel/cpufreq/Kconfig   |   13 ++---
 2 files changed, 20 insertions(+), 6 deletions(-)

diff -Nurp linux-2.6.22-rc1-mm1.orig/arch/i386/kernel/cpu/cpufreq/Kconfig 
linux-2.6.22-rc1-mm1/arch/i386/kernel/cpu/cpufreq/Kconfig
--- linux-2.6.22-rc1-mm1.orig/arch/i386/kernel/cpu/cpufreq/Kconfig  
2007-04-27 11:49:26.0 -1000
+++ linux-2.6.22-rc1-mm1/arch/i386/kernel/cpu/cpufreq/Kconfig   2007-05-21 
16:20:47.0 -1000
@@ -90,10 +90,17 @@ config X86_POWERNOW_K8
  If in doubt, say N.
 
 config X86_POWERNOW_K8_ACPI
-   bool
-   depends on X86_POWERNOW_K8 && ACPI_PROCESSOR
-   depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m)
+   bool "ACPI Support"
+   select ACPI_PROCESSOR
+   depends on X86_POWERNOW_K8
default y
+   help
+ This provides access to the K8s Processor Performance States via ACPI.
+ This driver is probably required for CPUFreq to work with 
multi-socket and
+ SMP systems.  It is not required on at least some single-socket yet
+ multi-core systems, even if SMP is enabled.
+
+ It is safe to say Y here.
 
 config X86_GX_SUSPMOD
tristate "Cyrix MediaGX/NatSemi Geode Suspend Modulation"
diff -Nurp linux-2.6.22-rc1-mm1.orig/arch/x86_64/kernel/cpufreq/Kconfig 
linux-2.6.22-rc1-mm1/arch/x86_64/kernel/cpufreq/Kconfig
--- linux-2.6.22-rc1-mm1.orig/arch/x86_64/kernel/cpufreq/Kconfig
2007-05-21 16:11:16.0 -1000
+++ linux-2.6.22-rc1-mm1/arch/x86_64/kernel/cpufreq/Kconfig 2007-05-21 
16:29:11.0 -1000
@@ -24,10 +24,17 @@ config X86_POWERNOW_K8
  If in doubt, say N.
 
 config X86_POWERNOW_K8_ACPI
-   bool
-   depends on X86_POWERNOW_K8 && ACPI_PROCESSOR
-   depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m)
+   bool "ACPI Support"
+   select ACPI_PROCESSOR
+   depends on X86_POWERNOW_K8
default y
+   help
+ This provides access to the K8s Processor Performance States via ACPI.
+ This driver is probably required for CPUFreq to work with 
multi-socket and
+ SMP systems.  It is not required on at least some single-socket yet
+ multi-core systems, even if SMP is enabled.
+   
+ It is safe to say Y here.
 
 config X86_SPEEDSTEP_CENTRINO
tristate "Intel Enhanced SpeedStep (deprecated)"


pgpwX1JIqMZRf.pgp
Description: PGP signature

Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree

2007-05-21 Thread Ken Chen


On 5/21/07, Ken Chen <[EMAIL PROTECTED]> wrote:

On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote:
> No, it doesn't.  Really.  It's easy to split; untested incremental to your
> patch follows:
>
> for (i = 0; i < nr; i++) {
> -   if (!loop_init_one(i))
> -   goto err;
> +   lo = loop_alloc(i);
> +   if (!lo)
> +   goto Enomem;
> +   list_add_tail(&lo->lo_list, &loop_devices);
> }

ah, yes, use the loop_device list_head to link all the pending devices.


> +   /* point of no return */
> +
> +   list_for_each_entry(lo, &loop_devices, lo_list)
> +   add_disk(lo->lo_disk);
> +
> +   blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
> + THIS_MODULE, loop_probe, NULL, NULL);
> +
> printk(KERN_INFO "loop: module loaded\n");
> return 0;
> -err:
> -   loop_exit();
> +
> +Enomem:
> printk(KERN_INFO "loop: out of memory\n");
> +
> +   while(!list_empty(&loop_devices)) {
> +   lo = list_entry(loop_devices.next, struct loop_device, 
lo_list);
> +   loop_del_one(lo);
> +   }
> +
> +   unregister_blkdev(LOOP_MAJOR, "loop");
> return -ENOMEM;
>  }

I suppose the loop_del_one call in Enomem label needs to be split up
too since in the error path, it hasn't done add_disk() yet?



tested, like this?

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 5526ead..0ed5470 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1354,7 +1354,7 @@ #endif
 */
static int max_loop;
module_param(max_loop, int, 0);
-MODULE_PARM_DESC(max_loop, "obsolete, loop device is created on-demand");
+MODULE_PARM_DESC(max_loop, "Maximum number of loop devices");
MODULE_LICENSE("GPL");
MODULE_ALIAS_BLOCKDEV_MAJOR(LOOP_MAJOR);

@@ -1394,16 +1394,11 @@ int loop_unregister_transfer
EXPORT_SYMBOL(loop_register_transfer);
EXPORT_SYMBOL(loop_unregister_transfer);

-static struct loop_device *loop_init_one(int i)
+static struct loop_device *loop_alloc(int i)
{
struct loop_device *lo;
struct gendisk *disk;

-   list_for_each_entry(lo, &loop_devices, lo_list) {
-   if (lo->lo_number == i)
-   return lo;
-   }
-
lo = kzalloc(sizeof(*lo), GFP_KERNEL);
if (!lo)
goto out;
@@ -1427,8 +1422,6 @@ static struct loop_device *loop_init_one
disk->private_data   = lo;
disk->queue  = lo->lo_queue;
sprintf(disk->disk_name, "loop%d", i);
-   add_disk(disk);
-   list_add_tail(&lo->lo_list, &loop_devices);
return lo;

out_free_queue:
@@ -1439,15 +1432,37 @@ out:
return NULL;
}

-static void loop_del_one(struct loop_device *lo)
+static void loop_free(struct loop_device *lo)
{
-   del_gendisk(lo->lo_disk);
blk_cleanup_queue(lo->lo_queue);
put_disk(lo->lo_disk);
list_del(&lo->lo_list);
kfree(lo);
}

+static struct loop_device *loop_init_one(int i)
+{
+   struct loop_device *lo;
+
+   list_for_each_entry(lo, &loop_devices, lo_list) {
+   if (lo->lo_number == i)
+   return lo;
+   }
+
+   lo = loop_alloc(i);
+   if (lo) {
+   add_disk(lo->lo_disk);
+   list_add_tail(&lo->lo_list, &loop_devices);
+   }
+   return lo;
+}
+
+static void loop_del_one(struct loop_device *lo)
+{
+   del_gendisk(lo->lo_disk);
+   loop_free(lo);
+}
+
static struct kobject *loop_probe(dev_t dev, int *part, void *data)
{
struct loop_device *lo;
@@ -1464,28 +1479,77 @@ static struct kobject *loop_probe

static int __init loop_init(void)
{
-   if (register_blkdev(LOOP_MAJOR, "loop"))
-   return -EIO;
-   blk_register_region(MKDEV(LOOP_MAJOR, 0), 1UL << MINORBITS,
- THIS_MODULE, loop_probe, NULL, NULL);
+   int i, nr;
+   unsigned long range;
+   struct loop_device *lo, *next;
+
+   /*
+* loop module now has a feature to instantiate underlying device
+* structure on-demand, provided that there is an access dev node.
+* However, this will not work well with user space tool that doesn't
+* know about such "feature".  In order to not break any existing
+* tool, we do the following:
+*
+* (1) if max_loop is specified, create that many upfront, and this
+* also becomes a hard limit.
+* (2) if max_loop is not specified, create 8 loop device on module
+* load, user can further extend loop device by create dev node
+* themselves and have kernel automatically instantiate actual
+* device on-demand.
+*/
+   if (max_loop > 1UL << MINORBITS)
+   return -EINVAL;

if (max_loop) {
-   printk(KERN_INFO "loop: the max_loop option is obsolete "
-"and will b

[PATCH 2/3] rd: Mark ramdisk buffer heads dirty in ramdisk_set_page_dirty

2007-05-21 Thread Eric W. Biederman


The problem:  When we are trying to free buffers try_to_free_buffers
will look at ramdisk pages with clean buffer heads and remove the
dirty bit from the page.  Resulting in ramdisk pages with data that
get removed from the page cache.  Ouch!

When we mark a ramdisk page dirty we call set_page_dirty which then
calls ramdisk_set_page_dirty.  Currently we don't mark the buffer
heads dirty leaving us susceptible to the problem above.

So to fix the mismatch between buffer head state and page state this patch
modifies ramdisk_set_page_dirty to set the dirty bit on all of the buffers
a page may posses.

I set the uptodate bit on the buffer head so that later we can use
simple_commit_write, and because it is trivially safe.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 drivers/block/rd.c |   15 +++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/drivers/block/rd.c b/drivers/block/rd.c
index a1512da..41de0f4 100644
--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -184,6 +184,21 @@ static int ramdisk_writepages(struct address_space 
*mapping,
  */
 static int ramdisk_set_page_dirty(struct page *page)
 {
+   struct address_space * const mapping = page_mapping(page);
+
+   spin_lock(&mapping->private_lock);
+   if (page_has_buffers(page)) {
+   struct buffer_head *head = page_buffers(page);
+   struct buffer_head *bh = head;
+
+   do {
+   set_buffer_uptodate(bh);
+   set_buffer_dirty(bh);
+   bh = bh->b_this_page;
+   } while (bh != head);
+   }
+   spin_unlock(&mapping->private_lock);
+
if (!TestSetPageDirty(page))
return 1;
return 0;
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PCI device problem - MMCONFIG, cannot allocate resource region, resource collisions

2007-05-21 Thread Wayne Sherman


Jesse Barnes wrote:
There's a recent thread about PCI resource assignment (sounds like your 
BIOS might be buggy btw, or you're somehow running out of space), search 
for the title "PCI bridge range sizing bug".  You may need the kernel to 
reassign the resource for your NIC before you can use it.  I think Ivan 
has some test patches along these lines.


If you can find out what resource it's colliding with, that might give you 
a clue.


I don't have anything else plugged in to the PC (except a USB drive). 
BIOS is set to PNP OS.  How do I find out what it is colliding with?


Here is a full lspci output:

# lspci -v

00:00.0 Host bridge: ATI Technologies Inc RS480 Host Bridge (rev 10)
Subsystem: ATI Technologies Inc RS480 Host Bridge
Flags: bus master, 66MHz, medium devsel, latency 0

00:01.0 PCI bridge: ATI Technologies Inc RS480 PCI Bridge (prog-if 00 
[Normal decode])

Flags: bus master, 66MHz, medium devsel, latency 64
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: a000-afff
Memory behind bridge: ff40-ff4f
Prefetchable memory behind bridge: fab0-fea0
Capabilities: [44] HyperTransport: MSI Mapping
Capabilities: [b0] #0d []

00:12.0 SATA controller: ATI Technologies Inc SB600 Non-Raid-5 SATA 
(prog-if 01 [AHCI 1.0])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7244
Flags: bus master, 66MHz, medium devsel, latency 96, IRQ 18
I/O ports at e800 [size=8]
I/O ports at e400 [size=4]
I/O ports at e000 [size=8]
I/O ports at dc00 [size=4]
I/O ports at d800 [size=16]
Memory at ff6ffc00 (32-bit, non-prefetchable) [size=1K]
Capabilities: [60] Power Management version 2
Capabilities: [50] Message Signalled Interrupts: 64bit+ Queue=0/2 
Enable-
Capabilities: [70] #12 [0010]

00:13.0 USB Controller: ATI Technologies Inc SB600 USB (OHCI0) (prog-if 
10 [OHCI])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 19
Memory at ff6fe000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 
Enable-

00:13.1 USB Controller: ATI Technologies Inc SB600 USB (OHCI1) (prog-if 
10 [OHCI])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 21
Memory at ff6fd000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 
Enable-

00:13.2 USB Controller: ATI Technologies Inc SB600 USB (OHCI2) (prog-if 
10 [OHCI])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 22
Memory at ff6fc000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 
Enable-

00:13.3 USB Controller: ATI Technologies Inc SB600 USB (OHCI3) (prog-if 
10 [OHCI])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 21
Memory at ff6fb000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 
Enable-

00:13.4 USB Controller: ATI Technologies Inc SB600 USB (OHCI4) (prog-if 
10 [OHCI])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 22
Memory at ff6fa000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 
Enable-

00:13.5 USB Controller: ATI Technologies Inc SB600 USB Controller (EHCI) 
(prog-if 20 [EHCI])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 20
Memory at ff6ff800 (32-bit, non-prefetchable) [size=256]
Capabilities: [c0] Power Management version 2
Capabilities: [d0] Message Signalled Interrupts: 64bit- Queue=0/0 
Enable-
Capabilities: [e4] Debug port

00:14.0 SMBus: ATI Technologies Inc SB600 SMBus (rev 13)
Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: 66MHz, medium devsel
I/O ports at 0b00 [size=16]
Capabilities: [b0] HyperTransport: MSI Mapping

00:14.1 IDE interface: ATI Technologies Inc SB600 IDE (prog-if 8a 
[Master SecP PriP])

Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 19
I/O ports at 01f0 [size=8]
I/O ports at 03f4 [size=1]
I/O ports at 0170 [size=8]
I/O ports at 0374 [size=1]
I/O ports at ff00 [size=16]
Capabilities: [70] Mess

[PATCH 1/3] Preserve the dirty bit in init_page_buffers

2007-05-21 Thread Eric W. Biederman


The problem:  When we are trying to free buffers try_to_free_buffers
will look at ramdisk pages with clean buffer heads and remove the
dirty bit from the page.  Resulting in ramdisk pages with data that
get removed from the page cache.  Ouch!

Buffer heads appear on ramdisk pages when a filesystem calls getblk,
which through a series of function calls eventually calls
init_page_buffers. 

So to fix the mismatch between buffer head state and page state this
patch modifies init_page_buffers to transfer the dirty bit from the
page to the buffer heads like we currently do for the uptodate bit.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 fs/buffer.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index aa68206..c6b58e8 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -953,6 +953,7 @@ init_page_buffers(struct page *page, struct block_device 
*bdev,
struct buffer_head *head = page_buffers(page);
struct buffer_head *bh = head;
int uptodate = PageUptodate(page);
+   int dirty = PageDirty(page);
 
do {
if (!buffer_mapped(bh)) {
@@ -961,6 +962,8 @@ init_page_buffers(struct page *page, struct block_device 
*bdev,
bh->b_blocknr = block;
if (uptodate)
set_buffer_uptodate(bh);
+   if (dirty)
+   set_buffer_dirty(bh);
set_buffer_mapped(bh);
}
block++;
-- 
1.5.1.1.181.g2de0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [PATCH] - store sysfs inode nrs in s_ino to avoid readdir oopses

2007-05-21 Thread Eric Sandeen


(2nd try, better(?) changelog, quilt refreshed(!) patch)

--

Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch

For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there
is no synchronization with readdir.  This patch follows Tejun's
scheme of allocating and storing an inode number in the new s_ino
member of a sysfs_dirent, when dirents are created, and retrieving 
it from there for readdir, so that the pointer chain doesn't have 
to be traversed.


Tejun's upstream patch uses a new-ish "ida" allocator which brings along
some extra complexity; this -stable patch has a brain-dead incrementing
counter which does not guarantee uniqueness, but because sysfs doesn't 
hash inodes as iunique expects, uniqueness wasn't guaranteed today anyway.


Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]>

Index: linux-2.6.21/fs/sysfs/dir.c
===
--- linux-2.6.21.orig/fs/sysfs/dir.c
+++ linux-2.6.21/fs/sysfs/dir.c
@@ -30,6 +30,14 @@ static struct dentry_operations sysfs_de
.d_iput = sysfs_d_iput,
};

+static unsigned int sysfs_inode_counter;
+ino_t sysfs_get_inum(void)
+{
+   if (unlikely(sysfs_inode_counter < 3))
+   sysfs_inode_counter = 3;
+   return sysfs_inode_counter++;
+}
+
/*
 * Allocates a new sysfs_dirent and links it to the parent sysfs_dirent
 */
@@ -41,6 +49,7 @@ static struct sysfs_dirent * __sysfs_new
if (!sd)
return NULL;

+   sd->s_ino = sysfs_get_inum();
atomic_set(&sd->s_count, 1);
atomic_set(&sd->s_event, 1);
INIT_LIST_HEAD(&sd->s_children);
@@ -509,7 +518,7 @@ static int sysfs_readdir(struct file * f

switch (i) {
case 0:
-   ino = dentry->d_inode->i_ino;
+   ino = parent_sd->s_ino;
if (filldir(dirent, ".", 1, i, ino, DT_DIR) < 0)
break;
filp->f_pos++;
@@ -538,10 +547,7 @@ static int sysfs_readdir(struct file * f

name = sysfs_get_name(next);
len = strlen(name);
-   if (next->s_dentry)
-   ino = next->s_dentry->d_inode->i_ino;
-   else
-   ino = iunique(sysfs_sb, 2);
+   ino = next->s_ino;

if (filldir(dirent, name, len, filp->f_pos, ino,
 dt_type(next)) < 0)
Index: linux-2.6.21/fs/sysfs/inode.c
===
--- linux-2.6.21.orig/fs/sysfs/inode.c
+++ linux-2.6.21/fs/sysfs/inode.c
@@ -140,6 +140,7 @@ struct inode * sysfs_new_inode(mode_t mo
inode->i_mapping->a_ops = &sysfs_aops;
inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
inode->i_op = &sysfs_inode_operations;
+   inode->i_ino = sd->s_ino;
lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);

if (sd->s_iattr) {
Index: linux-2.6.21/fs/sysfs/mount.c
===
--- linux-2.6.21.orig/fs/sysfs/mount.c
+++ linux-2.6.21/fs/sysfs/mount.c
@@ -33,6 +33,7 @@ static struct sysfs_dirent sysfs_root = 
	.s_element	= NULL,

.s_type = SYSFS_ROOT,
.s_iattr= NULL,
+   .s_ino  = 1,
};

static void sysfs_clear_inode(struct inode *inode)
Index: linux-2.6.21/fs/sysfs/sysfs.h
===
--- linux-2.6.21.orig/fs/sysfs/sysfs.h
+++ linux-2.6.21/fs/sysfs/sysfs.h
@@ -5,6 +5,7 @@ struct sysfs_dirent {
void* s_element;
int s_type;
umode_t s_mode;
+   ino_t   s_ino;
struct dentry   * s_dentry;
struct iattr* s_iattr;
atomic_ts_event;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC][PATCH] do_div_signed()

2007-05-21 Thread john stultz

Here's a quick pass at adding do_div_signed() which provides a signed
version of do_div, avoiding having do_div users hack around signed
issues (like in ntp.c).

It probably could be optimized further, so let me know if you have any
suggestions.


Other thoughts?

thanks
-john


Signed-off-by: John Stultz<[EMAIL PROTECTED]>


diff --git a/include/asm-arm/div64.h b/include/asm-arm/div64.h
index 0b5f881..af44e10 100644
--- a/include/asm-arm/div64.h
+++ b/include/asm-arm/div64.h
@@ -225,5 +225,6 @@
 #endif
 
 extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
+extern int64_t do_div_signed(int64_t *n, int32_t base);
 
 #endif
diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
index a4a4937..e1cac65 100644
--- a/include/asm-generic/div64.h
+++ b/include/asm-generic/div64.h
@@ -62,4 +62,5 @@ extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
 
 #endif /* BITS_PER_LONG */
 
+extern int64_t do_div_signed(int64_t *n, int32_t base);
 #endif /* _ASM_GENERIC_DIV64_H */
diff --git a/include/asm-i386/div64.h b/include/asm-i386/div64.h
index 438e980..05320a5 100644
--- a/include/asm-i386/div64.h
+++ b/include/asm-i386/div64.h
@@ -49,4 +49,5 @@ div_ll_X_l_rem(long long divs, long div, long *rem)
 }
 
 extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
+extern int64_t do_div_signed(int64_t *n, int32_t base);
 #endif
diff --git a/include/asm-m68k/div64.h b/include/asm-m68k/div64.h
index 33caad1..3c76059 100644
--- a/include/asm-m68k/div64.h
+++ b/include/asm-m68k/div64.h
@@ -26,4 +26,5 @@
 })
 
 extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
+extern int64_t do_div_signed(int64_t *n, int32_t base);
 #endif /* _M68K_DIV64_H */
diff --git a/include/asm-mips/div64.h b/include/asm-mips/div64.h
index 66189f5..851ce40 100644
--- a/include/asm-mips/div64.h
+++ b/include/asm-mips/div64.h
@@ -111,5 +111,6 @@ static inline uint64_t div64_64(uint64_t dividend, uint64_t 
divisor)
 }
 
 #endif /* (_MIPS_SZLONG == 64) */
+extern int64_t do_div_signed(int64_t *n, int32_t base);
 
 #endif /* _ASM_DIV64_H */
diff --git a/include/asm-um/div64.h b/include/asm-um/div64.h
index 7b73b2c..1fc4a2c 100644
--- a/include/asm-um/div64.h
+++ b/include/asm-um/div64.h
@@ -4,4 +4,5 @@
 #include "asm/arch/div64.h"
 
 extern uint64_t div64_64(uint64_t dividend, uint64_t divisor);
+extern int64_t do_div_signed(int64_t *n, int32_t base);
 #endif
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 87aa5ff..7c093ad 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -302,16 +302,11 @@ int do_adjtimex(struct timex *txc)
freq_adj = time_offset * mtemp;
freq_adj = shift_right(freq_adj, time_constant * 2 +
   (SHIFT_PLL + 2) * 2 - SHIFT_NSEC);
-   if (mtemp >= MINSEC && (time_status & STA_FLL || mtemp > 
MAXSEC)) {
+   if (mtemp >= MINSEC && 
+   (time_status & STA_FLL || mtemp > MAXSEC)) {
temp64 = time_offset << (SHIFT_NSEC - SHIFT_FLL);
-   if (time_offset < 0) {
-   temp64 = -temp64;
-   do_div(temp64, mtemp);
-   freq_adj -= temp64;
-   } else {
-   do_div(temp64, mtemp);
-   freq_adj += temp64;
-   }
+   do_div_signed(&temp64, mtemp);
+   freq_adj += temp64;
}
freq_adj += time_freq;
freq_adj = min(freq_adj, (s64)MAXFREQ_NSEC);
diff --git a/lib/div64.c b/lib/div64.c
index b71cf93..e6ff440 100644
--- a/lib/div64.c
+++ b/lib/div64.c
@@ -79,3 +79,37 @@ uint64_t div64_64(uint64_t dividend, uint64_t divisor)
 EXPORT_SYMBOL(div64_64);
 
 #endif /* BITS_PER_LONG == 32 */
+
+/* Signed 64 bit dividend, result, rem. Signed 32 bit divisor */
+int64_t do_div_signed(int64_t *n, int32_t base)
+{
+   uint64_t num, den;
+   int64_t rem;
+   int num_sign = (*n < 0);
+   int den_sign = (base < 0);
+
+   if (num_sign)
+   num = (uint64_t)(-*n);
+   else
+   num = (uint64_t)*n;
+   
+   /* XXX this is sort of obnoxious,but seems necessary 
+* to handle the base possibly being negative as well
+*/
+   if (den_sign)
+   den = (uint32_t)(-base);
+   else
+   den = (uint32_t)base;
+
+   rem = do_div(num, den);
+
+   *n = (int64_t)num;
+   if(num_sign ^ den_sign)
+   *n = -*n;
+   if(num_sign)
+   rem = -rem;
+
+   return rem;
+}
+
+EXPORT_SYMBOL(do_div_signed);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree

2007-05-21 Thread Ken Chen


On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote:

No, it doesn't.  Really.  It's easy to split; untested incremental to your
patch follows:

for (i = 0; i < nr; i++) {
-   if (!loop_init_one(i))
-   goto err;
+   lo = loop_alloc(i);
+   if (!lo)
+   goto Enomem;
+   list_add_tail(&lo->lo_list, &loop_devices);
}


ah, yes, use the loop_device list_head to link all the pending devices.



+   /* point of no return */
+
+   list_for_each_entry(lo, &loop_devices, lo_list)
+   add_disk(lo->lo_disk);
+
+   blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
+ THIS_MODULE, loop_probe, NULL, NULL);
+
printk(KERN_INFO "loop: module loaded\n");
return 0;
-err:
-   loop_exit();
+
+Enomem:
printk(KERN_INFO "loop: out of memory\n");
+
+   while(!list_empty(&loop_devices)) {
+   lo = list_entry(loop_devices.next, struct loop_device, lo_list);
+   loop_del_one(lo);
+   }
+
+   unregister_blkdev(LOOP_MAJOR, "loop");
return -ENOMEM;
 }


I suppose the loop_del_one call in Enomem label needs to be split up
too since in the error path, it hasn't done add_disk() yet?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc1-mm1

2007-05-21 Thread young dave


Hi,

This implies a miscompile somewhere, *or* that your bios stomps on
registers that gcc expect preserved, and adding printf's disturbs the
register allocation sufficiently.


I think maybe it's caused by gcc optimize, so I add volatile to
read_sector inline assemblly, then kernel can boot successfully.

please check this patch :

diff -ur linux/arch/i386/boot/edd.c linux.new/arch/i386/boot/edd.c
--- linux/arch/i386/boot/edd.c  2007-05-22 10:08:59.0 +
+++ linux.new/arch/i386/boot/edd.c  2007-05-22 10:06:24.0 +
@@ -47,7 +47,7 @@
   ax = 0x4200;/* Extended Read */
   si = (size_t)&dapa;
   dx = devno;
-   asm ("pushfl; stc; int $0x13; setc %%al; popfl"
+   asm volatile("pushfl; stc; int $0x13; setc %%al; popfl"
   : "+a" (ax), "+S" (si), "+d" (devno)
   : : "ebx", "ecx", "edi");

@@ -58,7 +58,7 @@
   cx = 0x0001;/* Sector 0-0-1 */
   dx = devno;
   bx = (size_t)buf;
-   asm ("pushfl; stc; int $0x13; setc %%al; popfl"
+   asm volatile("pushfl; stc; int $0x13; setc %%al; popfl"
   : "+a" (ax), "+c" (cx), "+d" (dx), "+b" (bx)
   : : "esi", "edi");

Regards
dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PCI device problem - MMCONFIG, cannot allocate resource region, resource collisions

2007-05-21 Thread Jesse Barnes

On Monday, May 21, 2007, System Design Works wrote:
> The kernel has a problem allocating resources for my PCI NIC.  Here is
> what the kernel is reporting:
>
> # uname -a
> Linux wopr 2.6.20-gentoo-r8 #7 SMP Sun May 20 20:56:56 PDT 2007 i686 AMD
> Athlon(tm) 64 X2 Dual Core Processor 3800+ AuthenticAMD GNU/Linux
>
> # dmesg
> ...
> PCI: BIOS Bug:  MCFG area at e000 is not E820-reserved
> PCI: Not using MMCONFIG.

This is actually an unrelated problem.  We're a little too conservative 
about using MCFG space (though this turns out to be a good thing on some 
of my machines), but shouldn't affect the rest of your PCI resource 
assignment.

> ...
> PCI: Cannot allocate resource region 0 of device :02:02.0
> ...
> PCI: Device :02:02.0 not available because of resource collisions
> skge: :02:02.0 cannot enable PCI device
> skge: probe of :02:02.0 failed with error -22
> ...
>
> I have seen other posts reporting similar error messages.  I would like
> to help resolve this problem, and I can do some testing if needed.  More
> info:
>
> Kernel boot params:   pci=nomsi

There's a recent thread about PCI resource assignment (sounds like your 
BIOS might be buggy btw, or you're somehow running out of space), search 
for the title "PCI bridge range sizing bug".  You may need the kernel to 
reassign the resource for your NIC before you can use it.  I think Ivan 
has some test patches along these lines.

If you can find out what resource it's colliding with, that might give you 
a clue.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected

2007-05-21 Thread Jonathan Woithe

> > Are we talking CONFIG_PATA_MARVELL here?  If so then the kernel I just
> > booted has this set to "y" (ie: built-in) and yet the drive is still not
> > detected.  Is there a newer version of this driver somewhere?  The kernel
> > was 2.6.22-rc2.
> 
> Should be current. Its known to work fine for that chip so you might need
> to do some debugging and provide more detail.

I got someone to go into the BIOS setup (I am debugging this remotely) and
check the IDE/ATA related options.  The two relevant options were set
as follows:

  ATA/IDE Mode: [ Choices: Legacy, Native ]
  Configure SATA as:   [ Choices: IDE, RAID, AHCI ]

I had them change the "Configure SATA as" setting from "IDE" to "AHCI"
and then reboot into 2.6.22-rc2.  At this point things appear to be much
happier:

  ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
  Probing IDE interface ide0...
  Probing IDE interface ide1...
  :
  ahci :00:1f.2: version 2.1
  ACPI: PCI Interrupt :00:1f.2[A] -> GSI 19 (level, low) -> IRQ 19
  ahci :00:1f.2: AHCI 0001.0100 32 slots 6 ports 3 Gbps 0x3f impl SATA mode
  ahci :00:1f.2: flags: 64bit ncq led clo pio slum part 
  PCI: Setting latency timer of device :00:1f.2 to 64
  scsi0 : ahci
  :
  scsi5 : ahci
  ata1: SATA max UDMA/133 cmd 0xf882a100 ctl 0x bmdma 0x irq 0
  :
  ata6: SATA max UDMA/133 cmd 0xf882a380 ctl 0x bmdma 0x irq 0
  ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
  ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
  ata1.00: ATA-7: WDC WD2500AAJS-00RYA0, 12.01B01, max UDMA/133
  ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 31/32)
  ata1.00: ata_hpa_resize 1: sectors = 488397168, hpa_sectors = 488397168
  ata1.00: configured for UDMA/133
  ata2: SATA link down (SStatus 0 SControl 300)
  :
  ata6: SATA link down (SStatus 0 SControl 300)
  scsi 0:0:0:0: Direct-Access ATA  WDC WD2500AAJS-0 12.0 PQ: 0 ANSI: 5
  sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
  sd 0:0:0:0: [sda] Write Protect is off
  :
  sd 0:0:0:0: [sda] Attached SCSI disk
  ACPI: PCI Interrupt :02:00.0[A] -> GSI 17 (level, low) -> IRQ 16
  PCI: Setting latency timer of device :02:00.0 to 64
  scsi6 : pata_marvell
  scsi7 : pata_marvell
  ata7: PATA max UDMA/100 cmd 0x00011018 ctl 0x00011026 bmdma 0x00011000 irq 0
  ata8: DUMMY
  BAR5:00:00 01:7F 02:22 03:CA 04:00 05:00 06:00 07:00 08:00 09:00 0A:00 0B:00 
0C:01 0D:00 0E:00 0F:00 
  ata7.00: ATAPI, max UDMA/66
  ata7.00: limited to UDMA/33 due to 40-wire cable
  ata7.00: configured for UDMA/33
  scsi 6:0:0:0: CD-ROMSONY DVD RW DRU-830A  SS25 PQ: 0 ANSI: 5
  sr0: scsi3-mmc drive: 40x/40x writer dvd-ram cd/rw xa/form2 cdda tray
  Uniform CD-ROM driver Revision: 3.20
  sr 6:0:0:0: Attached scsi CD-ROM sr0
  sr 6:0:0:0: Attached scsi generic sg1 type 5

Therefore it seems that for whatever reason, the marvell_pata driver will
only find the Marvell PATA IDE controller if the *SATA* mode in the BIOS is
set to "AHCI".  It's somewhat counter-intuitive, but since AHCI is the
"correct" setting for SATA performance reasons it's probably not such a bad
thing.

So, thanks for all the suggestions; the problem appears solved.

Regards
  jonathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

On Mon, May 21, 2007 at 06:39:51PM -0700, William Lee Irwin III wrote:
> On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote:
> >> ... yeah, something like that would bypass 
> 
> On Mon, May 21, 2007 at 05:43:16PM -0500, Matt Mackall wrote:
> > As long as we're throwing out crazy unpopular ideas, try this one:
> > Divide struct page in two such that all the most commonly used
> > elements are in one piece that's nicely sized and the rest are in
> > another. Have two parallel arrays containing these pieces and accessor
> > functions around the unpopular bits.
> > Whether a sensible divide between popular and unpopular bits isn't
> > clear to me. But hey, I said it was crazy.
> 
> I have a crazier and even less popular idea. Eliminate struct page
> entirely as an accounting structure (and, of course, mem_map with it).
> Filesystems can keep the per-page metadata they need in their own
> accounting structures, slab mutatis mutandis, etc. The brilliant bit
> here is that devolving the accounting structures this way allows the
> fs and/or subsystem to arrange for strong cache locality, file offset
> adjacency to imply memory adjacency of the page accounting fields,
> etc., where grabbing random structures out of some array is a real
> cache thrasher.
> 
> The page allocation and page replacement algorithms would have to be
> adjusted, and things would have to allocate their own refcounts,
> supposing they want/need refcounts, but it's not so far out. Refer to
> filesystem pages by  pairs, refer to slab pages by

BTW. I think the filesystem APIs (at least the VM-side ones) should be
doing this anyway (not even index, but offset). Passing things like
lists of pages around is just horrible. See my write_begin/write_end
and perform_write aops for (what I think is) a step in the right
direction.


> address (virtual and physical are trivially inter-convertible), mock
> up something akin to what filesystems do for anonymous pages, etc.
> 
> The real objection everyone's going to have is that driver writers
> will stain their shorts when faced with the rules for handling such
> things. The thing is, I'm not entirely sure who these driver writers
> that would have such trouble are, since the driver writers I know
> personally are sophisticates rather than walking disaster areas as such
> would imply. I suppose they may not be representative of the whole.

That's not the objection I would have. I would say that firstly, I
don't think the mem_map overhead is very significant (at any rate,
an allocated-on-demand metadata is not going to be any smaller if
you fill up on pagecache...). Secondly, I think there is merit to
having the same page metadata used by the major subsystems, because
it helps for locality of reference.

But I haven't explored the idea enough myself to know whether there
would be any really killer benefits to this. Delayed metadata freeing
via RCU without holding up the freeing of the actual page would have
been something, however I can do similar with speculative references
now (or whenever the code gets merged), which doesn't even require the
RCU overhead.


> -- wli
> 
> P.S. This idea is not plucked out of the air; it has precedents. A
> number of microkernels do this, and IIRC k42 does so also.

Psst, just say "kernels" when you mention this to Linus ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

2007-05-21 Thread Jesse Barnes

On Monday, May 21, 2007, Jon Smirl wrote:
> I am not asking that these features be implemented today. I am asking
> that enough planning go into the architecture today to make sure that
> these features can be built in the future without tearing up the
> graphics system for a third time.
>
> This is the essence of my complaint about this patch. The patch
> introduces a new low level graphics API to the kernel. Once we put an
> API in it is basically impossible to get it back out. I am not
> convinced that enough planning has gone into this API yet.

Jon, that's why I'm posting this stuff in the first place! :)  Again, if 
you have specific problems with the proposed interfaces (problems that 
would preclude your wishlist from being fully implementable), please let 
me know (preferably with specifics).

Thanks,
Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes

2007-05-21 Thread Chris Wright

* Chris Wright ([EMAIL PROTECTED]) wrote:
> * Alan Cox ([EMAIL PROTECTED]) wrote:
> > Yeah - fix your mailer, you got a reply 5 days ago.
> 
> Sure wouldn't be the first time something broke.  I'll take a look.

Thanks for the prod.  I found 2 quite stale RBL entries, causing
long connection delay (enough from some MTAs to walk away perhaps).

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree

On Mon, May 21, 2007 at 06:30:15PM -0700, Ken Chen wrote:
> On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote:
> >On Mon, May 21, 2007 at 03:00:55PM -0700, [EMAIL PROTECTED] wrote:
> >> + if (register_blkdev(LOOP_MAJOR, "loop"))
> >> + return -EIO;
> >> + blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
> >> +   THIS_MODULE, loop_probe, NULL, NULL);
> >> +
> >> + for (i = 0; i < nr; i++) {
> >> + if (!loop_init_one(i))
> >> + goto err;
> >> + }
> >> +
> >> + printk(KERN_INFO "loop: module loaded\n");
> >> + return 0;
> >> +err:
> >> + loop_exit();
> >
> >This isn't good.  You *can't* fail once a single disk has been registered.
> >Anyone could've opened it by now.
> >
> >IOW, you need to
> >* register region *after* you are past the point of no return
> 
> That option is a lot harder than I thought.  This requires an array to
> keep intermediate result of preallocated "lo" device, blk_queue, and
> disk structure before calling add_disk() or register region.  And this
> array could be potentially 1 million entries.  Maybe I will use
> vmalloc for it, but seems rather sick.

No, it doesn't.  Really.  It's easy to split; untested incremental to your
patch follows:

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 0aae8d8..2300490 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1394,16 +1394,11 @@ int loop_unregister_transfer(int number)
 EXPORT_SYMBOL(loop_register_transfer);
 EXPORT_SYMBOL(loop_unregister_transfer);
 
-static struct loop_device *loop_init_one(int i)
+static struct loop_device *loop_alloc(int i)
 {
struct loop_device *lo;
struct gendisk *disk;
 
-   list_for_each_entry(lo, &loop_devices, lo_list) {
-   if (lo->lo_number == i)
-   return lo;
-   }
-
lo = kzalloc(sizeof(*lo), GFP_KERNEL);
if (!lo)
goto out;
@@ -1427,8 +1422,6 @@ static struct loop_device *loop_init_one(int i)
disk->private_data  = lo;
disk->queue = lo->lo_queue;
sprintf(disk->disk_name, "loop%d", i);
-   add_disk(disk);
-   list_add_tail(&lo->lo_list, &loop_devices);
return lo;
 
 out_free_queue:
@@ -1439,6 +1432,23 @@ out:
return NULL;
 }
 
+static struct loop_device *loop_init_one(int i)
+{
+   struct loop_device *lo;
+
+   list_for_each_entry(lo, &loop_devices, lo_list) {
+   if (lo->lo_number == i)
+   return lo;
+   }
+
+   lo = loop_alloc(i);
+   if (lo) {
+   add_disk(lo->lo_disk);
+   list_add_tail(&lo->lo_list, &loop_devices);
+   }
+   return lo;
+}
+
 static void loop_del_one(struct loop_device *lo)
 {
del_gendisk(lo->lo_disk);
@@ -1481,6 +1491,7 @@ static int __init loop_init(void)
 {
int i, nr;
unsigned long range;
+   struct loop_device *lo;
 
/*
 * loop module now has a feature to instantiate underlying device
@@ -1506,19 +1517,34 @@ static int __init loop_init(void)
 
if (register_blkdev(LOOP_MAJOR, "loop"))
return -EIO;
-   blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
- THIS_MODULE, loop_probe, NULL, NULL);
 
for (i = 0; i < nr; i++) {
-   if (!loop_init_one(i))
-   goto err;
+   lo = loop_alloc(i);
+   if (!lo)
+   goto Enomem;
+   list_add_tail(&lo->lo_list, &loop_devices);
}
 
+   /* point of no return */
+
+   list_for_each_entry(lo, &loop_devices, lo_list)
+   add_disk(lo->lo_disk);
+
+   blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
+ THIS_MODULE, loop_probe, NULL, NULL);
+
printk(KERN_INFO "loop: module loaded\n");
return 0;
-err:
-   loop_exit();
+
+Enomem:
printk(KERN_INFO "loop: out of memory\n");
+
+   while(!list_empty(&loop_devices)) {
+   lo = list_entry(loop_devices.next, struct loop_device, lo_list);
+   loop_del_one(lo);
+   }
+
+   unregister_blkdev(LOOP_MAJOR, "loop");
return -ENOMEM;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH]serial: make early_uart to use early_param instead of console_initcall

2007-05-21 Thread Yinghai Lu


On 5/21/07, Bjorn Helgaas <[EMAIL PROTECTED]> wrote:
>
> I don't want to add asm-ia64/fixmap.h with dummy definitions
> just for this.
>
> Can we add this:
>
>   asm-ia64/io.h:   #define bt_ioremap ioremap
>   asm-x86_64/io.h: #define bt_ioremap early_ioremap
>
> and use bt_ioremap instead?
>


Please check if it work with ia64.

i add fix_ioremap in
include/asm-x86_64/io.h
include/asm-i386/io.h
include/asm-ia64/io.h

command line will be
console=uart8250,io,0x3f8,9600n8
console=uart8250,mmio,0xffe5,115200n8

YH
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 09220a1..634d809 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -444,8 +444,16 @@ and is between 256 and 4096 characters. It is defined in the file
 			Documentation/networking/netconsole.txt for an
 			alternative.
 
-		uart,io,[,options]
-		uart,mmio,[,options]
+		uart8250,io,[,options]
+		uart8250,mmio,[,options]
+			Start an early, polled-mode console on the 8250/16550
+			UART at the specified I/O port or MMIO address,
+			switching to the matching ttyS device later.  The
+			options are the same as for ttyS, above.
+
+	earlycon=	[KNL] Output early console device and options.
+		uart8250,io,[,options]
+		uart8250,mmio,[,options]
 			Start an early, polled-mode console on the 8250/16550
 			UART at the specified I/O port or MMIO address,
 			switching to the matching ttyS device later.  The
diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index f74dfc4..8271466 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -168,6 +168,12 @@ page_pde_offset = (__PAGE_OFFSET >> 20);
 .section .init.text,"ax",@progbits
 #endif
 
+	/* Do an early initialization of the fixmap area */
+	movl $(swapper_pg_dir - __PAGE_OFFSET), %edx
+	movl $(swapper_pg_pmd - __PAGE_OFFSET), %eax
+	addl $0x007, %eax			/* 0x007 = PRESENT+RW+USER */
+	movl %eax, 4092(%edx)
+
 #ifdef CONFIG_SMP
 ENTRY(startup_32_smp)
 	cld
@@ -507,6 +513,8 @@ ENTRY(_stext)
 .section ".bss.page_aligned","w"
 ENTRY(swapper_pg_dir)
 	.fill 1024,4,0
+ENTRY(swapper_pg_pmd)
+	.fill 1024,4,0
 ENTRY(empty_zero_page)
 	.fill 4096,1,0
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index eaa6a24..dd7f95b 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -390,10 +390,6 @@ early_console_setup (char *cmdline)
 	if (!efi_setup_pcdp_console(cmdline))
 		earlycons++;
 #endif
-#ifdef CONFIG_SERIAL_8250_CONSOLE
-	if (!early_serial_console_init(cmdline))
-		earlycons++;
-#endif
 
 	return (earlycons) ? 0 : -1;
 }
diff --git a/arch/x86_64/kernel/head.S b/arch/x86_64/kernel/head.S
index 1fab487..941c84b 100644
--- a/arch/x86_64/kernel/head.S
+++ b/arch/x86_64/kernel/head.S
@@ -73,7 +73,11 @@ startup_64:
 	addq	%rbp, init_level4_pgt + (511*8)(%rip)
 
 	addq	%rbp, level3_ident_pgt + 0(%rip)
+
 	addq	%rbp, level3_kernel_pgt + (510*8)(%rip)
+	addq	%rbp, level3_kernel_pgt + (511*8)(%rip)
+
+	addq	%rbp, level2_fixmap_pgt + (506*8)(%rip)
 
 	/* Add an Identity mapping if I am above 1G */
 	leaq	_text(%rip), %rdi
@@ -314,7 +318,16 @@ NEXT_PAGE(level3_kernel_pgt)
 	.fill	510,8,0
 	/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
 	.quad	level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE
-	.fill	1,8,0
+	.quad	level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
+
+NEXT_PAGE(level2_fixmap_pgt)
+	.fill	506,8,0
+	.quad	level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE
+	/* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */
+	.fill	5,8,0
+
+NEXT_PAGE(level1_fixmap_pgt)
+	.fill	512,8,0
 
 NEXT_PAGE(level2_ident_pgt)
 	/* Since I easily can, map the first 1G.
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c
index c84dab0..5a91ac5 100644
--- a/drivers/serial/8250.c
+++ b/drivers/serial/8250.c
@@ -2367,6 +2367,9 @@ static struct uart_ops serial8250_pops = {
 	.request_port	= serial8250_request_port,
 	.config_port	= serial8250_config_port,
 	.verify_port	= serial8250_verify_port,
+#ifdef CONFIG_SERIAL_8250_CONSOLE
+	.find_port_for_earlycon = serial8250_find_port_for_earlycon,
+#endif
 };
 
 static struct uart_8250_port serial8250_ports[UART_NR];
@@ -2533,7 +2536,7 @@ static int __init serial8250_console_init(void)
 }
 console_initcall(serial8250_console_init);
 
-static int __init find_port(struct uart_port *p)
+int __init find_port_serial8250(struct uart_port *p)
 {
 	int line;
 	struct uart_port *port;
@@ -2546,26 +2549,6 @@ static int __init find_port(struct uart_port *p)
 	return -ENODEV;
 }
 
-int __init serial8250_start_console(struct uart_port *port, char *options)
-{
-	int line;
-
-	line = find_port(port);
-	if (line < 0)
-		return -ENODEV;
-
-	add_preferred_console("ttyS", line, options);
-	printk("Adding console on ttyS%d at %s 0x%lx (options '%s')\n",
-		line, port->iotype == UPIO_MEM ? "MMIO" : "I/O port",
-		port->iotype == UPIO_MEM ? (unsigned long) port->mapbase :
-

PCI device problem - MMCONFIG, cannot allocate resource region, resource collisions

2007-05-21 Thread System Design Works

The kernel has a problem allocating resources for my PCI NIC.  Here is 
what the kernel is reporting:


# uname -a
Linux wopr 2.6.20-gentoo-r8 #7 SMP Sun May 20 20:56:56 PDT 2007 i686 AMD 
Athlon(tm) 64 X2 Dual Core Processor 3800+ AuthenticAMD GNU/Linux


# dmesg
...
PCI: BIOS Bug:  MCFG area at e000 is not E820-reserved
PCI: Not using MMCONFIG.
...
PCI: Cannot allocate resource region 0 of device :02:02.0
...
PCI: Device :02:02.0 not available because of resource collisions
skge: :02:02.0 cannot enable PCI device
skge: probe of :02:02.0 failed with error -22
...

I have seen other posts reporting similar error messages.  I would like 
to help resolve this problem, and I can do some testing if needed.  More 
info:


Kernel boot params:   pci=nomsi

PCI Device:
   D-Link DGE-530T  (10/100/1000 Gigabit Desktop Adapter)
   http://www.dlink.com/products/?pid=284

Motherboard:
   MSI K9AGM-FID  (AMD Socket AM2)
   Chipset:  SB600 / RS485
   http://www.msicomputer.com/product/p_spec.asp?model=K9AGM-FID&class=mb

# lspci -vvv
...
02:02.0 Non-VGA unclassified device: D-Link System Inc Unknown device 
4901 (rev 11)

   Subsystem: D-Link System Inc Unknown device 4901
   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- 
ParErr- Stepping- SERR+ FastB2B-
   Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
SERR- 
   Latency: 64 (5750ns min, 7250ns max), Cache Line Size: 64 bytes
   Interrupt: pin A routed to IRQ 11
   Region 0: Memory at  (32-bit, non-prefetchable)
   Region 1: I/O ports at b800 [size=256]
   Expansion ROM at  [disabled]
   Capabilities: [48] Power Management version 2
   Flags: PMEClk- DSI- D1- D2+ AuxCurrent=0mA 
PME(D0+,D1+,D2+,D3hot+,D3cold+)

   Status: D0 PME-Enable- DSel=0 DScale=1 PME-
   Capabilities: [50] Vital Product Data
...

Thanks,

Wayne
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

2007-05-21 Thread William Lee Irwin III

On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote:
>> ... yeah, something like that would bypass 

On Mon, May 21, 2007 at 05:43:16PM -0500, Matt Mackall wrote:
> As long as we're throwing out crazy unpopular ideas, try this one:
> Divide struct page in two such that all the most commonly used
> elements are in one piece that's nicely sized and the rest are in
> another. Have two parallel arrays containing these pieces and accessor
> functions around the unpopular bits.
> Whether a sensible divide between popular and unpopular bits isn't
> clear to me. But hey, I said it was crazy.

I have a crazier and even less popular idea. Eliminate struct page
entirely as an accounting structure (and, of course, mem_map with it).
Filesystems can keep the per-page metadata they need in their own
accounting structures, slab mutatis mutandis, etc. The brilliant bit
here is that devolving the accounting structures this way allows the
fs and/or subsystem to arrange for strong cache locality, file offset
adjacency to imply memory adjacency of the page accounting fields,
etc., where grabbing random structures out of some array is a real
cache thrasher.

The page allocation and page replacement algorithms would have to be
adjusted, and things would have to allocate their own refcounts,
supposing they want/need refcounts, but it's not so far out. Refer to
filesystem pages by  pairs, refer to slab pages by
address (virtual and physical are trivially inter-convertible), mock
up something akin to what filesystems do for anonymous pages, etc.

The real objection everyone's going to have is that driver writers
will stain their shorts when faced with the rules for handling such
things. The thing is, I'm not entirely sure who these driver writers
that would have such trouble are, since the driver writers I know
personally are sophisticates rather than walking disaster areas as such
would imply. I suppose they may not be representative of the whole.

-- wli

P.S. This idea is not plucked out of the air; it has precedents. A
number of microkernels do this, and IIRC k42 does so also.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [PATCH] - fix oops in sysfs_readdir

2007-05-21 Thread Tejun Heo

Andrew Morton wrote:
> Actually, someone (eg distros) looking at Tejun's changelog would still be
> struggling to answer the question "do I need this".  The one thing it
> claims to fix is "duplicate inode numbers".  But why is that a problem? 
> What are the user-visible consequences of not merging the patch?  Unobvious.

The oops part is explained in #2.  sysfs_dirent->s_dentry can go away
anytime and the original code accesses it without any synchronization,
so it can end up dereferencing NULL or access already freed memory.
And, yeah, this is another place where reclaim-related oops occurs.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree

2007-05-21 Thread Ken Chen

On 5/21/07, Al Viro <[EMAIL PROTECTED]> wrote:

On Mon, May 21, 2007 at 03:00:55PM -0700, [EMAIL PROTECTED] wrote:
> + if (register_blkdev(LOOP_MAJOR, "loop"))
> + return -EIO;
> + blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
> +   THIS_MODULE, loop_probe, NULL, NULL);
> +
> + for (i = 0; i < nr; i++) {
> + if (!loop_init_one(i))
> + goto err;
> + }
> +
> + printk(KERN_INFO "loop: module loaded\n");
> + return 0;
> +err:
> + loop_exit();

This isn't good.  You *can't* fail once a single disk has been registered.
Anyone could've opened it by now.

IOW, you need to
* register region *after* you are past the point of no return

That option is a lot harder than I thought.  This requires an array to
keep intermediate result of preallocated "lo" device, blk_queue, and
disk structure before calling add_disk() or register region.  And this
array could be potentially 1 million entries.  Maybe I will use
vmalloc for it, but seems rather sick.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: select(0, ..) is valid ?

2007-05-21 Thread Nish Aravamudan


On 5/18/07, Andi Kleen <[EMAIL PROTECTED]> wrote:

On Wednesday 16 May 2007 17:37, Anton Blanchard wrote:
> Hi Hugh,
>
> > It's interesting that compat_core_sys_select() shows this kmalloc(0)
> > failure but core_sys_select() does not.  That's because core_sys_select()
> > avoids kmalloc by using a buffer on the stack for small allocations (and
> > 0 sure is small).  Shouldn't compat_core_sys_select() do just the same?
> > Or is SLUB going to be so efficient that doing so is a waste of time?
>
> Nice catch, the original optimisation from Andi is:
>
> http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=
>commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f
>
> And I think it makes sense for the compat code to do it too.

Yes agreed. I just forgot the copy'n'pasted code when doing the original
change.


Is this headed upstream? It's causing some noise on test.kernel.org
now that SLAB is also warning about kmalloc(0).

Thanks,
Nish
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes

2007-05-21 Thread Chris Wright

* Alan Cox ([EMAIL PROTECTED]) wrote:
> Yeah - fix your mailer, you got a reply 5 days ago.

Sure wouldn't be the first time something broke.  I'll take a look.

thanks,
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

On Tue, 22 May 2007, Nick Piggin wrote:

> That would be unpopular with pagecache, because that uses pretty well
> all fields.

SLUB also uses all fields

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Fix headers check fallout

2007-05-21 Thread Stephen Rothwell

commit e8edc6e03a5c8562dc70a6d969f732bdb355a7e7 added an include of
linux/jiffies.h in linux/smb_fs.h outside the ifdef __KERNEL__.

Signed-off-by: Stephen Rothwell <[EMAIL PROTECTED]>
---
 include/linux/smb_fs.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]

diff --git a/include/linux/smb_fs.h b/include/linux/smb_fs.h
index 6b51a48..2c5cd55 100644
--- a/include/linux/smb_fs.h
+++ b/include/linux/smb_fs.h
@@ -9,7 +9,6 @@
 #ifndef _LINUX_SMB_FS_H
 #define _LINUX_SMB_FS_H
 
-#include 
 #include 
 
 /*
@@ -30,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static inline struct smb_sb_info *SMB_SB(struct super_block *sb)
-- 
1.5.1.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

On Mon, May 21, 2007 at 05:43:16PM -0500, Matt Mackall wrote:
> On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote:
> > 
> > ... yeah, something like that would bypass 
> 
> As long as we're throwing out crazy unpopular ideas, try this one:
> 
> Divide struct page in two such that all the most commonly used
> elements are in one piece that's nicely sized and the rest are in
> another. Have two parallel arrays containing these pieces and accessor
> functions around the unpopular bits.
> 
> Whether a sensible divide between popular and unpopular bits isn't
> clear to me. But hey, I said it was crazy.

That would be unpopular with pagecache, because that uses pretty well
all fields.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch] Off by one in floppy.c

2007-05-21 Thread Pete Zaitcev

On Tue, 22 May 2007 00:57:56 +0200, Eric Sesterhenn / Snakebyte <[EMAIL 
PROTECTED]> wrote:

> http://marc.info/?l=linux-kernel&m=115144559823592&w=2

Shows how much we care about floppy... It's going to be a year old soon.

> +++ linux-2.6/drivers/block/floppy.c  2007-05-22 00:54:18.0 +0200
> @@ -670,7 +670,7 @@ static void __reschedule_timeout(int dri
>   if (drive == current_reqD)
>   drive = current_drive;
>   del_timer(&fd_timeout);
> - if (drive < 0 || drive > N_DRIVE) {
> + if (drive < 0 || drive >= N_DRIVE) {

You need to find someone willing to take this. Maybe Andrew.

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kconfig - scan all Kconfig files

2007-05-21 Thread Sam Ravnborg


> A simple example would be 
> help texts, right now they are per symbol, but they should really be per 
> menu, so archs can provide different help texts for something.

This one turned out easy.
I assume what you had in mind was something like the attached.

With this the help entry present is no loger tha last one seen but the one 
belonging
to the menu (the symbol used within that menu).
So if I have:

menu "My first menu"
config FOO
bool "Foobar"
help
  First menu help
endmenu

menu "My second menu"
config FOO
bool "barfoo"
help
 Second menu help
endmenu

Then the help text will be the expected one in the two menus.

gconf + qconf will be fixed later if you are OK with this one.
This is IMO a general improvement and should be finished and applied.

Sam

diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c
index 1199baf..c2ca5fc 100644
--- a/scripts/kconfig/conf.c
+++ b/scripts/kconfig/conf.c
@@ -187,9 +187,9 @@ int conf_string(struct menu *menu)
/* print help */
if (line[1] == '\n') {
help = nohelp_text;
-   if (menu->sym->help)
-   help = menu->sym->help;
-   printf("\n%s\n", menu->sym->help);
+   if (menu->help)
+   help = menu->help;
+   printf("\n%s\n", menu->help);
def = NULL;
break;
}
@@ -233,7 +233,7 @@ static int conf_sym(struct menu *menu)
printf("/m");
if (oldval != yes && sym_tristate_within_range(sym, yes))
printf("/y");
-   if (sym->help)
+   if (menu->help)
printf("/?");
printf("] ");
conf_askvalue(sym, sym_get_string_value(sym));
@@ -270,8 +270,8 @@ static int conf_sym(struct menu *menu)
return 0;
 help:
help = nohelp_text;
-   if (sym->help)
-   help = sym->help;
+   if (menu->help)
+   help = menu->help;
printf("\n%s\n", help);
}
 }
@@ -342,7 +342,7 @@ static int conf_choice(struct menu *menu)
goto conf_childs;
}
printf("[1-%d", cnt);
-   if (sym->help)
+   if (menu->help)
printf("?");
printf("]: ");
switch (input_mode) {
@@ -359,8 +359,8 @@ static int conf_choice(struct menu *menu)
fgets(line, 128, stdin);
strip(line);
if (line[0] == '?') {
-   printf("\n%s\n", menu->sym->help ?
-   menu->sym->help : nohelp_text);
+   printf("\n%s\n", menu->help ?
+   menu->help : nohelp_text);
continue;
}
if (!line[0])
@@ -391,8 +391,8 @@ static int conf_choice(struct menu *menu)
if (!child)
continue;
if (line[strlen(line) - 1] == '?') {
-   printf("\n%s\n", child->sym->help ?
-   child->sym->help : nohelp_text);
+   printf("\n%s\n", child->help ?
+   child->help : nohelp_text);
continue;
}
sym_set_choice_value(sym, child->sym);
diff --git a/scripts/kconfig/expr.h b/scripts/kconfig/expr.h
index 6084525..0b22c73 100644
--- a/scripts/kconfig/expr.h
+++ b/scripts/kconfig/expr.h
@@ -71,7 +71,7 @@ enum {
 struct symbol {
struct symbol *next;
char *name;
-   char *help;
+   //char *help;
enum symbol_type type;
struct symbol_value curr;
struct symbol_value def[4];
@@ -139,7 +139,7 @@ struct menu {
struct property *prompt;
struct expr *dep;
unsigned int flags;
-   //char *help;
+   char *help;
struct file *file;
int lineno;
void *data;
diff --git a/scripts/kconfig/kxgettext.c b/scripts/kconfig/kxgettext.c
index abee55c..4b0aaa1 100644
--- a/scripts/kconfig/kxgettext.c
+++ b/scripts/kconfig/kxgettext.c
@@ -170,8 +170,8 @@ void menu_build_message_list(struct menu *menu)
 menu->file == NULL ? "Root Menu" : menu->file->name,
 menu->lineno);
 
-   if (menu->sym != NULL && menu->sym->help != NULL)
-   message__add(menu->sym->help, menu->sym->name,
+   if (menu->sym != NULL && menu->help != NULL)
+   message__add(menu->help, menu->sym->name,

Re: [rfc] increase struct page size?!

2007-05-21 Thread KAMEZAWA Hiroyuki

On Mon, 21 May 2007 17:38:58 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Tue, 22 May 2007, KAMEZAWA Hiroyuki wrote:
> 
> > For i386(32bit arch), there is not enough space for vmemmap.
> 
> I thought 32 bit would use flatmem? Is memory really sparse on 32 
> bit? Likely difficult due to lack of address space?
> 

Of course, i386 can use flatmem.

I am just afraid that memory hotplug is just for sprasemem.
But I also think we can add memory-hotplug for flatmem if necessary.
(I myself have no plan now. I wonder memory power-save-mode may be supported
 by chipsets.)

-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [PATCH] - fix oops in sysfs_readdir

2007-05-21 Thread Andrew Morton

On Mon, 21 May 2007 19:18:55 -0500 Eric Sandeen <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> > On Mon, 21 May 2007 13:11:21 -0500
> > Eric Sandeen <[EMAIL PROTECTED]> wrote:
> > 
> >> This is a non-ida backport of Tejun's patch in -mm at:
> >> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch
> >> for the 2.6.16 -stable tree - it follows the same scheme of using s_ino to 
> >> safely
> >> store & retrieve the inode number of sysfs entries for use in 
> >> sysfs_readdir,
> >> but uses a brain-dead-simple inode nr allocator rather than ida, which 
> >> would
> >> bring along a lot of newer, more complex code.
> >>
> >> No, this doesn't guarantee uniqueness of sysfs inode numbers, but then
> >> the code in -stable today doesn't either - and with this change, at least
> >> it shouldn't oops.
> > 
> > So I'm sitting here whether to commend this patch to google kernel 
> > maintainers
> > for 2.6.18 backport, but I realise I don't know what it does.  And I don't 
> > know
> > if it fixes the reclaim-time oopses they were intermittently seeing, or if 
> > it
> > fixes something else and if so what that is.
> > 
> > Sigh.  Better changelogs, please.
> > 
> 
> Sorry Andrew.  I referenced Tejun's upstream patch in -mm which has a 
> nice changelog etc, and this is a backport of that, and does the same 
> thing in the same way and solves the same problem - but that doesn't 
> help if you just want to toss this message into your patch stack.  Will 
> fix up & resend.
> 

Actually, someone (eg distros) looking at Tejun's changelog would still be
struggling to answer the question "do I need this".  The one thing it
claims to fix is "duplicate inode numbers".  But why is that a problem? 
What are the user-visible consequences of not merging the patch?  Unobvious.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem


> So why do you want it in kernel security is not the sensible answer
> here.


I'm not proposing the KGI solution where every device driver presents
the same API. That model does require a lot of code in the kernel.

The existing DRM model where each driver provides it's own API is a
good one. The user space DRI driver then takes this API and turns it
into a standard one. Applying the DRM style model to fbdev may allow
parts of fbdev to be moved out to user space.

What I don't want is a permanent root priv process hanging around in
the system. It simply isn't needed and I have prototyped a system that
runs without root so I know it can be done. With minor mods DRI can
run without the need for root, with more major mods the X server can
run without the need for root. Most of the mods to the X server are to
remove things like PCI bus probing, mode setting and VBIOS support.

--
Jon Smirl
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

On Mon, May 21, 2007 at 04:26:03AM -0700, William Lee Irwin III wrote:
> On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote:
> >> Choosing k distinct integers (mem_map array indices) from the interval
> >> [0,n-1] results in k(n-k+1)/n non-adjacent intervals of contiguous
> >> array indices on average. The average interval length is
> >> (n+1)/(n-k+1) - 1/C(n,k). Alignment considerations make going much
> >> further somewhat hairy, but it should be clear that contiguity arising
> >> from random choice is non-negligible.
> 
> On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote:
> > That doesn't say anything about temporal locality, though.
> 
> It doesn't need to. If what's in the cache is uniformly distributed,
> you get that result for spatial locality. From there, it's counting
> cachelines.

OK, so your 'k' is the number of struct pages that are in cache? Then
that's fine.

I'm not sure how many that is going to be, but I would be surprised if
it were a significant proportion of mem_map, even on not-so-large
memory systems.


> On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote:
> >> In any event, I don't have all that much of an objection to what's
> >> actually proposed, just this particular cache footprint argument.
> >> One can motivate increases in sizeof(struct page), but not this way.
> 
> On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote:
> > Realise that you have to have a run of I think at least 7 or 8 contiguous
> > pages and temporally close references in order to save a single cacheline.
> > Then also that if the page being touched is not partially in cache from
> > an earlier access, then it is statistically going to cost more lines to
> > touch it (up to 75% if you touch the first and the last field, obviously 0%
> > if you only touch a single field, but that's unlikely given that you
> > usually take a reference then do at least something else like check flags).
> > I think the problem with the cache footprint argument is just whether
> > it makes any significant difference to performance. But..
> 
> The average interval ("run") length is (n+1)/(n-k+1) - 1/C(n,k), so for
> that to be >= 8 you need (n+1)/(n-k+1) - 1/C(n,k) >= 8 which also happens
> when (n+1)/(n-k+1) >= 9 or when n >= (9/8)*k - 1 or k <= (8/9)*(n+1).
> Clearly a lower bound on k is required, but not obviously derivable.
> k >= 8 is obvious, but the least k where (n+1)/(n-k+1) - 1/C(n,k) >= 8
> is not entirely obvious. Numerically solving for the least such k finds
> that k actually needs to be relatively close to (8/9)*n. A lower bound
> of something like 0.87*n + O(1) probably holds.

Ah, you worked it out... yeah I'd guess this is going to be pretty difficult
a condition to satisfy (given that it isn't possible for a 4GB system, even
if you had 32MB of cache to fill entirely with struct pages).


> On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote:
> >> Now that I've been informed of the ->_count and ->_mapcount issues,
> >> I'd say that they're grave and should be corrected even at the cost
> >> of sizeof(struct page).
> 
> On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote:
> > ... yeah, something like that would bypass 
> 
> Did you get cut off here?

Must have. I was going to say it would bypass the whole speed/size
discussion anyway :P
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

2007-05-21 Thread Keith Packard

On Tue, 2007-05-22 at 10:09 +1000, Benjamin Herrenschmidt wrote:

> I do stongly beleive that the decision of what mode to choose should not
> be made in the kernel.

That's the plan; the kernel just provides mechanism. The architecture
used in the X server splits precisely at this point with the mechanism
in the driver and the configuration and policy up in the X server
proper. Quite a bit of that code could be broken out into a shared
library for fbdev-based apps and the X server to share, but that's down
the road a bit after the kernel APIs look a lot more solid.

With the goal of getting to a single-mode-set boot to avoid screen
flashing before login, the key here is to make any early user-mode
graphics apps share the same kernel graphics infrastructure as the X
server to identify the common cases where the startup and X modes are
the same and avoid resetting the configuration.

-- 
[EMAIL PROTECTED]

signature.asc
Description: This is a digitally signed message part

Re: [RFC] enhancing the kernel's graphics subsystem

On 5/21/07, Alan Cox <[EMAIL PROTECTED]> wrote:

> > the kernel, it can be a lot smaller than X and auditable.. sticking
> > the DRI protocol in the kernel is just pointless..
>
> It is a quite sensible idea.
>
> The userspace X server SHOULD be running under a non-root user, with
> appropriate fine-grained privs granted to it.
>
> "I need root to do graphics" is a myopic, antiquated view of the world.

X server: priviledges below everything, pageable
kernel: priviledges as high as conceivable, non-pageable

So why do you want it in kernel security is not the sensible answer
here.

Have you inspected the multi-megabyte X server for security holes to
the same level the kernel has been inspected?

The only part that needs to be in the kernel driver is the code
controlling locking and code that plays with the hardware. Moving it
into the driver ensures that only the minimal amount of root priv code
possible is going to end up in the system. If someone tries to move
too much into the kernel I'm sure you'll let them know that it's a bad
idea.

The problem right now is that code that needs root priv is all
intertwined with code that doesn't need it and it all ends up getting
run as root.

BTW, when I prototyped this a couple of years ago by merging Radeon
DRM/fbdev I only needed to add about 10K more code to the device
driver. Most of that was associated with getting the VBIOS to run in
x86 mode when the driver was first loaded. That code can be marked
_init. We're not talking about a lot of code needing to go into the
kernel.

--
Jon Smirl
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

On Tue, 22 May 2007, KAMEZAWA Hiroyuki wrote:

> For i386(32bit arch), there is not enough space for vmemmap.

I thought 32 bit would use flatmem? Is memory really sparse on 32 
bit? Likely difficult due to lack of address space?

> For 64bit arch, page flags are not exhausted yet.

Right.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes

> It's there specifically to fish out why it was sent to -stable w/out
> ever making it upstream.  Having sent the same question w/ no response
> 5 days ago

Yeah - fix your mailer, you got a reply 5 days ago.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 07/69] libata-sff: Undo bug introduced with pci_iomap changes

On Mon, 21 May 2007 16:18:25 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> 
> 
> On Mon, 21 May 2007, Chris Wright wrote:
> > ---
> > [chrisw: Why is this not upstream yet?]
> 
> And equally importantly, why is it even in the stable queue if it's not 
> upstream.

Its not relevant to upstream - upstream has different updates which
removed the bug but not in a clean "backport this" way.

> It's against stable rules, and it means that we may have stuff that gets 
> fixed in -stable and not in -upstream, if people don't notice. THAT IS 
> BAD

Then the rules are stupid in this specific case.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem


Alan Cox wrote:

the kernel, it can be a lot smaller than X and auditable.. sticking
the DRI protocol in the kernel is just pointless..

It is a quite sensible idea.

The userspace X server SHOULD be running under a non-root user, with 
appropriate fine-grained privs granted to it.


"I need root to do graphics" is a myopic, antiquated view of the world.


X server: priviledges below everything, pageable
kernel: priviledges as high as conceivable, non-pageable

So why do you want it in kernel security is not the sensible answer
here.


Replying/quoting mixup.  I was responding to the root-privs userspace 
aspect, not the "put it in the kernel" aspect.


I do not want it in the kernel (should have snipped that last quoted line).

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: IDE/ATA: Intel i865-based mainboard, CDROM not detected

> Are we talking CONFIG_PATA_MARVELL here?  If so then the kernel I just
> booted has this set to "y" (ie: built-in) and yet the drive is still not
> detected.  Is there a newer version of this driver somewhere?  The kernel
> was 2.6.22-rc2.

Should be current. Its known to work fine for that chip so you might need
to do some debugging and provide more detail.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc] increase struct page size?!

2007-05-21 Thread KAMEZAWA Hiroyuki

On Mon, 21 May 2007 10:08:06 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> On Sun, 20 May 2007, Andi Kleen wrote:
> 
> > Besides with the scarcity of pageflags it might make sense to do "64 bit 
> > only"
> > flags at some point.
> 
> There is no scarcity of page flags. There is
> 
> 1. Hoarding by Andrew
> 
> 2. Waste by Sparsemem (section flags no longer necessary with
>virtual memmap)

For i386(32bit arch), there is not enough space for vmemmap.
For 64bit arch, page flags are not exhausted yet.

-Kame


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]

2007-05-21 Thread Matthew Garrett

On Tue, May 22, 2007 at 12:42:00AM +0200, Pavel Machek wrote:
> On Mon 2007-05-21 14:45:53, Matthew Garrett wrote:
> > So don't do it badly. The advantage of doing so is that you can make it 
> > work properly, which you can't by putting it in the kernel.
> 
> You want stuff like critical shutdowns to work even if userspace is
> dead.

I don't think anyone suggested putting the critical shutdown control in 
userspace. The kernel already handles that fine.

> I do not think you can control passive cooling adequately from 
> userspace, and you can certainly not prevent kernel from slowing 
> machine down too soon.

Given the choice between something impossible and something difficult, 
I'm inclined towards picking the difficult one.

> Plus, this is actually nasty user-visible change, and a regression
> from 2.6.21. I am not sure why we are even debating this; user-kernel
> interface changed without warning. Patch should be simply reverted.

In http://lkml.org/lkml/2007/1/27/93 you were more than happy to break 
an interface even though it could be fixed in a (ugly) way that made it 
work again. Here, there's no way to fix this properly - the platform 
will quite happily do things based on what it believes the trip points 
should be, and one of those things may be to alter the trip points. 
Imagine the following situation:

1) Platform sets critical shutdown trip point to 85C
2) Userspace sets critical shutdown trip point to 95C
3) Temperature reaches 90C
4) Platform forces reevaluation of trip points
5) Entire invasion fleet is lost

How do you avoid that? Disable the ability for the platform to set trip 
points? You're breaking the spec and potentially causing hardware 
damage. If you have specific hardware that requires specific spec 
breakage, then a better approach would probably be to quirk the kernel 
to rectify it. On the other hand, if it works with the Other Leading OS, 
we ought to be able to just fix the problem properly.
-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

> > the kernel, it can be a lot smaller than X and auditable.. sticking
> > the DRI protocol in the kernel is just pointless..
> 
> It is a quite sensible idea.
> 
> The userspace X server SHOULD be running under a non-root user, with 
> appropriate fine-grained privs granted to it.
> 
> "I need root to do graphics" is a myopic, antiquated view of the world.

X server: priviledges below everything, pageable
kernel: priviledges as high as conceivable, non-pageable

So why do you want it in kernel security is not the sensible answer
here.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[git patches] libata fixes and administrivia


Two fixes; the rest is trivial pre-release administrivia (ie. only bump
versions and chomp whitespace after everything else gets merged up)

Please pull from 'upstream-linus' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git 
upstream-linus

to receive the following updates:

 drivers/ata/Kconfig|2 +-
 drivers/ata/ahci.c |   13 +++--
 drivers/ata/ata_generic.c  |2 +-
 drivers/ata/ata_piix.c |9 +
 drivers/ata/libata-core.c  |   14 ++
 drivers/ata/libata-eh.c|2 +-
 drivers/ata/pata_artop.c   |4 ++--
 drivers/ata/pata_cmd640.c  |4 ++--
 drivers/ata/pata_cmd64x.c  |2 +-
 drivers/ata/pata_cs5520.c  |2 +-
 drivers/ata/pata_cs5530.c  |2 +-
 drivers/ata/pata_cs5535.c  |2 +-
 drivers/ata/pata_cypress.c |2 +-
 drivers/ata/pata_hpt366.c  |   28 +---
 drivers/ata/pata_hpt37x.c  |   10 +-
 drivers/ata/pata_hpt3x3.c  |2 +-
 drivers/ata/pata_isapnp.c  |2 +-
 drivers/ata/pata_it8213.c  |2 +-
 drivers/ata/pata_ixp4xx_cf.c   |2 +-
 drivers/ata/pata_jmicron.c |2 +-
 drivers/ata/pata_legacy.c  |2 +-
 drivers/ata/pata_platform.c|2 +-
 drivers/ata/pata_qdi.c |2 +-
 drivers/ata/pata_rz1000.c  |2 +-
 drivers/ata/pata_sc1200.c  |2 +-
 drivers/ata/pata_scc.c |2 +-
 drivers/ata/pata_serverworks.c |2 +-
 drivers/ata/pata_sl82c105.c|2 +-
 drivers/ata/pata_winbond.c |2 +-
 drivers/ata/pdc_adma.c |2 +-
 drivers/ata/sata_inic162x.c|2 +-
 drivers/ata/sata_mv.c  |2 +-
 drivers/ata/sata_nv.c  |6 +++---
 drivers/ata/sata_qstor.c   |2 +-
 drivers/ata/sata_sil.c |2 +-
 drivers/ata/sata_sil24.c   |2 +-
 drivers/ata/sata_sis.c |2 +-
 drivers/ata/sata_svw.c |2 +-
 drivers/ata/sata_sx4.c |2 +-
 drivers/ata/sata_uli.c |2 +-
 drivers/ata/sata_via.c |2 +-
 drivers/ata/sata_vsc.c |2 +-
 include/linux/libata.h |2 --
 43 files changed, 65 insertions(+), 93 deletions(-)

Alan Cox (3):
  pata_hpt366: Enable bits are unreliable so don't use them
  ata_piix: clean up
  libata: Kiss post_set_mode goodbye

Dave Jones (1):
  libata: Add Seagate STT2A to DMA blacklist.

Jeff Garzik (2):
  libata: Trim trailing whitespace
  libata: bump versions

Tejun Heo (1):
  ahci: disable 64bit dma on sb600

diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index ad1f59c..b4a8d60 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -132,7 +132,7 @@ config SATA_SIS
depends on PCI
select PATA_SIS
help
- This option enables support for SiS Serial ATA on 
+ This option enables support for SiS Serial ATA on
  SiS 964/965/966/180 and Parallel ATA on SiS 180.
  The PATA support for SiS 180 requires additionally to
  enable the PATA_SIS driver in the config.
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index e00e1b9..7baeaff 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -46,7 +46,7 @@
 #include 
 
 #define DRV_NAME   "ahci"
-#define DRV_VERSION"2.1"
+#define DRV_VERSION"2.2"
 
 
 enum {
@@ -170,6 +170,7 @@ enum {
AHCI_FLAG_IGN_IRQ_IF_ERR= (1 << 25), /* ignore IRQ_IF_ERR */
AHCI_FLAG_HONOR_PI  = (1 << 26), /* honor PORTS_IMPL */
AHCI_FLAG_IGN_SERR_INTERNAL = (1 << 27), /* ignore SERR_INTERNAL */
+   AHCI_FLAG_32BIT_ONLY= (1 << 28), /* force 32bit */
 
AHCI_FLAG_COMMON= ATA_FLAG_SATA | ATA_FLAG_NO_LEGACY |
  ATA_FLAG_MMIO | ATA_FLAG_PIO_DMA |
@@ -354,7 +355,8 @@ static const struct ata_port_info ahci_port_info[] = {
/* board_ahci_sb600 */
{
.flags  = AHCI_FLAG_COMMON |
- AHCI_FLAG_IGN_SERR_INTERNAL,
+ AHCI_FLAG_IGN_SERR_INTERNAL |
+ AHCI_FLAG_32BIT_ONLY,
.pio_mask   = 0x1f, /* pio0-4 */
.udma_mask  = 0x7f, /* udma0-6 ; FIXME */
.port_ops   = &ahci_ops,
@@ -492,6 +494,13 @@ static void ahci_save_initial_config(struct pci_dev *pdev,
hpriv->saved_cap = cap = readl(mmio + HOST_CAP);
hpriv->saved_port_map = port_map = readl(mmio + HOST_PORTS_IMPL);
 
+   /* some chips lie about 64bit support */
+   if ((cap & HOST_CAP_64) && (pi->flags & AHCI_FLAG_32BIT_ONLY)) {
+   dev_printk(KERN_INFO, &pdev->dev,
+  "controller can't do 64bit DMA, forcing 32bit\n");
+   cap &= ~HOST_CAP_64;
+   }
+
/* fixup zero port_map */
if (!port_map) {
port_map = (1 << ahci_

Re: [1/5] 2.6.22-rc2: known regressions

2007-05-21 Thread Ray Lee


Hey there,

On 5/19/07, Michal Piotrowski <[EMAIL PROTECTED]> wrote:

Here is a list of some known regressions in 2.6.22-rc2.

Feel free to add new regressions/remove fixed etc.
http://kernelnewbies.org/known_regressions

Subject: nx6125 has lost fan control
References : http://lkml.org/lkml/2007/5/16/249
Submitter  : Ray Lee <[EMAIL PROTECTED]>
Status : Unknown


I'm withdrawing this one. At boot, it wasn't controlling the fans, but
after a suspend-resume it started up fine. My subsequent boots have
also been okay, so I'm at a loss to explain what I saw.

Regardless, given that it's hard to reproduce, I have no evidence that
it's a regression, it could just be something that's really rare.

Thanks,

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

On 5/21/07, Jeff Garzik <[EMAIL PROTECTED]> wrote:

Jon Smirl wrote:
> 2) Address the long outstanding issue of multi-seat at the console
> level. My solution to this is the one device per CRTC model.

This is very very low priority.  Pretty much nobody besides you is
clamoring for it.

> 3) Eliminate the need for a root priv controlling process. Get rid of
> the potential for a security hole.

Agreed.

> 4) OOPS should always display even if in a graphics mode

Agreed, and this was in the list that Jesse(?) posted.

> 8) Allow multiple user space graphics systems to run. These user space

Another very very low priority item.

There are a lot more important things to work on.  Linux is about what
people need -right now-, not what you think Linux might need in the
future; not what you think might be nice to have.

I am not asking that these features be implemented today. I am asking
that enough planning go into the architecture today to make sure that
these features can be built in the future without tearing up the
graphics system for a third time.

This is the essence of my complaint about this patch. The patch
introduces a new low level graphics API to the kernel. Once we put an
API in it is basically impossible to get it back out. I am not
convinced that enough planning has gone into this API yet.

I'm also not convinced that there is a transition plan in place to
ensure that all drivers get updated to this new API. The last thing we
want is to maintain two parallel sets of video drivers forever into
the future. V4L2 did something similar to this and orphaned a lot of
drivers that the distributions were forced into updating later.

Mode setting is intimately intertwined with the console. VT swapping
adds another messy layer which can and should be eliminated in a
redesign. Multi-seat and unicode add more complexity. All of this
needs to be designed as a unified system. Satisfying the needs of the
X server is the easiest piece of the puzzle.

--
Jon Smirl
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: intermittant petabyte usage reported with broadcom nic

2007-05-21 Thread Michael Chan

On Mon, 2 Apr 2007 11:43:19 +1000 CaT <[EMAIL PROTECTED]> wrote:
>  
> I take minute by minute snapshots of network traffic by sampling
> /proc/net/dev and most of the time everything works fine. Occasionally
> though I get petabyte byte traffic and corresponding packet traffic.

We were able to reproduce the problem and confirmed that it was a DMA
problem of the statistics block.  About once an hour on average, wrong
counter values will be DMA'ed to host memory.  Luckily, the DMA write
stays within the intended address range so it will not corrupt other
parts of memory.  Other types of DMA including traffic and buffer
descriptors are not affected.

If you happen to be reading /proc/net/dev within a second after the DMA
corruption, you'll see bogus counters.  One second later and until the
next bad DMA, the counters will be normal again.

We are considering ways to workaround the problem.  Thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

On Mon, 2007-05-21 at 13:42 -0400, Jon Smirl wrote:
> 
> When I went through the design process for all this I came to the same
> conclusion about needing a user space console process.
> 
> User space console does impact on all of this because it implies that
> the current console should be be defeatured down until it becomes only
> a system recovery console and not a console for everyday use.

I do agree (heh, for once) with that in the sense that in the long run,
we should strip the kernel console down to the bare minimum to boot,
display oopses, etc... and have all the fancy stuff, unicode, VT, and
more in a userspace console process.

However, I'm a little bit worried that we'll end up with 10 competing
incompatible and inconsistent userspace console projects and -that- will
be horrible.

But it's something separate from what Dave and Jesse are trying to
address. Let's first gets the fundation right and -then- we can do all
sort of crazy things. Or maybe you can start working on a user console
project in parallel using the new APIs that Jesse and Dave are
providing ? :-)

> For example, one part of the defeaturing would be to remove the
> drawing acceleration code in the existing fbdev console drivers and to
> rework it to support accelerated drawing from the user space console
> implementation. You want the system recovery console mode to be as
> simple as possible so that it is always guaranteed to work. User space
> console is also what leads to the idea of compiling VT out of the
> kernel.

I do agree on that in the long run, but again, let's look into this
-after- we have solved the more immediate issues. We can probably kill
most of fbdev, fbcon and current VT once we have a solid userland based
replacement that isn't completely bloated (maybe with a "slim" version
that does only VGA and non-utf8 for server type apps).

> Once you decide that a user space console is needed then the per CRTC
> device node becomes more obvious since different people can be logged
> onto the different consoles.

That's irrelevant. Implementation detail.

> All of the points in the list are interrelated and the architecture
> needs to address everything as a unified whole.

And that's bullshit :-)

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [stable] [PATCH] - fix oops in sysfs_readdir

2007-05-21 Thread Eric Sandeen

Andrew Morton wrote:

On Mon, 21 May 2007 13:11:21 -0500
Eric Sandeen <[EMAIL PROTECTED]> wrote:

This is a non-ida backport of Tejun's patch in -mm at:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch
for the 2.6.16 -stable tree - it follows the same scheme of using s_ino to
safely
store & retrieve the inode number of sysfs entries for use in sysfs_readdir,
but uses a brain-dead-simple inode nr allocator rather than ida, which would
bring along a lot of newer, more complex code.

No, this doesn't guarantee uniqueness of sysfs inode numbers, but then
the code in -stable today doesn't either - and with this change, at least
it shouldn't oops.

So I'm sitting here whether to commend this patch to google kernel maintainers
for 2.6.18 backport, but I realise I don't know what it does. And I don't know
if it fixes the reclaim-time oopses they were intermittently seeing, or if it
fixes something else and if so what that is.

Sigh. Better changelogs, please.

Sorry Andrew. I referenced Tejun's upstream patch in -mm which has a
nice changelog etc, and this is a backport of that, and does the same
thing in the same way and solves the same problem - but that doesn't
help if you just want to toss this message into your patch stack. Will
fix up & resend.

-Eric

-Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: error in recent patch to fs/partitions/ldm.c

2007-05-21 Thread Randy Dunlap

On Mon, 21 May 2007 19:24:57 -0400 (EDT) Robert P. J. Day wrote:

> 
> $ git show dde33348e53ecab687a9768bf5262f0b8f79b7f2
> ...
> --- a/fs/partitions/ldm.c
> +++ b/fs/partitions/ldm.c
> ...
> -   (unsigned long long)ph->config_size );
> +   udunsigned long long)ph->config_size);
> ^^

Patch has already been posted to lkml:

From:   Anton Altaparmakov <[EMAIL PROTECTED]>
Subject: Fix to the fix! - Re: [2.6 PATCH] LDM: Fix for Windows Vista dynamic 
disks
Date:   Mon, 21 May 2007 21:13:37 +0100 (BST)

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: + loop-preallocate-eight-loop-devices.patch added to -mm tree

On Mon, May 21, 2007 at 03:00:55PM -0700, [EMAIL PROTECTED] wrote:
> + if (register_blkdev(LOOP_MAJOR, "loop"))
> + return -EIO;
> + blk_register_region(MKDEV(LOOP_MAJOR, 0), range,
> +   THIS_MODULE, loop_probe, NULL, NULL);
> +
> + for (i = 0; i < nr; i++) {
> + if (!loop_init_one(i))
> + goto err;
> + }
> +
> + printk(KERN_INFO "loop: module loaded\n");
> + return 0;
> +err:
> + loop_exit();

This isn't good.  You *can't* fail once a single disk has been registered.
Anyone could've opened it by now.

IOW, you need to
* register region *after* you are past the point of no return
* either not fail on failing loop_init_one() here, or separate
allocations and actual add_disk().
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem

On Mon, 2007-05-21 at 18:14 +0100, Dave Airlie wrote:
> 
> > 6) Eliminate the existing VT swap driver free for all. I would
> compile
> > out the VT layer and replace it with a compatible API that enforces
> > some sanity.
> 
> I'm hoping to look into this but it is a parallel problem to what this
> code does, the VT switch API sucks rocks, so providing something
> compatible is going to suck rocks..

Yeah, please, don't even go near that until everything else is done &
merged or you'll never have anything finished :-) VT is a can of worms
that will take some time to sort out and has nothing to do with what we
are talking about right now anyway :-)

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bug in 2.6.22-rc2: loop mount limited to one single iso image

On Tue, May 22, 2007 at 01:10:38AM +0100, Al Viro wrote:
> On Mon, May 21, 2007 at 09:11:02AM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Sun, 20 May 2007, Kay Sievers wrote:
> > > 
> > > Right, providing "preallocated" devices, 8 or the number given in
> > > max_loop, sounds like the best option until the tools can handle that.
> > 
> > Yes. Can somebody who actually _uses_ loop send a tested patch, please?
> 
> FWIW, I do and I have tested it; what I did *not* do is using a testbox
> with dynamic /dev for testing.  Mea culpa...
> 
> AFAICS, patch that went to akpm (preallocate max_loop instances) is OK.

... except that it needs to do cleanup on failure in loop_init().
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem


> In collaboration with the FB guys, we've been working on enhancing
> the 
> kernel's graphics subsystem in an attempt to bring some sanity to the 
> Linux graphics world and avoid the situation we have now where
> several 
> kernel and userspace drivers compete for control of graphics devices.

 .../...

A little note about initial mode setting at boot...

I do stongly beleive that the decision of what mode to choose should not
be made in the kernel. At boot the kernel should either leave the HW in
whatever state the FW set it (and text mode is fine) or setup some sane
default (ie 640x480 has the most chances of working) if that's not an
option, maybe some very minimum EDID parsing in case you have a fixed
frequency weirdo panel, but that's it (unless it's a mac an OF tells you
what to use :-)

The kernel would provide userland with connector infos (presence load
detect, EDID, ...) and userland gets to decide what to do.

Some reasons to keep that policy completely out of the kernel even at
boot time are:

 - User may want to configure his default gfx setup and have it up early

 - EDID do lie, monitors routinely ship with windows .inf files
containing "updated" infos apple has that too in OS X (EDID overrides
afaik) etc and if we're going to do such a database of known
monitors, it should definitely not be in the kernel. Besides, we want to
add more infos that EDIDs don't provide most of the time to it like
subpixel ordering etc...

 - It sounds better that way :-) (yeah, that's the best reason !)

So while I agree that the register frobbing, memory management, etc...
should be indeed moved to the kernel as you guys have been doing lately,
the policy of deciding what mode to set should totally stick to
userland.

IMHO, the best would be a lib (or daemon or both) for monitor detection
& mode setting that is separate from X :-) That could handle storing the
admin's default setup (including weird monitor info if any) and
restoring it at boot time, etc...

Cheers,
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bug in 2.6.22-rc2: loop mount limited to one single iso image

On Mon, May 21, 2007 at 09:11:02AM -0700, Linus Torvalds wrote:
> 
> 
> On Sun, 20 May 2007, Kay Sievers wrote:
> > 
> > Right, providing "preallocated" devices, 8 or the number given in
> > max_loop, sounds like the best option until the tools can handle that.
> 
> Yes. Can somebody who actually _uses_ loop send a tested patch, please?

FWIW, I do and I have tested it; what I did *not* do is using a testbox
with dynamic /dev for testing.  Mea culpa...

AFAICS, patch that went to akpm (preallocate max_loop instances) is OK.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] enhancing the kernel's graphics subsystem